Re: [R] Line similarity

William Dunlap Tue, 30 Apr 2013 13:51:16 -0700

Here is one way to, for each row in the data.frame v, regress the numbers in
columns 2 through 4 on the numbers 1 through 3, storing only the slopes, and
then creating a column saying if the slope is greater than zero or not.


> v[,"Beta"] <- vapply(seq_len(nrow(v)),
                                        FUN=function(i)coef(lm(value~year, 
data=data.frame(value=as.numeric(v[i,2:4]), year=seq_len(3))))[2],
                                        FUN.VALUE=0)
> v[,"Growing"] <- v[,"Beta"] > 0
> v
  Name Year_1_value Year_2_value Year_3_value Beta Growing
1    A            1            2            3  1.0    TRUE
2    B            2            7           19  8.5    TRUE
3    C            3            4            2 -0.5   FALSE
4    D           10            7            6 -2.0   FALSE
5    E            4            4            5  0.5    TRUE
6    F           NA            3            6  3.0    TRUE

Since you are doing least-squares regression in which the predictors are the
same for all regressions (expect the one with the NA in it) you can also do
> coef(lm(value ~ year, list(value=t(as.matrix(v[1:5,2:4])), 
> year=seq_len(3))))[2,]
   1    2    3    4    5 
 1.0  8.5 -0.5 -2.0  0.5
but you have to then make a special case for each pattern of missing values.

If you always use a 3-consecutive-year period you can use
   Growing <- v[,"Year_1_value"] < v[, "Year_3_value"]

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Satsangi, Vivek (GE Capital)
> Sent: Tuesday, April 30, 2013 12:57 PM
> To: r-help@r-project.org
> Subject: [R] Line similarity
> 
> Folks,
> 
>                 This is probably a "help me google this properly, 
> please"-type of question.
> 
>                 In TIBCO Spotfire, there is a procedure called "line 
> similarity". I use this to
> determine which observations show a growing, stable or declining pattern... 
> sort of like a
> mini-regression on the time-line for each observation.
> 
>                 So of the input is something like this:
> 
> Name Year_1_value Year_2_value Year_3_value
> A 1 2 3
> B 2 7 19
> C 3 4 2
> D 10 7 6
> E 4 4 5
> F NA 3 6
> 
> Then the desired output is as follows:
> A Growing
> B Growing
> C Stable
> D Declining
> E Stable
> F Growing (or NA is also fine)
> 
>                 The data can also be unstacked, i.e. the three years could be 
> separate rows if
> necessary.
>                 Is there a package for R that implements something like the 
> above? I can
> obviously try do a set of simple regressions to classify the rows, but I want 
> to gain from
> the thoughts and learnings of others who may have taken the time to implement 
> a
> package.
>                 I tried searching with the words "line similarity" or its 
> variants to no avail.
> 
>                 Thanks in advance for your pointers!
> 
> Vivek Satsangi
> GE Capital
> Americas
> 
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Line similarity

Reply via email to