Re: [R] Bus stop sequence matching problem

2014-08-30 Thread Gabor Grothendieck
Try dtw.  First convert ref to numeric since dtw does not handle
character input.  Then align using dtw and NA out repeated values in
the alignment.  Finally zap ugly row names and calculate loading:

library(dtw)
s1 - as.numeric(stop_sequence$ref)
s2 - as.numeric(factor(as.character(stop_onoff$ref),
levels(stop_sequence$ref)))
a - dtw(s1, s2)
DF - cbind(stop_sequence,
  stop_onoff[replace(a$index2, c(FALSE, diff(a$index2) == 0), NA), ])[-3]
rownames(DF) - NULL
transform(DF, loading = cumsum(ifelse(is.na(on), 0, on)) -
cumsum(ifelse(is.na(off), 0, off)))

giving:

  seq ref on off loading
1  10   A  5   0   5
2  20   B NA  NA   5
3  30   C NA  NA   5
4  40   D  0   2   3
5  50   B 10   2  11
6  60   A  0   6   5

You will need to test this with more data and tweak it if necessary
via the various dtw arguments.


On Fri, Aug 29, 2014 at 8:46 PM, Adam Lawrence alaw...@gmail.com wrote:
 I am hoping someone can help me with a bus stop sequencing problem in R,
 where I need to match counts of people getting on and off a bus to the
 correct stop in the bus route stop sequence. I have tried looking
 online/forums for sequence matching but seems to refer to numeric sequences
 or DNA matching and over my head. I am after a simple example if anyone can
 please help.

 I have two data series as per below (from database), that I want to
 combine. In this example “stop_sequence” includes the equence (seq) of bus
 stops and “stop_onoff” is a count of people getting on and off at certain
 stops (there is no entry if noone gets on or off).

 stop_sequence - data.frame(seq=c(10,20,30,40,50,60),
 ref=c('A','B','C','D','B','A'))
 ##   seq ref
 ## 1  10   A
 ## 2  20   B
 ## 3  30   C
 ## 4  40   D
 ## 5  50   B
 ## 6  60   A
 stop_onoff -
 data.frame(ref=c('A','D','B','A'),on=c(5,0,10,0),off=c(0,2,2,6))
 ##   ref on off
 ## 1   A  5   0
 ## 2   D  0   2
 ## 3   B 10   2
 ## 4   A  0   6

 I need to match the stop_onoff numbers in the right sto sequence, with the
 correctly matched output as follows (load is a cumulative count of on and
 off)

 desired_output - data.frame(seq=c(10,20,30,40,50,60),
 ref=c('A','B','C','D','B','A'),
 on=c(5,'-','-',0,10,0),off=c(0,'-','-',2,2,6), load=c(5,0,0,3,11,5))
 ##   seq ref on off load
 ## 1  10   A  5   05
 ## 2  20   B  -   -0
 ## 3  30   C  -   -0
 ## 4  40   D  0   23
 ## 5  50   B 10   2   11
 ## 6  60   A  0   65

 In this example the stop “B” is matched to the second stop “B” in the stop
 sequence and not the first because the onoff data is after stop “D”.

 Any guidance much appreciated.

 Regards
 Adam

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bus stop sequence matching problem

2014-08-30 Thread David McPearson
Homework? The list has a no homework policy - but perhaps I'll be forgiven por
posting hints.
In general terms, this is how I appraoched the problem:
* Loop through the rows of stop_onoff - for (idx in ...someething...) {...
* For each row, find the first of ref in a suitably filtered subset of
stop_sequence, and keep track of these row numbers
* Update columns on and off
* Use cumsum to calculate the number of passengers on the bus

Note the loop. Someone cleverer than I might be able to vectorise that step,
but I couldn't see how.

By the way, if this is homework...

Are you sure you're desired_output is correct? I would expect someething like
  seq ref on off load
1  10   A  5   05
2  20   B  0   05
3  30   C  0   05
4  40   D  0   23
5  50   B 10   2   11
6  60   A  0   65

Are you aware that you're ref ccolumns are factors and not characters? If
you use stringsAsFactors = FALSE or
stop_onoff -
data.frame(ref=factor(c('A','D','B','A'), levels =
levels(stop_sequence$ref)),on=c(5,0,10,0),off=c(0,2,2,6))
it will simplify your'e analysis (or at least reduce some typing).

Type the following in an R console
?data.frame
?factor
and have a read.

Now, if this ain't homework, or you just want someone to do it for you, e-mail
me offline and I'll send you my appraoch. If it is homework, let me know - I'm
happy to help anyway, but I will be trying to help you solve this for
yourself.

Cheers,
DMcP

On Sat, 30 Aug 2014 12:46:17 +1200 Adam Lawrence alaw...@gmail.com wrote

 I am hoping someone can help me with a bus stop sequencing problem in R,
 where I need to match counts of people getting on and off a bus to the
 correct stop in the bus route stop sequence. I have tried looking
 online/forums for sequence matching but seems to refer to numeric sequences
 or DNA matching and over my head. I am after a simple example if anyone can
 please help.
 
 I have two data series as per below (from database), that I want to
 combine. In this example “stop_sequence” includes the equence (seq) of
bus
 stops and “stop_onoff” is a count of people getting on and off at
certain
 stops (there is no entry if noone gets on or off).
 
 stop_sequence - data.frame(seq=c(10,20,30,40,50,60),
 ref=c('A','B','C','D','B','A'))
 ##   seq ref
 ## 1  10   A
 ## 2  20   B
 ## 3  30   C
 ## 4  40   D
 ## 5  50   B
 ## 6  60   A
 stop_onoff -
 data.frame(ref=c('A','D','B','A'),on=c(5,0,10,0),off=c(0,2,2,6))
 ##   ref on off
 ## 1   A  5   0
 ## 2   D  0   2
 ## 3   B 10   2
 ## 4   A  0   6
 
 I need to match the stop_onoff numbers in the right sto sequence, with the
 correctly matched output as follows (load is a cumulative count of on and
 off)
 
 desired_output - data.frame(seq=c(10,20,30,40,50,60),
 ref=c('A','B','C','D','B','A'),
 on=c(5,'-','-',0,10,0),off=c(0,'-','-',2,2,6), load=c(5,0,0,3,11,5))
 ##   seq ref on off load
 ## 1  10   A  5   05
 ## 2  20   B  -   -0
 ## 3  30   C  -   -0
 ## 4  40   D  0   23
 ## 5  50   B 10   2   11
 ## 6  60   A  0   65
 
 In this example the stop “B” is matched to the second stop “B” in
the stop
 sequence and not the first because the onoff data is after stop “D”.
 
 Any guidance much appreciated.
 
 Regards
 Adam
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 




South Africas premier free email service - www.webmail.co.za 

Cheapest Insurance Quotes!
https://www.outsurance.co.za/insurance-quote/personal/?source=msncr=Postit14_468x60_gifcid=322

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with sapa package and spectral density function (SDF)

2014-08-30 Thread G
Hello,

Did you find a solution to this problem in the R mailing list ?
I am having the same problem but there were apparently no replies to 
your question or didnt find them ?

Thanks
Anusha
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new error with QuantMod getSymbols

2014-08-30 Thread Joshua Ulrich
You didn't provide the file lista.csv, so it's not possible to
reproduce any of these errors.  And there's no call to getSymbols in
your code.  You use loadSymbols, and I am not familiar with that
function.

That said, this sounds like an issue with some of the data being sent
by Yahoo Finance.
--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com


On Thu, Aug 28, 2014 at 11:12 PM, Adolfo Yanes adolfoya...@gmail.com wrote:
 Hello,

 I use getSymbols function daily to run some models with stock data. Today
 when I tried to update the stock info i get this error

 Error in charToDate(x) :
   character string is not in a standard unambiguous format

 Sometimes I get it after 2 symbols, other times after 150 symbols, another
 time after 40 symbols, then after 203 symbols.

 The code for the symbol list is:

 lista-read.csv(lista.csv, header=FALSE)



 lista.list.ana-vector('list',nrow(lista))

 names(lista.list.ana) - lista[,1]

 lista.sum-as.vector(lista[,1])


 ##actualizar la lista

 lista_simbolos-download_symbols(lista.sum, lista.list.ana)



 *The code for the function download_symbols is:*


 download_symbols- function(lista.sum.,lista.list.ana..){

   newnames.- c(Open, High, Low, Close, Volume, Adjusted)

 for (m in 1:length(lista.sum.))


 {

 print(paste(c(Downloading symbol , lista.sum.[m], . , length(lista.sum.
 )-m,  symbols missing), sep=, collapse=))

 temp-get(loadSymbols(lista.sum.[m]))

 names(temp)-newnames.

 #lista.list.ana..[[m]]-loadSymbols(lista.sum.[m])

 lista.list.ana..[[m]]-temp

 }

  return(lista.list.ana..)

 }


 Is it something wrong with yahoo? I tried google and got another error
 Error in `colnames-`(`*tmp*`, value = c(Open, High, Low, Close,  :
   length of 'dimnames' [2] not equal to array extent

 Thanks for your help


 --
 Adolfo Yanes Musetti

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bus stop sequence matching problem

2014-08-30 Thread Charles Berry
Adam Lawrence alaw005 at gmail.com writes:

 
 I am hoping someone can help me with a bus stop sequencing problem in R,
 where I need to match counts of people getting on and off a bus to the
 correct stop in the bus route stop sequence. I have tried looking
 online/forums for sequence matching but seems to refer to numeric 
 sequences
 or DNA matching and over my head. I am after a simple example if anyone 
 can
 please help.
 

Adam,

Yet another way...

See inline code. BTW, you should have mentioned that you are
a transit planner or included a signature block so folks would know this
is not a homework question.

As others have noted/hinted, there are some unstated assumptions, so
you need to try some test cases to be sure any solution always works.

You only have one outbound/inbound cycle in stop_onoff, right??
If not, I think almost any approach can fail given the right
sequence of 'seq's.


 I have two data series as per below (from database), that I want to
 combine. In this example “stop_sequence” includes the equence (seq) of bus
 stops and “stop_onoff” is a count of people getting on and off at certain
 stops (there is no entry if noone gets on or off).
 
 stop_sequence - data.frame(seq=c(10,20,30,40,50,60),
 ref=c('A','B','C','D','B','A'))
 ##   seq ref
 ## 1  10   A
 ## 2  20   B
 ## 3  30   C
 ## 4  40   D
 ## 5  50   B
 ## 6  60   A
 stop_onoff -
 data.frame(ref=c('A','D','B','A'),on=c(5,0,10,0),off=c(0,2,2,6))
 ##   ref on off
 ## 1   A  5   0
 ## 2   D  0   2
 ## 3   B 10   2
 ## 4   A  0   6
 
 I need to match the stop_onoff numbers in the right sto sequence, with the
 correctly matched output as follows (load is a cumulative count of on and
 off)
 
 desired_output - data.frame(seq=c(10,20,30,40,50,60),
 ref=c('A','B','C','D','B','A'),
 on=c(5,'-','-',0,10,0),off=c(0,'-','-',2,2,6), load=c(5,0,0,3,11,5))
 ##   seq ref on off load
 ## 1  10   A  5   05
 ## 2  20   B  -   -0
 ## 3  30   C  -   -0
 ## 4  40   D  0   23
 ## 5  50   B 10   2   11
 ## 6  60   A  0   65
 

Start here:

 stop_onoff$load - with(stop_onoff,cumsum(on)-cumsum(off))
 split.ref - with(stop_sequence,split(seq,ref))
 split.ref.onoff - split.ref[as.character(stop_onoff$ref)]
 stop.mat - sapply(split.ref.onoff,rep,length=2)
 inout - cbind(stop.mat,c(0,Inf))cbind(c(0,Inf),stop.mat)
 stop_onoff$seq - head(stop.mat[inout],-1)
 merge(stop_sequence[c(ref,seq)],stop_onoff[-1],by=seq,all.x=T)
  seq ref on off load
1  10   A  5   05
2  20   B NA  NA   NA
3  30   C NA  NA   NA
4  40   D  0   23
5  50   B 10   2   11
6  60   A  0   65

You can take care of turning the NA's to zeroes or '-'s, I think.

HTH,

Chuck

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] clean email addresses?

2014-08-30 Thread Spencer Graves

  Does anyone have suggestions for cleaning a list of email addresses?


  I ask, because I can read into R data on registered voters that 
includes an email address field.  I wondered if anyone had experience 
doing such, especially in R.  (I found an article on How to Clean Large 
Email Contact Lists for Email Marketing Campaigns; 
www.wikihow.com/Clean-Large-Email-Contact-Lists-for-Email-Marketing-Campaigns. 
Before I went further with this, I felt a need to ask.



  Thanks,
  Spencer Graves

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ddply question

2014-08-30 Thread Felipe Carrillo
I apologize about cross posting but my question keeps bouncing back from the 
list
 
 How come pct doesn't work in this ddply call?
I am trying to get a percent of 'TotalCount' by SampleDate and Age
 library(plyr)
b - structure(list(SampleDate = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = 5/8/1996, class = factor), TotalCount = c(1L,
2L, 1L, 1L, 4L, 3L, 1L, 10L, 3L), ForkLength = c(61L, 22L, NA,
NA, 72L, 34L, 100L, 23L, 25L), TotalSalvage = c(12L, 24L, 12L,
12L, 17L, 23L, 31L, 12L, 15L), Age = c(1L, 0L, NA, NA, 1L, 0L,
1L, 0L, 0L)), .Names = c(SampleDate, TotalCount, ForkLength,
TotalSalvage, Age), class = data.frame, row.names = c(NA,
-9L))
b
ddply(b,.(SampleDate,Age),summarise,salvage=sum(TotalSalvage),pct=TotalCount/sum(TotalCount))
Error: expecting result of length one, got : 4
 
#Computing TotalCount inside ddply works but the pct seems wrong...
ddply(b,.(SampleDate,Age),summarise,salvage=sum(TotalSalvage),Count=sum(TotalCount),pct=Count/sum(Count))
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in inDL(x, as.logical(local), as.logical(now), ...)

2014-08-30 Thread Girija Kalyani
Dear Group,

I get this error when loadin RCurl. What could be the reason?
My configuration:
R-version : 3.1.1
Windows- 32,

Error in inDL(x, as.logical(local), as.logical(now), ...) :
  unable to load shared object 'C:/Program
Files/R/R-3.1.1/library/RCurl/libs/i386/RCurl.dll':
  LoadLibrary failure:  The specified procedure could not be found.


how do i solve this.
Since RCurl seems mandatory to install Rgbif.
-- 
:) Smile is my Style :)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new error with QuantMod getSymbols

2014-08-30 Thread adolfoyanes
Thank you very much Joshua. Pardon me  for confusing loadSymbols with 
getSymbols and not sending the file lista.csv .. Apparently the issue was with 
Yahoo Finance that day. The next day it worked perfectlty.

Best Regards,

Adolfo Yanes
Enviado desde mi BlackBerry de Movistar

-Original Message-
From: Joshua Ulrich josh.m.ulr...@gmail.com
Date: Sat, 30 Aug 2014 10:45:01 
To: Adolfo Yanesadolfoya...@gmail.com
Cc: R-Helpr-help@r-project.org
Subject: Re: [R] new error with QuantMod getSymbols

You didn't provide the file lista.csv, so it's not possible to
reproduce any of these errors.  And there's no call to getSymbols in
your code.  You use loadSymbols, and I am not familiar with that
function.

That said, this sounds like an issue with some of the data being sent
by Yahoo Finance.
--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com


On Thu, Aug 28, 2014 at 11:12 PM, Adolfo Yanes adolfoya...@gmail.com wrote:
 Hello,

 I use getSymbols function daily to run some models with stock data. Today
 when I tried to update the stock info i get this error

 Error in charToDate(x) :
   character string is not in a standard unambiguous format

 Sometimes I get it after 2 symbols, other times after 150 symbols, another
 time after 40 symbols, then after 203 symbols.

 The code for the symbol list is:

 lista-read.csv(lista.csv, header=FALSE)



 lista.list.ana-vector('list',nrow(lista))

 names(lista.list.ana) - lista[,1]

 lista.sum-as.vector(lista[,1])


 ##actualizar la lista

 lista_simbolos-download_symbols(lista.sum, lista.list.ana)



 *The code for the function download_symbols is:*


 download_symbols- function(lista.sum.,lista.list.ana..){

   newnames.- c(Open, High, Low, Close, Volume, Adjusted)

 for (m in 1:length(lista.sum.))


 {

 print(paste(c(Downloading symbol , lista.sum.[m], . , length(lista.sum.
 )-m,  symbols missing), sep=, collapse=))

 temp-get(loadSymbols(lista.sum.[m]))

 names(temp)-newnames.

 #lista.list.ana..[[m]]-loadSymbols(lista.sum.[m])

 lista.list.ana..[[m]]-temp

 }

  return(lista.list.ana..)

 }


 Is it something wrong with yahoo? I tried google and got another error
 Error in `colnames-`(`*tmp*`, value = c(Open, High, Low, Close,  :
   length of 'dimnames' [2] not equal to array extent

 Thanks for your help


 --
 Adolfo Yanes Musetti

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] posterior probabilities from lda.predict

2014-08-30 Thread David L Carlson
Function predict.lda() is just answering a different question from the one you 
are posing. It is answering the question, given the values on this object what 
is the probability of membership in each of the groups used to construct the 
discriminant functions in the first place. Those probabilities sum to 1 and are 
generally called the posterior probabilities. Your question is somewhat 
different, if this object was a member of group x, what is the probability that 
it would have values like these. These are typicality probabilities (how 
typical is this observation in this group). 

There are two ways to compute typicality probabilities. One is to use the 
reduced space defined by the discriminant functions and measure the distance of 
a new observation to the centroid of the group. This is the approach taken by 
SPSS which provides the typicality for the group which has the highest 
posterior probability. Huberty and Olejink recommend this procedure on the 
grounds that the probability distribution is known. The alternate approach 
which is used commonly in compositional analysis is to use Mahalanobis distance 
with the probability assumed to follow a chi square distribution. I am not 
aware of a package that has a function to produce either of these.

Huberty, Carl J. and Stephen Olejink. 2006. Applied Manova and Discriminant 
Analysis. Second Edition. Wiley-Interscience.

David L. Carlson
Department of Anthropology
Texas AM University


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Fraser D. Neiman
Sent: Friday, August 29, 2014 4:14 PM
To: r-help@r-project.org
Subject: [R] posterior probabilities from lda.predict

Dear All,

I have used the lda() function in the MASS library to estimate a set of 
discriminant functions to assign samples from a training set to one of six 
groups.  The cross validation generates nearly perfect predictions for samples 
in the training set.  Hooray!

Now I want to use lda.predict() to estimate both discriminant function scores 
and probabilities of group membership for a second set of samples whose group 
membership is unknown.  For each unknown sample, lda.predict() produces a six 
probabilities. These probabilities sum to one. So lda.predict() seems to assume 
that the unknown samples do, in fact, belong to one of the six groups.  

The problem is that it is nearly certain that some of the unknown samples in 
the second set do not belong to any of the six groups. For those samples, 
probabilities of group membership should be close to zero for all six groups.  
In fact, identifying which samples are unlikely to belong to any of the six 
groups is a major goal of the analysis. 

So the question is, what is lda.predict() doing behind the scenes to force the 
group membership probabilities to sum to one? How do I get it to not do this 
and produce probabilities that accurately reflect the large Mahalanobis 
distances of some of the unknown sample from any group centroid?\

I have searched the R-list archive on this and have found several folks asking 
similar questions, but no helpful answers.

Thanks very much!

Fraser
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior when giving a value to a new variable basedon the value of another variable

2014-08-30 Thread David Winsemius


On Aug 29, 2014, at 8:54 PM, David McPearson wrote:

On Fri, 29 Aug 2014 06:33:01 -0700 Jeff Newmiller jdnew...@dcn.davis.ca.us 


wrote


One clue is the help file for $...

? $

In particular there see the discussion of character indices and the  
exact

argument.



...snip...


On August 29, 2014 1:53:47 AM PDT, Angel Rodriguez
angel.rodrig...@matiainstituto.net wrote: 

Dear subscribers,

I've found that if there is a variable in the dataframe with a name

...sip...
N - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58),  
V2 =

c(NA, 1, 1, 1, 1,1,1,1,NA)),
+ .Names = c(age,samplem), row.names = c(NA,
-9L), class = data.frame)

N$sample[N$age = 65] - 1
N

age samplem sample
1  67  NA  1
2  62   1  1
3  74   1  1
4  61   1  1
5  60   1  1
6  55   1  1
7  60   1  1
8  59   1  1
9  58  NA NA

...snip...

Having seen all the responses about partial matching I almost  
understand. I've
also replicated the behaviour on R 2.11.1 so it's been around  
awhile. This
tells me it ain't a bug - so if any of the cognoscenti have the time  
and
inclination can someone give me a brief (and hopefully simple)  
explanation of

what is going on under the hood?

It looks (to me) like N$sample[N$age = 65] - 1 copies N$samplem to  
N$sample
and then does the assignment. If partial matching is the problem  
(which it

clearly is) my expectation is that  the  output should look like

  age samplem
1   67   1
2   62   1
3   74   1
4   61   1
5   60   1
6   55   1
7   60   1
8   59   1
9   58  NA
That is - no new column.
(and I just hate it when the world doesn't live up to my  
expectations!)


Not sure what you are seeing. I am seeing what you expected:

 test - data.frame(age=1:10, sample=1)
 test$sample[test$age5] - 2
 test
   age sample
11  2
22  2
33  2
44  2
55  1
66  1
77  1
88  1
99  1
10  10  1

--
David


Bewildered and confused,
DMcP


South Africas premier free email service - www.webmail.co.za

Cotlands - Shaping tomorrows Heroes http://www.cotlands.org.za/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split PVClust plot

2014-08-30 Thread Tal Galili
Hi Tom,

There is a as.dendrogram.pvclust function in the package dendextend.
(it is on CRAN: http://cran.r-project.org/web/packages/dendextend/)

You can run:

install.packages('dendextend')
library(dendextend)
result2  - as.dendrogram(result)
# You can then also use the prune function in dendextend, to get the
subtree you are interested in.

Also, there is an example of pvclust in the package vignette (just
search pvclust
here):
http://cran.r-project.org/web/packages/dendextend/vignettes/introduction.html
The example shows how to highlight significant branches (with line width
and color).

With regards,
Tal






Contact
Details:---
Contact me: tal.gal...@gmail.com |
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--



On Tue, Jul 29, 2014 at 12:40 AM, Worthington, Thomas A 
thomas.worthing...@okstate.edu wrote:

 Dear All

 I'm using PVClust to perform hierarchical clustering, for the output plot
 I can control most of the graphical I need, however the plot is large and I
 would like to split it vertically into two panels one above the other. Is
 there a way to plot only part of a PVClust plot, I tried to convert it to a
 dendrogram with

 result2  = as.dendrogram(result)

 however I get the error message no applicable method for 'as.dendrogram'
 applied to an object of class pvclust. I also wondered whether it would
 be possible to convert to a phylogenetic tree and use the functions in the
 'ape' package?

 Any suggestion on how to split up a PVclust plot would be greatly
 appreciated  (code for the plot below)

 Thanks
 Tom


 result - pvclust(df.1, method.dist=uncentered,
 method.hclust=average,nboot=10)
 par(mar=c(0,0,0,0))
 par(oma=c(0,0,0,0))
 plot(result, print.pv =FALSE, col.pv=c(red,,), print.num=FALSE,
 float = 0.02, font=1,
 axes=T, cex =0.85, main=, sub=, xlab=, ylab= ,
 labels=NULL, hang=-1)
 pvrect(result, alpha=0.95)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior when giving a value to a new variable basedon the value of another variable

2014-08-30 Thread David Winsemius


On Aug 30, 2014, at 7:38 PM, David Winsemius wrote:



On Aug 29, 2014, at 8:54 PM, David McPearson wrote:

On Fri, 29 Aug 2014 06:33:01 -0700 Jeff Newmiller jdnew...@dcn.davis.ca.us 


wrote


One clue is the help file for $...

? $

In particular there see the discussion of character indices and  
the exact

argument.



...snip...


On August 29, 2014 1:53:47 AM PDT, Angel Rodriguez
angel.rodrig...@matiainstituto.net wrote: 

Dear subscribers,

I've found that if there is a variable in the dataframe with a name

...sip...
N - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58),  
V2 =

c(NA, 1, 1, 1, 1,1,1,1,NA)),
+ .Names = c(age,samplem), row.names =  
c(NA,

-9L), class = data.frame)

N$sample[N$age = 65] - 1
N

age samplem sample
1  67  NA  1
2  62   1  1
3  74   1  1
4  61   1  1
5  60   1  1
6  55   1  1
7  60   1  1
8  59   1  1
9  58  NA NA

...snip...

Having seen all the responses about partial matching I almost  
understand. I've
also replicated the behaviour on R 2.11.1 so it's been around  
awhile. This
tells me it ain't a bug - so if any of the cognoscenti have the  
time and
inclination can someone give me a brief (and hopefully simple)  
explanation of

what is going on under the hood?

It looks (to me) like N$sample[N$age = 65] - 1 copies N$samplem  
to N$sample
and then does the assignment. If partial matching is the problem  
(which it

clearly is) my expectation is that  the  output should look like

 age samplem
1   67   1
2   62   1
3   74   1
4   61   1
5   60   1
6   55   1
7   60   1
8   59   1
9   58  NA
That is - no new column.
(and I just hate it when the world doesn't live up to my  
expectations!)


Not sure what you are seeing. I am seeing what you expected:

 test - data.frame(age=1:10, sample=1)
 test$sample[test$age5] - 2
 test
  age sample
11  2
22  2
33  2
44  2
55  1
66  1
77  1
88  1
99  1
10  10  1



I realized later that I had not constructed a test of you behavior and  
that when I did I see the creation of a third column. The answer is to  
read the help page:


?`[-`

Character indices can in some circumstances be partially matched (see  
pmatch) to the names or dimnames of the object being subsetted (but  
never for subassignment). 


Note the caveat in parentheses.

--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what happened when copying a function definition into R prompt then press Enter?

2014-08-30 Thread PO SU

Dear expeRts,
    That's to say,what happened when loading source code  into memory? what's 
the difference between it and loading installed code into memory? Do they 
related with .Rdata?
   
       




--

PO SU
mail: desolato...@163.com 
Majored in Statistics from SJTU
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] help: shiny leer ficheros desde google drive

2014-08-30 Thread Javier Villacampa González
Hola buenas,

Un compañero y yo estamos haciendo una aplicación shiny. Nos ha quedado
bastante aparente y en sevidor local (con R) funciona bastante bien. El
problema es que cuando cargamos los ficheros en la web deja de fucionar
¿Por qué? Pues porque al principio de la aplicacion cargamos unos datos de
nuestro ordenador y esto no es posible a la hora de poner los datos
ShinyApps.io (una de las multiples soluciones que se nos ofrece en la red)
[1, 2,a p 3]

Una de las posibles soluciones que se nos había ocurrido era subir los
datos en formato csv a goolge drive y una vez permitido a cualquiera
acceder a ellos leerlos con un read table a través del link.

La pregunta es ¿Alguien sabe como leer unos datos colgados en csv desde
google drive? ¿Podría poner un ejemplo práctico? los que hemos encontrado
no hemos sabido reproducirlos.

Otra pregunta es ¿alguna otra solución gratuita para leer datos colgados
on-line?


Como siempre, gracias por adelantado y un cordial saludo

Javier

Bibliografía
[1] https://github.com/rstudio/shinyapps/blob/master/guide/guide.md
[2] http://shiny.rstudio.com/tutorial/lesson7/
[3] https://www.shinyapps.io/


--

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] help: shiny leer ficheros desde google drive

2014-08-30 Thread Carlos Ortega
Hola,

Hay un paquete en Bioconductor para esto RGoogleDocs.
Referencias adicionales:

http://www.statsravingmad.com/a-tiny-rcurl-headache/

Y si no, puedes hacerlo a través de un fichero en el Public de un
Dropbox...

Saludos,
Carlos Ortega
www.qualityexcellence.es



El 30 de agosto de 2014, 13:58, Javier Villacampa González 
javier.villacampa.gonza...@gmail.com escribió:

 Hola buenas,

 Un compañero y yo estamos haciendo una aplicación shiny. Nos ha quedado
 bastante aparente y en sevidor local (con R) funciona bastante bien. El
 problema es que cuando cargamos los ficheros en la web deja de fucionar
 ¿Por qué? Pues porque al principio de la aplicacion cargamos unos datos de
 nuestro ordenador y esto no es posible a la hora de poner los datos
 ShinyApps.io (una de las multiples soluciones que se nos ofrece en la red)
 [1, 2,a p 3]

 Una de las posibles soluciones que se nos había ocurrido era subir los
 datos en formato csv a goolge drive y una vez permitido a cualquiera
 acceder a ellos leerlos con un read table a través del link.

 La pregunta es ¿Alguien sabe como leer unos datos colgados en csv desde
 google drive? ¿Podría poner un ejemplo práctico? los que hemos encontrado
 no hemos sabido reproducirlos.

 Otra pregunta es ¿alguna otra solución gratuita para leer datos colgados
 on-line?


 Como siempre, gracias por adelantado y un cordial saludo

 Javier

 Bibliografía
 [1] https://github.com/rstudio/shinyapps/blob/master/guide/guide.md
 [2] http://shiny.rstudio.com/tutorial/lesson7/
 [3] https://www.shinyapps.io/


 --

 [[alternative HTML version deleted]]

 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es