Re: [R] Best Beginner Books?
On Thu, Oct 2, 2014 at 4:54 AM, Jason Eyerly teamtraders3...@gmail.com wrote: A lot of excellent suggestions! Thank you everyone for the input. I’ve purchased via Amazon: A Beginner's Guide to R by Zuur Data Manipulation with R by Spector “Introductory Statistics with R.” by Peter Dalgaard Are there any suggestions as to which order I should read them in when they arrive? For best comprehension, I’m thinking “Introductory Statistics with R” would be the best to start with, and perhaps the beginners guide after that, or vice versa? I would not start with a whole book, but a good short intro (~ 30-40 pages). Feel free to pick up free publications here: https://github.com/gabx/r-project/tree/master/documentation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply if else statement to vector
Subscribers, What is the correct syntax to apply the 'if else' conditional statement to vector objects? Example: vectorx-c(50,50,20,70) vectory-c(50,50,20,20) vectorz-function () { if (vectorxvectory) vectorx else vectorx-0 } vectorz() Warning message: In if (vectorx vectory) vectorx else vectorx - 0 : the condition has length 1 and only the first element will be used The help manual (?'if') explains that only length=0 is acceptable; what is an appropriate alternative function to use please? The desired result for vectorz is: 0 0 0 70 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply if else statement to vector
Hi in this case correct syntax is not to use if at all vectorx*(vectorxvectory) 1] 0 0 0 70 if you insist you can use ?ifelse which is vectorised if Regards Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of r...@openmailbox.org Sent: Thursday, October 02, 2014 11:02 AM To: r-help@r-project.org Subject: [R] apply if else statement to vector Subscribers, What is the correct syntax to apply the 'if else' conditional statement to vector objects? Example: vectorx-c(50,50,20,70) vectory-c(50,50,20,20) vectorz-function () { if (vectorxvectory) vectorx else vectorx-0 } vectorz() Warning message: In if (vectorx vectory) vectorx else vectorx - 0 : the condition has length 1 and only the first element will be used The help manual (?'if') explains that only length=0 is acceptable; what is an appropriate alternative function to use please? The desired result for vectorz is: 0 0 0 70 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply if else statement to vector
On 02-10-2014, at 11:01, r...@openmailbox.org wrote: Subscribers, What is the correct syntax to apply the 'if else' conditional statement to vector objects? Example: vectorx-c(50,50,20,70) vectory-c(50,50,20,20) vectorz-function () { if (vectorxvectory) vectorx else vectorx-0 } vectorz() Warning message: In if (vectorx vectory) vectorx else vectorx - 0 : the condition has length 1 and only the first element will be used The help manual (?'if') explains that only length=0 is acceptable; what is an appropriate alternative function to use please? And at the bottom of that page (section See Also) there is a reference to ifelse. Why didn’t you have a look at that? Replace your complete if/else statement with ifelse(vectorxvectory,vectorx,0) Berend 0 0 0 70 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Print list to text file with list elements names
OK I get it, all options work well now. Thank you! -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: 01 October 2014 19:05 To: Ingrid Charvet Cc: r-help@r-project.org Subject: Re: [R] Print list to text file with list elements names You want to put lapply(myList, print) inside a call to invisible invisible(lapply(myList, print)) (or put the whole sink(file)/lapply(...)/sink() sequence into a function or use tmp-lapply(...) so autoprinting is suppressed) Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Oct 1, 2014 at 10:41 AM, Ingrid Charvet ingrid.char...@rms.com wrote: Hi Bill, thanks for your suggestions. Regarding the list elements names, your idea works well - thank you. However, my first issue is not resolved. I omitted the second sink statement, and R still created a text file with duplicate list elements. Then I omitted both sink statements (keeping only the lapply), but obviously then no text file is created at all (although it does print properly on the R console) ... I also tried: func - function(i) { write.table(inventory[[i]],file=sprintf(test_new%d,i)) } lapply(1:10,func) But here of course I get 10 different text files (one per list element) when what I would like is to have them all in one file... Ingrid -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: 01 October 2014 15:54 To: Ingrid Charvet Cc: r-help@r-project.org Subject: Re: [R] Print list to text file with list elements names Omit the sink() statements to see what is happening - lapply(myList,print) prints each item in the list and then the output of lapply is printed via the autoprinting mechanism. If you put this into a function or saved the return value of lapply into a variable or wrapped the call to lapply in a call to invisible() then the autoprinting would not happen. You said that you wanted the name each list item printed before the item. print(myList) would do that, but I assume you've already tried that and didn't like the format. You can get your own formatting of the name with something like myList - list(First=data.frame(x=1:2,y=letters[1:2]), Second=data.frame(x=1:3,z=LETTERS[24:26])) invisible(lapply(seq_along(myList), function(i){ cat(sep=, \n, names(myList)[i], :\n) ; print(myList[[i]])})) First: x y 1 1 a 2 2 b Second: x z 1 1 X 2 2 Y 3 3 Z Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Oct 1, 2014 at 3:59 AM, Ingrid Charvet ingrid.char...@rms.com wrote: Hi everyone, I want to write a list of data frames to a text file and preserve the names given to the list elements (I am using R 3.1.0). I tried: setNames(myList, myNames) # myNames is a vector of char elements same length as myList sink(sprintf(%s,filename)) lapply(myList,print) sink() And here I have two problems: 1. R writes each element of my list to the text file twice, so for example if I have a list with 2 elements (i.e. data frames) in it, it will write 4: in order data frame 1, data frame 2, data frame 1, data frame 2. 2. The names of list elements do not print to the text file. Any suggestions to solve these issues would be appreciated! Many thanks, Ingrid This message and any attachments contain information that may be RMS Inc. confidential and/or privileged. If you are not the intended recipient (or authorized to receive for the intended recipient), and have received this message in error, any use, disclosure or distribution is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to the e-mail and permanently deleting the message from your computer and/or storage system. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This message and any attachments contain information that may be RMS Inc. confidential and/or privileged. If you are not the intended recipient (or authorized to receive for the intended recipient), and have received this message in error, any use, disclosure or distribution is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to the e-mail and permanently deleting the message from your computer and/or storage system. This message and any attachments contain information that may be RMS Inc. confidential and/or privileged. If you are not the intended recipient (or authorized to receive for the intended recipient), and have received this message in error, any use, disclosure or distribution is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to
Re: [R] How to check to see if a variable is within a range of another variable
Is there an easy way to check whether a variable is within +/- 10% range of another variable in R? You could use 2*abs(A-B)/(A+B) 0.1 which avoids an apply(). I've assumed you meant different by under 10% of the mean of the two, hence the 2/(A+B); if you meant 10% of something else, same kind of thing should work with a different bottom line. For example if you meant less than 10% of the smaller of the two, abs(A-B)/pmin(A, B) 0.1 would do that And if you want to check that they all comply, all( 2*abs(A-B)/(A+B) 0.1 ) will do that. I liked Peter Langfelder's suggestion of all.equal, though; an interesting use of a tolerance I've previously seen as there only to compensate for floating point representation errors. Steve E *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inverse Student t-value
Dear All, The manual formula mean that how to calculate that value by hand for TINV(0.408831, 1221) and the resulted is 4.0891672 Appreciate your help in advance. Cheers! On Wed, Oct 1, 2014 at 9:15 PM, jlu...@ria.buffalo.edu wrote: What do you mean by a manual formula? *Andre geomodel...@gmail.com geomodel...@gmail.com* 09/30/2014 11:54 PM To jlu...@ria.buffalo.edu, cc Duncan Murdoch murdoch.dun...@gmail.com, r-help@r-project.org r-help@r-project.org, r-help-boun...@r-project.org Subject Re: [R] Inverse Student t-value Hi JLucke, Maybe the old excel function. TINV and T.INV.2T is same function for Two-Tailed Inverse of the student`s t-distribution but T.INV use for Left-Tailed inverse of the Student's t-distribution and can be use for Inverse of the student`s t-distribution. I know automatic or functions any software but I just need a manual formula or compute formula (TINV or T.INV.2T) step by step presented by math for calculate until resulted. Thanks in advance. Cheers! On Wed, Oct 1, 2014 at 2:39 AM, *jlu...@ria.buffalo.edu* jlu...@ria.buffalo.edu wrote: The website has your answer. The t-distribution is a regularized incomplete beta function. The incomplete beta function is given by R's *pbeta* function. You regularize it with R's *beta* function. Then you use R's *uniroot* function to find the inverse. Good homework problem. *Andre **geomodel...@gmail.com* geomodel...@gmail.com** Sent by: *r-help-boun...@r-project.org* r-help-boun...@r-project.org 09/30/2014 02:45 PM To Duncan Murdoch *murdoch.dun...@gmail.com* murdoch.dun...@gmail.com, cc *r-help@r-project.org* r-help@r-project.org *r-help@r-project.org* r-help@r-project.org Subject Re: [R] Inverse Student t-value Hi Duncan, Let me explain again, I just need a manual expression for inverse student t value. You could go to web page *http://www.danielsoper.com/statcalc3/calc.aspx?id=10* http://www.danielsoper.com/statcalc3/calc.aspx?id=10 That's inverse student t value calculator. Do you know a manual expression use it. Cheers! On Wednesday, October 1, 2014, Duncan Murdoch *murdoch.dun...@gmail.com* murdoch.dun...@gmail.com wrote: On 30/09/2014 2:26 PM, Andre wrote: Hi Duncan, Actually, I am trying trace the formula for the Critical value of Z and manual formula is =(I7-1)/SQRT(I7)*SQRT((TINV(0. 05/I7,I7-2))^2/(I7-2+TINV(0.05/I7,I7-2))) So, I got new problem for TINV formula. I just need a manual equation for TINV. Sorry, can't help. I'm not sure I understand what you want, but if it's a simple formula for quantiles of the t distribution, it doesn't exist. Duncan Murdoch Hope solve this problem. Cheers! On Wed, Oct 1, 2014 at 1:20 AM, Duncan Murdoch *murdoch.dun...@gmail.com* murdoch.dun...@gmail.com *mailto:murdoch.dun...@gmail.com* murdoch.dun...@gmail.com wrote: On 30/09/2014 2:11 PM, Andre wrote: Hi Duncan, No, that's correct. Actually, I have data set below; Then it seems Excel is worse than I would have expected. I confirmed R's value in two other pieces of software, OpenOffice and some software I wrote a long time ago based on an algorithm published in 1977 in Applied Statistics. (They are probably all using the same algorithm. I wonder what Excel is doing?) N= 1223 alpha= 0.05 Then probability= 0.05/1223=0.408831 degree of freedom= 1223-2= 1221 So, TINV(0.408831,1221) returns 4.0891672 Could you show me more detail a manual equation. I really appreciate it if you may give more detail. I already gave you the expression: abs(qt(0.408831/2, df=1221)). For more detail, I suppose you could look at the help page for the qt function, using help(qt). Duncan Murdoch Cheers! On Wed, Oct 1, 2014 at 1:01 AM, Duncan Murdoch *murdoch.dun...@gmail.com* murdoch.dun...@gmail.com *mailto:murdoch.dun...@gmail.com* murdoch.dun...@gmail.com *mailto:murdoch.dun...@gmail.com* murdoch.dun...@gmail.com *mailto:murdoch.dun...@gmail.com* murdoch.dun...@gmail.com wrote: On 30/09/2014 1:31 PM, Andre wrote: Dear Sir/Madam, I am trying to use calculation for two-tailed inverse of the student`s t-distribution function presented by Excel functions like =TINV(probability, deg_freedom). For instance: The Excel function =TINV(0.408831,1221) = returns 4.0891672. Would you like to show me a manual calculation for this? Appreciate your helps in advance. That number looks pretty far off the true value. Have you got
Re: [R] Inverse Student t-value
The manual formula mean that how to calculate that value by hand for TINV(0.408831, 1221) and the resulted is 4.0891672 This is not really an R help question - you're specifically asking for something that _doesn't_ use R - but if you want to know how R does it the full C and R code for pt and qt in R (the cumulative probability distribution and the inverse, quantile, function respectively) is available in the R source code, which you can obtain from CRAN; see http://CRAN.R-project.org/mirrors.html - see the source code link from any of the listed mirrors. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimization question
Andras, Is there an error in your post or am I missing something? df[, 9] is made up of the last (9th) element of each of a, b, c, and d. The minimum value for sum(df[, 9]) is 0. Given your conditions, there are many, many ways to get this result. Here is just one example: a -c(1,1,1,1,1,0,0,0,0) b -c(1,0,1,1,1,1,0,0,0) c -c(1,0,1,1,0,0,0,0,0) d -c(1,1,1,1,0,1,0,0,0) df -rbind(a,b,c,d) df -cbind(df,h=c(sum(a)*8,sum(b)*8,sum(c)*8,sum(d)*8)) df -cbind(df,df[,8]*c(1,2,3,2)) Jean On Wed, Oct 1, 2014 at 7:30 PM, Andras Farkas motyoc...@yahoo.com wrote: Dear All, please provide help with the following: we have a -c(0,1,1,0,1,0,0,0,0) b -c(0,0,0,1,0,0,0,0,0) c -c(1,0,1,0,1,1,0,0,0) d -c(0,1,0,1,0,1,0,0,0) df -rbind(a,b,c,d) df -cbind(df,h=c(sum(a)*8,sum(b)*8,sum(c)*8,sum(d)*8)) df -cbind(df,df[,8]*c(1,2,3,2)) I would like to minimize the value for sum(df[,9]) under the following conditions: 1. all values of a,b,c, and d are binary variables and are the variables to change to get the optimal result 2. sum(a), sum(b), and sum(d) should be each 5 or more 3. sum(c) should be 3 or less 4. a[2], a[3], b[2], d[7] and d[8] are fixed to their current values. any thoughts or reference examples you could help with is greatly appreciated thanks andras __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimization question
There is an error jean, I apologize... I made changes to the vectors and did not correct the bottom line... this is the correct run:a -c(0,1,1,0,1,0,0,0,0) b -c(0,0,0,1,0,0,0,0,0) c -c(1,0,1,0,1,1,0,0,0) d -c(0,1,0,1,0,1,0,0,0) df -rbind(a,b,c,d) df -cbind(df,h=c(sum(a)*8,sum(b)*8,sum(c)*8,sum(d)*8)) df -cbind(df,df[,10]*c(1,2,3,2)) and I would like to minimize the value for sum(df[,11]) under the following conditions: 1. all values of a,b,c, and d are binary variables and are the variables to change to get the optimal result 2. sum(a), sum(b), and sum(d) should be each 5 or more 3. sum(c) should be 3 or less 4. a[2], a[3], b[2], d[7] and d[8] are fixed to their current values. sorry about that and thanks for taking the time to respond, Andras On Thursday, October 2, 2014 8:40 AM, Adams, Jean jvad...@usgs.gov wrote: Andras, Is there an error in your post or am I missing something? df[, 9] is made up of the last (9th) element of each of a, b, c, and d.The minimum value for sum(df[, 9]) is 0.Given your conditions, there are many, many ways to get this result.Here is just one example:a -c(1,1,1,1,1,0,0,0,0)b -c(1,0,1,1,1,1,0,0,0)c -c(1,0,1,1,0,0,0,0,0)d -c(1,1,1,1,0,1,0,0,0)df -rbind(a,b,c,d)df -cbind(df,h=c(sum(a)*8,sum(b)*8,sum(c)*8,sum(d)*8))df -cbind(df,df[,8]*c(1,2,3,2)) Jean On Wed, Oct 1, 2014 at 7:30 PM, Andras Farkas motyoc...@yahoo.com wrote: Dear All, please provide help with the following: we have a -c(0,1,1,0,1,0,0,0,0) b -c(0,0,0,1,0,0,0,0,0) c -c(1,0,1,0,1,1,0,0,0) d -c(0,1,0,1,0,1,0,0,0) df -rbind(a,b,c,d) df -cbind(df,h=c(sum(a)*8,sum(b)*8,sum(c)*8,sum(d)*8)) df -cbind(df,df[,8]*c(1,2,3,2)) I would like to minimize the value for sum(df[,9]) under the following conditions: 1. all values of a,b,c, and d are binary variables and are the variables to change to get the optimal result 2. sum(a), sum(b), and sum(d) should be each 5 or more 3. sum(c) should be 3 or less 4. a[2], a[3], b[2], d[7] and d[8] are fixed to their current values. any thoughts or reference examples you could help with is greatly appreciated thanks andras __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to check to see if a variable is within a range of another variable
On 01/10/2014 23:54, Peter Alspach wrote: Tena koe Kate If kateDF is a data.frame with your data, then apply(kateDF, 1, function(x) isTRUE(all.equal(x[2], x[1], check.attributes = FALSE, tolerance=0.1))) comes close to (what I think) you want (but not to what you have illustrated in your 'eventual outcome'). Anyhow, it may be enough to allow you to get there. HTH Peter Alspach I seem to need to specify all.equal(..., scale) to get correct values for some data/tolerances: -- aDF - data.frame(a = 1:10, b=10:1) apply(aDF, 1, function(x) isTRUE(all.equal(x[2], x[1], check.attributes = FALSE, tolerance = 5, scale = 1)) ) # [1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE apply(aDF, 1, function(x) isTRUE(all.equal(x[2], x[1], check.attributes = FALSE, tolerance = 5)) ) # [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE -- I'm probably being stupid, but from reading ?all.equal I would have expected scale = 1 and the default scale = NULL to give identical results for the length one numerics being passed to all.equal. Can anyone explain? KJ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge by time, certain value if 5 min before and after an event
Hello! I hope someone can help me. It would save me days of work. Thanks in advance! I have two dataframes which look like these: myframe - data.frame (Timestamp=c(24.09.2012 09:00:00, 24.09.2012 10:00:00, 24.09.2012 11:00:00), Event=c(low,high,low) ) myframe mydata - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) ) mydata # I want to merge them by time so I have a dataframe which looks like this in the end (i.e. Low during 5 min before and after high ) result - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) , Event=c(low, low,high,high,low)) result Anyone knows how do merge them? Best regards, Dagmar __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to check to see if a variable is within a range of another variable
Keith Jewell said: ... from reading ?all.equal I would have expected scale = 1 and the default scale = NULL to give identical results for the length one numerics being passed to all.equal. Can anyone explain? Inspectng the code in all.equal.numeric, I find xy - mean((if (cplx) Mod else abs)(target - current)) if (is.null(scale)) { xn - mean(abs(target)) if (is.finite(xn) xn tolerance) { xy - xy/xn relative } else absolute } else { xy - xy/scale if (scale == 1) absolute else scaled } target is the first number supplied, current the second; in yoour example code that is x[2] and x[1] respectively. Later on xy is compared to the tolerance. In the code, scale=NULL and scale=1 are clearly treated differently; in particular when scale is NULL the absolute difference is divided by first of the (two) numbers if that number is greater than tolerance or is used unchanged, and if scale=1 it is divided by scale throughout. That would mean that for scale=NULL, your example will divide the difference by 10, 9, ..1 in that order before comparing with tolerance, and if scale=1 it will simply compare the difference directly with the tolerance. Calculating your case through for scale = NULL, xy will take the values ifelse(b5, abs(a-b)/b, abs(a-b)) [1] 0.900 0.778 0.625 0.4285714 0.167 1.000 3.000 [8] 5.000 7.000 9.000 Of those, only the last 2 are greater than 5, which is the result you found. By contrast, when scale=1 xy takes the values abs(a-b) [1] 9 7 5 3 1 1 3 5 7 9 of which the two at each end are both greater than 5. That fairly complicated behavio0ur is probably a good reason to use a simpler calculation in which you can see how the difference is being scaled ... ;) Steve Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] accumulation curve for kernel density using adehabitatHR
Dear all, I am trying to create an accumulation curve for kernel density estimation (KDE) of home range size, with kernel density in the y-axis and number of telemetry fixes in the x-axis. I am using kernelUD and getverticeshr functions from adehabitatHR package. However, I have a problem: when re-sampling a few number of points to calculate KDE, the function getverticeshr (and sometimes kernelUD also) returns an error (maybe because the fixes are too close and it is impossible to normalize the surface, but I am not sure), and then it is impossible to build the accumulation curve. I tried to throw out this cases in which the functions return errors and sample once again, but it produces a bias towards high and unrealistic values of home range estimation. Have anyone worked with that before, or have any cues on how to solve that? Thank you very much. Bernardo Niebuhr [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Befuddled by ddply
Folks, I have the following data: mdf-structure(list(a = 1:3, b = c(10, 20, 30)), .Names = c(a, b ), row.names = c(NA, -3L), class = data.frame) And function: defCurveBreak-function(x, y) { cumsum(rep(diff(c(0, x)), each = y)/y) } lapply'ing to get the result foo foo-data.frame(lapply(mdf, function(x, y) defCurveBreak(x,y), 4)) foo ab 1 0.25 2.5 2 0.50 5.0 3 0.75 7.5 4 1.00 10.0 5 1.25 12.5 6 1.50 15.0 7 1.75 17.5 8 2.00 20.0 9 2.25 22.5 10 2.50 25.0 11 2.75 27.5 12 3.00 30.0 Which all works fine. I was wondering is there a way to do this using ddply? Is there a reason to do this using ddply rather than the above idiom? I spent a bunch of time trying to figure out how to set it up in ddply (is that the wrong tool?) to no avail. Thanks for your time, Best, KW __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Dear UseRs, I obtained following results from Anderson-Darling Goodness of fit test. dput(EB) structure(c(2.911, 0.9329, 0.818, 1.539, 0.604, 0.5142, 0.4344, 0.801, 0.963, 0.9925, 0.933, 0.956, 0.883, 0.572), .Dim = c(7L, 2L), .Dimnames = list(c(EXP, GUM, GENLOG, GENPARETO, GEV, LN, PAR3), c(A2, P))) My question is how to interpret these results? More precisely, which distribution fitted well on my data? Thankyou very much in advance Eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inverse Student t-value
Andre For the last time, there is NO simple rational approximation to the quantiles of the t-distribution. From R help, qt = TINV is based on Hill, G. W. (1970) Algorithm 396: Student's t-quantiles. Communications of the ACM, 13(10), 619–620. And Hill, G. W. (1981) Remark on Algorithm 396, ACM Transactions on Mathematical Software, 7, 250–1. Both of these articles can be googled and the source code obtained from the ACM. (I have done it.) The code uses a rational approximation to the inverse for which you can compute values by hand. Then you can see why we use qt(). :-) Joe Andre geomodel...@gmail.com 10/02/2014 07:27 AM To jlu...@ria.buffalo.edu, cc Duncan Murdoch murdoch.dun...@gmail.com, r-help@r-project.org r-help@r-project.org, r-help-boun...@r-project.org Subject Re: [R] Inverse Student t-value Dear All, The manual formula mean that how to calculate that value by hand for TINV(0.408831, 1221) and the resulted is 4.0891672 Appreciate your help in advance. Cheers! On Wed, Oct 1, 2014 at 9:15 PM, jlu...@ria.buffalo.edu wrote: What do you mean by a manual formula? Andre geomodel...@gmail.com 09/30/2014 11:54 PM To jlu...@ria.buffalo.edu, cc Duncan Murdoch murdoch.dun...@gmail.com, r-help@r-project.org r-help@r-project.org, r-help-boun...@r-project.org Subject Re: [R] Inverse Student t-value Hi JLucke, Maybe the old excel function. TINV and T.INV.2T is same function for T wo-Tailed Inverse of the student`s t-distribution but T.INV use for Left-Tailed inverse of the Student's t-distribution and can be use for Inverse of the student`s t-distribution. I know automatic or functions any software but I just need a manual formula or compute formula (TINV or T.INV.2T) step by step presented by math for calculate until resulted. Thanks in advance. Cheers! On Wed, Oct 1, 2014 at 2:39 AM, jlu...@ria.buffalo.edu wrote: The website has your answer. The t-distribution is a regularized incomplete beta function. The incomplete beta function is given by R's pbeta function. You regularize it with R's beta function. Then you use R's uniroot function to find the inverse. Good homework problem. Andre geomodel...@gmail.com Sent by: r-help-boun...@r-project.org 09/30/2014 02:45 PM To Duncan Murdoch murdoch.dun...@gmail.com, cc r-help@r-project.org r-help@r-project.org Subject Re: [R] Inverse Student t-value Hi Duncan, Let me explain again, I just need a manual expression for inverse student t value. You could go to web page http://www.danielsoper.com/statcalc3/calc.aspx?id=10 That's inverse student t value calculator. Do you know a manual expression use it. Cheers! On Wednesday, October 1, 2014, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 30/09/2014 2:26 PM, Andre wrote: Hi Duncan, Actually, I am trying trace the formula for the Critical value of Z and manual formula is =(I7-1)/SQRT(I7)*SQRT((TINV(0. 05/I7,I7-2))^2/(I7-2+TINV(0.05/I7,I7-2))) So, I got new problem for TINV formula. I just need a manual equation for TINV. Sorry, can't help. I'm not sure I understand what you want, but if it's a simple formula for quantiles of the t distribution, it doesn't exist. Duncan Murdoch Hope solve this problem. Cheers! On Wed, Oct 1, 2014 at 1:20 AM, Duncan Murdoch murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com wrote: On 30/09/2014 2:11 PM, Andre wrote: Hi Duncan, No, that's correct. Actually, I have data set below; Then it seems Excel is worse than I would have expected. I confirmed R's value in two other pieces of software, OpenOffice and some software I wrote a long time ago based on an algorithm published in 1977 in Applied Statistics. (They are probably all using the same algorithm. I wonder what Excel is doing?) N= 1223 alpha= 0.05 Then probability= 0.05/1223=0.408831 degree of freedom= 1223-2= 1221 So, TINV(0.408831,1221) returns 4.0891672 Could you show me more detail a manual equation. I really appreciate it if you may give more detail. I already gave you the expression: abs(qt(0.408831/2, df=1221)). For more detail, I suppose you could look at the help page for the qt function, using help(qt). Duncan Murdoch Cheers! On Wed, Oct 1, 2014 at 1:01 AM, Duncan Murdoch murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com wrote: On 30/09/2014 1:31 PM, Andre wrote: Dear Sir/Madam, I am trying to use calculation for two-tailed inverse of the student`s t-distribution function presented by Excel functions like =TINV(probability, deg_freedom). For instance: The Excel
[R] tkProgressBar without progress in foreach %dopar%
I have a list (temp.data) with many raster data and some computations are in parallel. n is the number of raster data and target is the mask. I'd like to use a progress bar. It is created but while the loop is running the progress is not showed. The loop ends and the progress bar is closed. I've tried to use the functions pdCreate/pdStep in raster package, but without success... Someone have any idea... mypb - tkProgressBar(title = R progress bar, label = , min = 0, max = n, initial = 0, width = 300) #creating a computing cluster cl - makeCluster(detectCores(),type='SOCK') registerDoParallel(cl, cores = detectCores()) foreach(i=1:n,.packages=c('tcltk','rgdal','raster')) %dopar% { Sys.sleep(.1) setTkProgressBar(mypb, i, title = R progress bar, label = NULL) temp.data[[i]]-mask(temp.data[[i]],target,maskvalue=as.numeric(class.outsid e)) writeRaster(temp.data[[i]],filename=files.list[i],format='GTiff',overwrite=T ) } stopCluster(cl) close(mypb) Thanks Alexsandro ___ --- Este email est� limpo de v�rus e malwares porque a prote��o do avast! Antiv�rus est� ativa. http://www.avast.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem using predict() to draw lines() if predictor variable is logged on the fly
Hello, I am plotting glms with logged predictors. I would like to define the logged variables on the fly rather than hard coding them into my dataframe. This code (with hard-coded logged variables) works just fine: xx-seq(-2,.5,by=0.1); lines(xx,predict(power,list(LogArKm00=xx),type= response),col=black,lwd=2,lty=1) #LogArKm00 is a variable in my dataframe but my attempt to define them on the fly doesn't work (see error below): plot(log(WbAb,10)~log(ArKm00,10),data=dat) #power model xx-seq(-2,.5,by=0.1); lines(xx,predict(power,list(log(ArKm00,10)=xx),type= response),col=black,lwd=2,lty=1) #trying to log the variable on the fly Error: unexpected '=' in lines(xx,predict(power,list(log(ArKm00,10)= I would really appreciate any help sorting this out! Many thanks Mark [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] order clusters in heatmap.2
Hi , Is there a way to order clusters in heatmap.2 ? Best Regards, Asmaa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] concentrated likelihood in dlm
Hi, I am trying to do maximum likelihood estimation on a univariate structural model with diffuse components in dlm. The package already has an MLE function, but I would like to implement two enhancements, both of which are discussed in Harvey's Forecasting structural time series models and the Kalman Filter, section 3.4: 1. drop the d first components for diffusive terms to construct a proper prior 2. swap in the concentrated univariate likelihood for the standard multivariate (concentrated variance being the prediction error variance). The first item is easy, so my question surrounds item (2). My hope is to re-use the MLE calculation apparatus in dlm, but swap out either one line in dlmLL that augments the likelihood. The concentrated likelihood requires two ingredients that come from the filter: the one step ahead prediction error and its variance. I believe that the prediction errors are easy to find. There are functions that produce it outside dlmLL and also it is pretty easy to find in dlmLL itself. I am less clear how to obtain the variance, which is the univariate Var(y(t)|y(t-1)) and is denoted f_t in Harvey's book. Here y is the observation. My confidence is low because the function is written for a multivariate filter with SVD expression and my R skills are beginning-intermediate. At best I think f is here in SVD form and I'm concerned I might have to cast it or something if I want to work with it as a scalar. Can anyone help me with an example of how to obtain f_t? Either within dlmLL or without? I appreciate any hints I can get. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Custom Function Not Completely working
Thank you Duncan! Dan -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Monday, September 29, 2014 6:23 PM To: Lopez, Dan; R help (r-help@r-project.org) Subject: Re: [R] Custom Function Not Completely working On 29/09/2014, 9:07 PM, Lopez, Dan wrote: Hi R Experts, I'm in the process of creating a set of convenience functions for myself. I am somewhat new to writing functions in r. In this example I wanted to load both existing files called KuhnPerform.Rdata and KuhnPerform.Rhistory. However it only works for loading the .Rhistory file and not the .Rdata file. Can someone please tell me how to fix this so that it loads both .Rdata and .Rhistory? I checked dir() that these files are in my working directory and are spelled correctly loads-function(filename=KuhnPerform){ load(paste0(filename,.Rdata)) # Load a previously saved workspace. loadhistory(paste0(filename,.Rhistory)) # Load a previously saved history file. } You're loading the objects from the Rdata file into the evaluation frame of your function, and they go away afterwards. You need to specify where they should be loaded using the envir argument, e.g. load(paste0(filename,.Rdata), envir = parent.frame()) (The funny thing here is that the default for envir is parent.frame(), but it means something different in that context.) Duncan Murdoch Dan Workforce Analyst LLNL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best Beginner Books?
It depends quite a bit on where you are coming from. If you come from the mathematical side of things, i.e. you're not scared at the thought of multiplying matrices, know a good deal about statistics and have some experience with programming languages, the document by Venables and Smith An Introduction to R (http://cran.r-project.org/manuals.html) may be the most efficient way to get started. My book, originally written for a course that use Altman's book on Medical Statistics at the main tex, is aimed at people who have been (or are being) exposed to the basic set of statistical methods and need to know how to do things in R. Yet others (e.g. Matloff) focus on the programming language aspects of R, some specifically target Data Science (whatever that means), etc. So you may need to shop around a little. -pd On 02 Oct 2014, at 08:22 , arnaud gaboury arnaud.gabo...@gmail.com wrote: On Thu, Oct 2, 2014 at 4:54 AM, Jason Eyerly teamtraders3...@gmail.com wrote: A lot of excellent suggestions! Thank you everyone for the input. I’ve purchased via Amazon: A Beginner's Guide to R by Zuur Data Manipulation with R by Spector “Introductory Statistics with R.” by Peter Dalgaard Are there any suggestions as to which order I should read them in when they arrive? For best comprehension, I’m thinking “Introductory Statistics with R” would be the best to start with, and perhaps the beginners guide after that, or vice versa? I would not start with a whole book, but a good short intro (~ 30-40 pages). Feel free to pick up free publications here: https://github.com/gabx/r-project/tree/master/documentation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tkProgressBar without progress in foreach %dopar%
On 02/10/2014 16:40, Alexsandro Cândido de Oliveira Silva wrote: I have a list (temp.data) with many raster data and some computations are in parallel. n is the number of raster data and target is the mask. I'd like to use a progress bar. It is created but while the loop is running the progress is not showed. The loop ends and the progress bar is closed. I've tried to use the functions pdCreate/pdStep in raster package, but without success... Someone have any idea... You are using parallel processes here. You do not tell us what packages you are actually using, but my guess is that setTkProgressBar(mypb, i, title = R progress bar, label = NULL) updates a copy of mypb in the worker process, but it is the master process which is displaying the progress bar. But without the complete reproducible example the posting guide asked for, we can only guess. mypb - tkProgressBar(title = R progress bar, label = , min = 0, max = n, initial = 0, width = 300) #creating a computing cluster cl - makeCluster(detectCores(),type='SOCK') registerDoParallel(cl, cores = detectCores()) foreach(i=1:n,.packages=c('tcltk','rgdal','raster')) %dopar% { Sys.sleep(.1) setTkProgressBar(mypb, i, title = R progress bar, label = NULL) temp.data[[i]]-mask(temp.data[[i]],target,maskvalue=as.numeric(class.outsid e)) writeRaster(temp.data[[i]],filename=files.list[i],format='GTiff',overwrite=T ) } stopCluster(cl) close(mypb) Thanks Alexsandro __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Class ltraj and function as.ltraj
Hello R Help Group: I have been struggling to create an object of class ltraj with the function as.ltraj (adehabitatLT) with my bird data. Regarding my data structure, I have GPS for 10 birds that transmit three times/day, over the course of a year (with missing data). I have a L10.csv file with the following headers: Craneid, Date, Time, Long, Lat, Habitat, ID (for burst). Step 1: Bring in my data with: �stringsasFactors=FALSE� to convert all variables from Factor (except Lat/Long) to strings. Thanks to David Carlson for that tip! Step 2: Transform my date, time vectors into POSIXct as follows: datetime=as.POSIXct(strptime(paste(L10$Date, L10$Time, sep= ),format=%m/%d/%Y %H:%M:%S, America/Chicago)) Thanks to Petr Pikal for that tip! Result: head(datetime)[1] 2011-07-10 17:00:38 CDT 2011-07-11 00:01:06 CDT[3] 2011-07-11 08:00:38 CDT 2011-07-11 17:00:38 CDT[5] 2011-07-12 00:01:06 CDT 2011-07-12 08:00:38 CDT Good so far�. Step 3: Coord=L10[c(Longitude, Latitude)] head(Coord) Longitude Latitude1 522598 33602852522579 33601743522618 33602744522656 33601965 522397 33602076522425 3360285 Good so far�.now comes the tricky part for me. Step 4: Craneid=as.character(L10$Craneid) id=as.character(L10$ID) Step 5: Test=as.ltraj(Coord, datetime, Craneid, burst=id, type=TRUE) Drum Roll Please�. Error in as.ltraj(Coord, datetime, Craneid, burst = id, typeII = TRUE) : non unique dates for a given burst I include my data.frame for your review. head(l10b) Longitude Latitude datetime Craneidid 1522598 3360285 2011-07-10 17:00:38 L1_10 L1_10 2522579 3360174 2011-07-11 00:01:06 L1_10 L1_10 3522618 3360274 2011-07-11 08:00:38 L1_10 L1_10 4522656 3360196 2011-07-11 17:00:38 L1_10 L1_10 5522397 3360207 2011-07-12 00:01:06 L1_10 L1_10 6522425 3360285 2011-07-12 08:00:38 L1_10 L1_10 Longitude Latitude datetime Craneidid 3803558205 3346410 2011-04-15 17:00:38 L5_10 L5_10 3804552813 3341251 2011-04-16 08:00:38 L5_10 L5_10 3805552784 3341373 2011-04-28 08:00:38 L5_10 L5_10 3806552833 3341262 2011-04-28 17:00:38 L5_10 L5_10 3807573502 3407390 2011-06-21 17:00:38 L8_10 L8_10 3808573271 3407499 2011-06-23 08:00:38 L8_10 L8_10 I have checked and re-checked for duplicates and there are no duplicates. However, when ask for duplicates in the datetime I get some �False� but a lot of �True�s� So, I am thinking it has to do with the fact that R is not picking up the individual birds which were monitored over the same time period. How do I structure my data in R to recognize the 10 separate birds with their associated coordinates and time stamps? I would ultimately like to run Bias Bridge Movement on these data but I can�t get from square one! Help! Thanks in advance for any and all assistance you can provide�You all are so valuable. TLP [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop does not work: Error in else statement (II)
Jim, Thanks for the comment about else! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using PCA to filter a series
I have four time-series of similar data. I would like to combine these into a single, clean time-series. I could simply find the mean of each time period, but I think that using principal components analysis should extract the most salient pattern and ignore some of the noise. I can compute components using princomp d1 - c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113) d2 - c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138) d3 - c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138) d4 - c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134) pca - princomp(cbind(d1,d2,d3,d4)) plot(pca$scores[,1]) This seems to have created the clean pattern I want, but I would like to project the first component back into the original axes? Is there a simple way to do that? Jonathan B. Thayn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using PCA to filter a series
On Oct 2, 2014, at 12:18 PM, Jonathan Thayn jth...@ilstu.edu wrote: I have four time-series of similar data. I would like to combine these into a single, clean time-series. I could simply find the mean of each time period, but I think that using principal components analysis should extract the most salient pattern and ignore some of the noise. I can compute components using princomp d1 - c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113) d2 - c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138) d3 - c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138) d4 - c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134) pca - princomp(cbind(d1,d2,d3,d4)) plot(pca$scores[,1]) This seems to have created the clean pattern I want, but I would like to project the first component back into the original axes? Is there a simple way to do that? Do you mean that you want to scale the scores on Axis 1 to the mean and range of your raw data? Or their mean and variance? See ?scale Jonathan B. Thayn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Don McKenzie Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Environmental and Forest Sciences College of the Environment University of Washington d...@uw.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anderson-Darling gof test (was no subject)
On 03/10/14 03:54, eliza botto wrote: Dear UseRs, I obtained following results from Anderson-Darling Goodness of fit test. dput(EB) structure(c(2.911, 0.9329, 0.818, 1.539, 0.604, 0.5142, 0.4344, 0.801, 0.963, 0.9925, 0.933, 0.956, 0.883, 0.572), .Dim = c(7L, 2L), .Dimnames = list(c(EXP, GUM, GENLOG, GENPARETO, GEV, LN, PAR3), c(A2, P))) My question is how to interpret these results? More precisely, which distribution fitted well on my data? (1) Please use an informative subject line. (2) Your posted data are somewhat opaque; one might guess that A2 holds the values of the test statistic and P the corresponding p-values. However I cannot reproduce your P column by applying the pAD() function. (3) I don't really know anything about the Anderson-Darling test, but the function ad.test() requires that parameters of the distribution be supplied. What is the impact of these parameters' being estimated, as they must surely be? (4) This is really a statistics question rather than an R question, and so is not appropriate for this list. Moreover it would appear that you are well out of your statistical depth. You should seek local statistical consultation rather than trying to scramble your way through topics of which you have no proper understanding by peppering the r-help list with ill-posed questions. cheers, Rolf Turner -- Rolf Turner Technical Editor ANZJS __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge by time, certain value if 5 min before and after an event
Dagmar, Can you explain more fully why rows 1, 2, and 5 in your result are low and rows 3 and 4 are high? It is not clear to me from the information you have provided. result[c(1, 2, 5), ] Timestamp location Event 1 24.09.2012 09:05:011 low 2 24.09.2012 09:49:502 low 5 24.09.2012 10:05:105 low result[3:4, ] Timestamp location Event 3 24.09.2012 09:51:013 high 4 24.09.2012 10:04:501 high Jean On Thu, Oct 2, 2014 at 8:13 AM, Dagmar ramga...@gmx.net wrote: Hello! I hope someone can help me. It would save me days of work. Thanks in advance! I have two dataframes which look like these: myframe - data.frame (Timestamp=c(24.09.2012 09:00:00, 24.09.2012 10:00:00, 24.09.2012 11:00:00), Event=c(low,high,low) ) myframe mydata - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) ) mydata # I want to merge them by time so I have a dataframe which looks like this in the end (i.e. Low during 5 min before and after high ) result - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) , Event=c(low, low,high,high,low)) result Anyone knows how do merge them? Best regards, Dagmar __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem using predict() to draw lines() if predictor variable is logged on the fly
You will have better luck getting replies to your post if you provide code that we can run. In other words, provide some small example data instead of referring to your data frame that we have no access to. You can use the output from dput() to provide a subset of your data frame to the list. dput(mydataframe[1:50, ]) Jean On Thu, Oct 2, 2014 at 11:12 AM, mtb...@gmail.com wrote: Hello, I am plotting glms with logged predictors. I would like to define the logged variables on the fly rather than hard coding them into my dataframe. This code (with hard-coded logged variables) works just fine: xx-seq(-2,.5,by=0.1); lines(xx,predict(power,list(LogArKm00=xx),type= response),col=black,lwd=2,lty=1) #LogArKm00 is a variable in my dataframe but my attempt to define them on the fly doesn't work (see error below): plot(log(WbAb,10)~log(ArKm00,10),data=dat) #power model xx-seq(-2,.5,by=0.1); lines(xx,predict(power,list(log(ArKm00,10)=xx),type= response),col=black,lwd=2,lty=1) #trying to log the variable on the fly Error: unexpected '=' in lines(xx,predict(power,list(log(ArKm00,10)= I would really appreciate any help sorting this out! Many thanks Mark [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge by time, certain value if 5 min before and after an event
Dear Jean and all, I want all lines to be low, but during times 9:55 - 10:05 a.m (i.e. a timespan of 10 min) I want them to be high. In my real data low and high refer to lowtide and hightide in the waddensea and I want to assign the location of my animal at the time it was taken to the tide (that means, there was water not only exactly at 10:00 (as taken from official data) but also 5 min before and after). I hope that is more understandable, if not ask again. Thanks for trying to help, Dagmar Am 02.10.2014 um 23:01 schrieb Adams, Jean: Dagmar, Can you explain more fully why rows 1, 2, and 5 in your result are low and rows 3 and 4 are high? It is not clear to me from the information you have provided. result[c(1, 2, 5), ] Timestamp location Event 1 24.09.2012 09:05:011 low 2 24.09.2012 09:49:502 low 5 24.09.2012 10:05:105 low result[3:4, ] Timestamp location Event 3 24.09.2012 09:51:013 high 4 24.09.2012 10:04:501 high Jean On Thu, Oct 2, 2014 at 8:13 AM, Dagmar ramga...@gmx.net mailto:ramga...@gmx.net wrote: Hello! I hope someone can help me. It would save me days of work. Thanks in advance! I have two dataframes which look like these: myframe - data.frame (Timestamp=c(24.09.2012 09:00:00, 24.09.2012 10:00:00, 24.09.2012 11:00:00), Event=c(low,high,low) ) myframe mydata - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) ) mydata # I want to merge them by time so I have a dataframe which looks like this in the end (i.e. Low during 5 min before and after high ) result - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) , Event=c(low, low,high,high,low)) result Anyone knows how do merge them? Best regards, Dagmar __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge by time, certain value if 5 min before and after an event
Thanks, Dagmar. So, shouldn't row 3 with a time of 09:51:01 be low and not high? Jean On Thu, Oct 2, 2014 at 4:25 PM, Dagmar ramga...@gmx.net wrote: Dear Jean and all, I want all lines to be low, but during times 9:55 - 10:05 a.m (i.e. a timespan of 10 min) I want them to be high. In my real data low and high refer to lowtide and hightide in the waddensea and I want to assign the location of my animal at the time it was taken to the tide (that means, there was water not only exactly at 10:00 (as taken from official data) but also 5 min before and after). I hope that is more understandable, if not ask again. Thanks for trying to help, Dagmar Am 02.10.2014 um 23:01 schrieb Adams, Jean: Dagmar, Can you explain more fully why rows 1, 2, and 5 in your result are low and rows 3 and 4 are high? It is not clear to me from the information you have provided. result[c(1, 2, 5), ] Timestamp location Event 1 24.09.2012 09:05:011 low 2 24.09.2012 09:49:502 low 5 24.09.2012 10:05:105 low result[3:4, ] Timestamp location Event 3 24.09.2012 09:51:013 high 4 24.09.2012 10:04:501 high Jean On Thu, Oct 2, 2014 at 8:13 AM, Dagmar ramga...@gmx.net mailto:ramga...@gmx.net wrote: Hello! I hope someone can help me. It would save me days of work. Thanks in advance! I have two dataframes which look like these: myframe - data.frame (Timestamp=c(24.09.2012 09:00:00, 24.09.2012 10:00:00, 24.09.2012 11:00:00), Event=c(low,high,low) ) myframe mydata - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) ) mydata # I want to merge them by time so I have a dataframe which looks like this in the end (i.e. Low during 5 min before and after high ) result - data.frame ( Timestamp=c(24.09.2012 09:05:01, 24.09.2012 09:49:50, 24.09.2012 09:51:01, 24.09.2012 10:04:50, 24.09.2012 10:05:10) , location=c(1,2,3,1,5) , Event=c(low, low,high,high,low)) result Anyone knows how do merge them? Best regards, Dagmar __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using PCA to filter a series
On Oct 2, 2014, at 2:29 PM, Jonathan Thayn jth...@ilstu.edu wrote: Hi Don. I would like to de-rotate� the first component back to its original state so that it aligns with the original time-series. My goal is to create a �cleaned�, or a �model� time-series from which noise has been removed. Please cc the list with replies. It�s considered courtesy plus you�ll get more help that way than just from me. Your goal sounds almost metaphorical, at least to me. Your first axis �aligns� with the original time series already in that it captures the dominant variation across all four. Beyond that, there are many approaches to signal/noise relations within time-series analysis. I am not a good source of help on these, and you probably need a statistical consult (locally?), which is not the function of this list. Jonathan Thayn On Oct 2, 2014, at 2:33 PM, Don McKenzie d...@u.washington.edu wrote: On Oct 2, 2014, at 12:18 PM, Jonathan Thayn jth...@ilstu.edu wrote: I have four time-series of similar data. I would like to combine these into a single, clean time-series. I could simply find the mean of each time period, but I think that using principal components analysis should extract the most salient pattern and ignore some of the noise. I can compute components using princomp d1 - c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113) d2 - c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138) d3 - c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138) d4 - c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134) pca - princomp(cbind(d1,d2,d3,d4)) plot(pca$scores[,1]) This seems to have created the clean pattern I want, but I would like to project the first component back into the original axes? Is there a simple way to do that? Do you mean that you want to scale the scores on Axis 1 to the mean and range of your raw data? Or their mean and variance? See ?scale Jonathan B. Thayn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Don McKenzie Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Environmental and Forest Sciences College of the Environment University of Washington d...@uw.edu Don McKenzie Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Environmental and Forest Sciences College of the Environment University of Washington d...@uw.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Class ltraj and function as.ltraj
You don't detail how you detect that 'there are no duplicates', and also provide no proof of this. It's not the usual sense of duplicated(), it needs to be that there are no date-times within a burst (a single animal's trip/journey/trajectory). Can you try this on your datetime and burst id and report: ## this tests out ok datetime - seq(Sys.time(), by = 3 secs, length = 20) id - rep(c(a, b), each = 10) ## none should be zero or negative sapply(split(datetime, id), function(x) min(diff(x))) ## fudge a broken data set datetime1 - datetime datetime1[15] - datetime1[14] sapply(split(datetime1, id), function(x) min(diff(x))) Cheers, Mike. On Fri, Oct 3, 2014 at 4:30 AM, tandi perkins id...@hotmail.com wrote: Hello R Help Group: I have been struggling to create an object of class ltraj with the function as.ltraj (adehabitatLT) with my bird data. Regarding my data structure, I have GPS for 10 birds that transmit three times/day, over the course of a year (with missing data). I have a L10.csv file with the following headers: Craneid, Date, Time, Long, Lat, Habitat, ID (for burst). Step 1: Bring in my data with: “stringsasFactors=FALSE” to convert all variables from Factor (except Lat/Long) to strings. Thanks to David Carlson for that tip! Step 2: Transform my date, time vectors into POSIXct as follows: datetime=as.POSIXct(strptime(paste(L10$Date, L10$Time, sep= ),format=%m/%d/%Y %H:%M:%S, America/Chicago)) Thanks to Petr Pikal for that tip! Result: head(datetime)[1] 2011-07-10 17:00:38 CDT 2011-07-11 00:01:06 CDT[3] 2011-07-11 08:00:38 CDT 2011-07-11 17:00:38 CDT[5] 2011-07-12 00:01:06 CDT 2011-07-12 08:00:38 CDT Good so far…. Step 3: Coord=L10[c(Longitude, Latitude)] head(Coord) Longitude Latitude1 522598 33602852522579 33601743522618 33602744522656 33601965 522397 33602076522425 3360285 Good so far….now comes the tricky part for me. Step 4: Craneid=as.character(L10$Craneid) id=as.character(L10$ID) Step 5: Test=as.ltraj(Coord, datetime, Craneid, burst=id, type=TRUE) Drum Roll Please…. Error in as.ltraj(Coord, datetime, Craneid, burst = id, typeII = TRUE) : non unique dates for a given burst I include my data.frame for your review. head(l10b) Longitude Latitude datetime Craneidid 1522598 3360285 2011-07-10 17:00:38 L1_10 L1_10 2522579 3360174 2011-07-11 00:01:06 L1_10 L1_10 3522618 3360274 2011-07-11 08:00:38 L1_10 L1_10 4522656 3360196 2011-07-11 17:00:38 L1_10 L1_10 5522397 3360207 2011-07-12 00:01:06 L1_10 L1_10 6522425 3360285 2011-07-12 08:00:38 L1_10 L1_10 Longitude Latitude datetime Craneidid 3803558205 3346410 2011-04-15 17:00:38 L5_10 L5_10 3804552813 3341251 2011-04-16 08:00:38 L5_10 L5_10 3805552784 3341373 2011-04-28 08:00:38 L5_10 L5_10 3806552833 3341262 2011-04-28 17:00:38 L5_10 L5_10 3807573502 3407390 2011-06-21 17:00:38 L8_10 L8_10 3808573271 3407499 2011-06-23 08:00:38 L8_10 L8_10 I have checked and re-checked for duplicates and there are no duplicates. However, when ask for duplicates in the datetime I get some “False” but a lot of “True’s” So, I am thinking it has to do with the fact that R is not picking up the individual birds which were monitored over the same time period. How do I structure my data in R to recognize the 10 separate birds with their associated coordinates and time stamps? I would ultimately like to run Bias Bridge Movement on these data but I can’t get from square one! Help! Thanks in advance for any and all assistance you can provide…You all are so valuable. TLP [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Software and Database Engineer Australian Antarctic Division Hobart, Australia e-mail: mdsum...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using PCA to filter a series
I think you want to convert your principal component to the same scale as d1, d2, d3, and d4. But the original space is a 4-dimensional space in which d1, d2, d3, and d4 are the axes, each with its own mean and standard deviation. Here are a couple of possibilities # plot original values for comparison matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5) # standardize the pc scores to the grand mean and sd new1 - scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, d4)) lines(new1) # Use least squares regression to predict the row means for the original four variables new2 - predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1])) lines(new2, col=red) - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Don McKenzie Sent: Thursday, October 2, 2014 4:39 PM To: Jonathan Thayn Cc: r-help@r-project.org Subject: Re: [R] Using PCA to filter a series On Oct 2, 2014, at 2:29 PM, Jonathan Thayn jth...@ilstu.edu wrote: Hi Don. I would like to de-rotate� the first component back to its original state so that it aligns with the original time-series. My goal is to create a �cleaned�, or a �model� time-series from which noise has been removed. Please cc the list with replies. It�s considered courtesy plus you�ll get more help that way than just from me. Your goal sounds almost metaphorical, at least to me. Your first axis �aligns� with the original time series already in that it captures the dominant variation across all four. Beyond that, there are many approaches to signal/noise relations within time-series analysis. I am not a good source of help on these, and you probably need a statistical consult (locally?), which is not the function of this list. Jonathan Thayn On Oct 2, 2014, at 2:33 PM, Don McKenzie d...@u.washington.edu wrote: On Oct 2, 2014, at 12:18 PM, Jonathan Thayn jth...@ilstu.edu wrote: I have four time-series of similar data. I would like to combine these into a single, clean time-series. I could simply find the mean of each time period, but I think that using principal components analysis should extract the most salient pattern and ignore some of the noise. I can compute components using princomp d1 - c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113) d2 - c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138) d3 - c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138) d4 - c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134) pca - princomp(cbind(d1,d2,d3,d4)) plot(pca$scores[,1]) This seems to have created the clean pattern I want, but I would like to project the first component back into the original axes? Is there a simple way to do that? Do you mean that you want to scale the scores on Axis 1 to the mean and range of your raw data? Or their mean and variance? See ?scale Jonathan B. Thayn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Don McKenzie Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Environmental and Forest Sciences College of the Environment University of Washington d...@uw.edu Don McKenzie Research Ecologist Pacific WIldland Fire Sciences Lab US Forest Service Affiliate Professor School of Environmental and Forest Sciences College of the Environment University of Washington d...@uw.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org
Re: [R] Problem using predict() to draw lines() if predictor variable is logged on the fly
Well, at least the immediate cause is clear: list(log(ArKm00,10)=xx) is invalid syntax. If you want a list element _named_ log(ArKm00,10), you'll need to quote the name. However, it's not going to work anyway, because that isn't what predict() expects. You don't supply logged variables, you supply the originals and predict() will do the logging. This is how it does work: x - rexp(10) ; y - rnorm(x, mean=log(x)) fit - lm(y~log(x)) fit Call: lm(formula = y ~ log(x)) Coefficients: (Intercept) log(x) 0.05337 1.05081 predict(fit, new=list(x=1)) 1 0.05337405 (Notice that since log(1)==0, the predicted value equals the intercept.) To be more specific, this works through the fact that predict() internally generates the model frame for newdata as, effectively, model.frame(delete.response(terms(fit)), list(x=1)) log(x) 1 0 On 02 Oct 2014, at 23:07 , Adams, Jean jvad...@usgs.gov wrote: You will have better luck getting replies to your post if you provide code that we can run. In other words, provide some small example data instead of referring to your data frame that we have no access to. You can use the output from dput() to provide a subset of your data frame to the list. dput(mydataframe[1:50, ]) Jean On Thu, Oct 2, 2014 at 11:12 AM, mtb...@gmail.com wrote: Hello, I am plotting glms with logged predictors. I would like to define the logged variables on the fly rather than hard coding them into my dataframe. This code (with hard-coded logged variables) works just fine: xx-seq(-2,.5,by=0.1); lines(xx,predict(power,list(LogArKm00=xx),type= response),col=black,lwd=2,lty=1) #LogArKm00 is a variable in my dataframe but my attempt to define them on the fly doesn't work (see error below): plot(log(WbAb,10)~log(ArKm00,10),data=dat) #power model xx-seq(-2,.5,by=0.1); lines(xx,predict(power,list(log(ArKm00,10)=xx),type= response),col=black,lwd=2,lty=1) #trying to log the variable on the fly Error: unexpected '=' in lines(xx,predict(power,list(log(ArKm00,10)= I would really appreciate any help sorting this out! Many thanks Mark [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Hello, I'm new to R. I'm trying to run a power analysis in a for loop to find an ideal sample size. The situation is I am doing counts of fish at 24 different sites and rating those sites as 1 or 2 for good or poor habitat. So the counts are continuous but the habitat rating isn't. When I try to run a negative binomial general linearized model with given values for mu: exp1.302 and exp-0.32 and a given theta: exp(0.75), I get an error message before I can run the whole loop and plot the power of the sample size against sample. I used a template my teacher showed in class, but since he used a poisson dist., I'm stuck on the glm.nb. Please help! This is an assignment, so any suggestions would be wonderful and I don't expect anything more. But please remember, I am a beginner and only have a crude knowledge of R at this point. Much appreciated. #Power analysis rm(list=ls()) graphics.off() set.seed(1) alpha=0.05 m=200 Ns = seq(2,40,2) nNs = length(Ns) NAs=rep(NA,nNs) results= data.frame(n=Ns, power=NAs) for(j in 1:nNs) { n=Ns[j] reject_count = 0 for(i in 1:m ) { mu_poor=exp(-0.32) mu_suit=exp(1.302) mu=c(rep(mu_poor,n/2),rep(mu_suit,n/2)) theta=exp(0.175) count=rnbinom(n=n,size=theta,mu=mu) hab_poor=1 hab_suit=2 habitat=sample(hab_poor:hab_suit, size=n, replace=T) model = glm.nb(count~habitat,link=log) summary(model) pval = summary(model)$coeff[2,4] pval reject = pval alpha if (reject) reject_count = reject_count+1 } reject_rate = reject_count/m power = reject_rate typeIIerror=1-reject_rate results[j,]$power=power } plot (x=Ns, y=results$power, xlab=Sample size, ylab=Power) lines(Ns,rep(0.95,nNs)) par(lty=2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identifying Values in Dataframe
R Help - I'd like to identify each correlation value in the dataframe below above/below .3/-.3 in order to graph the original data points. I've started with the call below to identify each value by it's row and column. I'd like to form a data object that identifies each set of variables that meet the criteria and then use that to graph the original data. *do.call(cbind, lapply(list(row = row(data, T), col = col(data, T), value = data), as.character))* structure(c(-0.0228976615669603, 0.0228976615669603, 0.0345568787488209, -0.0345568787488209, 0.0704941162950863, 0.0501672252097525, 0.119766411337358, 0.0697823742392512, 0.0223273454311378, -0.0223273454311378, 0.125952482472234, -0.125952482472234, -0.0748856339421511, -0.0353553864437216, 0.199331873910442, -0.068756564596986, 0.0188033303819659, -0.0188033303819659, 0.124483870344689, -0.124483870344689, 0.158304442010968, -0.158304442010968, -0.00291892981576431, 0.00291892981576431, 0.289784855159971, 0.197017018634618, 0.0725645607308865, 0.0960039687045857, 0.145044311433109, -0.145044311433109, 0.228975321916426, -0.228975321916426, 0.26388395000877, 0.175954114053622, 0.326823414986536, 0.0464304517962428, 0.171060413427109, -0.171060413427109, 0.125608489395663, -0.125608489395663, 0.170699125959079, -0.170699125959079, -0.0537588992684595, 0.0537588992684595, 0.237938136557008, 0.130101348701669, 0.0420659299644508, 0.140016889896702, 0.175781963301805, -0.175781963301805, 0.277913325677977, -0.277913325677977, 0.250436246834054, 0.149310941080417, 0.345171759606147, 0.0499822279379925, 0.180291553611261, -0.180291553611261, 0.0983047617452837, -0.0983047617452837, 0.0908629320478729, -0.0908629320478729, 0.0162624158794471, -0.0162624158794471, 0.219099271324641, 0.143898328556892, 0.0808498509449568, 0.0534039458771934, 0.103639895676339, -0.103639895676339, 0.224298317217259, -0.224298317217259, 0.13939241528796, 0.0923125296440915, 0.2952647762031, -0.0573156158960486, 0.126972946909388, -0.126972946909388, 0.145176640269481, -0.145176640269481, 0.176722440639515, -0.176722440639515, -0.110517827771672, 0.110517827771672, 0.265161677824653, 0.0349452421080511, -0.00586032680446085, 0.17140674405117, -0.0141919172668973, 0.0141919172668973, 0.0416754535115447, -0.0416754535115447, 0.110687511145091, 0.163199987771469, 0.174846924599677, 0.116657756754811, 0.184924363876472, -0.184924363876472, 0.0740138506321435, -0.0740138506321435, 0.0653341995281647, -0.0653341995281647, 0.101098966521841, -0.101098966521841, -0.0263969055123976, -0.129660641707532, 0.16313101271814, -0.382052406584783, 0.118309823316179, -0.118309823316179, 0.0408519777835592, -0.0408519777835592, -0.15406959482671, -0.274340973979869, 0.1465995621482, -0.0608360452726588, -0.00570057335076342, 0.00570057335076342, 0.0988132735291764, -0.0988132735291764, 0.159498629897594, -0.159498629897594, -0.0258437210789338, 0.0258437210789338, 0.311623918577701, 0.0959243386674064, 0.0373444291324758, 0.131601179184854, 0.0032064022008327, -0.0032064022008327, 0.0126042917937794, -0.0126042917937794, 0.0288999352531186, 0.0343919995425096, 0.0375647873892517, 0.0734866249695526, 0.125835994106989, -0.125835994106989, 0.0557071876372764, -0.0557071876372764, 0.0190687287223345, -0.0190687287223345, 0.0301326072710063, -0.0301326072710063, 0.0881640884608706, 0.0600194980037123, 0.0948090975231923, 0.0259282757599189, 0.120417132810781, -0.120417132810781, 0.196695581235906, -0.196695581235906, 0.166210300278382, 0.0252645245285897, 0.239953962041662, -0.013933692494363, 0.0174600644363753, -0.0174600644363753, 0.169008089964054, -0.169008089964054, 0.0503612778194372, -0.0503612778194372, 0.175816302924921, -0.175816302924921, 0.141434785651191, 0.0824019401386654, 0.173429908437586, -0.136795834367563, 0.219543981806626, -0.219543981806626, 0.290697487363267, -0.290697487363267, 0.331683649439792, -0.0035319780591347, 0.237371764540467, -0.172828690804139, 0.00922163769628213, -0.00922163769628213, 0.275507919516733, -0.275507919516733, 0.128267529853407, -0.128267529853407, -0.16619622911667, 0.16619622911667, 0.102467428865746, -0.115779804556684, 0.000997318666924614, 0.297139396802529, 0.040957786791642, -0.040957786791642, -0.0160650315621922, 0.0160650315621922, -0.043765426943726, -0.0637020898937285, 0.142863591010818, 0.214059283535989, 0.13975223034564, -0.13975223034564, -0.0286586386843802, 0.0286586386843802, 0.13735028629468, -0.13735028629468, -0.147016933653806, 0.147016933653806, 0.174438743129307, -0.0116564727226121, -0.0413775943824046, 0.136551598575573, 0.0614942508131549, -0.0614942508131549, 0.0687487372508148, -0.0687487372508148, -0.0352587211103196, -0.0872464182568976, 0.162247767446472, 0.0282617081917608, 0.175608445537029, -0.175608445537029, -0.0339260764991994, 0.0339260764991994, 0.130002432167221, -0.130002432167221, -0.141752408742242, 0.141752408742242, 0.126603115520512, -0.0784556105378803, -0.0736773348350251, 0.154815164010555,
Re: [R] Using PCA to filter a series
I suppose I could calculate the eigenvectors directly and not worry about centering the time-series, since they essentially the same range to begin with: vec - eigen(cor(cbind(d1,d2,d3,d4)))$vector cp - cbind(d1,d2,d3,d4)%*%vec cp1 - cp[,1] I guess there is no way to reconstruct the original input data using just the first component, though, is there? Not the original data in it entirety, just one time-series that we representative of the general pattern. Possibly something like the following, but with just the first component: o - cp%*%solve(vec) Thanks for your help. It's been a long time since I've played with PCA. Jonathan Thayn On Oct 2, 2014, at 4:59 PM, David L Carlson wrote: I think you want to convert your principal component to the same scale as d1, d2, d3, and d4. But the original space is a 4-dimensional space in which d1, d2, d3, and d4 are the axes, each with its own mean and standard deviation. Here are a couple of possibilities # plot original values for comparison matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5) # standardize the pc scores to the grand mean and sd new1 - scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, d4)) lines(new1) # Use least squares regression to predict the row means for the original four variables new2 - predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1])) lines(new2, col=red) - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Don McKenzie Sent: Thursday, October 2, 2014 4:39 PM To: Jonathan Thayn Cc: r-help@r-project.org Subject: Re: [R] Using PCA to filter a series On Oct 2, 2014, at 2:29 PM, Jonathan Thayn jth...@ilstu.edu wrote: Hi Don. I would like to de-rotate� the first component back to its original state so that it aligns with the original time-series. My goal is to create a �cleaned�, or a �model� time-series from which noise has been removed. Please cc the list with replies. It�s considered courtesy plus you�ll get more help that way than just from me. Your goal sounds almost metaphorical, at least to me. Your first axis �aligns� with the original time series already in that it captures the dominant variation across all four. Beyond that, there are many approaches to signal/noise relations within time-series analysis. I am not a good source of help on these, and you probably need a statistical consult (locally?), which is not the function of this list. Jonathan Thayn On Oct 2, 2014, at 2:33 PM, Don McKenzie d...@u.washington.edu wrote: On Oct 2, 2014, at 12:18 PM, Jonathan Thayn jth...@ilstu.edu wrote: I have four time-series of similar data. I would like to combine these into a single, clean time-series. I could simply find the mean of each time period, but I think that using principal components analysis should extract the most salient pattern and ignore some of the noise. I can compute components using princomp d1 - c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113) d2 - c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138) d3 - c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138) d4 - c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134) pca - princomp(cbind(d1,d2,d3,d4)) plot(pca$scores[,1]) This seems to have created the clean pattern I want, but I would like to project the first component back into the original axes? Is there a simple way to do that? Do you mean that you want to scale the scores on Axis 1 to the mean and range of your raw data? Or their mean and variance? See ?scale Jonathan B. Thayn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Don McKenzie Research Ecologist Pacific WIldland Fire Sciences Lab US