Re: [R] Head or Tails game
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of darnold Sent: Friday, August 03, 2012 9:18 PM To: r-help@r-project.org Subject: Re: [R] Head or Tails game Wow! Some great responses! I am getting some great responses. I've only read David, Michael, and Dennis thus far, leading me to develop this result before reading further. lead - function(x) { n - length(x) count - 0 if (x[1] = 0) count - count + 1 for (i in 2:n) { if (x[i] 0 || (x[i] == 0 x[i-1] = 0 )) { count - count + 1 } } count } games - replicate(1,sample(c(-1,1),40,replace=TRUE)) games_sum - apply(games,2,sum) plot(table(games_sum)) games_lead - apply(games,2,cumsum) games_lead - apply(games_lead,2,lead) plot(table(games_lead)) Now I am going to read Arun, William, and Jeff's responses and see what other ideas are being proposed. Thanks everyone. D. Here is another solution that doesn't need to define an additional function with an explicit loop. It seems to be considerably faster than the approach presented above. system.time({ set.seed(123) games - matrix(sample(c(-1, 1), 40*1, TRUE), ncol = 1) games_sum - apply(games,2,cumsum) games_lead - colSums((games_sum 0) | (games_sum==0 games==-1)) }) user system elapsed 0.080.000.08 plot(table(games_sum[40,])) plot(table(games_lead)) Compare this with your solution system.time({ set.seed(123) games - replicate(1,sample(c(-1,1),40,replace=TRUE)) games_sum - apply(games,2,sum) games_lead - apply(games,2,cumsum) games_lead - apply(games_lead,2,lead) }) user system elapsed 0.950.020.98 Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regexpr with accents
Hello, Works with me: d1 - data.frame(V1 = 1:3, V2 = c(some text = 9, some tèxt = 9, some other text = 9)) regexpr(some text = 9, d1$V2) [1] 1 -1 -1 attr(,match.length) [1] 13 -1 -1 regexpr(some tèxt = 9, d1$V2) [1] -1 1 -1 attr(,match.length) [1] -1 13 -1 d1$V1[regexpr(some text = 9,d1$V2) 0] - 9 d1$V1[regexpr(some tèxt = 9,d1$V2) 0] - 9 d1 V1 V2 1 9 some text = 9 2 9 some tèxt = 9 3 3 some other text = 9 What do you mean by it did not work? What was the contents of 'd1'? sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252 [3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Portugal.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] fortunes_1.5-0 Hope this helps, Rui Barradas Em 06-08-2012 06:55, Luca Meyer escreveu: Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr(some text = 9,d1$V2)0] - 9 and this works all right till some text contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr(some tèxt = 9,d1$V2)0] - 9 I have tried to substitute è with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent? Thank you in advance, Luca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regexpr with accents
HI, It works with me. I am using R 2.15 on Ubuntu 12.04. d1 - data.frame(V1 = 1:5, V2=c(some text = 9, some téxt=9,sóme tèxt=9, söme text=9, some têxt=9)) d1 # V1 V2 #1 1 some text = 9 #2 2 some téxt=9 #3 3 sóme tèxt=9 #4 4 söme text=9 #5 5 some têxt=9 d1$V1[regexpr(some téxt=9,d1$V2)0]-9 d1$V1[regexpr(söme text=9,d1$V2)0] -9 d1$V1[regexpr(some têxt=9,d1$V2)0] -9 d1$V1[regexpr(sóme tèxt=9,d1$V2)0] -9 d1$V1[regexpr(some text = 9,d1$V2)0] -9 d1 # V1 V2 #1 9 some text = 9 #2 9 some téxt=9 #3 9 sóme tèxt=9 #4 9 söme text=9 #5 9 some têxt=9 A.K. - Original Message - From: Luca Meyer lucam1...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, August 6, 2012 1:55 AM Subject: [R] regexpr with accents Hello, I have build a syntax to find out if a given substring is included in a larger string that works like this: d1$V1[regexpr(some text = 9,d1$V2)0] - 9 and this works all right till some text contains standard ASCII set. However, it does not work when accents are included as the following: d1$V1[regexpr(some tèxt = 9,d1$V2)0] - 9 I have tried to substitute è with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent? Thank you in advance, Luca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length
Thank you. It works great. Sent from my iPhone On Aug 5, 2012, at 9:08 PM, Jorge I Velez jorgeivanve...@gmail.com wrote: Hi Faz, Here is one way of doing it where x is your data frame: x[, colMeans(is.na(x)) = .15] HTH, Jorge.- On Sun, Aug 5, 2012 at 9:04 PM, Faz Jones wrote: I have a dataframe of 10 different columns (length of each column is the same). I want to eliminate any column that has 'NA' greater than 15% of the column length. Do i first need to make a function for calculating the percentage of NA for each column and then make another dataframe where i apply the function? Whats the best way to do this. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length
Thank you.. It was very informative and helpful. It works Sent from my iPhone On Aug 5, 2012, at 10:21 PM, arun smartpink...@yahoo.com wrote: HI, Try this: dat1-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA)) dat1[which(colMeans(is.na(dat1))=.15)] y 1 NA 2 13.53085 3 12.89453 4 15.02625 5 14.00387 6 15.34618 7 15.69293 8 15.62377 9 14.76479 #You can also use apply, sapply etc. dat2-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA),u=c(rnorm(9,15))) dat2[apply(dat2,2,function(x) mean(is.na(x))=.15)] #dat2[sapply(dat2,function(x) mean(is.na(x))=.15)] #dat2[which(colMeans(is.na(dat2))=.15)] yu 1 NA 14.56278 2 16.49940 16.25761 3 14.11368 14.08768 4 14.95139 14.01923 5 14.99517 15.91936 6 14.46359 14.07573 7 15.09702 13.94888 8 15.99967 14.97171 9 15.51924 15.59981 A.K. - Original Message - From: Faz Jones jonesf...@gmail.com To: r-help@r-project.org Cc: Sent: Sunday, August 5, 2012 9:04 PM Subject: [R] deleting columns from a dataframe where NA is more than 15 percent of the column length I have a dataframe of 10 different columns (length of each column is the same). I want to eliminate any column that has 'NA' greater than 15% of the column length. Do i first need to make a function for calculating the percentage of NA for each column and then make another dataframe where i apply the function? Whats the best way to do this. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit for Windows 64bit build of R
On Aug 5, 2012, at 3:52 PM, alan.x.simp...@nab.com.au wrote: Dear all I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz RAM. I am seeking to analyse very large data sets (perhaps as much as 10GB), without the addtional coding overhead of a package such as bigmemory(). It may depend in part on how that number is arrived at. And what you plan on doing with it. (Don't consider creating a dist-object.) My question is this - if we were to increase the RAM on the machine to (say) 128GB, would this become a possibility? I have read the documentation on memory limits and it seems so, but would like some additional confirmation before investing in any extra RAM. The trypical advices is you will need memory that is 3 times as large as a large dataset, and I find that even more headroom is needed. I have 32GB and my larger datasets occupy 5-6 GB and I generally have few problems. I had quite a few problems with 18 GB, so I think the ratio should be 4-5 x your 10GB object. I predict you could get by with 64GB. (please send check for half the difference in cost between 64GB abd 128 GB.) -- David. Kind regards Alan Alan Simpson Technical Lead, Retail Model Development Retail Models Project National Australia Bank Level 15, 500 Bourke St, Melbourne VIC Tel: +61 (0) 3 8697 7135 | Mob: +61 (0) 412 975 955 Email: alan.x.simp...@nab.com.au The information contained in this email and its attachments may be confidential. If you have received this email in error, please notify the sender by return email, delete this email and destroy any copy. Any advice contained in this email has been prepared without taking into account your objectives, financial situation or needs. Before acting on any advice in this email, National Australia Bank Limited ABN 12 004 044 937 AFSL and Australian Credit Licence 230686 (NAB) recommends that you consider whether it is appropriate for your circumstances. If this email contains reference to any financial products, NAB recommends you consider the Product Disclosure Statement (PDS) or other disclosure document available from NAB, before making any decisions regarding any products. If this email contains any promotional content that you do not wish to receive, please reply to the original sender and write Don't email promotional material in the subject. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to identify values from a column of a dataframe, and insert them in other data.frame with the corresponding id?
Thank you very much John, can you read it now? Hello, I'd like to do next, see if you could help me please: I have a csv called datuak with a id called calee_id and a colunm called poids. I have another csv called datuak2 with the same id called calee_id, (although there are calee_id that are in datuak but not in datuak2 and inverse), and a column called kg_totales in which the values are repeteated for each calee_id because are the sum of the colum kg for each row. I show you the table datuak and datuak2: Datuak (in the example the calee_id is the same, but there are a lot): poids calee_idmaree_id 10 1.27E+120.3013157 20 1.27E+120.05726046 20 1.27E+120.73631699 25 1.27E+120.74492002 3 1.27E+120.74492002 27 1.27E+120.31776439 43 1.27E+120.31776439 Datuak2: calee_id maree_id kg_totales effectif 1 1.33959e+12 0.782835873 129.7 30 2 1.33959e+12 0.782835873 129.7 40 3 1.33959e+12 0.782835873 129.7 10 4 1.33959e+12 0.782835873 129.7 5 5 1.33959e+12 0.782835873 129.71.7 6 1.33959e+12 0.782835873 129.7 20 7 1.33959e+12 0.782835873 129.7 20 8 1.33959e+12 0.782835873 129.7 1 9 1.33959e+12 0.782835873 129.7 2 I would like to identify in the csv datuak2 the corresponding calee_id that also are in datuak, and create a new column in datuak with the values for each calee_id from kg_totales, and not repeat them. So the final table would be datuak, with calee_id, poids, and the new column kg_totales with its corresponding value for each row. Thank you very much, Nerea -Mensaje original- De: John Kane [mailto:jrkrid...@inbox.com] Enviado el: 03 August 2012 20:17 Para: Nerea Lezama; r-help@r-project.org Asunto: RE: [R] how to identify values from a column of a dataframe, and insert them in other data.frame with the corresponding id? Hi Nerea, For some reason your post is badl garbled and close to imposible to read. Perhaps you need to check your text encoding? Also to send sample data it is better to use the dput() command. Do dput(myfile) and then paste the results into your email Sorry not to be of more help. John Kane Kingston ON Canada -Original Message- From: nlez...@azti.es Sent: Fri, 3 Aug 2012 12:34:07 +0200 To: r-help@r-project.org Subject: [R] how to identify values from a column of a dataframe, and insert them in other data.frame with the corresponding id? Hello, Ib??d like to do next, see if you could help me please: I have a csv called b??datuakb?? with a id called b??calee_idb?? and a colunm called b??poidsb??. I have another csv called b??datuak2b?? with the same id called b??calee_idb??, (although there are b??calee_idb?? that are in b??datuakb?? but not in b??datuak2b?? and inverse), and a column called b??kg_totalesb?? in which the values are repeteated for each calee_id because are the sum of the colum b??kgb?? for each row. I show you the table b??datuakb?? and b??datuak2b??: Datuak (in the example the calee_id is the same, but there are a lot): poids calee_id maree_id 10 1.27E+12 0.3013157 20 1.27E+12 0.05726046 20 1.27E+12 0.73631699 25 1.27E+12 0.74492002 3 1.27E+12 0.74492002 27 1.27E+12 0.31776439 43 1.27E+12 0.31776439 Datuak2: calee_id maree_id kg_totales effectif 1 1.33959e+12 0.782835873 129.7 30 2 1.33959e+12 0.782835873 129.7 40 3 1.33959e+12 0.782835873 129.7 10 4 1.33959e+12 0.782835873 129.7 5 5 1.33959e+12 0.782835873 129.71.7 6 1.33959e+12 0.782835873 129.7 20 7 1.33959e+12 0.782835873 129.7 20 8 1.33959e+12 0.782835873 129.7 1 9 1.33959e+12 0.782835873 129.7 2 I would like to identify in the csv b??datuak2b?? the corresponding b??calee_idb?? that also are in b??datuakb??, and create a new column in b??datuakb?? with the values for each b??calee_idb?? from b??kg_totalesb??, and not repeat them. So the final table would be b??datuakb??, with b??calee_idb??, b??poidsb??, and the new column b??kg_totalesb?? with its corresponding value for each row. Thank you very much, Nerea -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] find date between two other dates
Thanks arun and Rui; 3 fantastic suggestions. The Season interval is not always a month so arun's suggestion works better for this dataset. I couldn't get the as.between function to work on arun's second suggestion, it only returned NAs. However, arun's first suggestion worked a treat! Many thanks -- View this message in context: http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231p4639253.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit for Windows 64bit build of R
On 06.08.2012 09:34, David Winsemius wrote: On Aug 5, 2012, at 3:52 PM, alan.x.simp...@nab.com.au wrote: Dear all I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz RAM. I am seeking to analyse very large data sets (perhaps as much as 10GB), without the addtional coding overhead of a package such as bigmemory(). It may depend in part on how that number is arrived at. And what you plan on doing with it. (Don't consider creating a dist-object.) My question is this - if we were to increase the RAM on the machine to (say) 128GB, would this become a possibility? I have read the documentation on memory limits and it seems so, but would like some additional confirmation before investing in any extra RAM. The trypical advices is you will need memory that is 3 times as large as a large dataset, and I find that even more headroom is needed. I have 32GB and my larger datasets occupy 5-6 GB and I generally have few problems. I had quite a few problems with 18 GB, so I think the ratio should be 4-5 x your 10GB object. I predict you could get by with 64GB. (please send check for half the difference in cost between 64GB abd 128 GB.) 10Gb objects should be fine, but note that a vector/array/matrix cannot exceed 2^31-1 elements, hence a 17Gb vector/matrix/array of doubles / reals. Best, Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit for Windows 64bit build of R
On 06/08/2012 09:42, Uwe Ligges wrote: On 06.08.2012 09:34, David Winsemius wrote: On Aug 5, 2012, at 3:52 PM, alan.x.simp...@nab.com.au wrote: Dear all I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz RAM. I am seeking to analyse very large data sets (perhaps as much as 10GB), without the addtional coding overhead of a package such as bigmemory(). It may depend in part on how that number is arrived at. And what you plan on doing with it. (Don't consider creating a dist-object.) My question is this - if we were to increase the RAM on the machine to (say) 128GB, would this become a possibility? I have read the documentation on memory limits and it seems so, but would like some additional confirmation before investing in any extra RAM. The trypical advices is you will need memory that is 3 times as large as a large dataset, and I find that even more headroom is needed. I have The advice is 'at least 3 times'. It all depends what you are doing (and how slow your swap is -- on Windows it is likely to be slow; on a Linux box with a fast SSD it can be viable to use swap). 32GB and my larger datasets occupy 5-6 GB and I generally have few problems. I had quite a few problems with 18 GB, so I think the ratio should be 4-5 x your 10GB object. I predict you could get by with 64GB. But 3 x 18GB 32GB! (please send check for half the difference in cost between 64GB abd 128 GB.) 10Gb objects should be fine, but note that a vector/array/matrix cannot exceed 2^31-1 elements, hence a 17Gb vector/matrix/array of doubles / reals. That's true for R 2.15.1, but not the development version. Further, R-devel makes substantially fewer copies of objects, most of which improvements have been ported to R-patched. dist() is one example of substantial improvements. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] low resolution word map
On 2/08/2012 8:06 a.m., Thomas Steiner wrote: Hi Ray, 2012/7/31 Ray Brownrigg ray.brownr...@ecs.vuw.ac.nz: On 07/28/12 23:46, Thomas Steiner wrote: Hi, I'd like to have a low resolution word map in R. The maps package has this option, but if I use the argument, the map looses sense: Russia and Australia get empty etc library(maps) m=map(col=skyblue,fill=TRUE,plot=TRUE,resolution=10) length(m$x) If I drop the fill=TRUE, the effect of resolutaion=1 is lost, I don't know why this is so, I hadn't realised that effect of resolution with fill=TRUE. I don't understand either, but I left this observation to the developpers ;-) ie no change. Is there any other package or could I use the resolution argument differently? Well, you could say: m=map(col=0,fill=TRUE, resolution=10) Is that what you want? no, if you look at the output (ie the map) you know why: little islands like hawaii do still exist, but brazil is a rectangle of say 8 points... Unfortunately, I don't know why. What exactly do you mean by low resolution map? A point is a point at whatever resolution you choose. What do you expect to see? Perhaps you need to read the posting guide again, and provide reproducible code (resolutaion=1 is not a valid option to map(), and does not match your earlier resolution=10). Ray Brownrigg Ray Thanks, Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Problem with segmented function
Hi Hi, I appreciate your help with the segmented function. I am relatively new to R. I followed the introduction of the 'segmented'-package by Vito Muggeo, but still it does not work. Here are the lines I wrote: data_test-data.frame(x=c(1:10),y=c(1,1,1,1,1,2,3,4,5,6)) lr_test-lm(y~x,data_test) seg_test-segmented(lr_test,seg.Z~x,psi=1) You did not read help page correctly. seg.Z is named parameter in which you specify formula without LHS. psi shall be x near the expected slope change. seg_test-segmented(lr_test,seg.Z=~x,psi=5) works corretly Regards Petr /error in segmented.lm(lr_test, seg.Z ~ x, psi = 1) : A wrong number of terms in `seg.Z' or `psi'/ Thank you very much, Stella -- View this message in context: http://r.789695.n4.nabble.com/Problem-with- segmented-function-tp4639227.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to identify values from a column of a dataframe, and insert them in other data.frame with the corresponding id?
Hi It is better to use dput for presenting data for others. You probably want ?merge. Something like merge(datuak, datuak2, by = calee_id, all.x=TRUE) However calee_id seems to be a floating point number and it may be rounded so you shall beware of it. Regards Petr Thank you very much John, can you read it now? Hello, I'd like to do next, see if you could help me please: I have a csv called datuak with a id called calee_id and a colunm called poids. I have another csv called datuak2 with the same id called calee_id, (although there are calee_id that are in datuak but not in datuak2 and inverse), and a column called kg_totales in which the values are repeteated for each calee_id because are the sum of the colum kg for each row. I show you the table datuak and datuak2: Datuak (in the example the calee_id is the same, but there are a lot): poids calee_id maree_id 10 1.27E+12 0.3013157 20 1.27E+12 0.05726046 20 1.27E+12 0.73631699 25 1.27E+12 0.74492002 3 1.27E+12 0.74492002 27 1.27E+12 0.31776439 43 1.27E+12 0.31776439 Datuak2: calee_id maree_id kg_totales effectif 1 1.33959e+12 0.782835873 129.7 30 2 1.33959e+12 0.782835873 129.7 40 3 1.33959e+12 0.782835873 129.7 10 4 1.33959e+12 0.782835873 129.7 5 5 1.33959e+12 0.782835873 129.71.7 6 1.33959e+12 0.782835873 129.7 20 7 1.33959e+12 0.782835873 129.7 20 8 1.33959e+12 0.782835873 129.7 1 9 1.33959e+12 0.782835873 129.7 2 I would like to identify in the csv datuak2 the corresponding calee_id that also are in datuak, and create a new column in datuak with the values for each calee_id from kg_totales, and not repeat them. So the final table would be datuak, with calee_id, poids, and the new column kg_totales with its corresponding value for each row. Thank you very much, Nerea -Mensaje original- De: John Kane [mailto:jrkrid...@inbox.com] Enviado el: 03 August 2012 20:17 Para: Nerea Lezama; r-help@r-project.org Asunto: RE: [R] how to identify values from a column of a dataframe, and insert them in other data.frame with the corresponding id? Hi Nerea, For some reason your post is badl garbled and close to imposible to read. Perhaps you need to check your text encoding? Also to send sample data it is better to use the dput() command. Do dput(myfile) and then paste the results into your email Sorry not to be of more help. John Kane Kingston ON Canada -Original Message- From: nlez...@azti.es Sent: Fri, 3 Aug 2012 12:34:07 +0200 To: r-help@r-project.org Subject: [R] how to identify values from a column of a dataframe, and insert them in other data.frame with the corresponding id? Hello, Ib??d like to do next, see if you could help me please: I have a csv called b??datuakb?? with a id called b??calee_idb?? and a colunm called b??poidsb??. I have another csv called b??datuak2b?? with the same id called b??calee_idb??, (although there are b??calee_idb?? that are in b??datuakb?? but not in b??datuak2b?? and inverse), and a column called b??kg_totalesb?? in which the values are repeteated for each calee_id because are the sum of the colum b??kgb?? for each row. I show you the table b??datuakb?? and b??datuak2b??: Datuak (in the example the calee_id is the same, but there are a lot): poids calee_id maree_id 10 1.27E+12 0.3013157 20 1.27E+12 0.05726046 20 1.27E+12 0.73631699 25 1.27E+12 0.74492002 3 1.27E+12 0.74492002 27 1.27E+12 0.31776439 43 1.27E+12 0.31776439 Datuak2: calee_id maree_id kg_totales effectif 1 1.33959e+12 0.782835873 129.7 30 2 1.33959e+12 0.782835873 129.7 40 3 1.33959e+12 0.782835873 129.7 10 4 1.33959e+12 0.782835873 129.7 5 5 1.33959e+12 0.782835873 129.71.7 6 1.33959e+12 0.782835873 129.7 20 7 1.33959e+12 0.782835873 129.7 20 8 1.33959e+12 0.782835873 129.7 1 9 1.33959e+12 0.782835873 129.7 2 I would like to identify in the csv b??datuak2b?? the corresponding b??calee_idb?? that also are in b??datuakb??, and create a new column in b??datuakb?? with the values for each b??calee_idb?? from b??kg_totalesb??, and not repeat them. So the final table would be b??datuakb??, with b??calee_idb??, b??poidsb??, and the new column b??kg_totalesb??
Re: [R] no font could be found for family Arial
I hope the original poster fixed this a long time ago, but I had the same problem and here is how I fixed it: - go to the application Fontbook - check if the Arial font has duplicates, and delete them, even if they are set to Off - restart the computer. emmats wrote I was re-running some code that I hadn't run in a couple of months to make barplots in R. I didn't change a single thing in the script, but the plots wouldn't work this time around. The plot itself (the bars and axes) will graph in the window, but no text appears. In the console it says I have a number of errors, all of which say no font could be found for family 'Arial'. I have not knowingly changed anything in R and I would like to be able to make barplots with labels and titles again. Does anyone know how to fix this? -- View this message in context: http://r.789695.n4.nabble.com/no-font-could-be-found-for-family-Arial-tp3233322p4639257.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] line type lty
http://www.google.com/url?sa=trct=jq=lty+in+r+line+typessource=webcd=1ved=0CFQQFjAAurl=http%3A%2F%2Fstudents.washington.edu%2Fmclarkso%2Fdocuments%2Fline%2520styles%2520Ver2.pdfei=HYgfUMPgGYLJrQfWjIGYBwusg=AFQjCNGL8xBzLN2je0RQFc5e8Hk5eRnS9Q -- View this message in context: http://r.789695.n4.nabble.com/line-type-lty-tp3466345p4639258.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] LDA for topic modeling in R
Hi, All, I am using the supervised lda function (slda) from 'lda' package in R for topic modeling (http://cran.r-project.org/web/packages/lda/index.html), my data is a collection of documents, and within which each doc has a label. There are about 97 different categories and 18K documents in total. I tried to use the slda for training a model from the data (the whole dataset and sub_dataset), but failed with some strange problems during the procedure: Iteration 0 Iteration 1 Iteration 2 Iteration 3 Error in structure(.Call(collapsedGibbsSampler, documents, as.integer(K), : Numerical problems (-789.682, 0.0454545). I am using R 2.15.0 and lda 1.3.1. Any one has any idea? Thanks very much! -- View this message in context: http://r.789695.n4.nabble.com/LDA-for-topic-modeling-in-R-tp4639263.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package to remove collinear variables
On Sat, Aug 4, 2012 at 11:27 PM, Roberto rmosce...@unitus.it wrote: Hi, I need to remove collinear variables to my Near-Infrared table of spectra. What package can I use? Something simple, because I am a novice about statistic. There many methods of assessing multicollinearlity but to pick one that has a good help page try vif in the HH package. (There are also other packages that have implemented vif or variations of it.) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] regexpr with accents
Sorry but my previous email did not go through properly. Instead of the ? you should really read an egrave or #232 according to http://www.lookuptables.com/. So there are extended ASCII characters I need to deal with. I have tried d1$V1[regexpr(some tegravext = 9,d1$V2)0] - 9 and d1$V1[regexpr(some t#232xt = 9,d1$V2)0] - 9 without success... Thanks, Luca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regexpr with accents
Thanks Arun, It works all right, I just found out that my problem was not with accents but with the correct spelling of some text. Kind regards, Luca Il giorno 06/ago/2012, alle ore 15.01, arun ha scritto: Hi, Here, the string with in the quotes are read exactly like that. So, you may have to use the symbol instead of friendly or numeric from the link. Or you have to convert those. d1 - data.frame(V1 = 1:4, V2 = c(some text = 9, some tegravext = 9, some tèxt = 9, some t#232xt = 9)) d1$V1[regexpr(some tegravext = 9,d1$V2)0] - 9 d1$V1[regexpr(some t#232xt = 9,d1$V2)0] - 9 d1$V1[regexpr(some tèxt = 9,d1$V2)0] - 9 d1 V1 V2 1 1 some text = 9 2 9 some tegravext = 9 3 9 some tèxt = 9 4 9 some t#232xt = 9 A.K. - Original Message - From: Luca Meyer lucam1...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, August 6, 2012 8:25 AM Subject: [R] regexpr with accents Sorry but my previous email did not go through properly. Instead of the ? you should really read an egrave or #232 according to http://www.lookuptables.com/. So there are extended ASCII characters I need to deal with. I have tried d1$V1[regexpr(some tegravext = 9,d1$V2)0] - 9 and d1$V1[regexpr(some t#232xt = 9,d1$V2)0] - 9 without success... Thanks, Luca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory limit for Windows 64bit build of R
Alan, More RAM will definitely help. But if you have an object needing more than 2^31-1 ~ 2 billion elements, you'll hit a wall regardless. This could be particularly limiting for matrices. It is less limiting for data.frame objects (where each column could be 2 billion elements). But many R analytics under the hood use matrices, so you may not know up front where you could hit a limit. Jay Original message I have a Windows Server 2008 R2 Enterprise machine, with 64bit R installed running on 2 x Quad-core Intel Xeon 5500 processor with 24GB DDR3 1066 Mhz RAM. I am seeking to analyse very large data sets (perhaps as much as 10GB), without the addtional coding overhead of a package such as bigmemory(). My question is this - if we were to increase the RAM on the machine to (say) 128GB, would this become a possibility? I have read the documentation on memory limits and it seems so, but would like some additional confirmation before investing in any extra RAM. - -- John W. Emerson (Jay) Associate Professor of Statistics, Adjunct, and Acting Director of Graduate Studies Department of Statistics Yale University http://www.stat.yale.edu/~jay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] issue with nzchar() ?
Dear all I'm a bit surprised by the results output from nzchar(). The help page says: nzchar is a fast way to find out if elements of a character vector are *non-empty strings*. (my emphasis. However, if you do x - c(letters, NA, '') nzchar(x) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [13] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [25] TRUE TRUE TRUE FALSE any(is.na(x)) [1] TRUE the NA value in the character vector will be considered as a non-empty string, something that I find strange. At best NA is the equivalent of an empty string. In this sense, if you Hmisc::describe() the vector you get, as I would expect, that in the context of character vectors NA and '' values are considered together: require(Hmisc) describe(x) x n missing unique 26 2 26 lowest : a b c d e, highest: v w x y z So is this a bug in the function or in the help page? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] issue with nzchar() ?
On Mon, Aug 6, 2012 at 4:48 PM, Liviu Andronic landronim...@gmail.com wrote: string, something that I find strange. At best NA is the equivalent of an empty string. In this sense, if you Hmisc::describe() the vector you get, as I would expect, that in the context of character vectors NA and '' values are considered together: By the way, same question holds for nchar(): Should NA values be reported as 2-char strings, or as 0-char empty/missing values? x - c(letters, NA, '') nchar(x) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0 Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identify points that lie within polygon
I have a complex 2D polygon with thousands of vertices, and I'd like to be able to identify points from a large set contained within the polygon, and was wondering if there might be an efficient way of doing this? Any advice would be useful! Here is a small example of what I mean: # make polygon v1-c(0,1,1,2,1,3,6,7) v2-c(1,3,3,5,6,7,8,9) plot(v1, v2, type = n ) polygon(v1, v2, lwd = 2, col = red) # plot a set of candidate grid points grid-seq(0, 10, length.out = 30) pts-expand.grid(grid, grid) points(pts, pch = 19, col = 1, cex = 1) Many thanks! Alastair -- View this message in context: http://r.789695.n4.nabble.com/Identify-points-that-lie-within-polygon-tp4639289.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parallel runs of an external executable with snow in local
Thanks Uwe but, actually, I did so. Since #8220;filetorun.exe#8221; looks in the current folder for #8220;input.txt#8221;, I tried moving all needed files to a newly created temporary folder #8220;tmp.id#8221; (say, tmp.1) and running the executable. This works fine by doing it directly from the windows command line but not by doing it from R, since using: # System(#8220;C:/Users/#8230;/currentworkdirectory/temp.1/filetorun.exe#8221;) # Causes #8220;filetorun.exe#8221; to look for #8220;input.txt#8221; in #8220;C:/Users/#8230;/currentworkdirectory#8221;. So there#8217;s no point on moving files to a folder, it seems that input file must be situated in the current R work directory. Does anybody know how to avoid this behaviour? I hope I#8217;ve explained that clearly, Xavier Portell Canal, PhD candidate. Department of Agri-food Engineering, Universitat Politècnica de Catalunya -Uwe Ligges lig...@statistik.tu-dortmund.de ha escrit: - Per a: Xavier Portell/UPC xavier.port...@upc.edu De: Uwe Ligges lig...@statistik.tu-dortmund.de Data: 05/08/2012 07:46PM a/c: r-help@r-project.org Assumpte: Re: [R] Parallel runs of an external executable with snow in local On 03.08.2012 19:21, Xavier Portell/UPC wrote: Hi everyone, I'm aiming to run an external executable (say filetorun.EXE) in parallel. The external executable collect needed data from a file, say input.txt and, in turn,generates several output files, say output.txt. I need to generate input.txt, run the executable and keep input.txt and output.txt. I'm using Windows 7, R version 2.15.1 (2012-06-22) on RStudio and platform: i386.pc.mingw32/i386 (32-bit). My first attempt was a R code which, by using System(filetorun.EXE, intern = F, ignore.stdout = F, ignore.stderr = F, wait = T, input = NULL, show.output.on.console = T, minimized = F, invisible = T)) , ran the executable and kept required files to a conveniently named folder. After that I changed my previous R script so I could use the function lapply().This script apparently worked fine. Finally, I tried to parallelize the problem by using snow and parLapply(). The resulting script looks like this: ## Not run # library(snow)cl - makeCluster(3, type = SOCK) clusterExport(cl,list('param.esp','copy.files','for12.template','program.executor')) parLapply(cl,a.list,a.function))stopCluster(cl) # ##End not run Although it runs, the parallelized version is messing up the input parameters to pass to the executable (see table below, where parameters P1 and P2 are considered. .s comes from the serial code and .p from the parallelized one): s r P1.s P2.s P1.p P2.p 1 1 1 1.0 3.00 2.0 3.00 2 2 1 1.5 3.00 2.0 3.75 3 3 1 2.0 3.00 2.0 3.00 4 4 1 1.0 3.75 1.5 3.00 5 5 1 1.5 3.75 1.5 3.00 6 6 1 2.0 3.75 2.0 3.75 My first thought to avoid the described behaviour was creating a temporary file, say tmp.id with id being an identification run number, and copying filetorun.EXE and Input.txt to tmp.id. However, while doing so, I realised that although running the correct filetorun.EXE copy (i.e., the one in tmp.id) R looks for input.txt in the work directory. Not sure about the real setup, but you can actually specify the path, not only filenames. Uwe Ligges I've been looking thoroughly for a solution but I got nothing. Thanks for any help in advance, Xavier Portell Canal PhD candidate Department of Agri-food engineering, Universitat Politècnica de Catalunya __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identify points that lie within polygon
Hello, With your example run and click 10 black points inside the area. ploc - locator(n=10) points(ploc$x, ploc$y, pch = 19, col = green, cex = 1) Hope this helps, Rui Barradas Em 06-08-2012 16:05, Ally escreveu: I have a complex 2D polygon with thousands of vertices, and I'd like to be able to identify points from a large set contained within the polygon, and was wondering if there might be an efficient way of doing this? Any advice would be useful! Here is a small example of what I mean: # make polygon v1-c(0,1,1,2,1,3,6,7) v2-c(1,3,3,5,6,7,8,9) plot(v1, v2, type = n ) polygon(v1, v2, lwd = 2, col = red) # plot a set of candidate grid points grid-seq(0, 10, length.out = 30) pts-expand.grid(grid, grid) points(pts, pch = 19, col = 1, cex = 1) Many thanks! Alastair -- View this message in context: http://r.789695.n4.nabble.com/Identify-points-that-lie-within-polygon-tp4639289.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] issue with nzchar() ?
On Mon, Aug 6, 2012 at 9:53 AM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 4:48 PM, Liviu Andronic landronim...@gmail.com wrote: string, something that I find strange. At best NA is the equivalent of an empty string. Certainly not to my mind, unless you think that zero and NA should be the same for integers and doubles as well. NA (in whatever form) is, to my mind, _unknown_ which is very different than knowing 0. In this sense, if you Hmisc::describe() the vector you get, as I would expect, that in the context of character vectors NA and '' values are considered together: By the way, same question holds for nchar(): Should NA values be reported as 2-char strings, or as 0-char empty/missing values? x - c(letters, NA, '') nchar(x) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0 I'm not sure why that's the case, but it's documented on the help page (under value): For ‘nchar’, an integer vector giving the sizes of each element, currently always ‘2’ for missing values (for ‘NA’). so I don't see any bug. My guess is that it's this way for back-compatability from a time when there probably wasn't a proper NA_character_ (that's the parser literal for a character NA) and they really were just NA (the string) -- perhaps in some far distant R 3.0 we'll see nchar(NA_character_) = NA_integer_ Best, Michael Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bibtex::read.bib -- extracting bibentry keys
I have two versions of a bibtex database which have gotten badly out of sync. I need to find find all the entries in bib2 which are not contained in bib1, according to their bibtex keys. But I can't figure out how to extract a list of the bibentry keys in these databases. A minor question: Is there someway to prevent read.bib from ignoring entries that do not contain all required fields? A suggestion: it would be nice if bibtex provided some extractor functions for bibentry fields. bib1 - read.bib(C:/localtexmf/bibtex/bib/timeref.bib) ignoring entry 'Donoho-etal:1988' (line 40) because : A bibentry of bibtype ‘InCollection’ has to correctly specify the field(s): booktitle ... snipping other similar warnings ... length(bib1) [1] 628 bib2 - read.bib(W:/texmf/bibtex/bib/timeref.bib) ignoring entry 'Donoho-etal:1988' (line 57) because : A bibentry of bibtype ‘InCollection’ has to correctly specify the field(s): booktitle ... snipping other similar warnings ... length(bib2) [1] 611 # The first bibentry: bib1[[1]] Godfrey EH (1918). “History and Development of Statistics in Canada.” In Koren J (ed.), pp. 179-198. Macmillan, New York. str(bib1[[1]]) Class 'bibentry' hidden list of 1 $ :List of 9 ..$ author :Class 'person' hidden list of 1 .. ..$ :List of 5 .. .. ..$ given : chr [1:2] Ernest H. .. .. ..$ family : chr Godfrey .. .. ..$ role : NULL .. .. ..$ email : NULL .. .. ..$ comment: NULL ..$ title : chr History and Development of Statistics in {Canada} ..$ booktitle: chr History of Statistics, their Development and Progress in Many Countries ..$ publisher: chr Macmillan ..$ year : chr 1918 ..$ editor :Class 'person' hidden list of 1 .. ..$ :List of 5 .. .. ..$ given : chr John .. .. ..$ family : chr Koren .. .. ..$ role : NULL .. .. ..$ email : NULL .. .. ..$ comment: NULL ..$ pages : chr 179--198 ..$ address : chr New York ..$ crossref : chr Koren:1918 ..- attr(*, bibtype)= chr InCollection ..- attr(*, key)= chr Godfrey:1918 So, I try to get the key attribute for this entry, but it returns NULL, and I don't understand why. attr(bib1[[1]], key) NULL attr(bib1[1], key) NULL -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identify points that lie within polygon
This is off-topic (not about R), and a quick Web search of test within polygon yields many results, and adding R to the search when using Google provides hints about applying the algorithms in R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Ally a.rushwo...@stats.gla.ac.uk wrote: I have a complex 2D polygon with thousands of vertices, and I'd like to be able to identify points from a large set contained within the polygon, and was wondering if there might be an efficient way of doing this? Any advice would be useful! Here is a small example of what I mean: # make polygon v1-c(0,1,1,2,1,3,6,7) v2-c(1,3,3,5,6,7,8,9) plot(v1, v2, type = n ) polygon(v1, v2, lwd = 2, col = red) # plot a set of candidate grid points grid-seq(0, 10, length.out = 30) pts-expand.grid(grid, grid) points(pts, pch = 19, col = 1, cex = 1) Many thanks! Alastair -- View this message in context: http://r.789695.n4.nabble.com/Identify-points-that-lie-within-polygon-tp4639289.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] issue with nzchar() ?
Liviu: Well, as usual, to a certain extent this is arbitrary and the only issue is whether it is documented correctly. To me, NA (of whatever mode) means indeterminate or unknown, so since is known and of length 0, I would have expected NA as a return. But the point is, not what our particular tastes are (You say 'tomayto', I say 'tomahto,' an old song goes), but what the docs say. And in both cases, they tell you exactly what you'll get. For nchar(): an integer vector giving the sizes of each element, currently always 2 for missing values (for NA) and for nzchar: a logical vector of the same length as x, true if and only if the element has non-zero length. (note the 'only if'). So I see no error or inconsistencies anywhere. -- Bert On Mon, Aug 6, 2012 at 7:53 AM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 4:48 PM, Liviu Andronic landronim...@gmail.com wrote: string, something that I find strange. At best NA is the equivalent of an empty string. In this sense, if you Hmisc::describe() the vector you get, as I would expect, that in the context of character vectors NA and '' values are considered together: By the way, same question holds for nchar(): Should NA values be reported as 2-char strings, or as 0-char empty/missing values? x - c(letters, NA, '') nchar(x) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0 Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identify points that lie within polygon
Thanks for the suggestion, got exactly what I needed from library(splancs) ?pip Alastair Jeff Newmiller wrote This is off-topic (not about R), and a quick Web search of test within polygon yields many results, and adding R to the search when using Google provides hints about applying the algorithms in R. --- Jeff NewmillerThe . . Go Live... DCN:lt;jdnewmil@.cagt;Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Ally lt;a.rushworth@.acgt; wrote: I have a complex 2D polygon with thousands of vertices, and I'd like to be able to identify points from a large set contained within the polygon, and was wondering if there might be an efficient way of doing this? Any advice would be useful! Here is a small example of what I mean: # make polygon v1-c(0,1,1,2,1,3,6,7) v2-c(1,3,3,5,6,7,8,9) plot(v1, v2, type = n ) polygon(v1, v2, lwd = 2, col = red) # plot a set of candidate grid points grid-seq(0, 10, length.out = 30) pts-expand.grid(grid, grid) points(pts, pch = 19, col = 1, cex = 1) Many thanks! Alastair -- View this message in context: http://r.789695.n4.nabble.com/Identify-points-that-lie-within-polygon-tp4639289.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Identify-points-that-lie-within-polygon-tp4639289p4639296.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bibtex::read.bib -- extracting bibentry keys
On Mon, 6 Aug 2012, Michael Friendly wrote: I have two versions of a bibtex database which have gotten badly out of sync. I need to find find all the entries in bib2 which are not contained in bib1, according to their bibtex keys. But I can't figure out how to extract a list of the bibentry keys in these databases. read.bib() returns a bibentry object so you can simply do this as usual for bibentry objects with $key: x - read.bib(...) x$key or maybe unlist(x$key) Whatever is more convenient for you. See ?bibentry for more details. A minor question: Is there someway to prevent read.bib from ignoring entries that do not contain all required fields? Also not really an issue with read.bib itself. read.bib() wants to return a bibentry object but bibentry() just allows to create objects that are valid BibTeX, i.e., have all required fields. A suggestion: it would be nice if bibtex provided some extractor functions for bibentry fields. So that only a subset of fields is read as opposed to all fields? If you read all fields, you can easily subset afterwards (again using $-notation). hth, Z bib1 - read.bib(C:/localtexmf/bibtex/bib/timeref.bib) ignoring entry 'Donoho-etal:1988' (line 40) because : A bibentry of bibtype ?InCollection? has to correctly specify the field(s): booktitle ... snipping other similar warnings ... length(bib1) [1] 628 bib2 - read.bib(W:/texmf/bibtex/bib/timeref.bib) ignoring entry 'Donoho-etal:1988' (line 57) because : A bibentry of bibtype ?InCollection? has to correctly specify the field(s): booktitle ... snipping other similar warnings ... length(bib2) [1] 611 # The first bibentry: bib1[[1]] Godfrey EH (1918). ?History and Development of Statistics in Canada.? In Koren J (ed.), pp. 179-198. Macmillan, New York. str(bib1[[1]]) Class 'bibentry' hidden list of 1 $ :List of 9 ..$ author :Class 'person' hidden list of 1 .. ..$ :List of 5 .. .. ..$ given : chr [1:2] Ernest H. .. .. ..$ family : chr Godfrey .. .. ..$ role : NULL .. .. ..$ email : NULL .. .. ..$ comment: NULL ..$ title : chr History and Development of Statistics in {Canada} ..$ booktitle: chr History of Statistics, their Development and Progress in Many Countries ..$ publisher: chr Macmillan ..$ year : chr 1918 ..$ editor :Class 'person' hidden list of 1 .. ..$ :List of 5 .. .. ..$ given : chr John .. .. ..$ family : chr Koren .. .. ..$ role : NULL .. .. ..$ email : NULL .. .. ..$ comment: NULL ..$ pages : chr 179--198 ..$ address : chr New York ..$ crossref : chr Koren:1918 ..- attr(*, bibtype)= chr InCollection ..- attr(*, key)= chr Godfrey:1918 So, I try to get the key attribute for this entry, but it returns NULL, and I don't understand why. attr(bib1[[1]], key) NULL attr(bib1[1], key) NULL -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test if elements of a character vector contain letters
Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to assing unique ID in a table and do regression
Sorry, forgot to Cc the list. Em 06-08-2012 17:29, Rui Barradas escreveu: Hello, I'm glad it helped. The result of function cut() is a factor variable so you can coerce it to integer, giving more normal names, or, if you want to keep track of the intervals the adjusted r2 belong to, got straight to the last two lines in the following code. #dat1$groups - as.integer( cut( ...etc... ) ) [...rest of your code... ] adj - summary(lin.temp1)$adj.r.squared class(adj) - list That's it. It has as names the intervals produced by cut that appear in the output you've posted. Rui Barradas Em 06-08-2012 17:07, Kristi Glover escreveu: Dear Rui, Thanks for the help. I really appricated . It helped me out. I modified some of the script you gave me becasue I found the package 'nlme' can also do it. But I do use the script you gave me to split the data dat1$groups-cut(dat1$LATITUDE, seq(-56,79, by=2.5)) lin.temp1-lmList(S~mean_temp|groups,data=dat1) could you please give me an idea how I can extract r adjusted and put them in a table? I called summary but it gave me the value of r2 adjusted for each group but I don't know how I can put teh r2 adjusted in table (like: group , r2 sqaure, r2 adjusted) summary(lin.temp1)$adj.r.squared (-56,-53.5] : [1] 0.2565786 (-53.5,-51] : [1] 0.0715485 (-51,-48.5] : [1] 0.2265334 Thanks Kristi Date: Sat, 4 Aug 2012 16:15:57 +0100 From: ruipbarra...@sapo.pt To: kristi.glo...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] how to assing unique ID in a table and do regression Hello, Try the following. id.groups - with(dat, cut(ID, breaks=0:ceiling(max(ID sp - split(dat, id.groups) regressors - grep(en, names(dat)) models - lapply(sp, function(.df) lapply(regressors, function(x) lm(.df[[S]] ~ .df[[x]]))) mod.summ - lapply(models, function(x) lapply(x, summary)) # First R2 mod.r2 - lapply(mod.summ, function(x) lapply(x, `[[`, r.squared)) mod.r2 # Now p-values mod.coef - lapply(mod.summ, function(x) lapply(x, coef)) mod.pvalue - lapply(mod.coef, function(x) lapply(x, `[`, , 4)) # p-values in matrix form, columns are 'en2', en3', etc #lapply(mod.pvalue, function(x) do.call(cbind, x)) Hope this helps, Rui Barradas Em 04-08-2012 15:22, Kristi Glover escreveu: Hi R- User I have very big data set (5000 rows). I wanted to make classes based on a column of that table (that column has the data which is continuous .) After converting into different class, this class would be Unique ID. I want to run regression for each ID. For example I have a data set dput(dat) structure(list(ID = c(0.1, 0.8, 0.1, 1.5, 1.1, 0.9, 1.8, 2.5, 2, 2.5, 2.8, 3, 3.1, 3.2, 3.9, 1, 4, 4.7, 4.3, 4.9, 2.1, 2.4), S = c(4L, 7L, 9L, 10L, 10L, 8L, 8L, 8L, 17L, 18L, 13L, 13L, 11L, 1L, 10L, 20L, 22L, 20L, 18L, 16L, 7L, 20L), en2 = c(-2.5767, -2.5767, -2.5767, -2.5767, -2.5767, -2.5767, -2.5767, -2.5347, -2.5347, -2.5347, -2.5347, -2.5347, -2.5347, -2.4939, -2.4939, -2.4939, -2.4939, -2.4939, -2.4939, -2.4939, -2.4543, -2.4543 ), en3 = c(-1.1785, -0.6596, -0.6145, -0.6437, -0.6593, -0.7811, -1.1785, -1.1785, -1.1785, -0.6596, -0.6145, -0.6437, -0.6593, -1.1785, -0.1342, -0.2085, -0.4428, -0.5125, -0.8075, -1.1785, -1.1785, -0.1342), en4 = c(-1.4445, -1.3645, -1.1634, -0.7735, -0.6931, -1.1105, -1.4127, -1.5278, -1.4445, -1.3645, -1.1634, -0.7735, -0.6931, -1.0477, -0.8655, -0.1759, 0.1203, -0.2962, -0.4473, -1.0436, -0.9705, -0.8953), en5 = c(-0.4783, -0.3296, -0.2026, -0.3579, -0.5154, -0.5726, -0.6415, -0.3996, -0.4529, -0.5762, -0.561, -0.6891, -0.7408, -0.6287, -0.4337, -0.4586, -0.5249, -0.6086, -0.7076, -0.7114, -0.4952, 0.1091)), .Names = c(ID, S, en2, en3, en4, en5), class = data.frame, row.names = c(NA, -22L)) Here ID has continuous value, I want to make groups with value 0-1, 1-2, 2-3, 3-4 from the column ID. and then. I wanted to run regression with S (dependent variable) and en2 (independent variable); again regression of S and en3 , and so on. After that, I wanted to have a table with r2 and p value. would you help me how I can do it? I was trying it manually - but it took so much time. therefore I thought to write you for your help. Thanks for your help. Kristi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] more efficient way to parallel
After searching online, I found that clusterCall or foreach might be the solution. Best wishes, Jie On Sun, Aug 5, 2012 at 10:23 PM, Jie jimmycl...@gmail.com wrote: Dear All, Suppose I have a program as below: Outside is a loop for simulation (with random generated data), inside there are several sapply()'s (10~100) over the data and something else, but these sapply's have to be sequential. And each sapply do not involve very intensive calculation (a few seconds only). So the outside loop takes minutes to finish one iteration. I guess the better way is not to parallel sapply but the outer loop. But I have no idea how to modify it. I have a simple code here. Only two sapply's involved for simplicity. The logical in the sapply is not important. Thank you for your attention and suggestion. library(parallel) library(MASS) result.seq=c() Maxi - 100 for (i in 1:Maxi) { ## initialization, not of interest Sigmahalf - matrix(sample(1:1,size = 1,replace =T ), 100) Sigma - t(Sigmahalf)%*%Sigmahalf x - mvrnorm(n=1000, rep(0, 10), Sigma) xlist - list() for (j in 1:1000) { xlist[[j]] - list(X = matrix( x [j, ],5)) } ## end of initialization dd1 - sapply(xlist,function(s) {min(abs((eigen(s$X))$values))}) ## sumdd1=sum(dd1) for (j in 1:1000) { xlist[[j]]$dd1 - dd1[j]/sumdd1 } ## Assume dd2 and dd1 can not be combined in one sapply() dd2 - sapply(xlist, function(s){min(abs((eigen(s$X))$values))+s$dd1}) result.seq[i] - sum(dd1*dd2) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
nzchar(x) !is.na(x) No? -- Bert On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] more efficient way to parallel
Not that I've had a chance to really look at the problem, but I've removed outer loops using parLapply from the parallel package. Works great. On Mon, Aug 6, 2012 at 11:41 AM, Jie jimmycl...@gmail.com wrote: After searching online, I found that clusterCall or foreach might be the solution. Best wishes, Jie On Sun, Aug 5, 2012 at 10:23 PM, Jie jimmycl...@gmail.com wrote: Dear All, Suppose I have a program as below: Outside is a loop for simulation (with random generated data), inside there are several sapply()'s (10~100) over the data and something else, but these sapply's have to be sequential. And each sapply do not involve very intensive calculation (a few seconds only). So the outside loop takes minutes to finish one iteration. I guess the better way is not to parallel sapply but the outer loop. But I have no idea how to modify it. I have a simple code here. Only two sapply's involved for simplicity. The logical in the sapply is not important. Thank you for your attention and suggestion. library(parallel) library(MASS) result.seq=c() Maxi - 100 for (i in 1:Maxi) { ## initialization, not of interest Sigmahalf - matrix(sample(1:1,size = 1,replace =T ), 100) Sigma - t(Sigmahalf)%*%Sigmahalf x - mvrnorm(n=1000, rep(0, 10), Sigma) xlist - list() for (j in 1:1000) { xlist[[j]] - list(X = matrix( x [j, ],5)) } ## end of initialization dd1 - sapply(xlist,function(s) {min(abs((eigen(s$X))$values))}) ## sumdd1=sum(dd1) for (j in 1:1000) { xlist[[j]]$dd1 - dd1[j]/sumdd1 } ## Assume dd2 and dd1 can not be combined in one sapply() dd2 - sapply(xlist, function(s){min(abs((eigen(s$X))$values))+s$dd1}) result.seq[i] - sum(dd1*dd2) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] more efficient way to parallel
On 08/06/2012 09:41 AM, Jie wrote: After searching online, I found that clusterCall or foreach might be the solution. Re-write your outer loop as an lapply, then on non-Windows use parallel::mclapply. Or on windows use makePSOCKcluster and parLapply. I ended with library(parallel) library(MASS) Maxi - 10 Maxj - 1000 doit - function(i, Maxi, Maxj) { ## initialization, not of interest Sigmahalf - matrix(sample(1, replace=TRUE), 100) Sigma - t(Sigmahalf) %*% Sigmahalf x - mvrnorm(n=Maxj, rep(0, 100), Sigma) xlist - lapply(seq_len(nrow(x)), function(i, x) matrix(x[i,], 10), x) ## end of initialization fun - function(x) { v - eigen(x, symmetric=FALSE, only.values=TRUE)$values min(abs(v)) } dd1 - sapply(xlist, fun) dd2 - dd1 + dd1 / sum(dd1) sum(dd1 * dd2) } system.time(lapply(1:8, doit, Maxi, Maxj)) user system elapsed 6.677 0.016 6.714 system.time(mclapply(1:64, doit, Maxi, Maxj, mc.cores=8)) user system elapsed 68.857 1.032 10.398 the extra arguments to eigen are important, as is avoiding unnecessary repeated calculations. The strategy of allocate-and-grow (result.vec=numeric(); result.vec[i] - ...) is very inefficient (result.vec is copied in its entirety for each new value of i); better preallocate-and-fill (result.vec = integer(Maxi); result.vec[i] = ...) or let lapply manage the allocation. Martin Best wishes, Jie On Sun, Aug 5, 2012 at 10:23 PM, Jie jimmycl...@gmail.com wrote: Dear All, Suppose I have a program as below: Outside is a loop for simulation (with random generated data), inside there are several sapply()'s (10~100) over the data and something else, but these sapply's have to be sequential. And each sapply do not involve very intensive calculation (a few seconds only). So the outside loop takes minutes to finish one iteration. I guess the better way is not to parallel sapply but the outer loop. But I have no idea how to modify it. I have a simple code here. Only two sapply's involved for simplicity. The logical in the sapply is not important. Thank you for your attention and suggestion. library(parallel) library(MASS) result.seq=c() Maxi - 100 for (i in 1:Maxi) { ## initialization, not of interest Sigmahalf - matrix(sample(1:1,size = 1,replace =T ), 100) Sigma - t(Sigmahalf)%*%Sigmahalf x - mvrnorm(n=1000, rep(0, 10), Sigma) xlist - list() for (j in 1:1000) { xlist[[j]] - list(X = matrix( x [j, ],5)) } ## end of initialization dd1 - sapply(xlist,function(s) {min(abs((eigen(s$X))$values))}) ## sumdd1=sum(dd1) for (j in 1:1000) { xlist[[j]]$dd1 - dd1[j]/sumdd1 } ## Assume dd2 and dd1 can not be combined in one sapply() dd2 - sapply(xlist, function(s){min(abs((eigen(s$X))$values))+s$dd1}) result.seq[i] - sum(dd1*dd2) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On 08/06/2012 09:51 AM, Rui Barradas wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. system.time(res0 - grepl([[:alpha:]], x)) user system elapsed 0.060 0.000 0.061 system.time(res1 - has_letter(x)) user system elapsed 3.728 0.008 3.747 all.equal(res0, res1, check.attributes=FALSE) [1] TRUE Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
Perhaps I am missing something, but why use sapply() when grepl() is already vectorized? is.letter - function(x) grepl([:alpha:], x) is.number - function(x) grepl([:digit:], x) x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) str(x) chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ... system.time(is.letter(x)) user system elapsed 0.011 0.000 0.010 system.time(is.number(x)) user system elapsed 0.010 0.000 0.011 Regards, Marc Schwartz On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] issue with nzchar() ?
It would be nice to be able to trigger NA returning NA with an argument to the function, but you can easily get that result: ifelse(is.na(x), NA, nzchar(x)) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [13] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [25] TRUE TRUENA FALSE -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Bert Gunter Sent: Monday, August 06, 2012 10:43 AM To: Liviu Andronic Cc: r-help@r-project.org Help Subject: Re: [R] issue with nzchar() ? Liviu: Well, as usual, to a certain extent this is arbitrary and the only issue is whether it is documented correctly. To me, NA (of whatever mode) means indeterminate or unknown, so since is known and of length 0, I would have expected NA as a return. But the point is, not what our particular tastes are (You say 'tomayto', I say 'tomahto,' an old song goes), but what the docs say. And in both cases, they tell you exactly what you'll get. For nchar(): an integer vector giving the sizes of each element, currently always 2 for missing values (for NA) and for nzchar: a logical vector of the same length as x, true if and only if the element has non-zero length. (note the 'only if'). So I see no error or inconsistencies anywhere. -- Bert On Mon, Aug 6, 2012 at 7:53 AM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 4:48 PM, Liviu Andronic landronim...@gmail.com wrote: string, something that I find strange. At best NA is the equivalent of an empty string. In this sense, if you Hmisc::describe() the vector you get, as I would expect, that in the context of character vectors NA and '' values are considered together: By the way, same question holds for nchar(): Should NA values be reported as 2-char strings, or as 0-char empty/missing values? x - c(letters, NA, '') nchar(x) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0 Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tukey HSD not fully displayed in R console
Dear all, I would like to test the differences in dependent variable X depending on 2 grouping variables of each 10 levels. I do this with a 2-way ANOVA, followed by a Tukey HSD test (TukeyHSD(x)). However, since a lot of combinations are possible with 2 grouping variables, each of 10 levels, the result of the Tukey test is not fully displayed in the console. I tried to print it as a table (write.table () ) and open it afterwards in Notepad or print e.g. only the first 30 rows of the result, but both without success ... Anyone an idea how I can deal with this problem? Many thanks, Ulrike -- View this message in context: http://r.789695.n4.nabble.com/Tukey-HSD-not-fully-displayed-in-R-console-tp4639285.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OO code organization
Greetings, I am using S4 classes and the code is organized by putting each class into a separate source file in a separate folder: In Folder base/base.R:defines class base and does setGeneric(showSelf, function(this) standardGeneric(showSelf) ) setMethod(showSelf, signature(this=base), definition= function(this){ ... } ) For j=1,...,n: in derived_j/derived_j.R: defines class derived_j and does setMethod(showSelf, signature(this=derived_j), definition= function(this){ ... } ) Finally in tests/tests.R we do source(../base/base.R) source(../derived_1/derived_1.R) source(../derived_2/derived_2.R) source(../derived_n/derived_n.R) now we check which methods showSelf are known at this point: showMethods(showSelf) and get showMethods(showSelf) Function: showSelf (package .GlobalEnv) this=base this=derived_n The methods with signature this=derived_j, jn are not known. Needless to say this makes the code useless. How can I remedy this evil? Many thanks, Michael Meyer [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] find date between two other dates
Hi, I run the second list of codes (is.between()) again from the sent mail. It works fine for me. I am using R 2.15 on Ubuntu 12.04. sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] stringr_0.6 reshape_0.8.4 plyr_1.7.1 is.between-function(x,a,b){ xa x=b } ddate - c(29/12/1998 20:00:33, 02/01/1999 05:20:44, 02/01/1999 06:18:36, 02/02/1999 07:06:59, 02/03/1999 07:10:56, 02/03/1999 07:57:18) ddate - as.POSIXct(strptime(ddate, %d/%m/%Y %H:%M:%S), GMT) ddate1-data.frame(date=ddate) date2-c(01/12/1998 00:00:00, 31/12/1998 23:59:59, 01/01/1999 00:00:00, 31/01/1999 23:59:59, 01/02/1999 00:00:00, 28/02/1999 23:59:59, 01/03/1999 00:00:00, 31/03/1999 23:59:59) date3-as.POSIXct(strptime(date2, %d/%m/%Y %H:%M:%S), GMT) ddate1[is.between(ddate1$date,date3[2],date3[1]),Season]-1 ddate1[is.between(ddate1$date,date3[4],date3[3]),Season]-2 ddate1[is.between(ddate1$date,date3[6],date3[5]),Season]-3 ddate1[is.between(ddate1$date,date3[8],date3[7]),Season]-4 ddate1 date Season 1 1998-12-29 20:00:33 1 2 1999-01-02 05:20:44 2 3 1999-01-02 06:18:36 2 4 1999-02-02 07:06:59 3 5 1999-03-02 07:10:56 4 6 1999-03-02 07:57:18 4 # Not sure how you are getting NA. One possibility is that if you used date2(which is not converted) instead of date3 (as in date3 -as.POSIXct) If you did this: ddate1[is.between(ddate1$date,date2[2],date2[1]),Season]-1 ddate1[is.between(ddate1$date,date2[4],date2[3]),Season]-2 ddate1[is.between(ddate1$date,date2[6],date2[5]),Season]-3 ddate1[is.between(ddate1$date,date2[8],date2[7]),Season]-4 ddate1 date Season 1 1998-12-29 20:00:33 NA 2 1999-01-02 05:20:44 NA 3 1999-01-02 06:18:36 NA 4 1999-02-02 07:06:59 NA 5 1999-03-02 07:10:56 NA 6 1999-03-02 07:57:18 NA A.K. - Original Message - From: penguins cat...@bas.ac.uk To: r-help@r-project.org Cc: Sent: Monday, August 6, 2012 4:13 AM Subject: Re: [R] find date between two other dates Thanks arun and Rui; 3 fantastic suggestions. The Season interval is not always a month so arun's suggestion works better for this dataset. I couldn't get the as.between function to work on arun's second suggestion, it only returned NAs. However, arun's first suggestion worked a treat! Many thanks -- View this message in context: http://r.789695.n4.nabble.com/find-date-between-two-other-dates-tp4639231p4639253.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regexpr with accents
Hi, Here, the string with in the quotes are read exactly like that. So, you may have to use the symbol instead of friendly or numeric from the link. Or you have to convert those. d1 - data.frame(V1 = 1:4, V2 = c(some text = 9, some tegravext = 9, some tèxt = 9, some t#232xt = 9)) d1$V1[regexpr(some tegravext = 9,d1$V2)0] - 9 d1$V1[regexpr(some t#232xt = 9,d1$V2)0] - 9 d1$V1[regexpr(some tèxt = 9,d1$V2)0] - 9 d1 V1 V2 1 1 some text = 9 2 9 some tegravext = 9 3 9 some tèxt = 9 4 9 some t#232xt = 9 A.K. - Original Message - From: Luca Meyer lucam1...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, August 6, 2012 8:25 AM Subject: [R] regexpr with accents Sorry but my previous email did not go through properly. Instead of the ? you should really read an egrave or #232 according to http://www.lookuptables.com/. So there are extended ASCII characters I need to deal with. I have tried d1$V1[regexpr(some tegravext = 9,d1$V2)0] - 9 and d1$V1[regexpr(some t#232xt = 9,d1$V2)0] - 9 without success... Thanks, Luca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] program of matrix
Hi can ANY body help me to programme this formula: c[lj] and c[l'j] are matrix A[j]^-1 is an invertible diagonal matrix g[ll']=i[ll'] - sum *#from j=1 to k#* c[lj]c[l'j]A[j]^-1 WHERE i[ll']= 1/n sum from i=1 to n z[il] z[il'] n,k,m are given. j=1...k,l,l'=1...m, it s complicate for me ; hope you can help me thank you a lot -- View this message in context: http://r.789695.n4.nabble.com/program-of-matrix-tp4639288.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cannot find function simpleRDA2
Hi, I am trying to run the command forward.sel.par, however I receive the error message: Error: could not find function 'simpleRDA2'. I have the vegan library loaded. The documentation on varpart has not helped me to understand why I cannot call this function. Maybe I am missing something obvious because I am still an 'R' novice. Below is a reproducible example for you. Thank you always for all of your help. Lindsey example: X=matrix(rnorm(30),10,3) Y=matrix(rnorm(50),10,5) forward.sel.par - function(Y, X, alpha = 0.05, K = nrow(X)-1, R2thresh = 0.99, R2more = 0.001, adjR2thresh = 0.99, Yscale = FALSE, verbose=TRUE) ## ## Parametric forward selection of explanatory variables in regression and RDA. ## Y is the response, X is the table of explanatory variables. ## ## If Y is univariate, this function implements FS in regression. ## If Y is multivariate, this function implements FS using the F-test described ## by Miller and Farr (1971). This test requires that ## -- the Y variables be standardized, ## -- the error in the response variables be normally distributed (to be verified by the user). ## ## This function uses 'simpleRDA2' and 'RsquareAdj' developed for 'varpart' in 'vegan'. ## ##Pierre Legendre Guillaume Blanchet, May 2007 ## ## Arguments -- ## ## Y Response data matrix with n rows and m columns containing quantitative variables. ## X Explanatory data matrix with n rows and p columns containing quantitative variables. ## alpha Significance level. Stop the forward selection procedure if the p-value of a variable is higher than alpha. The default is 0.05. ## K Maximum number of variables to be selected. The default is one minus the number of rows. ## R2thresh Stop the forward selection procedure if the R-square of the model exceeds the stated value. This parameter can vary from 0.001 to 1. ## R2moreStop the forward selection procedure if the difference in model R-square with the previous step is lower than R2more. The default setting is 0.001. ## adjR2thresh Stop the forward selection procedure if the adjusted R-square of the model exceeds the stated value. This parameter can take any value (positive or negative) smaller than 1. ## YscaleStandardize the variables in table Y to variance 1. The default setting is FALSE. The setting is automatically changed to TRUE if Y contains more than one variable. This is a validity condition for the parametric test of significance (Miller and Farr 1971). ## ## Reference: ## Miller, J. K., and S. D. Farr. 1971. Bimultivariate redundancy: a comprehensive measure of ##interbattery relationship. Multivariate Behavioral Research 6: 313-324. { require(vegan) FPval - function(R2cum,R2prev,n,mm,p) ## Compute the partial F and p-value after adding a single explanatory variable to the model. ## In FS, the number of df of the numerator of F is always 1. See Sokal Rohlf 1995, eq 16.14. ## ## The amendment, based on Miller and Farr (1971), consists in multiplying the numerator and ## denominator df by 'p', the number of variables in Y, when computing the p-value. ## ##Pierre Legendre, May 2007 { df2 - (n-1-mm) Fstat - ((R2cum-R2prev)*df2) / (1-R2cum) pval - pf(Fstat,1*p,df2*p,lower.tail=FALSE) return(list(Fstat=Fstat,pval=pval)) } Y - as.matrix(Y) X - apply(as.matrix(X),2,scale,center=TRUE,scale=TRUE) var.names = colnames(as.data.frame(X)) n - nrow(X) m - ncol(X) if(nrow(Y) != n) stop(Numbers of rows not the same in Y and X) p - ncol(Y) if(p 1) { Yscale = TRUE if(verbose) cat(The variables in response matrix Y have been standardized,'\n') } Y - apply(Y,2,scale,center=TRUE,scale=Yscale) SS.Y - sum(Y^2) X.out - c(1:m) ## Find the first variable X to include in the model R2prev - 0 R2cum - 0 for(j in 1:m) { toto - simpleRDA2(Y,X[,j],SS.Y) if(toto$Rsquare R2cum) { R2cum - toto$Rsquare no.sup - j } } mm - 1 FP - FPval(R2cum,R2prev,n,mm,p) if(FP$pval = alpha) { adjRsq - RsquareAdj(R2cum,n,mm) res1 - var.names[no.sup] res2 - no.sup res3 - R2cum res4 - R2cum res5 - adjRsq res6 - FP$Fstat res7 - FP$pval X.out[no.sup] - 0 delta - R2cum } else { stop(Procedure stopped (alpha criterion): pvalue for variable ,no.sup, is ,FP$pval) } ## Add variables X to the model while((FP$pval = alpha) (mm = K) (R2cum = R2thresh) (delta = R2more) (adjRsq = adjR2thresh)) { mm - mm+1 R2prev - R2cum R2cum - 0 for(j in 1:m) { if(X.out[j] != 0) { toto - simpleRDA2(Y,X[,c(res2,j)],SS.Y) if(toto$Rsquare R2cum) { R2cum - toto$Rsquare no.sup - j } } } FP - FPval(R2cum,R2prev,n,mm,p) delta - R2cum-R2prev adjRsq - RsquareAdj(R2cum,n,mm) res1 - c(res1,var.names[no.sup]) res2 - c(res2,no.sup) res3 - c(res3,delta) res4 - c(res4,R2cum)
Re: [R] test if elements of a character vector contain letters
Hi, Not sure whether this is you wanted. x-letters (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) x1-c(x,1:26) x1 [1] a4 b3 c5 d2 e9 f6 g1 h8 i10 j7 k l [13] m n o p q r s t u v w x [25] y z 1 2 3 4 5 6 7 8 9 10 [37] 11 12 13 14 15 16 17 18 19 20 21 22 [49] 23 24 25 26 grepl(^[[:alpha:]][[:digit:]],x1) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE A.K. - Original Message - From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Cc: Sent: Monday, August 6, 2012 12:25 PM Subject: [R] test if elements of a character vector contain letters Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20 21 22 23 24 25 26 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20 21 22 23 24 25 26 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to convert data to 'normal' if they are in the form of standard scientific notations?
Dear R users I read two csv data files into R and called them Tem1 and Tem5. For the first column, data in Tem1 has 13 digits where in Tem5 there are 14 digits for each observation. Originally there are 'numerical' as can be seen in my code below. But how can I display/convert them using other form rather than scientific notations which seems a standard/default? I want them to be in the form like '20110911001084', but I'm very confused why when I used 'as.factor' call it works for my 'Tem1' but not for 'Tem5'...?? Many thanks! HJ Tem1[1:5,1][1] 2.10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 Tem5[1:5,1][1] 2.011091e+13 2.011091e+13 2.011091e+13 2.011091e+13 2.011091e+13 class(Tem1[1:5,1])[1] numeric class(Tem5[1:5,1])[1] numeric as.factor(Tem1[1:5,1])[1] 2.10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 Levels: 2.10004e+12 as.factor(Tem5[1:5,1])[1] 20110911001084 20110911001084 20110911001084 20110911001084 20110911001084 Levels: 20110911001084 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Aug 6, 2012, at 12:06 PM, Marc Schwartz marc_schwa...@me.com wrote: Perhaps I am missing something, but why use sapply() when grepl() is already vectorized? is.letter - function(x) grepl([:alpha:], x) is.number - function(x) grepl([:digit:], x) Sorry, typos in the above from my CP. Should be: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Marc x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) str(x) chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ... system.time(is.letter(x)) user system elapsed 0.011 0.000 0.010 system.time(is.number(x)) user system elapsed 0.010 0.000 0.011 Regards, Marc Schwartz On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting Data Into Different Series
Dear R Community, I'm trying to write a loop to split my data into different series. I need to make a new matrix (or series) according to the series code. For instance, every time the code column assumes the value 433 I need to save date, value, and code into the dados433 matrix. Please take a look at the following example: dados - matrix(c(2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 0.56,0.45,0.21,0.64,0.36,0.08,152136,153081,155872,158356,162157,166226, 33.47,34.48,35.24,38.42,35.33,34.43,433,433,433,433,433,433,2005,2005,2005, 2005,2005,2005,3939,3939,3939,3939,3939,3939), nrow=18, ncol=3, byrow=FALSE, dimnames=list(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18), c(date, value, code))) dados433 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados2005 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados3939 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) for(i in seq(along=dados[,3])) { if(dados[i,3] == 433) {dados433[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 2005) {dados2005[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 3939) {dados3939[i,1:3] - dados[i,1:3]} } Best regards, Henrique Andrade __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: test if elements of a character vector contain letters
- Forwarded Message - From: arun smartpink...@yahoo.com To: Liviu Andronic landronim...@gmail.com Cc: R help r-help@r-project.org Sent: Monday, August 6, 2012 12:56 PM Subject: Re: [R] test if elements of a character vector contain letters Hi, Not sure whether this is you wanted. x-letters (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) x1-c(x,1:26) x1 [1] a4 b3 c5 d2 e9 f6 g1 h8 i10 j7 k l [13] m n o p q r s t u v w x [25] y z 1 2 3 4 5 6 7 8 9 10 [37] 11 12 13 14 15 16 17 18 19 20 21 22 [49] 23 24 25 26 grepl(^[[:alpha:]][[:digit:]],x1) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE A.K. - Original Message - From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Cc: Sent: Monday, August 6, 2012 12:25 PM Subject: [R] test if elements of a character vector contain letters Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20 21 22 23 24 25 26 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20 21 22 23 24 25 26 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] no font could be found for family Arial
On Aug 6, 2012, at 2:01 AM, tibr wrote: I hope the original poster fixed this a long time ago, but I had the same problem and here is how I fixed it: - go to the application Fontbook - check if the Arial font has duplicates, and delete them, even if they are set to Off - restart the computer. Which you would have found with a search of the SIG-Mac list had you followed the advice that Prof. Ripley gave at the time. I do not delete both copies of the duplicated fonts when this has occurred on my machine ... only the ones that were defective. emmats wrote I was re-running some code that I hadn't run in a couple of months to make barplots in R. I didn't change a single thing in the script, but the plots wouldn't work this time around. The plot itself (the bars and axes) will graph in the window, but no text appears. In the console it says I have a number of errors, all of which say no font could be found for family 'Arial'. I have not knowingly changed anything in R and I would like to be able to make barplots with labels and titles again. Does anyone know how to fix this? -- View this message in context: http://r.789695.n4.nabble.com/no-font-could-be-found-for-family-Arial-tp3233322p4639257.html Sent from the R help mailing list archive at Nabble.com. Please realize that there are multiple R mailing lists and that posting to this list is the wrong one. The Nabble interface obscures that fact among many other facts, including where the real Archives are. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trouble with looping for effect of sampling interval increase
You would make it much easier for R-help readers to solve your problem if you provided a small example data set with your code, so that we could reproduce your results and troubleshoot the issues. Jean Naidraug white@wright.edu wrote on 08/05/2012 09:08:25 AM: I've looked everywhere and tinkered for three days now, so I figure asking might be good. So here's a general rundown of what I am trying to get my code to do I am giving you the whole rundown because I need a solution that retain certain ways of doing things because they give me the information i need. I want to examine the effect of increasing my sampling interval on my data. Example: what if instead of sampling every hour I sampled every two, oh yeah, how about every three?.. etc ad nausea. How I want to do this is to take the data I have now, add an index to it, that contains counters. Those counters will look something like 1,2,1,2,.. for the first one, 1,2,3,1,2,3.. for the next one. I have a lot of them, like say a thousand... Then for each column in the index my loops should start in the first column, run only the ones, store that, then run the twos, and store that in the same column of output in a different row. Then move to the next column run the ones, store in the next column of output, run the twos, store in the next row of that column, run the threes, etc on out until there is no more. I want to use this index for a number of reasons. The first is that after this I will be going back through and using a different method for sub-sampling but keeping all else the same. So all I have to do there is change the way I generate the index. The second is that it allows me to run many subsamples and see their range. So the code I have made, generates my index, and does the heavy lifting all correctly, as well as my averages, and quartiles, but a look at the head () of my key output (IntervalBetas) shows that something has gone a miss. You have to look close to catch it. The values generated for each row of output are identical, this should not be the case, as row one of the first output column should be generated from all values indexed by a one in the first column, whereas in column two there are different values indexed by the number one. I've checked about everything I can think of, done print() on my loop sequence things (those little i and j) and wiggled about everything. I am flummoxed. I think the bit that is messing up is in here : #Here is the loop for betas from sampling interval increase c - WHOLESIZE[2]-1 for (i in 1:c) { x - length(unique(index[,i])) for (j in 1:x) { data - WHOLE [WHOLE[,x]==j,1] But also here is the whole code in case I am wrong that that is the problem area: #loop for making index #clean dataset of empty cells dataset - na.omit (datasetORIGINAL) #how messed up was the data? holeyDATA - datasetORIGINAL - dataset D - dim(dataset) #what is the smallest sample? tinysample - 100 #how long is the dataset? datalength - length (dataset) #MD - how many divisions MD - datalength/tinysample #clear things up for the index loop WHOLE - NULL index - NULL #do the index loop for (a in 1:MD) { index - cbind (index, rep (1:a, length = D[1])) } index - subset(index, select = -c(1) ) #merge dataset and index loop WHOLE - cbind (dataset, index) WHOLESIZE - dim (WHOLE) #Housekeeping before loops IntervalBetas - NULL IntervalBetas - c(NA,NA) IntervalBetas - as.data.frame (IntervalBetas) IntervalLowerQ - NULL IntervalUpperQ - NULL IntervalMean - NULL IntervalMedian - NULL #Here is the loop for betas from sampling interval increase c - WHOLESIZE[2]-1 for (i in 1:c) { x - length(unique(index[,i])) for (j in 1:x) { data - WHOLE [WHOLE[,x]==j,1] #get power spectral density PSDPLOT - spectrum (data, detrend = TRUE, plot = FALSE) frequency - PSDPLOT$freq PSD - PSDPLOT$spec #log transform the power spectral density Logfrequency - log(frequency) LogPSD- log(PSD) #fit my line to the data Line - lm (LogPSD ~ Logfrequency) #store the slope of the line Betas - rbind (Betas, -coef(Line)[2]) #Get values on the curve shape BSkew - skew (Betas) BMean - mean (Betas) BMedian - median (Betas) Q - quantile (Betas) #store curve shape values IntervalLowerQ - rbind (IntervalLowerQ , Q[2]) IntervalUpperQ - rbind (IntervalUpperQ , Q[4]) IntervalSkew - rbind (IntervalSkew , BSkew) IntervalMean - rbind (IntervalMean , BMean) IntervalMedian - rbind (IntervalMedian , BMedian) #Store the Betas #This is a pain BetaSave - Betas no.r - nrow(IntervalBetas) l.v - length(BetaSave) difer - no.r - l.v difers - abs(difer) if (no.r l.v){ IntervalBetas - rbind(IntervalBetas,rep(NA,difers)) } else { (BetaSave - rbind(BetaSave,rep(NA,difers))) } IntervalBetas - cbind (IntervalBetas, BetaSave)
Re: [R] test if elements of a character vector contain letters
Only an extra set of brackets: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) Without them, the functions are fast, but wrong. x [1] a8 b5 c10 d1 e6 f2 g4 h3 i7 j9 k l [13] m n o p q r s t u v w x [25] y z 1 2 3 4 5 6 7 8 9 10 [37] 11 12 13 14 15 16 17 18 19 20 21 22 [49] 23 24 25 26 is.letter - function(x) grepl([:alpha:], x) is.letter(x) [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE [13] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE is.letter - function(x) grepl([[:alpha:]], x) is.letter(x) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [13] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [25] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [49] FALSE FALSE FALSE FALSE -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Marc Schwartz Sent: Monday, August 06, 2012 12:07 PM To: Rui Barradas Cc: r-help Subject: Re: [R] test if elements of a character vector contain letters Perhaps I am missing something, but why use sapply() when grepl() is already vectorized? is.letter - function(x) grepl([:alpha:], x) is.number - function(x) grepl([:digit:], x) x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) str(x) chr [1:52000] a2 b10 c8 d3 e6 f1 g5 ... system.time(is.letter(x)) user system elapsed 0.011 0.000 0.010 system.time(is.number(x)) user system elapsed 0.010 0.000 0.011 Regards, Marc Schwartz On Aug 6, 2012, at 11:51 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Fun as an exercise in vectorization. 30 times faster. Don't look, guess. Gave it up? Ok, here it is. is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } # test ascii codes, just one loop. has_letter - function(x){ sapply(x, function(y){ y - as.integer(charToRaw(y)) any((65 = y y = 90) | (97 = y y = 122)) }) } x - c(letters, 1:26) x[1:10] - paste(x[1:10], sample(1:10, 10), sep='') x - rep(x, 1e3) t1 - system.time(is_letter(x)) t2 - system.time(has_letter(x)) rbind(t1, t2, t1/t2) user.self sys.self elapsed user.child sys.child t1 15.690 15.74 NANA t2 0.5000.50 NANA 31.38 NaN 31.48 NANA Em 06-08-2012 17:25, Liviu Andronic escreveu: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE
Re: [R] bibtex::read.bib -- extracting bibentry keys
On 8/6/2012 11:54 AM, Achim Zeileis wrote: On Mon, 6 Aug 2012, Michael Friendly wrote: I have two versions of a bibtex database which have gotten badly out of sync. I need to find find all the entries in bib2 which are not contained in bib1, according to their bibtex keys. But I can't figure out how to extract a list of the bibentry keys in these databases. read.bib() returns a bibentry object so you can simply do this as usual for bibentry objects with $key: One thing that was confusing was that read.bib returns a bibentry object, all of whose elements are also bibentry objects. x - read.bib(...) x$key or maybe unlist(x$key) Whatever is more convenient for you. See ?bibentry for more details. That is what I was missing -- it would have helped to find a link to utils::bibentry in the [rather scanty] documentation for read.bib. I'm now a happy camper in this regard. What I wanted is given by: bib1 - read.bib(C:/localtexmf/bibtex/bib/timeref.bib) length(bib1) keys1 - unlist(bib1$key) bib2 - read.bib(W:/texmf/bibtex/bib/timeref.bib) length(bib2) keys2 - unlist(bib2$key) which(! keys1 %in% keys2) [1] 133 249 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 keys1[which(! keys1 %in% keys2)] [1] Langren:1646 Fisher:1915a Stigler:2012 [4] Wainer:2011 Minard:1860a CNAM:1906 [7] Wainer:2012 Wainer-Ramsay:2010 Stephenson-Galneder:1969 [10] Waters:1964 Agathe:1988 Gascoigne:2007 [13] Krzywinski:2009 Bolle:1929 Balbi:1829 [16] Bills-Li:2005 Lewi:2006 Fletcher:1851 [19] Perrot:1976 As a side note, I searched extensively for bibtex tools that would help me resolve the differences between two related bibtex files, but none was as simple as this, once I could get the keys. Thanks to Roman for providing this infrastructure! So, ignoring for now differences in the contents of the bibentries, a useful tool for my purpose is bibdiff(), bibdiff - function(bib1, bib2) { keys1 - unlist(bib1$key) keys2 - unlist(bib2$key) only1 - keys1[which(! keys1 %in% keys2)] only2 - keys2[which(! keys2 %in% keys1)] cat(Only in bib1:\n) print(only1) cat(Only in bib2:\n) print(only2) } bibdiff(bib1, bib2) Only in bib1: [1] Langren:1646 Fisher:1915a Stigler:2012 [4] Wainer:2011 Minard:1860a CNAM:1906 [7] Wainer:2012 Wainer-Ramsay:2010 Stephenson-Galneder:1969 [10] Waters:1964 Agathe:1988 Gascoigne:2007 [13] Krzywinski:2009 Bolle:1929 Balbi:1829 [16] Bills-Li:2005 Lewi:2006 Fletcher:1851 [19] Perrot:1976 Only in bib2: [1] Langren:1644 Quetelet:1842 which gives me the complete answer, as far as it goes. A minor question: Is there someway to prevent read.bib from ignoring entries that do not contain all required fields? Also not really an issue with read.bib itself. read.bib() wants to return a bibentry object but bibentry() just allows to create objects that are valid BibTeX, i.e., have all required fields. It turns out that read.bib seems to be pickier than bibtex itself -- it does not accommodate crossref= fields, used for InCollection items; these resolve correctly using bibtex. For some books in my database, the publisher is unknown. bibtex generates warnings (I think) and does include the references. It would be nicer if there was an argument to read.bib, e.g., strict = {T/F} where strict=FALSE would allow entries not containing all required fields. But perhaps that's buried too deep in the implementation. bib1 - read.bib(C:/localtexmf/bibtex/bib/timeref.bib) ignoring entry 'Donoho-etal:1988' (line 40) because : A bibentry of bibtype ‘InCollection’ has to correctly specify the field(s): booktitle ignoring entry 'Martonne:1919:map' (line 90) because : A bibentry of bibtype ‘InCollection’ has to correctly specify the field(s): booktitle, publisher, year ignoring entry 'Touraine:2002' (line 5423) because : A bibentry of bibtype ‘Book’ has to correctly specify the field(s): publisher ignoring entry 'Cotes:1722' (line 6004) because : A bibentry of bibtype ‘Book’ has to correctly specify the field(s): publisher ignoring entry 'Quetelet:1842' (line 6605) because : A bibentry of bibtype ‘Book’ has to correctly specify the field(s): publisher ignoring entry 'Wenzlick:1950' (line 6663) because : A bibentry of bibtype ‘Unpublished’ has to correctly specify the field(s): note ignoring entry 'Verniquet:1791' (line 6695) because : A bibentry of bibtype ‘Book’ has to correctly specify the field(s): publisher length(bib1) [1] 628 A suggestion: it would be nice if bibtex provided some extractor functions for bibentry fields. So that only a subset of fields is read as opposed to all fields? If you read all fields, you can easily subset afterwards (again using $-notation). No, it was only lack of documentation, and perhaps an example or two for read.bib that caused me to stumble. hth, Z -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street
Re: [R] sapply() and by()
Dominic, It's great that you provided some example data, but a much smaller data frame would have sufficed. For example, 10 randomly selected rows from your data ... LF - structure(list(Serra.da.Foladoira = c(27.335652173913, 25.4632608695652, 24.464652173913, 22.550652173913, 22.2177826086956, 29.3744782608695, 24.1317826086956, 25.5464782608695, 27.7517391304348, 25.172), Santiago = c(32.6199565217391, 27.9597826086956, 32.7863913043478, 25.2136086956521, 23.7573043478261, 32.6199565217391, 28.6671304347826, 27.9597826086956, 29.7489565217391, 23.5492608695652), Sergude = c(31.7877826086956, 27.4604782608695, 26.1706086956521, 25.8377391304348, 26.5034782608695, 33.2856956521739, 30.4979130434782, 30.7059565217391, 30.8307826086956, 31.9542173913043), Rio.Do.Sol = c(30.3730869565217, 25.7545217391304, 25.421652173913, 24.1317826086956, 23.4660434782608, 31.1220434782608, 25.8377391304348, 25.8793478260869, 30.7059565217391, 24.464652173913 ), V5 = c(10L, 2L, 2L, 11L, 3L, 8L, 8L, 3L, 8L, 6L)), .Names = c(Serra.da.Foladoira, Santiago, Sergude, Rio.Do.Sol, V5), row.names = c(1017L, 778L, 400L, 1403L, 86L, 1311L, 598L, 1536L, 605L, 520L), class = data.frame) Try this code to calculate the mean of each of the first four columns for each value of the fifth column ... aggregate(LF[, 1:4], list(month=LF$V5), mean) The sapply() approach doesn't have a built in by type of argument. Jean Dominic Roye dominic.r...@gmail.com wrote on 08/06/2012 09:34:58 AM: Hello everyone, I have a dataset with 5 colums (4 colums with thresholds of weather stations and one with month - data of 5 years). Now I would like to calculate the average for each month. I tried this unsuccessfully: lf.med - sapply(LF[,1:4],mean,LF[,5]) Error in mean.default(X[[1L]], ...) : 'trim' must be numeric and have length 1 With lf.med - by(LF[,1:4],LF[,5],mean) It works, but its deprecated. Any help is greatly appreciated!!! Thanky everybody`!! Dominic dput(LC) structure(list(Serra.da.Foladoira = c(21.1359565217391, 21.7184782608695, 23.5492608695652, 23.4660434782608, 23.6740869565217, 21.1775652173913, SNIPPED 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L)), row.names = c(NA, -1826L), .Names = c(Serra.da.Foladoira, Santiago, Sergude, Rio.Do.Sol, V5), class = data.frame) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply() and by()
On Aug 6, 2012, at 7:34 AM, Dominic Roye wrote: Hello everyone, I have a dataset with 5 colums (4 colums with thresholds of weather stations and one with month - data of 5 years). Now I would like to calculate the average for each month. I tried this unsuccessfully: lf.med - sapply(LF[,1:4],mean,) If you want to group calculations within categories then sapply is not the right function to turn to immediately. Use one of 'aggregate', 'tapply' or 'ave'. Error in mean.default(X[[1L]], ...) : 'trim' must be numeric and have length 1 It is telling you that the unnamed third argument was matched to the 'trim' parameter of the function 'mean'. Perhaps: aggregate( LF[,1:4], list(LF[,5]), mean) With lf.med - by(LF[,1:4],LF[,5],mean) It works, but its deprecated. Actually what is deprecated is the function `mean.data.frame`. Any help is greatly appreciated!!! Thanky everybody`!! Minimal example. PLEASE. Dominic dput(LC) Please do note that you offered an object 'LC' but you code referred to 'LF'. structure(list(Serra.da.Foladoira = c(21.1359565217391, 21.7184782608695, 23.5492608695652, 23.4660434782608, 23.6740869565217, 21.1775652173913, 19.8460869565217, 23.3412173913043, 22.8835217391304, 24.3398260869565, snipped 1800+ length vector [[alternative HTML version deleted]] and provide commented, minimal, self-contained, reproducible code. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to convert data to 'normal' if they are in the form of standard scientific notations?
HJ, You don't provide any reproducible code, so I had to make up my own. dat - data.frame(a=letters[1:5], x=c(20110911001084, 20110911001084, 20110911001084, 20110911001084, 20110911001084), y=c(2.10004e+12, 2.10004e+12, 2.10004e+12, 2.10004e+12, 2.10004e+12)) In my example, the long numbers print out without scientific notation. dat a x y 1 a 20110911001084 210004000 2 b 20110911001084 210004000 3 c 20110911001084 210004000 4 d 20110911001084 210004000 5 e 20110911001084 210004000 I can make it print with scientific notation using the digits argument to the print() function. print(dat, digits=3) ax y 1 a 2.01e+13 2.1e+12 2 b 2.01e+13 2.1e+12 3 c 2.01e+13 2.1e+12 4 d 2.01e+13 2.1e+12 5 e 2.01e+13 2.1e+12 What is your default number of digits? getOption(digits) Jean HJ YAN yhj...@googlemail.com wrote on 08/06/2012 11:14:17 AM: Dear R users I read two csv data files into R and called them Tem1 and Tem5. For the first column, data in Tem1 has 13 digits where in Tem5 there are 14 digits for each observation. Originally there are 'numerical' as can be seen in my code below. But how can I display/convert them using other form rather than scientific notations which seems a standard/default? I want them to be in the form like '20110911001084', but I'm very confused why when I used 'as.factor' call it works for my 'Tem1' but not for 'Tem5'...?? Many thanks! HJ Tem1[1:5,1][1] 2.10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 2. 10004e+12 Tem5[1:5,1][1] 2.011091e+13 2.011091e+13 2.011091e+13 2. 011091e+13 2.011091e+13 class(Tem1[1:5,1])[1] numeric class(Tem5 [1:5,1])[1] numeric as.factor(Tem1[1:5,1])[1] 2.10004e+12 2. 10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 Levels: 2.10004e+12 as.factor(Tem5[1:5,1])[1] 20110911001084 20110911001084 20110911001084 20110911001084 20110911001084 Levels: 20110911001084 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] program of matrix
On Aug 6, 2012, at 8:04 AM, hafida wrote: Hi can ANY body help me to programme this formula: c[lj] and c[l'j] are matrix A[j]^-1 is an invertible diagonal matrix g[ll']=i[ll'] - sum *#from j=1 to k#* c[lj]c[l'j]A[j]^-1 WHERE i[ll']= 1/n sum from i=1 to n z[il] z[il'] n,k,m are given. j=1...k,l,l'=1...m, it s complicate for me ; hope you can help me thank you a lot -- View this message in context: http://r.789695.n4.nabble.com/program-of-matrix-tp4639288.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting Data Into Different Series
Hello, Try the following. split(data.frame(dados), dados[, code]) Also, it's better to have data like 'dados' in a data.frame, like this you would have dates of class Date, and numbers of classes numeric or integer: dados2 - data.frame(dados) dados2$date - as.Date(dados2$date) dados2$value - as.numeric(dados2$value) dados2$code - as.integer(dados2$code) #See the STRucture str(dados2) The code above would be simplified to split(dados2, dados2$code) And it's also better to keep the result in a list, they are all in one place and you can access the components as result[[ 433 ]] # etc. Hope this helps Rui Barradas Em 06-08-2012 18:06, Henrique Andrade escreveu: Dear R Community, I'm trying to write a loop to split my data into different series. I need to make a new matrix (or series) according to the series code. For instance, every time the code column assumes the value 433 I need to save date, value, and code into the dados433 matrix. Please take a look at the following example: dados - matrix(c(2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 0.56,0.45,0.21,0.64,0.36,0.08,152136,153081,155872,158356,162157,166226, 33.47,34.48,35.24,38.42,35.33,34.43,433,433,433,433,433,433,2005,2005,2005, 2005,2005,2005,3939,3939,3939,3939,3939,3939), nrow=18, ncol=3, byrow=FALSE, dimnames=list(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18), c(date, value, code))) dados433 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados2005 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados3939 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) for(i in seq(along=dados[,3])) { if(dados[i,3] == 433) {dados433[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 2005) {dados2005[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 3939) {dados3939[i,1:3] - dados[i,1:3]} } Best regards, Henrique Andrade __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] program of matrix
Well, I just posted the fourth copy to the list which I apologize for. (I meant to delete the response I wrote.) Re-re-posting an unclear message seems unwise on your part, 'hafida'. You are not following the advice in the footer to all messages and you are not following the advice in the Positng Guide, so it is no surprise that people are not responding. Read the Posting Guide. From there you should take away lessons: Learn to use a shift key. Learn to post your real name and academic or professional affiliation. This is a technical mailing list and anonymity will lower people's level of willingness to offer advice. Learn to post R code that constructs a data example. Learn to provide background. (.i.e. What are you really trying to do?) Learn that providing the background regarding why you want to do this will reduce the concern that this is just a homework problem. (Homework submissions are generally ignored.) -- David. On Aug 6, 2012, at 8:04 AM, hafida wrote: Hi can ANY body help me to programme this formula: c[lj] and c[l'j] are matrix A[j]^-1 is an invertible diagonal matrix g[ll']=i[ll'] - sum *#from j=1 to k#* c[lj]c[l'j]A[j]^-1 WHERE i[ll']= 1/n sum from i=1 to n z[il] z[il'] n,k,m are given. j=1...k,l,l'=1...m, it s complicate for me ; hope you can help me thank you a lot -- View this message in context: http://r.789695.n4.nabble.com/program-of-matrix-tp4639288.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey HSD not fully displayed in R console
Would just saving the results onto an object saving it work? From the help page example summary(fm1 - aov(breaks ~ wool + tension, data = warpbreaks)) myresults- TukeyHSD(fm1, tension, ordered = TRUE) write.table (myresults, file = ksksk) John Kane Kingston ON Canada -Original Message- From: ulrikebraeck...@hotmail.com Sent: Mon, 6 Aug 2012 07:19:09 -0700 (PDT) To: r-help@r-project.org Subject: [R] Tukey HSD not fully displayed in R console Dear all, I would like to test the differences in dependent variable X depending on 2 grouping variables of each 10 levels. I do this with a 2-way ANOVA, followed by a Tukey HSD test (TukeyHSD(x)). However, since a lot of combinations are possible with 2 grouping variables, each of 10 levels, the result of the Tukey test is not fully displayed in the console. I tried to print it as a table (write.table () ) and open it afterwards in Notepad or print e.g. only the first 30 rows of the result, but both without success ... Anyone an idea how I can deal with this problem? Many thanks, Ulrike -- View this message in context: http://r.789695.n4.nabble.com/Tukey-HSD-not-fully-displayed-in-R-console-tp4639285.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote: nzchar(x) !is.na(x) No? It doesn't work for what I need: x [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 nzchar(x) !is.na(x) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [52] TRUE I need to have TRUE when an element contains a letter, and FALSE when an element contains only numbers. The above returns TRUE for the entire vector. Regards Liviu On Mon, Aug 6, 2012 at 9:25 AM, Liviu Andronic landronim...@gmail.com wrote: Dear all I'm pretty sure that I'm approaching the problem in a wrong way. Suppose the following character vector: (x[1:10] - paste(x[1:10], sample(1:10, 10), sep='')) [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 x [1] a10 b7 c2 d3 e6 f1 g5 h8 i9 j4 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 How do you test whether the elements of the vector contain at least one letter (or at least one digit) and obtain a logical vector of the same dimension? I came up with the following awkward function: is_letter - function(x, pattern=c(letters, LETTERS)){ sapply(x, function(y){ any(sapply(pattern, function(z) grepl(z, y, fixed=T))) }) } is_letter(x) a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE p q r s t u v w x y z 1 2 3 4 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE 5 6 7 8 9101112131415 16171819 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 20212223242526 FALSE FALSE FALSE FALSE FALSE FALSE FALSE is_letter(x, 0:9) ##function slightly misnamed a10b7c2d3e6f1g5h8i9j4 k l m n o TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE p q r s t u v w x y z 1 2 3 4 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE 5 6 7 8 9101112131415 16171819 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 20212223242526 TRUE TRUE TRUE TRUE TRUE TRUE TRUE Is there a nicer way to do this? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overlay Histogram
Dear all, For two sets of random variables, say, x - rnorm(1000, 10, 10) and y - rnorm(1000. 3, 20). Is there any way to overlay the histograms (and density curves) of x and y on the plot of y vs. x? The histogram of x is on the x axis and that of y is on the y axis. The density curve here is to approximate the shape of the distribution and does not have to have area 1. Thank you in advance. Hannah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlay Histogram
See example(layout) for one idea. I think you might also want to look into rug plots. Best, Michael On Mon, Aug 6, 2012 at 2:40 PM, li li hannah@gmail.com wrote: Dear all, For two sets of random variables, say, x - rnorm(1000, 10, 10) and y - rnorm(1000. 3, 20). Is there any way to overlay the histograms (and density curves) of x and y on the plot of y vs. x? The histogram of x is on the x axis and that of y is on the y axis. The density curve here is to approximate the shape of the distribution and does not have to have area 1. Thank you in advance. Hannah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
You probably mean grepl('[a-zA-Z]', x) Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Mon, Aug 6, 2012 at 3:29 PM, Liviu Andronic landronim...@gmail.com wrote: On Mon, Aug 6, 2012 at 6:42 PM, Bert Gunter gunter.ber...@gene.com wrote: nzchar(x) !is.na(x) No? It doesn't work for what I need: x [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 nzchar(x) !is.na(x) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [18] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [35] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [52] TRUE I need to have TRUE when an element contains a letter, and FALSE when an element contains only numbers. The above returns TRUE for the entire vector. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] program of matrix
A OK I MAKE A MISTAKE OK MR DAVID I WILL DO IT THANK YOU -- View this message in context: http://r.789695.n4.nabble.com/program-of-matrix-tp4639288p4639334.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] program of matrix
I CANT FIND ANY ANSWER MR DAVID -- View this message in context: http://r.789695.n4.nabble.com/program-of-matrix-tp4639288p4639332.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting Data Into Different Series
Dear Rui and Arun, Thanks a lot for your help. I will test all the proposed solutions ;-) Best regards, Henrique Andrade 2012/8/6 Rui Barradas ruipbarra...@sapo.pt: Hello, Try the following. split(data.frame(dados), dados[, code]) Also, it's better to have data like 'dados' in a data.frame, like this you would have dates of class Date, and numbers of classes numeric or integer: dados2 - data.frame(dados) dados2$date - as.Date(dados2$date) dados2$value - as.numeric(dados2$value) dados2$code - as.integer(dados2$code) #See the STRucture str(dados2) The code above would be simplified to split(dados2, dados2$code) And it's also better to keep the result in a list, they are all in one place and you can access the components as result[[ 433 ]] # etc. Hope this helps Rui Barradas Em 06-08-2012 18:06, Henrique Andrade escreveu: Dear R Community, I'm trying to write a loop to split my data into different series. I need to make a new matrix (or series) according to the series code. For instance, every time the code column assumes the value 433 I need to save date, value, and code into the dados433 matrix. Please take a look at the following example: dados - matrix(c(2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 0.56,0.45,0.21,0.64,0.36,0.08,152136,153081,155872,158356,162157,166226, 33.47,34.48,35.24,38.42,35.33,34.43,433,433,433,433,433,433,2005,2005,2005, 2005,2005,2005,3939,3939,3939,3939,3939,3939), nrow=18, ncol=3, byrow=FALSE, dimnames=list(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18), c(date, value, code))) dados433 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados2005 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados3939 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) for(i in seq(along=dados[,3])) { if(dados[i,3] == 433) {dados433[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 2005) {dados2005[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 3939) {dados3939[i,1:3] - dados[i,1:3]} } Best regards, Henrique Andrade __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Andrade __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting Data Into Different Series
HI, You can subset the data dados433-subset(dados,dados[,3]==433) is.matrix(dados433) #[1] TRUE dados433 date value code 1 2012-01-01 0.56 433 2 2012-02-01 0.45 433 3 2012-03-01 0.21 433 4 2012-04-01 0.64 433 5 2012-05-01 0.36 433 6 2012-06-01 0.08 433 dados2005-subset(dados,dados[,3]==2005) dados3939-subset(dados,dados[,3]==3939) #or split the data dados1-as.data.frame(dados) dados2-split(dados1,dados1$code) - Original Message - From: Henrique Andrade henrique.coe...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, August 6, 2012 1:06 PM Subject: [R] Splitting Data Into Different Series Dear R Community, I'm trying to write a loop to split my data into different series. I need to make a new matrix (or series) according to the series code. For instance, every time the code column assumes the value 433 I need to save date, value, and code into the dados433 matrix. Please take a look at the following example: dados - matrix(c(2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 2012-01-01,2012-02-01,2012-03-01,2012-04-01,2012-05-01,2012-06-01, 0.56,0.45,0.21,0.64,0.36,0.08,152136,153081,155872,158356,162157,166226, 33.47,34.48,35.24,38.42,35.33,34.43,433,433,433,433,433,433,2005,2005,2005, 2005,2005,2005,3939,3939,3939,3939,3939,3939), nrow=18, ncol=3, byrow=FALSE, dimnames=list(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18), c(date, value, code))) dados433 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados2005 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) dados3939 - matrix(data = NA, nrow = 6, ncol = 3, byrow= FALSE) for(i in seq(along=dados[,3])) { if(dados[i,3] == 433) {dados433[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 2005) {dados2005[i,1:3] - dados[i,1:3]} } for(i in seq(along=dados[,3])) { if(dados[i,3] == 3939) {dados3939[i,1:3] - dados[i,1:3]} } Best regards, Henrique Andrade __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlay Histogram
HI, #These links http://stackoverflow.com/questions/8545035/scatterplot-with-marginal-histograms-in-ggplot2 http://stackoverflow.com/questions/11022675/rotate-histogram-in-r-or-overlay-a-density-in-a-barplot # might be helpful for you. A.K. - Original Message - From: li li hannah@gmail.com To: r-help r-help@r-project.org Cc: Sent: Monday, August 6, 2012 3:40 PM Subject: [R] Overlay Histogram Dear all, For two sets of random variables, say, x - rnorm(1000, 10, 10) and y - rnorm(1000. 3, 20). Is there any way to overlay the histograms (and density curves) of x and y on the plot of y vs. x? The histogram of x is on the x axis and that of y is on the y axis. The density curve here is to approximate the shape of the distribution and does not have to have area 1. Thank you in advance. Hannah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AR vs ARMA model
Please find some reference online or textbook. This must be contained in the model assessment part. AIC, BIC, rolling prediction/forecasting error might be what you want. Best wishes, Jie On Fri, Aug 3, 2012 at 4:07 AM, Soham soham.tommarvolorid...@gmail.comwrote: Hi I am trying to fit a time series data.It gives a AR(2) model using the ar function and ARMA(1,1) model using autoarmafit function in timsac package.How do I know which is the correct underlying model? pls help -- View this message in context: http://r.789695.n4.nabble.com/AR-vs-ARMA-model-tp4639015.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] program of matrix
On Aug 6, 2012, at 11:16 AM, hafida wrote: I CANT FIND ANY ANSWER MR DAVID When I suggested that you learn to use the shift key, I was hoping for a sparing use of that key, such as at the beginning of sentences. The caps-lock key is different than the shift key. You are also posting to a mailing list whose Posting Guide requests that poster include context. -- View this message in context: http://r.789695.n4.nabble.com/program-of-matrix-tp4639288p4639332.html Sent from the R help mailing list archive at Nabble.com. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot find function simpleRDA2
Hi, simpleRDA2 is still in the vegan package, but it is not exported. I.e., the author only intends it for internal use and he doesn't make it available to end users directly. If you need to get at it, you can use getAnywhere(simpleRDA2) which will show it. If you need to make it available to your scripts, you can add the line simpleRDA2 - vegan:::simpleRDA2 # Note three colons which will make a copy in your workspace (global environment) that your functions can access. Best, Michael On Mon, Aug 6, 2012 at 10:40 AM, Lindsey Leigh Sloat llsl...@email.arizona.edu wrote: Hi, I am trying to run the command forward.sel.par, however I receive the error message: Error: could not find function 'simpleRDA2'. I have the vegan library loaded. The documentation on varpart has not helped me to understand why I cannot call this function. Maybe I am missing something obvious because I am still an 'R' novice. Below is a reproducible example for you. Thank you always for all of your help. Lindsey example: X=matrix(rnorm(30),10,3) Y=matrix(rnorm(50),10,5) forward.sel.par - function(Y, X, alpha = 0.05, K = nrow(X)-1, R2thresh = 0.99, R2more = 0.001, adjR2thresh = 0.99, Yscale = FALSE, verbose=TRUE) ## ## Parametric forward selection of explanatory variables in regression and RDA. ## Y is the response, X is the table of explanatory variables. ## ## If Y is univariate, this function implements FS in regression. ## If Y is multivariate, this function implements FS using the F-test described ## by Miller and Farr (1971). This test requires that ## -- the Y variables be standardized, ## -- the error in the response variables be normally distributed (to be verified by the user). ## ## This function uses 'simpleRDA2' and 'RsquareAdj' developed for 'varpart' in 'vegan'. ## ##Pierre Legendre Guillaume Blanchet, May 2007 ## ## Arguments -- ## ## Y Response data matrix with n rows and m columns containing quantitative variables. ## X Explanatory data matrix with n rows and p columns containing quantitative variables. ## alpha Significance level. Stop the forward selection procedure if the p-value of a variable is higher than alpha. The default is 0.05. ## K Maximum number of variables to be selected. The default is one minus the number of rows. ## R2thresh Stop the forward selection procedure if the R-square of the model exceeds the stated value. This parameter can vary from 0.001 to 1. ## R2moreStop the forward selection procedure if the difference in model R-square with the previous step is lower than R2more. The default setting is 0.001. ## adjR2thresh Stop the forward selection procedure if the adjusted R-square of the model exceeds the stated value. This parameter can take any value (positive or negative) smaller than 1. ## YscaleStandardize the variables in table Y to variance 1. The default setting is FALSE. The setting is automatically changed to TRUE if Y contains more than one variable. This is a validity condition for the parametric test of significance (Miller and Farr 1971). ## ## Reference: ## Miller, J. K., and S. D. Farr. 1971. Bimultivariate redundancy: a comprehensive measure of ##interbattery relationship. Multivariate Behavioral Research 6: 313-324. { require(vegan) FPval - function(R2cum,R2prev,n,mm,p) ## Compute the partial F and p-value after adding a single explanatory variable to the model. ## In FS, the number of df of the numerator of F is always 1. See Sokal Rohlf 1995, eq 16.14. ## ## The amendment, based on Miller and Farr (1971), consists in multiplying the numerator and ## denominator df by 'p', the number of variables in Y, when computing the p-value. ## ##Pierre Legendre, May 2007 { df2 - (n-1-mm) Fstat - ((R2cum-R2prev)*df2) / (1-R2cum) pval - pf(Fstat,1*p,df2*p,lower.tail=FALSE) return(list(Fstat=Fstat,pval=pval)) } Y - as.matrix(Y) X - apply(as.matrix(X),2,scale,center=TRUE,scale=TRUE) var.names = colnames(as.data.frame(X)) n - nrow(X) m - ncol(X) if(nrow(Y) != n) stop(Numbers of rows not the same in Y and X) p - ncol(Y) if(p 1) { Yscale = TRUE if(verbose) cat(The variables in response matrix Y have been standardized,'\n') } Y - apply(Y,2,scale,center=TRUE,scale=Yscale) SS.Y - sum(Y^2) X.out - c(1:m) ## Find the first variable X to include in the model R2prev - 0 R2cum - 0 for(j in 1:m) { toto - simpleRDA2(Y,X[,j],SS.Y) if(toto$Rsquare R2cum) { R2cum - toto$Rsquare no.sup - j } } mm - 1 FP - FPval(R2cum,R2prev,n,mm,p) if(FP$pval = alpha) { adjRsq - RsquareAdj(R2cum,n,mm) res1 - var.names[no.sup] res2 - no.sup res3 - R2cum res4 - R2cum res5 - adjRsq res6 - FP$Fstat res7 - FP$pval X.out[no.sup] - 0 delta -
[R] Force evaluation of a symbol when a function is created
I am porting a program in matlab to R, The problem is that Matlab has a feature where symbols that aren't arguments are evaluated immediately. That is: Y=3 F=@(x) x*Y Will yield a function such that F(2)=6. If later say. Y=4 then F(2) will still equal 6. R on the other hand has lazy evaluation. F-function(x){x*Y} Will do the following Y=3 F(2)=6 Y=4 F(2)=8. Does anyone know of away to defeat lazy evaluation in R so that I can easily simulate the Matlab behavior. I know that I can live without this in ordinary programming but it would make my port much easier. Thanks. The information in this e-mail is intended only for the ...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force evaluation of a symbol when a function is created
You could use local(), as in F - local({ +Y - 3 +function(x) x * Y +}) F(7) [1] 21 Y - 19 F(5) [1] 15 Look into 'environments' for more. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schoenfeld, David Alan,Ph.D.,Biostatistics Sent: Monday, August 06, 2012 2:08 PM To: 'r-help@r-project.org' Subject: [R] Force evaluation of a symbol when a function is created I am porting a program in matlab to R, The problem is that Matlab has a feature where symbols that aren't arguments are evaluated immediately. That is: Y=3 F=@(x) x*Y Will yield a function such that F(2)=6. If later say. Y=4 then F(2) will still equal 6. R on the other hand has lazy evaluation. F-function(x){x*Y} Will do the following Y=3 F(2)=6 Y=4 F(2)=8. Does anyone know of away to defeat lazy evaluation in R so that I can easily simulate the Matlab behavior. I know that I can live without this in ordinary programming but it would make my port much easier. Thanks. The information in this e-mail is intended only for the ...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: Help xts object Subset Date by Day of the Week
On Sun, Aug 5, 2012 at 4:49 PM, Douglas Karabasz doug...@sigmamonster.com wrote: I have a xts object made of daily closing prices I have acquired using quantmod. Here is my code: library(xts) library(quantmod) library(lubridate) # Gets SPY data getSymbols(SPY) # Subset Prices to just closing price SP500 - Cl(SPY) # Show day of the week for each date using 2-6 for monday-friday SP500wd - wday(SP500) # Add Price and days of week together SP500wd - cbind(SP500, SP500wd) # subset Monday into one xts object SPmon - subset(SP500wd, SP500wd$..2==2) I then used the package lubridate to show the days of the week. Due to the requirement of an xts objects to be numeric you will see each day is represented as a number so that Monday is =2, Tuesday=3, Wednesday=4, Thursday=5, Friday=6, Saturday=7. Since this is a financial index you will only see the numbers 2-6 or Monday-Friday. I want to subset the data by using the day column. I would like some help to figure out the best way to accomplish a few objectives. 1. Subset the data so that I only show Monday in sequence. However, I do want to make sure that it shows the date, price and the ..2 colum(which is the day of week) after Sub setting the data (I have it done but not sure if it is the best way) I think what you do works, this might also be a one liner: SPY[format(index(SPY), %a) == Mon, ] Alternatively split.default(SPY, format(index(SPY), %a)) creates a list of xts objects split by day of the week (Note you need split.default here because split.xts does something different) 2. Rearrange the object (hopefully without destroying the xts object) so that my data lines up like a weekly calendar. So it would look like the follow. Unfortunately, your formatting got all chewed up by the R-help server, which doesn't like HTML so I'm not quite sure what you want here. Possibly some black magic like this? SPY.CL - Cl(SPY) length(SPY.CL) - 7*floor(length(SPY.CL)/7) dim(SPY.CL) - c(length(SPY.CL)/7, 7) But note that this looses time stamps because each row can only have a single time stamp. You might also try to.weekly() Cheers, Michael Long Date Monday Monday Price Monday Day Index Long Date Tuesday Tuesday Price Tuesday Day Index Long Date Wednesday Wednesday Price Wednesday Index Long Date Thursday Thursday Price Thursday Index Friday Friday Price Friday Index 1/5/2009 92.85 2 1/6/2009 93.47 3 1/7/2009 90.67 4 1/8/2009 84.4 5 1/9/2009 89.09 6 1/12/2009 86.95 2 1/13/2009 87.11 3 1/14/2009 84.37 4 1/15/2009 91.04 5 1/16/2009 85.06 6 MLK Mondy MLK Monday MLK Monday 1/20/2009 80.57 3 1/21/2009 84.05 4 1/22/2009 82.75 5 1/23/2009 83.11 6 1/26/2009 83.68 2 1/27/2009 84.53 3 1/28/2009 87.39 4 1/29/2009 84.55 5 1/30/2009 82.83 6 Thank you, Douglas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: Help xts object Subset Date by Day of the Week
On Mon, Aug 6, 2012 at 4:30 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: On Sun, Aug 5, 2012 at 4:49 PM, Douglas Karabasz doug...@sigmamonster.com wrote: I have a xts object made of daily closing prices I have acquired using quantmod. Here is my code: library(xts) library(quantmod) library(lubridate) # Gets SPY data getSymbols(SPY) # Subset Prices to just closing price SP500 - Cl(SPY) # Show day of the week for each date using 2-6 for monday-friday SP500wd - wday(SP500) # Add Price and days of week together SP500wd - cbind(SP500, SP500wd) # subset Monday into one xts object SPmon - subset(SP500wd, SP500wd$..2==2) I then used the package lubridate to show the days of the week. Due to the requirement of an xts objects to be numeric you will see each day is represented as a number so that Monday is =2, Tuesday=3, Wednesday=4, Thursday=5, Friday=6, Saturday=7. Since this is a financial index you will only see the numbers 2-6 or Monday-Friday. I want to subset the data by using the day column. I would like some help to figure out the best way to accomplish a few objectives. 1. Subset the data so that I only show Monday in sequence. However, I do want to make sure that it shows the date, price and the ..2 colum(which is the day of week) after Sub setting the data (I have it done but not sure if it is the best way) I think what you do works, this might also be a one liner: SPY[format(index(SPY), %a) == Mon, ] Alternatively split.default(SPY, format(index(SPY), %a)) creates a list of xts objects split by day of the week (Note you need split.default here because split.xts does something different) 2. Rearrange the object (hopefully without destroying the xts object) so that my data lines up like a weekly calendar. So it would look like the follow. Unfortunately, your formatting got all chewed up by the R-help server, which doesn't like HTML so I'm not quite sure what you want here. Possibly some black magic like this? SPY.CL - Cl(SPY) length(SPY.CL) - 7*floor(length(SPY.CL)/7) dim(SPY.CL) - c(length(SPY.CL)/7, 7) But note that this looses time stamps because each row can only have a single time stamp. To clarify that's not _why_ that looses the time-stamps (and xts-ness) but just that it does happen. Technically, it's because dim-.xts doesn't exist; the reason it doesn't (I'd imagine) is because of the time stamp thing. M You might also try to.weekly() Cheers, Michael Long Date Monday Monday Price Monday Day Index Long Date Tuesday Tuesday Price Tuesday Day Index Long Date Wednesday Wednesday Price Wednesday Index Long Date Thursday Thursday Price Thursday Index Friday Friday Price Friday Index 1/5/2009 92.85 2 1/6/2009 93.47 3 1/7/2009 90.67 4 1/8/2009 84.4 5 1/9/2009 89.09 6 1/12/2009 86.95 2 1/13/2009 87.11 3 1/14/2009 84.37 4 1/15/2009 91.04 5 1/16/2009 85.06 6 MLK Mondy MLK Monday MLK Monday 1/20/2009 80.57 3 1/21/2009 84.05 4 1/22/2009 82.75 5 1/23/2009 83.11 6 1/26/2009 83.68 2 1/27/2009 84.53 3 1/28/2009 87.39 4 1/29/2009 84.55 5 1/30/2009 82.83 6 Thank you, Douglas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] test if elements of a character vector contain letters
On Mon, Aug 6, 2012 at 7:35 PM, Marc Schwartz marc_schwa...@me.com wrote: is.letter - function(x) grepl([[:alpha:]], x) is.number - function(x) grepl([[:digit:]], x) This does exactly what I wanted: x [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z 1 2 [29] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [43] 17 18 19 20 21 22 23 24 25 26 xb - grepl([[:alpha:]],x) x[xb] ##extract all vector elements that contain a letter [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 k l m n [15] o p q r s t u v w x y z xb - grepl([[:digit:]],x) x[xb] ##extract all vector elements that contain a digit [1] a10 b8 c9 d2 e3 f4 g1 h7 i6 j5 1 2 3 4 [15] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [29] 19 20 21 22 23 24 25 26 Thanks all for the suggestions! Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force evaluation of a symbol when a function is created
Thanks to both: Cute question, clever, informative answer. However, Bill, I don't think you **quite** answered him, although the modification needed is completely trivial. Of course, I could never have figured it out without your response. Anyway, I interpret the question as asking for the function definition to _implicitly_ pick up the value of Y at the time the function is defined, rather than explicitly assigning it in local(). The following are two essentially identical approaches: I prefer the second, because it's more transparent to me, but that's just a matter of taste. Y - 3 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) Yielding: Y - 3 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) F(5) [1] 15 G(5) [1] 15 Y - 2 F(5) [1] 15 G(5) [1] 15 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) F(5) [1] 10 G(5) [1] 10 Cheers, Bert On Mon, Aug 6, 2012 at 2:24 PM, William Dunlap wdun...@tibco.com wrote: You could use local(), as in F - local({ +Y - 3 +function(x) x * Y +}) F(7) [1] 21 Y - 19 F(5) [1] 15 Look into 'environments' for more. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schoenfeld, David Alan,Ph.D.,Biostatistics Sent: Monday, August 06, 2012 2:08 PM To: 'r-help@r-project.org' Subject: [R] Force evaluation of a symbol when a function is created I am porting a program in matlab to R, The problem is that Matlab has a feature where symbols that aren't arguments are evaluated immediately. That is: Y=3 F=@(x) x*Y Will yield a function such that F(2)=6. If later say. Y=4 then F(2) will still equal 6. R on the other hand has lazy evaluation. F-function(x){x*Y} Will do the following Y=3 F(2)=6 Y=4 F(2)=8. Does anyone know of away to defeat lazy evaluation in R so that I can easily simulate the Matlab behavior. I know that I can live without this in ordinary programming but it would make my port much easier. Thanks. The information in this e-mail is intended only for t...{{dropped:26}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force evaluation of a symbol when a function is created
Both of those approaches require the function to be created at the same time that the environment containing some of its bindings is created. You can also take an existing function and assign a new environment to it. E.g., f - function(x) y * x ys - c(2,3,5,7,11) fs - lapply(ys, function(y) { env - new.env(parent=baseenv()); env[[y]] - y ; environment(f) - env ; f }) # fs is a list of functions, all identical except for their environments, which contain 'y'. fs[[2]] function (x) y * x environment: 0x05df1c38 fs[[2]](10) [1] 30 fs[[3]] function (x) y * x environment: 0x05def8c0 fs[[3]](10) [1] 50 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Monday, August 06, 2012 3:03 PM To: William Dunlap Cc: Schoenfeld, David Alan,Ph.D.,Biostatistics; r-help@r-project.org Subject: Re: [R] Force evaluation of a symbol when a function is created Thanks to both: Cute question, clever, informative answer. However, Bill, I don't think you **quite** answered him, although the modification needed is completely trivial. Of course, I could never have figured it out without your response. Anyway, I interpret the question as asking for the function definition to _implicitly_ pick up the value of Y at the time the function is defined, rather than explicitly assigning it in local(). The following are two essentially identical approaches: I prefer the second, because it's more transparent to me, but that's just a matter of taste. Y - 3 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) Yielding: Y - 3 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) F(5) [1] 15 G(5) [1] 15 Y - 2 F(5) [1] 15 G(5) [1] 15 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) F(5) [1] 10 G(5) [1] 10 Cheers, Bert On Mon, Aug 6, 2012 at 2:24 PM, William Dunlap wdun...@tibco.com wrote: You could use local(), as in F - local({ +Y - 3 +function(x) x * Y +}) F(7) [1] 21 Y - 19 F(5) [1] 15 Look into 'environments' for more. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schoenfeld, David Alan,Ph.D.,Biostatistics Sent: Monday, August 06, 2012 2:08 PM To: 'r-help@r-project.org' Subject: [R] Force evaluation of a symbol when a function is created I am porting a program in matlab to R, The problem is that Matlab has a feature where symbols that aren't arguments are evaluated immediately. That is: Y=3 F=@(x) x*Y Will yield a function such that F(2)=6. If later say. Y=4 then F(2) will still equal 6. R on the other hand has lazy evaluation. F-function(x){x*Y} Will do the following Y=3 F(2)=6 Y=4 F(2)=8. Does anyone know of away to defeat lazy evaluation in R so that I can easily simulate the Matlab behavior. I know that I can live without this in ordinary programming but it would make my port much easier. Thanks. The information in this e-mail is intended only for the ...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force evaluation of a symbol when a function is created
Thank you both, this was very helpful. I need to study environments more. Do either of you know a good source? -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Monday, August 06, 2012 6:03 PM To: William Dunlap Cc: Schoenfeld, David Alan,Ph.D.,Biostatistics; r-help@r-project.org Subject: Re: [R] Force evaluation of a symbol when a function is created Thanks to both: Cute question, clever, informative answer. However, Bill, I don't think you **quite** answered him, although the modification needed is completely trivial. Of course, I could never have figured it out without your response. Anyway, I interpret the question as asking for the function definition to _implicitly_ pick up the value of Y at the time the function is defined, rather than explicitly assigning it in local(). The following are two essentially identical approaches: I prefer the second, because it's more transparent to me, but that's just a matter of taste. Y - 3 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) Yielding: Y - 3 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) F(5) [1] 15 G(5) [1] 15 Y - 2 F(5) [1] 15 G(5) [1] 15 F -local({y - Y;function(x)x*y}) G - evalq(function(x)x*y,env=list(y=Y)) F(5) [1] 10 G(5) [1] 10 Cheers, Bert On Mon, Aug 6, 2012 at 2:24 PM, William Dunlap wdun...@tibco.com wrote: You could use local(), as in F - local({ +Y - 3 +function(x) x * Y +}) F(7) [1] 21 Y - 19 F(5) [1] 15 Look into 'environments' for more. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schoenfeld, David Alan,Ph.D.,Biostatistics Sent: Monday, August 06, 2012 2:08 PM To: 'r-help@r-project.org' Subject: [R] Force evaluation of a symbol when a function is created I am porting a program in matlab to R, The problem is that Matlab has a feature where symbols that aren't arguments are evaluated immediately. That is: Y=3 F=@(x) x*Y Will yield a function such that F(2)=6. If later say. Y=4 then F(2) will still equal 6. R on the other hand has lazy evaluation. F-function(x){x*Y} Will do the following Y=3 F(2)=6 Y=4 F(2)=8. Does anyone know of away to defeat lazy evaluation in R so that I can easily simulate the Matlab behavior. I know that I can live without this in ordinary programming but it would make my port much easier. Thanks. The information in this e-mail is intended only for the ...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Correct Place to Seek an R-Project Consultant?
I would like to find out how to apply commands found in the bayesm package, to analyze data gathered via a choice-based conjoint study. Is there a web resource where I can seek an R-Project consultant experienced in this, who I could hire to walk me through the appropriate bayesm commands to use for this purpose? Thanks in advance to all for any info. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force evaluation of a symbol when a function is created
On Mon, Aug 6, 2012 at 9:03 PM, Schoenfeld, David Alan,Ph.D.,Biostatistics dschoenf...@partners.org wrote: Thank you both, this was very helpful. I need to study environments more. Do either of you know a good source? Disclaimer: I really have no idea what I'm talking about. They are a somewhat subtle, but exceptionally powerful concept: see, inter alia, cran.r-project.org/doc/contrib/Fox-Companion/appendix-scope.pdf http://www.lemnica.com/esotericR/Introducing-Closures/ If you know a little bit of C, it will go a long way in understanding environments in R. You'll want to (eventually) start to associate R names with C pointers and environments with symbol tables (hence the fact the printed environment is just a memory address) , but that's perhaps a little bit down the road. Environments are different in their fundamental behavior because of this though: they're the best way to get pass by reference in R. Best, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GAM and interpolation?
Hello fellow R users, I would need your help on GAM/GAMM models and interpolation on a marked spatial point process (cases and controls). I use the mgcv package to fit a GAMM model with a binary outcome, a parametric part (var1+..+varn), a spline used for the spatial variation, and a random effect coded through another spline in this form: gam(outcome~var1+.+varn+s(xlong+ylat)+s(var, bs=re), data=MyData, family=binomial(link=logit)) My purpose is to calculate a risk map adjusted on my covariates to look for compare and look for obvious differences with a risk map calculated by kernel ratio. However...the big deal is to interpolate my model to estimate the risk over the area of interest, but of course I don't have measurements of the variables (except geographic coordinates) for the whole area: only for the individuals in the dataset. I am kind of lost...I have been searching for a couple of days now and I tried the predict.gam function with the easy type=response and the more mysterious type=lpmatrix, and other possibilities but cannot find what I am looking for. I only calculate the risk for my individuals. I thought that the non-parametric spline component of the GAM/GAMM models could have helped me interpolate and fill the gaps. Did I miss something big? Are there solutions (without headache) or magical package I missed? Thank you for any help you could bring ! Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force evaluation of a symbol when a function is created
Hi, Try this: F-function(x,type=local){Y=3 x*Y} F(3) #[1] 9 Y-4 F(3) #[1] 9 Y-5 F(3) #[1] 9 A.K. - Original Message - From: Schoenfeld, David Alan,Ph.D.,Biostatistics dschoenf...@partners.org To: 'r-help@r-project.org' r-help@r-project.org Cc: Sent: Monday, August 6, 2012 5:07 PM Subject: [R] Force evaluation of a symbol when a function is created I am porting a program in matlab to R, The problem is that Matlab has a feature where symbols that aren't arguments are evaluated immediately. That is: Y=3 F=@(x) x*Y Will yield a function such that F(2)=6. If later say. Y=4 then F(2) will still equal 6. R on the other hand has lazy evaluation. F-function(x){x*Y} Will do the following Y=3 F(2)=6 Y=4 F(2)=8. Does anyone know of away to defeat lazy evaluation in R so that I can easily simulate the Matlab behavior. I know that I can live without this in ordinary programming but it would make my port much easier. Thanks. The information in this e-mail is intended only for the ...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nested ANOVA
Hi, How do I do a Duncan Multiple Range Test, with Nested ANOVA. I am not sure how to write the nested variable within dmrt. Any suggestions? Archana [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] label_wrap_gen question
Hi, all I am trying to use the label_wrap_gen function in this website. https://github.com/hadley/ggplot2/wiki/labeller I tried to make a long name like this Light and heavy good vehicles (diesel) -\nGVX f2 = facet_grid(vehicle ~ ., labeller=label_wrap_gen(width=15)) eventually, I got something like this in my label... *Light and heavy good vehicles (diesel) - GVX* I suppose the -n could break GVX to the next row but it failed... Is it a bug? or it has been overpowered by width=15?? so -n could not function well? Eventually I tried f2 = facet_grid(vehicle ~.) The -n did work and I got *Light and heavy good vehicles (diesel) - GVX* But it also failed because I could not show all the label properly... Anyone has idea about this? It is freaking me out~ I am sorry I am stupid on R Thanks in advance. VD -- View this message in context: http://r.789695.n4.nabble.com/label-wrap-gen-question-tp4639364.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] line type lty
lty =1 denotes the single continuous line lty = 2 denotes the broken line lty = 3 dotted line -- View this message in context: http://r.789695.n4.nabble.com/line-type-lty-tp3466345p4639365.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to convert data to 'normal' if they are in the form of standard scientific notations?
Dear Jean Thanks a lot for your help. The reason I did not provide producible code is that my work started with reading in some large csv files, e.g. the data is not created by myself. But the data is from the same data provider so I would expect to receive data in exactly same data format. I use read.csv to read the data in. My major curious is that by using exactly same code as I provided in my email, e.g. 'as.factor' why one of them work (e.g. convert the numerical data to factor) but the other one remains numerical with scientific notation? So, in R, how do I check if the data format are different for these two files in their original csv files, which might cause the different results..? Also I tried your code and created some reproducible examples, but still can not make it work as in your example a-c(2.0e+9,2.1e+9) print(a,digits=4)[1] 20 21 # I expected to see 2.0e+9 here...? print(a,digits=7)[1] 20 21 # Think here I should expect same 2.0e+9? getOption(digits) # Checking my default number of digits now..[1] 7 b-c(3000,3100) print(b)[1] 3000 3100 # This is what I expected to see print(b,digits=5)[1] 3000 3100 # I'm so confused why it is not working, e.g. printing 3.0e+9! getOption(digits) # checking again, but now I would expect it has being changed to 5[1] 7 Any thoughts please...? Thanks HJ On Mon, Aug 6, 2012 at 7:04 PM, Jean V Adams jvad...@usgs.gov wrote: HJ, You don't provide any reproducible code, so I had to make up my own. dat - data.frame(a=letters[1:5], x=c(20110911001084, 20110911001084, 20110911001084, 20110911001084, 20110911001084), y=c(2.10004e+12, 2.10004e+12, 2.10004e+12, 2.10004e+12, 2.10004e+12)) In my example, the long numbers print out without scientific notation. dat a x y 1 a 20110911001084 210004000 2 b 20110911001084 210004000 3 c 20110911001084 210004000 4 d 20110911001084 210004000 5 e 20110911001084 210004000 I can make it print with scientific notation using the digits argument to the print() function. print(dat, digits=3) ax y 1 a 2.01e+13 2.1e+12 2 b 2.01e+13 2.1e+12 3 c 2.01e+13 2.1e+12 4 d 2.01e+13 2.1e+12 5 e 2.01e+13 2.1e+12 What is your default number of digits? getOption(digits) Jean HJ YAN yhj...@googlemail.com wrote on 08/06/2012 11:14:17 AM: Dear R users I read two csv data files into R and called them Tem1 and Tem5. For the first column, data in Tem1 has 13 digits where in Tem5 there are 14 digits for each observation. Originally there are 'numerical' as can be seen in my code below. But how can I display/convert them using other form rather than scientific notations which seems a standard/default? I want them to be in the form like '20110911001084', but I'm very confused why when I used 'as.factor' call it works for my 'Tem1' but not for 'Tem5'...?? Many thanks! HJ Tem1[1:5,1][1] 2.10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 2. 10004e+12 Tem5[1:5,1][1] 2.011091e+13 2.011091e+13 2.011091e+13 2. 011091e+13 2.011091e+13 class(Tem1[1:5,1])[1] numeric class(Tem5 [1:5,1])[1] numeric as.factor(Tem1[1:5,1])[1] 2.10004e+12 2. 10004e+12 2.10004e+12 2.10004e+12 2.10004e+12 Levels: 2.10004e+12 as.factor(Tem5[1:5,1])[1] 20110911001084 20110911001084 20110911001084 20110911001084 20110911001084 Levels: 20110911001084 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] DAtes
Well, i believe writing correct date format would have served the purpose. Suppose tfr contains Date as column and is a factor by class. tft$Date - as.Date(as.character(tfr$Date),%d/%m%Y) should give you the desired output. -- View this message in context: http://r.789695.n4.nabble.com/DAtes-tp4639172p4639366.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting Where People Live on a U.S. Map
Dan, google refine http://goo.gl/AeKml can actually transform zip codes into longitude/latitude - http://goo.gl/1HDWb will show you how to do this from street adresses, but it should also work from city names -- i think it will allocate a default long/lat for a city, but not sure of the exact mechanism On Fri, Aug 3, 2012 at 1:10 PM, Lopez, Dan lopez...@llnl.gov wrote: Thank you! Dan From: Sarah Goslee [mailto:sarah.gos...@gmail.com] Sent: Thursday, August 02, 2012 5:51 PM To: Lopez, Dan Cc: R help (r-help@r-project.org) Subject: Re: [R] Plotting Where People Live on a U.S. Map Hi Dan, For question 1, yes you'll need geographic coordinates. I thinknit's possible to get a shapefile of zip codes, but maybe someone else will know the details. For #2, you probably want maps instead of map, and you need to load a package before you can use it: install.packages(maps) library(maps) and then your code. Sarah On Thursday, August 2, 2012, Lopez, Dan wrote: Hi, QUESTION TOPIC #1 I have some data I want to plot on a map. But what I have are home addresses: street, City, State, complete postal code--i.e 95377-1234. Is there a way to plot this data or do I need latitudinal and longitude coordinates? If so how do I convert them? Is there a package that will do the conversion in R? QUESTION TOPIC #2 I was trying to experiment with this code that I found at the site below but got a message that indicated that the map function is not found. So I tried installing the maps package but got the below message. Is there an alternative way of doing this (please refer to URL below)? # The message I got: install.packages(map) Warning message: package 'map' is not available (for R version 2.15.0) # The code I tried to run: states - data.frame(map(state, plot=FALSE)[c(x,y)]) colnames(states) - c(Lon,Lat) ggplot(states, aes(x=Lon, y=Lat)) + geom_path() + geom_point(alpha=0.6,size=0.3,data=subway) # Where I got the code from and also an image of what I am attempting to do (please enter this in your URL) http://www.google.com/imgres?um=1hl=enbiw=1790bih=845tbm=ischtbnid=4rMjXYA_w1qDiM:imgrefurl=http://www.informaniac.net/docid=SJqcsPghztrj0Mimgurl=http://lh5.ggpht.com/_yBbodrC25kU/Ta6Ifqr0ZLI/AAABRCg/98rIF-kMMns/map%25255B7%25255D.pngw=512h=319ei=mgsbUIzqJuKbiAL5v4DQDgzoom=1iact=hcvpx=176vpy=477dur=5741hovh=177hovw=285tx=110ty=113sig=117496213270544868088page=2tbnh=125tbnw=200start=32ndsp=40ved=1t:429,r:0,s:32,i:175 Dan [[alternative HTML version deleted]] __ R-help@r-project.orgjavascript:; mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] shading line plot
Hi everyone, I have a time series data set and I want to fill my line plot of this time series with different colors e.g I want to fill portion related to 1995-1996 with blue , portion related to 1996-1997 with orange and then portion related to 1997-1998 with red can anyone please help me. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.