Re: [R] building compiled help html files
Vogue ladies like to go http://www.earlshop.com mens watches out for purchasing or parties with their distinctive designer http://www.sonbags.com replica handbags luggage. The designer baggage should meet their have to have http://www.sonbags.com designer handbags to lead the efficient shopping life to express the state as well as the taste. Our web site feature http://www.sonbags.com/louis-vuitton-damier-azur.html louis vuitton bags several famous Designer bags at 20% -90% off the original retail price, which sells http://www.earlshop.com/Watches-Breitling.html breitling watches new, high-quality, authentic designer bags at unbeatable savings.Coach Baggage are known to be the greatest buddy of females, are http://www.sonbags.com/louis-vuitton-louis-vuitton-mahina.html louis vuitton handbags an extension to the personality of a lady, which give her class, elegance http://www.earlshop.com/Watches-Bvlgari.html bvlgari watches and sense of style.When you're online shopping for a woman present for birthday, a holiday or even Mother's Day, you will http://www.sonbags.com/louis-vuitton-louis-vuitton-nomade.html louis vitton wish to search for handbags like the Coach Bags - black, traditional and fashion. This designer bag is fashion-forward yet confidently classic, enduring functionality in our new signature Op Art fabric. -- View this message in context: http://r.789695.n4.nabble.com/building-compiled-help-html-files-tp3008927p3024806.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory allocation problem
Following on my memory allocation problem... I tried to run my code on our university HPU facility, requesting 61 GB of memory, and it still can not allocate a vector of 5 MB of size. load('/home/uqlcatta/test_scripts/.RData') myfun - function(Range, H1, H2, p, coeff) + { + -(coeff[1]+coeff[2]*H1+coeff[3]*H2+coeff[4]*p)*exp(-(coeff[5]+coeff[6]*H 1+coeff[7]*H2+coeff[8]*p)*Range)+coeff[9]+coeff[10]*H1+coeff[11]*H2+coef f[12]*p + } SS - function(coeff,steps,Range,H1,H2,p) + { + sum((steps - myfun(Range,H1,H2,p,coeff))^2) + } coeff - c(1,1,1,1,1,1,1,1,1,1,1,1) est_coeff - optim(coeff,SS, steps=org_results$no.steps, Range=org_results$Range, H1=org_results$H1, H2=org_results$H2, p=org_results$p) Error: cannot allocate vector of size 5.0 Mb Execution halted May it be a proble of the function? Any input is very much appreciated Lorenzo -Original Message- From: Lorenzo Cattarino Sent: Wednesday, 3 November 2010 2:22 PM To: 'David Winsemius'; 'Peter Langfelder' Cc: r-help@r-project.org Subject: RE: [R] memory allocation problem Thanks for all your suggestions, This is what I get after removing all the other (not useful) objects and run my code: getsizes() [,1] org_results 47240832 myfun 11672 getsizes4176 SS 3248 coeff168 NA NA NA NA NA NA NA NA NA NA est_coeff - optim(coeff,SS, steps=org_results$no.steps, Range=org_results$Range, H1=org_results$H1, H2=org_results$H2, p=org_results$p) Error: cannot allocate vector of size 5.0 Mb In addition: Warning messages: 1: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 2: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 3: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 4: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) It seems that R is using all the default availabe memory (4 GB, which is the RAM of my processor). memory.limit() [1] 4055 memory.size() [1] 4049.07 My dataframe has a size of 47240832 bytes, or about 45 Mb. So it should not be a problem in terms of memory usage? I do not understand what is going on. Thanks for your help anyway Lorenzo -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, 3 November 2010 12:48 PM To: Lorenzo Cattarino Cc: r-help@r-project.org Subject: Re: [R] memory allocation problem Restart your computer. (Yeah, I know that what the help-desk always says.) Start R before doing anything else. Then run your code in a clean session. Check ls() oafter starte up to make sure you don't have a bunch f useless stuff in your .Rdata file. Don't load anything that is not germane to this problem. Use this function to see what sort of space issues you might have after loading objects: getsizes - function() {z - sapply(ls(envir=globalenv()), function(x) object.size(get(x))) (tmp - as.matrix(rev(sort(z))[1:10]))} Then run your code. -- David. On Nov 2, 2010, at 10:13 PM, Lorenzo Cattarino wrote: I would also like to include details on my R version version _ platform x86_64-pc-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 2 minor 11.1 year 2010 month 05 day31 svn rev52157 language R version.string R version 2.11.1 (2010-05-31) from FAQ 2.9 (http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b e-a-limit-on-the-memory-it-uses_0021 http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b e-a-limit-on-the-memory-it-uses_0021 ) it says that: For a 64-bit build, the default is the amount of RAM So in my case the amount of RAM would be 4 GB. R should be able to allocate a vector of size 5 Mb without me typing any command (either as memory.limit() or appended string in the target path), is that right? From: Lorenzo Cattarino Sent: Wednesday, 3 November 2010 10:55 AM To: 'r-help@r-project.org' Subject: memory allocation problem I forgot to mention that I am using windows 7 (64-bit) and the R version 2.11.1 (64-bit) From: Lorenzo Cattarino I am trying to run a non linear parameter optimization using the function optim() and I have problems regarding memory allocation. My data are in a dataframe with 9 columns. There are 656100 rows. head(org_results) comb.id p H1 H2 Range Rep no.steps dist aver.hab.amount 1 1 0.1 0 0 11000 0.2528321
Re: [R] memory allocation problem
Thanks for all your suggestions, This is what I get after removing all the other (not useful) objects and run my code: getsizes() [,1] org_results 47240832 myfun 11672 getsizes4176 SS 3248 coeff168 NA NA NA NA NA NA NA NA NA NA est_coeff - optim(coeff,SS, steps=org_results$no.steps, Range=org_results$Range, H1=org_results$H1, H2=org_results$H2, p=org_results$p) Error: cannot allocate vector of size 5.0 Mb In addition: Warning messages: 1: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 2: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 3: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 4: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) It seems that R is using all the default availabe memory (4 GB, which is the RAM of my processor). memory.limit() [1] 4055 memory.size() [1] 4049.07 My dataframe has a size of 47240832 bytes, or about 45 Mb. So it should not be a problem in terms of memory usage? I do not understand what is going on. Thanks for your help anyway Lorenzo -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, 3 November 2010 12:48 PM To: Lorenzo Cattarino Cc: r-help@r-project.org Subject: Re: [R] memory allocation problem Restart your computer. (Yeah, I know that what the help-desk always says.) Start R before doing anything else. Then run your code in a clean session. Check ls() oafter starte up to make sure you don't have a bunch f useless stuff in your .Rdata file. Don't load anything that is not germane to this problem. Use this function to see what sort of space issues you might have after loading objects: getsizes - function() {z - sapply(ls(envir=globalenv()), function(x) object.size(get(x))) (tmp - as.matrix(rev(sort(z))[1:10]))} Then run your code. -- David. On Nov 2, 2010, at 10:13 PM, Lorenzo Cattarino wrote: I would also like to include details on my R version version _ platform x86_64-pc-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 2 minor 11.1 year 2010 month 05 day31 svn rev52157 language R version.string R version 2.11.1 (2010-05-31) from FAQ 2.9 (http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b e-a-limit-on-the-memory-it-uses_0021 http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b e-a-limit-on-the-memory-it-uses_0021 ) it says that: For a 64-bit build, the default is the amount of RAM So in my case the amount of RAM would be 4 GB. R should be able to allocate a vector of size 5 Mb without me typing any command (either as memory.limit() or appended string in the target path), is that right? From: Lorenzo Cattarino Sent: Wednesday, 3 November 2010 10:55 AM To: 'r-help@r-project.org' Subject: memory allocation problem I forgot to mention that I am using windows 7 (64-bit) and the R version 2.11.1 (64-bit) From: Lorenzo Cattarino I am trying to run a non linear parameter optimization using the function optim() and I have problems regarding memory allocation. My data are in a dataframe with 9 columns. There are 656100 rows. head(org_results) comb.id p H1 H2 Range Rep no.steps dist aver.hab.amount 1 1 0.1 0 0 11000 0.2528321 0.1393901 2 1 0.1 0 0 11000 0.4605934 0.1011841 3 1 0.1 0 0 11004 3.4273670 0.1052789 4 1 0.1 0 0 11004 2.8766364 0.1022138 5 1 0.1 0 0 11000 0.3496872 0.1041056 6 1 0.1 0 0 11000 0.1050840 0.3572036 est_coeff - optim(coeff,SS, steps=org_results$no.steps, Range=org_results$Range, H1=org_results$H1, H2=org_results$H2, p=org_results$p) Error: cannot allocate vector of size 5.0 Mb In addition: Warning messages: 1: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 1Mb: see help(memory.size) 2: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 1Mb: see help(memory.size) 3: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range,
Re: [R] R script on linux?
I'll try this thanx... -- View this message in context: http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3024922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data transformation
Dear Group, Need to do the following transformation: I have the dataset structure(list(Date = structure(1L, .Label = 2010-06-16, class = factor), ACC.returns1Day = -0.018524832, ACC.returns5Day = 0.000863931, ACC.returns7Day = -0.019795222, BCC.returns1Day = -0.009861859, BCC.returns5Day = 0.000850706, BCC.returns7Day = -0.014695715), .Names = c(Date, ACC.returns1Day, ACC.returns5Day, ACC.returns7Day, BCC.returns1Day, BCC.returns5Day, BCC.returns7Day), class = data.frame, row.names = c(NA, -1L)) I can split the names using: retNames - strsplit(names(returns),\\.returns) Assuming that the frame has only one row, how do I transform this into 1Day5Day7Day ACC -0.0185 0.0009 -0.0198 BCC -0.0099 0.0009 -0.0147 If I have more than one unique date ... is there some nice structure that I could put this into where I have the date as the parent and the sub data structure that gives the data as above for any unique date? I can always do this with for-loops ... but I think there are easier ways to achieve this. Thanks, S __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-numeric argument to binary operator error while reading ncdf file
Thank you everybody for the help! The solution of my problem is here: http://climateaudit.org/2009/10/10/unthreaded-23/ The mv variable is the designated NA for the variable and it appears that somebody screwed that up in the file. This workaround worked for me: Print out the function get.var.ncdf by typing exactly that in the console. Copy the results to a new script window. Redefine the function as getx.var.ncdf Find the two places in the function where it says: mv - nc$var[[nc$varid2Rindex[varid]]]$missval Replace each with: mv = -1.00e+30 Run the new function. data1 - getx.var.ncdf( nc, v1 ) will retrieve the data. IT WORKS! ;) Thanks a lot! Charles On Wed, Oct 27, 2010 at 4:25 PM, Charles Novaes de Santana charles.sant...@imedea.uib-csic.es wrote: Hi, Well, I did it, but all my script was on the first message. I don't have any other variables. I am just reading a NCDF file and trying to read the variable tasmax, that has values of temperatures. The only new information I have is the header of the NCDF file (Spain02D_tasmax.nc), that I have obtained by running ncdump command. netcdf Spain02D_tasmax { dimensions: time = 21275 ; lat = 40 ; lon = 68 ; variables: double time(time) ; time:long_name = Time variable ; time:units = days since 1950-01-01 00:00:00 ; double lat(lat) ; lat:standard_name = latitude ; lat:long_name = latitude ; lat:units = degrees north ; double lon(lon) ; lon:standard_name = longitude ; lon:long_name = longitude ; lon:units = degrees east ; double tasmax(time, lat, lon) ; tasmax:long_name = Daily maximum temperature ; tasmax:units = degrees Celsius ; tasmax:missing_value = -.0f ; // global attributes: :Info = Data generated for the esTcena project ( http://www.meteo.unican.es/projects/esTcena) ; :Institution = IFCA-UC ; :Conventions = CF-1.0 ; :conventionsURL = http://www.cgd.ucar.edu/cms/eaton/cf-metadata/index.html; ; :creation_date = 20-Sep-2010 09:34:26 ; data: (...) tasmax = NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 14.0840393066406, 14.4718475341797, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, (...) As we can see, tasmax has a lot of NaN (not a number) values. And the missing_value of the file is -.0f. As we can see in following lines, the error occurs just when the function get.var.ncdf is trying to set the missing values to NA. library(ncdf) file-open.ncdf(Spain02D_tasmax.nc) temp-get.var.ncdf(file,tasmax,verbose=TRUE) [1] get.var.ncdf: entering. Here is varid: [1] tasmax [1] checking to see if passed varid is actually a dimvar [1] entering vobjtodimname with varid= tasmax [1] vobjtodimname: is a character type varid. This file has 3 dims [1] vobjtodimname: no cases found, returning FALSE [1] get.var.ncdf: isdimvar: FALSE [1] vobjtovarid: entering with varid=tasmax [1] Variable named tasmax found in file with varid= 4 [1] vobjtovarid: returning with varid deduced from name; varid= 4 [1] get.var.ncdf: ending up using varid= 4 [1] ndims: 3 [1] get.var.ncdf: varsize: [1]6840 21275 [1] get.var.ncdf: start: [1] 1 1 1 [1] get.var.ncdf: count: [1]6840 21275 [1] get.var.ncdf: totvarsize: 57868000 [1] Getting var of type 4 (1=short, 2=int, 3=float, 4=double, 5=char, 6=byte) [1] get.var.ncdf: C call returned 0 [1] count.nodegen: 68Length of data: 57868000 [2] count.nodegen: 40Length of data: 57868000 [3] count.nodegen: 21275Length of data: 57868000 [1] get.var.ncdf: final dims of returned array: [1]6840 21275 [1] varid: 4 [1] nc$varid2Rindex: 0 nc$varid2Rindex: 0 nc$varid2Rindex: 0 [4] nc$varid2Rindex: 1 [1] nc$varid2Rindex[varid]: 1 [1] get.var.ncdf: setting missing values to NA Error en mv * 1e-05 : non-numeric argument to binary operator I think that the error is related to this procedure: to set missing values to NA. But I am not sure, as I told you I am a newbie and I did not see any error like this before (even in discussion lists in the web). If any of you that have worked with NCDF files in R could help me, even if you know any other packages or commands to read this values, I would be really really grateful. Thank you very much for your attention! Sorry about my poor English. Charles On Wed, Oct 27, 2010 at 3:46 PM, jim holtman jholt...@gmail.com wrote: put: options(error=utils::recover) in your script so that when an error
Re: [R] Colour filling in panel.bwplot from lattice
On Wed, Nov 3, 2010 at 4:11 AM, Dennis Murphy djmu...@gmail.com wrote: Hi: I don't know why, but it seems that in bwplot(voice.part ~ height, data = singer, main = NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink' 'violet' 'brown' 'gold', fill=c(yellow,blue,green,red,pink,violet,brown,gold)) the assignment of colors is offset by 3: Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1 fillcol - c(yellow,blue,green,red,pink,violet,brown,gold) In the above plot, yellow - Bass 2 (1) blue - Tenor 1 (4) green - Soprano 2 (7) red - Bass 1 (10 mod 8 = 2) pink - Alto 2 (13 mod 8 = 5) etc. It's certainly curious. Curious indeed. It turns out that because of the way this was implemented, every 11th color was used, so you end up with the order sel.cols - c(yellow,blue,green,red,pink,violet,brown,gold) rep(sel.cols, 100) [ seq(1, by = 11, length.out = 8) ] [1] yellow redbrown blue pink gold green violet It's easy to fix this so that we get the expected order, and I will do so for the next release. Having said that, it should be noted that any vectorization behaviour in lattice panel functions is a consequence of implementation and not guaranteed by design (although certainly useful in many situations). In particular, it is risky to depend on vectorization in multipanel plots, because the vectorization starts afresh in each panel for whatever data subset happens to be in that panel, and there may be no relation between the colors and the original data. One alternative is to use panel.superpose with panel.groups=panel.bwplot: bwplot(voice.part ~ height, data = singer, groups = voice.part, panel = panel.superpose, panel.groups = panel.bwplot, fill = sel.cols) -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] One question on heatmap
On 11/03/2010 02:50 AM, Hua wrote: ... When I try to make heatmap based on this gene expression value table, I found that, when I set 'scale' to 'column', the heatmap will be always be red. I think this is because, there's very large values in the matrix (gene Actb), while the most are just very small. Thus, the color will be very ugly. I just wonder, how to set the color to make the heatmap look better? I have tried log-tranformation on the matrix and it's better now. But I do want to know if you have better ways to set the color span manually and make the heatmap look better without any log-transformation? Hi Hua, It is not all that easy, but can be done. I read in your data as genexp. Notice how I split up the data into three ranges, adding the range extremes in color.scale (plotrix) and then removing the extremes. expcol[genexp$sample2100]- color.scale(c(0,100,genexp$sample2[genexp$sample2100]), c(1,0.5),c(0,0.5),0)[-(1:2)] expcol[genexp$sample2=100genexp$sample21000]- color.scale(c(100,1000,genexp$sample2[genexp$sample2=100genexp$sample21000]), c(0.5,0.1),c(0.5,0.9),0)[-(1:2)] expcol[genexp$sample21000]- color.scale(c(1000,1,genexp$sample2[genexp$sample2=1000]), c(0.1,0),c(0.9,1),0)[-(1:2)] barplot(rep(1,27),col=expcol) color.legend(0,-0.1,15,-0.05,c(0,100,1000,1), rect.col=c(red,#808000,#1ae600,green)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] save() with 64 bit and 32 bit R
hi, i have been using a 64 bit desktop machine to process a whole lot of data which i have then subsequently used save() to store. i am now wanting to use this data on my laptop machine, which is a 32 bit install. i suppose that i should not be surprised that the 64 bit data files do not open on my 32 bit machine! does anyone have a smart idea as to how these data can be reformatted for 32 bits? unfortunately the data processing that i did on the 64 bit machine took just under 20 days to complete, so i am not very keen to just throw away this data and begin again on the 32 bit machine. sorry, in retrospect this all seems rather idiotic, but i assumed that the data stored by save() would be compatible between 64 bit and 32 bit (there is no warning in the manual). thanks, andrew. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drawing circles on a chart
On Wed, Nov 3, 2010 at 2:07 AM, Santosh Srinivas santosh.srini...@gmail.com wrote: Dear Group, Inside each cell there should be a circle (sphere preferable) with radius of mod(data value). The color should be either red or green depending on -ve or +ve and the intensity should be based on the value of the datapoint. Any help on how to go about this? If you really want a sphere then you should look at the rgl package, which enables the drawing of 3d graphic objects with illumination. However it does it in its own graphics window and you'll not be able to use any of the standard R graphics functions. Otherwise you'll have to find some way of putting a 3d sphere on a 2d R graphics window, or faking it with a shaded circle and some highlights. Yuck. Also, drawing circles (strictly, a disc) with radius proportional to data value is usually a bad idea since we interpret areas. A circle with twice the radius has four times the area, and so looks four times as big. But the data is only twice as big... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] boxplot of timeseries with different lengths
Hello List, I have a time serie of observations representing the activity of some users in different time periods, like: table(obs1) user1 user2 user3 user31 user33 user4 user5 user6 user7 user8 user82 user83 user85 user9 1 1 3 1 1 1 6 1 11 6 11 7 table(obs2) user1 user2 user3 user31 user33 user4 user5 user6 user7 user8 user82 user83 user84 user85 user86 user87 user9 3 9 29 12 13 142113 13 15 20 21 1 11 9 427 I would like to boxplot them, but since they have different length, I don't know how to handle the dataset properly. Is it wise to use different arrays, one for each observation? or it is better to force the tabled observations to the same length, in order to put them into a data frame? thanks in advance for any advice. best regards, Simone Gabbriellini __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] install vegan
Dear all, I am trying to install Vegan, but I allways get the following error message: Warning in install.packages(choose.files(, filters = Filters[c(zip, : 'lib = C:/Programme/R/R-2.12.0/library' is not writable Error in install.packages(choose.files(, filters = Filters[c(zip, : unable to install packages utils:::menuInstallLocal() does anybody know what is wrong? Thanks in advance, Carolin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Recoding -- test whether number begins with a certain number
Dear R community, I have a question concerning recoding of a variable. I have a data set in which there is a variable devoted to the ISCO code describing the occupation of this certain individual (http://www.ilo.org/public/english/bureau/stat/isco/isco88/major.htm). Every type of occupation begins with a number and every number added to this number describes th occupation more detailed. Now my problem: I want to recode this variable in a way that every value beginning with a certain number is labeled as the respective category. For example, that all values of this variable beginning with a 6 is labeled as agri. My problem is that I cannot find a test which I can use for that purpose. I would really appreciate any help on that subject. Thank you. Best regards Marcel Gerds __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save() with 64 bit and 32 bit R
Andrew Collier wrote: hi, i have been using a 64 bit desktop machine to process a whole lot of data which i have then subsequently used save() to store. i am now wanting to use this data on my laptop machine, which is a 32 bit install. i suppose that i should not be surprised that the 64 bit data files do not open on my 32 bit machine! does anyone have a smart idea as to how these data can be reformatted for 32 bits? unfortunately the data processing that i did on the 64 bit machine took just under 20 days to complete, so i am not very keen to just throw away this data and begin again on the 32 bit machine. sorry, in retrospect this all seems rather idiotic, but i assumed that the data stored by save() would be compatible between 64 bit and 32 bit (there is no warning in the manual). The data would normally be compatible on all architectures. However, it may need the same version of R (or a newer one), and may need to have the same packages installed in order to read it. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save() with 64 bit and 32 bit R
On Wed, 3 Nov 2010, Andrew Collier wrote: hi, i have been using a 64 bit desktop machine to process a whole lot of data which i have then subsequently used save() to store. i am now wanting to use this data on my laptop machine, which is a 32 bit install. i suppose that i should not be surprised that the 64 bit data files do not open on my 32 bit machine! does anyone have a smart idea as to how these data can be reformatted for 32 bits? unfortunately the data processing that i did on the 64 bit machine took just under 20 days to complete, so i am not very keen to just throw away this data and begin again on the 32 bit machine. sorry, in retrospect this all seems rather idiotic, but i assumed that the data stored by save() would be compatible between 64 bit and 32 bit (there is no warning in the manual). It is, and the help says so: All R platforms use the XDR (bigendian) representation of C ints and doubles in binary save-d files, and these are portable across all R platforms. (ASCII saves used to be useful for moving data between platforms but are now mainly of historical interest.) So there is something specific about your save, and you haven't even told us the error message (see the posting guide). One possibility is that you saved references to namespaces, when those packages need to be installed on the machine used to load() the .RData file (but this is fairly unusual). Another is that you simply don't have enough memory on the 32-bit machine, when one remedy is to go back to the 64-bit machine and save individual objects. thanks, andrew. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tukey's table
Hi, I'm building Tukey's table using qtukey function. It happens that I can't get the values of Tukey's one degree of freedom and also wanted to eliminate the first column. The program is: Trat - c(1:30) # number of treatments gl - c(1:30, 40, 60, 120) # degree freedom tukval - matrix(0, nr=length(gl), nc=length(Trat)) for(i in 1:length(gl)) for(j in 1:length(Trat)) tukval[i,j] - qtukey(.95, Trat[j], gl[i]) rownames(tukval) - gl colnames(tukval) - paste(Trat, , sep=) tukval require(xtable) xtable(tukval) Some suggest? -- Silvano Cesar da Costa Departamento de Estatística Universidade Estadual de Londrina Fone: 3371-4346 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Colour filling in panel.bwplot from lattice
Am 03.11.2010 10:23 (UTC+1) schrieb Deepayan Sarkar: On Wed, Nov 3, 2010 at 4:11 AM, Dennis Murphydjmu...@gmail.com wrote: Hi: I don't know why, but it seems that in bwplot(voice.part ~ height, data = singer, main = NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink' 'violet' 'brown' 'gold', fill=c(yellow,blue,green,red,pink,violet,brown,gold)) the assignment of colors is offset by 3: Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1 fillcol- c(yellow,blue,green,red,pink,violet,brown,gold) In the above plot, yellow - Bass 2 (1) blue - Tenor 1 (4) green - Soprano 2 (7) red - Bass 1 (10 mod 8 = 2) pink - Alto 2 (13 mod 8 = 5) etc. It's certainly curious. Curious indeed. It turns out that because of the way this was implemented, every 11th color was used, so you end up with the order sel.cols- c(yellow,blue,green,red,pink,violet,brown,gold) rep(sel.cols, 100) [ seq(1, by = 11, length.out = 8) ] [1] yellow redbrown blue pink gold green violet It's easy to fix this so that we get the expected order, and I will do so for the next release. Thank you for this proposal. We are looking forward for the next release :-) We frequently have to colour selected boxes to be able to compare special cases over different panels. Having said that, it should be noted that any vectorization behaviour in lattice panel functions is a consequence of implementation and not guaranteed by design (although certainly useful in many situations). In particular, it is risky to depend on vectorization in multipanel plots, because the vectorization starts afresh in each panel for whatever data subset happens to be in that panel, and there may be no relation between the colors and the original data. Thank you for the warning. One alternative is to use panel.superpose with panel.groups=panel.bwplot: bwplot(voice.part ~ height, data = singer, groups = voice.part, panel = panel.superpose, panel.groups = panel.bwplot, fill = sel.cols) This indeed works nice 'as a workaround'. -Deepayan Thanks again for this wonderful package, Rainer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density() function: differences with S-PLUS
Dear Joshua, first of all, thank you very much for reply. I hoped that someone who's familiar with both S+ and R can reply to me, because I spent some hours to looking for a solution. If someone else would try, this is the SPLUS code and output, while below there is the R code. I obtain the same x values, while y values are differents for both examples. Thank you very much. Nicola ### S-PLUS CODE AND OUTPUT ### density(1:1000, width = 4) $x: [1]-2.018.5102039.0204159.5306180.04082 100.55102 121.06122 [8] 141.57143 162.08163 182.59184 203.10204 223.61224 244.12245 264.63265 [15] 285.14286 305.65306 326.16327 346.67347 367.18367 387.69388 408.20408 [22] 428.71429 449.22449 469.73469 490.24490 510.75510 531.26531 551.77551 [29] 572.28571 592.79592 613.30612 633.81633 654.32653 674.83673 695.34694 [36] 715.85714 736.36735 756.87755 777.38776 797.89796 818.40816 838.91837 [43] 859.42857 879.93878 900.44898 920.95918 941.46939 961.97959 982.48980 [50] 1003.0 $y: [1] 4.565970e-006 1.31e-003 9.999374e-004 1.31e-003 9.999471e-004 1.31e-003 [7] 9.999560e-004 1.30e-003 9.999643e-004 1.29e-003 9.999718e-004 1.28e-003 [13] 9.999788e-004 1.26e-003 9.999852e-004 1.24e-003 9.10e-004 1.22e-003 [19] 9.63e-004 1.19e-003 1.01e-003 1.16e-003 1.06e-003 1.13e-003 [25] 1.10e-003 1.10e-003 1.13e-003 1.06e-003 1.16e-003 1.01e-003 [31] 1.19e-003 9.63e-004 1.22e-003 9.10e-004 1.24e-003 9.999852e-004 [37] 1.26e-003 9.999788e-004 1.28e-003 9.999718e-004 1.29e-003 9.999643e-004 [43] 1.30e-003 9.999560e-004 1.31e-003 9.999471e-004 1.31e-003 9.999374e-004 [49] 1.31e-003 4.432131e-006 exdata = iris[, 1, 1] density(exdata, width = 4) $x: [1] 1.30 1.453061 1.606122 1.759184 1.912245 2.065306 2.218367 2.371429 2.524490 [10] 2.677551 2.830612 2.983673 3.136735 3.289796 3.442857 3.595918 3.748980 3.902041 [19] 4.055102 4.208163 4.361224 4.514286 4.667347 4.820408 4.973469 5.126531 5.279592 [28] 5.432653 5.585714 5.738776 5.891837 6.044898 6.197959 6.351020 6.504082 6.657143 [37] 6.810204 6.963265 7.116327 7.269388 7.422449 7.575510 7.728571 7.881633 8.034694 [46] 8.187755 8.340816 8.493878 8.646939 8.80 $y: [1] 0.0007849649 0.0013097474 0.0021225491 0.0033616520 0.0052059615 0.0078856717 [7] 0.0116917555 0.0169685132 0.0241073754 0.0335286785 0.0456521053 0.0608554862 [13] 0.0794235072 0.1014901241 0.1269807991 0.1555625999 0.1866111931 0.2192033788 [19] 0.2521417640 0.2840144993 0.3132881074 0.3384260582 0.3580208688 0.3709241384 [25] 0.3763578665 0.3739920600 0.3639778683 0.3469316232 0.3238721233 0.2961200278 [31] 0.2651731505 0.2325739601 0.1997853985 0.1680884651 0.1385105802 0.1117884914 [37] 0.0883644110 0.0684099972 0.0518702141 0.0385181792 0.0280126487 0.0199513951 [43] 0.0139159044 0.0095050745 0.0063575653 0.0041639082 0.0026680819 0.0016700727 [49] 0.0010169912 0.0005962089 ### R CODE ### # S-PLUS CODE: density(1:1000, width = 4) SAME x BUT DIFFERENT y density(1:1000, bw = 4, window = g, n = 50, cut = 0.75)$x density(1:1000, bw = 4, window = g, n = 50, cut = 0.75)$y # S-PLUS CODE: exdata = iris[, 1, 1]; density(exdata, width = 4) SAME x BUT DIFFERENT y exdata = iris$Sepal.Length[iris$Species == setosa] density(exdata, bw = 4, n = 50, cut = 0.75)$x density(exdata, bw = 4, n = 50, cut = 0.75)$y 2010/11/2 Joshua Wiley jwiley.ps...@gmail.com Dear Nicola, There are undoubtedly people here who are familiar with both S+ and R, but they may not always be around or get to every question. In that case there are (at least) two good options for you: 1) Say what you want mathematically (something of a universal language) or statistically 2) Rather than just give us S+ code, show sample data (e.g., 1:1000), and the values you would like obtained (in this case whatever the output from S+ was). This would let us *try* to figure out what happened and duplicate it in R. From the arcane step of reading R's documentation for density (?density): width: this exists for compatibility with S; if given, and bw is not, will set bw to width if this is a character string, or to a kernel-dependent multiple of width if this is numeric. Which makes me wonder if this works for you (in R)? density(1:1000, width = 4) Cheers, Josh On Tue, Nov 2, 2010 at 3:04 AM, Nicola Sturaro Sommacal (Quantide srl) mailingl...@sturaro.net wrote: Hello! Someone know what are the difference between R and S-PLUS in the density() function? For example, I would like to reply this simple S-PLUS code in R, but I don't understand which parameter I should modify to get the same results. S-PLUS CODE: density(1:1000, width = 4) R-CODE: density(1:1000, bw = 4, window = g, n = 50, cut = 0.75) I obtain the same
[R] bad optimization with nnet?
Hy, I try to give an example of overfitting with multi-layer perceptron. I have done following small example : library(nnet) set.seed(1) x - matrix(rnorm(20),10,2) z - matrix(rnorm(10),10,1) rx - max(x)-min(x) rz - max(z)-min(z) x - x/rx z - z/rz erreur - 10^9 for(i in 1:100){ temp.mod - nnet(x=x,y=z,size=10,rang=1,maxit=1000) if(temp.mod$valueerreur){ res.mod - temp.mod erreur - res.mod$value } } cat(\nFinal error : ,res.mod$value,\n) Normaly it is easy task for an MLP with 10 hidden units to reduce the final error to almost 0 (althougt there is nothing to predict). But the smallest error that I get is : 0.753895 (very poor result) Maybe this problem is already known?? Maybe the fault is mine but I don't see where. Joseph Rynkiewicz -- Ce message a ete verifie par MailScanner pour des virus ou des polluriels et rien de suspect n'a ete trouve. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple imputation for nominal data
The aregImpute function in the Hmisc package can do this through predictive mean matching and canonical variates (Fisher's optimum scoring algorithm). Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Multiple-imputation-for-nominal-data-tp3024276p3025181.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recoding -- test whether number begins with a certain number
On Wed, Nov 3, 2010 at 10:01 AM, Marcel Gerds marcel.ge...@gmx.de wrote: Dear R community, I have a question concerning recoding of a variable. I have a data set in which there is a variable devoted to the ISCO code describing the occupation of this certain individual (http://www.ilo.org/public/english/bureau/stat/isco/isco88/major.htm). Every type of occupation begins with a number and every number added to this number describes th occupation more detailed. Now my problem: I want to recode this variable in a way that every value beginning with a certain number is labeled as the respective category. For example, that all values of this variable beginning with a 6 is labeled as agri. My problem is that I cannot find a test which I can use for that purpose. I would really appreciate any help on that subject. Thank you. If it's a numeric variable, convert to character with 'as.character'. Then check the first character with substr(x,1,1). Then create a factor and set the levels... z=as.integer(runif(10,0,100)) z [1] 26 92 47 99 2 98 15 21 58 82 zc=factor(substr(as.character(z),1,1)) zc [1] 2 9 4 9 2 9 1 2 5 8 Levels: 1 2 4 5 8 9 levels(zc)=c(Foo,Bar,Baz,Qux,Quux,Quuux) zc [1] Bar Quuux Baz Quuux Bar Quuux Foo Bar Qux Quux Levels: Foo Bar Baz Qux Quux Quuux data.frame(z=z,zc=zc) zzc 1 26 Bar 2 92 Quuux 3 47 Baz 4 99 Quuux 5 2 Bar 6 98 Quuux 7 15 Foo 8 21 Bar 9 58 Qux 10 82 Quux Now all the 9-somethings are Quuux, the 2's are Bar etc etc. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Colour filling in panel.bwplot from lattice
On Wed, Nov 3, 2010 at 4:25 PM, Rainer Hurling rhur...@gwdg.de wrote: Am 03.11.2010 10:23 (UTC+1) schrieb Deepayan Sarkar: On Wed, Nov 3, 2010 at 4:11 AM, Dennis Murphydjmu...@gmail.com wrote: Hi: I don't know why, but it seems that in bwplot(voice.part ~ height, data = singer, main = NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink' 'violet' 'brown' 'gold', fill=c(yellow,blue,green,red,pink,violet,brown,gold)) the assignment of colors is offset by 3: Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1 fillcol- c(yellow,blue,green,red,pink,violet,brown,gold) In the above plot, yellow - Bass 2 (1) blue - Tenor 1 (4) green - Soprano 2 (7) red - Bass 1 (10 mod 8 = 2) pink - Alto 2 (13 mod 8 = 5) etc. It's certainly curious. Curious indeed. It turns out that because of the way this was implemented, every 11th color was used, so you end up with the order sel.cols- c(yellow,blue,green,red,pink,violet,brown,gold) rep(sel.cols, 100) [ seq(1, by = 11, length.out = 8) ] [1] yellow red brown blue pink gold green violet It's easy to fix this so that we get the expected order, and I will do so for the next release. Thank you for this proposal. We are looking forward for the next release :-) We frequently have to colour selected boxes to be able to compare special cases over different panels. Having said that, it should be noted that any vectorization behaviour in lattice panel functions is a consequence of implementation and not guaranteed by design (although certainly useful in many situations). In particular, it is risky to depend on vectorization in multipanel plots, because the vectorization starts afresh in each panel for whatever data subset happens to be in that panel, and there may be no relation between the colors and the original data. Thank you for the warning. One alternative is to use panel.superpose with panel.groups=panel.bwplot: bwplot(voice.part ~ height, data = singer, groups = voice.part, panel = panel.superpose, panel.groups = panel.bwplot, fill = sel.cols) This indeed works nice 'as a workaround'. Actually, I would reiterate that this is the right solution and the it's other fix that qualifies as a quick workaround (especially if you are considering comparing things across multiple panels). -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drawing circles on a chart
Thanks Barry ... actually the intention was to have areas of the circle depicting the value (radius imputed) -Original Message- From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On Behalf Of Barry Rowlingson Sent: 03 November 2010 15:02 To: Santosh Srinivas Cc: r-help@r-project.org Subject: Re: [R] Drawing circles on a chart On Wed, Nov 3, 2010 at 2:07 AM, Santosh Srinivas santosh.srini...@gmail.com wrote: Dear Group, Inside each cell there should be a circle (sphere preferable) with radius of mod(data value). The color should be either red or green depending on -ve or +ve and the intensity should be based on the value of the datapoint. Any help on how to go about this? If you really want a sphere then you should look at the rgl package, which enables the drawing of 3d graphic objects with illumination. However it does it in its own graphics window and you'll not be able to use any of the standard R graphics functions. Otherwise you'll have to find some way of putting a 3d sphere on a 2d R graphics window, or faking it with a shaded circle and some highlights. Yuck. Also, drawing circles (strictly, a disc) with radius proportional to data value is usually a bad idea since we interpret areas. A circle with twice the radius has four times the area, and so looks four times as big. But the data is only twice as big... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Colour filling in panel.bwplot from lattice
Am 03.11.2010 12:52 (UTC+1) schrieb Deepayan Sarkar: On Wed, Nov 3, 2010 at 4:25 PM, Rainer Hurlingrhur...@gwdg.de wrote: Am 03.11.2010 10:23 (UTC+1) schrieb Deepayan Sarkar: On Wed, Nov 3, 2010 at 4:11 AM, Dennis Murphydjmu...@gmail.comwrote: Hi: I don't know why, but it seems that in bwplot(voice.part ~ height, data = singer, main = NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink' 'violet' 'brown' 'gold', fill=c(yellow,blue,green,red,pink,violet,brown,gold)) the assignment of colors is offset by 3: Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1 fillcol- c(yellow,blue,green,red,pink,violet,brown,gold) In the above plot, yellow -Bass 2 (1) blue -Tenor 1 (4) green -Soprano 2 (7) red -Bass 1 (10 mod 8 = 2) pink -Alto 2 (13 mod 8 = 5) etc. It's certainly curious. Curious indeed. It turns out that because of the way this was implemented, every 11th color was used, so you end up with the order sel.cols- c(yellow,blue,green,red,pink,violet,brown,gold) rep(sel.cols, 100) [ seq(1, by = 11, length.out = 8) ] [1] yellow redbrown blue pink gold green violet It's easy to fix this so that we get the expected order, and I will do so for the next release. Thank you for this proposal. We are looking forward for the next release :-) We frequently have to colour selected boxes to be able to compare special cases over different panels. Having said that, it should be noted that any vectorization behaviour in lattice panel functions is a consequence of implementation and not guaranteed by design (although certainly useful in many situations). In particular, it is risky to depend on vectorization in multipanel plots, because the vectorization starts afresh in each panel for whatever data subset happens to be in that panel, and there may be no relation between the colors and the original data. Thank you for the warning. One alternative is to use panel.superpose with panel.groups=panel.bwplot: bwplot(voice.part ~ height, data = singer, groups = voice.part, panel = panel.superpose, panel.groups = panel.bwplot, fill = sel.cols) This indeed works nice 'as a workaround'. Actually, I would reiterate that this is the right solution and the it's other fix that qualifies as a quick workaround (especially if you are considering comparing things across multiple panels). Yes, this comparing across multiple panels was our intention. Rainer -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] install vegan
On 03.11.2010 10:39, Carolin wrote: Dear all, I am trying to install Vegan, but I allways get the following error message: Warning in install.packages(choose.files(, filters = Filters[c(zip, : 'lib = C:/Programme/R/R-2.12.0/library' is not writable Error in install.packages(choose.files(, filters = Filters[c(zip, : unable to install packages utils:::menuInstallLocal() does anybody know what is wrong? Yes: you do not have permission to write to C:/Programme/R/R-2.12.0/library where you are trying to install the package to. Uwe Ligges Thanks in advance, Carolin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory allocation problem
The optim function is very resource hungry. I have had similar problems in the past when dealing with extremely large datasets. What is perhaps happening is that each 'step' of the optimization algorithm stores some info so that it can compare to the next 'step', and while the original vector may only be a few Mb of data, over many iterations a huge amount memory is allocated to the optimization steps. Maybe look at the control options under ?optim, particularly stuff like trace, fnscale, ndeps, etc. that may cut down on the amount of data being stored each step as well as the number of steps needed. Good luck! -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly From: Lorenzo Cattarino l.cattar...@uq.edu.au To: David Winsemius dwinsem...@comcast.net, Peter Langfelder peter.langfel...@gmail.com Cc: r-help@r-project.org Date: 11/03/2010 03:26 AM Subject: Re: [R] memory allocation problem Sent by: r-help-boun...@r-project.org Thanks for all your suggestions, This is what I get after removing all the other (not useful) objects and run my code: getsizes() [,1] org_results 47240832 myfun 11672 getsizes4176 SS 3248 coeff168 NA NA NA NA NA NA NA NA NA NA est_coeff - optim(coeff,SS, steps=org_results$no.steps, Range=org_results$Range, H1=org_results$H1, H2=org_results$H2, p=org_results$p) Error: cannot allocate vector of size 5.0 Mb In addition: Warning messages: 1: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 2: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 3: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) 4: In optim(coeff, SS, steps = org_results$no.steps, Range = org_results$Range, : Reached total allocation of 4055Mb: see help(memory.size) It seems that R is using all the default availabe memory (4 GB, which is the RAM of my processor). memory.limit() [1] 4055 memory.size() [1] 4049.07 My dataframe has a size of 47240832 bytes, or about 45 Mb. So it should not be a problem in terms of memory usage? I do not understand what is going on. Thanks for your help anyway Lorenzo -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, 3 November 2010 12:48 PM To: Lorenzo Cattarino Cc: r-help@r-project.org Subject: Re: [R] memory allocation problem Restart your computer. (Yeah, I know that what the help-desk always says.) Start R before doing anything else. Then run your code in a clean session. Check ls() oafter starte up to make sure you don't have a bunch f useless stuff in your .Rdata file. Don't load anything that is not germane to this problem. Use this function to see what sort of space issues you might have after loading objects: getsizes - function() {z - sapply(ls(envir=globalenv()), function(x) object.size(get(x))) (tmp - as.matrix(rev(sort(z))[1:10]))} Then run your code. -- David. On Nov 2, 2010, at 10:13 PM, Lorenzo Cattarino wrote: I would also like to include details on my R version version _ platform x86_64-pc-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 2 minor 11.1 year 2010 month 05 day31 svn rev52157 language R version.string R version 2.11.1 (2010-05-31) from FAQ 2.9 (http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b e-a-limit-on-the-memory-it-uses_0021 http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b e-a-limit-on-the-memory-it-uses_0021 ) it says that: For a 64-bit build, the default is the amount of RAM So in my case the amount of RAM would be 4 GB. R should be able to allocate a vector of size 5 Mb without me typing any command (either as memory.limit() or appended string in the target path), is that right? From: Lorenzo Cattarino Sent: Wednesday, 3 November 2010 10:55 AM To: 'r-help@r-project.org' Subject: memory allocation problem I forgot to mention that I am using windows 7 (64-bit) and the R version 2.11.1 (64-bit) From: Lorenzo Cattarino I am trying to run a non linear parameter optimization using the function optim() and I have problems regarding memory allocation. My data are in a dataframe with 9 columns. There are 656100 rows. head(org_results) comb.id p
[R] optim works on command-line but not inside a function
Dear all, I am trying to optimize a logistic function using optim, inside the following functions: #Estimating a and b from thetas and outcomes by ML IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=t, X=X) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } #Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0) IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=tar, X=Xes) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } The problem is that this does not work: IRT.estimate.abFromThetaX(sx, st, c(0,0)) Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan, : L-BFGS-B needs finite values of 'fn' But If I try the same optim call on the command line, with the same data, it works fine: optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, + gr=IRT.gradZL, + lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx) optRes $par [1] -0.6975157 0.7944972 $convergence [1] 0 $message [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH Does anyone have an idea what this could be, and what I could try to avoid this error? I tried bounding the parameters, with lower=c(-10, -10) and upper=... but that made no difference. Thanks, Diederik Roijers Utrecht University MSc student. -- PS: the other functions I am using are: #IRT.p is the function that represents the probability #of a positive outcome of an item with difficulty b, #discriminativity a, in combination with a student with #competence theta. IRT.p - function(theta, a, b){ epow - exp(-a*(theta-b)) result - 1/(1+epow) result } # = IRT.p^-1 ; for usage in the loglikelihood IRT.oneOverP - function(theta, a, b){ epow - exp(-a*(theta-b)) result - (1+epow) result } # = (1-IRT.p)^-1 ; for usage in the loglikelihood IRT.oneOverPneg - function(theta, a, b){ epow - exp(a*(theta-b)) result - (1+epow) result } #simulation-based sample generation of thetas and outcomes #based on a given a and b. (See IRT.p) The sample-size is n IRT.generateSample - function(a, b, n){ x-rnorm(n, mean=b, sd=b/2) t-IRT.p(x,a,b) ch-runif(length(t)) t[t=ch]=1 t[tch]=0 cbind(x,t) } #This loglikelihood function is based on the a and be parameters, #and requires thetas as input in X, and outcomes in t #prone to give NaN errors due to 0*log(0) IRT.logLikelihood2 - function(params, t, X){ pos- sum(t * log(IRT.p(X,params[1],params[2]))) neg- sum( (1-t) * log( (1-IRT.p(X,params[1],params[2])) ) ) -pos-neg } #Avoiding NaN problems due to 0*log(0) #otherwise equivalent to IRT.logLikelihood2 IRT.logLikelihood2CorrNan - function(params, t, X){ pos- sum(t * log(IRT.oneOverP(X,params[1],params[2]))) neg- sum((1-t) * log(IRT.oneOverPneg(X,params[1],params[2]))) -pos-neg } #IRT.p can also be espressed in terms of z and l #where z=-ab and l=a - makes it a standard logit function IRT.pZL - function(theta, z, l){ epow - exp(-(z+l*theta)) result - 1/(1+epow) result } #as IRT.oneOverP but now for IRT.pZL IRT.pZLepos - function(theta, z, l){ epow - exp(-(z+l*theta)) result - (1+epow) result } #as IRT.oneOverPneg but now for IRT.pZL IRT.pZLeneg - function(theta, z, l){ epow - exp(z+l*theta) result - (1+epow) result } #The loglikelihood of IRT, but now expressed in terms of z and l IRT.llZetaLambda - function(params, t, X){ pos- sum(t * log(IRT.pZL( X,params[1],params[2]) )) neg- sum( (1-t) * log( (1-IRT.pZL(X,params[1],params[2] )) ) ) -pos-neg } #Same as IRT.logLikelihood2CorrNan but for IRT.llZetaLambda IRT.llZetaLambdaCorrNan - function(params, t, X){ pos - sum(t * log(IRT.pZLepos( X,params[1],params[2]) )) neg - sum((1-t) * log(IRT.pZLeneg(X,params[1],params[2]) )) pos+neg } #Gradient of IRT.llZetaLambda IRT.gradZL - function(params, t, X){ res-numeric(length(params)) res[1] - sum(t-IRT.pZL( X,params[1],params[2] )) res[2] - sum(X*(t-IRT.pZL( X,params[1],params[2] ))) -res } #And to create the sample: s - IRT.generateSample(0.8, 1, 50) sx - s[,1] st - s[,2] IRT.estimate.abFromThetaX(sx, st, c(0,0)) -- View this message in context: http://r.789695.n4.nabble.com/optim-works-on-command-line-but-not-inside-a-function-tp3025414p3025414.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim works on command-line but not inside a function
As the error message says, the values of your function must be finite in order to run the algorithm. Some part of your loop is passing arguments (inits maybe... you only tried (0,0) in the CLI example) that cause IRT.llZetaLambdaCorrNan to be infinite. -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly From: Damokun dmroi...@students.cs.uu.nl To: r-help@r-project.org Date: 11/03/2010 10:19 AM Subject: [R] optim works on command-line but not inside a function Sent by: r-help-boun...@r-project.org Dear all, I am trying to optimize a logistic function using optim, inside the following functions: #Estimating a and b from thetas and outcomes by ML IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=t, X=X) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } #Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0) IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=tar, X=Xes) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } The problem is that this does not work: IRT.estimate.abFromThetaX(sx, st, c(0,0)) Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan, : L-BFGS-B needs finite values of 'fn' But If I try the same optim call on the command line, with the same data, it works fine: optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, + gr=IRT.gradZL, + lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx) optRes $par [1] -0.6975157 0.7944972 $convergence [1] 0 $message [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH Does anyone have an idea what this could be, and what I could try to avoid this error? I tried bounding the parameters, with lower=c(-10, -10) and upper=... but that made no difference. Thanks, Diederik Roijers Utrecht University MSc student. -- PS: the other functions I am using are: #IRT.p is the function that represents the probability #of a positive outcome of an item with difficulty b, #discriminativity a, in combination with a student with #competence theta. IRT.p - function(theta, a, b){ epow - exp(-a*(theta-b)) result - 1/(1+epow) result } # = IRT.p^-1 ; for usage in the loglikelihood IRT.oneOverP - function(theta, a, b){ epow - exp(-a*(theta-b)) result - (1+epow) result } # = (1-IRT.p)^-1 ; for usage in the loglikelihood IRT.oneOverPneg - function(theta, a, b){ epow - exp(a*(theta-b)) result - (1+epow) result } #simulation-based sample generation of thetas and outcomes #based on a given a and b. (See IRT.p) The sample-size is n IRT.generateSample - function(a, b, n){ x-rnorm(n, mean=b, sd=b/2) t-IRT.p(x,a,b) ch-runif(length(t)) t[t=ch]=1 t[tch]=0 cbind(x,t) } #This loglikelihood function is based on the a and be parameters, #and requires thetas as input in X, and outcomes in t #prone to give NaN errors due to 0*log(0) IRT.logLikelihood2 - function(params, t, X){ pos- sum(t * log(IRT.p(X,params[1],params[2]))) neg- sum( (1-t) * log( (1-IRT.p(X,params[1],params[2])) ) ) -pos-neg } #Avoiding NaN problems due to 0*log(0) #otherwise equivalent to IRT.logLikelihood2 IRT.logLikelihood2CorrNan - function(params, t, X){ pos- sum(t * log(IRT.oneOverP(X,params[1],params[2]))) neg- sum((1-t) * log(IRT.oneOverPneg(X,params[1],params[2]))) -pos-neg } #IRT.p can also be espressed in terms of z and l #where z=-ab and l=a - makes it a standard logit function IRT.pZL - function(theta, z, l){ epow - exp(-(z+l*theta)) result - 1/(1+epow) result } #as IRT.oneOverP but now for IRT.pZL IRT.pZLepos - function(theta, z, l){ epow - exp(-(z+l*theta)) result - (1+epow) result } #as IRT.oneOverPneg but now for IRT.pZL IRT.pZLeneg - function(theta, z, l){ epow - exp(z+l*theta) result - (1+epow) result } #The loglikelihood of IRT, but now expressed in terms of z and l IRT.llZetaLambda - function(params, t, X){ pos- sum(t * log(IRT.pZL( X,params[1],params[2]) )) neg- sum( (1-t) * log( (1-IRT.pZL(X,params[1],params[2] )) ) ) -pos-neg } #Same as IRT.logLikelihood2CorrNan but for IRT.llZetaLambda IRT.llZetaLambdaCorrNan - function(params, t, X){ pos - sum(t * log(IRT.pZLepos( X,params[1],params[2]) )) neg - sum((1-t) * log(IRT.pZLeneg(X,params[1],params[2]) )) pos+neg } #Gradient of IRT.llZetaLambda IRT.gradZL - function(params, t, X){ res-numeric(length(params))
[R] Granger causality with panel data (econometrics question)
Hi folks, I am trying to perform a Granger causality analysis with panel data. There are some packages around for panel data analysis and Granger causality. However, I have found neither a package for both panel data and Granger causality nor any R procedures (homogenous/heterogenous causality hypotheses, related tests such as Wald, unit root tests etc.). Of course, someone must have encountered this problem before me. Can anyone suggest a solution to this case? Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dll problem with C++ function
Dear fellow R-users, I have the problem of being unable to repeatedly use a C++-function within R unless by dyn.unloading/dyn.loading it after each .C call. The C++-code (too large to attach) compiles without problem using R CMD SHLIB. It loads (using dyn.load(myfun.so)) and executes (via .C(myfun, ...) ) properly. The function returns no object, only reads files from disk, performs calculations and later writes a file to disk. When I now use the same line of code again to re-run the analysis (again via .C), I get an error message claiming a malformed input file. This seemingly malformed input file is absolutely correct. When I now use dyn.unload(myfun.so) and then again dyn.load(myfun.so), I can use it as before. I have absolutely no clue what is going on here. The C++-function returns a 1 if run correctly and 0 otherwise. The stand-alone version works fine. My feeling is that R cannot deallocate the memory or somehow doesn't grasp that the dll should be freed after running. My impression is there is a very simple reason, but I couldn't find it (in the Writing R Extensions or in any of the R help lists, including R-sig-mac). ANY hint greatly appreciated! Cheers, Carsten For what it's worth, here my system details: R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] de_DE.UTF-8/de_DE.UTF-8/C/C/de_DE.UTF-8/de_DE.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base The C++-functions starts like this: #include R.h #include stdio.h #include iostream #include fstream #include string #include stdlib.h #include time.h #include myhelperfunctions.h using namespace std; extern C { ... various long C++ functions without any change for inclusion into R (apart from renaming main to myfun) } -- Dr. Carsten F. Dormann Department of Computational Landscape Ecology Helmholtz Centre for Environmental Research-UFZ (Department Landschaftsökologie) (Helmholtz Zentrum für Umweltforschung - UFZ) Permoserstr. 15 04318 Leipzig Germany Tel: ++49(0)341 2351946 Fax: ++49(0)341 2351939 Email: carsten.dorm...@ufz.de internet: http://www.ufz.de/index.php?de=4205 Registered Office/Sitz der Gesellschaft: Leipzig Commercial Register Number/Registergericht: Amtsgericht Leipzig, Handelsregister Nr. B 4703 Chairman of the Supervisory Board/Vorsitzender des Aufsichtsrats: MinR Wilfried Kraus Scientific Managing Director/Wissenschaftlicher Geschäftsführer: Prof. Dr. Georg Teutsch Administrative Managing Director/Administrativer Geschäftsführer: Dr. Andreas Schmidt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] longer object length is not a multiple of shorter object length
Hi folks, I'm following An Introduction to R http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics to learn R. Coming to; 2.2 Vector arithmetic v - 2*x + y + 1 Warning message: In 2 * x + y : longer object length is not a multiple of shorter object length What does it mean? How to rectify it? Please help. TIA B.R. Stephen L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] longer object length is not a multiple of shorter object length
On Nov 3, 2010, at 11:00 AM, Stephen Liu wrote: Hi folks, I'm following An Introduction to R http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics to learn R. Coming to; 2.2 Vector arithmetic v - 2*x + y + 1 Warning message: In 2 * x + y : longer object length is not a multiple of shorter object length What does it mean? How to rectify it? Please help. TIA What does this return: c(length(x), length(y)) # ? David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice plots for images
Hello UseRs, I need help on how to plot several raster images (such as those obtained from a kernel-smoothed intensity function) in a layout such as that obtained from the lattice package. I would like to obtain something such as obtained from using the levelplot or xyplot in lattice. I currently use: par(mfrow=c(3,3) to set the workspace, but the resulting plots leave a lot of blank space between individual plots. If I can get it to the lattice format, I think it will save me some white space. Any help is greatly appreciated. Neba. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] longer object length is not a multiple of shorter object length
- Original Message From: David Winsemius dwinsem...@comcast.net To: Stephen Liu sati...@yahoo.com Cc: r-help@r-project.org Sent: Wed, November 3, 2010 11:03:18 PM Subject: Re: [R] longer object length is not a multiple of shorter object length - snip - v - 2*x + y + 1 Warning message: In 2 * x + y : longer object length is not a multiple of shorter object length What does it mean? How to rectify it? Please help. TIA What does this return: c(length(x), length(y)) # ? c(length(x), length(y)) [1] 5 11 B.R. Stephen L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] longer object length is not a multiple of shorter object length
On Nov 3, 2010, at 11:17 AM, Stephen Liu wrote: - Original Message From: David Winsemius dwinsem...@comcast.net To: Stephen Liu sati...@yahoo.com Cc: r-help@r-project.org Sent: Wed, November 3, 2010 11:03:18 PM Subject: Re: [R] longer object length is not a multiple of shorter object length - snip - v - 2*x + y + 1 Warning message: In 2 * x + y : longer object length is not a multiple of shorter object length What does it mean? How to rectify it? You were not supposed to rectify it. That example was designed to show you what happens in R when two vectors (actually three) are offered to the Arithmetic operators. Read the material that is above and below that expression again. Please help. TIA What does this return: c(length(x), length(y)) # ? c(length(x), length(y)) [1] 5 11 B.R. Stephen L David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [klaR package] [NaiveBayes] warning message numerical 0 probability
Hi, I run R 2.10.1 under ubuntu 10.04 LTS (Lucid Lynx) and klaR version 0.6-4. I compute a model over a 2 classes dataset (composed of 700 examples). To that aim, I use the function NaiveBayes provided in the package klaR. When I then use the prediction function : predict(my_model, new_data). I get the following warning : In FUN(1:747[[747L]], ...) : Numerical 0 probability with observation 458 As I did not find any documentation or any discussion concerning this warning message, I looked in the klaR source code and found the following line in predict.NaiveBayes.R : warning(Numerical 0 probability with observation , i) Unfortunately, it is hard to get a clear picture of the whole process reading the code. I wonder if someone could help me with the meaning of this warning message. Sorry I did not provide an example, but I could not simulate the same message over a small toy example. Thank you, Fabon Dzogang. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dll problem with C++ function
Just a shot in the dark... Do you properly close the input/output files at the end of your function? If not and the file remains open, it may throw an error upon new attempt to read it. It is possible that dyn.unload, among other things, closes all open connections and hence upon re-load everything works fine. Peter On Wed, Nov 3, 2010 at 6:47 AM, Carsten Dormann carsten.dorm...@ufz.de wrote: Dear fellow R-users, I have the problem of being unable to repeatedly use a C++-function within R unless by dyn.unloading/dyn.loading it after each .C call. The C++-code (too large to attach) compiles without problem using R CMD SHLIB. It loads (using dyn.load(myfun.so)) and executes (via .C(myfun, ...) ) properly. The function returns no object, only reads files from disk, performs calculations and later writes a file to disk. When I now use the same line of code again to re-run the analysis (again via .C), I get an error message claiming a malformed input file. This seemingly malformed input file is absolutely correct. When I now use dyn.unload(myfun.so) and then again dyn.load(myfun.so), I can use it as before. I have absolutely no clue what is going on here. The C++-function returns a 1 if run correctly and 0 otherwise. The stand-alone version works fine. My feeling is that R cannot deallocate the memory or somehow doesn't grasp that the dll should be freed after running. My impression is there is a very simple reason, but I couldn't find it (in the Writing R Extensions or in any of the R help lists, including R-sig-mac). ANY hint greatly appreciated! Cheers, Carsten For what it's worth, here my system details: R version 2.12.0 (2010-10-15) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] de_DE.UTF-8/de_DE.UTF-8/C/C/de_DE.UTF-8/de_DE.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base The C++-functions starts like this: #include R.h #include stdio.h #include iostream #include fstream #include string #include stdlib.h #include time.h #include myhelperfunctions.h using namespace std; extern C { ... various long C++ functions without any change for inclusion into R (apart from renaming main to myfun) } -- Dr. Carsten F. Dormann Department of Computational Landscape Ecology Helmholtz Centre for Environmental Research-UFZ (Department Landschaftsökologie) (Helmholtz Zentrum für Umweltforschung - UFZ) Permoserstr. 15 04318 Leipzig Germany Tel: ++49(0)341 2351946 Fax: ++49(0)341 2351939 Email: carsten.dorm...@ufz.de internet: http://www.ufz.de/index.php?de=4205 Registered Office/Sitz der Gesellschaft: Leipzig Commercial Register Number/Registergericht: Amtsgericht Leipzig, Handelsregister Nr. B 4703 Chairman of the Supervisory Board/Vorsitzender des Aufsichtsrats: MinR Wilfried Kraus Scientific Managing Director/Wissenschaftlicher Geschäftsführer: Prof. Dr. Georg Teutsch Administrative Managing Director/Administrativer Geschäftsführer: Dr. Andreas Schmidt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice plots for images
Have you tried using the 'mai' argument to par()? Something like: par(mfrow=c(3,3), mai=c(0,0,0,0)) I've used this in conjunction with image() to plot raster data in a tight grid. http://biostatmatt.com/archives/727 -Matt On Wed, 2010-11-03 at 11:13 -0400, Neba Funwi-Gabga wrote: Hello UseRs, I need help on how to plot several raster images (such as those obtained from a kernel-smoothed intensity function) in a layout such as that obtained from the lattice package. I would like to obtain something such as obtained from using the levelplot or xyplot in lattice. I currently use: par(mfrow=c(3,3) to set the workspace, but the resulting plots leave a lot of blank space between individual plots. If I can get it to the lattice format, I think it will save me some white space. Any help is greatly appreciated. Neba. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice plots for images
On Nov 3, 2010, at 11:13 AM, Neba Funwi-Gabga wrote: Hello UseRs, I need help on how to plot several raster images (such as those obtained from a kernel-smoothed intensity function) in a layout such as that obtained from the lattice package. I would like to obtain something such as obtained from using the levelplot or xyplot in lattice. I currently use: par(mfrow=c(3,3) to set the workspace, but the resulting plots leave a lot of blank space between individual plots. If I can get it to the lattice format, I think it will save me some white space. (It's not clear what plotting paradigm you are using since you do not name a particular function or package, but this assumes you will be using lattice. If you are using base graphics, then the answer is undoubtedly ?par ) In the archives are examples you might use to look up in the documentation and then to modify to fit you specifications: http://finzi.psych.upenn.edu/R/Rhelp02/archive/58102.html trellis.par.set(list(layout.widths = list(left.padding = -1))) trellis.par.set(list(layout.widths = list(right.padding = -1, ylab.axis.padding = -0.5))) http://finzi.psych.upenn.edu/R/Rhelp02/archive/62912.html theme.novpadding - list(layout.heights = list(top.padding = 0, main.key.padding = 0, key.axis.padding = 0, axis.xlab.padding = 0, xlab.key.padding = 0, key.sub.padding = 0, bottom.padding = 0)) Both citations hound with search space between lattice plots -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Line numbers in Sweave
Well, I know this is true for the default Sweave(), and my problem actually comes from the pgfSweave package: I used some tricks to cheat R to parse and deparse the code chunks so that the output can be automatically formatted (and preserve the comments, [1]). The price to pay for not being honest is these line numbers since R 2.12.0 (even if there are no errors in my code). As I am unable to figure out the logic behind Sweave() to generate line numbers, I cannot modify my code either. So a temporary dirty solution is to turn off the reporting of line numbers. I wish the keep.source=FALSE option can preserve comments so that we don't need to touch utils:::RweaveLatexRuncode ([2]). Formatting the code chunks is really a nice feature with keep.source=FALSE, but the price of discarding comments is too high... I think the evaluate package might be a good place to look at (or perhaps the highlight package), which performs just like the R terminal (keep the comments and report the errors without really stopping R, [3]). Thanks! [1] https://github.com/cameronbracken/pgfSweave/blob/master/R/pgfSweaveDriver.R#L297 [2] http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.pdf [3] see the error in the last but one example: http://had.co.nz/ggplot2/stat_smooth.html Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Tue, Nov 2, 2010 at 5:09 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 02/11/2010 5:50 PM, Yihui Xie wrote: Hi, I thumbed through the source code Sweave.R but was unable to figure out when (under what conditions) R will insert the line numbers to the output. The R 2.12.0 news said: • Parsing errors detected during Sweave() processing will now be reported referencing their original location in the source file. Do we have any options to turn off this reporting? Thanks! Sure: just don't include any syntax errors. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-numeric argument to binary operator error while reading ncdf file
Charles Novaes de Santana wrote: Thank you everybody for the help! The solution of my problem is here: http://climateaudit.org/2009/10/10/unthreaded-23/ The mv variable is the designated NA for the variable and it appears that somebody screwed that up in the file. This workaround worked for me: [...] Charles, not sure if you got my previous email, but if you send me a copy of the file that triggers the problem, I can fix it for everyone instead of requiring that kind of work around. Regards, --Dave --- David W. Pierce Division of Climate, Atmospheric Science, and Physical Oceanography Scripps Institution of Oceanography (858) 534-8276 (voice) / (858) 534-8561 (fax)dpie...@ucsd.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey's table
Hi: Try this: Trat - c(2:30) # number of treatments gl - c(2:30, 40, 60, 120) # Write a one-line 2D function to get the Tukey distribution quantile: f - function(x,y) qtukey(0.95, x, y) outer(Trat, gl, f) It's slow (takes a few seconds) but it seems to work. HTH, Dennis On Wed, Nov 3, 2010 at 3:52 AM, Silvano silv...@uel.br wrote: Hi, I'm building Tukey's table using qtukey function. It happens that I can't get the values of Tukey's one degree of freedom and also wanted to eliminate the first column. Firstly, one needs at least two treatments to find a studentized range (which is why you get NaNs across the first row when you set Trat = 1). Secondly, if you have at least two groups, you need at least two observations per group to get a variance estimate, which means that the variance estimate of the difference needs to have at least 2 df. If one group has only observation in it, the variance of the difference is the variance of the group with = 2 observations, which doesn't make intuitive sense. This is why you get NaNs along the first column. HTH, Dennis The program is: Trat - c(1:30) # number of treatments gl - c(1:30, 40, 60, 120) # degree freedom tukval - matrix(0, nr=length(gl), nc=length(Trat)) for(i in 1:length(gl)) for(j in 1:length(Trat)) tukval[i,j] - qtukey(.95, Trat[j], gl[i]) rownames(tukval) - gl colnames(tukval) - paste(Trat, , sep=) tukval require(xtable) xtable(tukval) Some suggest? -- Silvano Cesar da Costa Departamento de Estatística Universidade Estadual de Londrina Fone: 3371-4346 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Line numbers in Sweave
On 03/11/2010 11:52 AM, Yihui Xie wrote: Well, I know this is true for the default Sweave(), and my problem actually comes from the pgfSweave package: I used some tricks to cheat R to parse and deparse the code chunks so that the output can be automatically formatted (and preserve the comments, [1]). The price to pay for not being honest is these line numbers since R 2.12.0 (even if there are no errors in my code). As I am unable to figure out the logic behind Sweave() to generate line numbers, I cannot modify my code either. So a temporary dirty solution is to turn off the reporting of line numbers. Could you be more specific? I don't understand how the line numbers would affect anything if you don't have syntax errors. I wish the keep.source=FALSE option can preserve comments so that we don't need to touch utils:::RweaveLatexRuncode ([2]). This is basically impossible: with keep.source=FALSE, you are just seeing deparsed code. The comments don't make it into parsed code, so the deparser never sees them. I don't know the evaluate package, but I think I remember that the highlight package has its own parser, it doesn't use R's. Perhaps R's parser could import some of the differences, but I think it's complicated enough as it is, and would rather not make it more so. Formatting the code chunks is really a nice feature with keep.source=FALSE, but the price of discarding comments is too high... I think the evaluate package might be a good place to look at (or perhaps the highlight package), which performs just like the R terminal (keep the comments and report the errors without really stopping R, [3]). The error in [3] is a run-time error, not a syntax error. It should be unaffected by the line numbers. Duncan Murdoch Thanks! [1] https://github.com/cameronbracken/pgfSweave/blob/master/R/pgfSweaveDriver.R#L297 [2] http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.pdf [3] see the error in the last but one example: http://had.co.nz/ggplot2/stat_smooth.html Regards, Yihui -- Yihui Xiexieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Tue, Nov 2, 2010 at 5:09 PM, Duncan Murdochmurdoch.dun...@gmail.com wrote: On 02/11/2010 5:50 PM, Yihui Xie wrote: Hi, I thumbed through the source code Sweave.R but was unable to figure out when (under what conditions) R will insert the line numbers to the output. The R 2.12.0 news said: • Parsing errors detected during Sweave() processing will now be reported referencing their original location in the source file. Do we have any options to turn off this reporting? Thanks! Sure: just don't include any syntax errors. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to unquote string in R
s= Hey a = Hello table = rbind(s,a) write.table(table,paste(blah,.PROPERTIES,sep = ),row.names = FALSE,col.names = FALSE) In my table, how do I output only the words and not the words with the quotations? -- View this message in context: http://r.789695.n4.nabble.com/How-to-unquote-string-in-R-tp3025654p3025654.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to unquote string in R
Put the quote = FALSE argument in write.table On Wed, Nov 3, 2010 at 2:13 PM, lord12 trexi...@yahoo.com wrote: s= Hey a = Hello table = rbind(s,a) write.table(table,paste(blah,.PROPERTIES,sep = ),row.names = FALSE,col.names = FALSE) In my table, how do I output only the words and not the words with the quotations? -- View this message in context: http://r.789695.n4.nabble.com/How-to-unquote-string-in-R-tp3025654p3025654.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density() function: differences with S-PLUS
Dear William, thank you very much for your reply. I see it only after my reply to Joshua. Unfortunately I cannot try until tomorrow, because I don't have S-PLUS on this machine. Thanks again. Nicola 2010/11/3 William Dunlap wdun...@tibco.com Did you get my reply (1:31pm PST Tuesday) to your request? It showed how you needed to use the from= and to= argument to density to get identical x components to the output and that the small differences in the y component were due to S+ truncating the gaussian kernel at +- 4 standard deviations from the center while R does not truncate the gaussian kernel (it output looks like it uses a Fourier transform to do the convolution). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola Sturaro Sommacal (Quantide srl) Sent: Wednesday, November 03, 2010 3:34 AM To: Joshua Wiley Cc: r-help@r-project.org Subject: Re: [R] density() function: differences with S-PLUS Dear Joshua, first of all, thank you very much for reply. I hoped that someone who's familiar with both S+ and R can reply to me, because I spent some hours to looking for a solution. If someone else would try, this is the SPLUS code and output, while below there is the R code. I obtain the same x values, while y values are differents for both examples. Thank you very much. Nicola ### S-PLUS CODE AND OUTPUT ### density(1:1000, width = 4) $x: [1]-2.018.5102039.0204159.5306180.04082 100.55102 121.06122 [8] 141.57143 162.08163 182.59184 203.10204 223.61224 244.12245 264.63265 [15] 285.14286 305.65306 326.16327 346.67347 367.18367 387.69388 408.20408 [22] 428.71429 449.22449 469.73469 490.24490 510.75510 531.26531 551.77551 [29] 572.28571 592.79592 613.30612 633.81633 654.32653 674.83673 695.34694 [36] 715.85714 736.36735 756.87755 777.38776 797.89796 818.40816 838.91837 [43] 859.42857 879.93878 900.44898 920.95918 941.46939 961.97959 982.48980 [50] 1003.0 $y: [1] 4.565970e-006 1.31e-003 9.999374e-004 1.31e-003 9.999471e-004 1.31e-003 [7] 9.999560e-004 1.30e-003 9.999643e-004 1.29e-003 9.999718e-004 1.28e-003 [13] 9.999788e-004 1.26e-003 9.999852e-004 1.24e-003 9.10e-004 1.22e-003 [19] 9.63e-004 1.19e-003 1.01e-003 1.16e-003 1.06e-003 1.13e-003 [25] 1.10e-003 1.10e-003 1.13e-003 1.06e-003 1.16e-003 1.01e-003 [31] 1.19e-003 9.63e-004 1.22e-003 9.10e-004 1.24e-003 9.999852e-004 [37] 1.26e-003 9.999788e-004 1.28e-003 9.999718e-004 1.29e-003 9.999643e-004 [43] 1.30e-003 9.999560e-004 1.31e-003 9.999471e-004 1.31e-003 9.999374e-004 [49] 1.31e-003 4.432131e-006 exdata = iris[, 1, 1] density(exdata, width = 4) $x: [1] 1.30 1.453061 1.606122 1.759184 1.912245 2.065306 2.218367 2.371429 2.524490 [10] 2.677551 2.830612 2.983673 3.136735 3.289796 3.442857 3.595918 3.748980 3.902041 [19] 4.055102 4.208163 4.361224 4.514286 4.667347 4.820408 4.973469 5.126531 5.279592 [28] 5.432653 5.585714 5.738776 5.891837 6.044898 6.197959 6.351020 6.504082 6.657143 [37] 6.810204 6.963265 7.116327 7.269388 7.422449 7.575510 7.728571 7.881633 8.034694 [46] 8.187755 8.340816 8.493878 8.646939 8.80 $y: [1] 0.0007849649 0.0013097474 0.0021225491 0.0033616520 0.0052059615 0.0078856717 [7] 0.0116917555 0.0169685132 0.0241073754 0.0335286785 0.0456521053 0.0608554862 [13] 0.0794235072 0.1014901241 0.1269807991 0.1555625999 0.1866111931 0.2192033788 [19] 0.2521417640 0.2840144993 0.3132881074 0.3384260582 0.3580208688 0.3709241384 [25] 0.3763578665 0.3739920600 0.3639778683 0.3469316232 0.3238721233 0.2961200278 [31] 0.2651731505 0.2325739601 0.1997853985 0.1680884651 0.1385105802 0.1117884914 [37] 0.0883644110 0.0684099972 0.0518702141 0.0385181792 0.0280126487 0.0199513951 [43] 0.0139159044 0.0095050745 0.0063575653 0.0041639082 0.0026680819 0.0016700727 [49] 0.0010169912 0.0005962089 ### R CODE ### # S-PLUS CODE: density(1:1000, width = 4) SAME x BUT DIFFERENT y density(1:1000, bw = 4, window = g, n = 50, cut = 0.75)$x density(1:1000, bw = 4, window = g, n = 50, cut = 0.75)$y # S-PLUS CODE: exdata = iris[, 1, 1]; density(exdata, width = 4) SAME x BUT DIFFERENT y exdata = iris$Sepal.Length[iris$Species == setosa] density(exdata, bw = 4, n = 50, cut = 0.75)$x density(exdata, bw = 4, n = 50, cut = 0.75)$y 2010/11/2 Joshua Wiley jwiley.ps...@gmail.com Dear Nicola, There are undoubtedly people here who are familiar with both S+ and R, but they may not always be around or get to
Re: [R] How to unquote string in R
lord12 wrote: s= Hey a = Hello table = rbind(s,a) write.table(table,paste(blah,.PROPERTIES,sep = ),row.names = FALSE,col.names = FALSE) In my table, how do I output only the words and not the words with the quotations? You read the help page for the function you're using :). From ?write.table: quote: a logical value (‘TRUE’ or ‘FALSE’) or a numeric vector. If ‘TRUE’, any character or factor columns will be surrounded by double quotes. If a numeric vector, its elements are taken as the indices of columns to quote. In both cases, row and column names are quoted if they are written. If ‘FALSE’, nothing is quoted. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] smooth: differences between R and S-PLUS
Hi! I am studying differences between R and S-PLUS smooth() functions. I know from the help that they worked differently, so I ask: - exist a package that permit to have the same results? - alternatively, someone know how can I obtain the same results in R, using a self made script? I know that S-PLUS use the 4(3RSR)2H running median smoothing and I try to implement it with the code below. I obtain some result equal to the S-PLUS one, so I think the main problem is understand how NA value from moving median are treated. The R result is: [1] NA NA 4.6250 4.9375 4.7500 4. 3.2500 3. [9] 2.8750 2.5000 2.1250 the S-PLUS one is: [1] * * * 4.6250 4.9375 4.7500 4. 3.2500 3.** where * stand for a number different from the R one that I don't remember. Unfortunately I cannot give more details about the S-PLUS function now, because I am working on a machine without this software. If someone can help me, tomorrow (CET time), I will provide more details. Thanks in advance. Nicola ### EXAMPLE # Comments indicates which step of the 4(3RSR)2H algorithm I try to replicate. # Data x1 - c(4, 1, 3, 6, 6, 4, 1, 6, 2, 4, 2) # 4 out = NULL for (i in 1:11) {out[i] = median(x1[i:(i+3)])} out[is.na(out)] = x1[is.na(out)] out # (3RSR) x2 = smooth(out, 3RSR, twiceit = F) x2 # 2 out2 = NULL for (i in 1: 11) {out2[i] = median(x2[i:(i+1)])} out2[is.na(out2)] = x2[is.na(out2)] out2 # H filter(out2, filter = c(1/4, 1/2, 1/4), sides = 2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Orthogonalization with different inner products
Suppose one wanted to consider random variables X_1,...X_n and from each subtract off the piece which is correlated with the previous variables in the list. i.e. make new variables Z_i so that Z_1=X_1 and Z_i=X_i-cov(X_i,Z_1)Z_1/var(Z_1)-...- cov(X_i,Z__{i-1})Z__{i-1}/var(Z_{i-1}) I have code to do this but I keep getting a non-conformable array error in the line with the covariance. Does anyone have any suggestions? Here is my code: gov=read.table(file.choose(), sep=\t,header=T) gov1=gov[3:length(gov[1,])] n_indices=length(names(gov1)) x=data.matrix(gov1) v=x R=matrix(rep(0,length(x[,1])*length(x[1,])),length(x[,1])) for(j in 1:n_indices){ u=matrix(rep(0,length(v[,1])),length(v[,1])) for(i in 1:j-1){ u = u+cov(v[,j],v[,i])*v[,i]/var(v[,i])#(error here) } v[,j]=v[,j]-u } Thanks, Andrew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] '=' vs '-'
Hi all, can we use '=' instead of '-' operator for assignment in R programs? regards, KM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] programming questions
quick programming questions. I want to turn on more errors. there are two traps I occasionally fall into. * I wonder why R thinks that a variable is always defined in a data frame. is.defined(d) [1] FALSE d= data.frame( x=1:5, y=1:5 ) is.defined(d$z) [1] TRUE is.defined(nonexisting$garbage) [1] TRUE this is a bit unfortunate for me, because subsequent errors become less clear. right now, I need to do '(is.defined(d) and !is.null(d$z))' to check that my function inputs are valid. It would be nicer if one could just write if (is.defined(d$z). * is there a way to turn off automatic recycling? I would rather get an error than unexpected recycling. I can force recycling with rep() when I need to. regards, /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
ivo welch wrote: quick programming questions. I want to turn on more errors. there are two traps I occasionally fall into. * I wonder why R thinks that a variable is always defined in a data frame. is.defined(d) [1] FALSE d= data.frame( x=1:5, y=1:5 ) is.defined(d$z) [1] TRUE is.defined(nonexisting$garbage) [1] TRUE Which package/version of R is the 'is.defined' function in? I don't seem to have it here on 2.11.1, which I know is not the latest version of R. What does 'defined' mean? this is a bit unfortunate for me, because subsequent errors become less clear. right now, I need to do '(is.defined(d) and !is.null(d$z))' to check that my function inputs are valid. It would be nicer if one could just write if (is.defined(d$z). z %in% names(d) ? * is there a way to turn off automatic recycling? I would rather get an error than unexpected recycling. I can force recycling with rep() when I need to. regards, /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
On 03/11/2010 2:05 PM, ivo welch wrote: quick programming questions. I want to turn on more errors. there are two traps I occasionally fall into. * I wonder why R thinks that a variable is always defined in a data frame. is.defined(d) [1] FALSE d= data.frame( x=1:5, y=1:5 ) is.defined(d$z) [1] TRUE is.defined(nonexisting$garbage) [1] TRUE this is a bit unfortunate for me, because subsequent errors become less clear. right now, I need to do '(is.defined(d) and !is.null(d$z))' to check that my function inputs are valid. It would be nicer if one could just write if (is.defined(d$z). * is there a way to turn off automatic recycling? I would rather get an error than unexpected recycling. I can force recycling with rep() when I need to. Where did you find the is.defined() function? It's not part of R. The R function to do that is exists(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
yikes. this is all my fault. it was the first thing that I ever defined when I started using R. is.defined - function(name) exists(as.character(substitute(name))) I presume there is something much better... /iaw On Wed, Nov 3, 2010 at 2:12 PM, Erik Iverson er...@ccbr.umn.edu wrote: ivo welch wrote: quick programming questions. I want to turn on more errors. there are two traps I occasionally fall into. * I wonder why R thinks that a variable is always defined in a data frame. is.defined(d) [1] FALSE d= data.frame( x=1:5, y=1:5 ) is.defined(d$z) [1] TRUE is.defined(nonexisting$garbage) [1] TRUE Which package/version of R is the 'is.defined' function in? I don't seem to have it here on 2.11.1, which I know is not the latest version of R. What does 'defined' mean? this is a bit unfortunate for me, because subsequent errors become less clear. right now, I need to do '(is.defined(d) and !is.null(d$z))' to check that my function inputs are valid. It would be nicer if one could just write if (is.defined(d$z). z %in% names(d) ? * is there a way to turn off automatic recycling? I would rather get an error than unexpected recycling. I can force recycling with rep() when I need to. regards, /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
For data frames you can also use with() in your example: with(d, exists(z)) -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly From: ivo welch ivo.we...@gmail.com To: Erik Iverson er...@ccbr.umn.edu Cc: r-help r-h...@stat.math.ethz.ch Date: 11/03/2010 02:20 PM Subject: Re: [R] programming questions Sent by: r-help-boun...@r-project.org yikes. this is all my fault. it was the first thing that I ever defined when I started using R. is.defined - function(name) exists(as.character(substitute(name))) I presume there is something much better... /iaw On Wed, Nov 3, 2010 at 2:12 PM, Erik Iverson er...@ccbr.umn.edu wrote: ivo welch wrote: quick programming questions. I want to turn on more errors. there are two traps I occasionally fall into. * I wonder why R thinks that a variable is always defined in a data frame. is.defined(d) [1] FALSE d= data.frame( x=1:5, y=1:5 ) is.defined(d$z) [1] TRUE is.defined(nonexisting$garbage) [1] TRUE Which package/version of R is the 'is.defined' function in? I don't seem to have it here on 2.11.1, which I know is not the latest version of R. What does 'defined' mean? this is a bit unfortunate for me, because subsequent errors become less clear. right now, I need to do '(is.defined(d) and !is.null(d$z))' to check that my function inputs are valid. It would be nicer if one could just write if (is.defined(d$z). z %in% names(d) ? * is there a way to turn off automatic recycling? I would rather get an error than unexpected recycling. I can force recycling with rep() when I need to. regards, /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spliting first 10 words in a string
Hi all, Thanks for all the help. I realize i have a lot to learn in R but i love it. m From: steven mosher [mailto:mosherste...@gmail.com] Sent: Tuesday, November 02, 2010 11:45 PM To: Matevž PavliÄ Cc: David Winsemius; Gaj Vidmar; r-h...@stat.math.ethz.ch Subject: Re: [R] spliting first 10 words in a string just merge the data.frames back together. use merge or cbind() cbind will be easier DF1 - data.frame(x,y,z) DF2 -data.frame(DF1$x) # copy a column then you added columns to DF2 just put them back together DF3 -cbind(DF2,DF1$y,DF$z) if you spend more time with R you will be able to do things like this elegantly, but for now This way will work and you will learn a bit about R. As for counting instances of a string, I might suggest looking at the table command k - c( all, but,all) table(k) k all but 2 1 So you can do a table for each column in your dataframe On Tue, Nov 2, 2010 at 12:53 PM, Matevž PavliÄ matevz.pav...@gi-zrmk.si wrote: Hi, Ok, i got this now. At least i think so. I got a data.frame with 15 fields, all other words have bee truncated. Which is what i want. But ia have that in a seperate data.frame from that one it was before (would be nice if it would be in the same ...) 'data.frame': 22801 obs. of 15 variables: $ V1 : chr HUMUS SLABO MALO SLABO ... $ V2 : chr IN GRANULIRAN PREPEREL VEZAN ... $ V3 : chr HUMUSNA PEÅ ÄEN MELJAST ,KONGLOMERAT, ... $ V4 : chr GLINA PROD PROD P0ROZEN, ... $ V5 : chr Z DO DO S ... $ V6 : chr MALO r r PLASTMI ... $ V7 : chr PODA, = = GFs, ... $ V8 : chr LAHKO 8Q 60mm, SIVORJAV ... $ V9 : chr GNETNA, mm, S ... $ V10: chr RJAVA S PRODNIKI, ... $ V11: chr PRODNIKI MALO ... $ V12: chr DO PEÅ ÄEN ... $ V13: chr R S ... $ V14: chr = TANKIMI ... Now, i have another problem. Is it possible to count which word occours most often each field (V1, V2, V3, ...) and which one is the second and so on. Ideally to create a table for each field (V1, V2, V3, ...) with the word and thenumber of occuraces in that field (column) . I suppose it could be done in SQL, but what since i saw what R can do i guess this can be done here to? Thanks, m -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Tuesday, November 02, 2010 8:23 PM To: Matevž PavliÄ Cc: Gaj Vidmar; r-h...@stat.math.ethz.ch Subject: Re: [R] spliting first 10 words in a string On Nov 2, 2010, at 3:01 PM, Matevž PavliÄ wrote: Hi all, Thanks for all the help. I managed to do it with what Gaj suggested (Excel :(). The last solution from David is also freat i just don't undestand why R put the words in 14 columns and thre rows? Because the maximum number of words was 14 and the fill argument was TRUE. There were three rows because there were three items in the supplied character vector. I would like it to put just the first 10 words in source field to 10 diefferent destiantion fields, but the same row. And so on...is that possible? I don't know what a destination field might be. Those are not R data types. This would trim the extra columns (in this example set to those greater than 8) by adding a lot of NULL's to the end of a colClasses specification at the expense of a warning message which can be ignored: read.table(textConnection(words), fill=T, colClasses = c(rep(character, 8), rep(NULL, 30) ) , stringsAsFactors=FALSE ) V1V2V3 V4V5V6V7 V8 1 I have a columnn with text that has 2 I would like to split these words in 3 but just first ten wordsin the string. Warning message: In read.table(textConnection(words), fill = T, colClasses = c(rep(character, : cols = 14 != length(data) = 38 If you want to assign the first column to a variable then just: first8 - read.table(textConnection(words), fill=T, colClasses = c(rep(character, 8), rep(NULL, 30) ) , stringsAsFactors=FALSE) var1 - first8[[1]] var1 [1] I I but -- David. Thank you, m -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of David Winsemius Sent: Tuesday, November 02, 2010 3:47 PM To: Gaj Vidmar Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] spliting first 10 words in a string On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote: Though forbidden in this list, in Excel it's just (literally!) five clicks away! (with the column in question selected) Data - Text to Columns - Delimited - tick Space - Finish Pa je! (~Voila in Slovenian) (then import back to R, keeping only the first 10 columns if so desired) You could do the same thing without needing to leave R. Just read.table( textConnection(..), header=FALSE, fill=TRUE) read.table(textConnection(words), fill=T) V1V2V3 V4V5V6V7 V8 V9 V10 V11 V12 V13 V14 1 I have a columnn with text that
Re: [R] '=' vs '-'
Yes, but - is preferred. Note, there are also some differences. You can do the following: a - 10 b = 10 identical(a,b) [1] TRUE And you can also do myFun - function(x, y = 100){ + result - x*y + result} myFun(x = 20) [1] 2000 But, you cannot use '-' to define the arguments of a function -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of km Sent: Wednesday, November 03, 2010 2:05 PM To: r-help@r-project.org Subject: [R] '=' vs '-' Hi all, can we use '=' instead of '-' operator for assignment in R programs? regards, KM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] '=' vs '-'
On Wed, Nov 3, 2010 at 6:04 PM, km srikrishnamo...@gmail.com wrote: Hi all, can we use '=' instead of '-' operator for assignment in R programs? Yes, mostly, you can also use 'help' to ask such questions: help(=) The operators ‘-’ and ‘=’ assign into the environment in which they are evaluated. The operator ‘-’ can be used anywhere, whereas the operator ‘=’ is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions. and so on... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
On Wed, Nov 3, 2010 at 6:17 PM, ivo welch ivo.we...@gmail.com wrote: yikes. this is all my fault. it was the first thing that I ever defined when I started using R. is.defined - function(name) exists(as.character(substitute(name))) I presume there is something much better... You didn't do a good job testing your is.defined :) Let's see what happens when you feed it 'nonexisting$garbage'. What gets passed into 'exists'? acs=function(name){as.character(substitute(name))} acs(nonexisting$garbage) [1] $ nonexisting garbage - and then your exists test is doing effectively exists($) which exists. Hence TRUE. What you are getting here is the expression parsed up as a function call ($) and its args. You'll see this if you do: acs(fix(me)) [1] fix me Perhaps you meant to deparse it: acs=function(name){as.character(deparse(substitute(name)))} acs(nonexisting$garbage) [1] nonexisting$garbage exists(acs(nonexisting$garbage)) [1] FALSE But you'd be better off testing list elements with is.null Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting p-values from fitted ARIMA
Hi I fitted an ARIMA model using the function arima(). The output consists of the fitted coefficients with their standard errors. However i need information about the significance of the coefficients, like p-values. I hope you can help me on that issue... ciao Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] '=' vs '-'
@all: Does it seem reasonable to add a discussion of '=' vs. '-' to the FAQ? It seems a regular question and something of a hot topic to debate. @KM Here are links I've accumulated to prior discussions on this topic. I am pretty certain they are all unique. http://blog.revolutionanalytics.com/2008/12/use-equals-or-arrow-for-assignment.html http://www.mail-archive.com/r-help@r-project.org/msg69310.html http://www.mail-archive.com/r-help@r-project.org/msg99789.html http://www.mail-archive.com/r-help@r-project.org/msg104102.html http://www.mail-archive.com/r-help@r-project.org/msg16881.html https://stat.ethz.ch/pipermail/r-sig-teaching/2010q4/000312.html http://r.789695.n4.nabble.com/advice-opinion-on-vs-in-teaching-R-td1014502.html#a1014502 On Wed, Nov 3, 2010 at 11:04 AM, km srikrishnamo...@gmail.com wrote: Hi all, can we use '=' instead of '-' operator for assignment in R programs? regards, KM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting p-values from fitted ARIMA
Hi Stefan, Take a look at https://stat.ethz.ch/pipermail/r-help/2009-June/202173.html HTH, Jorge On Wed, Nov 3, 2010 at 2:50 PM, wrote: Hi I fitted an ARIMA model using the function arima(). The output consists of the fitted coefficients with their standard errors. However i need information about the significance of the coefficients, like p-values. I hope you can help me on that issue... ciao Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NFFT on a Zoo?
I have an irregular time series in a Zoo object, and I've been unable to find any way to do an FFT on it. More precisely, I'd like to do an NFFT (non-equispaced / non-uniform time FFT) on the data. The data is timestamped samples from a cheap self-logging accelerometer. The data is weakly regular, with the following characteristics: - short gaps every ~20ms - large gaps every ~200ms - jitter/noise in the timestamp The gaps cover ~10% of the acquisition time. And they occur often enough that the uninterrupted portions of the data are too short to yield useful individual FFT results, even without timestamp noise. My searches have revealed no NFFT support in R, but I'm hoping it may be known under some other name (just as non-uniform time series are known as 'zoo' rather than 'nts' or 'nuts'). I'm using R through RPy, so any solution that makes use of numpy/scipy would also work. And I care more about accuracy than speed, so a non-library solution in R or Python would also work. Alternatively, is there a technique by which multiple FFTs over smaller (incomplete) data regions may be combined to yield an improved view of the whole? My experiments have so far yielded only useless results, but I'm getting ready to try PCA across the set of partial FFTs. TIA, -BobC __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting p-values from fitted ARIMA
On Wed, 3 Nov 2010, h0453...@wu.ac.at wrote: Hi I fitted an ARIMA model using the function arima(). The output consists of the fitted coefficients with their standard errors. However i need information about the significance of the coefficients, like p-values. I hope you can help me on that issue... If you want to use a standard normal approximation, you can use coeftest() from the lmtest package. For example: fit3 - arima(presidents, c(3, 0, 0)) library(lmtest) coeftest(fit3) Whether or not this is a good approximation is a different question, though. See also the coments on ?arima wrt the Hessian. Best, Z ciao Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice plots for images
On Wed, Nov 3, 2010 at 8:13 AM, Neba Funwi-Gabga fusigabsm...@gmail.com wrote: Hello UseRs, I need help on how to plot several raster images (such as those obtained from a kernel-smoothed intensity function) in a layout such as that obtained from the lattice package. I would like to obtain something such as obtained from using the levelplot or xyplot in lattice. I currently use: par(mfrow=c(3,3) to set the workspace, but the resulting plots leave a lot of blank space between individual plots. If I can get it to the lattice format, I think it will save me some white space. Any help is greatly appreciated. It's not clear what your question is exactly, but you may want to look at ?panel.levelplot.raster and ?panel.smoothScatter (particularly the 'raster' argument) in lattice. -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NFFT on a Zoo?
On Wed, Nov 3, 2010 at 2:59 PM, Bob Cunningham flym...@gmail.com wrote: I have an irregular time series in a Zoo object, and I've been unable to find any way to do an FFT on it. More precisely, I'd like to do an NFFT (non-equispaced / non-uniform time FFT) on the data. The data is timestamped samples from a cheap self-logging accelerometer. The data is weakly regular, with the following characteristics: - short gaps every ~20ms - large gaps every ~200ms - jitter/noise in the timestamp The gaps cover ~10% of the acquisition time. And they occur often enough that the uninterrupted portions of the data are too short to yield useful individual FFT results, even without timestamp noise. My searches have revealed no NFFT support in R, but I'm hoping it may be known under some other name (just as non-uniform time series are known as 'zoo' rather than 'nts' or 'nuts'). I'm using R through RPy, so any solution that makes use of numpy/scipy would also work. And I care more about accuracy than speed, so a non-library solution in R or Python would also work. Alternatively, is there a technique by which multiple FFTs over smaller (incomplete) data regions may be combined to yield an improved view of the whole? My experiments have so far yielded only useless results, but I'm getting ready to try PCA across the set of partial FFTs. Check out the entire thread that starts here. http://www.mail-archive.com/r-help@r-project.org/msg36349.html -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
thanks, barry and eric. I didn't do a good job---I did an awful job. alas, should R not come with an is.defined() function? a variable may never have been created, and this is different from a variable existing but holding a NULL. this can be the case in the global environment or in a data frame. is.null(never.before.seen) Error: objected 'never.before.seen' not found is.defined(never.before.seen) ## I need this, because I do not want an error: [1] FALSE your acs function doesn't really do what I want, either, because { d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE . I really need d - data.frame( x=1:5, y=1:5 ) is.defined(d$x) TRUE is.defined(d$z) FALSE is.defined(never.before.seen) FALSE is.defined(never.before.seen$anything) ## if a list does not exist, anything in it does not exist either FALSE how would I define this function? regards, /iaw On Wed, Nov 3, 2010 at 2:48 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Wed, Nov 3, 2010 at 6:17 PM, ivo welch ivo.we...@gmail.com wrote: yikes. this is all my fault. it was the first thing that I ever defined when I started using R. is.defined - function(name) exists(as.character(substitute(name))) I presume there is something much better... You didn't do a good job testing your is.defined :) Let's see what happens when you feed it 'nonexisting$garbage'. What gets passed into 'exists'? acs=function(name){as.character(substitute(name))} acs(nonexisting$garbage) [1] $ nonexisting garbage - and then your exists test is doing effectively exists($) which exists. Hence TRUE. What you are getting here is the expression parsed up as a function call ($) and its args. You'll see this if you do: acs(fix(me)) [1] fix me Perhaps you meant to deparse it: acs=function(name){as.character(deparse(substitute(name)))} acs(nonexisting$garbage) [1] nonexisting$garbage exists(acs(nonexisting$garbage)) [1] FALSE But you'd be better off testing list elements with is.null Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] smooth: differences between R and S-PLUS
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola Sturaro Sommacal (Quantide srl) Sent: Wednesday, November 03, 2010 10:41 AM To: r-help@r-project.org Subject: [R] smooth: differences between R and S-PLUS Hi! I am studying differences between R and S-PLUS smooth() functions. I know from the help that they worked differently, so I ask: - exist a package that permit to have the same results? - alternatively, someone know how can I obtain the same results in R, using a self made script? I know that S-PLUS use the 4(3RSR)2H running median smoothing and I try to implement it with the code below. I obtain some result equal to the S-PLUS one, so I think the main problem is understand how NA value from moving median are treated. The R result is: [1] NA NA 4.6250 4.9375 4.7500 4. 3.2500 3. [9] 2.8750 2.5000 2.1250 the S-PLUS one is: [1] * * * 4.6250 4.9375 4.7500 4. 3.2500 3.** In S+ I get: x1 - c(4, 1, 3, 6, 6, 4, 1, 6, 2, 4, 2) smooth(x1) 1: 2.404297 3.283203 4.140625 4.789063 5.093750 4.886719 7: 4.078125 3.269531 3.00 3.00 3.00 start deltat frequency 1 1 1 smooth(x1, twiceit=FALSE) 1: 2.03125 3.0 3.93750 4.62500 4.93750 4.75000 4.0 8: 3.25000 3.0 3.0 3.0 start deltat frequency 1 1 1 Tukey's EDA book (1977) may give the details on how to deal with the ends. The code in S+ is unchanged since that era, aside from being converted from single to double precision. There are many better smoothers out there. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com where * stand for a number different from the R one that I don't remember. Unfortunately I cannot give more details about the S-PLUS function now, because I am working on a machine without this software. If someone can help me, tomorrow (CET time), I will provide more details. Thanks in advance. Nicola ### EXAMPLE # Comments indicates which step of the 4(3RSR)2H algorithm I try to replicate. # Data x1 - c(4, 1, 3, 6, 6, 4, 1, 6, 2, 4, 2) # 4 out = NULL for (i in 1:11) {out[i] = median(x1[i:(i+3)])} out[is.na(out)] = x1[is.na(out)] out # (3RSR) x2 = smooth(out, 3RSR, twiceit = F) x2 # 2 out2 = NULL for (i in 1: 11) {out2[i] = median(x2[i:(i+1)])} out2[is.na(out2)] = x2[is.na(out2)] out2 # H filter(out2, filter = c(1/4, 1/2, 1/4), sides = 2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
alas, should R not come with an is.defined() function? ?exists a variable may never have been created, and this is different from a variable existing but holding a NULL. this can be the case in the global environment or in a data frame. is.null(never.before.seen) Error: objected 'never.before.seen' not found is.defined(never.before.seen) ## I need this, because I do not want an error: [1] FALSE exists(never.before.seen) #notice the quotes [1] FALSE your acs function doesn't really do what I want, either, because { d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE . I really need d - data.frame( x=1:5, y=1:5 ) is.defined(d$x) TRUE with(d, exists(x)) is.defined(d$z) FALSE with(d, exists(z)) is.defined(never.before.seen) FALSE exists(never.before.seen) is.defined(never.before.seen$anything) ## if a list does not exist, anything in it does not exist either FALSE This one I'm a bit confused about. If you're programming a function, then the user either: 1) passes in an object, which is bound to a local variable, and therefore exists. You can do checks on that object to see that it conforms to any constraints you have set. 2) does not pass in the object, in which case you can test for that with ?missing. Is writing your own functions for others to use what you're doing? --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] deleteing all but some observations in data.frame
Hi, I am sure that can be done in R How would i delete all but let say 20 observations in data.frame? Thank you, M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deleteing all but some observations in data.frame
It depends on which 20 you want. If you have a data.frame called 'test.df', you can do: #first 20 test.df[20, ] -or- head(test.df, 20) #random 20 test.df[sample(nrow(test.df), 20), ] None of this was tested, but it should be a start. --Erik Matevž Pavlič wrote: Hi, I am sure that can be done in R How would i delete all but let say 20 observations in data.frame? Thank you, M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deleteing all but some observations in data.frame
Note that these methods don't 'delete' observations. They all create brand new objects that are subsets of the test.df object. You can effectively 'delete' the observations by replacing the original data.frame with the returned object... so test.df - head(test.df, 20) Erik Iverson wrote: It depends on which 20 you want. If you have a data.frame called 'test.df', you can do: #first 20 test.df[20, ] -or- head(test.df, 20) #random 20 test.df[sample(nrow(test.df), 20), ] None of this was tested, but it should be a start. --Erik Matevž Pavlič wrote: Hi, I am sure that can be done in R How would i delete all but let say 20 observations in data.frame? Thank you, M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
On Nov 3, 2010, at 3:32 PM, ivo welch wrote: thanks, barry and eric. I didn't do a good job---I did an awful job. is.defined(never.before.seen$anything) ## if a list does not exist, anything in it does not exist either Except the $ function return NULL rather than an error and you already said you were willing to accept a NULL value as being different than not-existing. You may want to look at the difference between `$` and `[` methods of accessing values. You can test for never.before.seen as an object is.defined - function(x) !(try-error %in% class(try(x)) ) But it won't give your desired result on d$never.before.seen which does not throw an error. For that you would need an additional test of the sort Iverson is suggesting. -- David. FALSE how would I define this function? regards, /iaw On Wed, Nov 3, 2010 at 2:48 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Wed, Nov 3, 2010 at 6:17 PM, ivo welch ivo.we...@gmail.com wrote: yikes. this is all my fault. it was the first thing that I ever defined when I started using R. is.defined - function(name) exists(as.character(substitute(name))) I presume there is something much better... You didn't do a good job testing your is.defined :) Let's see what happens when you feed it 'nonexisting$garbage'. What gets passed into 'exists'? acs=function(name){as.character(substitute(name))} acs(nonexisting$garbage) [1] $ nonexisting garbage - and then your exists test is doing effectively exists($) which exists. Hence TRUE. What you are getting here is the expression parsed up as a function call ($) and its args. You'll see this if you do: acs(fix(me)) [1] fix me Perhaps you meant to deparse it: acs=function(name){as.character(deparse(substitute(name)))} acs(nonexisting$garbage) [1] nonexisting$garbage exists(acs(nonexisting$garbage)) [1] FALSE But you'd be better off testing list elements with is.null Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim works on command-line but not inside a function
Damokun wrote: Dear all, I am trying to optimize a logistic function using optim, inside the following functions: #Estimating a and b from thetas and outcomes by ML IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=t, X=X) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } #Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0) IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=tar, X=Xes) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } The problem is that this does not work: IRT.estimate.abFromThetaX(sx, st, c(0,0)) Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan, : L-BFGS-B needs finite values of 'fn' But If I try the same optim call on the command line, with the same data, it works fine: optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, + gr=IRT.gradZL, + lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx) optRes $par [1] -0.6975157 0.7944972 $convergence [1] 0 $message [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH In your command line you have set t=st and X=sx. However in the alternative you do: IRT.estimate.abFromThetaX(sx, st, c(0,0)) Therefore you are assigning sx to t and st to X in the IRT.estimate.abFromThetaX function, which is reversed from your command line call. You should switch sx and st in the function call: IRT.estimate.abFromThetaX(st, sx, c(0,0)) and then all will be well. best Berend If yoou -- View this message in context: http://r.789695.n4.nabble.com/optim-works-on-command-line-but-not-inside-a-function-tp3025414p3026099.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] programming questions
thanks, eric---I need a little more clarification. *yes, I write functions and then forget them. so I want them to be self-sufficient. I want to write functions that check all their arguments for validity.) For example, my.fn - function( mylist ) { stop.if.not( is.defined(mylist) ) ## ok, superfluous stop.if.not( is.defined(mylist$dataframe.in.mylist )) stop.if.not( is.defined(mylist$dataframe.in.mylist$a.component.I.need) ) ### other checks, such as whether the component I need is long enough, positive, etc. ### could be various other operations mylist$dataframe.in.mylist$a.component.I.need } so my.fn( asd ) ## R gives me an error, asd is not in existence my.fn( NULL ) ## second error: the list component 'dataframe.in.mylist' I need is not there my.fn( data.frame( some.other.component=1:4 ) ) ## second error; the list component 'dataframe.in.mylist' I need is not there my.fn( list( hello=1, silly=data.frame( x=1:4 ) ) ) ## second error: dataframe.in.mylist does not exist my.fn( list( hello=2, dataframe.in.mylist= data.frame( a.component.I.need=1:4 )) ## ok exists() works on a stringified variable name. how do I stringify in R? PS: btw, is it possible to weave documentation into my user function, so that I can type ?is.defined and I get a doc page that I have written? Ala perl pod. I think I asked this before, and the answer was no. /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) CV Starr Professor of Economics (Finance), Brown University http://www.ivo-welch.info/ On Wed, Nov 3, 2010 at 3:40 PM, Erik Iverson er...@ccbr.umn.edu wrote: alas, should R not come with an is.defined() function? ?exists a variable may never have been created, and this is different from a variable existing but holding a NULL. this can be the case in the global environment or in a data frame. is.null(never.before.seen) Error: objected 'never.before.seen' not found is.defined(never.before.seen) ## I need this, because I do not want an error: [1] FALSE exists(never.before.seen) #notice the quotes [1] FALSE your acs function doesn't really do what I want, either, because { d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE . I really need d - data.frame( x=1:5, y=1:5 ) is.defined(d$x) TRUE with(d, exists(x)) is.defined(d$z) FALSE with(d, exists(z)) is.defined(never.before.seen) FALSE exists(never.before.seen) is.defined(never.before.seen$anything) ## if a list does not exist, anything in it does not exist either FALSE This one I'm a bit confused about. If you're programming a function, then the user either: 1) passes in an object, which is bound to a local variable, and therefore exists. You can do checks on that object to see that it conforms to any constraints you have set. 2) does not pass in the object, in which case you can test for that with ?missing. Is writing your own functions for others to use what you're doing? --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NFFT on a Zoo?
On 11/03/2010 12:27 PM, Gabor Grothendieck wrote: On Wed, Nov 3, 2010 at 2:59 PM, Bob Cunninghamflym...@gmail.com wrote: I have an irregular time series in a Zoo object, and I've been unable to find any way to do an FFT on it. More precisely, I'd like to do an NFFT (non-equispaced / non-uniform time FFT) on the data. The data is timestamped samples from a cheap self-logging accelerometer. The data is weakly regular, with the following characteristics: - short gaps every ~20ms - large gaps every ~200ms - jitter/noise in the timestamp The gaps cover ~10% of the acquisition time. And they occur often enough that the uninterrupted portions of the data are too short to yield useful individual FFT results, even without timestamp noise. My searches have revealed no NFFT support in R, but I'm hoping it may be known under some other name (just as non-uniform time series are known as 'zoo' rather than 'nts' or 'nuts'). I'm using R through RPy, so any solution that makes use of numpy/scipy would also work. And I care more about accuracy than speed, so a non-library solution in R or Python would also work. Alternatively, is there a technique by which multiple FFTs over smaller (incomplete) data regions may be combined to yield an improved view of the whole? My experiments have so far yielded only useless results, but I'm getting ready to try PCA across the set of partial FFTs. Check out the entire thread that starts here. http://www.mail-archive.com/r-help@r-project.org/msg36349.html Thanks for the instantaneous reply! While I couldn't follow the details of that discussion, it seems the periodogram is intended to detect weak periodic signals in irregular data. What I have is occasional strong signals of varying amplitude, duration and spectrum (events) in irregular data. Are the two cases equivalent from the periodogram perspective? Each event looks like an impulse with decaying oscillation (a smack followed by a fading ring), where the initial impulse can sometimes saturate the device. I also don't yet know the bandwidth of the device: I do know that samples are taken at a nominal rate of 640 Hz, and I have about 4 million samples. My initial goal is to determine the accuracy of the timestamps: Is the jitter in the time values real or not? My initial plan was to do 2 NFFTs: One with unmodified data, and one with the time quantized (gridded) to periods of 1/640 Hz. If the gridded FFT is 'sharper', then I'll know the time jitter is meaningless. My secondary goal is to determine the signal bandwidth, with the hope of using a slower sampling rate, since rates of 160 Hz and below are free of gaps. After that, I need to compare data from two devices taken under nominally identical conditions, to see if they record events equivalently (magnitude, duration, and frequency spectrum), with the goal being to determine a relative calibration for the pair of devices. Finally, I'll take more data with the devices in separate (but linked) locations, to determine the mechanical characteristics of that environment. After applying the calibration determined above, I'll again want to compare the events to quantify their differences. Should I be pursuing analytical methods other than the NFFT? -BobC __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: optim works on command-line but not inside a function
Well, the function should not be able to be infinite as IRT.llZetaLambdaCorrNan is a sum of products of either one or zero and log(1+exp(x)) or log(1+exp(-x)) (these logs are always bigger or equal to log(1)=0) Further more, I bounded x to be finite to fix my problem (as I expected that it might try x-Inf. But this did not help. And it is a mystery to me why it would work on the command line, and not as part of a function (it is just one call, and exactly the same one too.) (I tried this in order to find proper intervals and start values where the error would not arise. But to my surprise it just gave normal values when I used the same settings as in the function.) Thanks, Diederik On 3 November 2010 15:41, Roijers, D.M. d.m.roij...@students.uu.nl wrote: --- *From:* Jonathan P Daily[SMTP:jda...@usgs.gov smtp%3ajda...@usgs.gov] *Sent:* Wednesday, November 03, 2010 3:26:09 PM *To:* Damokun *Cc:* r-help@r-project.org; r-help-boun...@r-project.org *Subject:* Re: [R] optim works on command-line but not inside a function *Auto forwarded by a Rule* As the error message says, the values of your function must be finite in order to run the algorithm. Some part of your loop is passing arguments (inits maybe... you only tried (0,0) in the CLI example) that cause IRT.llZetaLambdaCorrNan to be infinite. -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly From: Damokun dmroi...@students.cs.uu.nl To: r-help@r-project.org Date: 11/03/2010 10:19 AM Subject: [R] optim works on command-line but not inside a function Sent by: r-help-boun...@r-project.org -- Dear all, I am trying to optimize a logistic function using optim, inside the following functions: #Estimating a and b from thetas and outcomes by ML IRT.estimate.abFromThetaX - function(t, X, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=t, X=X) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } #Estimating a and b from thetas and outcomes by ML, avoiding 0*log(0) IRT.estimate.abFromThetaX2 - function(tar, Xes, inits, lw=c(-Inf,-Inf), up=rep(Inf,2)){ optRes - optim(inits, method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, gr=IRT.gradZL, lower=lw, upper=up, t=tar, X=Xes) c(optRes$par[2], -(optRes$par[1]/optRes$par[2]) ) } The problem is that this does not work: IRT.estimate.abFromThetaX(sx, st, c(0,0)) Error in optim(inits, method = L-BFGS-B, fn = IRT.llZetaLambdaCorrNan, : L-BFGS-B needs finite values of 'fn' But If I try the same optim call on the command line, with the same data, it works fine: optRes - optim(c(0,0), method=L-BFGS-B, fn=IRT.llZetaLambdaCorrNan, + gr=IRT.gradZL, + lower=c(-Inf, -Inf), upper=c(Inf, Inf), t=st, X=sx) optRes $par [1] -0.6975157 0.7944972 $convergence [1] 0 $message [1] CONVERGENCE: REL_REDUCTION_OF_F = FACTR*EPSMCH Does anyone have an idea what this could be, and what I could try to avoid this error? I tried bounding the parameters, with lower=c(-10, -10) and upper=... but that made no difference. Thanks, Diederik Roijers Utrecht University MSc student. -- PS: the other functions I am using are: #IRT.p is the function that represents the probability #of a positive outcome of an item with difficulty b, #discriminativity a, in combination with a student with #competence theta. IRT.p - function(theta, a, b){ epow - exp(-a*(theta-b)) result - 1/(1+epow) result } # = IRT.p^-1 ; for usage in the loglikelihood IRT.oneOverP - function(theta, a, b){ epow - exp(-a*(theta-b)) result - (1+epow) result } # = (1-IRT.p)^-1 ; for usage in the loglikelihood IRT.oneOverPneg - function(theta, a, b){ epow - exp(a*(theta-b)) result - (1+epow) result } #simulation-based sample generation of thetas and outcomes #based on a given a and b. (See IRT.p) The sample-size is n IRT.generateSample - function(a, b, n){ x-rnorm(n, mean=b, sd=b/2) t-IRT.p(x,a,b) ch-runif(length(t)) t[t=ch]=1 t[tch]=0 cbind(x,t) } #This loglikelihood function is based on the a and be parameters, #and requires thetas as input in X, and outcomes in t #prone to give NaN errors due to 0*log(0) IRT.logLikelihood2 - function(params, t, X){ pos- sum(t * log(IRT.p(X,params[1],params[2]))) neg- sum( (1-t) * log( (1-IRT.p(X,params[1],params[2])) ) ) -pos-neg } #Avoiding NaN problems due to 0*log(0) #otherwise equivalent to IRT.logLikelihood2 IRT.logLikelihood2CorrNan - function(params, t, X){ pos- sum(t *
Re: [R] programming questions
On Wed, Nov 3, 2010 at 1:04 PM, ivo welch ivo.we...@gmail.com wrote: thanks, eric---I need a little more clarification. *yes, I write functions and then forget them. so I want them to be self-sufficient. I want to write functions that check all their arguments for validity.) For example, my.fn - function( mylist ) { stop.if.not( is.defined(mylist) ) ## ok, superfluous stop.if.not( is.defined(mylist$dataframe.in.mylist )) stop.if.not( is.defined(mylist$dataframe.in.mylist$a.component.I.need) ) ### other checks, such as whether the component I need is long enough, positive, etc. ### could be various other operations mylist$dataframe.in.mylist$a.component.I.need } See the Arguments class in R.utils, e.g. library(R.utils); my.fn - function(mylist) { # Assert a data.frame element exists df - Arguments$getInstanceOf(mylist$dataframe.in.mylist, data.frame); # Assert x = 0 and of length 45:67. x - df$a.component.I.need; x - Arguments$getDoubles(x, range=c(0,Inf), length=c(45,67)); ### could be various other operations mylist$dataframe.in.mylist$a.component.I.need } /Henrik so my.fn( asd ) ## R gives me an error, asd is not in existence my.fn( NULL ) ## second error: the list component 'dataframe.in.mylist' I need is not there my.fn( data.frame( some.other.component=1:4 ) ) ## second error; the list component 'dataframe.in.mylist' I need is not there my.fn( list( hello=1, silly=data.frame( x=1:4 ) ) ) ## second error: dataframe.in.mylist does not exist my.fn( list( hello=2, dataframe.in.mylist= data.frame( a.component.I.need=1:4 )) ## ok exists() works on a stringified variable name. how do I stringify in R? PS: btw, is it possible to weave documentation into my user function, so that I can type ?is.defined and I get a doc page that I have written? Ala perl pod. I think I asked this before, and the answer was no. /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) CV Starr Professor of Economics (Finance), Brown University http://www.ivo-welch.info/ On Wed, Nov 3, 2010 at 3:40 PM, Erik Iverson er...@ccbr.umn.edu wrote: alas, should R not come with an is.defined() function? ?exists a variable may never have been created, and this is different from a variable existing but holding a NULL. this can be the case in the global environment or in a data frame. is.null(never.before.seen) Error: objected 'never.before.seen' not found is.defined(never.before.seen) ## I need this, because I do not want an error: [1] FALSE exists(never.before.seen) #notice the quotes [1] FALSE your acs function doesn't really do what I want, either, because { d=data.frame( x=1:4); exists(acs(d$x)) } tells me FALSE . I really need d - data.frame( x=1:5, y=1:5 ) is.defined(d$x) TRUE with(d, exists(x)) is.defined(d$z) FALSE with(d, exists(z)) is.defined(never.before.seen) FALSE exists(never.before.seen) is.defined(never.before.seen$anything) ## if a list does not exist, anything in it does not exist either FALSE This one I'm a bit confused about. If you're programming a function, then the user either: 1) passes in an object, which is bound to a local variable, and therefore exists. You can do checks on that object to see that it conforms to any constraints you have set. 2) does not pass in the object, in which case you can test for that with ?missing. Is writing your own functions for others to use what you're doing? --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NFFT on a Zoo?
From: ggrothendi...@gmail.com Date: Wed, 3 Nov 2010 15:27:13 -0400 To: flym...@gmail.com CC: r-help@r-project.org; rpy-l...@lists.sourceforge.net Subject: Re: [R] NFFT on a Zoo? On Wed, Nov 3, 2010 at 2:59 PM, Bob Cunningham wrote: I have an irregular time series in a Zoo object, and I've been unable to find any way to do an FFT on it. More precisely, I'd like to do an NFFT (non-equispaced / non-uniform time FFT) on the data. The data is timestamped samples from a cheap self-logging accelerometer. The data is weakly regular, with the following characteristics: - short gaps every ~20ms - large gaps every ~200ms - jitter/noise in the timestamp The gaps cover ~10% of the acquisition time. And they occur often enough that the uninterrupted portions of the data are too short to yield useful individual FFT results, even without timestamp noise. My searches have revealed no NFFT support in R, but I'm hoping it may be known under some other name (just as non-uniform time series are known as 'zoo' rather than 'nts' or 'nuts'). I'm using R through RPy, so any solution that makes use of numpy/scipy would also work. And I care more about accuracy than speed, so a non-library solution in R or Python would also work. Alternatively, is there a technique by which multiple FFTs over smaller (incomplete) data regions may be combined to yield an improved view of the whole? My experiments have so far yielded only useless results, but I'm getting ready to try PCA across the set of partial FFTs. I'm pretty sure all of this is in Oppenheim and Shaffer meaning it is also in any newer books. I recall something about averaging but you'd need to look at details. Alternatively, and this is from distant memory so maybe someone else can comment, you can just feed a regularly spaced time series to anyone, go get FFTW for example, and insert zeroes for missing data. This is equivalent to multiplying your real data with a window function that is zero at missing points. I think you can prove that multiplication in time domain is convolution in FT domain so you can back this out by deconvolving with your window function spectrum. This probably is not painless, the window spectrum will have badly placed zeroes etc, but it may be helpful. Apaprently this is still a bit of an open issue, http://books.google.com/books?id=BW1PdOqZo6ACpg=PA2lpg=PA2dq=dft+window+missing+datasource=blots=fSY-iRoCNNsig=30cC0SdkrDcp62iWc-Mv26mfNjIhl=enei=AMTRTNmyMYP88AauxtzKDAsa=Xoi=book_resultct=resultresnum=6ved=0CDEQ6AEwBTgK#v=onepageqf=false You should be able to do the case of a sine wave with pencil and paper and see if or how this really would work. Check out the entire thread that starts here. http://www.mail-archive.com/r-help@r-project.org/msg36349.html -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loop
Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1 - w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2 - w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] avoiding too many loops - reshaping data
Hello! I have a data frame like this one: mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b), brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z), value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116)) (mydf) What I need to get is a data frame like the one below - cities as rows, brands as columns, and the sums of the value within each city/brand combination in the body of the data frame: city x yz a3 23 336 b7 42 231 I have written a code that involves multiple loops and subindexing - but it's taking too long. I am sure there must be a more efficient way of doing it. Thanks a lot for your hints! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding too many loops - reshaping data
Hadley's reshape package (google for it) can do this. There's a nice intro on the site. library(reshape) cast(melt(mydf, measure.vars = value), city ~ brand, fun.aggregate = sum) city x y z 1a 3 23 450 2b 12 42 231 Although the numbers differ slightly? I've heard of the reshape2 package, but have no idea if that's replaced the reshape package yet. --Erik Dimitri Liakhovitski wrote: Hello! I have a data frame like this one: mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b), brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z), value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116)) (mydf) What I need to get is a data frame like the one below - cities as rows, brands as columns, and the sums of the value within each city/brand combination in the body of the data frame: city x yz a3 23 336 b7 42 231 I have written a code that involves multiple loops and subindexing - but it's taking too long. I am sure there must be a more efficient way of doing it. Thanks a lot for your hints! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding too many loops - reshaping data
Try this: xtabs(value ~ city + brand, mydf) On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame like this one: mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b), brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z), value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116)) (mydf) What I need to get is a data frame like the one below - cities as rows, brands as columns, and the sums of the value within each city/brand combination in the body of the data frame: city x yz a3 23 336 b7 42 231 I have written a code that involves multiple loops and subindexing - but it's taking too long. I am sure there must be a more efficient way of doing it. Thanks a lot for your hints! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding too many loops - reshaping data
Thanks a lot! Yes - I just found the reshape package too - and guess what, my math was wrong! reshape2 seems like the more up-to-date version of reshape. Wonder what's faster - xtabs or dcast... Dimitri On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: xtabs(value ~ city + brand, mydf) On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame like this one: mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b), brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z), value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116)) (mydf) What I need to get is a data frame like the one below - cities as rows, brands as columns, and the sums of the value within each city/brand combination in the body of the data frame: city x y z a 3 23 336 b 7 42 231 I have written a code that involves multiple loops and subindexing - but it's taking too long. I am sure there must be a more efficient way of doing it. Thanks a lot for your hints! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding too many loops - reshaping data
In reshape2 this does the job: dcast(mydf,city~brand,sum) On Wed, Nov 3, 2010 at 4:37 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thanks a lot! Yes - I just found the reshape package too - and guess what, my math was wrong! reshape2 seems like the more up-to-date version of reshape. Wonder what's faster - xtabs or dcast... Dimitri On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: xtabs(value ~ city + brand, mydf) On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame like this one: mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b), brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z), value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116)) (mydf) What I need to get is a data frame like the one below - cities as rows, brands as columns, and the sums of the value within each city/brand combination in the body of the data frame: city x y z a 3 23 336 b 7 42 231 I have written a code that involves multiple loops and subindexing - but it's taking too long. I am sure there must be a more efficient way of doing it. Thanks a lot for your hints! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multi-level cox ph with time-dependent covariates
Dear all, I would like to know if it is possible to fit in R a Cox ph model with time-dependent covariates and to account for hierarchical effects at the same time. Additionally, I'd like also to know if it would be possible to perform any feature selection on this model fit. I have a data set that is composed by multiple marker measurements (and hundreds of covariates) at different time points from different tissue samples of different patients. Suppose that the data were coming from animal model with very few subjects (n=6) that were followed up given a pathogen exposure, measured several times, sampling different tissues in the same days, until a certain outcome was reached (or outcome censored). Suppose that the pathogen can vary over time (might be a bacteria that selects for drug-resistance) and that also it can vary across different tissue reservoirs within the same patient. In other words: names(data) = patient_id, start_time, stop_time, tissue_id, pathogen_type, marker1, ..., marker100, ..., outcome If I had multiple observations per patient at different time intervals, I would model it like this (hope it is correct) model-coxph(Surv(start_time,stop_time,outcome)~all_covariates+cluster(patient_id)) But now I have both the patient and the tissue, and hundreds of different variables. I thought I could use the coxme library, since it has also a ridge regression feature. Shall I then model nested random effects by considering both the patient_id and the tissue_id? Like model-coxme(Surv(start_time,stop_time,outcome) ~ covariates + (1 | patient_id/tissue_id)) Then, how could I shrink the coefficients in order to select a subset of them with non-neglegible effects? May I also consider the possibility to run an AIC-based forward-backward selection? thanks and apologies if I am completely out of the trails, M.P. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding too many loops - reshaping data
Want to thank everyone once more for pointing in reshape direction. Saved me about 16 hours of looping! Dimitri On Wed, Nov 3, 2010 at 4:38 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: In reshape2 this does the job: dcast(mydf,city~brand,sum) On Wed, Nov 3, 2010 at 4:37 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thanks a lot! Yes - I just found the reshape package too - and guess what, my math was wrong! reshape2 seems like the more up-to-date version of reshape. Wonder what's faster - xtabs or dcast... Dimitri On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: xtabs(value ~ city + brand, mydf) On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame like this one: mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b), brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z), value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116)) (mydf) What I need to get is a data frame like the one below - cities as rows, brands as columns, and the sums of the value within each city/brand combination in the body of the data frame: city x y z a 3 23 336 b 7 42 231 I have written a code that involves multiple loops and subindexing - but it's taking too long. I am sure there must be a more efficient way of doing it. Thanks a lot for your hints! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Sent: Wednesday, November 03, 2010 9:30 PM To: Matevž Pavlič Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multivariate Poisson distribution
Hello, from a search of the archives and functions, I am looking for information on creating random correlated counts from a multivariate Poisson distribution. I can not seem to find a function that does this. Perhaps, it has not yet  been created. Has anyone created an R package that does this.  thanks,  Jourdan Gold   [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] biding rows while merging at the same time
Hello! I have 2 data frames like this (well, actually, I have 200 of them): df1-data.frame(location=c(loc 1,loc 2,loc 3),date=c(1/1/2010,1/1/2010,1/1/2010), a=1:3,b=11:13,c=111:113) df2-data.frame(location=c(loc 1,loc 2,loc 3),date=c(2/1/2010,2/1/2010,2/1/2010), a=4:6,c=114:116,d=c(1,11,111)) (df1) (df2) I am trying to just rbind them, which is impossible, because not every column is present in every data frame. I can't merge them - merge(df1,df2,by.x=location,by.y=location,all.x=T,all.y=T) - because it kinda cbinds them. What I need is something that looks like this: location datea b c d loc 1 1/1/2010 1 11 111 NA loc 2 1/1/2010 2 12 112 NA loc 3 1/1/2010 3 13 113 NA loc 1 2/1/2010 3 NA 114 1 loc 2 2/1/2010 5 NA 115 11 loc 3 2/1/2010 6 NA 116 111 Thanks a lot for your suggestions! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] biding rows while merging at the same time
Never mind - I found it in reshape package: rbind.fill I wonder if it's still in reshape2. Dimitri On Wed, Nov 3, 2010 at 5:34 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have 2 data frames like this (well, actually, I have 200 of them): df1-data.frame(location=c(loc 1,loc 2,loc 3),date=c(1/1/2010,1/1/2010,1/1/2010), a=1:3,b=11:13,c=111:113) df2-data.frame(location=c(loc 1,loc 2,loc 3),date=c(2/1/2010,2/1/2010,2/1/2010), a=4:6,c=114:116,d=c(1,11,111)) (df1) (df2) I am trying to just rbind them, which is impossible, because not every column is present in every data frame. I can't merge them - merge(df1,df2,by.x=location,by.y=location,all.x=T,all.y=T) - because it kinda cbinds them. What I need is something that looks like this: location date a b c d loc 1 1/1/2010 1 11 111 NA loc 2 1/1/2010 2 12 112 NA loc 3 1/1/2010 3 13 113 NA loc 1 2/1/2010 3 NA 114 1 loc 2 2/1/2010 5 NA 115 11 loc 3 2/1/2010 6 NA 116 111 Thanks a lot for your suggestions! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] biding rows while merging at the same time
Just merge(df1, df2, all = TRUE) does it, yes? Dimitri Liakhovitski wrote: Hello! I have 2 data frames like this (well, actually, I have 200 of them): df1-data.frame(location=c(loc 1,loc 2,loc 3),date=c(1/1/2010,1/1/2010,1/1/2010), a=1:3,b=11:13,c=111:113) df2-data.frame(location=c(loc 1,loc 2,loc 3),date=c(2/1/2010,2/1/2010,2/1/2010), a=4:6,c=114:116,d=c(1,11,111)) (df1) (df2) I am trying to just rbind them, which is impossible, because not every column is present in every data frame. I can't merge them - merge(df1,df2,by.x=location,by.y=location,all.x=T,all.y=T) - because it kinda cbinds them. What I need is something that looks like this: location datea b c d loc 1 1/1/2010 1 11 111 NA loc 2 1/1/2010 2 12 112 NA loc 3 1/1/2010 3 13 113 NA loc 1 2/1/2010 3 NA 114 1 loc 2 2/1/2010 5 NA 115 11 loc 3 2/1/2010 6 NA 116 111 Thanks a lot for your suggestions! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote: Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) It appears you are not reading for meaning. Burns has advised you how to construct column names and use them in your initial steps. The `$` function is quite limited in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] biding rows while merging at the same time
On Nov 3, 2010, at 5:38 PM, Dimitri Liakhovitski wrote: Never mind - I found it in reshape package: rbind.fill I wonder if it's still in reshape2. Look in plyr. -- David. Dimitri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Auto-killing processes spawned by foreach::doMC
Hi all, Sometimes I'll find myself ctrl-c-ing like a madman to kill some code that's parallelized via foreach/doMC when I realized that I just set my cpu off to do something boneheaded, and it will keep doing that thing for a while. In these situations, since I interrupted its normal execution, foreach/doMC doesn't clean up after itself by killing the processes that were spawned. Furthermore, I believe that when I quit my main R session, the spawned processes still remain (there, but idle). I can go in via terminal (or some task manager/activity monitor) and kill them manually, but does foreach (or something else (maybe multicore?)) keep track of the process IDs that it spawned? Is there some simple doCleanup() function I can write to get these R processes and kill them automagically? For what it's worth, I'm running on linux os x, R-2.12 and the latest versions of foreach/doMC/multicore (though, I feel like this has been true since I've started using foreach/doMC way back when). Thanks, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to handle 'g...@gtdata' ?
I have a few questions about GenABEL, gwaa data. 1) is there a universal way that most GenABEL people use to add more individuals into a 'gwaa' data? For example, I have a 'gwaa' data, but I need to add some dummy parents, for 'g...@phdata', it's easy to add these rows, but for 'g...@gtdata', I think I need to create SNP data as '0 0 0 0 0.' for all the dummy parents first. I am using the function 'convert.snp.ped', so I need a 'pedfile' of this format: #ped id fa mo sex trait snp1.allele1 snp1.allele2 snp2.allele1 snp2.allele2 ...# 1 1 0 0 1 2 0 0 0 0 ... 1 2 0 0 1 0 0 0 0 0 ... 1 3 0 0 2 1 0 0 0 0 ... . . 100 101 0 0 2 1 0 0 0 0 ... If we use the 1M microarray, usually, after QC, there will be ~800 thousands SNPs, so this file is really huge. I created this matrix in R, and then try to export this by using 'write.table(pedfile, file='pedfile', col.names=F, row.names=F, quote=F), but seems like it's taking for ever, because the size of this matrix is too large. Anyone can tell me, how to create a 'gwaa' data efficiently? 2) Is there any way to add genotypic data to 'g...@gtdata' directly, without converting data of other format to 'g...@gtdata' first? thank you very much! karena -- View this message in context: http://r.789695.n4.nabble.com/how-to-handle-gwaa-gtdata-tp3026206p3026206.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding too many loops - reshaping data
Here is the summary of methods. tapply is the fastest! library(reshape) system.time(for(i in 1:1000)cast(melt(mydf, measure.vars = value), city ~ brand,fun.aggregate = sum)) user system elapsed 18.400.00 18.44 library(reshape2) system.time(for(i in 1:1000)dcast(mydf,city ~ brand, sum)) user system elapsed 12.360.02 12.37 system.time(for(i in 1:1000)xtabs(value ~ city + brand, mydf)) user system elapsed 2.450.002.47 system.time(for(i in 1:1000)tapply(mydf$value,mydf[c('city','brand')],sum)) user system elapsed 0.780.000.79 Dimitri On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna www...@gmail.com wrote: Try this: xtabs(value ~ city + brand, mydf) On Wed, Nov 3, 2010 at 6:23 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame like this one: mydf-data.frame(city=c(a,a,a,a,a,a,a,a,b,b,b,b,b,b,b,b), brand=c(x,x,y,y,z,z,z,z,x,x,x,y,y,y,z,z), value=c(1,2,11,12,111,112,113,114,3,4,5,13,14,15,115,116)) (mydf) What I need to get is a data frame like the one below - cities as rows, brands as columns, and the sums of the value within each city/brand combination in the body of the data frame: city x y z a 3 23 336 b 7 42 231 I have written a code that involves multiple loops and subindexing - but it's taking too long. I am sure there must be a more efficient way of doing it. Thanks a lot for your hints! -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rd installation (not markup language) primer?
I have a set of functions that I always load on startup. for example, there is my now infamous is.defined() function. I would like to add some documentation for these functions, so that I can do a ?is.defined inside R. The documentation tells me how to mark up Rd files is very good, but I wonder how one installs them for access by the R executable (on OSX for me). Do I drop them into a special directory? Which ones are allowed? Do I need to package everything into a library, or can I just add Rd files for source'd files? Do I parse the Rd files for R, or does R parse the Rd files on demand? so, is there a primer on installing Rd files? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate Poisson distribution
Jourdan Gold jgold at uoguelph.ca writes: Hello, from a search of the archives and functions, I am looking for information on creating random correlated counts from a multivariate Poisson distribution. I can not seem to find a function that does this. Perhaps, it has not yet  been created. Has anyone created an R package that does this. As far as I know this is a bit tricky (although I would be happy to hear of simple solutions). Two possibilities are (1) generate a multivariate normal distribution (e.g. MASS::mvrnorm), exponentiate it, and take Poisson deviates [hard to specify what the final correlation is]; (2) use copulas (library(sos); findFn(copula). Haven't tried library(sos); findFn(correlated Poisson) but you could ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rd installation (not markup language) primer?
On 03/11/2010 6:28 PM, ivo welch wrote: I have a set of functions that I always load on startup. for example, there is my now infamous is.defined() function. I would like to add some documentation for these functions, so that I can do a ?is.defined inside R. The documentation tells me how to mark up Rd files is very good, but I wonder how one installs them for access by the R executable (on OSX for me). Do I drop them into a special directory? Which ones are allowed? Do I need to package everything into a library, or can I just add Rd files for source'd files? Do I parse the Rd files for R, or does R parse the Rd files on demand? so, is there a primer on installing Rd files? Writing R Extensions is the main documentation. You put them in the man directory of a package, and R does the rest when you install the package. Duncan Murdoch /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.