[R] ggplot not showing all the years on the x-axis
Dear R helpers, I am currently having hard time fixing the values on the x-axis of a plot with ggplot: even though I have 12 years, ggplot plots only 3 of them. Here is my example: library(ggplot2) ii - 2000:2011 ss - rnorm(12,0,1) pm - data.frame(ii,ss) tmpplot - ggplot(pm, aes(x = ii, y = ss)) plot - tmpplot + geom_line() plot In my case, ggplot reports on the year 2000, 2004 and 2008 on the x-axis, but I'd like to have all the years from 2000 to 2011. I know how to fix this with the standard plot in R, but for consistency I'd like to use ggplot. Can anyone help? thanks in advance, f. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot not showing all the years on the x-axis
Hi,this is a question about how to set the scale,try this add a scale_x_discrete() like that: plot - tmpplot + geom_line()+scale_x_continuous(breaks=ii) Yao He 2013/1/8 Francesco Sarracino f.sarrac...@gmail.com: Dear R helpers, I am currently having hard time fixing the values on the x-axis of a plot with ggplot: even though I have 12 years, ggplot plots only 3 of them. Here is my example: library(ggplot2) ii - 2000:2011 ss - rnorm(12,0,1) pm - data.frame(ii,ss) tmpplot - ggplot(pm, aes(x = ii, y = ss)) plot - tmpplot + geom_line() plot In my case, ggplot reports on the year 2000, 2004 and 2008 on the x-axis, but I'd like to have all the years from 2000 to 2011. I know how to fix this with the standard plot in R, but for consistency I'd like to use ggplot. Can anyone help? thanks in advance, f. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- — Master candidate in 2rd year Department of Animal genetics breeding Room 436,College of Animial ScienceTechnology, China Agriculture University,Beijing,100193 E-mail: yao.h.1...@gmail.com —— __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem getting loess tricubic weights
Hi I am trying to get the tricube weights from the loess outputs as I need to calculate an error function which requires the weight. So I have used the following example from the R: cars.lo - loess(dist ~ speed, cars, span=0.5, degree=1, family=symmetric) Then i try to get the weights: cars.lo$weights [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 The results are all 1 so i dont think that the tricube weighting are set. May I know what other parameters do i need to tweak to set the weights to tricube weights? Thank you. -- Best regards Joyce Lin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot residuals per factor
Dear R-users, I want to plot residuals vs fitted for multiple groups with ggplot2. I try this code, but unsuccessful. library(plyr) models-dlply(dat1,d,function(df) mod-lm(y~x,data=df) ggplot(models,aes(.fitted,.resid), color=factor(d))+ geom_hline(yintercept=0,col=white,size=2)+ geom_point()+ geom_smooth(se=F) -- --- Catalin-Constantin ROIBU Forestry engineer, PhD Forestry Faculty of Suceava Str. Universitatii no. 13, Suceava, 720229, Romania office phone +4 0230 52 29 78, ext. 531 mobile phone +4 0745 53 18 01 +4 0766 71 76 58 FAX:+4 0230 52 16 64 silvic.usv.ro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot not showing all the years on the x-axis
Indeed, it works. I've been mumbling around with scale_x_discrete(), but without any success. Thanks a lot, Yao. You helped me a lot! f. On 8 January 2013 10:06, Yao He yao.h.1...@gmail.com wrote: Hi,this is a question about how to set the scale,try this add a scale_x_discrete() like that: plot - tmpplot + geom_line()+scale_x_continuous(breaks=ii) Yao He 2013/1/8 Francesco Sarracino f.sarrac...@gmail.com: Dear R helpers, I am currently having hard time fixing the values on the x-axis of a plot with ggplot: even though I have 12 years, ggplot plots only 3 of them. Here is my example: library(ggplot2) ii - 2000:2011 ss - rnorm(12,0,1) pm - data.frame(ii,ss) tmpplot - ggplot(pm, aes(x = ii, y = ss)) plot - tmpplot + geom_line() plot In my case, ggplot reports on the year 2000, 2004 and 2008 on the x-axis, but I'd like to have all the years from 2000 to 2011. I know how to fix this with the standard plot in R, but for consistency I'd like to use ggplot. Can anyone help? thanks in advance, f. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Master candidate in 2rd year Department of Animal genetics breeding Room 436,College of Animial ScienceTechnology, China Agriculture University,Beijing,100193 E-mail: yao.h.1...@gmail.com -- Francesco Sarracino, Ph.D. https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to label two figures in the same chunk independently with knitr
Dear R helpers, I am using knitr to run analysis with R and edit my document with Latex. I am wondering whether there is a way to include 2 or more pictures per chunk and being able to refer them in the text independently and eventually whether it is possible to give them different captions. Let me give you an example.Rnw: \documentclass{article} \title{Example} \author{FS} \begin{document} \maketitle I put some text here. I want to plot to charts in the same figure and label them independently. stat, echo = FALSE, results = 'hide'= ii - 2000:2011 xx - rnorm(12,0,1) yy - rnorm(12,0,1) pm - data.frame(ii,zz,yy @ Now I generate the two pictures and put them into the same chunk with the option out.width set to .49 so that knitr places the two charts side by side: fig:example, echo = FALSE, out.width=.49\\linewidth, fig.cap=this is an example= plot(ii,xx, type = l) plot(ii,yy, type = l, lty = 2) @ Finally, I want the reader to look at the figure on the left. \end{document} How can I do this? If I refer to \ref{fig:example} I will get the number of the figure, but of the chart on the left. Eventually, is it possible to have separate captions for each chart? Thanks in advance for your kind help, f. -- Francesco Sarracino, Ph.D. https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] posting a question in the R-help forum
At 14:14 07/01/2013, Violet Swakman wrote: Hello, I wanted to post this question below, on the R-help forum, but I'm not sure I succeeded because it said that I wasn't subscribed to the mailing list yet. Now I am subscribed, but will my question be accepted now automatically, or should I submit it again? Thanks in advance, Violet Swakman Violet You have apparently 20 groups but only 23 observations. I think your model would work fine with more data. Note also the warning sign of the correlation of -0.999 for intercept and tarsus length. Hello everyone, I'm having trouble understanding my output from a linear mixed effects model (nlme :: lme), I hope someone can help me. Say I'm interested in the effect of Tarsus length on the Bar length of feathers. I used an lme since some birds were living in the same territory, so territory was included as random effect. Both Bar length and Tarsus length are seen as numerical values, Territory is seen as a factor. # m1 - lme(Bar_length~Tarsus_length, random = ~ 1|Territory, data=data) summary(m1) Linear mixed-effects model fit by REML Data: min_s12 AIC BIC logLik -104.1593 -99.98116 56.07963 Random effects: Formula: ~1 | as.factor(Territory) (Intercept) Residual StdDev: 0.01023884 0.01072872 Fixed effects: Av_bar_length ~ Tarsus_av Value Std.Error DFt-value p-value (Intercept) 0.22391092 0.08472658 19 2.6427470 0.0160 Tarsus_av -0.00048219 0.00338510 2 -0.1424453 0.8998 Correlation: (Intr) Tarsus_av -0.999 Standardized Within-Group Residuals: Min Q1Med Q3 Max -2.02920116 -0.49093095 0.05736504 0.47632005 1.15944871 Number of Observations: 23 Number of Groups: 20 ## I do not understand why the model needs 17 degrees of freedom to calculate 1 intercept and slope (just 1 numerical explanatory variable). Could anyone maybe explain this to me? When I use Season, a factor with 2 levels, as an explanatory variable the same thing happens, the model takes 17 DF's to calculate the effect of Season. Thanks in advance, Violet [[alternative HTML version deleted]] Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Oaxaca-Blinder decomposition in R
Dear R-listers, does anybody know of any package developed to implement the Oaxaca-Blinder decomposition in R? I've been googling around and my reserch has been unfruitful. The latest news I've found were 1 year old. Does anybody know of any recent development? Has R ever been employed to run a Oaxaca-Blinder decomposition? Thanks for your kind support, f. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
I am using the nls function and it stops because the number of iterations exceeded 50, but i used the nls.control argument to allow for 500 iterations. Do you have any idea why it's not working? Not entirely. But I do see in the code for SSgompertz that SSgompertz calls nls _without_ specifying maxiter. So I suspect the error may be coming from the selfstart step which is ignoring your control argument. You could check that by adding trace=TRUE to the nls call; I suspect you may see no iterations at all if the self-start fails before nls itself gets going. If that's so, the easiest way I can see round that (barring alrternative packages I've not heard of) is to copy the SSgompertz code to a text file (say 'SSgompertz' at the command line without () to see the code), create your own SSgompertz function from it, add a big maxiter in its internal call and use your version. Failing that, use good starting parameters (usually I fool around with plots until something looks close enough to at least start) for a roll-your-own gompertz model. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New book: Beginner's Guide to GAM with R
Readers of this mailing list may be interested to know that the book A Beginner's Guide to Generalized Additive Models with R' is now available from: http://www.highstat.com/BGGAM.htm Upcoming books in 2013: A Beginner's Guide to GLM with R and JAGS. AF Zuur, J Hilbe, EN Ieno A Beginner's Guide to GAMM with R. AF Zuur, AA Saveliev, EN Ieno Kind regards, Alain Zuur [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Statistics
I think Simon has provided a good answer to the actual question but as a refugee from SAS I'd suggest having a look at www.et.bs.ehu.es/~etptupaf/pub/R/RforSASSPSSusers.pdf or getting the book Muenchen, R. A. (2008). R for SAS and SPSS Users (1st ed.). Springer. R ans SAS approach things very differently at times. John Kane Kingston ON Canada -Original Message- From: thoms...@email.arizona.edu Sent: Mon, 7 Jan 2013 19:17:27 -0600 To: r-help@r-project.org Subject: [R] Conditional Statistics Hello, I am a new user of R. I am coming from SAS and do statistics on stock market data, economic data, and social data. My question is this: How can you get the mean, standard dev, etc. of a variable based on a conditional statement on either the same variable or a different variable in the same data set? So if I had the closing prices of the SP from 01/01/1990-12/31/1990, how could I get the average price of the SP from 02/01/1990-03/15/1990? Or the average price of the SP on Mondays (assuming a dummy var is created for 1 = Monday, 0 = else). I understand that you can create subsets and new data sets based on the conditional statements; but is there an easier way to do this by typing a line into the mean() statement? That was extremely easy in SAS where you could say: proc means data=sp500; var price; where monday = 1; Thank you for your help. Joe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list of lists to matrix
Please supply some sample data. The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. Simpe example: aalist - list(aa = c(3.0, 2.9, 2.7), bb = c(0.86, 0.76, 0.66), cc= c(0.07, 0.04, 0.04), cc = c(a, b, c)) dput(aalist) The attached text file was useful but actual or example data is much better. John Kane Kingston ON Canada -Original Message- From: eliza_bo...@hotmail.com Sent: Mon, 7 Jan 2013 16:13:03 + To: r-help@r-project.org Subject: [R] list of lists to matrix dear R family, [a text file has been attached for better understanding] i have a list of 16 and each of of that is further subdivided into variable number of lists. So, i have a kind of list into lists phenomenon. [[1]]$'1' 1 2 3 4 5 6 7 8 9 [[1]]$'2' 1 2 3 4 5 6 7 8 9 i want to convert both these sublists into one column and then cbind it in the following way col1 col2 1 1 2 2 3 3.. 9 9 i want to the same operations on all the 16 lists. thanks in advance, elisa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Statistics
...if I had the closing prices of the SP from 01/01/1990-12/31/1990, how could I get the average price of the SP from 02/01/1990-03/15/1990? Or the average price of the SP on Mondays (assuming a dummy var is created for 1 = Monday, 0 = else). tapply has already been referred to. You may also find aggregate() useful, as it gives you back a data frame that includes the conditioning variables if you tell it to. Alse ave, if you want to do something like mean-centring a data set based on group means rather than the grad mean. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logical operator and lists
Hello R-Helpers, I have a slight problem with the expresion data[data==] - NA which works well for a data.frame. But now i must use the same for a list of data.frames. My idea is data[[]][data==] but it don´t work. Thanks!! Dominic [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in a abline loop
HI Elaine, In the data you sent to me, it had 5 levels for skin_color. data1-read.csv(skin_color.csv,sep=\t) data1$skin_color-factor(data1$skin_color) levels(data1$skin_color) #[1] 1 2 3 4 5 mypath-file.path(/home/arun/Trial1,paste(Elaine_,1:5,.jpg,sep=)) #change the file.path according to your system for(i in seq_along(mypath)){ jpeg(file=mypath[i]) plot(body_weight~body_length,data=data1[data1$skin_color==i,]) line-lm(body_weight~body_length,data=data1[data1$skin_color==i,]) abline(line,col=c(yellow,blue,green,orange,red)[i],lwd=2) dev.off() } #or lapply(seq_along(mypath),function(i) {jpeg(file=mypath[i]) line-lm(body_weight~body_length,data=data1[data1$skin_color==i,]) plot(body_weight~body_length,data=data1[data1$skin_color==i,]) abline(line,col=c(yellow,blue,green,orange,red)[i],lwd=2) dev.off() }) A.K. - Original Message - From: Elaine Kuo elaine.kuo...@gmail.com To: arun smartpink...@yahoo.com Cc: Sent: Monday, January 7, 2013 9:48 PM Subject: Re: [R] error in a abline loop Hello arun Thank you always. Please kindly help the attached data for your reference. Elaine On Tue, Jan 8, 2013 at 10:00 AM, arun smartpink...@yahoo.com wrote: HI, A possible guess ( with no data): for (i in 1:7) { subs - data$skin_color==levels(data$skin_color)[i] line-lm(body_weight~body_length, data=subset(data, subset=subs), abline(line,col=c(yellow,chocolate1,darkorange2, red3,saddlebrown,coral4,grey38)[i],lwd=2) ) #closing parenthesis for lm( was missing } A.K. - Original Message - From: Elaine Kuo elaine.kuo...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, January 7, 2013 8:23 PM Subject: [R] error in a abline loop Hello I have data of body length and body weight of people of different skin colors. I tried to write a code to plot body length and body weight according to the skin colors. (Thanks for Petr's advice so far.) A loop is used but an error shows up in the following code. It says: unexpected '}' in red3,red3,saddlebrown,coral4,chocolate4,darkblue,navy,grey38)[i],lwd=2) } Please kindly advise how to modify the code. Thank you. The code data -read.csv(H:/skincolor.csv,header=T) # graph par(mai=c(1.03,1.03,0.4,0.4)) plot(data$body_weight, data$body_length, xaxp=c(0,200,4), yaxp=c(0,200,4), type=p, pch=1,lwd=1.0, cex.lab=1.4, cex.axis=1.2, font.axis=2, cex=1.5, las=1, bty=l,col=c(yellow,chocolate1,darkorange2, red3,saddlebrown,coral4,grey38)[as.numeric(data$skin_color)]) #~~~ ## for (i in 1:7) { subs - data$skin_color==levels(data$skin_color)[i] line-lm(body_weight~body_length, data=subset(data, subset=subs), abline(line,col=c(yellow,chocolate1,darkorange2, red3,saddlebrown,coral4,grey38)[i],lwd=2) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stats-as.dendrogram
Dear Sir/Madam, I am using the 'as.dendrogram' function to convert from an hclust object (using 'fastcluster') to a dendrogram with a dataset of size ~3 (an example code is below). I need the dendrogram structure to use the dendrapply and attributes functions and to access the child nodes, I do not need any of the plot properties. The problem is that it takes an infinite amount of time to convert and it uses a lot of memory. Could you please let me know which part of the code takes the longest time and if there's a way to make it faster. Is it the recursive part or when applying the attributes to each node (merging, etc) ? If there isn't a way to make it faster, are there similar functions(dendrapply, attributes, accessing of child nodes) that use the hclust object instead? Thanks, Sahar Alkhairy # options(expressions=50) NCols=10 NRows=3 DataB -matrix(runif(NCols*NRows), ncol=NCols) HClust - hclust.vector(DataB ) dhc - as.dendrogram(HClust) #gets stuck here forever [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems when loading package GenABEL
Dear all, since yesterday, I have been experiencing problems with the package GenABEL. When I try to load the package (library(GenABEL)) I get the following error message: Loading required package: MASS Error : .onLoad failed in loadNamespace() for 'GenABEL', details: call: stringSplit[[1]] error: subscript out of bounds Error: package/namespace load failed for ‘GenABEL’ The funny (and worrying) thing is that I get the same error on every machine (linux, mac, windows) I tried, even after removing and reinstalling the package or the whole R. And it's not only me, but also my colleagues here and in different cities! Does anybody know what's going on and how to tackle the problem? Thank you in advance for any useful information. Filippo Biscarini __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GLMM post- hoc comparisons
Hi All, I have data about seed predation (SP) in fruits of three differents colors (yellow, motted, dark) and in two fruiting seasons (2007, 2008). I performed a GLMM (lmer function, lme4 package) and the outcome showed that the interaction term (color:season) was significant, and some combinations of this interaction have significant Pr(|z|), but I don't think they are the right significant combinations, because when I look the bwplot, this combinations seems to be very different from the other ones. So, I would like to know if there is any test a posteriori to know the p-values for each combination of color:season, and thereby be able to know what conbination/s is/are really significant. m1=lmer(SP ~ color + season:color +(1|Site:tree), data=datosfl, family=poisson) AIC BIC logLik deviance 178.3 196.6 -81.14162.3 Random effects: Groups NameVariance Std.Dev. obsBR (Intercept) 0.064324 0.25362 Site:tree (Intercept) 0.266490 0.51623 Number of obs: 73, groups: obsBR, 73; Site:tree, 37 Estimate Std. Error z value Pr(|z|) (Intercept)2.5089 0.2750 9.125 2e-16 *** colorM-0.1140 0.3242 -0.352 0.7250 colorD-0.6450 0.4178 -1.544 0.1227 Season2008-0.7343 0.3104 -2.365 0.0180 * colorM:Season2008 0.2505 0.4352 0.576 0.5648 colorD:Season2008 1.1445 0.5747 1.992 0.0464 * __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical operator and lists
Hello, Dominic, untested: data - lapply( data, function( x) x[ x == ] - NA Hth -- Gerrit On Tue, 8 Jan 2013, Dominic Roye wrote: Hello R-Helpers, I have a slight problem with the expresion data[data==] - NA which works well for a data.frame. But now i must use the same for a list of data.frames. My idea is data[[]][data==] but it don´t work. Thanks!! Dominic__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to label two figures in the same chunk independently with knitr
On 01/08/2013 04:17 AM, Francesco Sarracino wrote: Dear R helpers, I am using knitr to run analysis with R and edit my document with Latex. I am wondering whether there is a way to include 2 or more pictures per chunk and being able to refer them in the text independently and eventually whether it is possible to give them different captions. Hi: You may use, main = , within each plot; or take a look at the LaTeX package, subcaption: http://en.wikibooks.org/wiki/LaTeX/Floats,_Figures_and_Captions#Subfloats -- Ulises M. Alvarez http://sophie.unam.mx/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems when loading package GenABEL
On 08/01/2013 11:42, Filippo Biscarini wrote: Dear all, since yesterday, I have been experiencing problems with the package GenABEL. When I try to load the package (library(GenABEL)) I get the following error message: Loading required package: MASS Error : .onLoad failed in loadNamespace() for 'GenABEL', details: call: stringSplit[[1]] error: subscript out of bounds Error: package/namespace load failed for ‘GenABEL’ The funny (and worrying) thing is that I get the same error on every machine (linux, mac, windows) I tried, even after removing and reinstalling the package or the whole R. And it's not only me, but also my colleagues here and in different cities! Does anybody know what's going on and how to tackle the problem? Ask the maintainer, who has been informed. He tries to parse the CRAN web page for the package, which has changed. Not at all wise Thank you in advance for any useful information. Filippo Biscarini __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. That did ask you to contact the package maintainer -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple versions of function
On Jan 7, 2013, at 6:58 PM, ivo welch wrote: hi david---can you give just a little more of an example? the function should work with call by order, call by name, and data frame whose columns are the names. /iaw It is I who should be expecting you to provide an example. -- David. Ivo Welch (ivo.we...@gmail.com) On Mon, Jan 7, 2013 at 4:25 PM, David Winsemius dwinsem...@comcast.net wrote: On Jan 7, 2013, at 3:57 PM, ivo welch wrote: dear R experts: I want to define a function the calculates the black-scholes value. it takes 5 named parameters, BS - function(S,K,dt,rf,sigma) {} . let's presume I want to be able to call this not only with my 5 numeric vectors BS( sigma=0.3, S=100, K=100, dt=1, rf=0.1 ) and BS( 100, 100, 1, 0.1, 0.3), but also with a data frame that contains the variables alll in a neat data frame already, BS( data.frame( S=100, K=100, dt=1, rf=0.1, sigma=0.3 )). I could of course define BS6 and BS1, but it would be nice to wrap this functionality into one function that can do both. I know that BS has to parse an '...' argument, but there could be a couple of magical R functions that might make this easier than I would do it with my planned clunky version. what's the elegant version? apply( dfrm, 1, BS) -- David Winsemius Alameda, CA, USA David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot xaxp issue
On Jan 7, 2013, at 5:02 PM, Elaine Kuo wrote: Hello, I figured out that the code should be boyline-lm(body_weight ~ body_length, data=subset (together,,sex==boy)) However, the could be omitted if the field name happened to be numeric, such as 1, 2, or 3. Please kindly explain why the could be omitted for numbers. The interpreter parses atoms composed only of [digits, -, .] , i.e. numeric, differently than it parses atoms with leading [alphas]. Please study the introductory material more thoroughly where this is all explained and illustrated. And do provide minimal reproducible examples. That might be especially important here , since failing to include the quotes could be very prone to error if the underlying object is a factor variable. -- David Thanks again. Elaine On Tue, Jan 8, 2013 at 8:40 AM, Elaine Kuo elaine.kuo...@gmail.com wrote: boyline-lm(body_weight ~ body_length, data=together, subset(together,sex==boy)) __ David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical operator and lists
HI, This should also work: set.seed(5) list1-lapply(1:3,function(i) data.frame(col1=sample(c(1:5,),10,replace=TRUE), value=rnorm(10),stringsAsFactors=FALSE)) lapply(list1,function(x) {x[x==]-NA;x}) A.K. - Original Message - From: Dominic Roye dominic.r...@gmail.com To: R help r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 7:16 AM Subject: [R] Logical operator and lists Hello R-Helpers, I have a slight problem with the expresion data[data==] - NA which works well for a data.frame. But now i must use the same for a list of data.frames. My idea is data[[]][data==] but it don´t work. Thanks!! Dominic [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical operator and lists
Hi, Try this: set.seed(5) list1-lapply(1:3,function(i) data.frame(col1=sample(c(1:5,),10,replace=TRUE), value=rnorm(10),stringsAsFactors=FALSE)) res-lapply(list1,function(x) {x[apply(x,2,function(y) y==)]-NA;x}) res[[1]] # col1 value #1 2 -0.6029080 #2 5 -0.4721664 #3 NA -0.6353713 #4 2 -0.2857736 #5 1 0.1381082 #6 5 1.2276303 #7 4 -0.8017795 #8 5 -1.0803926 #9 NA -0.1575344 #10 1 -1.0717600 A.K. - Original Message - From: Dominic Roye dominic.r...@gmail.com To: R help r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 7:16 AM Subject: [R] Logical operator and lists Hello R-Helpers, I have a slight problem with the expresion data[data==] - NA which works well for a data.frame. But now i must use the same for a list of data.frames. My idea is data[[]][data==] but it don´t work. Thanks!! Dominic [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to label two figures in the same chunk independently with knitr
All you mentioned are possible; knitr has very comprehensive support to figures in LaTeX, and what you want in this case is subfigures (\usepackage{subfig}); here is an example: https://github.com/yihui/knitr-examples/blob/master/067-graphics-options.Rnw (search for 'fig.subcap' for the relevant chunk) And here is a preview: http://i.imgur.com/4lKpw.png Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Tue, Jan 8, 2013 at 4:17 AM, Francesco Sarracino f.sarrac...@gmail.com wrote: Dear R helpers, I am using knitr to run analysis with R and edit my document with Latex. I am wondering whether there is a way to include 2 or more pictures per chunk and being able to refer them in the text independently and eventually whether it is possible to give them different captions. Let me give you an example.Rnw: \documentclass{article} \title{Example} \author{FS} \begin{document} \maketitle I put some text here. I want to plot to charts in the same figure and label them independently. stat, echo = FALSE, results = 'hide'= ii - 2000:2011 xx - rnorm(12,0,1) yy - rnorm(12,0,1) pm - data.frame(ii,zz,yy @ Now I generate the two pictures and put them into the same chunk with the option out.width set to .49 so that knitr places the two charts side by side: fig:example, echo = FALSE, out.width=.49\\linewidth, fig.cap=this is an example= plot(ii,xx, type = l) plot(ii,yy, type = l, lty = 2) @ Finally, I want the reader to look at the figure on the left. \end{document} How can I do this? If I refer to \ref{fig:example} I will get the number of the figure, but of the chart on the left. Eventually, is it possible to have separate captions for each chart? Thanks in advance for your kind help, f. -- Francesco Sarracino, Ph.D. https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Applying a user-defined function
Hello List, My goal is to apply a user-defined function on several columns of a data frame. When testing the code on a reproducible example below, I get the following error message. #now Write a new function using the above cut ()/quantile function to apply on different columns of the data frame CutQuintiles - function(x) { + cut (test1$x,quantile (test1$x, (0:5/5)),include.lowest=TRUE) + } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) Error in cut.default(test1$x, quantile(test1$x, (0:5/5)), include.lowest = TRUE) : 'x' must be numeric I would appreciate receiving your advice. Thanks, Pradip ## The reproducible example begins here test1 - read.table (text= State,ObtMj_P,ObtMj_SE,ExpPrevMed_P,ExpPrevMed_SE,ParMon_P,ParMon_SE Alabama,49.60,1.37,80.00,0.91,12.10,0.68 Alaska,55.00,1.41,81.80,1.08,12.40,0.90 Arizona,52.50,1.56,79.60,1.20,15.80,1.08 Arkansas,50.50,1.22,78.00,0.78,12.80,0.72 California,51.10,0.65,80.50,0.53,13.00,0.41 Colorado,55.10,1.26,81.70,1.03,12.10,0.72 Connecticut,56.30,1.28,85.00,0.93,14.60,0.77 Delaware,53.60,1.30,79.50,1.04,14.70,0.97 District of Columbia,53.50,1.22,76.20,1.03,14.30,1.13 Florida,52.70,0.67,78.90,0.52,14.10,0.45 Georgia,52.50,1.15,79.30,1.02,15.90,0.98 Hawaii,49.40,1.33,83.80,1.12,16.00,1.06 Idaho,48.30,1.23,82.40,0.99,11.90,0.74 Illinois,52.70,0.63,81.00,0.46,13.60,0.40 Indiana,49.60,1.16,80.90,0.91,12.60,0.82 Iowa,46.30,1.37,82.10,1.01,13.60,0.87 Kansas,44.30,1.43,79.20,0.98,12.90,0.79 Kentucky,52.90,1.37,78.70,1.05,14.60,0.98 Louisiana,49.70,1.23,76.80,1.06,14.50,0.76 Maine,55.60,1.44,82.90,0.93,16.70,0.83 Maryland,53.90,1.46,83.60,0.95,14.00,0.80 Massachusetts,55.40,1.41,81.00,1.15,14.70,0.80 Michigan,52.40,0.62,80.50,0.47,15.00,0.43 Minnesota,51.50,1.20,84.40,0.87,14.40,0.86 Mississippi,43.20,1.14,76.60,0.91,12.30,0.78 Missouri,48.70,1.20,80.30,0.90,13.70,0.12 Montana,56.40,1.16,83.70,0.95,12.10,0.68 Nebraska,45.70,1.51,83.40,0.95,12.40,0.90 Nevada,54.20,1.17,80.60,1.07,15.80,1.08 New Hampshire,56.10,1.30,83.30,0.93,12.80,0.72 New Jersey,53.20,1.45,83.70,0.95,13.00,0.41 New Mexico,57.60,1.34,78.90,1.03,12.10,0.72 New York,53.70,0.67,82.60,0.48,14.60,0.77 North Carolina,52.20,1.26,81.90,0.84,14.70,0.97 North Dakota,48.60,1.34,84.20,0.88,14.30,1.13 Ohio,50.90,0.61,82.70,0.49,14.10,0.45 Oklahoma,47.20,1.42,78.80,1.33,15.90,0.98 Oregon,54.00,1.35,80.60,1.14,16.00,1.06 Pennsylvania,53.00,0.63,79.90,0.47,11.90,0.74 Rhode Island,57.20,1.20,79.50,1.02,13.60,0.40 South Carolina,50.50,1.21,79.50,0.95,12.60,0.82 South Dakota,43.40,1.30,81.70,1.05,13.60,0.87 Tennessee,48.90,1.35,78.40,1.35,12.90,0.79 Texas,48.70,0.62,79.00,0.48,14.60,0.98 Utah,42.00,1.49,85.00,0.93,14.50,0.76 Vermont,58.70,1.24,83.70,0.84,16.70,0.83 Virginia,51.80,1.18,82.00,1.04,14.00,0.80 Washington,53.50,1.39,84.10,0.96,14.70,0.80 West Virginia,52.80,1.07,79.80,0.93,15.00,0.43 Wisconsin,49.90,1.50,83.50,1.02,14.40,0.86 Wyoming,49.20,1.29,82.00,0.85,12.30,0.78 , sep=,, row.names='State', header=TRUE, as.is=TRUE) # Verify if The following function ctagorizes the obtmj_p values into one of the 5 equal sized groups- works fine. cut (test1$obtmj_p,quantile (test1$obtmj_p, (0:5/5)),include.lowest=TRUE) #now Write a new function using the above cut ()/quantile function to apply on different columns of the data frame CutQuintiles - function(x) { cut (test1$x,quantile (test1$x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(newcols) - paste (names(test1 [, seq (1,6,2)]), _cat) ## Pradip K. Muhuri, PhD Statistician Substance Abuse Mental Health Services Administration The Center for Behavioral Health Statistics and Quality Division of Population Surveys 1 Choke Cherry Road, Room 2-1071 Rockville, MD 20857 Tel: 240-276-1070 Fax: 240-276-1260 e-mail: pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov The Center for Behavioral Health Statistics and Quality your feedback. Please click on the following link to complete a brief customer survey: http://cbhsqsurvey.samhsa.govhttp://cbhsqsurvey.samhsa.gov/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying a user-defined function
On Jan 8, 2013, at 9:11 AM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote: Hello List, My goal is to apply a user-defined function on several columns of a data frame. When testing the code on a reproducible example below, I get the following error message. #now Write a new function using the above cut ()/quantile function to apply on different columns of the data frame CutQuintiles - function(x) { + cut (test1$x,quantile (test1$x, (0:5/5)),include.lowest=TRUE) + } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) Error in cut.default(test1$x, quantile(test1$x, (0:5/5)), include.lowest = TRUE) : 'x' must be numeric I would appreciate receiving your advice. Take the test$ out of that function's code. You are reaching outside the function when you should only be working on the x object that gets passed into it. Thanks, Pradip ## The reproducible example begins here test1 - read.table (text= State,ObtMj_P,ObtMj_SE,ExpPrevMed_P,ExpPrevMed_SE,ParMon_P,ParMon_SE Alabama,49.60,1.37,80.00,0.91,12.10,0.68 Alaska,55.00,1.41,81.80,1.08,12.40,0.90 Arizona,52.50,1.56,79.60,1.20,15.80,1.08 Arkansas,50.50,1.22,78.00,0.78,12.80,0.72 California,51.10,0.65,80.50,0.53,13.00,0.41 Colorado,55.10,1.26,81.70,1.03,12.10,0.72 Connecticut,56.30,1.28,85.00,0.93,14.60,0.77 Delaware,53.60,1.30,79.50,1.04,14.70,0.97 District of Columbia,53.50,1.22,76.20,1.03,14.30,1.13 Florida,52.70,0.67,78.90,0.52,14.10,0.45 Georgia,52.50,1.15,79.30,1.02,15.90,0.98 Hawaii,49.40,1.33,83.80,1.12,16.00,1.06 Idaho,48.30,1.23,82.40,0.99,11.90,0.74 Illinois,52.70,0.63,81.00,0.46,13.60,0.40 Indiana,49.60,1.16,80.90,0.91,12.60,0.82 Iowa,46.30,1.37,82.10,1.01,13.60,0.87 Kansas,44.30,1.43,79.20,0.98,12.90,0.79 Kentucky,52.90,1.37,78.70,1.05,14.60,0.98 Louisiana,49.70,1.23,76.80,1.06,14.50,0.76 Maine,55.60,1.44,82.90,0.93,16.70,0.83 Maryland,53.90,1.46,83.60,0.95,14.00,0.80 Massachusetts,55.40,1.41,81.00,1.15,14.70,0.80 Michigan,52.40,0.62,80.50,0.47,15.00,0.43 Minnesota,51.50,1.20,84.40,0.87,14.40,0.86 Mississippi,43.20,1.14,76.60,0.91,12.30,0.78 Missouri,48.70,1.20,80.30,0.90,13.70,0.12 Montana,56.40,1.16,83.70,0.95,12.10,0.68 Nebraska,45.70,1.51,83.40,0.95,12.40,0.90 Nevada,54.20,1.17,80.60,1.07,15.80,1.08 New Hampshire,56.10,1.30,83.30,0.93,12.80,0.72 New Jersey,53.20,1.45,83.70,0.95,13.00,0.41 New Mexico,57.60,1.34,78.90,1.03,12.10,0.72 New York,53.70,0.67,82.60,0.48,14.60,0.77 North Carolina,52.20,1.26,81.90,0.84,14.70,0.97 North Dakota,48.60,1.34,84.20,0.88,14.30,1.13 Ohio,50.90,0.61,82.70,0.49,14.10,0.45 Oklahoma,47.20,1.42,78.80,1.33,15.90,0.98 Oregon,54.00,1.35,80.60,1.14,16.00,1.06 Pennsylvania,53.00,0.63,79.90,0.47,11.90,0.74 Rhode Island,57.20,1.20,79.50,1.02,13.60,0.40 South Carolina,50.50,1.21,79.50,0.95,12.60,0.82 South Dakota,43.40,1.30,81.70,1.05,13.60,0.87 Tennessee,48.90,1.35,78.40,1.35,12.90,0.79 Texas,48.70,0.62,79.00,0.48,14.60,0.98 Utah,42.00,1.49,85.00,0.93,14.50,0.76 Vermont,58.70,1.24,83.70,0.84,16.70,0.83 Virginia,51.80,1.18,82.00,1.04,14.00,0.80 Washington,53.50,1.39,84.10,0.96,14.70,0.80 West Virginia,52.80,1.07,79.80,0.93,15.00,0.43 Wisconsin,49.90,1.50,83.50,1.02,14.40,0.86 Wyoming,49.20,1.29,82.00,0.85,12.30,0.78 , sep=,, row.names='State', header=TRUE, as.is=TRUE) # Verify if The following function ctagorizes the obtmj_p values into one of the 5 equal sized groups- works fine. cut (test1$obtmj_p,quantile (test1$obtmj_p, (0:5/5)),include.lowest=TRUE) #now Write a new function using the above cut ()/quantile function to apply on different columns of the data frame CutQuintiles - function(x) { cut (test1$x,quantile (test1$x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(newcols) - paste (names(test1 [, seq (1,6,2)]), _cat) ## Pradip K. Muhuri, PhD Statistician Substance Abuse Mental Health Services Administration The Center for Behavioral Health Statistics and Quality Division of Population Surveys 1 Choke Cherry Road, Room 2-1071 Rockville, MD 20857 Tel: 240-276-1070 Fax: 240-276-1260 e-mail: pradip.muh...@samhsa.hhs.govmailto:pradip.muh...@samhsa.hhs.gov The Center for Behavioral Health Statistics and Quality your feedback. Please click on the following link to complete a brief customer survey: http://cbhsqsurvey.samhsa.govhttp://cbhsqsurvey.samhsa.gov/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list
[R] Integration in R
Hi R-users. I'm having difficulty with an integration in R via the package cubature. I'm putting it with a simple example here. I wish to integrate a function like: f(x1,x2)=2/3*(x1+x2) in the interval 0x1x27. To be sure I tried it by hand and got 114.33, but the following R code is giving me 102.6667. --- library(cubature) f-function(x) { 2/3 * (x[1] + x[2] ) } adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(x[2],7)) --- I guess something is wrong with the way I have assigned limits in my codes. May I seek some advice from you. Many thanks. Regards, Jamil. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
On Jan 8, 2013, at 9:43 AM, Naser Jamil wrote: Hi R-users. I'm having difficulty with an integration in R via the package cubature. I'm putting it with a simple example here. I wish to integrate a function like: f(x1,x2)=2/3*(x1+x2) in the interval 0x1x27. To be sure I tried it by hand and got 114.33, but the following R code is giving me 102.6667. --- library(cubature) f-function(x) { 2/3 * (x[1] + x[2] ) } adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(x[2],7)) What is x[2]? On my machine it was 0.0761, so I obviously got a different answer. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
Please reply on list. On Jan 8, 2013, at 10:27 AM, Naser Jamil wrote: Hi David, x[2] is the second variable, x2. It comes from the condition 0x1x27. No, it doesn't come from those conditions. It is being grabbed from some x-named object that exists in your workspace. If your limits were 7 in both dimensions, then the code should be: adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(7,7)) # $integral [1] 228.6667 (At this point I was trusting R's calculus abilities more than yours. I wasn't too trusting of mine either, and so tried seeing if Wolfram Alpha would accept this expression: integrate 2/3 (x+y) over 0 x7, 0y7 ; which it did and calculating the decimal expansion of the exact fraction: 686/3 [1] 228.6667 -- David. Thanks. On 8 January 2013 18:11, David Winsemius dwinsem...@comcast.net wrote: On Jan 8, 2013, at 9:43 AM, Naser Jamil wrote: Hi R-users. I'm having difficulty with an integration in R via the package cubature. I'm putting it with a simple example here. I wish to integrate a function like: f(x1,x2)=2/3*(x1+x2) in the interval 0x1x27. To be sure I tried it by hand and got 114.33, but the following R code is giving me 102.6667. --- library(cubature) f-function(x) { 2/3 * (x[1] + x[2] ) } adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(x[2],7)) What is x[2]? On my machine it was 0.0761, so I obviously got a different answer. -- David Winsemius, MD Alameda, CA, USA David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
On Jan 8, 2013, at 10:51 AM, Naser Jamil wrote: Thanks. But then how to implement condition like 0x1x27? I would be happy to know that. Multiply the function by the conditional expression: f-function(x) { 2/3 * (x[1] + x[2] )*(x[1] x[2]) } adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(7,7)) $integral [1] 114. -- David. On 8 January 2013 18:41, David Winsemius dwinsem...@comcast.net wrote: Please reply on list. On Jan 8, 2013, at 10:27 AM, Naser Jamil wrote: Hi David, x[2] is the second variable, x2. It comes from the condition 0x1x27. No, it doesn't come from those conditions. It is being grabbed from some x-named object that exists in your workspace. If your limits were 7 in both dimensions, then the code should be: adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(7,7)) # $integral [1] 228.6667 (At this point I was trusting R's calculus abilities more than yours. I wasn't too trusting of mine either, and so tried seeing if Wolfram Alpha would accept this expression: integrate 2/3 (x+y) over 0 x7, 0y7 ; which it did and calculating the decimal expansion of the exact fraction: 686/3 [1] 228.6667 -- David. Thanks. On 8 January 2013 18:11, David Winsemius dwinsem...@comcast.net wrote: On Jan 8, 2013, at 9:43 AM, Naser Jamil wrote: Hi R-users. I'm having difficulty with an integration in R via the package cubature. I'm putting it with a simple example here. I wish to integrate a function like: f(x1,x2)=2/3*(x1+x2) in the interval 0x1x27. To be sure I tried it by hand and got 114.33, but the following R code is giving me 102.6667. --- library(cubature) f-function(x) { 2/3 * (x[1] + x[2] ) } adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(x[2],7)) What is x[2]? On my machine it was 0.0761, so I obviously got a different answer. -- David Winsemius, MD Alameda, CA, USA David Winsemius, MD Alameda, CA, USA David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Manhattan Plot
Hello, I am trying to create a simple Manhattan plot for a small list of 200 SNPs spread out in the genome in different genes. I have tried different functions (using ggplot2 and a function created by Stephen Turner, mhtplot etc.)-none of them work smoothly. Does anyone have a simple way to create the plot (not for all 22 chromosomes)- with the x axis showing the genes name and not the chromosomal location. Thanks a lot, Einat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Manhattan Plot
Hello Einat, Have you tried ggbio package's plotGrandLinear from bioconductor? Best, -m On 8 January 2013 20:03, Einat Granot einatgra...@gmail.com wrote: Hello, I am trying to create a simple Manhattan plot for a small list of 200 SNPs spread out in the genome in different genes. I have tried different functions (using ggplot2 and a function created by Stephen Turner, mhtplot etc.)-none of them work smoothly. Does anyone have a simple way to create the plot (not for all 22 chromosomes)- with the x axis showing the genes name and not the chromosomal location. Thanks a lot, Einat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Levels in new data fed to SVM
Hi all, I've encountered an issue using svm (e1071) in the specific case of supplying new data which may not have the full range of levels that were present in the training data. I've constructed this really primitive example to illustrate the point: library(e1071) training.data - data.frame(x = c(yellow,red,yellow,red), a = c(alpha,alpha,beta,beta), b = c(a, b, a, c)) my.model - svm(x ~ .,data=training.data) test.data - data.frame(x = c(yellow,red), a = c(alpha,beta), b = c(a, b)) predict(my.model,test.data) Error in predict.svm(my.model, test.data) : test data does not match model ! levels(test.data$b) - levels(training.data$b) predict(my.model,test.data) 1 2 yellowred Levels: red yellow In the first case test.data$b does not have the level c and this results in the input data being rejected. I've debugged this down to the point of model matrix creation in the SVM R code. Once I fill up the levels in the test data with the levels from the original data, then there is no problem at all. Assuming my test data has to come from another source where the number of category levels seen might not always be as large as those for the original training data, is there a better way I should be handling this? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] incrementation within ifelse
Dear R-helper, I am working on a very large data frame and I am trying to add a new column and write in it with certain conditions. I have try to use this code with the data frame p : ID = 0 p[,newColumn]- ifelse (p$flagFoehn3_durr == 1, ifelse(p$Guetsch == 0, ID - ID ++ , ID ) , 0 ) What I am trying to do is to increment the ID when p$Guetsch == 0 and to put this result in the column. The problem is that ID does not increment itself. An other way is to use a loop for like this example : ID = 0 for (s in 1:(nrow(z))){ z[s,newColumn]- if (z$flagFoehn3_durr[s] == 1){ if(z$flagFoehn3_durr[s-1] == 0){ ID -ID+1 }else{ ID } }else{ 0 } } This work perfectly, but the problem is that it will take me more than a month to run it. Is there a way to increment with the first code I used or a way of running the second code faster (I have more than 1 million rows) Thanks! Cheers, Damien [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical operator and lists
Hi, If you don't return(x) or x at the end, set.seed(5) list1-lapply(1:3,function(i) data.frame(col1=sample(c(1:5,),10,replace=TRUE), value=rnorm(10),stringsAsFactors=FALSE)) lapply(list1,function(x) x[x==]) #[[1]] #[1] #[[2]] #character(0) #[[3]] #[1] lapply(list1,function(x) x[x==]-NA) #[[1]] #[1] NA # #[[2]] #[1] NA # #[[3]] #[1] NA lapply(list1,function(x) x[x==]-rep(NA,length(x[x==]))) #[[1]] #[1] NA NA # #[[2]] #logical(0) # #[[3]] #[1] NA NA NA NA NA NA lapply(list1,function(x) {x[x==]-NA;return(x)}) #or lapply(list1,function(x) {x[x==]-NA;x}) #or lapply(list1,function(x) {x[x==]-rep(NA,length(x[x==]));x}) [[1]] # col1 value #1 2 -0.6029080 #2 5 -0.4721664 #3 NA -0.6353713 #4 2 -0.2857736 #5 1 0.1381082 #6 5 1.2276303 #7 4 -0.8017795 #8 5 -1.0803926 #9 NA -0.1575344 #10 1 -1.0717600 -- A.K. From: Dominic Roye dominic.r...@gmail.com To: arun smartpink...@yahoo.com Sent: Tuesday, January 8, 2013 1:06 PM Subject: Re: [R] Logical operator and lists hi, Can you explain me why without ;x at the end, i get only NA? c [[1]] [1] NA [[2]] [1] NA 2013/1/8 arun smartpink...@yahoo.com HI, This should also work: set.seed(5) list1-lapply(1:3,function(i) data.frame(col1=sample(c(1:5,),10,replace=TRUE), value=rnorm(10),stringsAsFactors=FALSE)) lapply(list1,function(x) {x[x==]-NA;x}) A.K. - Original Message - From: Dominic Roye dominic.r...@gmail.com To: R help r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 7:16 AM Subject: [R] Logical operator and lists Hello R-Helpers, I have a slight problem with the expresion data[data==] - NA which works well for a data.frame. But now i must use the same for a list of data.frames. My idea is data[[]][data==] but it don´t work. Thanks!! Dominic [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
Thanks. But then how to implement condition like 0x1x27? I would be happy to know that. On 8 January 2013 18:41, David Winsemius dwinsem...@comcast.net wrote: Please reply on list. On Jan 8, 2013, at 10:27 AM, Naser Jamil wrote: Hi David, x[2] is the second variable, x2. It comes from the condition 0x1x27. No, it doesn't come from those conditions. It is being grabbed from some x-named object that exists in your workspace. If your limits were 7 in both dimensions, then the code should be: adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(7,7)) # $integral [1] 228.6667 (At this point I was trusting R's calculus abilities more than yours. I wasn't too trusting of mine either, and so tried seeing if Wolfram Alpha would accept this expression: integrate 2/3 (x+y) over 0 x7, 0y7 ; which it did and calculating the decimal expansion of the exact fraction: 686/3 [1] 228.6667 -- David. Thanks. On 8 January 2013 18:11, David Winsemius dwinsem...@comcast.net wrote: On Jan 8, 2013, at 9:43 AM, Naser Jamil wrote: Hi R-users. I'm having difficulty with an integration in R via the package cubature. I'm putting it with a simple example here. I wish to integrate a function like: f(x1,x2)=2/3*(x1+x2) in the interval 0x1x27. To be sure I tried it by hand and got 114.33, but the following R code is giving me 102.6667. --**--**--- library(cubature) f-function(x) { 2/3 * (x[1] + x[2] ) } adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(x[2],7)) What is x[2]? On my machine it was 0.0761, so I obviously got a different answer. -- David Winsemius, MD Alameda, CA, USA David Winsemius, MD Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot residuals per factor
HI, Not sure whether ggplot() works with lists. If you want to plot residuals.vs.fitted for multiple groups, this could help you. Assuming that you want separate plots for each group: #You didn't provide any example. dat1-read.csv(skin_color.csv,sep=\t) #You can replace this with your dataset dat1$d-factor(dat1$skin_color) colnames(dat1)[2:3]-c(y,x) models-dlply(dat1,d,function(df) mod - lm(y~x,data=df)) models[[1]] #Call: #lm(formula = y ~ x, data = df) #Coefficients: #(Intercept) x # 51.8357 0.1407 mypath-file.path(/home/arun/Trial1,paste(catalin_,1:5,.jpg,sep=)) #change the file.path according to your system for(i in seq_along(mypath)){ jpeg(file=mypath[i]) par(mfrow=c(2,2)) line-lm(y~x,data=dat1[dat1$d==i,]) plot(line,which=1:4)# if you want only residual vs. fitted, change which=1 #abline(0,0) dev.off() } line1-lm(y~x,data=dat1[dat1$d==1,]) line1 # #Call: #lm(formula = y ~ x, data = dat1[dat1$d == 1, ]) # #Coefficients: #(Intercept) x # 51.8357 0.1407 A.K. - Original Message - From: catalin roibu catalinro...@gmail.com To: r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 4:22 AM Subject: [R] plot residuals per factor Dear R-users, I want to plot residuals vs fitted for multiple groups with ggplot2. I try this code, but unsuccessful. library(plyr) models-dlply(dat1,d,function(df) mod-lm(y~x,data=df) ggplot(models,aes(.fitted,.resid), color=factor(d))+ geom_hline(yintercept=0,col=white,size=2)+ geom_point()+ geom_smooth(se=F) -- --- Catalin-Constantin ROIBU Forestry engineer, PhD Forestry Faculty of Suceava Str. Universitatii no. 13, Suceava, 720229, Romania office phone +4 0230 52 29 78, ext. 531 mobile phone +4 0745 53 18 01 +4 0766 71 76 58 FAX: +4 0230 52 16 64 silvic.usv.ro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Manhattan Plot
Hi Einat As Mehmet suggested, you can try plotGrandlinear in ggbio. an example codes in the manual are here http://tengfei.github.com/ggbio/docs/man/plotGrandLinear.html how to install http://www.bioconductor.org/packages/2.11/bioc/html/ggbio.html But one thing is confusing(Maybe I don't get it right what you want), in Manhattan plot, the x-axis shows chromosome names like in those example, even though you have a short list with only 200 SNPs, if you hope to show gene names as x-axis, that's going to be tricky 1. 200(suppose each snp fall in different genes) x-axis labels in a row is hard or imporssible to read in the plot. 2. it's spread over multiple chromosomes, even if you really want to labels them with gene names, it's tricky to do in gglot2 level. IMHO, as a Manhattan overview, you don't have to show all the details like gene names, just to check general distribution and outliers. I guess what you want is probably a gene-list view like in IGV? http://www.broadinstitute.org/igv/gene_list_view suppose your 200 SNPs falls only to a few genes, let's say 10 genes. you may want something like each gene take one facet or panel in a row/grid of plots, would you mind to explain your case? or maybe you can shoot me personal email for collaboration on your case, because complex gene-slit views are not fully supported in ggbio yet. Thanks Tengfei On Tue, Jan 8, 2013 at 1:19 PM, Suzen, Mehmet msu...@gmail.com wrote: Hello Einat, Have you tried ggbio package's plotGrandLinear from bioconductor? Best, -m On 8 January 2013 20:03, Einat Granot einatgra...@gmail.com wrote: Hello, I am trying to create a simple Manhattan plot for a small list of 200 SNPs spread out in the genome in different genes. I have tried different functions (using ggplot2 and a function created by Stephen Turner, mhtplot etc.)-none of them work smoothly. Does anyone have a simple way to create the plot (not for all 22 chromosomes)- with the x axis showing the genes name and not the chromosomal location. Thanks a lot, Einat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tm: custom reader for readPlain
Hello: I have a series of newspaper articles from a Canadian newspaper database (Canadian Newsstand) that look just like below. I've read through this vignette (http://cran.r-project.org/web/packages/tm/vignettes/extensions.pdf) about creating a custom reader to extract meta-data, but I can't understand how to apply this in the context of a text document, rather than in the tabular format as in the vignette. You can see there's all kinds of valuable information in each document -Author, page number, publication year, section, publication title Can anyone provide some suggestions to someone unfamiliar with the tm package as to how to go about creating a custom reader for this situation? Yours truly, Simon Kiss Document 1 of 40 First Nation agrees not to block trains Author: SHAWN BERRY Legislature Bureau Publication info: Daily Gleaner [Fredericton, N.B] 07 Jan 2013: A.3. http://remote.libproxy.wlu.ca/login?url=http://search.proquest.com/docview/1266701269?accountid=15090 Abstract: Participants are also concerned about Chief Theresa Spence who stopped eating solid food on Dec. 11 in a bid to secure a meeting between First Nations leaders, Prime Minister Stephen Harper and Gov. Gen. David Johnston to discuss the treaty relationship. Links: null Full Text: A bunch of text about a story here Subject: Railroads; Native North Americans; Meetings; Injunctions Title: First Nation agrees not to block trains Publication title: Daily Gleaner First page: A.3 Publication year: 2013 Publication date: Jan 7, 2013 Year: 2013 Section: Main Publisher: Infomart, a division of Postmedia Network Inc. Place of publication: Fredericton, N.B. Country of publication: Canada Journal subject: GENERAL INTEREST PERIODICALS--UNITED STATES ISSN: 08216983 Source type: Newspapers Language of publication: English Document type: News ProQuest document ID: 1266701269 Document URL: http://remote.libproxy.wlu.ca/login?url=http://search.proquest.com/docview/1266701269?accountid=15090 Copyright: (Copyright (c) 2013 The Daily Gleaner (Fredericton)) Last updated: 2013-01-07 Database: Canadian Newsstand Complete * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 Please avoid sending me Word, PowerPoint or Excel attachments. Sending these documents puts pressure on many people to use Microsoft software and helps to deny them any other choice. In effect, you become a buttress of the Microsoft monopoly. To convert to plain text choose Text Only or Text Document as the Save As Type. Your computer may also have a program to convert to PDF format. Select File, then Print. Scroll through available printers and select the PDF converter. Click on the Print button and enter a name for the PDF file when requested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
On 08-01-2013, at 19:51, Naser Jamil jamilnase...@gmail.com wrote: Thanks. But then how to implement condition like 0x1x27? I would be happy to know that. David implemented the condition by multiplying by x[1]x[2] which in a numeric context is 0 when x[1]x[2] en 1 when x[1]=x[2]. That is what your requirement does. The condition 0x1x27 has been split into three separate parts 0x17 0x27 x1x2 The first two can be converted to arguments of adaptIntegrate. The last is inserted into the function definition effectively making the function return 0 when x1x2 which is what your inequality implies. Berend On 8 January 2013 18:41, David Winsemius dwinsem...@comcast.net wrote: Please reply on list. On Jan 8, 2013, at 10:27 AM, Naser Jamil wrote: Hi David, x[2] is the second variable, x2. It comes from the condition 0x1x27. No, it doesn't come from those conditions. It is being grabbed from some x-named object that exists in your workspace. If your limits were 7 in both dimensions, then the code should be: adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(7,7)) # $integral [1] 228.6667 (At this point I was trusting R's calculus abilities more than yours. I wasn't too trusting of mine either, and so tried seeing if Wolfram Alpha would accept this expression: integrate 2/3 (x+y) over 0 x7, 0y7 ; which it did and calculating the decimal expansion of the exact fraction: 686/3 [1] 228.6667 -- David. Thanks. On 8 January 2013 18:11, David Winsemius dwinsem...@comcast.net wrote: On Jan 8, 2013, at 9:43 AM, Naser Jamil wrote: Hi R-users. I'm having difficulty with an integration in R via the package cubature. I'm putting it with a simple example here. I wish to integrate a function like: f(x1,x2)=2/3*(x1+x2) in the interval 0x1x27. To be sure I tried it by hand and got 114.33, but the following R code is giving me 102.6667. --**--**--- library(cubature) f-function(x) { 2/3 * (x[1] + x[2] ) } adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(x[2],7)) What is x[2]? On my machine it was 0.0761, so I obviously got a different answer. -- David Winsemius, MD Alameda, CA, USA David Winsemius, MD Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
On 08-01-2013, at 22:00, Berend Hasselman b...@xs4all.nl wrote: …... David implemented the condition by multiplying by x[1]x[2] which in a numeric context is 0 when x[1]x[2] en 1 when x[1]=x[2]. OOPS!! Reverse the 0 and the 1 in that sentence (TRUE becomes 1 and FALSE becomes 0) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
On Jan 8, 2013, at 1:07 PM, Berend Hasselman wrote: On 08-01-2013, at 22:00, Berend Hasselman b...@xs4all.nl wrote: …... David implemented the condition by multiplying by x[1]x[2] which in a numeric context is 0 when x[1]x[2] en 1 when x[1]=x[2]. OOPS!! Reverse the 0 and the 1 in that sentence (TRUE becomes 1 and FALSE becomes 0) I didn't worry about which direction was used for the inequality in this case because of consideration of symmetry. (Thinking about it now, I still think I managed to choose the correct direction.) Obviously such a cavalier attitude should not be carried over to the general situation. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration in R
On Jan 8, 2013, at 1:31 PM, David Winsemius wrote: On Jan 8, 2013, at 1:07 PM, Berend Hasselman wrote: On 08-01-2013, at 22:00, Berend Hasselman b...@xs4all.nl wrote: …... David implemented the condition by multiplying by x[1]x[2] which in a numeric context is 0 when x[1]x[2] en 1 when x[1]=x[2]. OOPS!! Reverse the 0 and the 1 in that sentence (TRUE becomes 1 and FALSE becomes 0) I didn't worry about which direction was used for the inequality in this case because of consideration of symmetry. (Thinking about it now, I still think I managed to choose the correct direction.) Obviously such a cavalier attitude should not be carried over to the general situation. I hit the send button by mistake. (I originally misread Berend's correction.) I'm only posting this to make clear that it was myself, and not Berend, that I was accusing of having been cavalier. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tm: custom reader for readPlain
Le mardi 08 janvier 2013 à 15:56 -0500, Simon Kiss a écrit : Hello: I have a series of newspaper articles from a Canadian newspaper database (Canadian Newsstand) that look just like below. I've read through this vignette (http://cran.r-project.org/web/packages/tm/vignettes/extensions.pdf) about creating a custom reader to extract meta-data, but I can't understand how to apply this in the context of a text document, rather than in the tabular format as in the vignette. You can see there's all kinds of valuable information in each document -Author, page number, publication year, section, publication title Can anyone provide some suggestions to someone unfamiliar with the tm package as to how to go about creating a custom reader for this situation? You should create a reader function that takes as an input the text content you pasted at the end of your messages, parses it as appropriate, and returns a PlainTextDocument. The information can be set using the meta() function on the document object before returning it. You can see how this process works by looking at the readFactivaHTML.R file from my tm.plugin.factiva package, and probably from other packages too (do not use readFactivaXML.R as it uses a method that only works for XML input). Of course, parsing the input will take some work, but it shouldn't be too hard if you split each line into a field identifier (the part before :) and the value of the field, and create a character vector from that. An information you did not give us is how are distributed the different articles you need to import. If they are each in a separate files, you can adapt DirSource() from tm so that it calls your reader function on each file. If they are in one file, you need to create a custom source that will read the file, split it and call the reader function on the part corresponding to each article; this latter way is illustrated by the HTML part of the FactivaSource.R file (again, skip the XML part). Finally, maybe you can extract the articles in a different format, ideally in XML, which is easier to use? Or maybe this newspaper is available on Factiva, in which case my package will work for you? Hope this helps Yours truly, Simon Kiss Document 1 of 40 First Nation agrees not to block trains Author: SHAWN BERRY Legislature Bureau Publication info: Daily Gleaner [Fredericton, N.B] 07 Jan 2013: A.3. http://remote.libproxy.wlu.ca/login?url=http://search.proquest.com/docview/1266701269?accountid=15090 Abstract: Participants are also concerned about Chief Theresa Spence who stopped eating solid food on Dec. 11 in a bid to secure a meeting between First Nations leaders, Prime Minister Stephen Harper and Gov. Gen. David Johnston to discuss the treaty relationship. Links: null Full Text: A bunch of text about a story here Subject: Railroads; Native North Americans; Meetings; Injunctions Title: First Nation agrees not to block trains Publication title: Daily Gleaner First page: A.3 Publication year: 2013 Publication date: Jan 7, 2013 Year: 2013 Section: Main Publisher: Infomart, a division of Postmedia Network Inc. Place of publication: Fredericton, N.B. Country of publication: Canada Journal subject: GENERAL INTEREST PERIODICALS--UNITED STATES ISSN: 08216983 Source type: Newspapers Language of publication: English Document type: News ProQuest document ID: 1266701269 Document URL: http://remote.libproxy.wlu.ca/login?url=http://search.proquest.com/docview/1266701269?accountid=15090 Copyright: (Copyright (c) 2013 The Daily Gleaner (Fredericton)) Last updated: 2013-01-07 Database: Canadian Newsstand Complete * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 Please avoid sending me Word, PowerPoint or Excel attachments. Sending these documents puts pressure on many people to use Microsoft software and helps to deny them any other choice. In effect, you become a buttress of the Microsoft monopoly. To convert to plain text choose Text Only or Text Document as the Save As Type. Your computer may also have a program to convert to PDF format. Select File, then Print. Scroll through available printers and select the PDF converter. Click on the Print button and enter a name for the PDF file when requested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] tm: custom reader for readPlain
Hmm...Thanks a lot! that seems like really useful stuff. It might be a bit over my head, but I'll look into it. The articles are all contained in one text file, but they are clearly delimited (either by a series of ) or the regular expression ^Document.[0-9]. Simon On 2013-01-08, at 4:44 PM, Milan Bouchet-Valat wrote: Le mardi 08 janvier 2013 à 15:56 -0500, Simon Kiss a écrit : Hello: I have a series of newspaper articles from a Canadian newspaper database (Canadian Newsstand) that look just like below. I've read through this vignette (http://cran.r-project.org/web/packages/tm/vignettes/extensions.pdf) about creating a custom reader to extract meta-data, but I can't understand how to apply this in the context of a text document, rather than in the tabular format as in the vignette. You can see there's all kinds of valuable information in each document -Author, page number, publication year, section, publication title Can anyone provide some suggestions to someone unfamiliar with the tm package as to how to go about creating a custom reader for this situation? You should create a reader function that takes as an input the text content you pasted at the end of your messages, parses it as appropriate, and returns a PlainTextDocument. The information can be set using the meta() function on the document object before returning it. You can see how this process works by looking at the readFactivaHTML.R file from my tm.plugin.factiva package, and probably from other packages too (do not use readFactivaXML.R as it uses a method that only works for XML input). Of course, parsing the input will take some work, but it shouldn't be too hard if you split each line into a field identifier (the part before :) and the value of the field, and create a character vector from that. An information you did not give us is how are distributed the different articles you need to import. If they are each in a separate files, you can adapt DirSource() from tm so that it calls your reader function on each file. If they are in one file, you need to create a custom source that will read the file, split it and call the reader function on the part corresponding to each article; this latter way is illustrated by the HTML part of the FactivaSource.R file (again, skip the XML part). Finally, maybe you can extract the articles in a different format, ideally in XML, which is easier to use? Or maybe this newspaper is available on Factiva, in which case my package will work for you? Hope this helps Yours truly, Simon Kiss Document 1 of 40 First Nation agrees not to block trains Author: SHAWN BERRY Legislature Bureau Publication info: Daily Gleaner [Fredericton, N.B] 07 Jan 2013: A.3. http://remote.libproxy.wlu.ca/login?url=http://search.proquest.com/docview/1266701269?accountid=15090 Abstract: Participants are also concerned about Chief Theresa Spence who stopped eating solid food on Dec. 11 in a bid to secure a meeting between First Nations leaders, Prime Minister Stephen Harper and Gov. Gen. David Johnston to discuss the treaty relationship. Links: null Full Text: A bunch of text about a story here Subject: Railroads; Native North Americans; Meetings; Injunctions Title: First Nation agrees not to block trains Publication title: Daily Gleaner First page: A.3 Publication year: 2013 Publication date: Jan 7, 2013 Year: 2013 Section: Main Publisher: Infomart, a division of Postmedia Network Inc. Place of publication: Fredericton, N.B. Country of publication: Canada Journal subject: GENERAL INTEREST PERIODICALS--UNITED STATES ISSN: 08216983 Source type: Newspapers Language of publication: English Document type: News ProQuest document ID: 1266701269 Document URL: http://remote.libproxy.wlu.ca/login?url=http://search.proquest.com/docview/1266701269?accountid=15090 Copyright: (Copyright (c) 2013 The Daily Gleaner (Fredericton)) Last updated: 2013-01-07 Database: Canadian Newsstand Complete * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 Please avoid sending me Word, PowerPoint or Excel attachments. Sending these documents puts pressure on many people to use Microsoft software and helps to deny them any other choice. In effect, you become a buttress of the Microsoft monopoly. To convert to plain text choose Text Only or Text Document as the Save As Type. Your computer may also have a program to convert to PDF format. Select File, then Print. Scroll through available printers and select the PDF converter. Click on the Print button and enter a name for the PDF file when requested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
[R] Correct use of the cluster::daisy function
Hi, I have two groups, and I want to find the dissimiarity between the members of the two groups. Since I have mixed level variables on the members, I opt for the daisy function in the cluster package. Let's pretend that the following represent my groups: x - data.frame(sex=factor(c(1,0,0,1,0,1), levels=0:1, labels=c('Male','Female'), ordered=FALSE), age=c(31,28,25,30,29,28), x=c(1,0,0,1,0,1) ) y - data.frame(sex=factor(c(1,0,0), levels=0:1, labels=c('Male','Female'), ordered=FALSE), age=c(27,30,30), x=c(0,1,0) ) Now I'm thinking that I take the first member of group x and compare to the first member of group y. I bind the two members together since daisy computes dissimilarities row-wise. Like this: library(cluster) daisy(rbind(x[1, ], y[1, ])) Is this is correct thinking? If so, I can build a nested loop to calculate the full between group dissimilarity matrix. Like this (ignoring the performance issues of the for loop): m - matrix(nrow=nrow(x), ncol=nrow(y)) for(i in 1:nrow(x)){ for(j in 1:nrow(y)){ m[i,j]- daisy(rbind(x[i, ], y[j, ]))[1] } } I get a matrix with group x along the rows and group y along the columns. m Now, looking at the first row of the result matrix; is it meaningful to say that subject x1 is 'closest' to subject y1 and y3 as they have the lowest dissimilarity coefficient? And that x2 is closest to y3, by the same logic? Etc... Thanks a lot in advnce! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] try()-function does not catch error in BATCH-job if Matrix is loaded
The work-around was actually put in plae prior to the release of R 2.15.2, so updating your R to the current released version will resolve this. Best, luke On Mon, 7 Jan 2013, luke-tier...@uiowa.edu wrote: This is due to long-staning issue in methods internals, which are involved because loading Matrix shadows base::mean with Matrix::mean. A work-around has been in place in R_devel for some time; a proper fix may come at some point in the future. So if your real code doesn't need the moficied mean from Matrix you can use base::mean; otherwise you will have to use an R-devel snapshot. Best, luke On Mon, 7 Jan 2013, Sarah Brockhaus wrote: Hello, In my simulation I use the try()-function to catch possible errors when fitting models. I run the simulationon a Linux-server using the command R CMD BATCH nameOfFile.R . When executing the code as batch-job I get the problem that the execution is halted without giving an error message. But when I run the code interactivly the error is catched by try() as it would be expected. The problem is somewhat strange as it only occurs when the code is executed as a batch-job and when the package Matrix is loaded. I wrote a small example reproducing the error. (In my code the error occurs in mgcv:::fixDependence, which looks like the code I'm using below in order to get a small reproducible example. I realized that the code makes no sense...) ## library(Matrix) R - matrix(abs(rnorm(25)), 5, 5) r0 - r - nrow(R) # while-loop produces error that should be catched by the function try() try( while (mean(R[r0:r, r0:r]) 0) r0 - r0 - 1 ) # so Hello should be printed print(Hello) ## I use R version 2.15.1 (2012-06-22) on a x86_64-pc-linux-gnu (64-bit) platform. I would be grateful for help. Best regards, Sarah Brockhaus PHD-student Department of Statistics University of Munich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random effects model
Hi Thanks a lot, the corstr exchangeabledoes work. Didn't strike to me for so long. Does the AIC value come out with the gee output? By reference, I meant reference to a easy-read paper or web address that can give me knowledge about implications of missing data. Ta. On 1/8/13, arun kirshna [via R] ml-node+s789695n4654902...@n4.nabble.com wrote: HI, BP.stack5 is the one without missing values. na.omit(). Otherwise, I have to use the option na.action=.. in the ?geese() statement You need to read about the correlation structures. IN unstructured option, more number of parameters needs to be estimated, In repeated measures design, when the underlying structure is not known, it would be better to compare using different options (exchangeable is similar to compound symmetry) and select the one which provide the least value for AIC or BIC. Have a look at http://stats.stackexchange.com/questions/21771/how-to-perform-model-selection-in-gee-in-r It's not clear to me reference to write about missing values. A.K. - Original Message - From: Usha Gurunathan usha.nat...@gmail.com To: arun smartpink...@yahoo.com Cc: Sent: Monday, January 7, 2013 6:12 PM Subject: Re: [R] random effects model Hi AK 2)I shall try putting exch. and check when I get home. Btw, what is BP.stack5? is it with missing values or only complete cases? I guess I am still not clear about the unstructured and exchangeable options, as in which one is better. 1)Rgding the summary(p): NA thing, I tried putting one of my gee equation. Can you suggest me a reference to write about missing values and the implications for my results Thanks. On 1/8/13, arun smartpink...@yahoo.com wrote: HI, Just to add: fit3-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr=exch,scale.fix=TRUE) #works summary(fit3)$mean[p] # p #(Intercept) 0. #MaternalAge40.49099242 #MaternalAge50.04686295 #time21 0.86164351 #MaternalAge4:time21 0.59258221 #MaternalAge5:time21 0.79909832 fit4-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr=unstructured,scale.fix=TRUE) #when the correlation structure is changed to unstructured #Error in `contrasts-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : # contrasts can be applied only to factors with 2 or more levels #In addition: Warning message: #In is.na(rows) : is.na() applied to non-(list or vector) of type 'NULL' Though, it works with data(Ohio) fit1-geese(resp~age+smoke+age:smoke,id=id,data=ohio1,family=binomial,corstr=unstructured,scale.fix=TRUE) summary(fit1)$mean[p] # p #(Intercept) 0. #age-10.60555454 #age0 0.45322698 #age1 0.01187725 #smoke1 0.86262269 #age-1:smoke1 0.17239050 #age0:smoke1 0.32223942 #age1:smoke1 0.36686706 By checking: with(BP.stack5,table(MaternalAge,time)) # time #MaternalAge 14 21 #3 1104 864 # 4 875 667 # 5 67 53 #less number of observations BP.stack6 - BP.stack5[order(BP.stack5$CODEA, BP.stack5$time),] head(BP.stack6) # very few IDs with MaternalAge==5 # X CODEA Sex MaternalAge Education Birthplace AggScore IntScore #1493 3.1 3 2 3 3 100 #3202 3.2 3 2 3 3 100 #1306 7.1 7 2 4 6 100 #3064 7.2 7 2 4 6 100 #18.1 8 2 4 4 100 #2047 8.2 8 2 4 4 100 # Categ time Obese Overweight hibp #1493 Overweight 14 0 00 #3202 Overweight 21 0 10 #1306 Obese 14 0 00 #3064 Obese 21 1 10 #1Normal 14 0 00 #2047 Normal 21 0 00 BP.stack7-BP.stack6[BP.stack6$MaternalAge!=5,] BP.stack7$MaternalAge-factor(as.numeric(as.character(BP.stack7$MaternalAge) fit5-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack7,family=binomial,corstr=unstructured,scale.fix=TRUE) #Error in `contrasts-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : # contrasts can be applied only to factors with 2 or more levels with(BP.stack7,table(MaternalAge,time)) #It looks like the combinations are still there # time #MaternalAge 14 21 # 3 1104 864 # 4 875 667 It works also with corstr=ar1. Why do you gave the option unstructured? A.K. - Original Message - From: rex2013 usha.nat...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, January 7, 2013 6:15 AM Subject: Re: [R] random effects model Hi A.K Below is the comment I get, not sure why. BP.sub3 is the stacked data without the missing
Re: [R] MA process in panels
Dear R users, after some time I have picked up working on this dataset again. I have found a way which produces reasonable results but I am not sure whether it is truly the correct way to go. I am still looking for a way to estimate a panel ARMA(1,1) with cross-sectional fixed effects (later I wish to include time fixed effects as well -- but for the sake of simplicity, I drop these for now). Since I have a large panel (see above), I wish to regress demeaned time series. After trying out a lot of different approaches (mainly using the nlme-package and the plm-package, but also trying to specify a maxLik function) without figuring out a way to estimate the ARMA, I began thinking about the arima function -- and how it could help me to solve my issue. Please find attached my thoughts on this with an example based on the Grunfeld dataset (unfortunately I did not find a larger, more appropriate panel dataset). If my approach is incorrect, I hope these thoughts nevertheless help some more advanced R users to find a proper way to estimate panel ARMAs. Any comment is highly appreciated! Thank you very much again, Philipp Grueber # Data Import library(plm) library(lmtest) data(Grunfeld) tslag - function(x, d=l) { x - as.vector(x) n - length(x) c(rep(NA,d),x)[1:n] } #In a final dataset, lagging should be done for each cross-section. For the sake of comparability of arima(...,xreg=Grunfeld$inv1,order=c(0,0,0)) with arima(...,xreg=0,order=c(1,0,0)), let's do this: Grunfeld$inv1-tslag(Grunfeld$inv,d=1) # In order to avoid differing results due to heterogenous handling of NAs: Grunfeld-Grunfeld[-1,] # Create dummy variables for comparison: mat_i-as.data.frame.matrix(table(cbind.data.frame(N=1:nrow(Grunfeld),T=Grunfeld$firm)))[,-1] #Now, these two should be the same, but there seems to be a convergence problem (not enough data?). arima_0-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,mat_i),transform.pars=F,include.mean=T,order=c(1,0,0)) coef(arima_0)[1:4] arima_1-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$inv1,Grunfeld$value,Grunfeld$capital,mat_i),transform.pars=F,include.mean=T,order=c(0,0,0)) coef(arima_1)[1:4] #I can show that they are not the same but closer using better data: a-arima.sim(n = 10001,model=list(ar=0.5,order=c(1,0,0))) b-a[-1] c-tslag(a,d=1)[-1] coef(lm(b~c)) coef(arima(x=b,include.mean=T,order=c(1,0,0))) coef(arima(x=b,xreg=c,include.mean=T,order=c(0,0,0))) coef(lm(b~c-1)) coef(arima(x=b,include.mean=F,order=c(1,0,0))) coef(arima(x=b,include.mean=F,xreg=c,order=c(0,0,0))) #Probably, this does not matter as for an ARMA(1,0) model, I would prefer to use OLS and thus, the lm function anyway! #Why do I want to know? Because only if these two (arima_0 and arima_1) are the same, I would be able to use the cross-sectionally lagged series of inv, inv_1 as the lagged endogenous variable, right? Grunfeld$inv_1-c() for (i in unique(Grunfeld$firm)){ Grunfeld$inv_1[Grunfeld$firm==i]-tslag(Grunfeld$inv[Grunfeld$firm==i],d=1) } #note: for the sake of comparability, I will not do so in the following. # Create demeaned series y_i-Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$firm) y1_i-Grunfeld$inv1-ave(Grunfeld$inv1,index=Grunfeld$firm) x1_i-Grunfeld$value-ave(Grunfeld$value,index=Grunfeld$firm) x2_i-Grunfeld$capital-ave(Grunfeld$capital,index=Grunfeld$firm) y_it1-y_i-ave(y_i,index=Grunfeld$year) y1_it1-y1_i-ave(y1_i,index=Grunfeld$year) x1_it1-x1_i-ave(x1_i,index=Grunfeld$year) x2_it1-x2_i-ave(x2_i,index=Grunfeld$year) #In order to simplify my calculation, I now wish to use demeaned data. I am able to show that these two are the same: arima_a-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,Grunfeld$inv1,mat_i),transform.pars=F,include.mean=T,order=c(0,0,0)) coef(arima_a)[1:4] arima_b-arima(x=y_i,xreg=cbind(y1_i,x1_i,x2_i),transform.pars=F,include.mean=F,order=c(0,0,0)) coef(arima_b)[1:3] #Even though I believe I do not need a constant term (due to demeaning), here's the test: arima_c-arima(x=y_i,xreg=cbind(y1_i,x1_i,x2_i),transform.pars=F,include.mean=T,order=c(0,0,0)) coef(arima_c)[1:4] #Related with the above question is the fact, that these coefficients are different from the following ones: arima_d-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,mat_i),transform.pars=F,include.mean=T,order=c(1,0,0)) coef(arima_d)[1:4] arima_e-arima(x=y_i,xreg=cbind(x1_i,x2_i),transform.pars=F,include.mean=F,order=c(1,0,0)) coef(arima_e)[1:3] arima_f-arima(x=y_i,xreg=cbind(x1_i,x2_i),transform.pars=F,include.mean=T,order=c(1,0,0)) coef(arima_f)[1:4] #Finally, I wish to complete my ARMA by introducing the MA process. arima_g-arima(x=Grunfeld$inv,xreg=cbind(Grunfeld$value,Grunfeld$capital,Grunfeld$inv1,mat_i),transform.pars=F,include.mean=T,order=c(0,0,1)) coef(arima_g)[1:5] arima_h-arima(x=y_i,xreg=cbind(y1_i,x1_i,x2_i),transform.pars=F,include.mean=F,order=c(0,0,1)) coef(arima_h)[1:4]
[R] Not getting enough tweets in twitteR
Hi all, I am trying to download as many tweets as possible (say 1000). The documentation states that the limit is 3200. However when I run police-userTimeline('@nswpolice',n=1000) it returns random amounts. When I ran it today I got 144, yesterday it was around 300. Any thoughts? Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [twitteR-users] Not getting enough tweets in twitteR
Is there a way to limit the tweets by date so that I can do multiple calls to get the required number of tweets? For example, something in the lines of: police-userTimeline('@**nswpolice',n=1000,since=01/12/2012, to=01/01/2013) #first call police-userTimeline('@**nswpolice',n=1000,since=01/11/2012, to=01/12/2012) #second call On Wed, Jan 9, 2013 at 11:29 AM, Jeff Gentry gen...@hexdump.org wrote: On Wed, 9 Jan 2013, Sachinthaka Abeywardana wrote: I am trying to download as many tweets as possible (say 1000). The documentation states that the limit is 3200. However when I run police-userTimeline('@**nswpolice',n=1000) it returns random amounts. When I ran it today I got 144, yesterday it was around 300. The most likely explanation is that you're bumping into the API's limitation for time. The Twitter API typically will only give you results that go back for a few days, regardless of what you request. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R2html and Blackboard LMS
Dear R People: Has anyone used R2HTML in web files that were on the Blackboard LMS, please? I'm starting to do these but wanted to know if there were any potential pitfalls. Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R2html and Blackboard LMS : solved
Everything is ok on Firefox, IE, and iPad. Thanks, Erin On Tue, Jan 8, 2013 at 7:58 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: Has anyone used R2HTML in web files that were on the Blackboard LMS, please? I'm starting to do these but wanted to know if there were any potential pitfalls. Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random effects model
HI, In your dataset, the exchangeable or compound symmetry may work as there are only two levels for time. In experimental data analysis involving a factor time with more than 2 levels, randomization of combination of levels of factors applied to the subject/plot etc. gets affected as time is unidirectional. I guess your data is observational, and with two time levels, it may not hurt to use CS as option, though, it would help if you check different options. In the link I sent previously, QIC was used. http://stats.stackexchange.com/questions/577/is-there-any-reason-to-prefer-the-aic-or-bic-over-the-other I am not sure whether AIC/BIC is better than QIC or viceversa. You could sent email to the maintainer of geepack (Jun Yan jun@uconn.edu). Regarding the reference links, You can check this link www.jstatsoft.org/v15/i02/paper . Other references are in the paper. 4.3. Missing values (waves) In case of missing values, the GEE estimates are consistent if the values are missing com- pletely at random (Rubin 1976). The geeglm function assumes by default that observations are equally separated in time. Therefore, one has to inform the function about different sep- arations if there are missing values and other correlation structures than the independence or exchangeable structures are used. The waves arguments takes an integer vector that indicates that two observations of the same cluster with the values of the vector of k respectively l have a correlation of rkl . Hope it helps. A.K. - Original Message - From: rex2013 usha.nat...@gmail.com To: r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 5:29 PM Subject: Re: [R] random effects model Hi Thanks a lot, the corstr exchangeabledoes work. Didn't strike to me for so long. Does the AIC value come out with the gee output? By reference, I meant reference to a easy-read paper or web address that can give me knowledge about implications of missing data. Ta. On 1/8/13, arun kirshna [via R] ml-node+s789695n4654902...@n4.nabble.com wrote: HI, BP.stack5 is the one without missing values. na.omit(). Otherwise, I have to use the option na.action=.. in the ?geese() statement You need to read about the correlation structures. IN unstructured option, more number of parameters needs to be estimated, In repeated measures design, when the underlying structure is not known, it would be better to compare using different options (exchangeable is similar to compound symmetry) and select the one which provide the least value for AIC or BIC. Have a look at http://stats.stackexchange.com/questions/21771/how-to-perform-model-selection-in-gee-in-r It's not clear to me reference to write about missing values. A.K. - Original Message - From: Usha Gurunathan usha.nat...@gmail.com To: arun smartpink...@yahoo.com Cc: Sent: Monday, January 7, 2013 6:12 PM Subject: Re: [R] random effects model Hi AK 2)I shall try putting exch. and check when I get home. Btw, what is BP.stack5? is it with missing values or only complete cases? I guess I am still not clear about the unstructured and exchangeable options, as in which one is better. 1)Rgding the summary(p): NA thing, I tried putting one of my gee equation. Can you suggest me a reference to write about missing values and the implications for my results Thanks. On 1/8/13, arun smartpink...@yahoo.com wrote: HI, Just to add: fit3-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr=exch,scale.fix=TRUE) #works summary(fit3)$mean[p] # p #(Intercept) 0. #MaternalAge4 0.49099242 #MaternalAge5 0.04686295 #time21 0.86164351 #MaternalAge4:time21 0.59258221 #MaternalAge5:time21 0.79909832 fit4-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr=unstructured,scale.fix=TRUE) #when the correlation structure is changed to unstructured #Error in `contrasts-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : # contrasts can be applied only to factors with 2 or more levels #In addition: Warning message: #In is.na(rows) : is.na() applied to non-(list or vector) of type 'NULL' Though, it works with data(Ohio) fit1-geese(resp~age+smoke+age:smoke,id=id,data=ohio1,family=binomial,corstr=unstructured,scale.fix=TRUE) summary(fit1)$mean[p] # p #(Intercept) 0. #age-1 0.60555454 #age0 0.45322698 #age1 0.01187725 #smoke1 0.86262269 #age-1:smoke1 0.17239050 #age0:smoke1 0.32223942 #age1:smoke1 0.36686706 By checking: with(BP.stack5,table(MaternalAge,time)) # time #MaternalAge 14 21 # 3 1104 864 # 4 875 667 # 5 67 53 #less number of observations BP.stack6 - BP.stack5[order(BP.stack5$CODEA, BP.stack5$time),] head(BP.stack6) # very few IDs with MaternalAge==5 # X CODEA Sex MaternalAge
[R] another R2HTML question
Hello again. I just started with the R2HTML and it's REALLY slick! I do have one question, please: It shows the output, but not the command which generated the output. How do I get the command, please? For instance, when I do: summary(etch.aov) it shows the nice output but no summary command. Any suggestions on how to put that in, please? Thanks so much in advance, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R2html and Blackboard LMS : solved
Use the echo = TRUE in the HTMLStart function. Sorry. On Tue, Jan 8, 2013 at 8:21 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Everything is ok on Firefox, IE, and iPad. Thanks, Erin On Tue, Jan 8, 2013 at 7:58 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: Has anyone used R2HTML in web files that were on the Blackboard LMS, please? I'm starting to do these but wanted to know if there were any potential pitfalls. Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying a user-defined function
Hello List, Last time, Arun's following solution worked to create 3 new columns (1,3,5). Now how would I tweak this function to create corresponding (additional) columns (7,8,9) of mode factor (levels = 1,2,3,4,5)? Thanks for your continued support. Pradip ### cut and paste from the reproducible example CutQuintiles - function( x) { cut (x,quantile (x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat) ## Reproducible Example test1 - read.table (text= State,ObtMj_P,ObtMj_SE,ExpPrevMed_P,ExpPrevMed_SE,ParMon_P,ParMon_SE Alabama,49.60,1.37,80.00,0.91,12.10,0.68 Alaska,55.00,1.41,81.80,1.08,12.40,0.90 Arizona,52.50,1.56,79.60,1.20,15.80,1.08 Arkansas,50.50,1.22,78.00,0.78,12.80,0.72 California,51.10,0.65,80.50,0.53,13.00,0.41 Colorado,55.10,1.26,81.70,1.03,12.10,0.72 Connecticut,56.30,1.28,85.00,0.93,14.60,0.77 Delaware,53.60,1.30,79.50,1.04,14.70,0.97 District of Columbia,53.50,1.22,76.20,1.03,14.30,1.13 Florida,52.70,0.67,78.90,0.52,14.10,0.45 Georgia,52.50,1.15,79.30,1.02,15.90,0.98 Hawaii,49.40,1.33,83.80,1.12,16.00,1.06 Idaho,48.30,1.23,82.40,0.99,11.90,0.74 Illinois,52.70,0.63,81.00,0.46,13.60,0.40 Indiana,49.60,1.16,80.90,0.91,12.60,0.82 Iowa,46.30,1.37,82.10,1.01,13.60,0.87 Kansas,44.30,1.43,79.20,0.98,12.90,0.79 Kentucky,52.90,1.37,78.70,1.05,14.60,0.98 Louisiana,49.70,1.23,76.80,1.06,14.50,0.76 Maine,55.60,1.44,82.90,0.93,16.70,0.83 Maryland,53.90,1.46,83.60,0.95,14.00,0.80 Massachusetts,55.40,1.41,81.00,1.15,14.70,0.80 Michigan,52.40,0.62,80.50,0.47,15.00,0.43 Minnesota,51.50,1.20,84.40,0.87,14.40,0.86 Mississippi,43.20,1.14,76.60,0.91,12.30,0.78 Missouri,48.70,1.20,80.30,0.90,13.70,0.12 Montana,56.40,1.16,83.70,0.95,12.10,0.68 Nebraska,45.70,1.51,83.40,0.95,12.40,0.90 Nevada,54.20,1.17,80.60,1.07,15.80,1.08 New Hampshire,56.10,1.30,83.30,0.93,12.80,0.72 New Jersey,53.20,1.45,83.70,0.95,13.00,0.41 New Mexico,57.60,1.34,78.90,1.03,12.10,0.72 New York,53.70,0.67,82.60,0.48,14.60,0.77 North Carolina,52.20,1.26,81.90,0.84,14.70,0.97 North Dakota,48.60,1.34,84.20,0.88,14.30,1.13 Ohio,50.90,0.61,82.70,0.49,14.10,0.45 Oklahoma,47.20,1.42,78.80,1.33,15.90,0.98 Oregon,54.00,1.35,80.60,1.14,16.00,1.06 Pennsylvania,53.00,0.63,79.90,0.47,11.90,0.74 Rhode Island,57.20,1.20,79.50,1.02,13.60,0.40 South Carolina,50.50,1.21,79.50,0.95,12.60,0.82 South Dakota,43.40,1.30,81.70,1.05,13.60,0.87 Tennessee,48.90,1.35,78.40,1.35,12.90,0.79 Texas,48.70,0.62,79.00,0.48,14.60,0.98 Utah,42.00,1.49,85.00,0.93,14.50,0.76 Vermont,58.70,1.24,83.70,0.84,16.70,0.83 Virginia,51.80,1.18,82.00,1.04,14.00,0.80 Washington,53.50,1.39,84.10,0.96,14.70,0.80 West Virginia,52.80,1.07,79.80,0.93,15.00,0.43 Wisconsin,49.90,1.50,83.50,1.02,14.40,0.86 Wyoming,49.20,1.29,82.00,0.85,12.30,0.78 , sep=,, row.names='State', header=TRUE, as.is=TRUE) # change names () to lower case names (test1) - tolower (names (test1)) #Write a cut/quantile function to apply on different columns of the data frame CutQuintiles - function( x) { cut (x,quantile (x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat) dim (test1) options (width=100) test1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem getting loess tricubic weights
As this does not seem to have been answered... I believe you may misunderstand how loess works. The tricube weights are part of the smoothing algorithm and change with each local fit, not fixed weights for observations, which is what the weights argument provides (and initially multiplies the tricube weight, IIRC). I suggest you consult ?predict.loess to get standard deviations of fitted values at existing or new points. -- Bert On Tue, Jan 8, 2013 at 12:57 AM, Joyce Lin joyceli...@gmail.com wrote: Hi I am trying to get the tricube weights from the loess outputs as I need to calculate an error function which requires the weight. So I have used the following example from the R: cars.lo - loess(dist ~ speed, cars, span=0.5, degree=1, family=symmetric) Then i try to get the weights: cars.lo$weights [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 The results are all 1 so i dont think that the tricube weighting are set. May I know what other parameters do i need to tweak to set the weights to tricube weights? Thank you. -- Best regards Joyce Lin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.