Re: [R] iconv question: SQL Server 2005 to R
On 09/10/2013 10:37, Milan Bouchet-Valat wrote: Le mardi 08 octobre 2013 à 16:02 -0700, Ira Sharenow a écrit : A colleague is sending me quite a few files that have been saved with MS SQL Server 2005. I am using R 2.15.1 on Windows 7. I am trying to read in the files using standard techniques. Although the file has a csv extension when I go to Excel or WordPad and do SAVE AS I see that it is Unicode Text. Notepad indicates that the encoding is Unicode. Right now I have to do a few things from within Excel (such as Text to Columns) and eventually save as a true csv file before I can read it into R and then use it. Is there an easy way to solve this from within R? I am also open to easy SQL Server 2005 solutions. I tried the following from within R. testDF = read.table(Info06.csv, header = TRUE, sep = ,) testDF2 = iconv(x = testDF, from = Unicode, to = ) Error in iconv(x = testDF, from = Unicode, to = ) : unsupported conversion from 'Unicode' to '' in codepage 1252 # The next line did not produce an error message testDF3 = iconv(x = testDF, from = UTF-8 , to = ) testDF3[1:6, 1:3] Error in testDF3[1:6, 1:3] : incorrect number of dimensions # The next line did not produce an error message testDF4 = iconv(x = testDF, from = macroman , to = ) testDF4[1:6, 1:3] Error in testDF4[1:6, 1:3] : incorrect number of dimensions Encoding(testDF3) [1] unknown Encoding(testDF4) [1] unknown This is the first few lines from WordPad Date,StockID,Price,MktCap,ADV,SectorID,Days,A1,std1,std2 2006-01-03 00:00:00.000,@Stock1,2.53,467108197.38,567381.1,4,133.14486997089,-0.0162107939626307,0.0346283580367959,0.0126471695454834 2006-01-03 00:00:00.000,@Stock2,1.3275,829803070.531114,6134778.93292,5,124.632223896458,0.071513138376339,0.0410694546850102,0.0172091268025929 What's the actual problem? You did not state any. Do you get accentuated characters that are not printed correctly after importing the file? In the two lines above it does not look like there would be any non-ASCII characters in this file, so encoding would not matter. It is most likely UCS-2. That has embedded NULs, so the encoding does matter. All 8-bit encodings extend ASCII: others do not, in general. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Small p from binomial probability function.
Sounds like you want a 95% binomial confidence interval: binom.test(N, P) will compute this for you, and you can get the bounds directly with binom.test(N, P)$conf.int Actually, binom.test computes a two-sided confidence interval, which corresponds roughly to 2.5 and 97.5 percentages in your approach. It doesn't give you the 50% point either, but I don't think that's a meaningful quantity with a two-sided test. Hope this helps, Stefan On 9 Oct 2013, at 15:53, Benjamin Ward (ENV) b.w...@uea.ac.uk wrote: I got given some code that uses the R function pbionom: p - mut * t sumprobs - pbinom( N, B, p ) * 1000 Which gives the output of a probability as a percentage like 5, 50, 95. What the code currently does is find me the values of t I need, by using the above two code lines in a loop, each iteration it increaces t by one and runs the two lines. When sumprobs equals 5, it records the value t, then again when sumprobs is equal to 50, and again when sumprobs is equal to 95 - giving me three t values. This is not an efficient way of doing this if t is large. Is it possible to rearrange pbinom so it gives me the small p (made of mut*t) as the result of plugging in the sumprobs instead, and is there an R function that already does this? Since pbinom is the binomial probability equation I suppose the question is - in more mathematical terminology - can I change this code so that instead of calculating the Probability of N successes given the number of trials and the probability of a single success, can I instead calculate the probability of a single success using the probability of N successes and number of trials, and the number of successes? Can R do this for me. So instead I plug in 5, 50, and 95, and then get the small p out? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using cpquery function from bnlearn package inside loop
Dear Ryan, On 9 October 2013 21:26, Ryan Morrison ryan.r.morri...@me.com wrote: I'm attempting to use the bnlearn package to calculate conditional probabilities, and I'm running into a problem when the cpquery function is used within a loop. I've created an example, shown below, using data included with the package. When using the cpquery function in a loop, a variable created in the loop (evi in the example) is not recognized by the function. I receive the error: Error in parse(text = evi) : object 'evi' not found [snip] Based on the second example you emailed me off-list, it appears to be a scoping problem; that's why the same code works if it's not inside a function. I will try to debug this soon, but I am not an expert in R parsing mechanisms so it will take some time. In the mean time, you can use cpquery(..., method = lw) instead of the default cpquery(..., method = ls) if your query looks like the one in the example. The former does not rely on unevaluated expressions, but takes the conditioning values as a list, and it should work regardless. However, if you do so I suggest you should install the latest bugfix snapshot from bnlearn.com to avoid a few other bugs in cpquery(..., method = lw). Cheers, Marco -- Marco Scutari, Ph.D. Research Associate, Genetics Institute (UGI) University College London (UCL), United Kingdom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function for Bisecting K-means algorithm
On Wed, Oct 9, 2013 at 12:37 PM, Bert Gunter gunter.ber...@gene.com wrote: Probably not. I would guess that most (or all) of us have no clue what bisecting a k-means algorithm means. You might have more luck if you explain yourself more clearly (but probably not from me in any case). Is this an R question, by the way? This is not a statistics list -- it's an R programming help list. Cheers, -- Bert On Tue, Oct 8, 2013 at 8:13 PM, Vivek Singh vksingh.ii...@gmail.com wrote: Any help on this? Regards, Vivek On Tue, Oct 8, 2013 at 2:36 PM, Vivek Singh vksingh.ii...@gmail.com wrote: Hi All, Can someone please tell me* R function for Bisecting K-means algorithm*. I have used *kmeans() *function but not getting good results. Please help. -- Thanks and Regards, Vivek Kumar Singh Research Assistant, School of Computing, National University of Singapore Mobile:(0065) 82721535 -- Thanks and Regards, Vivek Kumar Singh Research Assistant, School of Computing, National University of Singapore Mobile:(0065) 82721535 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Hi, Just to explain more on Bisecting K-means. Its a clustering algorithm which is better than the kmeans algorithm. http://minethedata.blogspot.sg/2012/08/bisecting-k-means.html I have used the R function *kmeans* which does *clustering.* If anyone is aware of *implementation of bisection -kmeans algorithm in R*, please help. -- Thanks and Regards, Vivek Kumar Singh Research Assistant, School of Computing, National University of Singapore Mobile:(0065) 82721535 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] makeCluster help needed
On 10.10.2013 04:04, Jeffrey Flint wrote: Uwe, Good news. I installed 3.0.2, and the parallel package examples ran successfully. This time a firewall window popped up. Probably the firewall was the problem with the snow package too, but for some reason the window didn't pop up with the snow package. Great new. Thanks for the suggestion to use parallel. I noticed that the package is brand new! Or, at least the pdf help was written 9/25/13. Not that new, just updated. Best, Uwe Jeff On Sat, Sep 28, 2013 at 10:15 AM, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: Can you please upgrade R to R-3.0.2 and use the parallel package? And can you please explain why you want to start the workers manually? I'd be happy to look into the details if you can reproduce the problem with a recent version of R and the parallel package. Best, Uwe Ligges On 28.09.2013 03:20, Jeffrey Flint wrote: This is in regards to the SNOW library. I'm using Windows. The problem is that makeSOCKcluster hangs in R as well as the DOS command line. Below I've shown that it completes the Rscript until it reaches the line slaveLoop(master) , at which point it hangs. = In R: cl - makeSOCKcluster(names=c(**localhost,localhost),** manual=T,outfile=jeff.log) Manually start worker on localhost with C:/PROGRA~1/R/R-214~1.2/bin/**Rscript.exe C:/Program Files/R/R-2.14.2/library/snow/**RSOCKnode.R MASTER=localhost PORT=11590 OUT=jeff.log SNOWLIB=C:/Program Files/R/R-2.14.2/library [HANGS] ==**== On the DOS Command Line: C:\Documents and Settings\JeffC:/PROGRA~1/R/R-**214~1.2/bin/Rscript.exe C:/Program Files/R/R-2.14.2/library/snow/**RSOCKno de.R MASTER=localhost PORT=11590 OUT=jeff.log SNOWLIB=C:/Program Files/R/R-2.14.2/library [HANGS] ^C C:\Documents and Settings\Jefftype jeff.log starting worker for localhost:11590 ==**== In the file RSOCKnode.R, stalls after last line, after executing slaveLoop(master). local({ master - localhost port - 8765 snowlib - Sys.getenv(R_SNOW_LIB) outfile - Sys.getenv(R_SNOW_OUTFILE) args - commandArgs() pos - match(--args, args) args - args[-(1 : pos)] for (a in args) { pos - regexpr(=, a) name - substr(a, 1, pos - 1) value - substr(a,pos + 1, nchar(a)) switch(name, MASTER = master - value, PORT = port - value, SNOWLIB = snowlib - value, OUT = outfile - value, RANK = rank - value, TMPWS = tmpWsName - value) } ## these should be passed as arguments to makeNWSmaster Sys.setenv(MASTER = master) Sys.setenv(PORT = port) Sys.setenv(RANK = rank) Sys.setenv(TMPWS = tmpWsName) if (! (snowlib %in% .libPaths())) .libPaths(c(snowlib, .libPaths())) library(methods) ## because Rscript as of R 2.7.0 doesn't load methods library(nws) library(snow) sinkWorkerOutput(outfile) master - makeNWSmaster() sendData(master, ping) cat(starting NWS worker\n) slaveLoop(master) }) [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help for compare regression coefficients across groups
Dear all I have data related to cell count across time in 2 different types of cells. I have transformed the count data using a log and I want to test the H0: B cell_ttype1=Bcell_type2 across time for that I am fitting the following model fit_all-lm(data$count~data$cell_type+data$time+data$cell_type*data$time) the output is Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.0450021 0.0286824 36.434 2e-16 *** data$cell_typeOV -0.0456669 0.0405631 -1.1260.271 data$time 0.0115620 0.0004815 24.015 2e-16 *** data$cell_typeOV:data$time -0.0009764 0.0006809 -1.4340.164 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.06318 on 26 degrees of freedom Multiple R-squared: 0.9764,Adjusted R-squared: 0.9737 F-statistic: 358.8 on 3 and 26 DF, p-value: 2.2e-16 inspite the fact that the p-value of he interaction is 0.05 may I still conclude that B cell_ttype1 is different from Bcell_type2 because the p-value of the fit is lower0.05? Thanks in advance for your help. With kind regards, Andreia -- - Andreia J. Amaral, PhD BioFIG - Center for Biodiversity, Functional and Integrative Genomics Instituto de Medicina Molecular University of Lisbon Tel: +352 21750 (ext. office: 28253) email:andreiaama...@fm.ul.pt ; andreiaama...@fc.ul.pt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert list of lists to simple list
Dear R Community, I have the following Problem. I use foreach nested loops, which then return a list of lists. E.g.: [[1]] [[1]][[1]] Output [[1]][[2]] Output [[1]][[3]] Output [[2]] [[2]][[1]] Output What I want to achieve is a single layer list, i.e. a list in which [[1]][[1]] becomes [[1]], [[1]][[2]] becomes [[2]],...,[[2]][[1]] becomes [[4]] and so on. Here a reproducible example: test - foreach(i = 1:3) %:% foreach (j = 1:3) %do% { paste(i,j,sep=,) } In my real problem I have up to 5-6 layers. Anyone an idea? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list of lists to simple list
On 10 Oct 2013, at 13:17, ivan i.pet...@gmail.com wrote: test - foreach(i = 1:3) %:% foreach (j = 1:3) %do% { paste(i,j,sep=,) } Not easily reproducible, unless you write #install.packages(foreach) require(foreach) in front of your code. Here is a starting point: unlist(test, recursive = FALSE) … but you would probably need to build a recursive call of the function down to the last 'list level'. unlist(recursive = TRUE) goes one level too far. Best, Philippe Grosjean __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert a factor to a numeric
foo - as.numeric(as.character(your_factors) ). It's a common mistake to forget the first conversion, in which case you end up with an integer sequence rather than the desired values. -- View this message in context: http://r.789695.n4.nabble.com/Convert-a-factor-to-a-numeric-tp4677964p4677971.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimizing code
FMRPROG wrote I am generating random numbers from a normal distribution using [snip] I need to optimize the speed WITHOUT using the rnorm function but have no idea how to do this. I assume I should minimise what goes in the loop? Any help would be very much appreciated. Looks like homework. Thus, only a hint: any time you create something one at a time with a loop, you almost certainly can create all n values by taking advantage of R's built-in vectorization. Every statement in your loops can be vectorized. -- View this message in context: http://r.789695.n4.nabble.com/optimizing-code-tp4677966p4677972.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector where elements are functions evaluated at integers, but length of vector varies
Hi, I have two integers a and b (with ab), as well as a function f(x). Is there a way of getting the vector (f(a), ..., f(b)) from R without having to explicitly write it out? as my a and b vary. Thanks for your help lt;/quote What did you try?Further, without knowing what your function f(x) is, we can't tell you whether it accepts vector inputs. -- View this message in context: http://r.789695.n4.nabble.com/vector-where-elements-are-functions-evaluated-at-integers-but-length-of-vector-varies-tp4677932p4677973.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] makeCluster help needed
Jeffrey Flint wrote Good news. I installed 3.0.2, and the parallel package examples ran successfully. This time a firewall window popped up. Probably the firewall was the problem with the snow package too, but for some reason the window didn't pop up with the snow package. Thanks for the suggestion to use parallel. I noticed that the package is brand new! Or, at least the pdf help was written 9/25/13. Jeff One thing to watch for that hung me (and R :-) ) up for a while is: make sure your .Rprofile doesn't have any commands which are valid only in interactive sessions. I had loadhistory() which caused the worker Rscript.exe to fail. Replacing that line in .Rprofile with if(interactive()) loadhistory() and all was well. -- View this message in context: http://r.789695.n4.nabble.com/makeCluster-help-needed-tp4677156p4677974.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help required graphing factors with predicted model settings
Perhaps you are looking for the effects package, which can plot effects (predicted values) for terms in mer objects from lme4? library(effects) ?effect library(lme4) data(cake, package=lme4) fm1 - lmer(angle ~ recipe * temperature + (1|recipe:replicate), cake, REML = FALSE) plot(effect(recipe:temperature, fm1), grid=TRUE) plot(Effect(c(recipe, temperature), fm1)) # equivalent On 10/10/2013 12:52 AM, Rebecca Stirnemann wrote: Thanks Jim for helping, Your sample data actually looks like my dataset. The one I put up looks strange for some reason so please ignore that. I have three landusenumb variables 1 2 and 3. is rep (1,2,3) correct? When I run the following code I am getting: mod1 - glmer(frat ~ flandusenumb + ground.cover_lo + (1|fsite) ,family = binomial, data= mao1) #Calculate predicted values newdata1 - data.frame(ground.cover_lo = c(25,50,100), flandusenumb = rep(1,2,3)) pred34 - predict(mod1,newdata=newdata1,type=response) Error in UseMethod(predict) : no applicable method for 'predict' applied to an object of class mer Can you see what I am doing wrong? What I am aiming for is a graph which looks like this. Thanks Rebecca On Thu, Oct 10, 2013 at 5:33 PM, Jim Lemon j...@bitwrit.com.au wrote: On 10/10/2013 08:35 AM, Rebecca Stirnemann wrote: Dear R wizards, Though I hate to do it after weeks of my code not working I need some help since I cant find an example which seems to work. I am trying to create a graph which show the probability of predation of a nest on one side (either 1 to 0) or (0% to 100%) on one side and grass height at the bottom. I want to then add my predicted lines from my glmr onto the graph for three habitat types. I would like to repeat this procedure 3 times for three different grass heights 25- 50- 100 to see the effect size. My data: landusenumb landuse sitename rat ground.cover_lo 1 plantation far.leftroad_LHS 0 60 1 plantation far.leftroad_LHS 1 70 1 plantation far.leftroad_LHS 1 10 1 plantation far.leftroad_LHS 1 30 1 plantation far.leftroad_LHS 1 50 1 plantation far.leftroad_LHS 0 20 1 plantation far.leftroad_LHS 0 70 1 plantation far.leftroad_LHS 0 100 1 plantation far.leftroad_LHS 0 90 #Graph #Fit model mod1- glmer(frat ~ flandusenumb + ground.cover_lo + (1|fsite) ,family = binomial, data= mao1) #Calculate predicted values newdata1- data.frame(ground.cover_lo = seq(0,10,length=100), flandusenumb = rep(1,2,3)) pred34- predict(mod1,newdata=newdata1,**type=response) #Plot model predicted curves plot(c(0,100),c(0,1),type=n,**xlab=grasscover,ylab=**Probability of predation) lines(newdata1$frat,pred34,**lwd=3,col=blue) Hi Rebecca, First, your sample data are a bit mangled, and should look like this: mao1 landusenumb landusesitename rat ground.cover_lo 1 plantation far.leftroad_LHS 0 60 1 plantation far.leftroad_LHS 1 70 1 plantation far.leftroad_LHS 1 10 1 plantation far.leftroad_LHS 1 30 1 plantation far.leftroad_LHS 1 50 1 plantation far.leftroad_LHS 0 20 1 plantation far.leftroad_LHS 0 70 1 plantation far.leftroad_LHS 0 100 1 plantation far.leftroad_LHS 0 90 If you want the predicted values with ground cover as above, then: ground.cover_lo = c(25,50,100) The variable names in the first model don't match those in the data frame, but I assume these were typos. What does pred34 look like? This will tell you what function you should be using to plot it. Jim -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. Chair, Quantitative Methods York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert a factor to a numeric
Hi, It is not clear whether all the variables are factor or only a few are.. dat- read.table(text=a coef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089,sep=,colClasses=rep(factor,4)) dat1- dat dat[] - lapply(dat,function(x) as.numeric(as.character(x))) str(dat) #'data.frame': 5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 # With only a subset of variables in the dataset as factors dat1$a- as.numeric(as.character(dat1$a)) dat1[sapply(dat1,is.factor)]- lapply(dat1[sapply(dat1,is.factor)],function(x) as.numeric(as.character(x))) str(dat1) #'data.frame': 5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 A.K. I have a factor data frame which I want to convert to numeric without any change in contents. How could I do that? a coef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert a factor to a numeric
data.matrix() should do the job for you Charles On Thu, Oct 10, 2013 at 8:02 AM, arun smartpink...@yahoo.com wrote: Hi, It is not clear whether all the variables are factor or only a few are.. dat- read.table(text=acoef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089,sep=,colClasses=rep(factor,4)) dat1- dat dat[] - lapply(dat,function(x) as.numeric(as.character(x))) str(dat) #'data.frame':5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 # With only a subset of variables in the dataset as factors dat1$a- as.numeric(as.character(dat1$a)) dat1[sapply(dat1,is.factor)]- lapply(dat1[sapply(dat1,is.factor)],function(x) as.numeric(as.character(x))) str(dat1) #'data.frame':5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 A.K. I have a factor data frame which I want to convert to numeric without any change in contents. How could I do that? acoef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] frailtypack
I wanted to provide a follow-up post regarding the question of the coxme program and nested (multilevel) frailty analysis. As it turns out, my failure to produce results was a result of my own error. The following syntax seems to successfully produce results for a model accounting for both clustering of recurrent events within individual (ID) and also individuals within groups (GroupNum). Given that my dataset is over 15,000 rows in length, the success of the program is no small feat (I attempted a similar nested analysis with this data in two other R programs, and in both cases R stopped working during the procedure) library(coxme) cgd.nfm- coxme(Surv(Duration, Censoring) ~ Alcohol*Gender + (1 | GroupNum/ID), data=mydata) print(cgd.nfm) My only remaining question concerns the relation between estimates provided by coxme and those provided by a gamma shared frailty model run in SAS. When I run identical shared frailty models in coxme and NLMIXED in SAS, the estimates of fixed effects in coxme are slightly lower and the standard errors are slightly higher. In reading the “Short Introduction to Coxme,” I noticed mention of the Gaussian distribution. I wondered if differences in the effects produced by coxme might be attributable to differences between the Gaussian and the gamma distributions. (Forgive me if these questions are misinformed or rudimentary—this is an entirely new field to me.) I also wondered if there was a way to adjust the syntax of coxme such that the estimates more closely approximate those produced by NLMIXED or, alternatively, if there are reasons for believing the results produced by coxme would be superior. cf2059 wrote Thank you very much for responding. Yes, I incorrectly stated that frailtypack was the only widely available software for the analysis of nested frailty models.When I initially began my search to identify software well suited to nested frailty analysis, Frailtypack dominated the google results. While this package seems to be widely publicized on the internet, I have not thus far found it to be well suited to analysis with my large dataset. I would like to use coxme to analyze my data, as it appears to have far fewer idiosyncrasies than Frailtypack. However, at the moment I am struggling to 1) achieve model convergence with even a basic shared frailty model and 2) produce the correct code for the nested frailty model. My dataset involves recurrent events clustered within individuals (ID) and individuals then clustered within groups of three (GroupNum). One independent variable (Alcohol) varies by group and another (Gender) varies by individual. I ultimately aim to produce a nested frailty model that includes one random intercept variance term at the level of the individual and one at the level of the group. I have already produced results for a basic shared frailty model using SAS NLMIXED--a model that accounts for clustering only at the level of the Group using group-level predictors (Alcohol)--but so far I have not been able to achieve convergence for this same model using coxme. I suspect that supplying the program with starting values might be useful, but I am not familiar enough with the program code to do so. Any suggestions would be very much appreciated. I am new to survival analysis as well as the R software program. ##Basic Shared Frailty Model cgd.nfm - coxme(Surv(Duration, Censoring) ~ Alcohol + (1 | ID), data=mydata) summary(cgd.nfm) Length Class Mode coefficients 1 -none- numeric frail1 -none- list penalty 1 -none- numeric loglik 3 -none- numeric variance 1 bdsmatrix S4 df 2 -none- numeric hmat 1 gchol.bdsmatrix S4 iter 2 -none- numeric control 9 -none- list u 709 -none- numeric means1 -none- numeric scale1 -none- numeric linear.predictor 15831 -none- numeric vcoef1 -none- list n2 -none- numeric terms3 terms call formulaList 2 -none- list y31662 Survnumeric call 3 -none- call ties 1 -none- character Nested Frailty Model cgd.nfm - coxme(Surv(Duration, Censoring) ~ Alcohol*Gender + (1 | GroupNum/ID), data=mydata) summary(cgd.nfm) Length Class Mode coefficients 1 -none- numeric frail2 -none- list penalty 1 -none- numeric loglik 3 -none- numeric variance 1 bdsmatrix S4 df
Re: [R] Convert a factor to a numeric
I'm not honestly sure why data.matrix didn't work off hand. Perhaps another user can shed some light on this. An alternative is the following: apply(dat, 2, FUN = function(x) as.numeric(as.character(x))) On Thu, Oct 10, 2013 at 8:26 AM, arun smartpink...@yahoo.com wrote: Did you mean to apply it like this or is it something else? data.matrix(dat) # a coef coef.l coef.h 1 13 4 2 2 24 5 4 3 31 1 1 4 42 2 3 5 55 3 5 A.K. On Thursday, October 10, 2013 9:09 AM, Charles Determan Jr deter...@umn.edu wrote: data.matrix() should do the job for you Charles On Thu, Oct 10, 2013 at 8:02 AM, arun smartpink...@yahoo.com wrote: Hi, It is not clear whether all the variables are factor or only a few are.. dat- read.table(text=acoef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089,sep=,colClasses=rep(factor,4)) dat1- dat dat[] - lapply(dat,function(x) as.numeric(as.character(x))) str(dat) #'data.frame':5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 # With only a subset of variables in the dataset as factors dat1$a- as.numeric(as.character(dat1$a)) dat1[sapply(dat1,is.factor)]- lapply(dat1[sapply(dat1,is.factor)],function(x) as.numeric(as.character(x))) str(dat1) #'data.frame':5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 A.K. I have a factor data frame which I want to convert to numeric without any change in contents. How could I do that? acoef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Charles Determan Integrated Biosciences PhD Candidate University of Minnesota [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert a factor to a numeric
Firstly, please make sure to reply-all so the r-help list also receives these emails. Second, I have just run this sequence as it provides an exact copy with each as numeric. Use the apply function, it iterates over each column and converts each to numeric. dat - read.table(text=acoef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089,sep=,colClasses=rep(factor,4)) dat.num - apply(dat, 2, FUN = function(x) as.numeric(as.character(x))) Charles On Thu, Oct 10, 2013 at 8:37 AM, arun smartpink...@yahoo.com wrote: Looks like it is directly doing: as.numeric() without the as.character() For ex: as.numeric(dat[,2]) #[1] 3 4 1 2 5 On Thursday, October 10, 2013 9:33 AM, Charles Determan Jr deter...@umn.edu wrote: I'm not honestly sure why data.matrix didn't work off hand. Perhaps another user can shed some light on this. An alternative is the following: apply(dat, 2, FUN = function(x) as.numeric(as.character(x))) On Thu, Oct 10, 2013 at 8:26 AM, arun smartpink...@yahoo.com wrote: Did you mean to apply it like this or is it something else? data.matrix(dat) # a coef coef.l coef.h 1 13 4 2 2 24 5 4 3 31 1 1 4 42 2 3 5 55 3 5 A.K. On Thursday, October 10, 2013 9:09 AM, Charles Determan Jr deter...@umn.edu wrote: data.matrix() should do the job for you Charles On Thu, Oct 10, 2013 at 8:02 AM, arun smartpink...@yahoo.com wrote: Hi, It is not clear whether all the variables are factor or only a few are.. dat- read.table(text=acoef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089,sep=,colClasses=rep(factor,4)) dat1- dat dat[] - lapply(dat,function(x) as.numeric(as.character(x))) str(dat) #'data.frame':5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 # With only a subset of variables in the dataset as factors dat1$a- as.numeric(as.character(dat1$a)) dat1[sapply(dat1,is.factor)]- lapply(dat1[sapply(dat1,is.factor)],function(x) as.numeric(as.character(x))) str(dat1) #'data.frame':5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 A.K. I have a factor data frame which I want to convert to numeric without any change in contents. How could I do that? acoef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Charles Determan Integrated Biosciences PhD Candidate University of Minnesota -- Charles Determan Integrated Biosciences PhD Candidate University of Minnesota [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rcpp and mclapply
Dear all, I have an R script that uses Rcpp, and I have been trying to parallelize it using mclapply (I tried with the multicore and the parallel library) Sometimes (not always, interestingly), the CPU use for each core drops, usually so that the total over all cores reaches 100%, i.e., as fast as if using just one single core fully. I tried my code directly from within emacs, and also using a shell command - it happens either way. I suspect there might be some interaction between Rcpp and the multicore/parallel libraries. Did any R(cpp) user encounter such symptoms? Sophie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error while running MR using rmr2
Hi, I have trying to run a simple MR program using rmr2 in a single node Hadoop cluster. Here is the environment for the setup Ubuntu 12.04 (32 bit) R (Ubuntu comes with 2.14.1, so updated to 3.0.2) Installed the latest rmr2 and rhdfs from herehttps://github.com/RevolutionAnalytics/RHadoop/wiki/Downloadsand the corresponding dependencies Hadoop 1.2.1 Now I am trying to run a simple MR program as Sys.setenv(HADOOP_HOME=/home/training/Installations/hadoop-1.2.1) Sys.setenv(HADOOP_CMD=/home/training/Installations/hadoop-1.2.1/bin/hadoop) library(rmr2) library(rhdfs) ints = to.dfs(1:100) calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v)) from.dfs(calc) The mapreduce job fails with the below error message in * hadoop-1.2.1/logs/userlogs/job_201310091055_0001/attempt_201310091055_0001_m_00_0/stderr * Error in library(functional) : there is no package called functional Execution halted java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) But, the sessionInfo() shows that functional package has been loaded sessionInfo() R version 3.0.2 (2013-09-25) Platform: i686-pc-linux-gnu (32-bit) locale: 1 LC_CTYPE=en_IN LC_NUMERIC=C LC_TIME=en_IN [4] LC_COLLATE=en_IN LC_MONETARY=en_IN LC_MESSAGES=en_IN [7] LC_PAPER=en_IN LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C attached base packages: 1 stats graphics grDevices utils datasets methods base other attached packages: 1 rhdfs_1.0.6 rJava_0.9-4 rmr2_2.3.0 reshape2_1.2.2 plyr_1.8 [6] stringr_0.6.2 *functional_0.4* digest_0.6.3 bitops_1.0-6 RJSONIO_1.0-3 [11] Rcpp_0.10.5 How to get around this problem? I have posted the same in StackOverflow also (http://goo.gl/KEKRVJ) Thanks, Praveen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert a factor to a numeric
Also, BTW, dat.num() is matrix, but if you use lapply(), it is still a dataframe. Anyway, it depends on what the OP really wants as output. dat.num - apply(dat, 2, FUN = function(x) as.numeric(as.character(x))) dat[] - lapply(dat,function(x) as.numeric(as.character(x))) str(dat) 'data.frame': 5 obs. of 4 variables: $ a : num 1 2 3 4 5 $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 str(dat.num) num [1:5, 1:4] 1 2 3 4 5 ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr [1:4] a coef coef.l coef.h as.data.frame(dat.num) A.K. On Thursday, October 10, 2013 9:41 AM, Charles Determan Jr deter...@umn.edu wrote: Firstly, please make sure to reply-all so the r-help list also receives these emails. Second, I have just run this sequence as it provides an exact copy with each as numeric. Use the apply function, it iterates over each column and converts each to numeric. dat - read.table(text=a coef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089,sep=,colClasses=rep(factor,4)) dat.num - apply(dat, 2, FUN = function(x) as.numeric(as.character(x))) Charles On Thu, Oct 10, 2013 at 8:37 AM, arun smartpink...@yahoo.com wrote: Looks like it is directly doing: as.numeric() without the as.character() For ex: as.numeric(dat[,2]) #[1] 3 4 1 2 5 On Thursday, October 10, 2013 9:33 AM, Charles Determan Jr deter...@umn.edu wrote: I'm not honestly sure why data.matrix didn't work off hand. Perhaps another user can shed some light on this. An alternative is the following: apply(dat, 2, FUN = function(x) as.numeric(as.character(x))) On Thu, Oct 10, 2013 at 8:26 AM, arun smartpink...@yahoo.com wrote: Did you mean to apply it like this or is it something else? data.matrix(dat) # a coef coef.l coef.h 1 1 3 4 2 2 2 4 5 4 3 3 1 1 1 4 4 2 2 3 5 5 5 3 5 A.K. On Thursday, October 10, 2013 9:09 AM, Charles Determan Jr deter...@umn.edu wrote: data.matrix() should do the job for you Charles On Thu, Oct 10, 2013 at 8:02 AM, arun smartpink...@yahoo.com wrote: Hi, It is not clear whether all the variables are factor or only a few are.. dat- read.table(text=a coef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089,sep=,colClasses=rep(factor,4)) dat1- dat dat[] - lapply(dat,function(x) as.numeric(as.character(x))) str(dat) #'data.frame': 5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 # With only a subset of variables in the dataset as factors dat1$a- as.numeric(as.character(dat1$a)) dat1[sapply(dat1,is.factor)]- lapply(dat1[sapply(dat1,is.factor)],function(x) as.numeric(as.character(x))) str(dat1) #'data.frame': 5 obs. of 4 variables: # $ a : num 1 2 3 4 5 # $ coef : num 0.00566 0.00635 0.00369 0.00562 0.00637 # $ coef.l: num 0.00301 0.00334 0.00029 0.00209 0.0027 # $ coef.h: num 0.00831 0.00935 0.00708 0.00915 0.01004 A.K. I have a factor data frame which I want to convert to numeric without any change in contents. How could I do that? a coef coef.l coef.h 1 1 0.005657825001254 0.00300612956318132 0.00830952043932667 2 2 0.00634505314577229 0.00334102345418614 0.00934908283735844 3 3 0.00368668099805019 0.000289702228748421 0.00708365976735195 4 4 0.0056200291035751 0.00209123538827368 0.00914882281887651 5 5 0.00636609791030242 0.00269683889899591 0.0100353569216089 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Charles Determan Integrated Biosciences PhD Candidate University of Minnesota -- Charles Determan Integrated Biosciences PhD Candidate University of Minnesota __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] Rcpp and mclapply
I would bet that you are doing something in C++ that shares some resource between the workers and blocks all but one worker at a time. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. sophie_brugie...@mpipsykl.mpg.de wrote: Dear all, I have an R script that uses Rcpp, and I have been trying to parallelize it using mclapply (I tried with the multicore and the parallel library) Sometimes (not always, interestingly), the CPU use for each core drops, usually so that the total over all cores reaches 100%, i.e., as fast as if using just one single core fully. I tried my code directly from within emacs, and also using a shell command - it happens either way. I suspect there might be some interaction between Rcpp and the multicore/parallel libraries. Did any R(cpp) user encounter such symptoms? Sophie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pairs plot
my data are matrix with 3 numeric columns. would like to have pairs plot with scatterplots in the upper with hist at the diag and with correlation at the lower. actually default pairs does almost what I want but looks semi awesome. Especially, i didn't find out how to remove the axes from the lower part where I do only want to display the numeric values correlations there and somehow axes don't fit. Hence I am looking at ggpairs from GGally and calling it without parameters looks almost perfect : but I cant find out how they got the Corr: in the upper, so I can't put it in the lower, and I do not know how to put the hist in the diag. please help -- Witold Eryk Wolski __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] system2 commands with backslash
Hi All, I'm trying to edit a file in place using system2 and sed from within R. I can get my command to work unless there is a backslash in the command in which case I'm warned about an unrecognized escape. So, for example: system2(sed -i s/oldword/newword/g d:/junk/x/test.tex) # works fine system2(sed -i s/oldword\s/newword/g d:/junk/x/test.tex) # does not work in R (the command works on the command line) I've experimented with double slashes to escape the \s, I've tried the shell command, I've tried experimenting with shQuote and can't seem to get around the unrecognized escape issue. By the way, it would be preferable to have a solution that avoided using double backslashes etc because, unfortunately, in my real-world example, I'm actually replacing double slashes and would prefer not to have quadruple slashes etc. I'm using Windows 7, 64 bit. Zev -- Zev Ross ZevRoss Spatial Analysis 120 N Aurora, Suite 3A Ithaca, NY 14850 607-277-0004 (phone) z...@zevross.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mixed model MANOVA? does it even exist?
Hi, Sorry to bother you again. I tried doing regressions using lme (because i want p-values) and ran across two issues. *(1) about model specification:* this is a mixed model. i tried two model specifications: (a) specify that subjectID (subj) is nested within the between-subject variable (age): * summary(lme(bias ~ dprime * emo * race * age, random =~ 1|age/subj, data = data)) * = gives NaN p-values for age :-( (b) omit to specify the nested relationship between the variables: * summary(lme(bias ~ dprime * emo * race * age, random =~ 1|subj, data = data))* = gives numerical p-values for age should i go with the specification (b), even if the variables are indeed nested? is the formula in (a) wrong? *(2) about anova(model):* because several of the factors have more than 2 levels, i wanted to use * anova()* on the model to compute p-values for the whole effects. however, the results from *summary(model)* and *anova(model)* are inconsistent. sometimes significant variables in *summary(model)* are not significant in * anova(model)*, sometimes the reverse occurs. this also happens with factors that only have 2 levels (so not due to dummy variables). is this normal, is it a bad sign? what does it mean? Thank you for any hint about those issues. Best, L. 2013/10/9 laurie bayet laurieba...@gmail.com Hi Peter, and thank you for your quick and helpful reply ! Do you want to know whether the predictors affect the marginal distributions of Y1, Y2,... or are you interested in conditional effects given other DVs (aka test for additional information)? Hmmm I think that, yes, i am looking for that additional information (although i don't know what marginal distributions means). *So multiple regression it is, thank you !* Yes *i do have a random effect* to include (subject's number). Is it ok to do that ? Can i do this multiple regression with, say, lmer or glmer, even if i am not sure if the relation between the two DVs is actually linear (can i run a multiple regression on ranks instead, should i test for a linear correlation beforehand ?) ? Thank you again, your answer was very helpful :-) Best, L. 2013/10/9 peter dalgaard pda...@gmail.com As a matter of principle, yes, multivariate mixed models do exist, look at the last bit of example(manova) (in reasonably recent versions of R). In practice, it often doesn't really buy you much. It just gives a joint test for all the DVs, the estimates are the same as in separate analyses. The tricky bit is usually to define precisely what the research question is: Do you want to know whether the predictors affect the marginal distributions of Y1, Y2,... or are you interested in conditional effects given other DVs (aka test for additional information)? The latter case leads to regression models where other DVs are entered as covariates. There's no issue with having categorical variables as predictors in multiple regression in R, dummy variables are created internally. But if you are considering mixed models, presumably you have a random effect that needs to be included? -pd On Oct 9, 2013, at 10:23 , laurie bayet wrote: Hi, Sorry to bother you again. I would like to estimate the effect of several categorical factors (two between subjects and one within subjects) on two continuous dependent variables that probably covary, with subjects as a random effect. *I want to control for the covariance between those two DVs when estimating the effects of the categorical predictors** on those two DVs*. The thing is, i know the predictors have an effect on DV1, and i know DV2 covaries with DV1, so it would be cheating to simply estimate the effect of the predictors on DV2 because those effects could be indirect (via DV1), right ? I see two solutions : *One solution would be a mixed model MANOVA (if that even exists)*. But i don't know how to run a mixed model MANOVA, i tried to do it with Statistica but couldn't find the right module (I know how to declare two DVs and run a GLM, but *I don't know if the covariance between my two DVs is automatically controlled for*). Same thing with R. I tried to ask a question on Statistica's forum with no answer, tried looking around in the manuals with no improvement. *A backup solution would be a multiple regression* (regressing DV2 against DV1 with the categorical predictors). But i am not sure how to implement a mixed model, which function i should use and besides, it would be *much less convenient because one of my categorical predictors has three levels*(so i would have to split it and make it two predictors, right?). Thank you for any help at all ! Cheers, L. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Rcpp and mclapply
Dear Jeff, I had suspected something along those lines initially, however, as I had stated, this only happens sometimes ... Here is a partial top showing two of my calls: 17404 sophie20 0 12.4g 11g 1996 R 100 4.6 5:50.11 R 17405 sophie20 0 12.4g 11g 2016 R 100 4.6 5:49.86 R 17408 sophie20 0 12.4g 11g 2016 R 100 4.6 5:49.13 R 17411 sophie20 0 12.4g 11g 2016 R 100 4.6 5:48.39 R 17412 sophie20 0 12.4g 11g 2016 R 100 4.6 5:48.14 R 1461 sophie20 0 12.7g 11g 2016 R2 4.7 25:19.27 R 1465 sophie20 0 12.7g 11g 2016 R2 4.7 25:19.05 R 1476 sophie20 0 12.7g 11g 2016 R2 4.7 25:18.73 R 1486 sophie20 0 12.7g 11g 2016 R2 4.7 25:18.39 R 1491 sophie20 0 12.7g 11g 2016 R2 4.7 25:18.24 R the ones in the 1400 range come from one call, the ones with 17000 from another ... It seems to be linked to the console instance that the call is made from in a way that I cannot grasp yet. Still puzzled ... Sophie I would bet that you are doing something in C++ that shares some resource between the workers and blocks all but one worker at a time. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. sophie_brugie...@mpipsykl.mpg.de wrote: Dear all, I have an R script that uses Rcpp, and I have been trying to parallelize it using mclapply (I tried with the multicore and the parallel library) Sometimes (not always, interestingly), the CPU use for each core drops, usually so that the total over all cores reaches 100%, i.e., as fast as if using just one single core fully. I tried my code directly from within emacs, and also using a shell command - it happens either way. I suspect there might be some interaction between Rcpp and the multicore/parallel libraries. Did any R(cpp) user encounter such symptoms? Sophie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with R
Hi, I have recently installed R and am trying to do some work on it. To be honest I'm finding it a PAIN to you. I use Mac and I can open stata dta. files in R despite using the commands suggested to me and I have been trying to get figure out how to do reduced rank regression on it and the manual is not very useful. Can you help? Thanks, Mash __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Small p from binomial probability function.
Hi, Thank you for your answers, I'm not completely sure if it's to bino.test I need or the uniroot. Perhaps I should explain more the idea behind the code and the actual task I'm trying to do. The idea is to calculate a confidence interval as to the age of two DNA sequences which have diverged, where I know the number of mutations that happened in them, and I know the mutation rate. The binomial probability can be used since, mutations have a probability of occurring or being observed so many times in a sequence. This is dependent on the length of the DNA stretch (which equates to the number of trials since each base is a possibility of observing a mutation), the probability of a single mutation occurring which is p = t * u, since more time means a higher probability a mutation may have occurred. So my code, using pbinom, is supposed to calculate the probability that my DNA stretches contain the number of mutations observed P(X = k), given their size (trials) and the probability of a single mutation (p = t * u). However I'm interested in finding t: t is what is unknown, so the loop repeatedly evaluates the calculation, increasing t each time and checking P(X=k), when it is 0.05, 0.50 and 0.95, we record t. Ideally I'd like to rearrange this so I can get the probability of a single success (mutation) p, and then divide by the mutation rate to get my t. My supervisor gave my the loopy code but I imagine there is a way to plug in P(X=k) as 0.05 and 0.95 and get my upper and lower t estimates. According to the R built in docs: binom.test Description: Performs an exact test of a simple null hypothesis about the probability of success in a Bernoulli experiment. Perhaps this is the one I need rather than uniroot? Best, Ben. From: Stefan Evert [stefa...@collocations.de] Sent: 10 October 2013 09:37 To: R-help Mailing List Cc: Benjamin Ward (ENV) Subject: Re: [R] Small p from binomial probability function. Sounds like you want a 95% binomial confidence interval: binom.test(N, P) will compute this for you, and you can get the bounds directly with binom.test(N, P)$conf.int Actually, binom.test computes a two-sided confidence interval, which corresponds roughly to 2.5 and 97.5 percentages in your approach. It doesn't give you the 50% point either, but I don't think that's a meaningful quantity with a two-sided test. Hope this helps, Stefan On 9 Oct 2013, at 15:53, Benjamin Ward (ENV) b.w...@uea.ac.uk wrote: I got given some code that uses the R function pbionom: p - mut * t sumprobs - pbinom( N, B, p ) * 1000 Which gives the output of a probability as a percentage like 5, 50, 95. What the code currently does is find me the values of t I need, by using the above two code lines in a loop, each iteration it increaces t by one and runs the two lines. When sumprobs equals 5, it records the value t, then again when sumprobs is equal to 50, and again when sumprobs is equal to 95 - giving me three t values. This is not an efficient way of doing this if t is large. Is it possible to rearrange pbinom so it gives me the small p (made of mut*t) as the result of plugging in the sumprobs instead, and is there an R function that already does this? Since pbinom is the binomial probability equation I suppose the question is - in more mathematical terminology - can I change this code so that instead of calculating the Probability of N successes given the number of trials and the probability of a single success, can I instead calculate the probability of a single success using the probability of N successes and number of trials, and the number of successes? Can R do this for me. So instead I plug in 5, 50, and 95, and then get the small p out? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rcpp and mclapply
I cannot imagine how the calling process will affect this. If you want further help I think you will need to provide a reproducible example and info as requested by the Posting Guide. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. sophie_brugie...@mpipsykl.mpg.de wrote: Dear Jeff, I had suspected something along those lines initially, however, as I had stated, this only happens sometimes ... Here is a partial top showing two of my calls: 17404 sophie20 0 12.4g 11g 1996 R 100 4.6 5:50.11 R 17405 sophie20 0 12.4g 11g 2016 R 100 4.6 5:49.86 R 17408 sophie20 0 12.4g 11g 2016 R 100 4.6 5:49.13 R 17411 sophie20 0 12.4g 11g 2016 R 100 4.6 5:48.39 R 17412 sophie20 0 12.4g 11g 2016 R 100 4.6 5:48.14 R 1461 sophie20 0 12.7g 11g 2016 R2 4.7 25:19.27 R 1465 sophie20 0 12.7g 11g 2016 R2 4.7 25:19.05 R 1476 sophie20 0 12.7g 11g 2016 R2 4.7 25:18.73 R 1486 sophie20 0 12.7g 11g 2016 R2 4.7 25:18.39 R 1491 sophie20 0 12.7g 11g 2016 R2 4.7 25:18.24 R the ones in the 1400 range come from one call, the ones with 17000 from another ... It seems to be linked to the console instance that the call is made from in a way that I cannot grasp yet. Still puzzled ... Sophie I would bet that you are doing something in C++ that shares some resource between the workers and blocks all but one worker at a time. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. sophie_brugie...@mpipsykl.mpg.de wrote: Dear all, I have an R script that uses Rcpp, and I have been trying to parallelize it using mclapply (I tried with the multicore and the parallel library) Sometimes (not always, interestingly), the CPU use for each core drops, usually so that the total over all cores reaches 100%, i.e., as fast as if using just one single core fully. I tried my code directly from within emacs, and also using a shell command - it happens either way. I suspect there might be some interaction between Rcpp and the multicore/parallel libraries. Did any R(cpp) user encounter such symptoms? Sophie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with R
On Thu, Oct 10, 2013 at 11:09 AM, Mash Hamid mxh...@bham.ac.uk wrote: Hi, I have recently installed R and am trying to do some work on it. To be honest I'm finding it a PAIN to you. To me? I use Mac and I can open stata dta. files in R Great, glad to hear it's working! despite using the commands suggested to me Or maybe not... What did you try? What went wrong? and I have been trying to get figure out how to do reduced rank regression on it and the manual is not very useful. The first hit when googling for R reduced rank regression suggests that the VGAM package provides reduced rank regression models. Hope this helps, Ista Can you help? Thanks, Mash __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using calibrate for raking (survey package)
I'm studying the calibration function in the survey package in preparation for raking some survey data. Results from the rake function below agree with other sources. When I run calibrate, I get a warning message and the M and F weights seem to be reversed. Even allowing for that, the deviation between calibrated and raked weights is much more than I expected. I see that in the calibrate function population is supposed to be a vector or table, but can't figure out how to adjust. Can you clarify? Thanks. -M. Laviolette satisfy - c(2,5,2,3,4,3,3,3,4,2,2,3,2,3,4,3,3,2,3,3,4,3,3,3,2, 3,3,3,2,1,4,4,3,3,2,3,4,2,3,3,3,5,3,1,4,3,3,4,4,2, 3,3,3,5,4,4,5,3,4,4,5,3,3,4,3,3,3,3,2,4,4,3,3,4,3, 2,4,4,3,4,4,4,5,3,3,4,4,4,3,2,2,4,3,4,3,4,4,3,3,3, 3,4,4,4,4,3,3,3,3,2,3,3,2,2,5,4,5,2,4,4,4,3,4,4,2, 4,4,3,4,3,4,2,3,3,2,4,3,4,4,3,5,2,4,4,3,4,5,3,3,3, 3,2,3,4,4,4,2,4,4,2,3,5,2,2,3,3,3,3,3,4,4,3,3,4,4, 4,4,4,4,4,4,3,2,3,3,3,3,4,4,4,3,3,4,3,4,4,4,3,3,2) Gender - c(1,2,1,1,2,1,1,2,1,2,1,1,2,1,1,1,1,2,2,1,2,1,1,2,1, 2,1,1,2,2,1,1,2,1,2,2,1,1,1,1,2,1,1,1,1,1,1,1,2,1, 1,1,1,1,2,1,1,2,2,2,2,2,2,2,2,2,1,1,1,1,2,1,2,1,2, 1,1,2,1,1,2,1,1,1,1,1,1,2,1,1,2,2,2,2,1,1,2,2,1,2, 1,1,2,1,2,1,2,2,1,1,1,2,1,1,1,2,1,1,2,1,2,2,2,1,1, 2,2,1,1,1,2,1,2,1,2,2,1,1,1,2,2,1,2,2,2,2,1,2,2,1, 2,1,2,1,1,2,2,1,1,1,2,2,1,2,2,2,1,2,2,1,1,1,2,2,2, 1,2,1,2,2,2,2,1,1,2,1,1,1,2,1,1,2,2,1,1,1,1,2,1,1) Age - c(2,3,2,1,2,2,2,2,3,2,2,1,2,2,2,2,2,2,2,2,2,2,2,3,2, 3,3,3,1,2,2,3,2,2,2,1,3,2,2,2,2,2,2,3,2,2,2,2,2,1, 3,3,2,3,2,2,2,2,2,2,2,3,2,2,1,2,2,2,1,2,2,3,2,2,1, 2,2,1,2,2,1,2,2,2,2,2,2,2,3,2,2,1,3,2,2,2,3,2,2,2, 3,1,2,1,2,2,1,2,2,2,2,2,2,1,2,2,3,1,2,2,2,2,2,2,2, 2,3,1,1,2,1,2,2,2,2,2,2,2,2,1,3,2,2,2,1,2,1,1,2,1, 2,1,1,2,2,2,2,2,2,2,2,3,2,1,2,1,1,2,3,3,1,3,3,2,2, 2,2,2,2,2,2,2,3,2,3,3,2,2,2,3,1,2,1,2,3,2,2,2,3,2) emp.dat - data.frame(Gender = factor(Gender, labels = c(M, F)), Age = factor(Age, labels = c(30, 30-44, 45 +)), satisfy) pop.gender - data.frame(Gender = c(M, F), Freq = c(3800, 6200)) pop.age - data.frame(Age = c(30, 30-44, 45+), Freq = c(2000, 5000, 3000)) library(survey) emp.svy - svydesign(ids = ~0, strata = NULL, weights = ~rep(50, 200), data = emp.dat) rake.svy - rake(emp.svy, list(~Gender, ~Age), list(pop.gender, pop.age)) cal.svy - calibrate(emp.svy, formula = list(~Gender, ~Age), population = list(pop.gender, pop.age), cal.fun = raking) # Warning message: # In regcalibrate.survey.design2(design, formula, population, aggregate.stage # = aggregate.stage, :Sample and population totals have different names. # check weights--M and F seem reversed when calibrate used library(reshape2) check1 - with(rake.svy, cbind(variables, weight = 1/prob)) dcast(check1, Gender~Age, sum, value.var = weight, margins = TRUE) check2 - with(cal.svy, cbind(variables, weight = 1/prob)) dcast(check2, Gender~Age, sum, value.var = weight, margins = TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for package to solve for exponent using newton's method
Hi, I'm looking for an R function/package that will let me solve problems of the type: 13 = 2^x + 3^x. The answer to this example is x = 2, but I'm looking for solutions when x isn't so easily determined. Looking around, it seems that there is no algebraic solution for x, unless I'm mistaken. Does anyone know a good package to solve these types of problems? Are there built in functions to do this? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] installing package gstat
Hello, # I am able to install.packages(gstat) #but when I try to upload I get an error message library(gstat) #Error in loadNamespace(j - i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : # there is no package called intervals #In addition: Warning message: #package gstat was built under R version 3.0.2 #Error: package or namespace load failed for gstat what do you recommend? -- Simona Augyte, MS PhD student Ecology and Evolutionary Biology University of Connecticut cell 707-832-7007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] order() not producing results as I expect
Hello, I'm using R version 3.0.0 on a mac. I'm having trouble getting order to behave as I expect it should. I'm trying to sort a data.frame according to a character vector. I'm able to sort the data.frame, but it retruns an unexpected result. I have no idea where the order that is being produced comes from. Any ideas on how to properly order a data frame by a character vector? Here is the current order of the data frame (called str.dat): head(str.dat) str.names POPINFO POPFLAG LOCDATA Loc1 Loc2 Loc3 ind.names 1 alba1.pop3 3 0 1 1232 alba1 2 alba2.pop3 3 0 1332 alba2 3 alch1.pop4 4 0 2232 alch1 4 alch2.pop4 4 0 2232 alch2 5 alco1.pop4 4 0 3332 alco1 6 alco2.pop4 4 0 3332 alco2 Here's the order I expect it to be in when I use order: head(data.frame(gen.names)) gen.names 1 magv1 2 magv2 3 magv3 4 magv4 5 lc1 6 lc2 Here's the order I'm getting: head(str.dat[order(gen.names),]) str.names POPINFO POPFLAG LOCDATA Loc1 Loc2 Loc3 ind.names 111 ncle2.pop5 5 0 39332 ncle2 112 ncle3.pop5 5 0 39222 ncle3 146 wvma1.pop8 8 0 57332 wvma1 145 wvfa2.pop8 8 0 56332 wvfa2 55 flse6.pop2 2 0 19254 flse6 54 flse5.pop2 2 0 19254 flse5 Many thanks, Karl [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] order() not producing results as I expect
On 10/10/2013 12:00 PM, Karl Fetter wrote: Hello, I'm using R version 3.0.0 on a mac. I'm having trouble getting order to behave as I expect it should. I'm trying to sort a data.frame according to a character vector. I'm able to sort the data.frame, but it retruns an unexpected result. I have no idea where the order that is being produced comes from. It comes from gen.names, which is not a column of your dataset. I think you'll have to give us something reproducible before we can help you. Duncan Murdoch Any ideas on how to properly order a data frame by a character vector? Here is the current order of the data frame (called str.dat): head(str.dat) str.names POPINFO POPFLAG LOCDATA Loc1 Loc2 Loc3 ind.names 1 alba1.pop3 3 0 1 1232 alba1 2 alba2.pop3 3 0 1332 alba2 3 alch1.pop4 4 0 2232 alch1 4 alch2.pop4 4 0 2232 alch2 5 alco1.pop4 4 0 3332 alco1 6 alco2.pop4 4 0 3332 alco2 Here's the order I expect it to be in when I use order: head(data.frame(gen.names)) gen.names 1 magv1 2 magv2 3 magv3 4 magv4 5 lc1 6 lc2 Here's the order I'm getting: head(str.dat[order(gen.names),]) str.names POPINFO POPFLAG LOCDATA Loc1 Loc2 Loc3 ind.names 111 ncle2.pop5 5 0 39332 ncle2 112 ncle3.pop5 5 0 39222 ncle3 146 wvma1.pop8 8 0 57332 wvma1 145 wvfa2.pop8 8 0 56332 wvfa2 55 flse6.pop2 2 0 19254 flse6 54 flse5.pop2 2 0 19254 flse5 Many thanks, Karl [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for package to solve for exponent using newton's method
On 10/10/2013 2:39 PM, Ken Takagi wrote: Hi, I'm looking for an R function/package that will let me solve problems of the type: 13 = 2^x + 3^x. The answer to this example is x = 2, but I'm looking for solutions when x isn't so easily determined. Looking around, it seems that there is no algebraic solution for x, unless I'm mistaken. Does anyone know a good package to solve these types of problems? Are there built in functions to do this? You can get approximate solutions using uniroot: uniroot(function(x) 2^x + 3^x - 13, c(0, 10)) $root [1] 1.8 $f.root [1] -0.0002581592 $iter [1] 10 $estim.prec [1] 6.103516e-05 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for package to solve for exponent using newton's method
On 10-10-2013, at 20:39, Ken Takagi katak...@bu.edu wrote: Hi, I'm looking for an R function/package that will let me solve problems of the type: 13 = 2^x + 3^x. The answer to this example is x = 2, but I'm looking for solutions when x isn't so easily determined. Looking around, it seems that there is no algebraic solution for x, unless I'm mistaken. Does anyone know a good package to solve these types of problems? Are there built in functions to do this? Univariate equations can be solved with uniroot, available in base R. You can also use package nleqslv for this but that is intended for systems of nonlinear equations. It does however solve your equation. There is also BB which is especially intended for large sparse systems. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for package to solve for exponent using newton's method
Thanks! That's just what I needed. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for package to solve for exponent using newton's method
?uniroot --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Ken Takagi katak...@bu.edu wrote: Hi, I'm looking for an R function/package that will let me solve problems of the type: 13 = 2^x + 3^x. The answer to this example is x = 2, but I'm looking for solutions when x isn't so easily determined. Looking around, it seems that there is no algebraic solution for x, unless I'm mistaken. Does anyone know a good package to solve these types of problems? Are there built in functions to do this? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrap (bootSem) causes R to crash
Hi everyone, I'd like to report a similar experience. When I attempt to run the first CFA example in bootSem(), specifically the line: system.time(boot.cnes - bootSem(sem.cnes, R=100, Cov=hcor, data=CNES)) R crashes and I get a notice about X11. I copied my sessionInfo() below. Any ideas for a workaround? Thanks Eric R version 3.0.1 (2013-05-16) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] polycor_0.7-8sfsmisc_1.0-23 mvtnorm_0.9-9996 corrgram_1.5 [5] seriation_1.0-11 colorspace_1.2-3 gclus_1.3.1 TSP_1.0-8 [9] cluster_1.14.4 GGally_0.4.4 reshape_0.8.4plyr_1.8 [13] ggplot2_0.9.3.1 xtable_1.7-1 xlsx_0.5.1 xlsxjars_0.5.0 [17] rJava_0.9-4 nFactors_2.3.3 lattice_0.20-23 boot_1.3-9 [21] sem_3.1-3matrixcalc_1.0-3 MASS_7.3-29 psych_1.3.10 [25] foreign_0.8-55 knitr_1.5formatR_0.9 loaded via a namespace (and not attached): [1] dichromat_2.0-0digest_0.6.3 evaluate_0.5 gtable_0.1.2 [5] labeling_0.2 munsell_0.4.2 proto_0.3-10 RColorBrewer_1.0-5 [9] reshape2_1.2.2 scales_0.2.3 stringr_0.6.2 tools_3.0.1 On Wednesday, March 20, 2013 1:43:26 PM UTC-4, John Fox wrote: Dear pedtroabq, I think that it's impossible to know from the information given why R crashes when bootSem() is called a second time after it worked the first time (which is how I interpret the message that you reference). I don't think that bootSem() is doing anything unusual -- it simply refits the model to bootstrap samples and uses try() to intercept model-fitting failures on the bootstrap replications. Also, although it's not relevant to the problem, it's odd not to store the object returned by bootSem() in a variable, which is perhaps what the author of the message means by (2). bootSem() does use a Tk progress bar, and I suppose that it's possible that Tcl/Tk for X-Windows or X-Windows itself isn't installed, and that's the source of the error, but I read (1) as implying that the first call to bootSem() worked. bootSem() only uses a Tk progress var if tcltk is loaded, but I don't believe on the Mac that this insures that the tcltk package actually works. If this is indeed the source of the problem, I should probably avoid the Tk progress bar under Mac OS X because of the inadequate support out of the box for the tcltk package on that platform. Best, John --- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada -Original Message- From: r-help-...@r-project.org javascript: [mailto:r-help-...@r-javascript: project.org] On Behalf Of pedroabg Sent: Wednesday, March 20, 2013 1:18 PM To: r-h...@r-project.org javascript: Subject: Re: [R] Bootstrap (bootSem) causes R to crash I think we didn't receive this on the list. -- View this message in context: http://r.789695.n4.nabble.com/Bootstrap- bootSem-causes-R-to-crash-tp4661900p4661944.html Sent from the R help mailing list archive at Nabble.com. __ r-h...@r-project.org javascript: mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org javascript: mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with expression()
Hi everyone, I am hoping someone can help with my attempted use of the expression function. I have a long series of text and variable to paste together including a degree symbol. The text is to be placed on my scatter plot using the mtext function. Using expression like this: changetext = expression(paste(Change from ,mini, to , maxi, :, diff ,degree,C,collapse=)) does not evaluate my user defined variables - mini,maxi, and diff - just printing them out as words Using expression like this: changetext = paste(Change from ,mini, to , maxi, :, diff ,expression(degree,C),collapse=) prints the text twice and does not evaluate the degree symbol. I have tried to place the expression alone in a variable and then run the paste: degsym = expression(degree,C) changetext = paste(Change from ,mini, to , maxi, :, diff ,degsym,collapse=) giving me the same result as the second option Is there any way I can use the expression function as in the first example but still have R evaluate my user defined variables? Thanks! Sheri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installing package gstat
Simona, You need to install the dependencies: install.packages(gstat,dependencies=T) Tom On Thu, Oct 10, 2013 at 11:58 AM, Simona Augyte simona.aug...@uconn.eduwrote: Hello, # I am able to install.packages(gstat) #but when I try to upload I get an error message library(gstat) #Error in loadNamespace(j - i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : # there is no package called intervals #In addition: Warning message: #package gstat was built under R version 3.0.2 #Error: package or namespace load failed for gstat what do you recommend? -- Simona Augyte, MS PhD student Ecology and Evolutionary Biology University of Connecticut cell 707-832-7007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Revolutions Blog: September roundup
Revolution Analytics staff write about R every weekday at the Revolutions blog: http://blog.revolutionanalytics.com and every month I post a summary of articles from the previous month of particular interest to readers of r-help. In case you missed them, here are some articles related to R from the month of September: Todd Schneider wrote an algorithm in R to find the most convex US state (it's NY), and created an animation to show how it works: http://bit.ly/19qzJ2O Rob Hyndman (of the forecast package) describes how R-based forecasting saved the Australian government millions, in a video describing his new online course: http://bit.ly/19qzGnO Revolution Analytics sponsored more than 60 local R user groups in 2013, and is now taking sponsorships applications for 2014: http://bit.ly/19qzJ2L R 3.0.2 is now available, with bug fixes and improved documentation support: http://bit.ly/19qzJ2N Some tips for data scientists on using R as part of a command-line tool chain: http://bit.ly/19qzJ2M Replay of a Google Hangout panel discussion on how open-source software including R is changing business: http://bit.ly/19qzGnN Hortonworks shares some resources for getting started with Data Science and R: http://bit.ly/19qzJ2R R represented 57% of the software supplements to the Journal of Computational and Graphical Statistics over the past year: http://bit.ly/19qzGE5 On Talk Like a Pirate Day -- Rrrr! -- R was used to chart real-life pirate attacks (http://bit.ly/19qzGnQ), create a pun-inspired pirate flag (http://bit.ly/19qzJ2S), and the Revolution Analytics staff had some pirate fun (http://bit.ly/19qzJ2T). Revolution Analytics partners with Teradata to bring R and big-data statistics into the database: http://bit.ly/19qzJ2U An article in Datanami discusses R in Hadoop: http://bit.ly/19qzJ2X Reports from the alpha test of the Revolution Analytics' RevoScaleR package running in Hadoop: http://bit.ly/19qzGE9 A survey of JSM attendees reveals concerns about data privacy and ethical frameworks for data use: http://bit.ly/19qzJ2V Coursera's online R courses are back on: Computing for Data Analysis started on September 23, and Data Analysis starts on October 28: http://bit.ly/19qzGE7 A neat R-based animation shows the progression of a Metropolis-Hastings algorithm for Bayesian estimation: http://bit.ly/19qzGE8 R was mentioned in articles in Data Informed and TechRepublic: http://bit.ly/19qzGEa There are now more than 125 R user groups worldwide, as this map shows: http://bit.ly/19qzJ2W Slides from two recent Revolution Analytics presentations on: high-performance predictive analytics in R and Hadoop; and Big Data, Big Analytics http://bit.ly/19qzJ2Y A tutorial on how to set up R, Hadoop and RHadoop on a single workstation/laptop (for learning or testing): http://bit.ly/19qzJ2Z R is named the top language for data science for the third year running in the KDNuggets poll: http://bit.ly/19qzJ30 Some non-R stories in the past month included: some terrible data visualizations (http://bit.ly/19qzGEb), a data visualization of checkins in SF (http://bit.ly/19qzJ31), paintings of a retro sci-fi Sweden (http://bit.ly/19qzJje), and software for making 3-D models from 2-D images (http://bit.ly/19qzGEc). Meeting times for local R user groups (http://bit.ly/eC5YQe) can be found on the updated R Community Calendar at: http://bit.ly/bb3naW If you're looking for more articles about R, you can find summaries from previous months at http://blog.revolutionanalytics.com/roundups/. You can receive daily blog posts via email using services like blogtrottr.com, or join the Revolution Analytics mailing list at http://revolutionanalytics.com/newsletter to be alerted to new articles on a monthly basis. As always, thanks for the comments and please keep sending suggestions to me at da...@revolutionanalytics.com . Don't forget you can also follow the blog using an RSS reader, or by following me on Twitter (I'm @revodavid). Cheers, # David -- David M Smith da...@revolutionanalytics.com VP of Marketing, Revolution Analytics http://blog.revolutionanalytics.com Tel: +1 (650) 646-9523 (Seattle WA, USA) Twitter: @revodavid We're hiring! www.revolutionanalytics.com/careers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with expression()
Hi everyone, I am hoping someone can help with my attempted use of the expression function. I have a long series of text and variable to paste together including a degree symbol. The text is to be placed on my scatter plot using the mtext function. Using expression like this: changetext = expression(paste(Change from ,mini, to , maxi, :, diff ,degree,C,collapse=)) does not evaluate my user defined variables - mini,maxi, and diff - just printing them out as words Using expression like this: changetext = paste(Change from ,mini, to , maxi, :, diff ,expression(degree,C),collapse=) prints the text twice and does not evaluate the degree symbol. I have tried to place the expression alone in a variable and then run the paste: degsym = expression(degree,C) changetext = paste(Change from ,mini, to , maxi, :, diff ,degsym,collapse=) giving me the same result as the second option Is there any way I can use the expression function as in the first example but still have R evaluate my user defined variables? Thanks! Sheri --- Sheri O'Connor M.Sc Candidate Department of Biology Lakehead University - Thunder Bay/Orillia 500 University Avenue Orillia, ON L3V 0B9 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error while running MR using rmr2
Hi, I have trying to run a simple MR program using rmr2 in a single node Hadoop cluster. Here is the environment for the setup Ubuntu 12.04 (32 bit) R (Ubuntu comes with 2.14.1, so updated to 3.0.2) Installed the latest rmr2 and rhdfs from herehttps://github.com/RevolutionAnalytics/RHadoop/wiki/Downloadsand the corresponding dependencies Hadoop 1.2.1 Now I am trying to run a simple MR program as Sys.setenv(HADOOP_HOME=/home/training/Installations/hadoop-1.2.1) Sys.setenv(HADOOP_CMD=/home/training/Installations/hadoop-1.2.1/bin/hadoop) library(rmr2) library(rhdfs) ints = to.dfs(1:100) calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v)) from.dfs(calc) The mapreduce job fails with the below error message in * hadoop-1.2.1/logs/userlogs/job_201310091055_0001/attempt_201310091055_0001_m_00_0/stderr * Error in library(functional) : there is no package called functional Execution halted java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) But, the sessionInfo() shows that functional package has been loaded sessionInfo() R version 3.0.2 (2013-09-25) Platform: i686-pc-linux-gnu (32-bit) locale: 1 LC_CTYPE=en_IN LC_NUMERIC=C LC_TIME=en_IN [4] LC_COLLATE=en_IN LC_MONETARY=en_IN LC_MESSAGES=en_IN [7] LC_PAPER=en_IN LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C attached base packages: 1 stats graphics grDevices utils datasets methods base other attached packages: 1 rhdfs_1.0.6 rJava_0.9-4 rmr2_2.3.0 reshape2_1.2.2 plyr_1.8 [6] stringr_0.6.2 *functional_0.4* digest_0.6.3 bitops_1.0-6 RJSONIO_1.0-3 [11] Rcpp_0.10.5 How to get around this problem? Thanks, Praveen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs plot
On 10/11/2013 02:01 AM, Witold E Wolski wrote: my data are matrix with 3 numeric columns. would like to have pairs plot with scatterplots in the upper with hist at the diag and with correlation at the lower. actually default pairs does almost what I want but looks semi awesome. Especially, i didn't find out how to remove the axes from the lower part where I do only want to display the numeric values correlations there and somehow axes don't fit. Hence I am looking at ggpairs from GGally and calling it without parameters looks almost perfect : but I cant find out how they got the Corr: in the upper, so I can't put it in the lower, and I do not know how to put the hist in the diag. Hi Witold, The example below is roughly what you want, I think. This could be wrapped up in a function similar to pairs and I might do this in future if I get the time. wwdat-data.frame(a=seq(1,5,length.out=20)+rnorm(20), b=seq(1,4,length.out=20)+rnorm(20),c=rnorm(20)) require(plotrix) panes(matrix(1:9,nrow=3,byrow=TRUE),c(1.2,1,1.2),c(1.2,1,1.2)) # plot 1 par(mar=c(0,3,3,0)) hist(wwdat$a,xaxt=n,xlab=,ylab=,main=) box() # plot 2 par(mar=c(0,0,3,0)) plot(wwdat$a,wwdat$b,xaxt=n,yaxt=n,xlab=,ylab=) axis(3) # plot 3 par(mar=c(0,0,3,3)) plot(wwdat$a,wwdat$c,xaxt=n,yaxt=n,xlab=,ylab=) axis(3) axis(4) # plot 4 par(mar=c(0,3,0,0)) plot(0,0,xlim=c(-1,1),ylim=c(-1,1),xaxt=n,yaxt=n,xlab=,ylab=,type=n) text(0,0,round(cor(wwdat$a,wwdat$b),2),cex=2.5) # plot 5 par(mar=c(0,0,0,0)) hist(wwdat$b,xaxt=n,yaxt=n,xlab=,ylab=,main=) # plot 6 par(mar=c(0,0,0,3)) plot(wwdat$b,wwdat$c,xaxt=n,yaxt=n,xlab=,ylab=) axis(4) # plot 7 par(mar=c(3,3,0,0)) plot(0,0,xlim=c(-1,1),ylim=c(-1,1),xaxt=n,yaxt=n,xlab=,ylab=,type=n) text(0,0,round(cor(wwdat$c,wwdat$a),2),cex=2.5) # plot 8 par(mar=c(3,0,0,0)) plot(0,0,xlim=c(-1,1),ylim=c(-1,1),xaxt=n,yaxt=n,xlab=,ylab=,type=n) text(0,0,round(cor(wwdat$c,wwdat$b),2),cex=2.5) # plot 9 par(mar=c(3,0,0,3)) hist(wwdat$c,xaxt=n,yaxt=n,xlab=,ylab=,main=) axis(4) box() Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible loop/ if statement query
Fantastic - once again, thanks Arun - your knowledge is very impressive! Ben Gillespie, Research Postgraduate o---o School of Geography, University of Leeds, Leeds, LS2 9JT o---o Tel: +44(0)113 34 33345 Mob: +44(0)770 868 7641 o---o http://www.geog.leeds.ac.uk/ o-o @RiversBenG o--o From: arun [smartpink...@yahoo.com] Sent: 10 October 2013 03:24 To: Benjamin Gillespie Cc: R help Subject: Re: [R] Possible loop/ if statement query Hi, Try: b1- b b1[!b1=7]- NA lst1 - split(b1,cumsum(c(0,abs(diff(b=7) indx - as.logical(((seq_along(lst1)-1)%%2)) lst1[indx]- lapply(seq_along(lst1[indx]),function(i) {lst1[indx][[i]]- rep(i,length(lst1[indx][[i]]))}) C2 - unlist(lst1,use.names=FALSE) all.equal(c1,C2) #[1] TRUE A.K. - Original Message - From: arun smartpink...@yahoo.com To: Benjamin Gillespie gy...@leeds.ac.uk Cc: Sent: Wednesday, October 9, 2013 8:19 PM Subject: Re: [R] Possible loop/ if statement query Hi, There should be a simpler way with cumsum(diff()). b=c((1:10),sort(1:9,decreasing=TRUE),(2:12),sort(6:11,decreasing=TRUE),(7:13)) b1- b c1=c( NA,NA,NA,NA,NA,NA,1,1,1,1,1,1,1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,2,2,2,2,2,2,2,2,2,2,2,NA,3,3,3,3,3,3,3) rl1- rle(b1=7) indx1-(cumsum(rl1$lengths)+1)[!rl1$values] indx2-(cumsum(rl1$lengths))[rl1$values] b1[!b1=7] - NA lst1- split(sort(c(indx1,indx2)),((seq_along(sort(c(indx1,indx2)))-1)%/%2)+1) mat1- sapply(seq_along(lst1),function(i) {x- lst1[[i]]; b1[seq(x[1],x[2])]- i; b1 }) indx2New- !is.na(mat1[,2]) mat1[,2]==2 indx3New- !is.na(mat1[,3]) mat1[,3]==3 mat1[!is.na(mat1[,1]) mat1[,1]3,1] - c(mat1[,2][indx2New],mat1[,3][indx3New]) all.equal(c1,mat1[,1]) #[1] TRUE A.K. - Original Message - From: Benjamin Gillespie gy...@leeds.ac.uk To: r-help@R-project.org r-help@r-project.org Cc: Sent: Wednesday, October 9, 2013 6:39 PM Subject: [R] Possible loop/ if statement query Dear r genii, I hope you can help. I have vector 'b': b=c((1:10),sort(1:9,decreasing=TRUE),(2:12),sort(6:11,decreasing=TRUE),(7:13)) and, from 'b' I wish to create vector 'c': c=c( NA,NA,NA,NA,NA,NA,1,1,1,1,1,1,1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,2,2,2,2,2,2,2,2,2,2,2,NA,3,3,3,3,3,3,3) The rules I want to use to create 'c' are: A numeric of equal to, or over 7 in 'b' needs to result in a numeric (i.e. not NA) in 'c'; A numeric of less than 7 in 'b' needs to result in NA in 'c'; Where 'groups' of numerics equal to, or over 7 in 'b' are present (i.e. next to each other in the list), the numerics produced in 'c' all need to be the same; Each 'group' of numerics in 'b' must result in a unique numeric in 'c' (and, ideally, they should run in sequence as in 'c' above (1,2,3...). If anyone has any idea where to start on this or can crack it I'll be most grateful!! Many thanks in advance, Ben Gillespie, Research Postgraduate o---o School of Geography, University of Leeds, Leeds, LS2 9JT o---o Tel: +44(0)113 34 33345 Mob: +44(0)770 868 7641 o---o http://www.geog.leeds.ac.uk/ o-o @RiversBenG o--o __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with expression()
changetext = expression(paste(Change from ,mini, to , maxi, :, diff ,degree,C,collapse=)) does not evaluate my user defined variables - mini,maxi, and diff - just printing them out as words bquote() can do it: put the variables which should be evaluated in .(). E.g., mini - 13 maxi - 97 diff - maxi - mini plot(0, 0, main=bquote(Change from ~ .(mini) ~ to ~ .(maxi) * : ~ .(diff) ~ degree ~ C)) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sheri O'Connor Sent: Thursday, October 10, 2013 1:08 PM To: R-help@r-project.org Subject: [R] Help with expression() Hi everyone, I am hoping someone can help with my attempted use of the expression function. I have a long series of text and variable to paste together including a degree symbol. The text is to be placed on my scatter plot using the mtext function. Using expression like this: changetext = expression(paste(Change from ,mini, to , maxi, :, diff ,degree,C,collapse=)) does not evaluate my user defined variables - mini,maxi, and diff - just printing them out as words Using expression like this: changetext = paste(Change from ,mini, to , maxi, :, diff ,expression(degree,C),collapse=) prints the text twice and does not evaluate the degree symbol. I have tried to place the expression alone in a variable and then run the paste: degsym = expression(degree,C) changetext = paste(Change from ,mini, to , maxi, :, diff ,degsym,collapse=) giving me the same result as the second option Is there any way I can use the expression function as in the first example but still have R evaluate my user defined variables? Thanks! Sheri --- Sheri O'Connor M.Sc Candidate Department of Biology Lakehead University - Thunder Bay/Orillia 500 University Avenue Orillia, ON L3V 0B9 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with R
Wow, that really sounds terrible. Is someone making you use R? For my tasks it is generally an improvement over other tools. Sometimes it can be a bit puzzling, but often the result works more reliably and it gives me warnings when my data are messed up. If you have a better tool for your work, perhaps you should use that. If you want help on this list, complaining is not going to be very productive, though. In addition to telling us what you are trying to accomplish, you need to show us what you are doing so we can point out where you are going wrong. You might find http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example helpful. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Mash Hamid mxh...@bham.ac.uk wrote: Hi, I have recently installed R and am trying to do some work on it. To be honest I'm finding it a PAIN to you. I use Mac and I can open stata dta. files in R despite using the commands suggested to me and I have been trying to get figure out how to do reduced rank regression on it and the manual is not very useful. Can you help? Thanks, Mash __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting times into groups based on a range of times
Hi all, I hope you can help with this one! I have a dataframe: 'df' that consists of a vector of times: 'dt2' and a vector of group id's: 'group': dates2=rep(01/02/13,times=8) times2=c(12:00:00,12:30:00,12:45:00,13:15:00,13:30:00,14:00:00,14:45:00,17:30:00) y =paste(dates2, times2) dt2=strptime(y, %m/%d/%y %H:%M:%S) group=c(1,1,2,2,3,3,4,4) df=data.frame(dt2,group) I also have a vector: 'dt' which is a series of times: dates=rep(01/02/13,times=20) times=c(12:00:00,12:15:00,12:30:00,12:45:00,13:00:00,13:15:00,13:30:00,13:45:00,14:00:00,14:15:00,14:30:00,14:45:00,15:00:00,15:15:00,15:30:00,15:45:00,16:00:00,16:15:00,16:30:00,16:45:00,17:00:00,17:15:00,17:30:00,17:45:00) x =paste(dates, times) dt=strptime(x, %m/%d/%y %H:%M:%S) I wish to create a vector which looks like 'id': id=c(1,1,1,2,2,2,3,3,3,0,0,4,4,4,4,4,4,4,4,4,4,4,4,0) The rules I wish to follow to create 'id' are: 1. If a value in 'dt' is either equal to, or, within the range of times within group x in dataframe 'df', then, the value in 'id' will equal x. So, for example, in 'df', group 4 is between the times of 14:45:00 and 17:30:00 on the 01/02/13. Thus, the 12th to 23rd value in 'id' equals 4 as these values correspond to times within 'dt' that are equal to and within the range of 14:45:00 and 17:30:00 on the 01/02/13. If this doesn't make sense, please ask, I'm not sure where to even start with this... possibly the 'cut' function? Many thanks in advance, Ben Gillespie, Research Postgraduate o---o School of Geography, University of Leeds, Leeds, LS2 9JT o---o Tel: +44(0)113 34 33345 Mob: +44(0)770 868 7641 o---o http://www.geog.leeds.ac.uk/ o-o @RiversBenG o--o __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrap (bootSem) causes R to crash
Hi, I solved the crashing issue by removing references to tclck. Here is a gist that shows the edits: https://gist.github.com/ericpgreen/6926595 However, I am getting an error about convergence failures: Error in bootSem2(sem.cnes, R = 100, Cov = hcor, data = CNES) : more than 10 consecutive convergence failures As described in my previous message, I am trying to run the first example in ?semBoot, not my own data. The only difference from the example is that I call the modified bootSem2() function I created by commenting out references to tclck. Increasing max.failures does not seem to help. I'm not sure what is going on, but I think this example should work fine. Eric On Thursday, October 10, 2013 at 3:01 PM, Eric Green wrote: Hi everyone, I'd like to report a similar experience. When I attempt to run the first CFA example in bootSem(), specifically the line: system.time(boot.cnes - bootSem(sem.cnes, R=100, Cov=hcor, data=CNES)) R crashes and I get a notice about X11. I copied my sessionInfo() below. Any ideas for a workaround? Thanks Eric R version 3.0.1 (2013-05-16) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] polycor_0.7-8 sfsmisc_1.0-23 mvtnorm_0.9-9996 corrgram_1.5 [5] seriation_1.0-11 colorspace_1.2-3 gclus_1.3.1 TSP_1.0-8 [9] cluster_1.14.4 GGally_0.4.4 reshape_0.8.4 plyr_1.8 [13] ggplot2_0.9.3.1 xtable_1.7-1 xlsx_0.5.1 xlsxjars_0.5.0 [17] rJava_0.9-4 nFactors_2.3.3 lattice_0.20-23 boot_1.3-9 [21] sem_3.1-3 matrixcalc_1.0-3 MASS_7.3-29 psych_1.3.10 [25] foreign_0.8-55 knitr_1.5 formatR_0.9 loaded via a namespace (and not attached): [1] dichromat_2.0-0 digest_0.6.3 evaluate_0.5 gtable_0.1.2 [5] labeling_0.2 munsell_0.4.2 proto_0.3-10 RColorBrewer_1.0-5 [9] reshape2_1.2.2 scales_0.2.3 stringr_0.6.2 tools_3.0.1 On Wednesday, March 20, 2013 1:43:26 PM UTC-4, John Fox wrote: Dear pedtroabq, I think that it's impossible to know from the information given why R crashes when bootSem() is called a second time after it worked the first time (which is how I interpret the message that you reference). I don't think that bootSem() is doing anything unusual -- it simply refits the model to bootstrap samples and uses try() to intercept model-fitting failures on the bootstrap replications. Also, although it's not relevant to the problem, it's odd not to store the object returned by bootSem() in a variable, which is perhaps what the author of the message means by (2). bootSem() does use a Tk progress bar, and I suppose that it's possible that Tcl/Tk for X-Windows or X-Windows itself isn't installed, and that's the source of the error, but I read (1) as implying that the first call to bootSem() worked. bootSem() only uses a Tk progress var if tcltk is loaded, but I don't believe on the Mac that this insures that the tcltk package actually works. If this is indeed the source of the problem, I should probably avoid the Tk progress bar under Mac OS X because of the inadequate support out of the box for the tcltk package on that platform. Best, John --- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada -Original Message- From: r-help-...@r-project.org javascript: [mailto:r-help-...@r-javascript: project.org (http://project.org)] On Behalf Of pedroabg Sent: Wednesday, March 20, 2013 1:18 PM To: r-h...@r-project.org (http://r-project.org) javascript: Subject: Re: [R] Bootstrap (bootSem) causes R to crash I think we didn't receive this on the list. -- View this message in context: http://r.789695.n4.nabble.com/Bootstrap- bootSem-causes-R-to-crash-tp4661900p4661944.html Sent from the R help mailing list archive at Nabble.com (http://Nabble.com). __ r-h...@r-project.org (http://r-project.org) javascript: mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org (http://r-project.org) javascript: mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org (mailto:R-help@r-project.org) mailing list
[R] Gaussian Quadrature for arbitrary PDF
Hi all, We know that Hermite polynomial is for Gaussian, Laguerre polynomial for Exponential distribution, Legendre polynomial for uniform distribution, Jacobi polynomial for Beta distribution. Does anyone know which kind of polynomial deals with the log-normal, Students t, Inverse gamma and Fishers F distribution? Thank you in advance! David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gaussian Quadrature for arbitrary PDF
On 10/10/2013 5:02 PM, Marino David wrote: Hi all, We know that Hermite polynomial is for Gaussian, Laguerre polynomial for Exponential distribution, Legendre polynomial for uniform distribution, Jacobi polynomial for Beta distribution. Does anyone know which kind of polynomial deals with the log-normal, * lognormal in X is normal for Z = log(X). Therefore, you'd use Hermite polynomials in Z. Student's t, Inverse gamma and Fisher's F distribution? * If X follows an F(d1, d2) distribution, then Z = d1*x/(x1*x+d2) follows a beta distribution. Use Jacobi polynomials on Z. * If X follows a student's t(df), then X^2 follows an F(1, df) distribution. Again, use Jacobi on the appropriate transform. * If X follows an inverse gamma, then 1/X follows a gamma distribution. Does this answer the question? Spencer Thank you in advance! David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gaussian Quadrature for arbitrary PDF
p.s. Orthogonal polynomials can be defined for any probability distribution on the real line, discrete, continuous, or otherwise, as described in the Wikipedia article on orthogonal polynomials. On 10/10/2013 5:02 PM, Marino David wrote: Hi all, We know that Hermite polynomial is for Gaussian, Laguerre polynomial for Exponential distribution, Legendre polynomial for uniform distribution, Jacobi polynomial for Beta distribution. Does anyone know which kind of polynomial deals with the log-normal, * lognormal in X is normal for Z = log(X). Therefore, you'd use Hermite polynomials in Z. Student's t, Inverse gamma and Fisher's F distribution? * If X follows an F(d1, d2) distribution, then Z = d1*x/(x1*x+d2) follows a beta distribution. Use Jacobi polynomials on Z. * If X follows a student's t(df), then X^2 follows an F(1, df) distribution. Again, use Jacobi on the appropriate transform. * If X follows an inverse gamma, then 1/X follows a gamma distribution. Does this answer the question? Spencer Thank you in advance! David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gaussian Quadrature for arbitrary PDF
Thanks so much for your response. BTW, do you know any Gauss quadrature R package can deal with the arbitary PDF? Thank you! David 2013/10/11 Spencer Graves spencer.gra...@structuremonitoring.com p.s. Orthogonal polynomials can be defined for any probability distribution on the real line, discrete, continuous, or otherwise, as described in the Wikipedia article on orthogonal polynomials. On 10/10/2013 5:02 PM, Marino David wrote: Hi all, We know that Hermite polynomial is for Gaussian, Laguerre polynomial for Exponential distribution, Legendre polynomial for uniform distribution, Jacobi polynomial for Beta distribution. Does anyone know which kind of polynomial deals with the log-normal, * lognormal in X is normal for Z = log(X). Therefore, you'd use Hermite polynomials in Z. Students t, Inverse gamma and Fishers F distribution? * If X follows an F(d1, d2) distribution, then Z = d1*x/(x1*x+d2) follows a beta distribution. Use Jacobi polynomials on Z. * If X follows a student's t(df), then X^2 follows an F(1, df) distribution. Again, use Jacobi on the appropriate transform. * If X follows an inverse gamma, then 1/X follows a gamma distribution. Does this answer the question? Spencer Thank you in advance! David [[alternative HTML version deleted]] __r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Small p from binomial probability function.
It is mysterious to me why the procedure proposed by Stefan Evert works. It appears to work --- once you modify the call to binom.test() to have the correct syntax. In a sequence of 1000 trials with random values of N, x, and p0, the answers from Evert's procedure agreed with the answer given by uniroot() to within +/- 3.045e-05. However your question was (in effect) how to solve the equation Pr(X = x) = p0 for p, where X ~ Binom(N,p), with N and x known. What this has to do with confidence intervals for p is, to my mind at least, completely opaque. In contrast it is obvious why the procedure using uniroot() works. I would suggest that you stick with the uniroot() procedure in that it is readily comprehensible. cheers, Rolf Turner On 10/11/13 03:56, Benjamin Ward (ENV) wrote: Hi, Thank you for your answers, I'm not completely sure if it's to bino.test I need or the uniroot. Perhaps I should explain more the idea behind the code and the actual task I'm trying to do. The idea is to calculate a confidence interval as to the age of two DNA sequences which have diverged, where I know the number of mutations that happened in them, and I know the mutation rate. The binomial probability can be used since, mutations have a probability of occurring or being observed so many times in a sequence. This is dependent on the length of the DNA stretch (which equates to the number of trials since each base is a possibility of observing a mutation), the probability of a single mutation occurring which is p = t * u, since more time means a higher probability a mutation may have occurred. So my code, using pbinom, is supposed to calculate the probability that my DNA stretches contain the number of mutations observed P(X = k), given their size (trials) and the probability of a single mutation (p = t * u). However I'm interested in finding t: t is what is unknown, so the loop repeatedly evaluates the calculation, increasing t each time and checking P(X=k), when it is 0.05, 0.50 and 0.95, we record t. Ideally I'd like to rearrange this so I can get the probability of a single success (mutation) p, and then divide by the mutation rate to get my t. My supervisor gave my the loopy code but I imagine there is a way to plug in P(X=k) as 0.05 and 0.95 and get my upper and lower t estimates. According to the R built in docs: binom.test Description: Performs an exact test of a simple null hypothesis about the probability of success in a Bernoulli experiment. Perhaps this is the one I need rather than uniroot? Best, Ben. From: Stefan Evert [stefa...@collocations.de] Sent: 10 October 2013 09:37 To: R-help Mailing List Cc: Benjamin Ward (ENV) Subject: Re: [R] Small p from binomial probability function. Sounds like you want a 95% binomial confidence interval: binom.test(N, P) will compute this for you, and you can get the bounds directly with binom.test(N, P)$conf.int Actually, binom.test computes a two-sided confidence interval, which corresponds roughly to 2.5 and 97.5 percentages in your approach. It doesn't give you the 50% point either, but I don't think that's a meaningful quantity with a two-sided test. Hope this helps, Stefan On 9 Oct 2013, at 15:53, Benjamin Ward (ENV) b.w...@uea.ac.uk wrote: I got given some code that uses the R function pbionom: p - mut * t sumprobs - pbinom( N, B, p ) * 1000 Which gives the output of a probability as a percentage like 5, 50, 95. What the code currently does is find me the values of t I need, by using the above two code lines in a loop, each iteration it increaces t by one and runs the two lines. When sumprobs equals 5, it records the value t, then again when sumprobs is equal to 50, and again when sumprobs is equal to 95 - giving me three t values. This is not an efficient way of doing this if t is large. Is it possible to rearrange pbinom so it gives me the small p (made of mut*t) as the result of plugging in the sumprobs instead, and is there an R function that already does this? Since pbinom is the binomial probability equation I suppose the question is - in more mathematical terminology - can I change this code so that instead of calculating the Probability of N successes given the number of trials and the probability of a single success, can I instead calculate the probability of a single success using the probability of N successes and number of trials, and the number of successes? Can R do this for me. So instead I plug in 5, 50, and 95, and then get the small p out? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting times into groups based on a range of times
Hi Ben, I would look into ?findInterval() or ?cut() for an easier solution. indx- match(df[,1],as.POSIXct(dt)) indx2- unique(df[,2]) lst1- lapply(split(indx,((seq_along(indx)-1)%/%2)+1),function(x) seq(x[1], x[2])) res - unlist(lapply(seq_along(lst1),function(i) { val-rep(indx2[i],length(lst1[[i]])) names(val)-lst1[[i]] val })) res1-res[match(seq_along(dt),names(res))] res1[is.na(res1)]- 0 names(res1)- NULL res1 # [1] 1 1 1 2 2 2 3 3 3 0 0 4 4 4 4 4 4 4 4 4 4 4 4 0 identical(id,res1) #[1] TRUE On Thursday, October 10, 2013 8:10 PM, Benjamin Gillespie gy...@leeds.ac.uk wrote: Hi all, I hope you can help with this one! I have a dataframe: 'df' that consists of a vector of times: 'dt2' and a vector of group id's: 'group': dates2=rep(01/02/13,times=8) times2=c(12:00:00,12:30:00,12:45:00,13:15:00,13:30:00,14:00:00,14:45:00,17:30:00) y =paste(dates2, times2) dt2=strptime(y, %m/%d/%y %H:%M:%S) group=c(1,1,2,2,3,3,4,4) df=data.frame(dt2,group) I also have a vector: 'dt' which is a series of times: dates=rep(01/02/13,times=20) times=c(12:00:00,12:15:00,12:30:00,12:45:00,13:00:00,13:15:00,13:30:00,13:45:00,14:00:00,14:15:00,14:30:00,14:45:00,15:00:00,15:15:00,15:30:00,15:45:00,16:00:00,16:15:00,16:30:00,16:45:00,17:00:00,17:15:00,17:30:00,17:45:00) x =paste(dates, times) dt=strptime(x, %m/%d/%y %H:%M:%S) I wish to create a vector which looks like 'id': id=c(1,1,1,2,2,2,3,3,3,0,0,4,4,4,4,4,4,4,4,4,4,4,4,0) The rules I wish to follow to create 'id' are: 1. If a value in 'dt' is either equal to, or, within the range of times within group x in dataframe 'df', then, the value in 'id' will equal x. So, for example, in 'df', group 4 is between the times of 14:45:00 and 17:30:00 on the 01/02/13. Thus, the 12th to 23rd value in 'id' equals 4 as these values correspond to times within 'dt' that are equal to and within the range of 14:45:00 and 17:30:00 on the 01/02/13. If this doesn't make sense, please ask, I'm not sure where to even start with this... possibly the 'cut' function? Many thanks in advance, Ben Gillespie, Research Postgraduate o---o School of Geography, University of Leeds, Leeds, LS2 9JT o---o Tel: +44(0)113 34 33345 Mob: +44(0)770 868 7641 o---o http://www.geog.leeds.ac.uk/ o-o @RiversBenG o--o __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Small p from binomial probability function.
I've figured it out. It ***is*** obvious why Evert's procedure works. Once you hold your head at the correct angle, as my first year calculus lecturer used to say. The binom.test() confidence interval gives you the value of a random variable say U (for upper) such that Pr(U p) = p0 where U is a function of the observed binomial random variable, say U = h(X). The observed value of U is h(x), where x is the observed value of X. Now we want p such that Pr(X = x) = p0 where X ~ Binom(N,p). But when X ~ Binom(N,p), Pr(U = p) = p0, i.e Pr(h(X) = p) = p0, so if we take p = h(x) we have Pr(h(X) = h(x)) = p0, whence Pr(X = x) = p0 as desired. Still twists my head, but. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help for compare regression coefficients across groups
On Oct 10, 2013, at 3:44 AM, Andreia Fonseca wrote: Dear all I have data related to cell count across time in 2 different types of cells. I have transformed the count data using a log and I want to test the H0: B cell_ttype1=Bcell_type2 across time for that I am fitting the following model fit_all-lm(data$count~data$cell_type+data$time+data$cell_type*data$time) the output is Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.0450021 0.0286824 36.434 2e-16 *** data$cell_typeOV -0.0456669 0.0405631 -1.1260.271 data$time 0.0115620 0.0004815 24.015 2e-16 *** data$cell_typeOV:data$time -0.0009764 0.0006809 -1.4340.164 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.06318 on 26 degrees of freedom Multiple R-squared: 0.9764,Adjusted R-squared: 0.9737 F-statistic: 358.8 on 3 and 26 DF, p-value: 2.2e-16 inspite the fact that the p-value of he interaction is 0.05 may I still conclude that B cell_ttype1 is different from Bcell_type2 because the p-value of the fit is lower0.05? You have offered the output of an interaction model and only provided the Wald statistics on terms, which is a situation where those values are usually meaningless. Furthermore, you have not described the study or the data in any detail. You should seek competent local consultation at your institution. If you get an answer based only on this information, that should result in a lower opinion (in the Bayesian sense) of the competence of the responder. -- David. Thanks in advance for your help. With kind regards, Andreia -- - Andreia J. Amaral, PhD BioFIG - Center for Biodiversity, Functional and Integrative Genomics Instituto de Medicina Molecular University of Lisbon Tel: +352 21750 (ext. office: 28253) email:andreiaama...@fm.ul.pt ; andreiaama...@fc.ul.pt [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] summary and plot
Hi All, I have a huge data set with the following type; city year sex obs 1 1990 M 25 1 1990 F 32 1 1991 M 15 1 1991 F 22 2 1990 M 42 2 1990 F 36 2 1991 M 12 2 1991 F 16 I want to calculate the percentage of M and F by city, year and year within city and also plot. city 1total Male= 40; total female= 54; %M= 40/(40+54)=42.6 %F= 54/(40+54)=57.4 and so on. Can any body help me out? Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary and plot
## I would use the likert function in the HH package ## if necessary ## install.packages(HH) ## install.packages(reshape) library(HH) library(reshape) pop - read.table(header=TRUE, text= city year sex obs 1 1990 M 25 1 1990 F 32 1 1991 M 15 1 1991 F 22 2 1990 M 42 2 1990 F 36 2 1991 M 12 2 1991 F 16 ) popwide - cast(val, city + year ~ sex, value=obs) popwide likert(city ~ F + M | year, data=popwide, as.percent=TRUE, main=F M population by city within year) likert(year ~ F + M | city, data=popwide, as.percent=TRUE, main=F M population by year within city) ## Rich On Thu, Oct 10, 2013 at 10:35 PM, Val valkr...@gmail.com wrote: Hi All, I have a huge data set with the following type; city year sex obs 1 1990 M 25 1 1990 F 32 1 1991 M 15 1 1991 F 22 2 1990 M 42 2 1990 F 36 2 1991 M 12 2 1991 F 16 I want to calculate the percentage of M and F by city, year and year within city and also plot. city 1total Male= 40; total female= 54; %M= 40/(40+54)=42.6 %F= 54/(40+54)=57.4 and so on. Can any body help me out? Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary and plot
This is better for plotting the percents likert(city ~ F + M | year, data=popwide, as.percent=noRightAxis, main=F M population by city within year) likert(year ~ F + M | city, data=popwide, as.percent=noRightAxis, main=F M population by year within city) We can also plot the populations likert(city ~ F + M | year, data=popwide, main=F M population by city within year) likert(year ~ F + M | city, data=popwide, main=F M population by year within city) The row count totals in the percent plots in my previous email belong only to the first panel. This is a bug that I will need to fix. Rich On Thu, Oct 10, 2013 at 11:04 PM, Richard M. Heiberger r...@temple.edu wrote: ## I would use the likert function in the HH package ## if necessary ## install.packages(HH) ## install.packages(reshape) library(HH) library(reshape) pop - read.table(header=TRUE, text= city year sex obs 1 1990 M 25 1 1990 F 32 1 1991 M 15 1 1991 F 22 2 1990 M 42 2 1990 F 36 2 1991 M 12 2 1991 F 16 ) popwide - cast(val, city + year ~ sex, value=obs) popwide likert(city ~ F + M | year, data=popwide, as.percent=TRUE, main=F M population by city within year) likert(year ~ F + M | city, data=popwide, as.percent=TRUE, main=F M population by year within city) ## Rich On Thu, Oct 10, 2013 at 10:35 PM, Val valkr...@gmail.com wrote: Hi All, I have a huge data set with the following type; city year sex obs 1 1990 M 25 1 1990 F 32 1 1991 M 15 1 1991 F 22 2 1990 M 42 2 1990 F 36 2 1991 M 12 2 1991 F 16 I want to calculate the percentage of M and F by city, year and year within city and also plot. city 1total Male= 40; total female= 54; %M= 40/(40+54)=42.6 %F= 54/(40+54)=57.4 and so on. Can any body help me out? Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary and plot
Hi, May be: dat1- read.table(text=city year sex obs 1 1990 M 25 1 1990 F 32 1 1991 M 15 1 1991 F 22 2 1990 M 42 2 1990 F 36 2 1991 M 12 2 1991 F 16,sep=,header=TRUE,stringsAsFactors=FALSE) library(plyr) #by city d1 - ddply(dat1,.(city),summarize,Tot=sum(obs)) d2 - ddply(dat1,.(city,sex),summarize,Tot=sum(obs)) res - merge(d1,d2,by=city) res$percent - round((res$Tot.y/res$Tot.x)*100,1) library(plotrix) plot(percent~city,data=res,type=p) thigmophobe.labels(res$city,res$percent,labels=res$sex) #by year d1 - ddply(dat1,.(year),summarize,Tot=sum(obs)) d2 - ddply(dat1,.(year,sex),summarize,Tot=sum(obs)) res2 - merge(d1,d2,by=year) res2$percent - round((res2$Tot.y/res2$Tot.x)*100,1) plot(percent~year,data=res2,type=p,xaxt=n) axis(1,at=res2$year,labels=res2$year) thigmophobe.labels(res2$year,res2$percent,labels=res2$sex) A.K. On Thursday, October 10, 2013 10:37 PM, Val valkr...@gmail.com wrote: Hi All, I have a huge data set with the following type; city year sex obs 1 1990 M 25 1 1990 F 32 1 1991 M 15 1 1991 F 22 2 1990 M 42 2 1990 F 36 2 1991 M 12 2 1991 F 16 I want to calculate the percentage of M and F by city, year and year within city and also plot. city 1 total Male= 40; total female= 54; %M= 40/(40+54)=42.6 %F= 54/(40+54)=57.4 and so on. Can any body help me out? Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Permutation tests in {coin}
Lars Bishop skrev 2013-10-05 22:17: Hello, I'm trying to get familiar with the coin package for doing permutation tests. I'm not sure I understand the documentation regarding the difference between distribution = asymptotic and approximate in the function independence_test. The use of asymptotic or approximate leads to two different approximations of the exact conditional distribution under the null hypothesis. The basis of 'coin' is a multivariate linear test statistic, and it can be shown (Strasser and Weber, 1999) that its conditional distribution is asymptotically normally distributed under the null hypothesis (asymptotic). Alternatively, another approximative conditional null distribution can be obtained using conditional Monte Carlo procedures (approximate). Are permutations of the test statistic actually computed in the asymptotic case, or only when the distribution is specified as approximate? In the asymptotic case, the univariate normal distribution is used when the test statistic is a scalar and the chi-squared distribution is used when the test statistics is a quadratic form. For a multivariate maximum-type test, permutations may be used in the asymptotic case but only in the sense that probabilities from the corresponding multivariate normal distribution are obtained by numerical integration using Monte Carlo procedures. In the latter case, the 'mvtnorm' package is used and further details can be found in its documentation. In the approximate case, permutations are always used irrespective of the type of test statistic. When should I use each option? You should only use asymptotic in situations where you trust the asymptotic approximation. ;-) In all other cases, use approximate or, if possible, exact. HTH, Henric Thanks Lars. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assessing compiled code in base packages
I am updating a package that works previously under R.2.15.x on a Windows OS. When I build and run the package with R.3.0.2. I receive an error message like the following: C_call_dqags not available for .External() for package stats How can compiled C (or Fortran) routines from package stats be used in functions in a new package with R.3.0.2? Help will be highly appreciated. Niel E-pos vrywaringsklousule Hierdie e-pos mag vertroulike inligting bevat en mag regtens geprivilegeerd wees en is slegs bedoel vir die persoon aan wie dit geadresseer is. Indien u nie die bedoelde ontvanger is nie, word u hiermee in kennis gestel dat u hierdie dokument geensins mag gebruik, versprei of kopieer nie. Stel ook asseblief die sender onmiddellik per telefoon in kennis en vee die e-pos uit. Die Universiteit aanvaar nie aanspreeklikheid vir enige skade, verlies of uitgawe wat voortspruit uit hierdie e-pos en/of die oopmaak van enige l?ers aangeheg by hierdie e-pos nie. E-mail disclaimer This e-mail may contain confidential information and may be legally privileged and is intended only for the person to whom it is addressed. If you are not the intended recipient, you are notified that you may not use, distribute or copy this document in any manner whatsoever. Kindly also notify the sender immediately by telephone, and delete the e-mail. The University does not accept l! iability for any damage, loss or expense arising from this e-mail and/or accessing any files attached to this e-mail. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with expression()
Thanks very much! bquote() did the trick! on Thu, Oct 10, 2013 at 6:19 PM, William Dunlap wdun...@tibco.com wrote: changetext = expression(paste(Change from ,mini, to , maxi, :, diff ,degree,C,collapse=)) does not evaluate my user defined variables - mini,maxi, and diff - just printing them out as words bquote() can do it: put the variables which should be evaluated in .(). E.g., mini - 13 maxi - 97 diff - maxi - mini plot(0, 0, main=bquote(Change from ~ .(mini) ~ to ~ .(maxi) * : ~ .(diff) ~ degree ~ C)) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sheri O'Connor Sent: Thursday, October 10, 2013 1:08 PM To: R-help@r-project.org Subject: [R] Help with expression() Hi everyone, I am hoping someone can help with my attempted use of the expression function. I have a long series of text and variable to paste together including a degree symbol. The text is to be placed on my scatter plot using the mtext function. Using expression like this: changetext = expression(paste(Change from ,mini, to , maxi, :, diff ,degree,C,collapse=)) does not evaluate my user defined variables - mini,maxi, and diff - just printing them out as words Using expression like this: changetext = paste(Change from ,mini, to , maxi, :, diff ,expression(degree,C),collapse=) prints the text twice and does not evaluate the degree symbol. I have tried to place the expression alone in a variable and then run the paste: degsym = expression(degree,C) changetext = paste(Change from ,mini, to , maxi, :, diff ,degsym,collapse=) giving me the same result as the second option Is there any way I can use the expression function as in the first example but still have R evaluate my user defined variables? Thanks! Sheri --- Sheri O'Connor M.Sc Candidate Department of Biology Lakehead University - Thunder Bay/Orillia 500 University Avenue Orillia, ON L3V 0B9 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] iconv question: SQL Server 2005 to R
Thanks for the suggestion. From R version 3.0.2, I tried testDF7 =iconv(x = test07 , from = UCS-2, to = ) Encoding(testDF7) [1] unknown testDF7[1:6] [1] NA NA NA NA NA NA So using UCS-2 produced the same results as before. I do not think there are any NA values. I cleaned up the csv file from within Excel. Then read it into R sum(is.na(workingDF)) [1] 0 Also the Excel COUNTBLANK function gave me zero. On 10/9/2013 11:33 PM, Prof Brian Ripley wrote: On 09/10/2013 10:37, Milan Bouchet-Valat wrote: Le mardi 08 octobre 2013 à 16:02 -0700, Ira Sharenow a écrit : A colleague is sending me quite a few files that have been saved with MS SQL Server 2005. I am using R 2.15.1 on Windows 7. I am trying to read in the files using standard techniques. Although the file has a csv extension when I go to Excel or WordPad and do SAVE AS I see that it is Unicode Text. Notepad indicates that the encoding is Unicode. Right now I have to do a few things from within Excel (such as Text to Columns) and eventually save as a true csv file before I can read it into R and then use it. Is there an easy way to solve this from within R? I am also open to easy SQL Server 2005 solutions. I tried the following from within R. testDF = read.table(Info06.csv, header = TRUE, sep = ,) testDF2 = iconv(x = testDF, from = Unicode, to = ) Error in iconv(x = testDF, from = Unicode, to = ) : unsupported conversion from 'Unicode' to '' in codepage 1252 # The next line did not produce an error message testDF3 = iconv(x = testDF, from = UTF-8 , to = ) testDF3[1:6, 1:3] Error in testDF3[1:6, 1:3] : incorrect number of dimensions # The next line did not produce an error message testDF4 = iconv(x = testDF, from = macroman , to = ) testDF4[1:6, 1:3] Error in testDF4[1:6, 1:3] : incorrect number of dimensions Encoding(testDF3) [1] unknown Encoding(testDF4) [1] unknown This is the first few lines from WordPad Date,StockID,Price,MktCap,ADV,SectorID,Days,A1,std1,std2 2006-01-03 00:00:00.000,@Stock1,2.53,467108197.38,567381.1,4,133.14486997089,-0.0162107939626307,0.0346283580367959,0.0126471695454834 2006-01-03 00:00:00.000,@Stock2,1.3275,829803070.531114,6134778.93292,5,124.632223896458,0.071513138376339,0.0410694546850102,0.0172091268025929 What's the actual problem? You did not state any. Do you get accentuated characters that are not printed correctly after importing the file? In the two lines above it does not look like there would be any non-ASCII characters in this file, so encoding would not matter. It is most likely UCS-2. That has embedded NULs, so the encoding does matter. All 8-bit encodings extend ASCII: others do not, in general. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.