Re: [R] About populating a dataframe in a loop

2017-01-06 Thread lily li
Thanks, Richard. But if the data cannot fill the constructed data frame, will there be NA values? On Fri, Jan 6, 2017 at 10:07 PM, Richard M. Heiberger wrote: > Incrementally increasing the size of an array is not efficient in R. > The recommended technique is to allocate as

Re: [R] About populating a dataframe in a loop

2017-01-06 Thread jeremiah rounds
As a rule never rbind in a loop. It has O(n^2) run time because the rbind itself can be O(n) (where n is the number of data.frames). Instead either put them all into a list with lapply or vector("list", length=) and then datatable::rbindlist, do.call(rbind, thelist) or use the equivalent from

Re: [R] About populating a dataframe in a loop

2017-01-06 Thread Richard M. Heiberger
Incrementally increasing the size of an array is not efficient in R. The recommended technique is to allocate as much space as you will need, and then fill it. > system.time({tmp <- 1:5 ; for (i in 1:1000) tmp <- rbind(tmp, 1:5)}) user system elapsed 0.011 0.000 0.011 > dim(tmp) [1]

Re: [R] About populating a dataframe in a loop

2017-01-06 Thread lily li
Hi Rui, Thanks for your reply. Yes, when I tried to rbind two dataframes, it works. However, if there are more than 50, it got stuck for hours. When I tried to terminate the process and open the csv file separately, it has only one data frame. What is the problem? Thanks. On Fri, Jan 6, 2017 at

[R-es] que tal comunidad, una pregunta del paquete data.table

2017-01-06 Thread patricio fuenmayor
Hola. Esta es una manera: require(data.table) dt <- data.table(v1=letters[1:30],v2=round(runif(30,max=20)),v3=rep(c("x","y","z"),10)) dt[unlist(dt[,.I[which.max(v2)],by=v3,drop=TRUE][,2])] Saludos. [[alternative HTML version deleted]] ___

Re: [R] Problem with IRkernel Installation Solved - Instructions on how to Solve it

2017-01-06 Thread Ista Zahn
On Jan 6, 2017 2:11 PM, "Paul Bernal" wrote: Ista, If you do not appreciate it or do not find it useful, just discard the message. It's not about me. My concern is for the people you potentially send on a wild goose chase when all they really need to do is follow the

Re: [R] Problem with IRkernel Installation Solved - Instructions on how to Solve it

2017-01-06 Thread Paul Bernal
Ista, If you do not appreciate it or do not find it useful, just discard the message. I tried several things and this is what worked for me. If you have another solution or a better solution let me know. Regards, Paul 2017-01-06 13:00 GMT-05:00 Ista Zahn : > On Fri, Jan 6,

Re: [R-es] que tal comunidad, una pregunta del paquete data.table

2017-01-06 Thread eric
Muchas gracias Carlos, Carlos y Javier, vamos a probar las opciones. Saludos, Eric. On 01/06/2017 12:38 PM, Carlos J. Gil Bellosta wrote: Lo que quieres es un sort y, luego, un tail. Abundando en el ejemplo de Carlos Ortega, library(data.table) set.seed(22) tmp <- data.table(x =

Re: [R] About populating a dataframe in a loop

2017-01-06 Thread Rui Barradas
Hello, Works with me: set.seed(6574) pre.mat = data.frame() for(i in 1:10){ mat.temp = data.frame(x = rnorm(5), A = sample(LETTERS, 5, TRUE)) pre.mat = rbind(pre.mat, mat.temp) } nrow(pre.mat) # should be 50 Can you give us an example that doesn't work? Rui Barradas Em 06-01-2017

Re: [R] Problem with IRkernel Installation Solved - Instructions on how to Solve it

2017-01-06 Thread Ista Zahn
On Fri, Jan 6, 2017 at 8:43 AM, Paul Bernal wrote: > Dear friends, > > Great news! I was able to install the IRkernel successfully and I am now > able to create R notebooks in Jupyter. Congratulations. Just in case anybody out there is > struggling with this too, here

[R] About populating a dataframe in a loop

2017-01-06 Thread lily li
Hi R users, I have a question about filling a dataframe in R using a for loop. I created an empty dataframe first and then filled it, using the code: pre.mat = data.frame() for(i in 1:10){ mat.temp = data.frame(some values filled in) pre.mat = rbind(pre.mat, mat.temp) } However, the

Re: [R] extract minimal variables from model

2017-01-06 Thread Marc Schwartz
> On Jan 6, 2017, at 11:03 AM, Jacob Wegelin wrote: > > Given any regression model, created for instance by lm, lme, lmer, or rqs, > such as > > z1<-lm(weight~poly(Time,2), data=ChickWeight) > > I would like a general way to obtain only those variables used for the

[R] extract minimal variables from model

2017-01-06 Thread Jacob Wegelin
Given any regression model, created for instance by lm, lme, lmer, or rqs, such as z1<-lm(weight~poly(Time,2), data=ChickWeight) I would like a general way to obtain only those variables used for the model. In the current example, this "minimal data frame" would consist of the "weight" and

Re: [R] purrr::map and xml2:: read_xml

2017-01-06 Thread Ulrik Stervbo
Hi Maicel, I'm guessing that B works on 50 files, and that A fails because there is no function called 'read_xmlmap'. If the function that you map work well, removing 'dplyr::sample_n(50)' from 'B' should solve the problem. If that is not the case, we need a bit more information. HTH Ulrik On

[R] purrr::map and xml2:: read_xml

2017-01-06 Thread maicel
Hi List, I am trying to extract the key words from 1403 papers in xml format. I programmed such codes but they do not work but they only do with the modification showed below. But that variation is not the one I need because the 1403 xml files do not match to those in my folder. Could you

Re: [R-es] que tal comunidad, una pregunta del paquete data.table

2017-01-06 Thread Carlos J. Gil Bellosta
Lo que quieres es un sort y, luego, un tail. Abundando en el ejemplo de Carlos Ortega, library(data.table) set.seed(22) tmp <- data.table(x = rnorm(100), y = rnorm(100), z = sample(1:5, 100, replace = TRUE)) setkeyv(tmp, c("z", "y")) tmp[, tail(.SD, 1), by=z] Así puedes sacar los N mayores,

Re: [R] testing whether clusters in a PCA plot are significantly different from one another

2017-01-06 Thread Marchesi, Julian
many thanks david for such a swift response, really appreciate your help cheers Julian Julian R. Marchesi Deputy Director and Professor of Clinical Microbiome Research at the Centre for Digestive and Gut Health, Imperial College London, London W2 1NY Tel: +44 (0)20 331 26197 and Professor

Re: [R] testing whether clusters in a PCA plot are significantly different from one another

2017-01-06 Thread David L Carlson
In that case you should be able to use manova where pc1 and pc2 are the independent (response) variables and group (Baseline, HFD+P, HFD) is the dependent (explanatory) variable. Something like lm(cbind(pc1, pc2)~group). That will give you slopes for HFD+P and HFD (difference in mean relative

Re: [R] Tobit Regression with unbalanced Panel Data

2017-01-06 Thread peter dalgaard
On 06 Jan 2017, at 15:08 , Vanessa Romero wrote: > BHHH maximisation, 150 iterations > Return code 4: Iteration limit exceeded. > Log-likelihood: -66915.77 on 10 Df > > How can I calculate McFadden's adjusted R2 in R? Google gets you there soon enough (e.g., "mcfadden r2

Re: [R] Tobit Regression with unbalanced Panel Data

2017-01-06 Thread Vanessa Romero
Thank you for your answers. I have just replaced pdata.frame with plm.data and it worked. tobit1<- plm.data(T1, index = c("firm", "year")) But I have two more questions, maybe someone could help: summary(Tob) Call: censReg(formula = Imp ~ Bath + CEOTurnover + ChangeOCF + E + Sales + ROE +

Re: [R-es] que tal comunidad, una pregunta del paquete data.table

2017-01-06 Thread Carlos Ortega
Hola, Una forma de hacerlo es esta: #-- library(data.table) set.seed(22) DT <- data.table( x = rnorm(100), y = rnorm(100), z = sample(1:5, 100, replace = TRUE)) DT[, Max := max(y), by=z][y == Max] #-- Que produce este resultado: > DT[, Max := max(y), by=z][y

[R] Problem with IRkernel Installation Solved - Instructions on how to Solve it

2017-01-06 Thread Paul Bernal
Dear friends, Great news! I was able to install the IRkernel successfully and I am now able to create R notebooks in Jupyter. Just in case anybody out there is struggling with this too, here is what I did (I have Windows 8, but it will probably work for Mac OS X as well): 1-Go to the page

Re: [R] IRkernel Installation Issues

2017-01-06 Thread Paul Bernal
Dear friends, Great news! I was able to install the IRkernel successfully and I am now able to create R notebooks in Jupyter. Just in case anybody out there is struggling with this too, here is what I did (I have Windows 8, but it will probably work for Mac OS X as well): 1-Go to the page

[R] testing whether clusters in a PCA plot are significantly different from one another

2017-01-06 Thread Marchesi, Julian
Rplot_PCA.pdf Description: Rplot_PCA.pdf __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide

Re: [R-es] que tal comunidad, una pregunta del paquete data.table

2017-01-06 Thread javier.ruben.marcuzzi
Estimado Eric Creo que es más simple si lo piensa de otra forma, equivalente, yo plantearía en tener las filas (para luego buscar la primer columna). Preguntaría: Agrupar por columna 3, A estos Cuándo el valor máximo de la columna 2. De esta forma cuándo tenga 40 columnas en lugar de 3 no

Re: [R] Dates and Times in R

2017-01-06 Thread Ulrik Stervbo
The lubridate package might be helpful. HTH Ulrik On Fri, 6 Jan 2017 at 08:28 PIKAL Petr wrote: > Hi > It strongly reminds me following fortune > > library(fortunes) > fortune("surgery") > > Along with Posting guide you should also look at chapter 7 of R intro > manual.