Re: [R] read.table and NaN
Like this? con <- textConnection(object = 'A,B\n1,NaN\nNA,2') > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE, + colClasses = c("numeric", "character")) > close.connection(con) > tmp A B 1 1 NaN 2 NA 2 > class(tmp[,1]) [1] "numeric" > class(tmp[,2]) [1] "character" > tmp[,2] [1] "NaN" "2" Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help < r-help@r-project.org> wrote: > Hi, > > Is there a way to make read.table consider NaN as a string of characters > rather than the internal NaN? Changing the na.strings argument does not > seems to have any effect on how R interprets the NaN string (while is does > not the the NA string) > > con <- textConnection(object = 'A,B\n1,NaN\nNA,2') > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', > stringsAsFactors = FALSE) > close.connection(con) > tmp > class(tmp[,1]) > class(tmp[,2]) > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] calculo porcentaje de subcategirías para individuos diferentes
Estimada Lorena Saavedra Aracena No alcanzo a comprender su pregunta, en R la respuesta mucha veces depende de los datos y en la forma en que estos son accedidos, por ejemplo, si usted tiene un data.frame que se llama datos posiblemente summay(datos) alcanza, si no es así alguna función sobre los datos y a esta summary(). Yo llevo muchos anos con R, y la forma adecuada va cambiando con el tiempo, simplemente porque aparecen librerías que facilitan el trabajo, pero no se preocupe por eso, piense en lo que usted cree oportuno, es preferible que conozca bien lo básico, y nuevamente lo básico, y la razón es sobre la posibilidad de combinar esto básico como función dentro de otra librería, el camino inverso la confundiría. Aparte de summary está table y ftable, esto da el resultado en frecuencias, no es justo porcentaje, pero es casi sinónimo. Espero que se comprenda mi explicación, no estoy seguro de haber respondido a su pregunta. Javier Marcuzzi El mié., 23 oct. 2019 a las 23:07, Lorena Saavedra Aracena (< l.saavedra.arac...@gmail.com>) escribió: > Buenas noches, > Soy nueva en R y a veces me cuesta pensar los cálculos de manera más > práctica, por los que les agradecería la ayuda. > Tengo una matriz de datos con una dim = 35745 19, correspondientes a > ubicaciones de 39 perros, cada perro tiene poco más o poco menos de 1000 > datos. > Necesito saber el % de uso de hábitat natural. Es un sencillo cálculo de > porcentaje, pero me gustaría hacerlo mas automatizado para no tomar tanto > tiempo en correrlo manualmente. > Entonces tengo una columna con la ID de cada perro, y otra con las > categorías de ambiente (urbano, rural y mar). > He calculado el promedio con este script para el perro 1: > > ## encontrar el numero total de zonas naturales por perro > > P01 <-subset(TODOS, TODOS$ID=="P01") > ruralP01 <- subset(P01,P01$Zone=="rural") > marP01 <- subset(P01,P01$Zone=="mar") > > nrow(P01) > nrow(ruralP01) > nrow(marP01) > > porcent_natP01 <- (nrow(ruralP01) + nrow(marP01))*100/ nrow(P01) > porcent_natP01 > > y llego a 61,35%. Es la forma más básica que se me ocurrió, podrían > ayudarme a hacerlo más automatizado? he intentado un par de formas pero me > es un poco complicado verlo pensando que tengo primero categorías de perros > y luego de ambientes. > > Les agradezco, > > Saludos > > -- > > *Lorena Saavedra A.**Ing. Recursos Naturales Renovables* > *+56 9 9880 2972* > > [[alternative HTML version deleted]] > > ___ > R-help-es mailing list > R-help-es@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-help-es > [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
[R-es] calculo porcentaje de subcategirías para individuos diferentes
Buenas noches, Soy nueva en R y a veces me cuesta pensar los cálculos de manera más práctica, por los que les agradecería la ayuda. Tengo una matriz de datos con una dim = 35745 19, correspondientes a ubicaciones de 39 perros, cada perro tiene poco más o poco menos de 1000 datos. Necesito saber el % de uso de hábitat natural. Es un sencillo cálculo de porcentaje, pero me gustaría hacerlo mas automatizado para no tomar tanto tiempo en correrlo manualmente. Entonces tengo una columna con la ID de cada perro, y otra con las categorías de ambiente (urbano, rural y mar). He calculado el promedio con este script para el perro 1: ## encontrar el numero total de zonas naturales por perro P01 <-subset(TODOS, TODOS$ID=="P01") ruralP01 <- subset(P01,P01$Zone=="rural") marP01 <- subset(P01,P01$Zone=="mar") nrow(P01) nrow(ruralP01) nrow(marP01) porcent_natP01 <- (nrow(ruralP01) + nrow(marP01))*100/ nrow(P01) porcent_natP01 y llego a 61,35%. Es la forma más básica que se me ocurrió, podrían ayudarme a hacerlo más automatizado? he intentado un par de formas pero me es un poco complicado verlo pensando que tengo primero categorías de perros y luego de ambientes. Les agradezco, Saludos -- *Lorena Saavedra A.**Ing. Recursos Naturales Renovables* *+56 9 9880 2972* [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
[R] read.table and NaN
Hi, Is there a way to make read.table consider NaN as a string of characters rather than the internal NaN? Changing the na.strings argument does not seems to have any effect on how R interprets the NaN string (while is does not the the NA string) con <- textConnection(object = 'A,B\n1,NaN\nNA,2') tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE) close.connection(con) tmp class(tmp[,1]) class(tmp[,2]) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
Ana... contributed packages like data.table and dplyr are developed completely independently from R, have their own versions, and in fact both of them have recommendations as to how to report bugs in their package descriptions. As for getting help here, you really need to supply ALL of the information requested to make forward progress in clarifying next steps... there were several items that Duncan mentioned that you failed to provide. Also, note that dplyr and data.table take very different approaches to handling data, and have been known to not play well with each other. At the very least I would suggest using as.data.frame to convert to a standardized data representation before switching from using functions in one of these packages to using functions in the other package. [1] https://cran.r-project.org/web/packages/data.table/index.html [2] https://cran.r-project.org/web/packages/dplyr/index.html On October 23, 2019 5:05:44 PM PDT, Ana Marija wrote: >I am using R-3.6.1 >and these libraries: >library(data.table) >library(dplyr) > >On Wed, Oct 23, 2019 at 6:54 PM Duncan Murdoch > wrote: >> >> On 23/10/2019 7:04 p.m., Ana Marija wrote: >> > I also tried left_join but I got: Error: std::bad_alloc >> > >> >> df3 <- left_join(l4, asign, by = c("chr","pos") >> > Error: std::bad_alloc >> >> Looks like bugs in whatever package you're finding "left_join" in >(and >> previously "merge"). Are those from dplyr and base? Showing us >> str(lr), str(asign), and sessionInfo() would be helpful. >> >> Duncan Murdoch >> >> >> dim(l4) >> > [1] 166941635 8 >> >> dim(asign) >> > [1] 107371528 5 >> > >> > On Wed, Oct 23, 2019 at 5:32 PM Ana Marija > wrote: >> >> >> >> Hello, >> >> >> >> I have two data frames like this: >> >> >> >>> head(l4) >> >> X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL >> >> 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 >> >> 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 >> >> 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 >> >> 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 >> >> 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 >> >> 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 >> >>> head(asign) >> >>gene chrchr_pos pos p.val.Retina >> >> 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 >> >> 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 >> >> 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 >> >> 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 >> >> 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 >> >> 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 >> >>> m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) >> >> Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = >c("chr", : >> >>negative length vectors are not allowed >> >>> sapply(l4,class) >> >>X1 X2 X3 X4 X5 > variant_id >> >> "character" "character" "character" "character" "character" >"character" >> >> pval_nominal gene_id.LCL >> >> "numeric" "character" >> >>> sapply(asign,class) >> >> gene chr chr_pos pos p.val.Retina >> >> "character" "character" "character" "character" "character" >> >> >> >> Please advise as to why I am getting this error when merging? >> >> >> >> Thanks >> >> Ana >> > >> > __ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
On 23/10/2019 7:04 p.m., Ana Marija wrote: I also tried left_join but I got: Error: std::bad_alloc df3 <- left_join(l4, asign, by = c("chr","pos") Error: std::bad_alloc Looks like bugs in whatever package you're finding "left_join" in (and previously "merge"). Are those from dplyr and base? Showing us str(lr), str(asign), and sessionInfo() would be helpful. Duncan Murdoch dim(l4) [1] 166941635 8 dim(asign) [1] 107371528 5 On Wed, Oct 23, 2019 at 5:32 PM Ana Marija wrote: Hello, I have two data frames like this: head(l4) X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 head(asign) gene chrchr_pos pos p.val.Retina 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : negative length vectors are not allowed sapply(l4,class) X1 X2 X3 X4 X5 variant_id "character" "character" "character" "character" "character" "character" pval_nominal gene_id.LCL "numeric" "character" sapply(asign,class) gene chr chr_pos pos p.val.Retina "character" "character" "character" "character" "character" Please advise as to why I am getting this error when merging? Thanks Ana __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
I am using R-3.6.1 and these libraries: library(data.table) library(dplyr) On Wed, Oct 23, 2019 at 6:54 PM Duncan Murdoch wrote: > > On 23/10/2019 7:04 p.m., Ana Marija wrote: > > I also tried left_join but I got: Error: std::bad_alloc > > > >> df3 <- left_join(l4, asign, by = c("chr","pos") > > Error: std::bad_alloc > > Looks like bugs in whatever package you're finding "left_join" in (and > previously "merge"). Are those from dplyr and base? Showing us > str(lr), str(asign), and sessionInfo() would be helpful. > > Duncan Murdoch > > >> dim(l4) > > [1] 166941635 8 > >> dim(asign) > > [1] 107371528 5 > > > > On Wed, Oct 23, 2019 at 5:32 PM Ana Marija > > wrote: > >> > >> Hello, > >> > >> I have two data frames like this: > >> > >>> head(l4) > >> X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > >> 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > >> 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > >> 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > >> 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > >> 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > >> 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > >>> head(asign) > >>gene chrchr_pos pos p.val.Retina > >> 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > >> 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > >> 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > >> 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > >> 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > >> 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > >>> m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > >> Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", > >> : > >>negative length vectors are not allowed > >>> sapply(l4,class) > >>X1 X2 X3 X4 X5 > >> variant_id > >> "character" "character" "character" "character" "character" > >> "character" > >> pval_nominal gene_id.LCL > >> "numeric" "character" > >>> sapply(asign,class) > >> gene chr chr_pos pos p.val.Retina > >> "character" "character" "character" "character" "character" > >> > >> Please advise as to why I am getting this error when merging? > >> > >> Thanks > >> Ana > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
I don't have it installed - that was merely a suggestion. I notice that both data.table and dplyr packages are mentioned as possibilities for "merge big datasets in r". Apparently the best way to do it if you have a database manager is to read the two datasets into tables and do the join via SQL or whatever language is available. Jim On Thu, Oct 24, 2019 at 10:17 AM Ana Marija wrote: > > no can you please send me an example how the command would look like in my > case? > > On Wed, Oct 23, 2019 at 6:16 PM Jim Lemon wrote: > > > > Yes. Have you tried the bigmemory package? > > > > Jim > > > > On Thu, Oct 24, 2019 at 10:08 AM Ana Marija > > wrote: > > > > > > Hi Jim, > > > > > > I think one of the issue is that data frames are so big, > > > > dim(l4) > > > [1] 166941635 8 > > > > dim(asign) > > > [1] 107371528 5 > > > > > > so my example would not reproduce the error > > > > > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon wrote: > > > > > > > > Hi Ana, > > > > When I run this example taken from your email: > > > > > > > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > > > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > > > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > > > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > > > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > > > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232", > > > > header=TRUE,stringsAsFactors=FALSE) > > > > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > > > > ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > > > ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > > ENSG0227232 chr11:11008:C:G 11008 0.218132 > > > > ENSG0227232 chr11:11012:C:G 11012 0.218132 > > > > ENSG0227232 chr11:13110:G:A 13110 0.998262 > > > > ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572", > > > > header=TRUE,stringsAsFactors=FALSE) > > > > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > > [1] X1 X2 X3 X4 X5 > > > > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > > > > [11] p.val.Retina > > > > <0 rows> (or 0-length row.names) > > > > > > > > It works okay, but there are no matches in the join. So I can't even > > > > guess what the problem is. > > > > > > > > Jim > > > > > > > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija > > > > wrote: > > > > > > > > > > Hello, > > > > > > > > > > I have two data frames like this: > > > > > > > > > > > head(l4) > > > > > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > > > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > > > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > > > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > > > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > > > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > > > > > head(asign) > > > > > gene chrchr_pos pos p.val.Retina > > > > > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > > > > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > > > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > > > > > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > > > > > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > > > > > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = > > > > > c("chr", : > > > > > negative length vectors are not allowed > > > > > > sapply(l4,class) > > > > > X1 X2 X3 X4 X5 > > > > > variant_id > > > > > "character" "character" "character" "character" "character" > > > > > "character" > > > > > pval_nominal gene_id.LCL > > > > >"numeric" "character" > > > > > > sapply(asign,class) > > > > > gene chr chr_pos pos p.val.Retina > > > > > "character" "character" "character" "character" "character" > > > > > > > > > > Please advise as to why I am getting this error when merging? > > > > > > > > > > Thanks > > > > > Ana > > > > > > > > > > __ > > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > > PLEASE do read the posting guide > > > > > http://www.R-project.org/posting-guide.html > > > > > and provide commented, minimal, self-contained,
Re: [R] negative vector length when merging data frames
no can you please send me an example how the command would look like in my case? On Wed, Oct 23, 2019 at 6:16 PM Jim Lemon wrote: > > Yes. Have you tried the bigmemory package? > > Jim > > On Thu, Oct 24, 2019 at 10:08 AM Ana Marija > wrote: > > > > Hi Jim, > > > > I think one of the issue is that data frames are so big, > > > dim(l4) > > [1] 166941635 8 > > > dim(asign) > > [1] 107371528 5 > > > > so my example would not reproduce the error > > > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon wrote: > > > > > > Hi Ana, > > > When I run this example taken from your email: > > > > > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232", > > > header=TRUE,stringsAsFactors=FALSE) > > > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > > > ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > > ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > ENSG0227232 chr11:11008:C:G 11008 0.218132 > > > ENSG0227232 chr11:11012:C:G 11012 0.218132 > > > ENSG0227232 chr11:13110:G:A 13110 0.998262 > > > ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572", > > > header=TRUE,stringsAsFactors=FALSE) > > > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > [1] X1 X2 X3 X4 X5 > > > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > > > [11] p.val.Retina > > > <0 rows> (or 0-length row.names) > > > > > > It works okay, but there are no matches in the join. So I can't even > > > guess what the problem is. > > > > > > Jim > > > > > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija > > > wrote: > > > > > > > > Hello, > > > > > > > > I have two data frames like this: > > > > > > > > > head(l4) > > > > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > > > > head(asign) > > > > gene chrchr_pos pos p.val.Retina > > > > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > > > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > > > > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > > > > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > > > > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = > > > > c("chr", : > > > > negative length vectors are not allowed > > > > > sapply(l4,class) > > > > X1 X2 X3 X4 X5 > > > > variant_id > > > > "character" "character" "character" "character" "character" > > > > "character" > > > > pval_nominal gene_id.LCL > > > >"numeric" "character" > > > > > sapply(asign,class) > > > > gene chr chr_pos pos p.val.Retina > > > > "character" "character" "character" "character" "character" > > > > > > > > Please advise as to why I am getting this error when merging? > > > > > > > > Thanks > > > > Ana > > > > > > > > __ > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
Yes. Have you tried the bigmemory package? Jim On Thu, Oct 24, 2019 at 10:08 AM Ana Marija wrote: > > Hi Jim, > > I think one of the issue is that data frames are so big, > > dim(l4) > [1] 166941635 8 > > dim(asign) > [1] 107371528 5 > > so my example would not reproduce the error > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon wrote: > > > > Hi Ana, > > When I run this example taken from your email: > > > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232", > > header=TRUE,stringsAsFactors=FALSE) > > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > > ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > ENSG0227232 chr11:11008:C:G 11008 0.218132 > > ENSG0227232 chr11:11012:C:G 11012 0.218132 > > ENSG0227232 chr11:13110:G:A 13110 0.998262 > > ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572", > > header=TRUE,stringsAsFactors=FALSE) > > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > [1] X1 X2 X3 X4 X5 > > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > > [11] p.val.Retina > > <0 rows> (or 0-length row.names) > > > > It works okay, but there are no matches in the join. So I can't even > > guess what the problem is. > > > > Jim > > > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija > > wrote: > > > > > > Hello, > > > > > > I have two data frames like this: > > > > > > > head(l4) > > > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > > > head(asign) > > > gene chrchr_pos pos p.val.Retina > > > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > > > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > > > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > > > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = > > > c("chr", : > > > negative length vectors are not allowed > > > > sapply(l4,class) > > > X1 X2 X3 X4 X5 > > > variant_id > > > "character" "character" "character" "character" "character" > > > "character" > > > pval_nominal gene_id.LCL > > >"numeric" "character" > > > > sapply(asign,class) > > > gene chr chr_pos pos p.val.Retina > > > "character" "character" "character" "character" "character" > > > > > > Please advise as to why I am getting this error when merging? > > > > > > Thanks > > > Ana > > > > > > __ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
thanks but I would need solution in R On Wed, Oct 23, 2019 at 6:31 PM Jim Lemon wrote: > > I don't have it installed - that was merely a suggestion. I notice > that both data.table and dplyr packages are mentioned as possibilities > for "merge big datasets in r". Apparently the best way to do it if you > have a database manager is to read the two datasets into tables and do > the join via SQL or whatever language is available. > > Jim > > On Thu, Oct 24, 2019 at 10:17 AM Ana Marija > wrote: > > > > no can you please send me an example how the command would look like in my > > case? > > > > On Wed, Oct 23, 2019 at 6:16 PM Jim Lemon wrote: > > > > > > Yes. Have you tried the bigmemory package? > > > > > > Jim > > > > > > On Thu, Oct 24, 2019 at 10:08 AM Ana Marija > > > wrote: > > > > > > > > Hi Jim, > > > > > > > > I think one of the issue is that data frames are so big, > > > > > dim(l4) > > > > [1] 166941635 8 > > > > > dim(asign) > > > > [1] 107371528 5 > > > > > > > > so my example would not reproduce the error > > > > > > > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon wrote: > > > > > > > > > > Hi Ana, > > > > > When I run this example taken from your email: > > > > > > > > > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal > > > > > gene_id.LCL > > > > > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > > > > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > > > > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > > > > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > > > > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > > > > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232", > > > > > header=TRUE,stringsAsFactors=FALSE) > > > > > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > > > > > ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > > > > ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > > > ENSG0227232 chr11:11008:C:G 11008 0.218132 > > > > > ENSG0227232 chr11:11012:C:G 11012 0.218132 > > > > > ENSG0227232 chr11:13110:G:A 13110 0.998262 > > > > > ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572", > > > > > header=TRUE,stringsAsFactors=FALSE) > > > > > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > > > [1] X1 X2 X3 X4 X5 > > > > > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > > > > > [11] p.val.Retina > > > > > <0 rows> (or 0-length row.names) > > > > > > > > > > It works okay, but there are no matches in the join. So I can't even > > > > > guess what the problem is. > > > > > > > > > > Jim > > > > > > > > > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija > > > > > wrote: > > > > > > > > > > > > Hello, > > > > > > > > > > > > I have two data frames like this: > > > > > > > > > > > > > head(l4) > > > > > > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > > > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > > > > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > > > > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > > > > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > > > > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > > > > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > > > > > > head(asign) > > > > > > gene chrchr_pos pos p.val.Retina > > > > > > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > > > > > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > > > > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > > > > > > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > > > > > > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > > > > > > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > > > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > > > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = > > > > > > c("chr", : > > > > > > negative length vectors are not allowed > > > > > > > sapply(l4,class) > > > > > > X1 X2 X3 X4 X5 > > > > > > variant_id > > > > > > "character" "character" "character" "character" "character" > > > > > > "character" > > > > > > pval_nominal gene_id.LCL > > > > > >"numeric" "character" > > > > > > > sapply(asign,class) > > > > > > gene chr chr_pos pos p.val.Retina > > > > > > "character" "character" "character" "character" "character" > > > > > > > > > > > > Please advise as to why I am getting this error when merging? > > > > > > > > > > > > Thanks > > > > > > Ana > > > > > > > > > > > >
Re: [R] negative vector length when merging data frames
Hi Jim, I think one of the issue is that data frames are so big, > dim(l4) [1] 166941635 8 > dim(asign) [1] 107371528 5 so my example would not reproduce the error On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon wrote: > > Hi Ana, > When I run this example taken from your email: > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232", > header=TRUE,stringsAsFactors=FALSE) > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > ENSG0227232 chr11:11008:C:G 11008 0.218132 > ENSG0227232 chr11:11012:C:G 11012 0.218132 > ENSG0227232 chr11:13110:G:A 13110 0.998262 > ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572", > header=TRUE,stringsAsFactors=FALSE) > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > [1] X1 X2 X3 X4 X5 > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > [11] p.val.Retina > <0 rows> (or 0-length row.names) > > It works okay, but there are no matches in the join. So I can't even > guess what the problem is. > > Jim > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija > wrote: > > > > Hello, > > > > I have two data frames like this: > > > > > head(l4) > > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > > head(asign) > > gene chrchr_pos pos p.val.Retina > > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", > > : > > negative length vectors are not allowed > > > sapply(l4,class) > > X1 X2 X3 X4 X5 > > variant_id > > "character" "character" "character" "character" "character" > > "character" > > pval_nominal gene_id.LCL > >"numeric" "character" > > > sapply(asign,class) > > gene chr chr_pos pos p.val.Retina > > "character" "character" "character" "character" "character" > > > > Please advise as to why I am getting this error when merging? > > > > Thanks > > Ana > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
Ah, it looks like a memory allocation problem. Jim On Thu, Oct 24, 2019 at 10:05 AM Ana Marija wrote: > > I also tried left_join but I got: Error: std::bad_alloc > > > df3 <- left_join(l4, asign, by = c("chr","pos")) > Error: std::bad_alloc > > dim(l4) > [1] 166941635 8 > > dim(asign) > [1] 107371528 5 > > On Wed, Oct 23, 2019 at 5:32 PM Ana Marija > wrote: > > > > Hello, > > > > I have two data frames like this: > > > > > head(l4) > > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > > head(asign) > > gene chrchr_pos pos p.val.Retina > > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", > > : > > negative length vectors are not allowed > > > sapply(l4,class) > > X1 X2 X3 X4 X5 > > variant_id > > "character" "character" "character" "character" "character" > > "character" > > pval_nominal gene_id.LCL > >"numeric" "character" > > > sapply(asign,class) > > gene chr chr_pos pos p.val.Retina > > "character" "character" "character" "character" "character" > > > > Please advise as to why I am getting this error when merging? > > > > Thanks > > Ana > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
Hi Ana, When I run this example taken from your email: l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232", header=TRUE,stringsAsFactors=FALSE) asign<-read.table(text="gene chr chr_pos pos p.val.Retina ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 ENSG0227232 chr11:11008:C:G 11008 0.218132 ENSG0227232 chr11:11012:C:G 11012 0.218132 ENSG0227232 chr11:13110:G:A 13110 0.998262 ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572", header=TRUE,stringsAsFactors=FALSE) merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) [1] X1 X2 X3 X4 X5 [6] variant_id pval_nominal gene_id.LCL gene chr_pos [11] p.val.Retina <0 rows> (or 0-length row.names) It works okay, but there are no matches in the join. So I can't even guess what the problem is. Jim On Thu, Oct 24, 2019 at 9:33 AM Ana Marija wrote: > > Hello, > > I have two data frames like this: > > > head(l4) > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > head(asign) > gene chrchr_pos pos p.val.Retina > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > negative length vectors are not allowed > > sapply(l4,class) > X1 X2 X3 X4 X5 variant_id > "character" "character" "character" "character" "character" "character" > pval_nominal gene_id.LCL >"numeric" "character" > > sapply(asign,class) > gene chr chr_pos pos p.val.Retina > "character" "character" "character" "character" "character" > > Please advise as to why I am getting this error when merging? > > Thanks > Ana > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative vector length when merging data frames
I also tried left_join but I got: Error: std::bad_alloc > df3 <- left_join(l4, asign, by = c("chr","pos")) Error: std::bad_alloc > dim(l4) [1] 166941635 8 > dim(asign) [1] 107371528 5 On Wed, Oct 23, 2019 at 5:32 PM Ana Marija wrote: > > Hello, > > I have two data frames like this: > > > head(l4) > X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > > head(asign) > gene chrchr_pos pos p.val.Retina > 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 > 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 > 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 > 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 > 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > negative length vectors are not allowed > > sapply(l4,class) > X1 X2 X3 X4 X5 variant_id > "character" "character" "character" "character" "character" "character" > pval_nominal gene_id.LCL >"numeric" "character" > > sapply(asign,class) > gene chr chr_pos pos p.val.Retina > "character" "character" "character" "character" "character" > > Please advise as to why I am getting this error when merging? > > Thanks > Ana __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] negative vector length when merging data frames
Hello, I have two data frames like this: > head(l4) X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0227232 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG0227232 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG0227232 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG0227232 > head(asign) gene chrchr_pos pos p.val.Retina 1: ENSG0227232 chr1 1:10177:A:AC 10177 0.381708 2: ENSG0227232 chr1 rs145072688:10352:T:TA 10352 0.959523 3: ENSG0227232 chr11:11008:C:G 11008 0.218132 4: ENSG0227232 chr11:11012:C:G 11012 0.218132 5: ENSG0227232 chr11:13110:G:A 13110 0.998262 6: ENSG0227232 chr1 rs201725126:13116:T:G 13116 0.438572 > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : negative length vectors are not allowed > sapply(l4,class) X1 X2 X3 X4 X5 variant_id "character" "character" "character" "character" "character" "character" pval_nominal gene_id.LCL "numeric" "character" > sapply(asign,class) gene chr chr_pos pos p.val.Retina "character" "character" "character" "character" "character" Please advise as to why I am getting this error when merging? Thanks Ana __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change colour ggiNEXT plot package iNEXT
Apparently, the iNEXT package was first described in an academic paper published in 2016, although CRAN archives go back to 2015. http://chao.stat.nthu.edu.tw/wordpress/paper/120_pdf_appendix.pdf https://cran.r-project.org/src/contrib/Archive/iNEXT/ The vignette below has a section entitled "General Customization" which talks about color. See the four lines of code I've added to the vignette's code to get a general idea what to do. https://cran.r-project.org/web/packages/iNEXT/vignettes/Introduction.html library(iNEXT) library(ggplot2) library(gridExtra) library(grid) data("spider") out <- iNEXT(spider, q=0, datatype="abundance") g <- ggiNEXT(out, type=1, color.var = "site") print(g) g1 <- g + scale_colour_manual(values=c("yellow", "green")) print(g1) g2 <- g1 + scale_fill_manual(values=c("yellow", "green")) print(g2) HTH, Bill. W. Michels, Ph.D. On Wed, Oct 23, 2019 at 11:13 AM David Winsemius wrote: > > > On 10/22/19 12:48 PM, Luigi Marongiu wrote: > > I thought it was a major package for ecological analysis. > > > Yours is the first question in 20 years of Rhelp about the package iNEXT. > > > -- > > David > > > Anyway, > > thank you for the tips. I'll dip from there. > > > > On Tue, Oct 22, 2019 at 5:29 PM Jeff Newmiller > > wrote: > >> Probably, assuming that function returns a ggplot object. You will need to > >> identify the levels of the factor used for distinguishing groups, and add > >> a scale_colour_manual() to the ggplot object with colors specified in the > >> same order as those levels. > >> > >> Support for obscure packages is technically off-topic here ... if you need > >> a more specific answer you may need to correspond with the package authors > >> or use their suggested support resources. > >> > >> On October 22, 2019 2:18:49 AM PDT, Luigi Marongiu > >> wrote: > >>> Dear all, > >>> is it possible to provide custom color to the rarefaction curve of the > >>> package iNEXT (ggiNEXT)? > >>> If I have these data: > >>> ``` > >>> library(iNEXT) > >>> library(ggplot2) > >>> data(spider) > >>> out <- iNEXT(spider, q=0, datatype="abundance") > >>> ggiNEXT(out, type=1) > >>> ``` > >>> can i colour the lines with, let's say, yellow and green? > >>> Thank you > >> -- > >> Sent from my phone. Please excuse my brevity. > > > > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditions in R (Help Post)
On 10/22/19 10:19 PM, Yeasmin Alea wrote: Thank you. Can you please have a look the below data sets, script and question? *Dataset-1: Pen* *YEAR DAY X Y Sig phase * * * *1 1981 9 -0.213 1.08 1.10 Phase-7* *2 198110 0.065 1.05 1.05 Phase-6* *Dataset-2: Book* *YEAR Time * *1 1981 1981-12-03 06:00:00 * *2 19811981-12-04 00:00:00* I want the output table as *YEAR Time phase* *1 1981 1981-12-03 06:00:00 Phase-7* *2 19811981-12-04 00:00:00 Phase-4* You are posting in HTML. R help is a plain text mailing list. It is easy to send plain text using gmail. You should start over by configuring your mail client for this purpose and send the output of dput(head(Pen)) and dput(head(Book)) rather than the versions above which do not lend themselves to simple input strategies. -- David. *How can I combine and match the Dataset-1 DAY (365 days*35 years) +YEAR with Dataset-2 YEAR+Time? Dataset 1 has 5,551 rows and dataset 2 has 22,210* d$Pen<-Pen[cbind(match(Book$Time,Pen$DAY)] Kind regards Alea Yeasmin On Wed, Oct 23, 2019 at 2:20 AM jim holtman wrote: Here is one way of doing it; I think the output you show is wrong: library(tidyverse) input <- read_delim(" YEAR DAY X Y Sig 1981 9 -0.213 1.08 1.10 198110 0.065 1.05 1.05", delim = ' ', trim_ws = TRUE) input <- mutate(input, phase = case_when(X < 0 & Y < 0 & Y < X ~ 'phase=1', X < 0 & Y > 0 & Y < X ~ 'phase=2', X < 0 & Y > 0 & Y < X ~ 'phase=7', X < 0 & Y > 0 & Y > X ~ 'phase=8', X > 0 & Y < 0 & Y < X ~ 'phase=3', X > 0 & Y < 0 & Y > X ~ 'phase=4', X > 0 & Y > 0 & Y > X ~ 'phase=6', X > 0 & Y > 0 & Y < X ~ 'phase=5', TRUE ~ 'unknown' ) ) input # A tibble: 2 x 6 YEAR DAY X Y Sig phase 1 1981 9 -0.213 1.08 1.1 phase=8 2 198110 0.065 1.05 1.05 phase=6 Jim Holtman *Data Munger Guru* *What is the problem that you are trying to solve?Tell me what you want to do, not how you want to do it.* On Tue, Oct 22, 2019 at 9:43 AM Yeasmin Alea wrote: Hello Team I would like to add a new column (for example-Phase) from the below data set based on the conditions YEAR DAY X Y Sig 1 1981 9 -0.213 1.08 1.10 2 198110 0.065 1.05 1.05 *Conditions* D$Phase=sapply(D,function(a,b) { a <-D$X b<-D$Y if (a<0 && b<0 && ba) {phase=2} else if (a<0 && b>0 && b0 && b>a) {phase=8} else if (a>0 && b<0 && b0 && b<0 && b>a) {phase=4} else if (a>0 && b>0 && b>a) {phase=6} else (a>0 && b>0 && bhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change colour ggiNEXT plot package iNEXT
On 10/22/19 12:48 PM, Luigi Marongiu wrote: I thought it was a major package for ecological analysis. Yours is the first question in 20 years of Rhelp about the package iNEXT. -- David Anyway, thank you for the tips. I'll dip from there. On Tue, Oct 22, 2019 at 5:29 PM Jeff Newmiller wrote: Probably, assuming that function returns a ggplot object. You will need to identify the levels of the factor used for distinguishing groups, and add a scale_colour_manual() to the ggplot object with colors specified in the same order as those levels. Support for obscure packages is technically off-topic here ... if you need a more specific answer you may need to correspond with the package authors or use their suggested support resources. On October 22, 2019 2:18:49 AM PDT, Luigi Marongiu wrote: Dear all, is it possible to provide custom color to the rarefaction curve of the package iNEXT (ggiNEXT)? If I have these data: ``` library(iNEXT) library(ggplot2) data(spider) out <- iNEXT(spider, q=0, datatype="abundance") ggiNEXT(out, type=1) ``` can i colour the lines with, let's say, yellow and green? Thank you -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem in drawing relative potency curve
Dear I investigated herbicide affect on weeds growth (dose-response). I used drc package and fitted data to best model by mselect function and parameters of models. For some data, Logistic model was the best and for others, Weibull. But when I wanted to draw relative potency curve, the R-program displayed error massage just for Weibull model (attachment files). could you please help me? thanks a lot. your sincerely, Ahmad Rahbari > mydata<- read.csv ("D:\\MY Data\\R\\Tri-set.csv" , sep=",", dec=".", > header=TRUE) > mydata1 <- drm (Survival ~ Dose, Herbicide, fct = W1.3(), data=mydata) > summary(mydata1) Model fitted: Weibull (type 1) with lower limit at 0 (3 parms) Parameter estimates: Estimate Std. Error t-value p-value b:EC 1.440454 0.239782 6.0073 3.346e-06 *** b:MC 1.915474 0.320034 5.9852 3.533e-06 *** d:EC 100.489083 5.303133 18.9490 6.368e-16 *** d:MC 100.356215 5.257064 19.0898 4.791e-16 *** e:EC 0.711835 0.054866 12.9740 2.444e-12 *** e:MC 0.637447 0.042248 15.0882 9.551e-14 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 9.398498 (24 degrees of freedom) > EDcomp(mydata1, c(10,10), interval = "delta") Estimated ratios of effect doses Estimate Lower Upper EC/MC:10/10 0.75802 0.18863 1.32740 > EDcomp(mydata1, c(50,50), interval = "delta") Estimated ratios of effect doses Estimate Lower Upper EC/MC:50/50 1.04841 0.77686 1.31997 > EDcomp(mydata1, c(90,90), interval = "delta") Estimated ratios of effect doses Estimate Lower Upper EC/MC:90/90 1.2891 0.9224 1.6559 > relpot(mydata1, plotit = TRUE, compMatch = NULL, percVec = NULL, interval = > "none", type="relative") Error in match.fun(FUN) : 'object$fct$lowerAs' is not a function, character or symbol __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditions in R (Help Post)
Thank you. Can you please have a look the below data sets, script and question? *Dataset-1: Pen* *YEAR DAY X Y Sig phase * * * *1 1981 9 -0.213 1.08 1.10 Phase-7* *2 198110 0.065 1.05 1.05 Phase-6* *Dataset-2: Book* *YEAR Time * *1 1981 1981-12-03 06:00:00 * *2 19811981-12-04 00:00:00* I want the output table as *YEAR Time phase* *1 1981 1981-12-03 06:00:00 Phase-7* *2 19811981-12-04 00:00:00 Phase-4* *How can I combine and match the Dataset-1 DAY (365 days*35 years) +YEAR with Dataset-2 YEAR+Time? Dataset 1 has 5,551 rows and dataset 2 has 22,210* d$Pen<-Pen[cbind(match(Book$Time,Pen$DAY)] Kind regards Alea Yeasmin On Wed, Oct 23, 2019 at 2:20 AM jim holtman wrote: > Here is one way of doing it; I think the output you show is wrong: > > library(tidyverse) > input <- read_delim(" YEAR DAY X Y Sig > 1981 9 -0.213 1.08 1.10 > 198110 0.065 1.05 1.05", delim = ' ', trim_ws = TRUE) > > input <- mutate(input, > phase = case_when(X < 0 & Y < 0 & Y < X ~ 'phase=1', > X < 0 & Y > 0 & Y < X ~ 'phase=2', > X < 0 & Y > 0 & Y < X ~ 'phase=7', > X < 0 & Y > 0 & Y > X ~ 'phase=8', > X > 0 & Y < 0 & Y < X ~ 'phase=3', > X > 0 & Y < 0 & Y > X ~ 'phase=4', > X > 0 & Y > 0 & Y > X ~ 'phase=6', > X > 0 & Y > 0 & Y < X ~ 'phase=5', > TRUE ~ 'unknown' > ) > ) > > > input > # A tibble: 2 x 6 >YEAR DAY X Y Sig phase > > 1 1981 9 -0.213 1.08 1.1 phase=8 > 2 198110 0.065 1.05 1.05 phase=6 > > Jim Holtman > *Data Munger Guru* > > > *What is the problem that you are trying to solve?Tell me what you want to > do, not how you want to do it.* > > > On Tue, Oct 22, 2019 at 9:43 AM Yeasmin Alea > wrote: > >> Hello Team >> I would like to add a new column (for example-Phase) from the below data >> set based on the conditions >>YEAR DAY X Y Sig >> 1 1981 9 -0.213 1.08 1.10 >> 2 198110 0.065 1.05 1.05 >> *Conditions* >> >> D$Phase=sapply(D,function(a,b) { >> a <-D$X >> b<-D$Y >> if (a<0 && b<0 && b> {phase=1} else if (a<0 && b<0 && b>a) >> {phase=2} else if (a<0 && b>0 && b> {phase=7} else if (a<0 && b>0 && b>a) >> {phase=8} else if (a>0 && b<0 && b> {phase=3} else if (a>0 && b<0 && b>a) >> {phase=4} else if (a>0 && b>0 && b>a) >> {phase=6} else (a>0 && b>0 && b> {phase=5} >> }) >> >> Can anyone help to fix the script to get a Phase column based on the >> conditions. The table will be like the below >>YEAR DAY X Y Sig Phase >> 1 1981 9 -0.213 1.08 1.10 phase=7 >> 2 198110 0.065 1.05 1.05 phase=6 >> >> Many thanks >> Alea >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] If Loop I Think
Hi ***do not think in if or if loops in R***. to elaborate Jim's solution further With simple function based on logical expression fff <- function(x) (x!="")+0 you could use apply t(apply(phdf[,3:5], 1, fff)) and add results to your data frame columns phdf[, 6:8] <- t(apply(phdf[,3:5], 1, fff)) Regarding some tutorial Basic stuff is in R-intro, there is excellent documentation to each function. And as R users pool is huge, you could simply ask Google e.g. r change values based on condition Cheers Petr > -Original Message- > From: R-help On Behalf Of Jim Lemon > Sent: Wednesday, October 23, 2019 12:26 AM > To: Phillip Heinrich > Cc: r-help > Subject: Re: [R] If Loop I Think > > Hi Philip, > Try this: > > phdf<-read.table( > text="Row Outs RunnerFirst RunnerSecond RunnerThird R1 R2 R3 > 1 0 > 2 1 > 3 1 > 4 1 arenn001 > 5 2 arenn001 > 6 0 > 7 0 perad001 > 8 0 polla001 perad001 > 9 0 goldp001 polla001 perad001 > 10 0 lambj001 goldp001 > 11 1 lambj001 goldp001 > 12 2 lambj001 > 13 0 > 14 1 ", > header=TRUE,stringsAsFactors=FALSE,fill=TRUE) > phdf$R1<-ifelse(nchar(phdf$RunnerFirst) > 0,1,0) > phdf$R2<-ifelse(nchar(phdf$RunnerSecond) > 0,1,0) > phdf$R3<-ifelse(nchar(phdf$RunnerThird) > 0,1,0) > > Jim > > On Wed, Oct 23, 2019 at 7:54 AM Phillip Heinrich > wrote: > > > > Row Outs RunnerFirst RunnerSecond RunnerThird R1 R2 R3 > > 1 0 > > 2 1 > > 3 1 > > 4 1 arenn001 > > 5 2 arenn001 > > 6 0 > > 7 0 perad001 > > 8 0 polla001 perad001 > > 9 0 goldp001 polla001 perad001 > > 10 0 lambj001 goldp001 > > 11 1 lambj001 goldp001 > > 12 2 lambj001 > > 13 0 > > 14 1 > > > > > > > > With the above data, Arizona Diamondbacks baseball, I’m trying to put > zeros into the R1 column is the RunnerFirst column is blank and a one if the > column has a coded entry such as rows 4,5,7,8,& 9. Similarly I want zeros in > R2 and R3 if RunnerSecond and RunnerThird respectively are blank and ones > if there is an entry. > > > > I’ve tried everything I know how to do such as “If Loops”, “If-Then loops”, > “apply”, “sapply”, etc. I wrote function below and it ran without errors but > I > have no idea what to do with it to accomplish my goal: > > > > R1 <- function(x) { > > if (ari18.test3$RunnerFirst == " "){ > >ari18.test3$R1 <- 0 > >return(R1) > > }else{ > >R1 <- ari18.test3$R1 <- 1 > >return(R1) > > } > >} > > > > The name of the data frame is ari18.test3 > > > > On a more philosophical note, data handling in R seems to be made up of > thousands of details with no over-riding principles. I’ve read two books on R > and a number of tutorial and watched several videos but I don’t seem to be > making any progress. Can anyone suggest videos, or tutorials, or books that > might help? Database stuff has never been my strong point but I’m > determined to learn. > > > > Thanks, > > Philip Heinrich > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.