Re: [R-es] Consulta
Hola Berenice, ¿Qué quires decir con que no reconoce el paquete? ¿Te da algún mensaje de error? No sé si has probado a instalar de nuevo el paquete, si no hazlo. Para poder reproducir el error con tu código haría falta alguno de los pdfs que utilizas (puedes compartir un enlace a dropbox o similar). Un saludo, Emilio > El 24 sept 2019, a las 1:49, BERENICE DOMINGUEZ SANCHEZ > escribió: > > Buenas tarde a todo@s: > > Tenia la versión de R 3.6 y utilizaba la paquetería de pdftools para extraer > información de archivos en pdf actualice la versión 3.6.1 y ya no reconoce la > paquetería alguien que me pueda ayudar. Prácticamente no reconoce las > funciones de pdftools > > library(pdftools) > library(stringr) > library(NLP) > library(tm) > library(tesseract) > library(magick) > install.packages("magick") > install.packages("pdftools") > > txt <- system.file("texts", "txt", package = "tm") > > rfc_rg <- "([A-Z]{3,})([0-9]{6})([A-Z]|[0-9]){0,3}" > #poliza_rg <- > "(34|36|37|39)(ME|MEC|CH|MB|TF|GI|VE|TS|IM|ER|VE)*([0-9]{6,})[-]([0-9]){2}[-][A-Z]" > poliza_rg <- "(ME|CH|MB|TF|GI|gi|VE|TS|IM|ER|VE)*([0-9]{8,})[-]([0-9]){2}" > registro_rg <- "(CNSF-H0711-)([0-9]{4})[- ]([0-9]){4}" > subgrupo_rg <- "_([0-9]){1,3}." > mon_rg <- "SMGM|UMAM|MN" > > > ruta <- 'C:/Users/bdominguez/Documents/H0711/Bond/1909/' > archivos<-list.files(path=ruta,pattern = '*.pdf') > > > imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep="")) > prueba <-image_ocr(imagen, language = 'eng') > lineas<-unlist(str_split(prueba,pattern = "\n")) > lineasp<-unlist(str_split(prueba[2],pattern = "\r\n")) > > newnom <- NULL > renglones <- NULL > for (nombre in archivos){ > subgrupo <- str_extract(str_extract(nombre,pattern = subgrupo_rg),pattern = > "[0-9]{1,3}") > imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep="")) > prueba <-image_ocr(imagen, language = 'eng') > lineas<-unlist(str_split(prueba,pattern = "\n")) > poliza <- NULL > poliza<-str_extract(lineas[1],poliza_rg) > newnom <- c(newnom,paste(poliza[1],substr(nombre,5,6),".pdf",sep='')) > > registro <- NULL > registro<-str_extract(lineas[49],registro_rg) > > rfc <- NULL > rfc <- str_extract(lineas[5],rfc_rg) > > > #lineasnew<-unlist(str_split(lineas[2],pattern = "\r\n")) > #lineasdosnew<-unlist(str_split(lineas[1],pattern = "\r\n")) > > cobertura <- NA > extranjera <- NA > suma_str <- NA > deducible_str <- NA > > suma <- NA > coaseguro <- NA > deducible <- NA > tope <- NA > mon <- NA > mondedu <- NA > > cobertura <- grep("Cobertura en el Extranjero",lineas,value=TRUE) > extranjera <- grep("Emergencia en el Extranjero",lineas,value=TRUE) > suma_str <- grep("SUMA ASEGURADA:",lineas,value=TRUE) > deducible_str <- grep("DEDUCIBLE:",lineas,value=TRUE) > sumacob <- NA > sumaext <- NA > > pprimaria <- grep("Numero de Póliza:", lineas, value = TRUE) > dnprimariaa <- grep("Nombre de la Aseguradora Primaria:", lineas, value = > TRUE) > > #cer<- grep("Certificado No. ",lineas, value=TRUE) > #ntit<- grep("Ramo", lineas, value=TRUE) > > sumacob<-as.numeric(str_extract(cobertura[1],pattern = "[0-9]{1,}")) > if (length(sumacob)==0){ >sumacob = NA > } > > sumaext<-as.numeric(str_extract(extranjera[17],pattern = "[0-9]{1,}")) > if (length(sumaext)==0){ >sumaext = NA > } > valores <- NULL > monedas <- NULL > valores <- str_extract_all(suma_str[17],pattern = > "[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE) > monedas <- str_extract(suma_str,pattern = mon_rg) > if (length(valores[1])==0){ >suma = NA >mon = NA > }else{ >suma = as.numeric(gsub(pattern = ",*",replacement = "",valores[1])) >mon <- as.character(monedas[1]) > } > > if (length(valores[2])==0){ >coaseguro = NA > }else{ >coaseguro = as.numeric(valores[2]) > } > valores <- NULL > valores <- str_extract_all(deducible_str[1],pattern = > "[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE) > > if (length(valores[1])==0){ >deducible <- NA > }else{ >deducible <- as.numeric(gsub(pattern = ",",replacement = "",valores[1])) > } > > monedas <- NULL > monedas <- str_extract(deducible_str[1],pattern = mon_rg) > > if (length(monedas)==0){ >mondedu <- NA > }else{ >mondedu <- monedas > } > > > if (length(valores[2])==0){ >tope = NA > }else{ >tope = as.numeric(gsub(pattern = ",",replacement = "",valores[2])) > } > > renglon <- > data.frame(archivo=nombre,poliza=as.character(poliza[1]),cobertura=sumacob,emergencia=sumaext,registro=registro[1],suma=suma,coaseguro=coaseguro,deducible=deducible,tope=tope,rfc=rfc,mon=mon,mondedu=mondedu,subgrupo=subgrupo, > cert=as.character(cer[1]), cer_tit=as.character(lineasdos[14]), > titu=as.character(lineasdos[10])) > renglones <- rbind(renglones,renglon) > } > >
Re: [R] [FORGED] Re: Loop With Dates
On 22/09/19 11:19 PM, Richard O'Keefe wrote: Whenever you want a vector that counts something, cumsum of a logical vector is a good thing to try. Fortune nomination. cheers, Rolf -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R-es] Consulta
Buenas tarde a todo@s: Tenia la versión de R 3.6 y utilizaba la paquetería de pdftools para extraer información de archivos en pdf actualice la versión 3.6.1 y ya no reconoce la paquetería alguien que me pueda ayudar. Prácticamente no reconoce las funciones de pdftools library(pdftools) library(stringr) library(NLP) library(tm) library(tesseract) library(magick) install.packages("magick") install.packages("pdftools") txt <- system.file("texts", "txt", package = "tm") rfc_rg <- "([A-Z]{3,})([0-9]{6})([A-Z]|[0-9]){0,3}" #poliza_rg <- "(34|36|37|39)(ME|MEC|CH|MB|TF|GI|VE|TS|IM|ER|VE)*([0-9]{6,})[-]([0-9]){2}[-][A-Z]" poliza_rg <- "(ME|CH|MB|TF|GI|gi|VE|TS|IM|ER|VE)*([0-9]{8,})[-]([0-9]){2}" registro_rg <- "(CNSF-H0711-)([0-9]{4})[- ]([0-9]){4}" subgrupo_rg <- "_([0-9]){1,3}." mon_rg <- "SMGM|UMAM|MN" ruta <- 'C:/Users/bdominguez/Documents/H0711/Bond/1909/' archivos<-list.files(path=ruta,pattern = '*.pdf') imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep="")) prueba <-image_ocr(imagen, language = 'eng') lineas<-unlist(str_split(prueba,pattern = "\n")) lineasp<-unlist(str_split(prueba[2],pattern = "\r\n")) newnom <- NULL renglones <- NULL for (nombre in archivos){ subgrupo <- str_extract(str_extract(nombre,pattern = subgrupo_rg),pattern = "[0-9]{1,3}") imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep="")) prueba <-image_ocr(imagen, language = 'eng') lineas<-unlist(str_split(prueba,pattern = "\n")) poliza <- NULL poliza<-str_extract(lineas[1],poliza_rg) newnom <- c(newnom,paste(poliza[1],substr(nombre,5,6),".pdf",sep='')) registro <- NULL registro<-str_extract(lineas[49],registro_rg) rfc <- NULL rfc <- str_extract(lineas[5],rfc_rg) #lineasnew<-unlist(str_split(lineas[2],pattern = "\r\n")) #lineasdosnew<-unlist(str_split(lineas[1],pattern = "\r\n")) cobertura <- NA extranjera <- NA suma_str <- NA deducible_str <- NA suma <- NA coaseguro <- NA deducible <- NA tope <- NA mon <- NA mondedu <- NA cobertura <- grep("Cobertura en el Extranjero",lineas,value=TRUE) extranjera <- grep("Emergencia en el Extranjero",lineas,value=TRUE) suma_str <- grep("SUMA ASEGURADA:",lineas,value=TRUE) deducible_str <- grep("DEDUCIBLE:",lineas,value=TRUE) sumacob <- NA sumaext <- NA pprimaria <- grep("Numero de Póliza:", lineas, value = TRUE) dnprimariaa <- grep("Nombre de la Aseguradora Primaria:", lineas, value = TRUE) #cer<- grep("Certificado No. ",lineas, value=TRUE) #ntit<- grep("Ramo", lineas, value=TRUE) sumacob<-as.numeric(str_extract(cobertura[1],pattern = "[0-9]{1,}")) if (length(sumacob)==0){ sumacob = NA } sumaext<-as.numeric(str_extract(extranjera[17],pattern = "[0-9]{1,}")) if (length(sumaext)==0){ sumaext = NA } valores <- NULL monedas <- NULL valores <- str_extract_all(suma_str[17],pattern = "[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE) monedas <- str_extract(suma_str,pattern = mon_rg) if (length(valores[1])==0){ suma = NA mon = NA }else{ suma = as.numeric(gsub(pattern = ",*",replacement = "",valores[1])) mon <- as.character(monedas[1]) } if (length(valores[2])==0){ coaseguro = NA }else{ coaseguro = as.numeric(valores[2]) } valores <- NULL valores <- str_extract_all(deducible_str[1],pattern = "[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE) if (length(valores[1])==0){ deducible <- NA }else{ deducible <- as.numeric(gsub(pattern = ",",replacement = "",valores[1])) } monedas <- NULL monedas <- str_extract(deducible_str[1],pattern = mon_rg) if (length(monedas)==0){ mondedu <- NA }else{ mondedu <- monedas } if (length(valores[2])==0){ tope = NA }else{ tope = as.numeric(gsub(pattern = ",",replacement = "",valores[2])) } renglon <- data.frame(archivo=nombre,poliza=as.character(poliza[1]),cobertura=sumacob,emergencia=sumaext,registro=registro[1],suma=suma,coaseguro=coaseguro,deducible=deducible,tope=tope,rfc=rfc,mon=mon,mondedu=mondedu,subgrupo=subgrupo, cert=as.character(cer[1]), cer_tit=as.character(lineasdos[14]), titu=as.character(lineasdos[10])) renglones <- rbind(renglones,renglon) } # Con los datos del data frame renombra los archivos hay que crear los subdirectorios noms <- data.frame(archivo=archivos,poliza=newnom) noms <- renglones[!is.na(renglones$poliza),c('archivo','cer_tit')] ungrupo<-sqldf("select poliza,count(cert) from noms group by 1 having count(cert) <= 1 ") noms<-sqldf("select * from noms where poliza in (select poliza from ungrupo)") length(noms$archivo) salida <- "/renombra/" for (i in 1:length(noms[,1])){ if (!is.na(noms[i,'cer_tit'])){ pfrom <- paste(ruta,"/",noms[i,'archivo'],sep='') pto <-
Re: [R] 95% bootstrap CIs
Hello, Is this what you are looking for? ci95 <- apply(my.data, 2, quantile, probs = c(0.025, 0.975)) Hope this helps, Rui Barradas Às 20:42 de 23/09/19, varin sacha via R-help escreveu: Dear R-Experts, Here is my reproducible R code to get the Mean squared error of GAM and MARS after I = 50 iterations/replications. If I want to get the 95% bootstrap CIs around the MSE of GAM and around the MSE of MARS, how can I complete/modify my R code ? Many thanks for your precious help. ## library(mgcv) library(earth) my.experiment <- function() { n<-500 x <-runif(n, 0, 5) z <- rnorm(n, 2, 3) a <- runif(n, 0, 5) y_model <- 0.1*x^3 - 0.5*z^2 - a + x*z + x*a + 3*x*a*z + 10 y_obs <- y_model +c( rnorm(n*0.97, 0, 0.1), rnorm(n*0.03, 0, 0.5) ) gam_model<- gam(y_obs~s(x)+s(z)+s(a)) mars_model<-earth(y_obs~x+z+a) MSE_GAM<-mean((gam_model$fitted.values - y_model)^2) MSE_MARS<-mean((mars_model$fitted.values - y_model)^2) return( c(MSE_GAM, MSE_MARS) ) } my.data = t(replicate( 50, my.experiment() )) colnames(my.data) <- c("MSE_GAM", "MSE_MARS") summary(my.data) ## __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 95% bootstrap CIs
Dear R-Experts, Here is my reproducible R code to get the Mean squared error of GAM and MARS after I = 50 iterations/replications. If I want to get the 95% bootstrap CIs around the MSE of GAM and around the MSE of MARS, how can I complete/modify my R code ? Many thanks for your precious help. ## library(mgcv) library(earth) my.experiment <- function() { n<-500 x <-runif(n, 0, 5) z <- rnorm(n, 2, 3) a <- runif(n, 0, 5) y_model <- 0.1*x^3 - 0.5*z^2 - a + x*z + x*a + 3*x*a*z + 10 y_obs <- y_model +c( rnorm(n*0.97, 0, 0.1), rnorm(n*0.03, 0, 0.5) ) gam_model<- gam(y_obs~s(x)+s(z)+s(a)) mars_model<-earth(y_obs~x+z+a) MSE_GAM<-mean((gam_model$fitted.values - y_model)^2) MSE_MARS<-mean((mars_model$fitted.values - y_model)^2) return( c(MSE_GAM, MSE_MARS) ) } my.data = t(replicate( 50, my.experiment() )) colnames(my.data) <- c("MSE_GAM", "MSE_MARS") summary(my.data) ## __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] help: boxplot multivariables
Hola Diego, gracias, soy bastante nueva en R y me cuesta seguirle el hilo a veces. Y no pasa nada, creo que terminaré usando ggplot de todas formas. Lo del NA es porque es un experimento, donde GS sólo puede ocurrir cuando hay una persona, y en algunos episodios el animal está solo. Tal vez podría ir un 0 para el caso del boxplot. Entonces quería comparar los comportamientos GS-GI. Seguí tu script y con esos pocos datos esto es lo que me arrojó para GS (es lo mismo que obtuviste?) [image: image.png] Quizás pregunté por la combinación más extraña porque GS casi nunca pasó. En realidad, me gustaría hacer lo mismo con AA y AD, Z1-Z2, y así con otros comportamientos... cómo para tener un boxplot similar a este pero con mis datos: [image: image.png] Sólo pegué un pequeño extracto porque son muchos datos. Pero sí, los datos que copiaste están perfectos. Gracias. El lun., 23 sept. 2019 a las 14:28, Diego Martín () escribió: > Hola Lorena: > >bueno, tal y como planteas la duda, a mí me sale > automáticamente esto que te mostraré a continuación: > > library(dplyr) > library(purrr) > library(ggplot2) > > dLSaa %>% > ggplot(aes(factor(Ep), GI)) + > geom_boxplot() > > map_dbl(.x = dLSaa$GS, .f = function(x) {sum(is.na(x))}) > > dLSaa %>% > ggplot(aes(factor(Ep), GS)) + > geom_boxplot() > > Sé que deseabas más bien la función boxplot(), pero es > que con lo que más trabajo es con ggplot, y casi me he acostumbrado a > pensar en ese modo. > > Hay varias cosas. La primera me sorprenden los datos con > el resultado que sale, especialmente con el caso de la columna GS, por eso > he puesto la función map_dbl con is.na, para resaltar los na, pero > también por los valores que hallo en ella. La otra cuestión es que no estoy > seguro de las columnas sean comparables. Pero bueno, todo esto huelga, > porque lo importante es lo que estés haciendo y que sirva para algo este > planteamiento que te facilito. Ojalá te haya ayudado. > > Un saludo cordial. > > PS. Te adjunto la tabla que he usado, la que pones tú, pero que yo tuve > que leer y por si me hubiera equivocado para que las veas. > > El lun., 23 sept. 2019 a las 16:34, Lorena Saavedra Aracena (< > l.saavedra.arac...@gmail.com>) escribió: > >> Hola, >> estoy tratando de hacer un boxplot para compara diferentes >> comportamientos. >> Tengo un set de datos con 17 columnas, quiero crear un boxplot >> considerando 3 de ellas. He buscado en foros y siempre responden esta >> pregunta con subconjuntos de datos dentro de una misma columna, y no he >> logrado realizarlo. >> >> Este es un ejemplo de mis datos: >> ID: id animal >> EP: episodio >> AA: acostado y alerta >> AD: acostado descansando >> B: sentado; C: de pie, D; locomición, F: exploración; GS: juego social; >> GI: >> juego individual >> Z: zona 1,2,3,4 >> >> ID Ep AA AD B C D F GS GI Z1 Z2 Z3 Z4 >> 1 1 8 0 0 0 4 5 0 0 8 1 0 0 >> 1 2 5 0 0 2 5 3 0 0 9 0 0 0 >> 1 3 0 0 0 0 12 0 11 0 1 2 0 1 >> 1 4 2 0 0 2 4 0 NA 8 8 0 0 0 >> 1 5 3 0 1 0 8 0 0 7 1 3 0 4 >> 1 6 5 0 0 5 2 0 0 10 0 11 0 0 >> 2 1 0 0 12 0 0 1 0 0 0 0 12 0 >> 2 2 8 0 0 0 3 1 0 0 0 0 0 9 >> 2 3 0 0 1 1 10 1 0 0 1 0 1 0 >> 2 4 0 0 0 6 5 0 NA 0 4 1 2 0 >> 2 5 1 0 7 0 4 0 0 0 0 2 0 6 >> 2 6 4 0 2 1 5 0 0 0 2 0 2 4 >> 3 1 2 6 0 4 0 2 0 0 0 0 12 0 >> 3 2 10 1 0 0 0 0 0 0 0 0 11 0 >> 3 3 0 0 0 7 5 3 0 0 4 0 0 0 >> 3 4 0 0 0 1 5 1 NA 0 5 0 0 0 >> 3 5 2 0 0 5 5 0 0 0 0 3 0 2 >> 3 6 4 2 0 4 2 0 0 0 4 6 0 0 >> >> necesito crear un boxplot donde x siempre será Ep (episodios), e Y >> considere por ejemplo GS y GI, para poder comparar dichos comportamientos. >> De preferencia, me gustaría hacerlo simplemente con boxplot(), pero estoy >> abierta a intentarlo con ggplot de igual forma. >> >> Muchas gracias, saludos >> >> >> -- >> >> *Lorena Saavedra A.**Ing. Recursos Naturales Renovables* >> *+56 9 9880 2972* >> >> [[alternative HTML version deleted]] >> >> ___ >> R-help-es mailing list >> R-help-es@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-help-es >> > -- *Lorena Saavedra A.**Ing. Recursos Naturales Renovables* *+56 9 9880 2972* ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R-es] help: boxplot multivariables
Hola Lorena: bueno, tal y como planteas la duda, a mí me sale automáticamente esto que te mostraré a continuación: library(dplyr) library(purrr) library(ggplot2) dLSaa %>% ggplot(aes(factor(Ep), GI)) + geom_boxplot() map_dbl(.x = dLSaa$GS, .f = function(x) {sum(is.na(x))}) dLSaa %>% ggplot(aes(factor(Ep), GS)) + geom_boxplot() Sé que deseabas más bien la función boxplot(), pero es que con lo que más trabajo es con ggplot, y casi me he acostumbrado a pensar en ese modo. Hay varias cosas. La primera me sorprenden los datos con el resultado que sale, especialmente con el caso de la columna GS, por eso he puesto la función map_dbl con is.na, para resaltar los na, pero también por los valores que hallo en ella. La otra cuestión es que no estoy seguro de las columnas sean comparables. Pero bueno, todo esto huelga, porque lo importante es lo que estés haciendo y que sirva para algo este planteamiento que te facilito. Ojalá te haya ayudado. Un saludo cordial. PS. Te adjunto la tabla que he usado, la que pones tú, pero que yo tuve que leer y por si me hubiera equivocado para que las veas. El lun., 23 sept. 2019 a las 16:34, Lorena Saavedra Aracena (< l.saavedra.arac...@gmail.com>) escribió: > Hola, > estoy tratando de hacer un boxplot para compara diferentes comportamientos. > Tengo un set de datos con 17 columnas, quiero crear un boxplot > considerando 3 de ellas. He buscado en foros y siempre responden esta > pregunta con subconjuntos de datos dentro de una misma columna, y no he > logrado realizarlo. > > Este es un ejemplo de mis datos: > ID: id animal > EP: episodio > AA: acostado y alerta > AD: acostado descansando > B: sentado; C: de pie, D; locomición, F: exploración; GS: juego social; GI: > juego individual > Z: zona 1,2,3,4 > > ID Ep AA AD B C D F GS GI Z1 Z2 Z3 Z4 > 1 1 8 0 0 0 4 5 0 0 8 1 0 0 > 1 2 5 0 0 2 5 3 0 0 9 0 0 0 > 1 3 0 0 0 0 12 0 11 0 1 2 0 1 > 1 4 2 0 0 2 4 0 NA 8 8 0 0 0 > 1 5 3 0 1 0 8 0 0 7 1 3 0 4 > 1 6 5 0 0 5 2 0 0 10 0 11 0 0 > 2 1 0 0 12 0 0 1 0 0 0 0 12 0 > 2 2 8 0 0 0 3 1 0 0 0 0 0 9 > 2 3 0 0 1 1 10 1 0 0 1 0 1 0 > 2 4 0 0 0 6 5 0 NA 0 4 1 2 0 > 2 5 1 0 7 0 4 0 0 0 0 2 0 6 > 2 6 4 0 2 1 5 0 0 0 2 0 2 4 > 3 1 2 6 0 4 0 2 0 0 0 0 12 0 > 3 2 10 1 0 0 0 0 0 0 0 0 11 0 > 3 3 0 0 0 7 5 3 0 0 4 0 0 0 > 3 4 0 0 0 1 5 1 NA 0 5 0 0 0 > 3 5 2 0 0 5 5 0 0 0 0 3 0 2 > 3 6 4 2 0 4 2 0 0 0 4 6 0 0 > > necesito crear un boxplot donde x siempre será Ep (episodios), e Y > considere por ejemplo GS y GI, para poder comparar dichos comportamientos. > De preferencia, me gustaría hacerlo simplemente con boxplot(), pero estoy > abierta a intentarlo con ggplot de igual forma. > > Muchas gracias, saludos > > > -- > > *Lorena Saavedra A.**Ing. Recursos Naturales Renovables* > *+56 9 9880 2972* > > [[alternative HTML version deleted]] > > ___ > R-help-es mailing list > R-help-es@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-help-es > dLSaa.RData Description: Binary data ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
[R-es] help: boxplot multivariables
Hola, estoy tratando de hacer un boxplot para compara diferentes comportamientos. Tengo un set de datos con 17 columnas, quiero crear un boxplot considerando 3 de ellas. He buscado en foros y siempre responden esta pregunta con subconjuntos de datos dentro de una misma columna, y no he logrado realizarlo. Este es un ejemplo de mis datos: ID: id animal EP: episodio AA: acostado y alerta AD: acostado descansando B: sentado; C: de pie, D; locomición, F: exploración; GS: juego social; GI: juego individual Z: zona 1,2,3,4 ID Ep AA AD B C D F GS GI Z1 Z2 Z3 Z4 1 1 8 0 0 0 4 5 0 0 8 1 0 0 1 2 5 0 0 2 5 3 0 0 9 0 0 0 1 3 0 0 0 0 12 0 11 0 1 2 0 1 1 4 2 0 0 2 4 0 NA 8 8 0 0 0 1 5 3 0 1 0 8 0 0 7 1 3 0 4 1 6 5 0 0 5 2 0 0 10 0 11 0 0 2 1 0 0 12 0 0 1 0 0 0 0 12 0 2 2 8 0 0 0 3 1 0 0 0 0 0 9 2 3 0 0 1 1 10 1 0 0 1 0 1 0 2 4 0 0 0 6 5 0 NA 0 4 1 2 0 2 5 1 0 7 0 4 0 0 0 0 2 0 6 2 6 4 0 2 1 5 0 0 0 2 0 2 4 3 1 2 6 0 4 0 2 0 0 0 0 12 0 3 2 10 1 0 0 0 0 0 0 0 0 11 0 3 3 0 0 0 7 5 3 0 0 4 0 0 0 3 4 0 0 0 1 5 1 NA 0 5 0 0 0 3 5 2 0 0 5 5 0 0 0 0 3 0 2 3 6 4 2 0 4 2 0 0 0 4 6 0 0 necesito crear un boxplot donde x siempre será Ep (episodios), e Y considere por ejemplo GS y GI, para poder comparar dichos comportamientos. De preferencia, me gustaría hacerlo simplemente con boxplot(), pero estoy abierta a intentarlo con ggplot de igual forma. Muchas gracias, saludos -- *Lorena Saavedra A.**Ing. Recursos Naturales Renovables* *+56 9 9880 2972* [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] The "--slave" option
> Richard O'Keefe > on Sat, 21 Sep 2019 09:39:18 +1200 writes: > Ah, *now* we're getting somewhere. There is something > that *can* be done that's genuinely helpful. >> From the R(1) manual page: >-q, --quiet Don't print startup message >--silent Same as --quiet >--slave Make R run as quietly as possible > It might have been better to use --nobanner instead of > --quiet. So perhaps > -q, --quiet Don't print the startup message. This is > the only output that is suppressed. > --silent Same as --quiet. Suppress the startup > message only. > --slave Make R run as quietly as possible. This is > for use when running R as a subordinate process. See > "Introduction to Sub-Processes in R" > https://cran.r-project.org/web/packages/subprocess/vignettes/intro.html > for an example. Thank you, Stephen and Richard. I think we (the R Core Team) *can* make the description a bit more verbose. However, as practically all "--" descriptions are fitting in one short line, (and as the 'subprocess' package is just an extension pkg, and may disappear (and more reasons)) I'd like to be less verbose than your proposal. What about -q, --quiet Don't print startup message --silent Same as --quiet --slave Make R run as quietly as possible. For use when runnning R as sub(ordinate) process. If you look more closely, you'll notice that --slave is not much quieter than --quiet, the only (?) difference being that the input is not copied and (only "mostly") the R prompt is also not printed. And from my experiments (in Linux (Fedora 30)), one might even notice that in some cases --slave prints the R prompt (to stderr?) which one might consider bogous (I'm not: not wanting to spend time fixing this platform-independently) : --slave : MM@lynne$ echo '(i <- 1:3) i*10' | R-3.6.1 --slave --vanilla > [1] 1 2 3 [1] 10 20 30 MM@lynne$ f=/tmp/Rslave.out$$; echo '(i <- 1:3) i*10' | R-3.6.1 --slave --vanilla | tee $f > [1] 1 2 3 [1] 10 20 30 MM@lynne$ cat $f [1] 1 2 3 [1] 10 20 30 --quiet : MM@lynne$ f=/tmp/Rquiet.out$$; echo '(i <- 1:3) i*10' | R-3.6.1 --quiet --vanilla | tee $f > (i <- 1:3) [1] 1 2 3 > i*10 [1] 10 20 30 > MM@lynne$ cat $f > (i <- 1:3) [1] 1 2 3 > i*10 [1] 10 20 30 > MM@lynne$ But there's a bit more to it: In my examples above, both --quiet and --slave where used together with --vanilla. In general --slave *also* never saves, i.e., uses the equivalent of q('no'), where as --quiet does [ask or ...]. Last but not least, from very simply reading R's source code on this, it becomes blatant that you can use '-s' instead of '--slave', but we (R Core) have probably not documented that on purpose (so we could reserve it for something more important, and redefine the simple use of '-s' some time in the future ?) So, all those who want to restrict their language could use '-s' for now. In addition, we could add >> one << other alias to --slave, say --subprocess (or --quieter ? or ???) and one could make that the preferred use some time in the future. Well, these were another two hours of time *not* spent improving R technically, but spent reading e-mails, source code, and considering. Maybe well spent, maybe not ... Martin Maechler ETH Zurich and R Core Team > On Sat, 21 Sep 2019 at 02:29, Stephen Ellison > wrote: >> >> > Sure, it's a silly example, but it makes about as much >> sense as using > "slave" to mean "quiet". It >> doesn't. It's a set of options chosen for when R is >> called as a slave process from a controlling process, and >> in that it is a reasonable description of the >> circumstance. >> >> --quiet is a separate command line option with different >> effect. >> >> __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Average distance in kilometers between subsets of points with ggmap /geosphere
Hi Malte, I only skimmed your question and looked at the desired output. I wondered if the apply function could meet your needs. Here's a small example that might help you: m <- matrix(1:9,nrow=3) m <- cbind(m,apply(m,MAR=1,mean)) # MAR=1 says to apply the function row-wise m # [,1] [,2] [,3] [,4] # [1,]1474 # [2,]2585 # [3,]3696 HTH, Eric On Mon, Sep 23, 2019 at 10:18 AM Malte Hückstädt < deaddatascienti...@gmail.com> wrote: > I would like to determine the geographical distances from a number of > addresses and determine the mean value (the mean distance) from these. > > In case the dataframe has only one row, I have found a solution: > > ```r > # Pakete laden > library(readxl) > library(openxlsx) > library(googleway) > #library(sf) > library(tidyverse) > library(geosphere) > library("ggmap") > > #API Key bestimmen > set_key("") > api_key <- "" > register_google(key=api_key) > > # Data > df <- data.frame( > V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538 > München, Germany", > "07745 Jena, Germany","10117 Berlin, Germany"), > V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152 > Planegg, Germany", > "07743 Jena, Germany","14195 Berlin, Germany"), > V3 = c("85748 Garching, Germany", "01069 Dresden, Germany", "85748 > Garching, Germany", > NA, "10318 Berlin, Germany"), > V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805 > München, Germany", > "07745 Jena, Germany", NA), stringsAsFactors=FALSE > ) > > #replace NA for geocode-funktion > df[is.na(df)] <- "" > > #slice it > df1 <- slice(df, 5:5) > > # lon lat Informations > df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit() > > # to Matrix > mat_df <- as.matrix(df_2) > > #dist-mat > dist_mat <- distm(mat_df) > > #mean-dist of row 5 > mean(dist_mat[lower.tri(dist_mat)])/1000 > ``` > > Unfortunately, I fail to implement a function that executes the code for > an entire data set. My current problem is, that the function does not > calculate the distance-averages rowwise, but calculates the average value > from all lines of the data set. > > ```r > #Funktion > > Mean_Dist <- function(df,w,x,y,z) { > > # for (row in 1:nrow(df)) { > # dist_mat <- geocode(c(w, x, y, z)) > # > # } > > df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen > aus Adressen > > mat_df <- as.matrix(df) # schreibe diese in eine Matrix > > dist_mat <- distm(mat_df) > > dist_mean <- mean(dist_mat[lower.tri(dist_mat)]) > > return(dist_mean) > } > > df %>% mutate(lon = Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000) > > ``` > Do you have any idea what mistake I made? > > to clarify my question: What I'm trying to create a dataframe like this > one (V5): > > ```r > V1 V2 V3 > V4 V5 > > > 1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany > 80805 München, Germany Mean_Dist_row1 > 2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany > 01187 Dresden, Germany Mean_Dist_row2 > 3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany > 80805 München, Germany Mean_Dist_row3 > 4 07745 Jena, Germany07743 Jena, Germany07745 Jena, Germany > 07745 Jena, Germany Mean_Dist_row4 > 5 10117 Berlin, Germany 14195 Berlin, Germany 10318 Berlin, Germany > 14476 Potsdam, Germany Mean_Dist_row5 > ``` > > eg an average of the distance of each row. > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Average distance in kilometers between subsets of points with ggmap /geosphere
I would like to determine the geographical distances from a number of addresses and determine the mean value (the mean distance) from these. In case the dataframe has only one row, I have found a solution: ```r # Pakete laden library(readxl) library(openxlsx) library(googleway) #library(sf) library(tidyverse) library(geosphere) library("ggmap") #API Key bestimmen set_key("") api_key <- "" register_google(key=api_key) # Data df <- data.frame( V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538 München, Germany", "07745 Jena, Germany","10117 Berlin, Germany"), V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152 Planegg, Germany", "07743 Jena, Germany","14195 Berlin, Germany"), V3 = c("85748 Garching, Germany", "01069 Dresden, Germany", "85748 Garching, Germany", NA, "10318 Berlin, Germany"), V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805 München, Germany", "07745 Jena, Germany", NA), stringsAsFactors=FALSE ) #replace NA for geocode-funktion df[is.na(df)] <- "" #slice it df1 <- slice(df, 5:5) # lon lat Informations df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit() # to Matrix mat_df <- as.matrix(df_2) #dist-mat dist_mat <- distm(mat_df) #mean-dist of row 5 mean(dist_mat[lower.tri(dist_mat)])/1000 ``` Unfortunately, I fail to implement a function that executes the code for an entire data set. My current problem is, that the function does not calculate the distance-averages rowwise, but calculates the average value from all lines of the data set. ```r #Funktion Mean_Dist <- function(df,w,x,y,z) { # for (row in 1:nrow(df)) { # dist_mat <- geocode(c(w, x, y, z)) # # } df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen aus Adressen mat_df <- as.matrix(df) # schreibe diese in eine Matrix dist_mat <- distm(mat_df) dist_mean <- mean(dist_mat[lower.tri(dist_mat)]) return(dist_mean) } df %>% mutate(lon = Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000) ``` Do you have any idea what mistake I made? to clarify my question: What I'm trying to create a dataframe like this one (V5): ```r V1 V2 V3 V4 V5 1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 München, Germany Mean_Dist_row1 2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany 01187 Dresden, Germany Mean_Dist_row2 3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 München, Germany Mean_Dist_row3 4 07745 Jena, Germany07743 Jena, Germany07745 Jena, Germany 07745 Jena, Germany Mean_Dist_row4 5 10117 Berlin, Germany 14195 Berlin, Germany 10318 Berlin, Germany 14476 Potsdam, Germany Mean_Dist_row5 ``` eg an average of the distance of each row. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [SPAM] Re: The "--slave" option
Hi Abby, I don’t really understand why you’re upset with me, but a) they’re cultured cell lines, not animals, b) they might cure people, c) I don’t do experiments, d) modern slavery, dated today: https://www.theguardian.com/world/2019/sep/21/such-brutality-tricked-into-slavery-in-the-thai-fishing-industry . I simply don’t like the word, and it doesn’t even describe what the option does. Best, Ben > On 22 Sep 2019, at 00:56, Abby Spurdle wrote: > > (excerpts only) >> slavery being easily justified by the Bible while abolition is not is an >> experience. >> P.S. Do any R developers actually read this? > > I've read one or two verses... > > I also found this (by you): > https://www.ncbi.nlm.nih.gov/pubmed/20362542 > > Which uses embryonic stem cells. > I recognize that they're mouse embryos. > However, your article cites at least five other articles (probably, a > lot more), that use human embryonic stem cells. > > You complain about slavery (that doesn't exist), and then prompte > murder (which does exist). > What does that say about you... > > And that's ignoring the way you treat animals > We slice and dice data, you slice and dice living creatures. > > Here's two songs about freedom, if you have ears to hear: > https://youtu.be/lKw6uqtGFfo > https://youtu.be/HAIdo707Sac [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [SPAM] Re: The "--slave" option
Hah, fair. I do hope somebody does see it and gives it a thought. Thanks, Ben On Sun, 22 Sep 2019 at 01:29, Roy Mendelssohn - NOAA Federal < roy.mendelss...@noaa.gov> wrote: > Please All: > > While as I said in my first post I am still not convinced that the OP was > in good faith to improve R and not a troll (yours to decide), I also don't > think attacking a person's research to counter a point that has nothing to > do with their research is what is wanted on this mail-list. There is one > very simple alternative - don't reply. > > Ben - members of R-core do read this mail-list, and the fact that not a > single one has replied probably tells you what you need to know. > > -Roy > > > > On Sep 21, 2019, at 3:56 PM, Abby Spurdle wrote: > > > > (excerpts only) > >> slavery being easily justified by the Bible while abolition is not is > an experience. > >> P.S. Do any R developers actually read this? > > > > I've read one or two verses... > > > > I also found this (by you): > > https://www.ncbi.nlm.nih.gov/pubmed/20362542 > > > > Which uses embryonic stem cells. > > I recognize that they're mouse embryos. > > However, your article cites at least five other articles (probably, a > > lot more), that use human embryonic stem cells. > > > > You complain about slavery (that doesn't exist), and then prompte > > murder (which does exist). > > What does that say about you... > > > > And that's ignoring the way you treat animals > > We slice and dice data, you slice and dice living creatures. > > > > Here's two songs about freedom, if you have ears to hear: > > https://youtu.be/lKw6uqtGFfo > > https://youtu.be/HAIdo707Sac > > ** > "The contents of this message do not reflect any position of the U.S. > Government or NOAA." > ** > Roy Mendelssohn > Supervisory Operations Research Analyst > NOAA/NMFS > Environmental Research Division > Southwest Fisheries Science Center > ***Note new street address*** > 110 McAllister Way > Santa Cruz, CA 95060 > Phone: (831)-420-3666 > Fax: (831) 420-3980 > e-mail: roy.mendelss...@noaa.gov www: https://www.pfeg.noaa.gov/ > > "Old age and treachery will overcome youth and skill." > "From those who have been given much, much will be expected" > "the arc of the moral universe is long, but it bends toward justice" -MLK > Jr. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dabestr plot color change manually
Dear R plot experts, I'm working on dabestr package for effect size estimation plot ( https://cran.r-project.org/web/packages/dabestr/vignettes/using-dabestr.html) where I'd like to change the following two issues manually. Does anyone know- 1) how can I manually change ticks color (I'd like to have black rather than the default grey one)? and 2) how do I manipulate the point's color distinctly into two different colors (I'd like to have black and grey colors than the default automatic gradient color)? Any suggestion will be highly appreciated. Thanks in advance. Regards, Moshi JSPS Postdoctoral Fellow Laboratory of Population Biology Department of Marine Biosciences Graduate School of Marine Science and Technology Tokyo University of Marine Science and Technology 4-5-7 Konan, Minato-ku, Tokyo 108-8477, Japan Mobile: 050-6874-9072 . [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.