Re: [R-es] Consulta

2019-09-23 Thread Emilio L. Cano
Hola Berenice,

¿Qué quires decir con que no reconoce el paquete? ¿Te da algún mensaje de error?
No sé si has probado a instalar de nuevo el paquete, si no hazlo.

Para poder reproducir el error con tu código haría falta alguno de los pdfs que 
utilizas (puedes compartir un enlace a dropbox o similar).

Un saludo,
Emilio

> El 24 sept 2019, a las 1:49, BERENICE DOMINGUEZ SANCHEZ  
> escribió:
> 
> Buenas tarde a todo@s:
> 
> Tenia la versión de R 3.6 y utilizaba la paquetería de pdftools para extraer 
> información de archivos en pdf actualice la versión 3.6.1 y ya no reconoce la 
> paquetería alguien que me pueda ayudar. Prácticamente no reconoce las 
> funciones de pdftools
> 
> library(pdftools)
> library(stringr)​
> library(NLP)​
> library(tm)​
> library(tesseract)​
> library(magick)​
> install.packages("magick")​
> install.packages("pdftools")​
> ​
> txt <- system.file("texts", "txt", package = "tm")​
> ​
> rfc_rg <- "([A-Z]{3,})([0-9]{6})([A-Z]|[0-9]){0,3}"​
> #poliza_rg <- 
> "(34|36|37|39)(ME|MEC|CH|MB|TF|GI|VE|TS|IM|ER|VE)*([0-9]{6,})[-]([0-9]){2}[-][A-Z]"​
> poliza_rg <- "(ME|CH|MB|TF|GI|gi|VE|TS|IM|ER|VE)*([0-9]{8,})[-]([0-9]){2}"​
> registro_rg <- "(CNSF-H0711-)([0-9]{4})[- ]([0-9]){4}"​
> subgrupo_rg <- "_([0-9]){1,3}."​
> mon_rg <- "SMGM|UMAM|MN"​
> ​
> ​
> ruta <- 'C:/Users/bdominguez/Documents/H0711/Bond/1909/'​
> archivos<-list.files(path=ruta,pattern = '*.pdf')​
> ​
> ​
> imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep=""))​
> prueba <-image_ocr(imagen, language = 'eng')​
> lineas<-unlist(str_split(prueba,pattern = "\n"))​
> lineasp<-unlist(str_split(prueba[2],pattern = "\r\n"))​
> ​
> newnom <- NULL​
> renglones <- NULL​
> for (nombre in archivos){​
>  subgrupo <- str_extract(str_extract(nombre,pattern = subgrupo_rg),pattern = 
> "[0-9]{1,3}")​
>  imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep=""))​
>  prueba <-image_ocr(imagen, language = 'eng')​
>  lineas<-unlist(str_split(prueba,pattern = "\n"))​
>  poliza <- NULL​
>  poliza<-str_extract(lineas[1],poliza_rg)​
>  newnom <- c(newnom,paste(poliza[1],substr(nombre,5,6),".pdf",sep=''))​
>  ​
>  registro <- NULL​
>  registro<-str_extract(lineas[49],registro_rg)​
>  ​
>  rfc <- NULL​
>  rfc <- str_extract(lineas[5],rfc_rg)​
>  ​
>  ​
>  #lineasnew<-unlist(str_split(lineas[2],pattern = "\r\n"))​
>  #lineasdosnew<-unlist(str_split(lineas[1],pattern = "\r\n"))​
>  ​
>  cobertura <- NA​
>  extranjera <- NA​
>  suma_str   <- NA​
>  deducible_str <- NA​
>  ​
>  suma <- NA​
>  coaseguro <- NA​
>  deducible <- NA​
>  tope <- NA​
>  mon <- NA​
>  mondedu <- NA​
>  ​
>  cobertura  <- grep("Cobertura en el Extranjero",lineas,value=TRUE)​
>  extranjera <- grep("Emergencia en el Extranjero",lineas,value=TRUE)​
>  suma_str   <- grep("SUMA ASEGURADA:",lineas,value=TRUE)​
>  deducible_str   <- grep("DEDUCIBLE:",lineas,value=TRUE)​
>  sumacob <- NA​
>  sumaext <- NA​
>  ​
>  pprimaria <- grep("Numero de Póliza:", lineas, value = TRUE)​
>  dnprimariaa <- grep("Nombre de la Aseguradora Primaria:", lineas, value = 
> TRUE)​
>  ​
>  #cer<- grep("Certificado No. ",lineas, value=TRUE)​
>  #ntit<- grep("Ramo", lineas, value=TRUE)​
>  ​
>  sumacob<-as.numeric(str_extract(cobertura[1],pattern = "[0-9]{1,}"))​
>  if (length(sumacob)==0){​
>sumacob = NA​
>  }​
>  ​
>  sumaext<-as.numeric(str_extract(extranjera[17],pattern = "[0-9]{1,}"))​
>  if (length(sumaext)==0){​
>sumaext = NA​
>  }​
>  valores <- NULL​
>  monedas <- NULL​
>  valores <- str_extract_all(suma_str[17],pattern = 
> "[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE)​
>  monedas <- str_extract(suma_str,pattern = mon_rg)​
>  if (length(valores[1])==0){​
>suma = NA​
>mon = NA​
>  }else{​
>suma = as.numeric(gsub(pattern = ",*",replacement = "",valores[1]))​
>mon <- as.character(monedas[1])​
>  }​
>  ​
>  if (length(valores[2])==0){​
>coaseguro = NA​
>  }else{​
>coaseguro = as.numeric(valores[2])​
>  }​
>  valores <- NULL​
>  valores <- str_extract_all(deducible_str[1],pattern = 
> "[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE)​
>  ​
>  if (length(valores[1])==0){​
>deducible <- NA​
>  }else{​
>deducible <- as.numeric(gsub(pattern = ",",replacement = "",valores[1]))​
>  }​
>  ​
>  monedas <- NULL  ​
>  monedas <- str_extract(deducible_str[1],pattern = mon_rg)​
>  ​
>  if (length(monedas)==0){​
>mondedu <- NA​
>  }else{​
>mondedu <- monedas​
>  }​
>  ​
>  ​
>  if (length(valores[2])==0){​
>tope = NA​
>  }else{​
>tope = as.numeric(gsub(pattern = ",",replacement = "",valores[2]))​
>  }​
>  ​
>  renglon <- 
> data.frame(archivo=nombre,poliza=as.character(poliza[1]),cobertura=sumacob,emergencia=sumaext,registro=registro[1],suma=suma,coaseguro=coaseguro,deducible=deducible,tope=tope,rfc=rfc,mon=mon,mondedu=mondedu,subgrupo=subgrupo,
>  cert=as.character(cer[1]), cer_tit=as.character(lineasdos[14]), 
> titu=as.character(lineasdos[10]))​
>  renglones <- rbind(renglones,renglon)​
> }​
> ​
> 

Re: [R] [FORGED] Re: Loop With Dates

2019-09-23 Thread Rolf Turner



On 22/09/19 11:19 PM, Richard O'Keefe wrote:




Whenever you want a vector that counts something,
cumsum of a logical vector is a good thing to try.




Fortune nomination.

cheers,

Rolf

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] Consulta

2019-09-23 Thread BERENICE DOMINGUEZ SANCHEZ
Buenas tarde a todo@s:

Tenia la versión de R 3.6 y utilizaba la paquetería de pdftools para extraer 
información de archivos en pdf actualice la versión 3.6.1 y ya no reconoce la 
paquetería alguien que me pueda ayudar. Prácticamente no reconoce las funciones 
de pdftools

library(pdftools)
library(stringr)​
library(NLP)​
library(tm)​
library(tesseract)​
library(magick)​
install.packages("magick")​
install.packages("pdftools")​
​
txt <- system.file("texts", "txt", package = "tm")​
​
rfc_rg <- "([A-Z]{3,})([0-9]{6})([A-Z]|[0-9]){0,3}"​
#poliza_rg <- 
"(34|36|37|39)(ME|MEC|CH|MB|TF|GI|VE|TS|IM|ER|VE)*([0-9]{6,})[-]([0-9]){2}[-][A-Z]"​
poliza_rg <- "(ME|CH|MB|TF|GI|gi|VE|TS|IM|ER|VE)*([0-9]{8,})[-]([0-9]){2}"​
registro_rg <- "(CNSF-H0711-)([0-9]{4})[- ]([0-9]){4}"​
subgrupo_rg <- "_([0-9]){1,3}."​
mon_rg <- "SMGM|UMAM|MN"​
​
​
ruta <- 'C:/Users/bdominguez/Documents/H0711/Bond/1909/'​
archivos<-list.files(path=ruta,pattern = '*.pdf')​
​
​
imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep=""))​
prueba <-image_ocr(imagen, language = 'eng')​
lineas<-unlist(str_split(prueba,pattern = "\n"))​
lineasp<-unlist(str_split(prueba[2],pattern = "\r\n"))​
​
newnom <- NULL​
renglones <- NULL​
for (nombre in archivos){​
  subgrupo <- str_extract(str_extract(nombre,pattern = subgrupo_rg),pattern = 
"[0-9]{1,3}")​
  imagen <- image_read_pdf(path=paste(ruta,"/",nombre,".pdf",sep=""))​
  prueba <-image_ocr(imagen, language = 'eng')​
  lineas<-unlist(str_split(prueba,pattern = "\n"))​
  poliza <- NULL​
  poliza<-str_extract(lineas[1],poliza_rg)​
  newnom <- c(newnom,paste(poliza[1],substr(nombre,5,6),".pdf",sep=''))​
  ​
  registro <- NULL​
  registro<-str_extract(lineas[49],registro_rg)​
  ​
  rfc <- NULL​
  rfc <- str_extract(lineas[5],rfc_rg)​
  ​
  ​
  #lineasnew<-unlist(str_split(lineas[2],pattern = "\r\n"))​
  #lineasdosnew<-unlist(str_split(lineas[1],pattern = "\r\n"))​
  ​
  cobertura <- NA​
  extranjera <- NA​
  suma_str   <- NA​
  deducible_str <- NA​
  ​
  suma <- NA​
  coaseguro <- NA​
  deducible <- NA​
  tope <- NA​
  mon <- NA​
  mondedu <- NA​
  ​
  cobertura  <- grep("Cobertura en el Extranjero",lineas,value=TRUE)​
  extranjera <- grep("Emergencia en el Extranjero",lineas,value=TRUE)​
  suma_str   <- grep("SUMA ASEGURADA:",lineas,value=TRUE)​
  deducible_str   <- grep("DEDUCIBLE:",lineas,value=TRUE)​
  sumacob <- NA​
  sumaext <- NA​
  ​
  pprimaria <- grep("Numero de Póliza:", lineas, value = TRUE)​
  dnprimariaa <- grep("Nombre de la Aseguradora Primaria:", lineas, value = 
TRUE)​
  ​
  #cer<- grep("Certificado No. ",lineas, value=TRUE)​
  #ntit<- grep("Ramo", lineas, value=TRUE)​
  ​
  sumacob<-as.numeric(str_extract(cobertura[1],pattern = "[0-9]{1,}"))​
  if (length(sumacob)==0){​
sumacob = NA​
  }​
  ​
  sumaext<-as.numeric(str_extract(extranjera[17],pattern = "[0-9]{1,}"))​
  if (length(sumaext)==0){​
sumaext = NA​
  }​
  valores <- NULL​
  monedas <- NULL​
  valores <- str_extract_all(suma_str[17],pattern = 
"[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE)​
  monedas <- str_extract(suma_str,pattern = mon_rg)​
  if (length(valores[1])==0){​
suma = NA​
mon = NA​
  }else{​
suma = as.numeric(gsub(pattern = ",*",replacement = "",valores[1]))​
mon <- as.character(monedas[1])​
  }​
  ​
  if (length(valores[2])==0){​
coaseguro = NA​
  }else{​
coaseguro = as.numeric(valores[2])​
  }​
  valores <- NULL​
  valores <- str_extract_all(deducible_str[1],pattern = 
"[0-9]{0,3},*[0-9]{0,3},*[0-9]{1,3}(.[0-9]{1,}){0,1}",simplify=TRUE)​
  ​
  if (length(valores[1])==0){​
deducible <- NA​
  }else{​
deducible <- as.numeric(gsub(pattern = ",",replacement = "",valores[1]))​
  }​
  ​
  monedas <- NULL  ​
  monedas <- str_extract(deducible_str[1],pattern = mon_rg)​
  ​
  if (length(monedas)==0){​
mondedu <- NA​
  }else{​
mondedu <- monedas​
  }​
  ​
  ​
  if (length(valores[2])==0){​
tope = NA​
  }else{​
tope = as.numeric(gsub(pattern = ",",replacement = "",valores[2]))​
  }​
  ​
  renglon <- 
data.frame(archivo=nombre,poliza=as.character(poliza[1]),cobertura=sumacob,emergencia=sumaext,registro=registro[1],suma=suma,coaseguro=coaseguro,deducible=deducible,tope=tope,rfc=rfc,mon=mon,mondedu=mondedu,subgrupo=subgrupo,
 cert=as.character(cer[1]), cer_tit=as.character(lineasdos[14]), 
titu=as.character(lineasdos[10]))​
  renglones <- rbind(renglones,renglon)​
}​
​
# Con los datos del data frame renombra los archivos hay que crear los 
subdirectorios​
​
noms <- data.frame(archivo=archivos,poliza=newnom)​
​
noms <- renglones[!is.na(renglones$poliza),c('archivo','cer_tit')]​
ungrupo<-sqldf("select poliza,count(cert) from noms group by 1  having 
count(cert) <= 1 ")​
noms<-sqldf("select * from noms where poliza in (select poliza from ungrupo)")​
length(noms$archivo)​
salida <- "/renombra/"​
​
for (i in 1:length(noms[,1])){​
  if (!is.na(noms[i,'cer_tit'])){​
pfrom <- paste(ruta,"/",noms[i,'archivo'],sep='')​
pto <- 

Re: [R] 95% bootstrap CIs

2019-09-23 Thread Rui Barradas

Hello,

Is this what you are looking for?


ci95 <- apply(my.data, 2, quantile, probs = c(0.025, 0.975))


Hope this helps,

Rui Barradas

Às 20:42 de 23/09/19, varin sacha via R-help escreveu:

Dear R-Experts,

Here is my reproducible R code to get the Mean squared error of GAM and MARS 
after I = 50 iterations/replications.
If I want to get the 95% bootstrap CIs around the MSE of GAM and around the MSE 
of MARS, how can I complete/modify my R code ?

Many thanks for your precious help.

##

library(mgcv)
  library(earth)
my.experiment <- function() {
n<-500
x <-runif(n, 0, 5)
  z <- rnorm(n, 2, 3)
a <- runif(n, 0, 5)
y_model <- 0.1*x^3 - 0.5*z^2 - a + x*z + x*a + 3*x*a*z + 10
  y_obs <- y_model +c( rnorm(n*0.97, 0, 0.1), rnorm(n*0.03, 0, 0.5) )
  gam_model<- gam(y_obs~s(x)+s(z)+s(a))
mars_model<-earth(y_obs~x+z+a)
MSE_GAM<-mean((gam_model$fitted.values - y_model)^2)
  MSE_MARS<-mean((mars_model$fitted.values - y_model)^2)
  return( c(MSE_GAM, MSE_MARS) )
}
my.data = t(replicate( 50, my.experiment() ))
colnames(my.data) <- c("MSE_GAM", "MSE_MARS")
summary(my.data)

##

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 95% bootstrap CIs

2019-09-23 Thread varin sacha via R-help
Dear R-Experts,

Here is my reproducible R code to get the Mean squared error of GAM and MARS 
after I = 50 iterations/replications.
If I want to get the 95% bootstrap CIs around the MSE of GAM and around the MSE 
of MARS, how can I complete/modify my R code ?

Many thanks for your precious help.

##

library(mgcv) 
 library(earth)   
my.experiment <- function() { 
n<-500 
x <-runif(n, 0, 5) 
 z <- rnorm(n, 2, 3) 
a <- runif(n, 0, 5) 
y_model <- 0.1*x^3 - 0.5*z^2 - a + x*z + x*a + 3*x*a*z + 10 
 y_obs <- y_model +c( rnorm(n*0.97, 0, 0.1), rnorm(n*0.03, 0, 0.5) ) 
 gam_model<- gam(y_obs~s(x)+s(z)+s(a)) 
mars_model<-earth(y_obs~x+z+a) 
MSE_GAM<-mean((gam_model$fitted.values - y_model)^2) 
 MSE_MARS<-mean((mars_model$fitted.values - y_model)^2) 
 return( c(MSE_GAM, MSE_MARS) ) 
}   
my.data = t(replicate( 50, my.experiment() )) 
colnames(my.data) <- c("MSE_GAM", "MSE_MARS") 
summary(my.data) 

##

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] help: boxplot multivariables

2019-09-23 Thread Lorena Saavedra Aracena
Hola Diego,
gracias, soy bastante nueva en R y me cuesta seguirle el hilo a veces. Y no
pasa nada, creo que terminaré usando ggplot de todas formas.
Lo del NA es porque es un experimento, donde GS sólo puede ocurrir cuando
hay una persona, y en algunos episodios el animal está solo.   Tal vez
podría ir un 0 para el caso del boxplot.  Entonces quería comparar los
comportamientos GS-GI.
Seguí tu script y con esos pocos datos esto es lo que me arrojó para GS (es
lo mismo que obtuviste?)
[image: image.png]

Quizás pregunté por la combinación más extraña porque GS casi nunca pasó.
En realidad, me gustaría hacer lo mismo con AA y AD, Z1-Z2, y así con otros
comportamientos...  cómo para tener un boxplot similar a este pero con mis
datos:
[image: image.png]

Sólo pegué un pequeño extracto porque son muchos datos. Pero sí, los datos
que copiaste están perfectos.

Gracias.

El lun., 23 sept. 2019 a las 14:28, Diego Martín ()
escribió:

> Hola Lorena:
>
>bueno, tal y como planteas la duda, a mí me sale
> automáticamente esto que te mostraré a continuación:
>
> library(dplyr)
> library(purrr)
> library(ggplot2)
>
> dLSaa %>%
> ggplot(aes(factor(Ep), GI)) +
>   geom_boxplot()
>
> map_dbl(.x = dLSaa$GS, .f = function(x) {sum(is.na(x))})
>
> dLSaa %>%
>   ggplot(aes(factor(Ep), GS)) +
>   geom_boxplot()
>
>  Sé que deseabas más bien la función boxplot(), pero es
> que con lo que más trabajo es con ggplot, y casi me he acostumbrado a
> pensar en ese modo.
>
>  Hay varias cosas. La primera me sorprenden los datos con
> el resultado que sale, especialmente con el caso de la columna GS, por eso
> he puesto la función map_dbl con is.na, para resaltar los na, pero
> también por los valores que hallo en ella. La otra cuestión es que no estoy
> seguro de las columnas sean comparables. Pero bueno, todo esto huelga,
> porque lo importante es lo que estés haciendo y que sirva para algo este
> planteamiento que te facilito. Ojalá te haya ayudado.
>
>  Un saludo cordial.
>
> PS. Te adjunto la tabla que he usado, la que pones tú, pero que yo tuve
> que leer y por si me hubiera equivocado para que las veas.
>
> El lun., 23 sept. 2019 a las 16:34, Lorena Saavedra Aracena (<
> l.saavedra.arac...@gmail.com>) escribió:
>
>> Hola,
>> estoy tratando de hacer un boxplot para compara diferentes
>> comportamientos.
>> Tengo un set de datos con 17 columnas, quiero crear un boxplot
>> considerando 3 de ellas. He buscado en foros y siempre responden esta
>> pregunta con subconjuntos de datos dentro de una misma columna, y no he
>> logrado realizarlo.
>>
>> Este es un ejemplo de mis datos:
>> ID: id animal
>> EP: episodio
>> AA: acostado y alerta
>> AD: acostado descansando
>> B: sentado; C: de pie, D; locomición, F: exploración; GS: juego social;
>> GI:
>> juego individual
>> Z: zona 1,2,3,4
>>
>> ID  Ep AA AD B C D F GS GI Z1 Z2 Z3 Z4
>> 1 1 8 0 0 0 4 5 0 0 8 1 0 0
>> 1 2 5 0 0 2 5 3 0 0 9 0 0 0
>> 1 3 0 0 0 0 12 0 11 0 1 2 0 1
>> 1 4 2 0 0 2 4 0 NA 8 8 0 0 0
>> 1 5 3 0 1 0 8 0 0 7 1 3 0 4
>> 1 6 5 0 0 5 2 0 0 10 0 11 0 0
>> 2 1 0 0 12 0 0 1 0 0 0 0 12 0
>> 2 2 8 0 0 0 3 1 0 0 0 0 0 9
>> 2 3 0 0 1 1 10 1 0 0 1 0 1 0
>> 2 4 0 0 0 6 5 0 NA 0 4 1 2 0
>> 2 5 1 0 7 0 4 0 0 0 0 2 0 6
>> 2 6 4 0 2 1 5 0 0 0 2 0 2 4
>> 3 1 2 6 0 4 0 2 0 0 0 0 12 0
>> 3 2 10 1 0 0 0 0 0 0 0 0 11 0
>> 3 3 0 0 0 7 5 3 0 0 4 0 0 0
>> 3 4 0 0 0 1 5 1 NA 0 5 0 0 0
>> 3 5 2 0 0 5 5 0 0 0 0 3 0 2
>> 3 6 4 2 0 4 2 0 0 0 4 6 0 0
>>
>> necesito crear un boxplot donde x siempre será Ep (episodios), e Y
>> considere por ejemplo GS y GI, para poder comparar dichos comportamientos.
>> De preferencia, me gustaría hacerlo simplemente con boxplot(), pero estoy
>> abierta a intentarlo con ggplot de igual forma.
>>
>> Muchas gracias, saludos
>>
>>
>> --
>>
>> *Lorena Saavedra A.**Ing. Recursos Naturales Renovables*
>> *+56 9 9880 2972*
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>

-- 

*Lorena Saavedra A.**Ing. Recursos Naturales Renovables*
*+56 9 9880 2972*
___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] help: boxplot multivariables

2019-09-23 Thread Diego Martín
Hola Lorena:

   bueno, tal y como planteas la duda, a mí me sale
automáticamente esto que te mostraré a continuación:

library(dplyr)
library(purrr)
library(ggplot2)

dLSaa %>%
ggplot(aes(factor(Ep), GI)) +
  geom_boxplot()

map_dbl(.x = dLSaa$GS, .f = function(x) {sum(is.na(x))})

dLSaa %>%
  ggplot(aes(factor(Ep), GS)) +
  geom_boxplot()

 Sé que deseabas más bien la función boxplot(), pero es que
con lo que más trabajo es con ggplot, y casi me he acostumbrado a pensar en
ese modo.

 Hay varias cosas. La primera me sorprenden los datos con
el resultado que sale, especialmente con el caso de la columna GS, por eso
he puesto la función map_dbl con is.na, para resaltar los na, pero también
por los valores que hallo en ella. La otra cuestión es que no estoy seguro
de las columnas sean comparables. Pero bueno, todo esto huelga, porque lo
importante es lo que estés haciendo y que sirva para algo este
planteamiento que te facilito. Ojalá te haya ayudado.

 Un saludo cordial.

PS. Te adjunto la tabla que he usado, la que pones tú, pero que yo tuve que
leer y por si me hubiera equivocado para que las veas.

El lun., 23 sept. 2019 a las 16:34, Lorena Saavedra Aracena (<
l.saavedra.arac...@gmail.com>) escribió:

> Hola,
> estoy tratando de hacer un boxplot para compara diferentes comportamientos.
> Tengo un set de datos con 17 columnas, quiero crear un boxplot
> considerando 3 de ellas. He buscado en foros y siempre responden esta
> pregunta con subconjuntos de datos dentro de una misma columna, y no he
> logrado realizarlo.
>
> Este es un ejemplo de mis datos:
> ID: id animal
> EP: episodio
> AA: acostado y alerta
> AD: acostado descansando
> B: sentado; C: de pie, D; locomición, F: exploración; GS: juego social; GI:
> juego individual
> Z: zona 1,2,3,4
>
> ID  Ep AA AD B C D F GS GI Z1 Z2 Z3 Z4
> 1 1 8 0 0 0 4 5 0 0 8 1 0 0
> 1 2 5 0 0 2 5 3 0 0 9 0 0 0
> 1 3 0 0 0 0 12 0 11 0 1 2 0 1
> 1 4 2 0 0 2 4 0 NA 8 8 0 0 0
> 1 5 3 0 1 0 8 0 0 7 1 3 0 4
> 1 6 5 0 0 5 2 0 0 10 0 11 0 0
> 2 1 0 0 12 0 0 1 0 0 0 0 12 0
> 2 2 8 0 0 0 3 1 0 0 0 0 0 9
> 2 3 0 0 1 1 10 1 0 0 1 0 1 0
> 2 4 0 0 0 6 5 0 NA 0 4 1 2 0
> 2 5 1 0 7 0 4 0 0 0 0 2 0 6
> 2 6 4 0 2 1 5 0 0 0 2 0 2 4
> 3 1 2 6 0 4 0 2 0 0 0 0 12 0
> 3 2 10 1 0 0 0 0 0 0 0 0 11 0
> 3 3 0 0 0 7 5 3 0 0 4 0 0 0
> 3 4 0 0 0 1 5 1 NA 0 5 0 0 0
> 3 5 2 0 0 5 5 0 0 0 0 3 0 2
> 3 6 4 2 0 4 2 0 0 0 4 6 0 0
>
> necesito crear un boxplot donde x siempre será Ep (episodios), e Y
> considere por ejemplo GS y GI, para poder comparar dichos comportamientos.
> De preferencia, me gustaría hacerlo simplemente con boxplot(), pero estoy
> abierta a intentarlo con ggplot de igual forma.
>
> Muchas gracias, saludos
>
>
> --
>
> *Lorena Saavedra A.**Ing. Recursos Naturales Renovables*
> *+56 9 9880 2972*
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>


dLSaa.RData
Description: Binary data
___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R-es] help: boxplot multivariables

2019-09-23 Thread Lorena Saavedra Aracena
Hola,
estoy tratando de hacer un boxplot para compara diferentes comportamientos.
Tengo un set de datos con 17 columnas, quiero crear un boxplot
considerando 3 de ellas. He buscado en foros y siempre responden esta
pregunta con subconjuntos de datos dentro de una misma columna, y no he
logrado realizarlo.

Este es un ejemplo de mis datos:
ID: id animal
EP: episodio
AA: acostado y alerta
AD: acostado descansando
B: sentado; C: de pie, D; locomición, F: exploración; GS: juego social; GI:
juego individual
Z: zona 1,2,3,4

ID  Ep AA AD B C D F GS GI Z1 Z2 Z3 Z4
1 1 8 0 0 0 4 5 0 0 8 1 0 0
1 2 5 0 0 2 5 3 0 0 9 0 0 0
1 3 0 0 0 0 12 0 11 0 1 2 0 1
1 4 2 0 0 2 4 0 NA 8 8 0 0 0
1 5 3 0 1 0 8 0 0 7 1 3 0 4
1 6 5 0 0 5 2 0 0 10 0 11 0 0
2 1 0 0 12 0 0 1 0 0 0 0 12 0
2 2 8 0 0 0 3 1 0 0 0 0 0 9
2 3 0 0 1 1 10 1 0 0 1 0 1 0
2 4 0 0 0 6 5 0 NA 0 4 1 2 0
2 5 1 0 7 0 4 0 0 0 0 2 0 6
2 6 4 0 2 1 5 0 0 0 2 0 2 4
3 1 2 6 0 4 0 2 0 0 0 0 12 0
3 2 10 1 0 0 0 0 0 0 0 0 11 0
3 3 0 0 0 7 5 3 0 0 4 0 0 0
3 4 0 0 0 1 5 1 NA 0 5 0 0 0
3 5 2 0 0 5 5 0 0 0 0 3 0 2
3 6 4 2 0 4 2 0 0 0 4 6 0 0

necesito crear un boxplot donde x siempre será Ep (episodios), e Y
considere por ejemplo GS y GI, para poder comparar dichos comportamientos.
De preferencia, me gustaría hacerlo simplemente con boxplot(), pero estoy
abierta a intentarlo con ggplot de igual forma.

Muchas gracias, saludos


-- 

*Lorena Saavedra A.**Ing. Recursos Naturales Renovables*
*+56 9 9880 2972*

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] The "--slave" option

2019-09-23 Thread Martin Maechler
> Richard O'Keefe 
> on Sat, 21 Sep 2019 09:39:18 +1200 writes:

> Ah, *now* we're getting somewhere.  There is something
> that *can* be done that's genuinely helpful.
>> From the R(1) manual page:
>-q, --quiet Don't print startup message

>--silent Same as --quiet

>--slave Make R run as quietly as possible

> It might have been better to use --nobanner instead of
> --quiet.  So perhaps

> -q, --quiet Don't print the startup message.  This is
> the only output that is suppressed.

> --silent Same as --quiet.  Suppress the startup
> message only.

> --slave Make R run as quietly as possible.  This is
> for use when running R as a subordinate process.  See
> "Introduction to Sub-Processes in R"
> https://cran.r-project.org/web/packages/subprocess/vignettes/intro.html
> for an example.

Thank you, Stephen and Richard.

I think we (the R Core Team) *can* make the description a bit
more verbose. However, as practically all "--" descriptions
are fitting in one short line, (and as the 'subprocess' package is just an
extension pkg, and may disappear (and more reasons)) I'd like to
be less verbose than your proposal.

What about

  -q, --quiet   Don't print startup message

  --silent  Same as --quiet

  --slave   Make R run as quietly as possible.  For use when
runnning R as sub(ordinate) process. 

If you look more closely, you'll notice that --slave is not much
quieter than --quiet, the only (?) difference being that the
input is not copied and (only "mostly") the R prompt is also not printed.

And from my experiments (in Linux (Fedora 30)), one might even
notice that in some cases --slave prints the R prompt (to stderr?)
which one might consider bogous (I'm not: not wanting to spend
time fixing this platform-independently) :

--slave :


MM@lynne$ echo '(i <- 1:3)
i*10' | R-3.6.1 --slave --vanilla
> [1] 1 2 3
[1] 10 20 30
MM@lynne$ f=/tmp/Rslave.out$$; echo '(i <- 1:3)
i*10' | R-3.6.1 --slave --vanilla | tee $f
> [1] 1 2 3
[1] 10 20 30
MM@lynne$ cat $f
[1] 1 2 3
[1] 10 20 30

--quiet :


MM@lynne$ f=/tmp/Rquiet.out$$; echo '(i <- 1:3)
i*10' | R-3.6.1 --quiet --vanilla | tee $f
> (i <- 1:3)
[1] 1 2 3
> i*10
[1] 10 20 30
> 
MM@lynne$ cat $f
> (i <- 1:3)
[1] 1 2 3
> i*10
[1] 10 20 30
> 
MM@lynne$ 



But there's a bit more to it: In my examples above, both --quiet
and --slave where used together with --vanilla.  In general
--slave *also* never saves, i.e., uses the equivalent of
q('no'), where as --quiet does [ask or ...].

Last but not least, from very simply reading R's source code on
this, it becomes blatant that you can use  '-s'  instead of '--slave',
but we (R Core) have probably not documented that on purpose (so
we could reserve it for something more important, and redefine
the simple use of '-s' some time in the future ?)

So, all those who want to restrict their language could use '-s'
for now.  In addition, we could add  >> one <<  other alias to
--slave, say --subprocess (or --quieter ? or ???)
and one could make that the preferred use some time in the future.

Well, these were another two hours of time *not* spent improving
R technically, but spent reading e-mails, source code, and considering.
Maybe well spent, maybe not ...

Martin Maechler
ETH Zurich and R Core Team




> On Sat, 21 Sep 2019 at 02:29, Stephen Ellison
>  wrote:
>> 
>> > Sure, it's a silly example, but it makes about as much
>> sense as using > "slave" to mean "quiet".  It
>> doesn't. It's a set of options chosen for when R is
>> called as a slave process from a controlling process, and
>> in that it is a reasonable description of the
>> circumstance.
>> 
>> --quiet is a separate command line option with different
>> effect.
>> 
>>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Average distance in kilometers between subsets of points with ggmap /geosphere

2019-09-23 Thread Eric Berger
Hi Malte,
I only skimmed your question and looked at the desired output.
I wondered if the apply function could meet your needs.
Here's a small example that might help you:

m <- matrix(1:9,nrow=3)
m <- cbind(m,apply(m,MAR=1,mean))  # MAR=1 says to apply the function
row-wise
m

# [,1] [,2] [,3] [,4]
# [1,]1474
# [2,]2585
# [3,]3696

HTH,
Eric


On Mon, Sep 23, 2019 at 10:18 AM Malte Hückstädt <
deaddatascienti...@gmail.com> wrote:

> I would like to determine the geographical distances from a number of
> addresses and determine the mean value (the mean distance) from these.
>
> In case the dataframe has only one row, I have found a solution:
>
> ```r
> # Pakete laden
> library(readxl)
> library(openxlsx)
> library(googleway)
> #library(sf)
> library(tidyverse)
> library(geosphere)
> library("ggmap")
>
> #API Key bestimmen
> set_key("")
> api_key <- ""
> register_google(key=api_key)
>
> #  Data
> df <- data.frame(
>   V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538
> München, Germany",
>  "07745 Jena, Germany","10117 Berlin, Germany"),
>   V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152
> Planegg, Germany",
>  "07743 Jena, Germany","14195 Berlin, Germany"),
>   V3 = c("85748 Garching, Germany", "01069 Dresden, Germany",  "85748
> Garching, Germany",
>  NA, "10318 Berlin, Germany"),
>   V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805
> München, Germany",
>  "07745 Jena, Germany", NA), stringsAsFactors=FALSE
> )
>
> #replace NA for geocode-funktion
> df[is.na(df)] <- ""
>
> #slice it
> df1 <- slice(df, 5:5)
>
> #  lon lat Informations
> df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit()
>
> # to Matrix
> mat_df  <- as.matrix(df_2)
>
> #dist-mat
> dist_mat <- distm(mat_df)
>
> #mean-dist of row 5
> mean(dist_mat[lower.tri(dist_mat)])/1000
> ```
>
> Unfortunately, I fail to implement a function that executes the code for
> an entire data set. My current problem is, that the function does not
> calculate the distance-averages rowwise, but calculates the average value
> from all lines of the data set.
>
> ```r
> #Funktion
>
> Mean_Dist <- function(df,w,x,y,z) {
>
>   # for (row in 1:nrow(df)) {
>   #   dist_mat <- geocode(c(w, x, y, z))
>   #
>   # }
>
>   df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen
> aus Adressen
>
>   mat_df <- as.matrix(df) # schreibe diese in eine Matrix
>
>   dist_mat <- distm(mat_df)
>
>   dist_mean <- mean(dist_mat[lower.tri(dist_mat)])
>
>   return(dist_mean)
> }
>
> df %>%  mutate(lon =  Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000)
>
> ```
> Do you have any idea what mistake I made?
>
> to clarify my question: What I'm trying to create a dataframe like this
> one (V5):
>
> ```r
>   V1 V2 V3
> V4  V5
>   
> 
> 1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
> 80805 München, Germany Mean_Dist_row1
> 2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany
> 01187 Dresden, Germany Mean_Dist_row2
> 3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany
> 80805 München, Germany Mean_Dist_row3
> 4 07745 Jena, Germany07743 Jena, Germany07745 Jena, Germany
>  07745 Jena, Germany Mean_Dist_row4
> 5 10117 Berlin, Germany  14195 Berlin, Germany  10318 Berlin, Germany
>  14476 Potsdam, Germany Mean_Dist_row5
> ```
>
> eg an average of the distance of each row.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Average distance in kilometers between subsets of points with ggmap /geosphere

2019-09-23 Thread Malte Hückstädt
I would like to determine the geographical distances from a number of addresses 
and determine the mean value (the mean distance) from these.

In case the dataframe has only one row, I have found a solution:

```r
# Pakete laden
library(readxl)
library(openxlsx)
library(googleway)
#library(sf)
library(tidyverse)
library(geosphere)
library("ggmap")

#API Key bestimmen
set_key("")
api_key <- ""
register_google(key=api_key)

#  Data
df <- data.frame(
  V1 = c("80538 München, Germany", "01328 Dresden, Germany", "80538 München, 
Germany",
 "07745 Jena, Germany","10117 Berlin, Germany"),
  V2 = c("82152 Planegg, Germany", "01069 Dresden, Germany", "82152 Planegg, 
Germany",
 "07743 Jena, Germany","14195 Berlin, Germany"),
  V3 = c("85748 Garching, Germany", "01069 Dresden, Germany",  "85748 Garching, 
Germany",
 NA, "10318 Berlin, Germany"),
  V4 = c("80805 München, Germany", "01187 Dresden, Germany", "80805 München, 
Germany",
 "07745 Jena, Germany", NA), stringsAsFactors=FALSE
)

#replace NA for geocode-funktion
df[is.na(df)] <- ""

#slice it
df1 <- slice(df, 5:5)

#  lon lat Informations
df_2 <- geocode(c(df1$V1, df1$V2,df1$V3, df1$V4)) %>% na.omit()

# to Matrix
mat_df  <- as.matrix(df_2) 

#dist-mat
dist_mat <- distm(mat_df)

#mean-dist of row 5
mean(dist_mat[lower.tri(dist_mat)])/1000  
```

Unfortunately, I fail to implement a function that executes the code for an 
entire data set. My current problem is, that the function does not calculate 
the distance-averages rowwise, but calculates the average value from all lines 
of the data set.

```r
#Funktion

Mean_Dist <- function(df,w,x,y,z) {
  
  # for (row in 1:nrow(df)) {
  #   dist_mat <- geocode(c(w, x, y, z))
  #   
  # }
  
  df <- geocode(c(w, x, y, z)) %>% na.omit() # ziehe lon lat Informationen aus 
Adressen
  
  mat_df <- as.matrix(df) # schreibe diese in eine Matrix
  
  dist_mat <- distm(mat_df)
  
  dist_mean <- mean(dist_mat[lower.tri(dist_mat)])
  
  return(dist_mean)
}

df %>%  mutate(lon =  Mean_Dist(df,df$V1, df$V2,df$V3, df$V4)/1000)

```
Do you have any idea what mistake I made?

to clarify my question: What I'm trying to create a dataframe like this one 
(V5):

```r
  V1 V2 V3  V4  
V5

 
1 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 
München, Germany Mean_Dist_row1
2 01328 Dresden, Germany 01069 Dresden, Germany 01069 Dresden, Germany  01187 
Dresden, Germany Mean_Dist_row2
3 80538 München, Germany 82152 Planegg, Germany 85748 Garching, Germany 80805 
München, Germany Mean_Dist_row3
4 07745 Jena, Germany07743 Jena, Germany07745 Jena, Germany 07745 
Jena, Germany Mean_Dist_row4   
5 10117 Berlin, Germany  14195 Berlin, Germany  10318 Berlin, Germany   14476 
Potsdam, Germany Mean_Dist_row5
```

eg an average of the distance of each row.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SPAM] Re: The "--slave" option

2019-09-23 Thread Benjamin Lang
Hi Abby,

I don’t really understand why you’re upset with me, but a) they’re cultured 
cell lines, not animals, b) they might cure people, c) I don’t do experiments, 
d) modern slavery, dated today: 
https://www.theguardian.com/world/2019/sep/21/such-brutality-tricked-into-slavery-in-the-thai-fishing-industry
 .

I simply don’t like the word, and it doesn’t even describe what the option does.

Best,
Ben

> On 22 Sep 2019, at 00:56, Abby Spurdle  wrote:
> 
> (excerpts only)
>> slavery being easily justified by the Bible while abolition is not is an 
>> experience.
>> P.S. Do any R developers actually read this?
> 
> I've read one or two verses...
> 
> I also found this (by you):
> https://www.ncbi.nlm.nih.gov/pubmed/20362542
> 
> Which uses embryonic stem cells.
> I recognize that they're mouse embryos.
> However, your article cites at least five other articles (probably, a
> lot more), that use human embryonic stem cells.
> 
> You complain about slavery (that doesn't exist), and then prompte
> murder (which does exist).
> What does that say about you...
> 
> And that's ignoring the way you treat animals
> We slice and dice data, you slice and dice living creatures.
> 
> Here's two songs about freedom, if you have ears to hear:
> https://youtu.be/lKw6uqtGFfo
> https://youtu.be/HAIdo707Sac

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SPAM] Re: The "--slave" option

2019-09-23 Thread Benjamin Lang
Hah, fair. I do hope somebody does see it and gives it a thought.

Thanks,
Ben

On Sun, 22 Sep 2019 at 01:29, Roy Mendelssohn - NOAA Federal <
roy.mendelss...@noaa.gov> wrote:

> Please All:
>
> While as I said in my first post I am still not convinced that the OP was
> in good faith to improve R and not a troll  (yours to decide), I also don't
> think attacking a person's research to counter a point that has nothing to
> do with their research is what is wanted on this mail-list.  There is one
> very simple alternative - don't reply.
>
> Ben - members of R-core do read this mail-list,  and the fact that not a
> single one has replied probably tells you what you need to know.
>
> -Roy
>
>
> > On Sep 21, 2019, at 3:56 PM, Abby Spurdle  wrote:
> >
> > (excerpts only)
> >> slavery being easily justified by the Bible while abolition is not is
> an experience.
> >> P.S. Do any R developers actually read this?
> >
> > I've read one or two verses...
> >
> > I also found this (by you):
> > https://www.ncbi.nlm.nih.gov/pubmed/20362542
> >
> > Which uses embryonic stem cells.
> > I recognize that they're mouse embryos.
> > However, your article cites at least five other articles (probably, a
> > lot more), that use human embryonic stem cells.
> >
> > You complain about slavery (that doesn't exist), and then prompte
> > murder (which does exist).
> > What does that say about you...
> >
> > And that's ignoring the way you treat animals
> > We slice and dice data, you slice and dice living creatures.
> >
> > Here's two songs about freedom, if you have ears to hear:
> > https://youtu.be/lKw6uqtGFfo
> > https://youtu.be/HAIdo707Sac
>
> **
> "The contents of this message do not reflect any position of the U.S.
> Government or NOAA."
> **
> Roy Mendelssohn
> Supervisory Operations Research Analyst
> NOAA/NMFS
> Environmental Research Division
> Southwest Fisheries Science Center
> ***Note new street address***
> 110 McAllister Way
> Santa Cruz, CA 95060
> Phone: (831)-420-3666
> Fax: (831) 420-3980
> e-mail: roy.mendelss...@noaa.gov www: https://www.pfeg.noaa.gov/
>
> "Old age and treachery will overcome youth and skill."
> "From those who have been given much, much will be expected"
> "the arc of the moral universe is long, but it bends toward justice" -MLK
> Jr.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dabestr plot color change manually

2019-09-23 Thread Moshiur Rahman
Dear R plot experts,

I'm working on dabestr package for effect size estimation plot (
https://cran.r-project.org/web/packages/dabestr/vignettes/using-dabestr.html)
where I'd like to change the following two issues manually. Does anyone
know-

1) how can I manually change ticks color (I'd like to have black rather
than the default grey one)? and

2) how do I manipulate the point's color distinctly into two different
colors (I'd like to have black and grey colors than the default automatic
gradient color)?

Any suggestion will be highly appreciated.

Thanks in advance.

Regards,

Moshi

JSPS Postdoctoral Fellow
Laboratory of Population Biology
Department of Marine Biosciences
Graduate School of Marine Science and Technology
Tokyo University of Marine Science and Technology
4-5-7 Konan, Minato-ku, Tokyo 108-8477, Japan
Mobile: 050-6874-9072

.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.