[R] Expected min height of view

2021-01-07 Thread Gregory Coats via R-help
I upgraded from R 4.0.2 to R 4.0.3 for Apple Mac at Duke University. Now, the 
only output I get from R 4.0.3 is an error message. Greg Coats
2021-01-07 22:58:42.997 R[8311:37566] Warning: Expected min height of view: 
() to be less than or equal to 30 
but got a height of 32.00. This error will be logged once per view in 
violation.
> version
   _   
platform   x86_64-apple-darwin17.0 
arch   x86_64  
os darwin17.0  
system x86_64, darwin17.0  
status 
major  4   
minor  0.3 
year   2020
month  10  
day10  
svn rev79318   
language   R   
version.string R version 4.0.3 (2020-10-10)
nickname   Bunny-Wunnies Freak Out 
> 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Secondary y axis in ggplot2: did not respond when change its y-axis value

2021-01-07 Thread Marna Wagley
Hi R users,
I was trying to plot a graph with a secondary axis, and used the following
code for the data but the secondary line and secondary y -axis value did
not match. I would like to show both lines in one graph.

Any suggestions?

library(ggplot2)
library(reshape2)
daT<-structure(list(x = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L), y1 = c(9754L, 1051L, 5833L, 5769L, 2479L,
470L, 5828L, 174L, 2045L, 6099L, 8780L, 8732L, 4053L, 9419L,
4728L, 3587L), y2 = c(0.51, 0.61, 0.3, 0.81, 0.89, 0, 1.9, 0.76,
0.87, 0.29, 0, 0.42, 0.73, 0.96, 0.62, 0.06), group = c("A",
"A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B",
"B", "B")), class = "data.frame", row.names = c(NA, -16L))
print(daT)
daT1<-melt(daT, id.vars=c("x", "group"))
daT1%>%
  ggplot() +
  geom_line(aes(x = x, y = value, group = variable, color = variable)) +
  facet_wrap(~group) +
  scale_y_continuous(sec.axis = sec_axis(~ .*0.0001))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Gráfico al estilo heatmap, pero no heatmap XD

2021-01-07 Thread Eric
Muchas gracias Carlos, un abrazo,

Eric.




On 07-01-21 18:48, Carlos Ortega wrote:
> Y tienes más alternativas aquí:
>
> https://www.r-graph-gallery.com/2d-density-plot-with-ggplot2.html 
> 
>
> El jue, 7 ene 2021 a las 22:47, Carlos Ortega 
> (mailto:c...@qualityexcellence.es>>) escribió:
>
> Hola,
>
> Mira esto:
>
> https://github.com/LKremer/ggpointdensity
> 
>
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es 
>
> El jue, 7 ene 2021 a las 22:29, Eric ( >) escribió:
>
> Que tal comunidad, tengo un problema al que le he dado vueltas
> para
> resolverlo con R y creo que he llegado a una solución, pero
> pienso que
> ya le debe haber pasado a alguien y que ya debe estar
> implementado en R.
> SIn embargo, me ha ido mal en mi búsqueda porque, a pesar de
> la simpleza
> del problema, al parecer tengo dificultades para describirlo
> correctamente así es que no doy con su solución en la web. Por
> eso vengo
> a preguntarle a inteligencias humanas, a ver si me entienden
> mejor que
> las IA de los buscadores XD. El problema es el siguiente:
>
> Tengo una matriz con varios miles de filas y dos columnas.
> Esas dos
> columnas las represento en un scatterplot que por tener muchos
> puntos
> juntos uno al lado, e incluso sobre el otro, se observa como
> una masa
> oscura sin una tendencia clara. Sin embargo, tengo un R2 de
> más de 0.5,
> así es que tan masa oscura no es, hay alguna tendencia en la
> relación de
> las variables, pero no se aprecia gráficamente. De modo que
> quisiera
> colorear el gráfico 2D de manera que aquellas zonas con más
> densidad de
> puntos sean más rojas y las con menos puntos sean más
> amarillas, por
> ejemplo. Pensé crear una tercera columna manualmente para
> aquellos casos
> en que haya un punto exactamente sobre otro, pero tengo el
> problema de
> que las variables son contínuas, de modo que un punto con
> exactamente
> las mismas coordenadas XY es raro, aunque están uno al ladito
> del otro.
>
> Así es que mi pregunta es si hay alguna librería en R que me
> resuelva el
> problema de colorear automáticamente el scatterplot de acuerdo
> a la
> densidad de puntos de un sector al estilo heatmap (heatmap no
> me sirve
> porque requiere que yo indique la tercera componente del
> gráfico XYZ
> para colorearlo).
>
> Como siempre muchas gracias por su atención y tiempo. Espero
> dar la mano
> de vuelta cuando me sea posible.
>
> Saludos !!
>
> Eric.
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org 
> https://stat.ethz.ch/mailman/listinfo/r-help-es
> 
>
>
>
> -- 
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es 
>
>
>
> -- 
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es 

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] Gráfico al estilo heatmap, pero no heatmap XD

2021-01-07 Thread Carlos Ortega
Y tienes más alternativas aquí:

https://www.r-graph-gallery.com/2d-density-plot-with-ggplot2.html

El jue, 7 ene 2021 a las 22:47, Carlos Ortega ()
escribió:

> Hola,
>
> Mira esto:
>
> https://github.com/LKremer/ggpointdensity
>
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>
> El jue, 7 ene 2021 a las 22:29, Eric ()
> escribió:
>
>> Que tal comunidad, tengo un problema al que le he dado vueltas para
>> resolverlo con R y creo que he llegado a una solución, pero pienso que
>> ya le debe haber pasado a alguien y que ya debe estar implementado en R.
>> SIn embargo, me ha ido mal en mi búsqueda porque, a pesar de la simpleza
>> del problema, al parecer tengo dificultades para describirlo
>> correctamente así es que no doy con su solución en la web. Por eso vengo
>> a preguntarle a inteligencias humanas, a ver si me entienden mejor que
>> las IA de los buscadores XD. El problema es el siguiente:
>>
>> Tengo una matriz con varios miles de filas y dos columnas. Esas dos
>> columnas las represento en un scatterplot que por tener muchos puntos
>> juntos uno al lado, e incluso sobre el otro, se observa como una masa
>> oscura sin una tendencia clara. Sin embargo, tengo un R2 de más de 0.5,
>> así es que tan masa oscura no es, hay alguna tendencia en la relación de
>> las variables, pero no se aprecia gráficamente. De modo que quisiera
>> colorear el gráfico 2D de manera que aquellas zonas con más densidad de
>> puntos sean más rojas y las con menos puntos sean más amarillas, por
>> ejemplo. Pensé crear una tercera columna manualmente para aquellos casos
>> en que haya un punto exactamente sobre otro, pero tengo el problema de
>> que las variables son contínuas, de modo que un punto con exactamente
>> las mismas coordenadas XY es raro, aunque están uno al ladito del otro.
>>
>> Así es que mi pregunta es si hay alguna librería en R que me resuelva el
>> problema de colorear automáticamente el scatterplot de acuerdo a la
>> densidad de puntos de un sector al estilo heatmap (heatmap no me sirve
>> porque requiere que yo indique la tercera componente del gráfico XYZ
>> para colorearlo).
>>
>> Como siempre muchas gracias por su atención y tiempo. Espero dar la mano
>> de vuelta cuando me sea posible.
>>
>> Saludos !!
>>
>> Eric.
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>
>
> --
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>


-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] Gráfico al estilo heatmap, pero no heatmap XD

2021-01-07 Thread Carlos Ortega
Hola,

Mira esto:

https://github.com/LKremer/ggpointdensity

Saludos,
Carlos Ortega
www.qualityexcellence.es

El jue, 7 ene 2021 a las 22:29, Eric () escribió:

> Que tal comunidad, tengo un problema al que le he dado vueltas para
> resolverlo con R y creo que he llegado a una solución, pero pienso que
> ya le debe haber pasado a alguien y que ya debe estar implementado en R.
> SIn embargo, me ha ido mal en mi búsqueda porque, a pesar de la simpleza
> del problema, al parecer tengo dificultades para describirlo
> correctamente así es que no doy con su solución en la web. Por eso vengo
> a preguntarle a inteligencias humanas, a ver si me entienden mejor que
> las IA de los buscadores XD. El problema es el siguiente:
>
> Tengo una matriz con varios miles de filas y dos columnas. Esas dos
> columnas las represento en un scatterplot que por tener muchos puntos
> juntos uno al lado, e incluso sobre el otro, se observa como una masa
> oscura sin una tendencia clara. Sin embargo, tengo un R2 de más de 0.5,
> así es que tan masa oscura no es, hay alguna tendencia en la relación de
> las variables, pero no se aprecia gráficamente. De modo que quisiera
> colorear el gráfico 2D de manera que aquellas zonas con más densidad de
> puntos sean más rojas y las con menos puntos sean más amarillas, por
> ejemplo. Pensé crear una tercera columna manualmente para aquellos casos
> en que haya un punto exactamente sobre otro, pero tengo el problema de
> que las variables son contínuas, de modo que un punto con exactamente
> las mismas coordenadas XY es raro, aunque están uno al ladito del otro.
>
> Así es que mi pregunta es si hay alguna librería en R que me resuelva el
> problema de colorear automáticamente el scatterplot de acuerdo a la
> densidad de puntos de un sector al estilo heatmap (heatmap no me sirve
> porque requiere que yo indique la tercera componente del gráfico XYZ
> para colorearlo).
>
> Como siempre muchas gracias por su atención y tiempo. Espero dar la mano
> de vuelta cuando me sea posible.
>
> Saludos !!
>
> Eric.
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>


-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] Fórmula con variables al azar

2021-01-07 Thread Carlos Ortega
Hola,

Las diferencias con respecto a un random forest (no sé si usaste
randomForest o ranger) pueden venir del resto de parámetros (los valores
por defecto de rpart están en rpart.control() ) son diferentes a los de
random forest (al menos para los de ranger), por ejemplo número de
elementos a considerar en el split.

Gracias,
Carlos.


El jue, 7 ene 2021 a las 22:12, Manuel Mendoza ()
escribió:

> Gracias José Luis, y también a Carlos y a Juan. Respecto a lo que dices,
> José Luis, de usar un random forest, es que es eso lo que estoy
> programando, un RF. Tenía ya hecho el programa del bootstrap con árboles, y
> a partir de él he programado un RF. He solucionado el problema que tenía
> con lo que me ha dicho Carlos. Funciona bien, aunque me extraña que la
> correlación sobre las muestras OOB no mejore respecto al bootstrap,
> mientras que hecho con el paquete randomForest mejoraba sustancialmente.
> Os lo pongo aquí, tal cual, por si queréis echarle un ojo.
> Un saludo
>
> set.seed(5)
> data <- read.table(file="Data.csv",header=T,sep=",")
>
> colnames(data)
> p=19
> nreps<- 1000
>
> # Creamos una matriz vacía para guardar las predicciones de cada árbol:
> OOBpreds<- matrix(NA, nrow=nrow(data),
> ncol=nreps,dimnames=list(rownames(data)))
>
> target <- c('IFd')
> vars   <- setdiff(names(data), target)
>
> for (i in 1:nreps){
>   selected<-sample(1:nrow(data),size=floor((2/3)*nrow(data)),replace=T)
>   training<- data[selected,]
>   OOB<-data[-selected,]# El out of bag incluye las que no están en
> el training data set
>   num_vars <- floor(p/3)
>   vars_samp <- vars[ sample(1:length(vars), num_vars)]
>   fmla <- as.formula(paste(target, " ~ ", paste(vars_samp, collapse= "+")))
>   fit <- rpart(fmla, data = training)
>   OOBpreds[-selected, i]<-predict(fit, OOB)
>   if(i%%10==0){print(paste("Iteración ",i))} # i%%10==0 significa: el
> resto de dividir i entre 10 es 0
>   }
>
> ResOOB<-rowMeans(OOBpreds, na.rm=T) # se obtiene la media de todas las
> predicciones para cada muestra
> OOBBagging<-lm(data$IFd ~ ResOOB) # para calcular la correlación entre la
> predicción y la observación
> rsqOOBRT<-summary(OOBBagging)$adj.r.squared# R2
>
> windows();plot(data$IFd ~
> ResOOB,main=paste("R2=",round(rsqOOBRT,2)));abline(0,1,lty=2,col=2)
>
>
>
>
>
>
> El jue, 7 ene 2021 a las 11:03, José Luis Cañadas ()
> escribió:
>
>> Hola Manuel.
>> ¿No has pensado en hacer un randomforest, poniendo qeu use todos los
>> datos en cada muestra bootstrap y un porcentaje de las variables?
>>
>> El jue, 7 ene 2021 a las 1:42, Carlos Ortega ()
>> escribió:
>>
>>> Hola Manuel,
>>>
>>> Esta es una forma, uso el conjunto de datos "car90" que viene incluido en
>>> "rpart".
>>>
>>> #-
>>> library(rpart)
>>>
>>> data(car90)
>>> target <- c('Mileage')
>>> vars   <- setdiff(names(car90), target)
>>>
>>> num_loops <- 10
>>> for( i in 1:num_loops) {
>>>num_vars <- 6
>>>vars_samp <- vars[ sample(1:length(vars), num_vars)]
>>>fmla <- as.formula(paste(target, " ~ ", paste(vars_samp, collapse=
>>> "+")))
>>>fit <- rpart(fmla, data = car90)
>>>print(fit)
>>> }
>>>
>>> #-
>>>
>>> Se puede sofisticar esto, para capturar incluso la salida de cada
>>> iteración... :-).
>>>
>>> Gracias,
>>> Carlos Ortega
>>> www.qualityexcellence.es
>>>
>>> El mié, 6 ene 2021 a las 22:44, Manuel Mendoza (<
>>> mmend...@fulbrightmail.org>)
>>> escribió:
>>>
>>> > Muy buenas, hago un árbol de regresión (aunque podría ser cualquier
>>> otro
>>> > análisis) dentro de un loop y quiero que en cada vuelta coja un
>>> conjunto
>>> > distinto de variables. En la df hay 19 predictores pero quiero que
>>> utilice
>>> > solo 6 de ellos, al azar, cada vez. ¿Qué debería poner donde hay un
>>> > interrogante?
>>> >
>>> >   fit<- rpart(IFd ~ ? , data=training)
>>> >
>>> > Gracias, como siempre,
>>> > Manuel
>>> >
>>> > [[alternative HTML version deleted]]
>>> >
>>> > ___
>>> > R-help-es mailing list
>>> > R-help-es@r-project.org
>>> > https://stat.ethz.ch/mailman/listinfo/r-help-es
>>> >
>>>
>>>
>>> --
>>> Saludos,
>>> Carlos Ortega
>>> www.qualityexcellence.es
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ___
>>> R-help-es mailing list
>>> R-help-es@r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>>
>>

-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R-es] Gráfico al estilo heatmap, pero no heatmap XD

2021-01-07 Thread Eric
Que tal comunidad, tengo un problema al que le he dado vueltas para 
resolverlo con R y creo que he llegado a una solución, pero pienso que 
ya le debe haber pasado a alguien y que ya debe estar implementado en R. 
SIn embargo, me ha ido mal en mi búsqueda porque, a pesar de la simpleza 
del problema, al parecer tengo dificultades para describirlo 
correctamente así es que no doy con su solución en la web. Por eso vengo 
a preguntarle a inteligencias humanas, a ver si me entienden mejor que 
las IA de los buscadores XD. El problema es el siguiente:


Tengo una matriz con varios miles de filas y dos columnas. Esas dos 
columnas las represento en un scatterplot que por tener muchos puntos 
juntos uno al lado, e incluso sobre el otro, se observa como una masa 
oscura sin una tendencia clara. Sin embargo, tengo un R2 de más de 0.5, 
así es que tan masa oscura no es, hay alguna tendencia en la relación de 
las variables, pero no se aprecia gráficamente. De modo que quisiera 
colorear el gráfico 2D de manera que aquellas zonas con más densidad de 
puntos sean más rojas y las con menos puntos sean más amarillas, por 
ejemplo. Pensé crear una tercera columna manualmente para aquellos casos 
en que haya un punto exactamente sobre otro, pero tengo el problema de 
que las variables son contínuas, de modo que un punto con exactamente 
las mismas coordenadas XY es raro, aunque están uno al ladito del otro.


Así es que mi pregunta es si hay alguna librería en R que me resuelva el 
problema de colorear automáticamente el scatterplot de acuerdo a la 
densidad de puntos de un sector al estilo heatmap (heatmap no me sirve 
porque requiere que yo indique la tercera componente del gráfico XYZ 
para colorearlo).


Como siempre muchas gracias por su atención y tiempo. Espero dar la mano 
de vuelta cuando me sea posible.


Saludos !!

Eric.

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] seq.Date when date is the last date of the month

2021-01-07 Thread Jeremie Juste
Hello Dirk,

Many thanks for the feedback. I did came across your SO answer but
didn't know the RcppBDT package, thanks for pointing that out.

Best regards,
Jeremie

On Thursday,  7 Jan 2021 at 13:59, Dirk Eddelbuettel wrote:
> Jeremie,
>
> As months have irregular number of dates, one needs to use a function that
> accounts for that (date libraries and packages have that, one of the earliest
> for R was my RcppBDT package using Boost Date_Time), or be otherwise clever.
>
> Here is a one-liner using the latter approach:
>
>seq(as.Date("2010-02-01"), length=24, by="1 month") - 1
>
> See this old StackOverflow answer where I used this before:
>
>
> https://stackoverflow.com/questions/8333838/generate-a-sequence-of-the-last-day-of-the-month-over-two-years
>
> Dirk

-- 
Jeremie Juste

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Fórmula con variables al azar

2021-01-07 Thread Manuel Mendoza
Gracias José Luis, y también a Carlos y a Juan. Respecto a lo que dices,
José Luis, de usar un random forest, es que es eso lo que estoy
programando, un RF. Tenía ya hecho el programa del bootstrap con árboles, y
a partir de él he programado un RF. He solucionado el problema que tenía
con lo que me ha dicho Carlos. Funciona bien, aunque me extraña que la
correlación sobre las muestras OOB no mejore respecto al bootstrap,
mientras que hecho con el paquete randomForest mejoraba sustancialmente.
Os lo pongo aquí, tal cual, por si queréis echarle un ojo.
Un saludo

set.seed(5)
data <- read.table(file="Data.csv",header=T,sep=",")

colnames(data)
p=19
nreps<- 1000

# Creamos una matriz vacía para guardar las predicciones de cada árbol:
OOBpreds<- matrix(NA, nrow=nrow(data),
ncol=nreps,dimnames=list(rownames(data)))

target <- c('IFd')
vars   <- setdiff(names(data), target)

for (i in 1:nreps){
  selected<-sample(1:nrow(data),size=floor((2/3)*nrow(data)),replace=T)
  training<- data[selected,]
  OOB<-data[-selected,]# El out of bag incluye las que no están en
el training data set
  num_vars <- floor(p/3)
  vars_samp <- vars[ sample(1:length(vars), num_vars)]
  fmla <- as.formula(paste(target, " ~ ", paste(vars_samp, collapse= "+")))
  fit <- rpart(fmla, data = training)
  OOBpreds[-selected, i]<-predict(fit, OOB)
  if(i%%10==0){print(paste("Iteración ",i))} # i%%10==0 significa: el resto
de dividir i entre 10 es 0
  }

ResOOB<-rowMeans(OOBpreds, na.rm=T) # se obtiene la media de todas las
predicciones para cada muestra
OOBBagging<-lm(data$IFd ~ ResOOB) # para calcular la correlación entre la
predicción y la observación
rsqOOBRT<-summary(OOBBagging)$adj.r.squared# R2

windows();plot(data$IFd ~
ResOOB,main=paste("R2=",round(rsqOOBRT,2)));abline(0,1,lty=2,col=2)






El jue, 7 ene 2021 a las 11:03, José Luis Cañadas ()
escribió:

> Hola Manuel.
> ¿No has pensado en hacer un randomforest, poniendo qeu use todos los datos
> en cada muestra bootstrap y un porcentaje de las variables?
>
> El jue, 7 ene 2021 a las 1:42, Carlos Ortega ()
> escribió:
>
>> Hola Manuel,
>>
>> Esta es una forma, uso el conjunto de datos "car90" que viene incluido en
>> "rpart".
>>
>> #-
>> library(rpart)
>>
>> data(car90)
>> target <- c('Mileage')
>> vars   <- setdiff(names(car90), target)
>>
>> num_loops <- 10
>> for( i in 1:num_loops) {
>>num_vars <- 6
>>vars_samp <- vars[ sample(1:length(vars), num_vars)]
>>fmla <- as.formula(paste(target, " ~ ", paste(vars_samp, collapse=
>> "+")))
>>fit <- rpart(fmla, data = car90)
>>print(fit)
>> }
>>
>> #-
>>
>> Se puede sofisticar esto, para capturar incluso la salida de cada
>> iteración... :-).
>>
>> Gracias,
>> Carlos Ortega
>> www.qualityexcellence.es
>>
>> El mié, 6 ene 2021 a las 22:44, Manuel Mendoza (<
>> mmend...@fulbrightmail.org>)
>> escribió:
>>
>> > Muy buenas, hago un árbol de regresión (aunque podría ser cualquier otro
>> > análisis) dentro de un loop y quiero que en cada vuelta coja un conjunto
>> > distinto de variables. En la df hay 19 predictores pero quiero que
>> utilice
>> > solo 6 de ellos, al azar, cada vez. ¿Qué debería poner donde hay un
>> > interrogante?
>> >
>> >   fit<- rpart(IFd ~ ? , data=training)
>> >
>> > Gracias, como siempre,
>> > Manuel
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ___
>> > R-help-es mailing list
>> > R-help-es@r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-help-es
>> >
>>
>>
>> --
>> Saludos,
>> Carlos Ortega
>> www.qualityexcellence.es
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] seq.Date when date is the last date of the month

2021-01-07 Thread Jeremie Juste


Hello Jim,

Many thanks for the feedback

> Using "month" first advances the month without changing the day: if
> this results in an invalid day of the month, it is counted forward
> into the next month: see the examples.
Indeed I missed the documentation of seq.Date that refers to
seq.POSIXt. Many thanks for pointing this out.

Still I would be tempted to count back the day into the same month
instead of counting forward. But this behavior seems intentional and
documented so no need to question it. 

> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
The problem I'm trying to solve is this.

for a date d and an integer m
add.month(d,m)

would

Using "month" first advance the month without changing the day: if
this results in an invalid day of the month, it is counted backward
into the same month.

so nearly the same behavior as seq.Date except
add.month("2020-08-31",1) ==> 2020-09-30
add.month("2020-08-01",1) ==> 2020-09-01


Best regards,
Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Proyecto Lucene

2021-01-07 Thread Diego Martín
Estimados compañeros:

 Tened en cuenta que no está, desde
2018, disponible el paquete *solr* en CRAN. En su lugar está *solrium*,
véase: *https://cran.r-project.org/web/packages/solrium/index.html
* . El empleo
no es exactamente igual aunque en general es muy similar. Hay matices en
los parámetros de algunas funciones. Caso de *solr_search*, con *conn*,...

 Saludos.


El jue, 7 ene 2021 a las 19:48, Diego Martín ()
escribió:

> Buenas:
>
> Muchas gracias por la cuestión. Eso me ha hecho repensar el
> asunto en el que estoy. Sí, había visto Solr, pero pronto lo descarté para
> trabajar con él por dos razones: primero declara tener API para Python y
> Ruby, pero no hace la más mínima referencia a R. Segundo, es específico y
> había pensado mejor acudir directamente a Lucene, que tampoco hace
> referencia a R.
>
> Gracias a ti, Carlos, he visto que hay en CRAN un paquete
> llamado *solr*, que podría servir, y en la página: 
> *https://ropensci.org/blog/2014/01/27/solr/
> * , hay comentarios
> interesantes sobre cómo emplearlo. Bueno, es un comienzo. Voy a
> documentarme sobre esta librería y a tratar de ver si se ajusta para lo que
> busco.
>
>  Muchas gracias como siempre. Un abrazo.
>
> El jue, 7 ene 2021 a las 0:54, Carlos Ortega ()
> escribió:
>
>> Hola,
>>
>> ¿No te vale la conexión con SolR?.
>> SolR y Lucene son ya el mismo proyecto desde 2010...
>>
>> Saludos,
>> Carlos Ortega
>> www.qualityexcellence.es
>>
>> El mié, 6 ene 2021 a las 23:06, Diego Martín ()
>> escribió:
>>
>>> Estimados compañeros:
>>>
>>>  He buscado y no he hallado una
>>> conexión disponible entre R y Lucene. ¿Alguno de ustedes conoce que
>>> exista
>>> como sí la hay con Python [1]?.
>>>
>>>   Muchas gracias.
>>>
>>> [1] PyLucene.
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ___
>>> R-help-es mailing list
>>> R-help-es@r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>>
>>
>>
>> --
>> Saludos,
>> Carlos Ortega
>> www.qualityexcellence.es
>>
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] non-standard reshape from long to wide

2021-01-07 Thread Yuan Chun Ding
Hi Rui,

Thank you so much!!   You code works well and I am looking into the pivot_wider 
function.

Yuan Ding

-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt] 
Sent: Thursday, January 7, 2021 12:19 PM
To: Yuan Chun Ding ; r-help@r-project.org
Subject: Re: [R] non-standard reshape from long to wide

Hello,

Here is a dplyr solution. The main trick is to create a column of 1's, then 
pipe to pivot_wider.


library(dplyr)
library(tidyr)

df.long %>%
   mutate(values = 1) %>%
   pivot_wider(
 id_cols = sample,
 names_from = marker,
 values_from = values,
 values_fill = NA
   )


Note: your df.wide is not a data.frame, the transpose coerces it to 
matrix. In this case it doesn't matter because it was just an example of 
expected output but in other, real use cases you must be careful.

df.wide <- as.data.frame(df.wide)

would solve it.


Hope this helps,

Rui Barradas

Às 18:39 de 07/01/21, Yuan Chun Ding escreveu:
> Dear R user,
> 
> I want to reshape a long data frame to wide format, I made the following 
> example files.  Can you help me?
> 
> Thank you,
> 
> Yuan Chun Ding
> 
> sample <-c("xr" , "xr" , "fh" , "fh" , "fh" , "uy" , "uy" , "uy" , "uy");
> marker <-c("x" , "y" , "g" , "x" , "k" , "y" , "x" , "u" , "j");
> df.long <-data.frame(sample, marker);
> 
> xr <-c(1,1,NA,NA,NA,NA);
> fh <-c(1,NA,1,1,NA,NA);
> uy <-c(1,1,NA,NA,1,1);
> 
> df.wide <- t(data.frame(xr,fh,uy));
> colnames(df.wide)<-c("x","y","g","k", "u","j");
> 
> --
> 
> -SECURITY/CONFIDENTIALITY WARNING-
> 
> This message and any attachments are intended solely for the individual or 
> entity to which they are addressed. This communication may contain 
> information that is privileged, confidential, or exempt from disclosure under 
> applicable law (e.g., personal health information, research data, financial 
> information). Because this e-mail has been sent without encryption, 
> individuals other than the intended recipient may be able to view the 
> information, forward it to others or tamper with the information without the 
> knowledge or consent of the sender. If you are not the intended recipient, or 
> the employee or person responsible for delivering the message to the intended 
> recipient, any dissemination, distribution or copying of the communication is 
> strictly prohibited. If you received the communication in error, please 
> notify the sender immediately by replying to this message and deleting the 
> message and any accompanying files from your system. If, due to the security 
> risks, you do not wish to receive further communications via e-mail, please 
> reply to this message and inform the sender that you do not wish to receive 
> further e-mail from the sender. (LCP301)
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!9ccbhtYzBJoahdschhouzo2kkluOs-EdoH8jn32fv9E22xaJ4GzfrI0bOvVl$
>  
> PLEASE do read the posting guide 
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!9ccbhtYzBJoahdschhouzo2kkluOs-EdoH8jn32fv9E22xaJ4GzfrKjC2095$
>  
> and provide commented, minimal, self-contained, reproducible code.
> 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-standard reshape from long to wide

2021-01-07 Thread Rui Barradas

Hello,

Here is a dplyr solution. The main trick is to create a column of 1's, 
then pipe to pivot_wider.



library(dplyr)
library(tidyr)

df.long %>%
  mutate(values = 1) %>%
  pivot_wider(
id_cols = sample,
names_from = marker,
values_from = values,
values_fill = NA
  )


Note: your df.wide is not a data.frame, the transpose coerces it to 
matrix. In this case it doesn't matter because it was just an example of 
expected output but in other, real use cases you must be careful.


df.wide <- as.data.frame(df.wide)

would solve it.


Hope this helps,

Rui Barradas

Às 18:39 de 07/01/21, Yuan Chun Ding escreveu:

Dear R user,

I want to reshape a long data frame to wide format, I made the following 
example files.  Can you help me?

Thank you,

Yuan Chun Ding

sample <-c("xr" , "xr" , "fh" , "fh" , "fh" , "uy" , "uy" , "uy" , "uy");
marker <-c("x" , "y" , "g" , "x" , "k" , "y" , "x" , "u" , "j");
df.long <-data.frame(sample, marker);

xr <-c(1,1,NA,NA,NA,NA);

fh <-c(1,NA,1,1,NA,NA);
uy <-c(1,1,NA,NA,1,1);

df.wide <- t(data.frame(xr,fh,uy));
colnames(df.wide)<-c("x","y","g","k", "u","j");

--

-SECURITY/CONFIDENTIALITY WARNING-

This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to receive further communications via e-mail, please reply to this message 
and inform the sender that you do not wish to receive further e-mail from the 
sender. (LCP301)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AIC for robust GAM ?

2021-01-07 Thread varin sacha via R-help
Hi Bert,

Many thanks for your response.

Best,


Le mercredi 6 janvier 2021 à 21:47:14 UTC+1, Bert Gunter 
 a écrit : 





Per the posting guide linked below:

"If the question relates to a contributed package , e.g., one downloaded from 
CRAN, try contacting the package maintainer first. You can also use 
find("functionname") and packageDescription("packagename") to find this 
information. Only send such questions to R-help or R-devel if you get no reply 
or need further assistance. This applies to both requests for help and to bug 
reports."

So I believe that you should try to contact the maintainer for your question 
first. You may get lucky here, of course, but your query seems rather too 
technical to expect a useful response on R-Help.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jan 6, 2021 at 12:17 PM varin sacha via R-help  
wrote:
> Dear R-Experts,
> 
> Here below my reproducible R code.
> How can I get the AIC of my model (robust GAM) ?
> 
> Best Regards,
> 
> 
> y<-c(499,491,500,517,438,495,501,525,516,494,500,453,479,481,505,465,477,520,520,480,477,416,502,503,497,513,492,469,504,482,502,498,463,504,495)
> x<-c(499,496,424,537,480,484,503,575,540,436,486,506,496,481,508,425,501,519,546,507,452,498,471,495,499,522,509,474,502,534,504,466,527,485,525)
> library(robustgam)
> true.family <- poisson()
> fit=robustgam(x,y,sp=0,family=true.family,smooth.basis='ps',K=3)
> AIC(fit)
> 
>  
> 
>  
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] seq.Date when date is the last date of the month

2021-01-07 Thread Dirk Eddelbuettel


Jeremie,

As months have irregular number of dates, one needs to use a function that
accounts for that (date libraries and packages have that, one of the earliest
for R was my RcppBDT package using Boost Date_Time), or be otherwise clever.

Here is a one-liner using the latter approach:

   seq(as.Date("2010-02-01"), length=24, by="1 month") - 1

See this old StackOverflow answer where I used this before:

   
https://stackoverflow.com/questions/8333838/generate-a-sequence-of-the-last-day-of-the-month-over-two-years

Dirk

-- 
https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] seq.Date when date is the last date of the month

2021-01-07 Thread jim holtman
yes it is the expected behaviour is you check the documentation:

Using "month" first advances the month without changing the day: if
this results in an invalid day of the month, it is counted forward
into the next month: see the examples.

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Thu, Jan 7, 2021 at 11:20 AM Jeremie Juste  wrote:
>
> Hello,
>
> I recently bumped into a behavior that surprised me.
> When performing the following command, I would expect the second
> argument to be "2012-09-30" but got "2012-10-01" instead
> > seq(as.Date("2012-08-31"),by="1 month",length=2)
> [1] "2012-08-31" "2012-10-01"
>
> When the same command is performed for the start of the month. I get a
> result I expect.
> > seq(as.Date("2012-08-01"),by="1 month",length=2)
> [1] "2012-08-01"
>
>
> Is there an explanation for this behavior?
>
> Best regards,
> --
> Jeremie Juste
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] seq.Date when date is the last date of the month

2021-01-07 Thread Jeremie Juste
Hello,

I recently bumped into a behavior that surprised me.
When performing the following command, I would expect the second
argument to be "2012-09-30" but got "2012-10-01" instead
> seq(as.Date("2012-08-31"),by="1 month",length=2)
[1] "2012-08-31" "2012-10-01"

When the same command is performed for the start of the month. I get a
result I expect.
> seq(as.Date("2012-08-01"),by="1 month",length=2)
[1] "2012-08-01"


Is there an explanation for this behavior?

Best regards,
-- 
Jeremie Juste

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-standard reshape from long to wide

2021-01-07 Thread Bert Gunter
Show us your attempt on your example data. Also note that warnings are
*not* errors, though they typically do indicate problems.

-- Bert

On Thu, Jan 7, 2021 at 11:09 AM Yuan Chun Ding  wrote:

> Hi Bert,
>
>
>
> No, this Is not home work related.  Original data have 87352 rows. I used
> the standard reshape function and got warning message. So I reformatted the
> wide format to meet my research purpose.
>
>
>
> mut2 <-mut[,c("Tumor_Sample_Barcode","mut.id", "Hugo_Symbol")]
>
> mut2 <-mut2[order(mut2$Hugo_Symbol),]
>
> mut3 <-mut2[!duplicated(mut2),]
>
> mut4 <-reshape(mut3, idvar = "Hugo_Symbol", timevar =
> "Tumor_Sample_Barcode", direction = "wide")
>
>
>
> There were 50 or more warnings (use warnings() to see the first 50)
>
> > View(mut4)
>
> > warnings()
>
> Warning messages:
>
> 1: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
>
>   multiple rows match for
> Tumor_Sample_Barcode=TCGA-A8-A09Z-01A-11W-A019-09: first taken
>
> 2: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
>
> *From:* Bert Gunter [mailto:bgunter.4...@gmail.com]
> *Sent:* Thursday, January 7, 2021 10:52 AM
> *To:* Yuan Chun Ding 
> *Cc:* r-help@r-project.org
> *Subject:* Re: [R] non-standard reshape from long to wide
>
>
>
> Is this homework? There is a no-homework policy on this list.
>
>
>
> If not, note that you are usually asked to show what you tried and the
> error messages you received.
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
>
>
>
> On Thu, Jan 7, 2021 at 10:40 AM Yuan Chun Ding  wrote:
>
> Dear R user,
>
> I want to reshape a long data frame to wide format, I made the following
> example files.  Can you help me?
>
> Thank you,
>
> Yuan Chun Ding
>
> sample <-c("xr" , "xr" , "fh" , "fh" , "fh" , "uy" , "uy" , "uy" , "uy");
> marker <-c("x" , "y" , "g" , "x" , "k" , "y" , "x" , "u" , "j");
> df.long <-data.frame(sample, marker);
>
> xr <-c(1,1,NA,NA,NA,NA);
> fh <-c(1,NA,1,1,NA,NA);
> uy <-c(1,1,NA,NA,1,1);
>
> df.wide <- t(data.frame(xr,fh,uy));
> colnames(df.wide)<-c("x","y","g","k", "u","j");
>
> --
> 
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the individual or
> entity to which they are addressed. This communication may contain
> information that is privileged, confidential, or exempt from disclosure
> under applicable law (e.g., personal health information, research data,
> financial information). Because this e-mail has been sent without
> encryption, individuals other than the intended recipient may be able to
> view the information, forward it to others or tamper with the information
> without the knowledge or consent of the sender. If you are not the intended
> recipient, or the employee or person responsible for delivering the message
> to the intended recipient, any dissemination, distribution or copying of
> the communication is strictly prohibited. If you received the communication
> in error, please notify the sender immediately by replying to this message
> and deleting the message and any accompanying files from your system. If,
> due to the security risks, you do not wish to receive further
> communications via e-mail, please reply to this message and inform the
> sender that you do not wish to receive further e-mail from the sender.
> (LCP301)
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> 
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> 
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-standard reshape from long to wide

2021-01-07 Thread Yuan Chun Ding
Hi Bert,

No, this Is not home work related.  Original data have 87352 rows. I used the 
standard reshape function and got warning message. So I reformatted the wide 
format to meet my research purpose.

mut2 <-mut[,c("Tumor_Sample_Barcode","mut.id", "Hugo_Symbol")]
mut2 <-mut2[order(mut2$Hugo_Symbol),]
mut3 <-mut2[!duplicated(mut2),]
mut4 <-reshape(mut3, idvar = "Hugo_Symbol", timevar = "Tumor_Sample_Barcode", 
direction = "wide")

There were 50 or more warnings (use warnings() to see the first 50)
> View(mut4)
> warnings()
Warning messages:
1: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
  multiple rows match for Tumor_Sample_Barcode=TCGA-A8-A09Z-01A-11W-A019-09: 
first taken
2: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
From: Bert Gunter [mailto:bgunter.4...@gmail.com]
Sent: Thursday, January 7, 2021 10:52 AM
To: Yuan Chun Ding 
Cc: r-help@r-project.org
Subject: Re: [R] non-standard reshape from long to wide

Is this homework? There is a no-homework policy on this list.

If not, note that you are usually asked to show what you tried and the error 
messages you received.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 7, 2021 at 10:40 AM Yuan Chun Ding 
mailto:ycd...@coh.org>> wrote:
Dear R user,

I want to reshape a long data frame to wide format, I made the following 
example files.  Can you help me?

Thank you,

Yuan Chun Ding

sample <-c("xr" , "xr" , "fh" , "fh" , "fh" , "uy" , "uy" , "uy" , "uy");
marker <-c("x" , "y" , "g" , "x" , "k" , "y" , "x" , "u" , "j");
df.long <-data.frame(sample, marker);

xr <-c(1,1,NA,NA,NA,NA);
fh <-c(1,NA,1,1,NA,NA);
uy <-c(1,1,NA,NA,1,1);

df.wide <- t(data.frame(xr,fh,uy));
colnames(df.wide)<-c("x","y","g","k", "u","j");

--

-SECURITY/CONFIDENTIALITY WARNING-

This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to rec
 eive further communications via e-mail, please reply to this message and 
inform the sender that you do not wish to receive further e-mail from the 
sender. (LCP301)

__
R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-standard reshape from long to wide

2021-01-07 Thread Bert Gunter
Is this homework? There is a no-homework policy on this list.

If not, note that you are usually asked to show what you tried and the
error messages you received.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jan 7, 2021 at 10:40 AM Yuan Chun Ding  wrote:

> Dear R user,
>
> I want to reshape a long data frame to wide format, I made the following
> example files.  Can you help me?
>
> Thank you,
>
> Yuan Chun Ding
>
> sample <-c("xr" , "xr" , "fh" , "fh" , "fh" , "uy" , "uy" , "uy" , "uy");
> marker <-c("x" , "y" , "g" , "x" , "k" , "y" , "x" , "u" , "j");
> df.long <-data.frame(sample, marker);
>
> xr <-c(1,1,NA,NA,NA,NA);
> fh <-c(1,NA,1,1,NA,NA);
> uy <-c(1,1,NA,NA,1,1);
>
> df.wide <- t(data.frame(xr,fh,uy));
> colnames(df.wide)<-c("x","y","g","k", "u","j");
>
> --
> 
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the individual or
> entity to which they are addressed. This communication may contain
> information that is privileged, confidential, or exempt from disclosure
> under applicable law (e.g., personal health information, research data,
> financial information). Because this e-mail has been sent without
> encryption, individuals other than the intended recipient may be able to
> view the information, forward it to others or tamper with the information
> without the knowledge or consent of the sender. If you are not the intended
> recipient, or the employee or person responsible for delivering the message
> to the intended recipient, any dissemination, distribution or copying of
> the communication is strictly prohibited. If you received the communication
> in error, please notify the sender immediately by replying to this message
> and deleting the message and any accompanying files from your system. If,
> due to the security risks, you do not wish to receive further
> communications via e-mail, please reply to this message and inform the
> sender that you do not wish to receive further e-mail from the sender.
> (LCP301)
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Proyecto Lucene

2021-01-07 Thread Diego Martín
Buenas:

Muchas gracias por la cuestión. Eso me ha hecho repensar el
asunto en el que estoy. Sí, había visto Solr, pero pronto lo descarté para
trabajar con él por dos razones: primero declara tener API para Python y
Ruby, pero no hace la más mínima referencia a R. Segundo, es específico y
había pensado mejor acudir directamente a Lucene, que tampoco hace
referencia a R.

Gracias a ti, Carlos, he visto que hay en CRAN un paquete
llamado *solr*, que podría servir, y en la página:
*https://ropensci.org/blog/2014/01/27/solr/
* , hay comentarios
interesantes sobre cómo emplearlo. Bueno, es un comienzo. Voy a
documentarme sobre esta librería y a tratar de ver si se ajusta para lo que
busco.

 Muchas gracias como siempre. Un abrazo.

El jue, 7 ene 2021 a las 0:54, Carlos Ortega ()
escribió:

> Hola,
>
> ¿No te vale la conexión con SolR?.
> SolR y Lucene son ya el mismo proyecto desde 2010...
>
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>
> El mié, 6 ene 2021 a las 23:06, Diego Martín ()
> escribió:
>
>> Estimados compañeros:
>>
>>  He buscado y no he hallado una
>> conexión disponible entre R y Lucene. ¿Alguno de ustedes conoce que exista
>> como sí la hay con Python [1]?.
>>
>>   Muchas gracias.
>>
>> [1] PyLucene.
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>
>
> --
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R] non-standard reshape from long to wide

2021-01-07 Thread Yuan Chun Ding
Dear R user,

I want to reshape a long data frame to wide format, I made the following 
example files.  Can you help me?

Thank you,

Yuan Chun Ding

sample <-c("xr" , "xr" , "fh" , "fh" , "fh" , "uy" , "uy" , "uy" , "uy");
marker <-c("x" , "y" , "g" , "x" , "k" , "y" , "x" , "u" , "j");
df.long <-data.frame(sample, marker);
   
xr <-c(1,1,NA,NA,NA,NA);
fh <-c(1,NA,1,1,NA,NA);
uy <-c(1,1,NA,NA,1,1);

df.wide <- t(data.frame(xr,fh,uy));
colnames(df.wide)<-c("x","y","g","k", "u","j");

--

-SECURITY/CONFIDENTIALITY WARNING-  

This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to receive further communications via e-mail, please reply to this message 
and inform the sender that you do not wish to receive further e-mail from the 
sender. (LCP301)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New mask.ok option for libraries

2021-01-07 Thread Duncan Murdoch

On 07/01/2021 10:02 a.m., Magnus Torfason wrote:

I had sent the following to r-devel a while ago, but perhaps r-help is more
appropriate.


No, this is definitely an R-devel topic.  You didn't get any replies 
there, but that doesn't mean you should post it again in the wrong place.


Duncan Murdoch

 I guess my question is what to do with this, would people

generally file an issue, or is there a way to hear if this is something
that makes sense to add – whether more info would be helpful and so on?

=
I was very happy to see the new mask.ok option. It works very well when
conflicts.policy is "strict":

---
options(conflicts.policy="strict")
library(igraph, exclude="decompose", mask.ok=c("spectrum","union"))
#> [No messages]
---

However, if no conflicts.policy has been set, the masked objects are loudly
reported, even if they are specified with mask.ok:

---
library(igraph, exclude="decompose", mask.ok=c("spectrum","union"))
#>
#> Attaching package: 'igraph'
#> The following object is masked from 'package:stats':
#>
#> spectrum
#> The following object is masked from 'package:base':
#>
#> union
---

It seems that if I specify mask.ok, that particular masking is expected and
should NOT be reported, regardless of what the conflicts.policy is. It
would be very useful for many users who are not ready to switch over to a
strict conflicts.policy, to nevertheless be able to suppress messages about
expected conflicts using mask.ok and thus only get messages when unexpected
masking occurs.
=

Best,
Magnus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New mask.ok option for libraries

2021-01-07 Thread Magnus Torfason
I had sent the following to r-devel a while ago, but perhaps r-help is more
appropriate. I guess my question is what to do with this, would people
generally file an issue, or is there a way to hear if this is something
that makes sense to add – whether more info would be helpful and so on?

=
I was very happy to see the new mask.ok option. It works very well when
conflicts.policy is "strict":

---
options(conflicts.policy="strict")
library(igraph, exclude="decompose", mask.ok=c("spectrum","union"))
#> [No messages]
---

However, if no conflicts.policy has been set, the masked objects are loudly
reported, even if they are specified with mask.ok:

---
library(igraph, exclude="decompose", mask.ok=c("spectrum","union"))
#>
#> Attaching package: 'igraph'
#> The following object is masked from 'package:stats':
#>
#> spectrum
#> The following object is masked from 'package:base':
#>
#> union
---

It seems that if I specify mask.ok, that particular masking is expected and
should NOT be reported, regardless of what the conflicts.policy is. It
would be very useful for many users who are not ready to switch over to a
strict conflicts.policy, to nevertheless be able to suppress messages about
expected conflicts using mask.ok and thus only get messages when unexpected
masking occurs.
=

Best,
Magnus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [ESS] info file not building in emacs

2021-01-07 Thread Miguel Rodriguez via ESS-help
Apparently the issue has been reported already in 
https://github.com/melpa/melpa/issues/7347# 
.  I’ll try the manual download 
and take a look at the Makeconf.

Many thanks!

Miguel Rodriguez
rodmiganto...@gmail.com
+4915163393479



> On 5 Jan 2021, at 23:14, Sparapani, Rodney via ESS-help 
>  wrote:
> 
> Check Makeconf to see where the info file goes.
> I have uncommented INFODIR=/usr/local/info
> 
> -- 
> 
> Rodney Sparapani, Associate Professor of Biostatistics
> Chair ISBA Section on Biostatistics and Pharmaceutical Statistics
> Institute for Health and Equity, Division of Biostatistics
> Medical College of Wisconsin, Milwaukee Campus
> 
> 
> __
> ESS-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/ess-help


[[alternative HTML version deleted]]

__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help


Re: [R] text analysis errors

2021-01-07 Thread Rasmus Liland
On 2021-01-07 11:34 +1100, Jim Lemon wrote:
> On Thu, Jan 7, 2021 at 10:40 AM Gordon Ballingrud
>  wrote:
> >
> > Hello all,
> >
> > I have asked this question on many forums without response. And although
> > I've made progress myself, I am stuck as to how to respond to a particular
> > error message.
> >
> > I have a question about text-analysis packages and code. The general idea
> > is that I am trying to perform readability analyses on a collection of
> > about 4,000 Word files. I would like to do any of a number of such
> > analyses, but the problem now is getting R to recognize the uploaded files
> > as data ready for analysis. But I have been getting error messages. Let me
> > show what I have done so far. I have three separate commands because I
> > broke the file of 4,000 files up into three separate ones because,
> > evidently, the file was too voluminous to be read alone in its entirety.
> > So, I divided the files up into three roughly similar folders. They are
> > called ‘WPSCASES’ one through three. Here is my code, with the error
> > messages for each command recorded below:
> >
> > token <-
> > tokenize("/Users/Gordon/Desktop/WPSCASESONE/",lang="en",doc_id="sample")
> >
> > The code is the same for the other folders; the name of the folder is
> > different, but otherwise identical.
> >
> > The error message reads:
> >
> > *Error in nchar(tagged.text[, "token"], type = "width") : invalid multibyte
> > string, element 348*
> >
> > The error messages are the same for the other two commands. But the
> > 'element' number is different. It's 925 for the second folder, and 4302 for
> > the third.
> >
> > token2 <-
> > tokenize("/Users/Gordon/Desktop/WPSCASES2/",lang="en",doc_id="sample")
> >
> > token3 <-
> > tokenize("/Users/Gordon/Desktop/WPSCASES3/",lang="en",doc_id="sample")
> >
> > These are the other commands if that's helpful.
> >
> > I’ve tried to discover whether the ‘element’ that the error message
> > mentions corresponds to the file of that number in the file’s order. But
> > since folder 3 does not have 4,300 files in it, I think that that was
> > unlikely. Please let me know if you can figure out how to fix this stuff so
> > that I can start to use ‘koRpus’ commands, like ‘readability’ and its
> > progeny.
> >
> > Thank you,
> > Gordon
> 
> Hi Gordon,
> Looks to me as though you may have to extract the text from the Word
> files. Export As Text.

Hi!  

quanteda::tokenizer says it needs a 
character vector or «corpus» as input


https://www.rdocumentation.org/packages/quanteda/versions/0.99.12/topics/tokenize

... or is this tokenize from the 
tokenizers package, I found something 
about «doc_id» here:


https://cran.r-project.org/web/packages/tokenizers/vignettes/introduction-to-tokenizers.html

You can convert docx to markdown using 
pandoc:

pandoc --from docx --to markdown $inputfile

odt also works, and many others.  

I believe pandoc is included in RStudio.  
But I have never used it from there 
myself, so that is really bad advice I 
think.

To read doc, I use wvHtml:

wvHtml $inputfile - 2> /dev/null | w3m -dump -T text/html

Rasmus


signature.asc
Description: PGP signature
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Fórmula con variables al azar

2021-01-07 Thread José Luis Cañadas
Hola Manuel.
¿No has pensado en hacer un randomforest, poniendo qeu use todos los datos
en cada muestra bootstrap y un porcentaje de las variables?

El jue, 7 ene 2021 a las 1:42, Carlos Ortega ()
escribió:

> Hola Manuel,
>
> Esta es una forma, uso el conjunto de datos "car90" que viene incluido en
> "rpart".
>
> #-
> library(rpart)
>
> data(car90)
> target <- c('Mileage')
> vars   <- setdiff(names(car90), target)
>
> num_loops <- 10
> for( i in 1:num_loops) {
>num_vars <- 6
>vars_samp <- vars[ sample(1:length(vars), num_vars)]
>fmla <- as.formula(paste(target, " ~ ", paste(vars_samp, collapse=
> "+")))
>fit <- rpart(fmla, data = car90)
>print(fit)
> }
>
> #-
>
> Se puede sofisticar esto, para capturar incluso la salida de cada
> iteración... :-).
>
> Gracias,
> Carlos Ortega
> www.qualityexcellence.es
>
> El mié, 6 ene 2021 a las 22:44, Manuel Mendoza (<
> mmend...@fulbrightmail.org>)
> escribió:
>
> > Muy buenas, hago un árbol de regresión (aunque podría ser cualquier otro
> > análisis) dentro de un loop y quiero que en cada vuelta coja un conjunto
> > distinto de variables. En la df hay 19 predictores pero quiero que
> utilice
> > solo 6 de ellos, al azar, cada vez. ¿Qué debería poner donde hay un
> > interrogante?
> >
> >   fit<- rpart(IFd ~ ? , data=training)
> >
> > Gracias, como siempre,
> > Manuel
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-help-es mailing list
> > R-help-es@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-help-es
> >
>
>
> --
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Error Running Bootstrap Function within Wrapper Function

2021-01-07 Thread Kevin Egan
Hi Stephen,

Thanks for the advice, from my understanding, the function foo1 should
provide a vector of coefficients which are the same length as the columns
in the data frame provided. This is based on the results from glmnet. With
foo2, this should be the same case, I've provided an if statement that
performs OLS if the sum of coef_nonzero is greater than 0. Meaning there
should be at least one independent variable from the original data set to
perform OLS since it's coefficient from glmnet was nonzero.

I previously have not had any issues with this method for other tests I
have run, it seems to only occur as this data increases in observations.
Would it be possible for you to look at the attached file? I provide a new
foo1 function that generates the data I am using, the same 3 functions I
sent previously, and the function I am using for glmnet. I'm still not sure
why, but as the number of bootstrap samples increases, I get an error with
my methods, particularly with foo3.

At this point, I think the problem could be with the wrapper function or
with the data itself.

I appreciate any help I can get at this point. I've been trying to debug
this function for almost a month.

Thanks,

Kevin


https://www.dropbox.com/s/1sdidrnmzrqmukt/r-help%20example.R?dl=0

On Wed, Jan 6, 2021 at 12:48 PM Stephen Ellison 
wrote:

> Kevin,
>
> I didn't have the data set (you might want to post a link to a
> downloadable file instead), but on a quick look at the code your foo1
> function looks as if it is not guaranteed to return an array of the same
> length each time. It's testing for nonzero coefs in a fitted model and then
> dropping exact zero coefs. That need not (and often will not) return the
> same coefficients every time in a simulation.  foo2 does the same general
> kind of thing.
>
> Could that be part of the problem?
>
> S Ellison
>
>
> > -Original Message-
> > From: R-help  On Behalf Of Kevin Egan
> > Sent: 05 January 2021 12:26
> > To: r-help 
> > Subject: [R] Error Running Bootstrap Function within Wrapper Function
> >
> > ===
> >  EXTERNAL EMAIL
> > ===
> >
> > Hello,
> >
> > I am currently trying to solve a problem with the boot package and
> writing a
> > function within a function in R. I have developed several functions to
> > perform the lasso but continue to receive an error when bootstrapping
> these
> > functions within a wrapper function. When I perform these methods using
> > tsboot outside the wrapper function, I do not get an error. However, when
> > placed within my function I continue to get the error "Error in t[r, ]
> <- res[[r]]
> > : number of items to replace is not a multiple of length."
> >
> > I've attached an example of my functions as well as a data file of the
> data I
> > am using. I'm sorry the file is so large, but I do not get a problem
> with a
> > smaller number of observations.
> >
> >
> > library(boot)
> > library(glmnet)
> > library(np)
> > foo1 <- function(data,index){ #index is the bootstrap sample index
> >   x <- data[index, -1] %>%
> > as.matrix() %>%
> > unname()
> >   y <- data[index, 1] %>%
> > scale(center = TRUE, scale = FALSE) %>%
> > as.matrix() %>%
> > unname()
> >   ols <- lm(y ~ x)
> >   # The intercept estimate should be dropped.
> >   ols.coef <- as.numeric(coef(ols))[-1]
> >   ols.coef[is.na(ols.coef)] <- 0
> >   ## The intercept estimate should be dropped.
> >   lasso <- cv.glmnet(x, y, alpha = 1,
> >  penalty.factor = 1 / abs(ols.coef))
> >   # Select nonzero coefficients from bic.out
> >   coef <- as.vector(coef(lasso,
> >  s = lasso$lambda.min))[-1]
> >   return(coef)
> > }
> > foo2 <- function(data, index){ #index is the bootstrap sample index
> >   x <- data[index, -1] %>%
> > as.matrix() %>%
> > unname()
> >   y <- data[index, 1] %>%
> > scale(center = TRUE, scale = FALSE) %>%
> > as.matrix() %>%
> > unname()
> >   # ic.glmnet provides coefficients with lowest BIC
> >   ols <- lm(y ~ x)
> >   # The intercept estimate should be dropped.
> >   ols.coef <- as.numeric(coef(ols))[-1]
> >   ols.coef[is.na(ols.coef)] <- 0
> >   lasso <- cv.glmnet(x, y, alpha = 1,
> >  penalty.factor = 1 / abs(ols.coef))
> >   # Select nonzero coefficients from bic.out
> >   coef <- as.vector(coef(lasso,
> >  s = lasso$lambda.min))[-1]
> >   coef_nonzero <- coef != 0
> >   if(sum(coef_nonzero) > 0) {
> > ls_obj <- lm(y ~ x[, coef_nonzero, drop = FALSE])
> > ls_coef <- as.vector(coef(ls_obj))[-1]
> > coef[coef_nonzero] <- ls_coef
> >   }
> >   return(coef)
> > }
> > foo3 <- function(data, num_samples) {
> >   bstar <- b.star(data[, 1], round = TRUE)
> >   # Select Block Length of circular block result
> >   blocklength <- bstar[, 2]
> >   init_boot_ts <- tsboot(tseries = data,
> >  statistic = foo1,
> >  R = num_samples, l = blocklength,
> >