Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter
... and just for fun, here is a non-string version (more appropriate for
complex state labels??):

gvec <- function(ntimes, states, init, final, repeats = TRUE)
   ## ntimes: integer, number of unique times
   ## states: vector of unique states
   ## init: initial state
   ## final: final state
{
   out <- cbind(init,
as.matrix(expand.grid(rep(list(states),ntimes -2 ))),final)
   if(!repeats)
 out[ apply(out,1,\(x)all(x[-1] != x[-ntimes])), ]
   else out
}

yielding:


> gvec(4, letters[1:5], "b", "e", repeats = TRUE)
  init Var1 Var2 final
 [1,] "b"  "a"  "a"  "e"
 [2,] "b"  "b"  "a"  "e"
 [3,] "b"  "c"  "a"  "e"
 [4,] "b"  "d"  "a"  "e"
 [5,] "b"  "e"  "a"  "e"
 [6,] "b"  "a"  "b"  "e"
 [7,] "b"  "b"  "b"  "e"
 [8,] "b"  "c"  "b"  "e"
 [9,] "b"  "d"  "b"  "e"
[10,] "b"  "e"  "b"  "e"
[11,] "b"  "a"  "c"  "e"
[12,] "b"  "b"  "c"  "e"
[13,] "b"  "c"  "c"  "e"
[14,] "b"  "d"  "c"  "e"
[15,] "b"  "e"  "c"  "e"
[16,] "b"  "a"  "d"  "e"
[17,] "b"  "b"  "d"  "e"
[18,] "b"  "c"  "d"  "e"
[19,] "b"  "d"  "d"  "e"
[20,] "b"  "e"  "d"  "e"
[21,] "b"  "a"  "e"  "e"
[22,] "b"  "b"  "e"  "e"
[23,] "b"  "c"  "e"  "e"
[24,] "b"  "d"  "e"  "e"
[25,] "b"  "e"  "e"  "e"
>
> gvec(4, letters[1:5], "b", "e", repeats = FALSE)
  init Var1 Var2 final
 [1,] "b"  "c"  "a"  "e"
 [2,] "b"  "d"  "a"  "e"
 [3,] "b"  "e"  "a"  "e"
 [4,] "b"  "a"  "b"  "e"
 [5,] "b"  "c"  "b"  "e"
 [6,] "b"  "d"  "b"  "e"
 [7,] "b"  "e"  "b"  "e"
 [8,] "b"  "a"  "c"  "e"
 [9,] "b"  "d"  "c"  "e"
[10,] "b"  "e"  "c"  "e"
[11,] "b"  "a"  "d"  "e"
[12,] "b"  "c"  "d"  "e"
[13,] "b"  "e"  "d"  "e"

:-)

-- Bert

On Mon, Sep 4, 2023 at 2:04 PM Bert Gunter  wrote:

> Well, if strings with repeats (as you defined them) are to be excluded, I
> think it's simple just to use regular expressions to remove them.
>
> e.g.
> g <- function(ntimes, states, init, final, repeats = TRUE)
>## ntimes: integer, number of unique times
>## states: vector of unique states
>## init: initial state
>## final: final state
> {
> out <- do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)),
> final))
> if(!repeats)
>out[-grep(paste(paste0(states,states),  collapse = "|"),out)]
> else out
> }
> So:
>
> > g(4, LETTERS[1:5], "B", "E", repeats = FALSE)
>  [1] "BCAE" "BDAE" "BEAE" "BABE" "BCBE" "BDBE" "BEBE" "BACE"
>  [9] "BDCE" "BECE" "BADE" "BCDE" "BEDE"
>
> Perhaps not the most efficient way to do this, of course.
>
> Cheers,
> Bert
>
>
> On Mon, Sep 4, 2023 at 12:57 PM Eric Berger  wrote:
>
>> My initial response was buggy and also used a deprecated function.
>> Also, it seems possible that one may want to rule out any strings where
>> the same state appears consecutively.
>> I say that such a string has a repeat.
>>
>> myExpand <- function(v, n) {
>>   do.call(tidyr::expand_grid, replicate(n, v, simplify = FALSE))
>> }
>>
>> no_repeat <- function(s) {
>>   v <- unlist(strsplit(s, NULL))
>>   sum(v[-1]==v[-length(v)]) == 0
>> }
>>
>> f <- function(states, nsteps, first, last, rm_repeat=TRUE) {
>>   if (nsteps < 3) stop("nsteps must be at least 3")
>> out <- paste(first,
>>   myExpand(states, nsteps-2) |>
>> apply(MAR=1, \(x) paste(x, collapse="")),
>>   last, sep="")
>> if (rm_repeat) {
>>   ok <- sapply(out, no_repeat)
>>   out <- out[ok]
>> }
>> out
>> }
>>
>> f(LETTERS[1:5],4,"B","E")
>>
>> #  [1] "BABE" "BACE" "BADE" "BCAE" "BCBE" "BCDE" "BDAE" "BDBE" "BDCE"
>> "BEAE" "BEBE" "BECE" "BEDE"
>>
>> On Mon, Sep 4, 2023 at 10:33 PM Bert Gunter 
>> wrote:
>>
>>> Sorry, my last line should have read:
>>>
>>> If neither this nor any of the other suggestions is what is desired, I
>>> think the OP will have to clarify his query.
>>>
>>> Bert
>>>
>>> On Mon, Sep 4, 2023 at 12:31 PM Bert Gunter 
>>> wrote:
>>>
 I think there may be some uncertainty here about what the OP requested.
 My interpretation is:

 n different times
 k different states
 Any state can appear at any time in the vector of times and can be
 repeated
 Initial and final states are given

 So modifying Tim's expand.grid() solution a bit yields:

 g <- function(ntimes, states, init, final){
## ntimes: integer, number of unique times
## states: vector of unique states
## init: initial state
## final: final state
 do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
 }

 e.g.

 > g(4, LETTERS[1:5], "B", "D")
  [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
  [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
 [17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
 [25] "BEED"

 If neither this nor any of the other suggestions is not what is
 desired, I think the OP will have to clarify his query.

 Cheers,
 Bert

 On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron 
 wrote:

> Does this work for you?
>
> 

Re: [R] aggregate formula - differing results

2023-09-04 Thread Achim Zeileis

On Mon, 4 Sep 2023, Ivan Calandra wrote:


Thanks Rui for your help; that would be one possibility indeed.

But am I the only one who finds that behavior of aggregate() completely 
unexpected and confusing? Especially considering that dplyr::summarise() and 
doBy::summaryBy() deal with NAs differently, even though they all use 
mean(na.rm = TRUE) to calculate the group stats.


I agree with Rui that this behaves as documented but I also agree with 
Ivan that the behavior is potentially confusing. Not so much because other 
packages behave differently but mostly because the handling of missing 
values differs between the different aggregate() methods.


Based on my teaching experience, I feel that a default of 
na.action=na.pass would be less confusing, especially in the case with 
multivariate "response".


In the univeriate case the discrepancy can be surprising - in the default 
method you need na.rm=TRUE but in the formula method you get the same 
result without additional arguments (due to na.action=na.omit). But in the 
multivariate case the discrepancy is not obvious, especially for 
beginners, because the results in other variables without NAs are affected 
as well.


A minimal toy example is the following data with two groups (x = A vs. B) 
and two "responses" (y without NAs and z with NA):


d <- data.frame(x = c("A", "A", "B", "B"), y = 1:4, z = c(1:3, NA))
d
##   x y  z
## 1 A 1  1
## 2 A 2  2
## 3 B 3  3
## 4 B 4 NA

Except for naming of the columns, both of the following summaries for y by 
x (without NAs) yield the same result:


aggregate(d$y, list(d$x), FUN = mean)
aggregate(y ~ x, data = d, FUN = mean)
##   x   y
## 1 A 1.5
## 2 B 3.5

For a single variable _with_ NAs, the default method needs the na.rm = 
TRUE argument, the fomula method does not. Again, except for naming of the 
columns:


aggregate(d$z, list(d$x), FUN = mean, na.rm = TRUE)
aggregate(z ~ x, data = d, FUN = mean)
##   x   z
## 1 A 1.5
## 2 B 3.0

Conversely, if you do want the NAs in the groups, the following two are 
the same (except for naming):


aggregate(d$z, list(d$x), FUN = mean)
aggregate(z ~ x, data = d, FUN = mean, na.action = na.pass)
##   x   z
## 1 A 1.5
## 2 B  NA

But in the multivariate case, it is not so obvious why the following two 
commands differ in their results for y (!), the variable without NAs, in 
group B:


aggregate(d[, c("y", "z")], list(d$x), FUN = mean, na.rm = TRUE)
##   Group.1   y   z
## 1   A 1.5 1.5
## 2   B 3.5 3.0
 ^^^

aggregate(cbind(y, z) ~ x, data = d, FUN = mean)
##   x   y   z
## 1 A 1.5 1.5
## 2 B 3.0 3.0
   ^^^

Hence, in my programming courses I tell students to use na.action=na.pass 
in the formula method and to handle NAs in the FUN argument themselves.


I guess that this is not important enough to change the default in 
aggregate.formula. Or are there R core members who also find that this 
inconsistency between the different methods is worth addressing?


If not, maybe an explicit example could be added on the help page? Showing 
something like this might help:


## default: omit enitre row 4 where z=NA
aggregate(cbind(y, z) ~ x, data = d, FUN = mean)
##   x   y   z
## 1 A 1.5 1.5
## 2 B 3.0 3.0

## alternatively: omit row 4 only for z result but not for y result
aggregate(cbind(y, z) ~ x, data = d, FUN = mean, na.action = na.pass, na.rm = 
TRUE)
##   x   y   z
## 1 A 1.5 1.5
## 2 B 3.5 3.0

Best wishes,
Achim


On 04/09/2023 13:46, Rui Barradas wrote:

Às 10:44 de 04/09/2023, Ivan Calandra escreveu:

Dear useRs,

I have just stumbled across a behavior in aggregate() that I cannot 
explain. Any help would be appreciated!


Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103", 
"HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41, 121.37, 
70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83, 1656.46, 
1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = 
c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337, 
35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427, 20.683, 
62.783, 26.417, 35.297), PLATWIDTH = c(38.84, NA, 15.33, 30.37, 11.44, 
14.88, 13.86, NA, NA, 26.71), PLATTHICK = c(8.67, NA, 7.99, 11.69, 3.3, 
16.52, 4.58, NA, NA, 9.35), EPA = c(78, NA, 78, 54, 72, 49, 56, NA, NA, 
56), THICKNESS = c(10.97, NA, 9.36, 6.4, 5.89, 11.05, 4.9, NA, NA, 10.08), 
WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9, 29.5, 4.5, NA, NA, 23), RAWMAT = 
c("FLINT", "FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", 
"HORNFELS", "HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 
111L, 112L, 113L, 114L, 115L), class = "data.frame")


1) Simple aggregation with 2 variables:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean, na.rm 
= TRUE)


2) Using the dot notation - different results:
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)

3) Using 

Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter
Well, if strings with repeats (as you defined them) are to be excluded, I
think it's simple just to use regular expressions to remove them.

e.g.
g <- function(ntimes, states, init, final, repeats = TRUE)
   ## ntimes: integer, number of unique times
   ## states: vector of unique states
   ## init: initial state
   ## final: final state
{
out <- do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)),
final))
if(!repeats)
   out[-grep(paste(paste0(states,states),  collapse = "|"),out)]
else out
}
So:

> g(4, LETTERS[1:5], "B", "E", repeats = FALSE)
 [1] "BCAE" "BDAE" "BEAE" "BABE" "BCBE" "BDBE" "BEBE" "BACE"
 [9] "BDCE" "BECE" "BADE" "BCDE" "BEDE"

Perhaps not the most efficient way to do this, of course.

Cheers,
Bert


On Mon, Sep 4, 2023 at 12:57 PM Eric Berger  wrote:

> My initial response was buggy and also used a deprecated function.
> Also, it seems possible that one may want to rule out any strings where
> the same state appears consecutively.
> I say that such a string has a repeat.
>
> myExpand <- function(v, n) {
>   do.call(tidyr::expand_grid, replicate(n, v, simplify = FALSE))
> }
>
> no_repeat <- function(s) {
>   v <- unlist(strsplit(s, NULL))
>   sum(v[-1]==v[-length(v)]) == 0
> }
>
> f <- function(states, nsteps, first, last, rm_repeat=TRUE) {
>   if (nsteps < 3) stop("nsteps must be at least 3")
> out <- paste(first,
>   myExpand(states, nsteps-2) |>
> apply(MAR=1, \(x) paste(x, collapse="")),
>   last, sep="")
> if (rm_repeat) {
>   ok <- sapply(out, no_repeat)
>   out <- out[ok]
> }
> out
> }
>
> f(LETTERS[1:5],4,"B","E")
>
> #  [1] "BABE" "BACE" "BADE" "BCAE" "BCBE" "BCDE" "BDAE" "BDBE" "BDCE"
> "BEAE" "BEBE" "BECE" "BEDE"
>
> On Mon, Sep 4, 2023 at 10:33 PM Bert Gunter 
> wrote:
>
>> Sorry, my last line should have read:
>>
>> If neither this nor any of the other suggestions is what is desired, I
>> think the OP will have to clarify his query.
>>
>> Bert
>>
>> On Mon, Sep 4, 2023 at 12:31 PM Bert Gunter 
>> wrote:
>>
>>> I think there may be some uncertainty here about what the OP requested.
>>> My interpretation is:
>>>
>>> n different times
>>> k different states
>>> Any state can appear at any time in the vector of times and can be
>>> repeated
>>> Initial and final states are given
>>>
>>> So modifying Tim's expand.grid() solution a bit yields:
>>>
>>> g <- function(ntimes, states, init, final){
>>>## ntimes: integer, number of unique times
>>>## states: vector of unique states
>>>## init: initial state
>>>## final: final state
>>> do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
>>> }
>>>
>>> e.g.
>>>
>>> > g(4, LETTERS[1:5], "B", "D")
>>>  [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
>>>  [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
>>> [17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
>>> [25] "BEED"
>>>
>>> If neither this nor any of the other suggestions is not what is desired,
>>> I think the OP will have to clarify his query.
>>>
>>> Cheers,
>>> Bert
>>>
>>> On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron 
>>> wrote:
>>>
 Does this work for you?

 t0<-t1<-t2<-LETTERS[1:5]
 al2<-expand.grid(t0, t1, t2)
 al3<-paste(al2$Var1, al2$Var2, al2$Var3)
 al4 <- gsub(" ", "", al3)
 head(al3)

 Tim

 -Original Message-
 From: R-help  On Behalf Of Eric Berger
 Sent: Monday, September 4, 2023 10:17 AM
 To: Christofer Bogaso 
 Cc: r-help 
 Subject: Re: [R] Finding combination of states

 [External Email]

 The function purrr::cross() can help you with this. For example:

 f <- function(states, nsteps, first, last) {
paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
 \(x) paste(unlist(x), collapse=""))), last, sep="") } f(LETTERS[1:5],
 3, "B", "E") [1] "BAE" "BBE" "BCE" "BDE" "BEE"

 HTH,
 Eric


 On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso <
 bogaso.christo...@gmail.com> wrote:
 >
 > Let say I have 3 time points.as T0, T1, and T2.(number of such time
 > points can be arbitrary) In each time point, an object can be any of 5
 > states, A, B, C, D, E (number of such states can be arbitrary)
 >
 > I need to find all possible ways, how that object starting with state
 > B (say) at time T0, can be on state E (example) in time T2
 >
 > For example one possibility is BAE etc.
 >
 > Is there any function available with R, that can give me a vector of
 > such possibilities for arbitrary number of states, time, and for a
 > given initial and final (desired) states?
 >
 > ANy pointer will be very appreciated.
 >
 > Thanks for your time.
 >
 > __
 > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 > https://stat/
 > 

Re: [R-es] problemas al instalar "dplyr" - problems installing "dplyr"

2023-09-04 Thread J.M. Santiago

Muchas gracias por la solución. Me ha funcionado. Tenía instalada una
versión un poco "vieja" de Rtools y eso puede haber influido. 

Saludos, 


José María

---

Dr. José María Santiago Sáez (PhD)
Madrid - España/Spain
+34 646 165 291
jms...@picos.com
https://sites.google.com/site/santiagojosemaria/ 


El 04.09.2023 21:42, Carlos Ortega escribió:

Hola José, 

Veo que tienes Windows, por el path que aparece en el error... 
En este caso tienes que hacer esto: 


* No sé si en versiones anteriores instalaste las RTools.
* Si no es así, conviene que lo hagas "dplyr" busca un compilador para 
instalarse y no lo encuentra. Esto es lo que justamente te ofrece las RTools.
* Como estás usando la versión 4.3, tienes que instalarla desde aquí: 
https://cran.r-project.org/bin/windows/Rtools/rtools43/rtools.html

Gracias! 
Carlos. 

El lun, 4 sept 2023 a las 21:30, J.M. Santiago () escribió: 

Hola a todos: 

He instalado las últimas versiones de R y R-studio y no consigo instalar el paquete "dplyr". Al intentarlo me da el siguiente mensaje. 

Por favor. si alguien tiene idea de cómo resolver este problema, por favor, necesito ayuda. 

Muchas gracias, 

José M. 

Hi, 

I have installed the last versions of R and R-studio but I can´t install the "dplyr" package. When I try, I get the following message. 

Please, if somebody can help me I will be very gratefull. 

Thank you very much. 

José M. 


Installing package into 'C:/Users/jmsan/AppData/Local/R/win-library/4.3'
(as 'lib' is unspecified)

There is a binary version available but the source version is later:
binary source needs_compilation
dplyr  1.1.2  1.1.3  TRUE

installing the source package 'dplyr'

trying URL 'https://cran.rstudio.com/src/contrib/dplyr_1.1.3.tar.gz'
Content type 'application/x-gzip' length 1083635 bytes (1.0 MB)
downloaded 1.0 MB

* installing *source* package 'dplyr' ...
** package 'dplyr' successfully unpacked and MD5 sums checked
** using staged installation
** libs
g++ -std=gnu++17  -I"C:/PROGRA~1/R/R-43~1.1/include" -DNDEBUG 
-I"c:/rtools43/x86_64-w64-mingw32.static.posix/include" -O2 -Wall  -mfpmath=sse 
-msse2 -mstackrealign  -c chop.cpp -o chop.o
sh: line 1: g++: command not found
make: *** [C:/PROGRA~1/R/R-43~1.1/etc/x64/Makeconf:272: chop.o] Error 127
ERROR: compilation failed for package 'dplyr'
* removing 'C:/Users/jmsan/AppData/Local/R/win-library/4.3/dplyr'
Warning in install.packages :
installation of package 'dplyr' had non-zero exit status

---

Dr. José María Santiago Sáez (PhD)
Madrid - España/Spain
+34 646 165 291
jms...@picos.com
https://sites.google.com/site/santiagojosemaria/ 
___

R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


--
Saludos,
Carlos Ortega
www.qualityexcellence.es [1]



Links:
--
[1] http://www.qualityexcellence.es___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] problemas al instalar "dplyr" - problems installing "dplyr"

2023-09-04 Thread J.M. Santiago

Muchas gracias por la solución. Me ha funcionado. Tenía instalada una
versión un poco "vieja" de Rtools y eso puede haber influido. 

Saludos, 


José María

---

Dr. José María Santiago Sáez (PhD)
Madrid - España/Spain
+34 646 165 291
jms...@picos.com
https://sites.google.com/site/santiagojosemaria/ 


El 04.09.2023 21:46, Emilio L. Cano escribió:

Hola, 

Supongo que te estará preguntando si quieres instalar desde las fuentes porque hay una versión más nueva que la que hay en binario (supongo que usas Windows). 

Entonces, o respondes que NO a esa pregunta, o te instalas las Rtools . 

Espero que te sirva, un saludo, 
Emilio



El 4 sept 2023, a las 21:23, J.M. Santiago  escribió:


Hola a todos: 

He instalado las últimas versiones de R y R-studio y no consigo instalar el paquete "dplyr". Al intentarlo me da el siguiente mensaje. 

Por favor. si alguien tiene idea de cómo resolver este problema, por favor, necesito ayuda. 

Muchas gracias, 

José M. 

Hi, 

I have installed the last versions of R and R-studio but I can´t install the "dplyr" package. When I try, I get the following message. 

Please, if somebody can help me I will be very gratefull. 

Thank you very much. 

José M. 


Installing package into 'C:/Users/jmsan/AppData/Local/R/win-library/4.3'
(as 'lib' is unspecified)

There is a binary version available but the source version is later:
binary source needs_compilation
dplyr  1.1.2  1.1.3  TRUE

installing the source package 'dplyr'

trying URL 'https://cran.rstudio.com/src/contrib/dplyr_1.1.3.tar.gz'
Content type 'application/x-gzip' length 1083635 bytes (1.0 MB)
downloaded 1.0 MB

* installing *source* package 'dplyr' ...
** package 'dplyr' successfully unpacked and MD5 sums checked
** using staged installation
** libs
g++ -std=gnu++17  -I"C:/PROGRA~1/R/R-43~1.1/include" -DNDEBUG 
-I"c:/rtools43/x86_64-w64-mingw32.static.posix/include" -O2 -Wall  -mfpmath=sse 
-msse2 -mstackrealign  -c chop.cpp -o chop.o
sh: line 1: g++: command not found
make: *** [C:/PROGRA~1/R/R-43~1.1/etc/x64/Makeconf:272: chop.o] Error 127
ERROR: compilation failed for package 'dplyr'
* removing 'C:/Users/jmsan/AppData/Local/R/win-library/4.3/dplyr'
Warning in install.packages :
installation of package 'dplyr' had non-zero exit status

---

Dr. José María Santiago Sáez (PhD)
Madrid - España/Spain
+34 646 165 291
jms...@picos.com
https://sites.google.com/site/santiagojosemaria/ 


<0a969aa8.gif> ___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Finding combination of states

2023-09-04 Thread Eric Berger
My initial response was buggy and also used a deprecated function.
Also, it seems possible that one may want to rule out any strings where the
same state appears consecutively.
I say that such a string has a repeat.

myExpand <- function(v, n) {
  do.call(tidyr::expand_grid, replicate(n, v, simplify = FALSE))
}

no_repeat <- function(s) {
  v <- unlist(strsplit(s, NULL))
  sum(v[-1]==v[-length(v)]) == 0
}

f <- function(states, nsteps, first, last, rm_repeat=TRUE) {
  if (nsteps < 3) stop("nsteps must be at least 3")
out <- paste(first,
  myExpand(states, nsteps-2) |>
apply(MAR=1, \(x) paste(x, collapse="")),
  last, sep="")
if (rm_repeat) {
  ok <- sapply(out, no_repeat)
  out <- out[ok]
}
out
}

f(LETTERS[1:5],4,"B","E")

#  [1] "BABE" "BACE" "BADE" "BCAE" "BCBE" "BCDE" "BDAE" "BDBE" "BDCE"
"BEAE" "BEBE" "BECE" "BEDE"

On Mon, Sep 4, 2023 at 10:33 PM Bert Gunter  wrote:

> Sorry, my last line should have read:
>
> If neither this nor any of the other suggestions is what is desired, I
> think the OP will have to clarify his query.
>
> Bert
>
> On Mon, Sep 4, 2023 at 12:31 PM Bert Gunter 
> wrote:
>
>> I think there may be some uncertainty here about what the OP requested.
>> My interpretation is:
>>
>> n different times
>> k different states
>> Any state can appear at any time in the vector of times and can be
>> repeated
>> Initial and final states are given
>>
>> So modifying Tim's expand.grid() solution a bit yields:
>>
>> g <- function(ntimes, states, init, final){
>>## ntimes: integer, number of unique times
>>## states: vector of unique states
>>## init: initial state
>>## final: final state
>> do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
>> }
>>
>> e.g.
>>
>> > g(4, LETTERS[1:5], "B", "D")
>>  [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
>>  [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
>> [17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
>> [25] "BEED"
>>
>> If neither this nor any of the other suggestions is not what is desired,
>> I think the OP will have to clarify his query.
>>
>> Cheers,
>> Bert
>>
>> On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron 
>> wrote:
>>
>>> Does this work for you?
>>>
>>> t0<-t1<-t2<-LETTERS[1:5]
>>> al2<-expand.grid(t0, t1, t2)
>>> al3<-paste(al2$Var1, al2$Var2, al2$Var3)
>>> al4 <- gsub(" ", "", al3)
>>> head(al3)
>>>
>>> Tim
>>>
>>> -Original Message-
>>> From: R-help  On Behalf Of Eric Berger
>>> Sent: Monday, September 4, 2023 10:17 AM
>>> To: Christofer Bogaso 
>>> Cc: r-help 
>>> Subject: Re: [R] Finding combination of states
>>>
>>> [External Email]
>>>
>>> The function purrr::cross() can help you with this. For example:
>>>
>>> f <- function(states, nsteps, first, last) {
>>>paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
>>> \(x) paste(unlist(x), collapse=""))), last, sep="") } f(LETTERS[1:5], 3,
>>> "B", "E") [1] "BAE" "BBE" "BCE" "BDE" "BEE"
>>>
>>> HTH,
>>> Eric
>>>
>>>
>>> On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso <
>>> bogaso.christo...@gmail.com> wrote:
>>> >
>>> > Let say I have 3 time points.as T0, T1, and T2.(number of such time
>>> > points can be arbitrary) In each time point, an object can be any of 5
>>> > states, A, B, C, D, E (number of such states can be arbitrary)
>>> >
>>> > I need to find all possible ways, how that object starting with state
>>> > B (say) at time T0, can be on state E (example) in time T2
>>> >
>>> > For example one possibility is BAE etc.
>>> >
>>> > Is there any function available with R, that can give me a vector of
>>> > such possibilities for arbitrary number of states, time, and for a
>>> > given initial and final (desired) states?
>>> >
>>> > ANy pointer will be very appreciated.
>>> >
>>> > Thanks for your time.
>>> >
>>> > __
>>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > https://stat/
>>> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
>>> > %7C25cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84
>>> > %7C0%7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
>>> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>>> > ta=TM4jGF39Gy3PH0T3nnQpT%2BLogkVxifv%2Fudv9hWPwbss%3D=0
>>> > PLEASE do read the posting guide
>>> > http://www.r/
>>> > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7C25
>>> > cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
>>> > 7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>>> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=5n
>>> > PTLmsz0lOz47t41u578t9oI0i7BOgIX53yx8CesLs%3D=0
>>> > and provide commented, minimal, self-contained, reproducible code.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> 

Re: [R-es] problemas al instalar "dplyr" - problems installing "dplyr"

2023-09-04 Thread Emilio L. Cano
Hola,Supongo que te estará preguntando si quieres instalar desde las fuentes porque hay una versión más nueva que la que hay en binario (supongo que usas Windows).Entonces, o respondes que NO a esa pregunta, o te instalas las Rtools .Espero que te sirva, un saludo,EmilioEl 4 sept 2023, a las 21:23, J.M. Santiago  escribió:
Hola a todos:
He instalado las últimas versiones de R y R-studio y no consigo instalar el paquete "dplyr". Al intentarlo me da el siguiente mensaje.
Por favor. si alguien tiene idea de cómo resolver este problema, por favor, necesito ayuda.
Muchas gracias,
 José M.
Hi,
I have installed the last versions of R and R-studio but I can´t install the "dplyr" package. When I try, I get the following message.
Please, if somebody can help me I will be very gratefull.
Thank you very much.
 José M.

Installing package into ‘C:/Users/jmsan/AppData/Local/R/win-library/4.3’
(as ‘lib’ is unspecified)

  There is a binary version available but the source version is later:
  binary source needs_compilation
dplyr  1.1.2  1.1.3  TRUE

installing the source package ‘dplyr’

trying URL 'https://cran.rstudio.com/src/contrib/dplyr_1.1.3.tar.gz'
Content type 'application/x-gzip' length 1083635 bytes (1.0 MB)
downloaded 1.0 MB

* installing *source* package 'dplyr' ...
** package 'dplyr' successfully unpacked and MD5 sums checked
** using staged installation
** libs
g++ -std=gnu++17  -I"C:/PROGRA~1/R/R-43~1.1/include" -DNDEBUG -I"c:/rtools43/x86_64-w64-mingw32.static.posix/include" -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign  -c chop.cpp -o chop.o
sh: line 1: g++: command not found
make: *** [C:/PROGRA~1/R/R-43~1.1/etc/x64/Makeconf:272: chop.o] Error 127
ERROR: compilation failed for package 'dplyr'
* removing 'C:/Users/jmsan/AppData/Local/R/win-library/4.3/dplyr'
Warning in install.packages :
  installation of package ‘dplyr’ had non-zero exit status



---Dr. José María Santiago Sáez (PhD)Madrid - España/Spain+34 646 165 291jms...@picos.comhttps://sites.google.com/site/santiagojosemaria/
<0a969aa8.gif>


___R-help-es mailing listR-help-es@r-project.orghttps://stat.ethz.ch/mailman/listinfo/r-help-es___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter
Sorry, my last line should have read:

If neither this nor any of the other suggestions is what is desired, I
think the OP will have to clarify his query.

Bert

On Mon, Sep 4, 2023 at 12:31 PM Bert Gunter  wrote:

> I think there may be some uncertainty here about what the OP requested. My
> interpretation is:
>
> n different times
> k different states
> Any state can appear at any time in the vector of times and can be repeated
> Initial and final states are given
>
> So modifying Tim's expand.grid() solution a bit yields:
>
> g <- function(ntimes, states, init, final){
>## ntimes: integer, number of unique times
>## states: vector of unique states
>## init: initial state
>## final: final state
> do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
> }
>
> e.g.
>
> > g(4, LETTERS[1:5], "B", "D")
>  [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
>  [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
> [17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
> [25] "BEED"
>
> If neither this nor any of the other suggestions is not what is desired, I
> think the OP will have to clarify his query.
>
> Cheers,
> Bert
>
> On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron  wrote:
>
>> Does this work for you?
>>
>> t0<-t1<-t2<-LETTERS[1:5]
>> al2<-expand.grid(t0, t1, t2)
>> al3<-paste(al2$Var1, al2$Var2, al2$Var3)
>> al4 <- gsub(" ", "", al3)
>> head(al3)
>>
>> Tim
>>
>> -Original Message-
>> From: R-help  On Behalf Of Eric Berger
>> Sent: Monday, September 4, 2023 10:17 AM
>> To: Christofer Bogaso 
>> Cc: r-help 
>> Subject: Re: [R] Finding combination of states
>>
>> [External Email]
>>
>> The function purrr::cross() can help you with this. For example:
>>
>> f <- function(states, nsteps, first, last) {
>>paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
>> \(x) paste(unlist(x), collapse=""))), last, sep="") } f(LETTERS[1:5], 3,
>> "B", "E") [1] "BAE" "BBE" "BCE" "BDE" "BEE"
>>
>> HTH,
>> Eric
>>
>>
>> On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso <
>> bogaso.christo...@gmail.com> wrote:
>> >
>> > Let say I have 3 time points.as T0, T1, and T2.(number of such time
>> > points can be arbitrary) In each time point, an object can be any of 5
>> > states, A, B, C, D, E (number of such states can be arbitrary)
>> >
>> > I need to find all possible ways, how that object starting with state
>> > B (say) at time T0, can be on state E (example) in time T2
>> >
>> > For example one possibility is BAE etc.
>> >
>> > Is there any function available with R, that can give me a vector of
>> > such possibilities for arbitrary number of states, time, and for a
>> > given initial and final (desired) states?
>> >
>> > ANy pointer will be very appreciated.
>> >
>> > Thanks for your time.
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat/
>> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
>> > %7C25cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84
>> > %7C0%7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
>> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>> > ta=TM4jGF39Gy3PH0T3nnQpT%2BLogkVxifv%2Fudv9hWPwbss%3D=0
>> > PLEASE do read the posting guide
>> > http://www.r/
>> > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7C25
>> > cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
>> > 7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=5n
>> > PTLmsz0lOz47t41u578t9oI0i7BOgIX53yx8CesLs%3D=0
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.r-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter
I think there may be some uncertainty here about what the OP requested. My
interpretation is:

n different times
k different states
Any state can appear at any time in the vector of times and can be repeated
Initial and final states are given

So modifying Tim's expand.grid() solution a bit yields:

g <- function(ntimes, states, init, final){
   ## ntimes: integer, number of unique times
   ## states: vector of unique states
   ## init: initial state
   ## final: final state
do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
}

e.g.

> g(4, LETTERS[1:5], "B", "D")
 [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
 [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
[17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
[25] "BEED"

If neither this nor any of the other suggestions is not what is desired, I
think the OP will have to clarify his query.

Cheers,
Bert

On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron  wrote:

> Does this work for you?
>
> t0<-t1<-t2<-LETTERS[1:5]
> al2<-expand.grid(t0, t1, t2)
> al3<-paste(al2$Var1, al2$Var2, al2$Var3)
> al4 <- gsub(" ", "", al3)
> head(al3)
>
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Eric Berger
> Sent: Monday, September 4, 2023 10:17 AM
> To: Christofer Bogaso 
> Cc: r-help 
> Subject: Re: [R] Finding combination of states
>
> [External Email]
>
> The function purrr::cross() can help you with this. For example:
>
> f <- function(states, nsteps, first, last) {
>paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
> \(x) paste(unlist(x), collapse=""))), last, sep="") } f(LETTERS[1:5], 3,
> "B", "E") [1] "BAE" "BBE" "BCE" "BDE" "BEE"
>
> HTH,
> Eric
>
>
> On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso <
> bogaso.christo...@gmail.com> wrote:
> >
> > Let say I have 3 time points.as T0, T1, and T2.(number of such time
> > points can be arbitrary) In each time point, an object can be any of 5
> > states, A, B, C, D, E (number of such states can be arbitrary)
> >
> > I need to find all possible ways, how that object starting with state
> > B (say) at time T0, can be on state E (example) in time T2
> >
> > For example one possibility is BAE etc.
> >
> > Is there any function available with R, that can give me a vector of
> > such possibilities for arbitrary number of states, time, and for a
> > given initial and final (desired) states?
> >
> > ANy pointer will be very appreciated.
> >
> > Thanks for your time.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat/
> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
> > %7C25cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84
> > %7C0%7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> > ta=TM4jGF39Gy3PH0T3nnQpT%2BLogkVxifv%2Fudv9hWPwbss%3D=0
> > PLEASE do read the posting guide
> > http://www.r/
> > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7C25
> > cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> > 7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=5n
> > PTLmsz0lOz47t41u578t9oI0i7BOgIX53yx8CesLs%3D=0
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.r-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] problemas al instalar "dplyr" - problems installing "dplyr"

2023-09-04 Thread J.M. Santiago
Hola a todos: 


He instalado las últimas versiones de R y R-studio y no consigo instalar
el paquete "dplyr". Al intentarlo me da el siguiente mensaje. 


Por favor. si alguien tiene idea de cómo resolver este problema, por
favor, necesito ayuda. 

Muchas gracias, 

José M. 

Hi, 


I have installed the last versions of R and R-studio but I can´t install
the "dplyr" package. When I try, I get the following message. 

Please, if somebody can help me I will be very gratefull. 

Thank you very much. 

José M. 


Installing package into 'C:/Users/jmsan/AppData/Local/R/win-library/4.3'
(as 'lib' is unspecified)

 There is a binary version available but the source version is later:
 binary source needs_compilation
dplyr  1.1.2  1.1.3  TRUE

installing the source package 'dplyr'

trying URL 'https://cran.rstudio.com/src/contrib/dplyr_1.1.3.tar.gz'
Content type 'application/x-gzip' length 1083635 bytes (1.0 MB)
downloaded 1.0 MB

* installing *source* package 'dplyr' ...
** package 'dplyr' successfully unpacked and MD5 sums checked
** using staged installation
** libs
g++ -std=gnu++17  -I"C:/PROGRA~1/R/R-43~1.1/include" -DNDEBUG
-I"c:/rtools43/x86_64-w64-mingw32.static.posix/include" -O2 -Wall 
-mfpmath=sse -msse2 -mstackrealign  -c chop.cpp -o chop.o

sh: line 1: g++: command not found
make: *** [C:/PROGRA~1/R/R-43~1.1/etc/x64/Makeconf:272: chop.o] Error
127
ERROR: compilation failed for package 'dplyr'
* removing 'C:/Users/jmsan/AppData/Local/R/win-library/4.3/dplyr'
Warning in install.packages :
 installation of package 'dplyr' had non-zero exit status

---

Dr. José María Santiago Sáez (PhD)
Madrid - España/Spain
+34 646 165 291
jms...@picos.com
https://sites.google.com/site/santiagojosemaria/___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Finding combination of states

2023-09-04 Thread Ebert,Timothy Aaron
Does this work for you?

t0<-t1<-t2<-LETTERS[1:5]
al2<-expand.grid(t0, t1, t2)
al3<-paste(al2$Var1, al2$Var2, al2$Var3)
al4 <- gsub(" ", "", al3)
head(al3)

Tim

-Original Message-
From: R-help  On Behalf Of Eric Berger
Sent: Monday, September 4, 2023 10:17 AM
To: Christofer Bogaso 
Cc: r-help 
Subject: Re: [R] Finding combination of states

[External Email]

The function purrr::cross() can help you with this. For example:

f <- function(states, nsteps, first, last) {
   paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
\(x) paste(unlist(x), collapse=""))), last, sep="") } f(LETTERS[1:5], 3, "B", 
"E") [1] "BAE" "BBE" "BCE" "BDE" "BEE"

HTH,
Eric


On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso  
wrote:
>
> Let say I have 3 time points.as T0, T1, and T2.(number of such time
> points can be arbitrary) In each time point, an object can be any of 5
> states, A, B, C, D, E (number of such states can be arbitrary)
>
> I need to find all possible ways, how that object starting with state
> B (say) at time T0, can be on state E (example) in time T2
>
> For example one possibility is BAE etc.
>
> Is there any function available with R, that can give me a vector of
> such possibilities for arbitrary number of states, time, and for a
> given initial and final (desired) states?
>
> ANy pointer will be very appreciated.
>
> Thanks for your time.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat/
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
> %7C25cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84
> %7C0%7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> ta=TM4jGF39Gy3PH0T3nnQpT%2BLogkVxifv%2Fudv9hWPwbss%3D=0
> PLEASE do read the posting guide
> http://www.r/
> -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7C25
> cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> 7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=5n
> PTLmsz0lOz47t41u578t9oI0i7BOgIX53yx8CesLs%3D=0
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with installing R packages from source and running C++ in R, even on fresh R installation

2023-09-04 Thread Ivan Krylov
В Mon, 04 Sep 2023 12:05:38 +
Christophe Bousquet  пишет:

> I will try compiling R from source when I am back from holidays, and
> ask you if I need assistance.

Make sure to compile with DEBUG=1 so that the compiler flags needed to
emit debugging information will be enabled. Good luck!

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Book Recommendation

2023-09-04 Thread Robert Baer
This is a great find for those of us lurking on this thread. Thanks for 
sharing Greg (and of course Paul).


On 8/30/2023 3:52 PM, Greg Snow wrote:

Stephen,  I see lots of answers with packages and resources, but not
book recommendations.  I have used Introduction to Data Technologies
by Paul Murrell (https://www.stat.auckland.ac.nz/~paul/ItDT/) to teach
SQL and database design and would recommend looking at it as a
possibility.

On Mon, Aug 28, 2023 at 9:47 AM Stephen H. Dawson, DSL via R-help
 wrote:

Good Morning,


I am doing some research to develop a new course where I teach. I am
looking for a book to use in the course content to teach accomplishing
SQL in R.

Does anyone know of a book on this topic to recommend for consideration?


Thank You,
--
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate formula - differing results

2023-09-04 Thread Bert Gunter
Ivan:
Just one perhaps extraneous comment.

You said that you were surprised that aggregate() and group_by() did not
have the same behavior. That is a misconception on your part. As you know,
the tidyverse recapitulates the functionality of many base R functions; but
it makes no claims to do so in exactly the same way and, indeed, often
makes deliberate changes to "improve" behavior. So if you wish to use both,
you should *expect* such differences, which, of course, are documented in
the man pages (and often elsewhere).

Cheers,
Bert

On Mon, Sep 4, 2023 at 5:21 AM Ivan Calandra  wrote:

> Haha, got it now, there is an na.action argument (which defaults to
> na.omit) to aggregate() which is applied before calling mean(na.rm =
> TRUE). Thank you Rui for pointing this out.
>
> So running it with na.pass instead of na.omit gives the same results as
> dplyr::group_by()+summarise():
> aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE,
> na.action = na.pass)
>
> Cheers,
> Ivan
>
> On 04/09/2023 13:56, Rui Barradas wrote:
> > Às 12:51 de 04/09/2023, Ivan Calandra escreveu:
> >> Thanks Rui for your help; that would be one possibility indeed.
> >>
> >> But am I the only one who finds that behavior of aggregate()
> >> completely unexpected and confusing? Especially considering that
> >> dplyr::summarise() and doBy::summaryBy() deal with NAs differently,
> >> even though they all use mean(na.rm = TRUE) to calculate the group
> >> stats.
> >>
> >> Best wishes,
> >> Ivan
> >>
> >> On 04/09/2023 13:46, Rui Barradas wrote:
> >>> Às 10:44 de 04/09/2023, Ivan Calandra escreveu:
>  Dear useRs,
> 
>  I have just stumbled across a behavior in aggregate() that I cannot
>  explain. Any help would be appreciated!
> 
>  Sample data:
>  my_data <- structure(list(ID = c("FLINT-1", "FLINT-10",
>  "FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100",
>  "HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77,
>  142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5),
>  SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47,
>  1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384,
>  29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337,
>  35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427,
>  20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84, NA, 15.33,
>  30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK = c(8.67, NA,
>  7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78, NA, 78,
>  54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4,
>  5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6,
>  14.9, 29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT",
>  "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS",
>  "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L,
>  114L, 115L), class = "data.frame")
> 
>  1) Simple aggregation with 2 variables:
>  aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN =
>  mean, na.rm = TRUE)
> 
>  2) Using the dot notation - different results:
>  aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)
> 
>  3) Using dplyr, I get the same results as #1:
>  group_by(my_data, RAWMAT) %>%
> summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))
> 
>  4) It gets weirder: using all columns in #1 give the same results
>  as in #2 but different from #1 and #3
>  aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH,
>  PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN =
>  mean, na.rm = TRUE)
> 
>  So it seems it is not only due to the notation (cbind() vs. dot).
>  Is it a bug? A peculiar thing in my dataset? I tend to think this
>  could be due to some variables (or their names) as all notations
>  seem to agree when I remove some variables (although I haven't
>  found out which variable(s) is (are) at fault), e.g.:
> 
>  my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10",
>  "FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100",
>  "HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77,
>  142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5),
>  SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47,
>  1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384,
>  29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337,
>  35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427,
>  20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT", "FLINT",
>  "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS",
>  "HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L,
>  112L, 113L, 114L, 115L), class = "data.frame")
> 
>  aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT,
>  

Re: [R] Finding combination of states

2023-09-04 Thread Eric Berger
The function purrr::cross() can help you with this. For example:

f <- function(states, nsteps, first, last) {
   paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
\(x) paste(unlist(x), collapse=""))), last, sep="")
}
f(LETTERS[1:5], 3, "B", "E")
[1] "BAE" "BBE" "BCE" "BDE" "BEE"

HTH,
Eric


On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso
 wrote:
>
> Let say I have 3 time points.as T0, T1, and T2.(number of such time
> points can be arbitrary) In each time point, an object can be any of 5
> states, A, B, C, D, E (number of such states can be arbitrary)
>
> I need to find all possible ways, how that object starting with state
> B (say) at time T0, can be on state E (example) in time T2
>
> For example one possibility is BAE etc.
>
> Is there any function available with R, that can give me a vector of
> such possibilities for arbitrary number of states, time, and for a
> given initial and final (desired) states?
>
> ANy pointer will be very appreciated.
>
> Thanks for your time.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Pkg-Collaboratos] BioShapes Almost-Package

2023-09-04 Thread Leonard Mada via R-help
Thank you very much for all the responses; especially Duncan's guidance. 
I will add some further ideas on workflows below.


There were quite a few views on GitHub; but there is not much to see, as 
there is absolutely no documentation.  I have added in the meantime a 
basic example:

https://github.com/discoleo/BioShapes/blob/main/Examples.Bioshapes.png
The actual code can do a lot more.



Some ideas on workflows:

1. Most code is written as C/C++; the R code is a thin wrapper around 
the C/C++ functions
It is practical to embed the documentation with the R code - as there is 
no complex code anyway. The same may apply for small packages.


2. Complex R code
The comments may clutter the code. It is also difficult to maintain this 
documentation, as the comments are less easily readable. Separating the 
documentation from the code is a good idea.


Unfortunately, this is not so obvious when you start working on your 
first package.


Many thanks,

Leonard


On 9/4/2023 5:47 AM, Jeff Newmiller wrote:

Leonard... the reason roxygen exists is to allow markup in source files to be 
used to automatically generate the numerous files required by standard R 
packages as documented in Writing R Extensions.

If your goal is to not use source files this way then the solution is to not 
use roxygen at all. Just create those files yourself by directly editing them 
from scratch.

On September 3, 2023 7:06:09 PM PDT, Leonard Mada via R-help 
 wrote:

Thank you Bert.


Clarification:

Indeed, I am using an add-on package: it is customary for that package -
that is what I have seen - to have the entire documentation included as
comments in the R src files. (But maybe I am wrong.)


I will try to find some time over the next few days to explore in more
detail the R documentation. Although, I do not know how this will
interact with the add-on package.


Sincerely,


Leonard


On 9/4/2023 4:58 AM, Bert Gunter wrote:

1. R-package-devel is where queries about package protocols should go.

2. But...
"Is there a succinct, but sufficiently informative description of
documentation tools?"
"Writing R Extensions" (shipped with R) is *the* reference for R
documentation. Whether it's sufficiently "succinct" for you, I cannot
say.

"I find that including the documentation in the source files is very
distracting."
?? R documentation (.Rd) files are separate from source (.R) files.
Inline documentation in source files is an "add-on" capability
provided by optional packages if one prefers to do this. Such packages
parse the source files to extract the documentation into the .Rd
files/ So not sure what you mean here. Apologies if I have misunderstood.

" I would prefer to have only basic comments in the source
files and an expanded documentation in a separate location."
If I understand you correctly, this is exactly what the R package
process specifies. Again, see the "Writing R Extensions" manual for
details.

Also, if you wish to have your package on CRAN, it requires that the
package documents all functions in the package as specified by the
"Writing ..." manual.

Again, further questions and elaboration should go to the
R-package-devel list, although I think the manual is really the
authoritative resource to follow.

Cheers,
Bert



On Sun, Sep 3, 2023 at 5:06 PM Leonard Mada via R-help
 wrote:

 Dear R-List Members,

 I am looking for collaborators to further develop the BioShapes
 almost-package. I added a brief description below.

 A.) BioShapes (Almost-) Package

 The aim of the BioShapes quasi-package is to facilitate the
 generation
 of graphical objects resembling biological and chemical entities,
 enabling the construction of diagrams based on these objects. It
 currently includes functions to generate diagrams depicting viral
 particles, liposomes, double helix / DNA strands, various cell types
 (like neurons, brush-border cells and duct cells), Ig-domains, as
 well
 as more basic shapes.

 It should offer researchers in the field of biological and chemical
 sciences a tool to easily generate diagrams depicting the studied
 biological processes.

 The package lacks a proper documentation and is not yet released on
 CRAN. However, it is available on GitHub:
 https://github.com/discoleo/BioShapes

 Although there are 27 unique cloners on GitHub, I am still looking
 for
 contributors and collaborators. I would appreciate any
 collaborations to
 develop it further. I can be contacted both by email and on GitHub.


 B.) Documentation Tools

 Is there a succinct, but sufficiently informative description of
 documentation tools?
 I find that including the documentation in the source files is very
 distracting. I would prefer to have only basic comments in the source
 files and an expanded documentation in a separate location.

 This question may be more appropriate for the R-package-devel list. I

Re: [R] Time out error while connecting to Github repository

2023-09-04 Thread Martin Maechler
> siddharth sahasrabudhe via R-help 
> on Sun, 3 Sep 2023 09:54:28 +0530 writes:

> I want to access the .csv file from my github
> repository. While connecting to the Github repository I am
> getting the following error:

> Error in curl::curl_fetch_memory(file) : Timeout was
> reached: [raw.githubusercontent.com] Failed to connect to
> raw.githubusercontent.com port 443 after 5250 ms: Timed
> out

> The R-code is as below:

library(tidyverse)  ## << UNNEEDED !

library(rio)

> data <- import("
> 
https://raw.githubusercontent.com/siddharth-sahasrabudhe/Youtube-video-files/main/deck.csv
> ")

> Can you please suggest how I can able to resolve this
> issue?

Yes: You used the wrong "file name", because you added a
  /  
 on both ends by breaking the line.

This works nicely:

> require(rio)
> dd <- 
> import("https://raw.githubusercontent.com/siddharth-sahasrabudhe/Youtube-video-files/main/deck.csv;)
> str(dd)
'data.frame':   52 obs. of  3 variables:
 $ face : chr  "king" "queen" "jack" "ten" ...
 $ suit : chr  "spades" "spades" "spades" "spades" ...
 $ value: int  13 12 11 10 9 8 7 6 5 4 ...


> -- 
> Regards Siddharth Sahasrabudhe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding combination of states

2023-09-04 Thread Christofer Bogaso
Let say I have 3 time points.as T0, T1, and T2.(number of such time
points can be arbitrary) In each time point, an object can be any of 5
states, A, B, C, D, E (number of such states can be arbitrary)

I need to find all possible ways, how that object starting with state
B (say) at time T0, can be on state E (example) in time T2

For example one possibility is BAE etc.

Is there any function available with R, that can give me a vector of
such possibilities for arbitrary number of states, time, and for a
given initial and final (desired) states?

ANy pointer will be very appreciated.

Thanks for your time.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate formula - differing results

2023-09-04 Thread Ivan Calandra
Haha, got it now, there is an na.action argument (which defaults to 
na.omit) to aggregate() which is applied before calling mean(na.rm = 
TRUE). Thank you Rui for pointing this out.


So running it with na.pass instead of na.omit gives the same results as 
dplyr::group_by()+summarise():
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE, 
na.action = na.pass)


Cheers,
Ivan

On 04/09/2023 13:56, Rui Barradas wrote:

Às 12:51 de 04/09/2023, Ivan Calandra escreveu:

Thanks Rui for your help; that would be one possibility indeed.

But am I the only one who finds that behavior of aggregate() 
completely unexpected and confusing? Especially considering that 
dplyr::summarise() and doBy::summaryBy() deal with NAs differently, 
even though they all use mean(na.rm = TRUE) to calculate the group 
stats.


Best wishes,
Ivan

On 04/09/2023 13:46, Rui Barradas wrote:

Às 10:44 de 04/09/2023, Ivan Calandra escreveu:

Dear useRs,

I have just stumbled across a behavior in aggregate() that I cannot 
explain. Any help would be appreciated!


Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", 
"FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100", 
"HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 
142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), 
SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 
1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384, 
29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337, 
35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427, 
20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84, NA, 15.33, 
30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK = c(8.67, NA, 
7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78, NA, 78, 
54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4, 
5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6, 
14.9, 29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT", 
"FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", 
"HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 
114L, 115L), class = "data.frame")


1) Simple aggregation with 2 variables:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = 
mean, na.rm = TRUE)


2) Using the dot notation - different results:
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)

3) Using dplyr, I get the same results as #1:
group_by(my_data, RAWMAT) %>%
   summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))

4) It gets weirder: using all columns in #1 give the same results 
as in #2 but different from #1 and #3
aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH, 
PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = 
mean, na.rm = TRUE)


So it seems it is not only due to the notation (cbind() vs. dot). 
Is it a bug? A peculiar thing in my dataset? I tend to think this 
could be due to some variables (or their names) as all notations 
seem to agree when I remove some variables (although I haven't 
found out which variable(s) is (are) at fault), e.g.:


my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", 
"FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100", 
"HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 
142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), 
SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 
1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384, 
29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337, 
35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427, 
20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT", "FLINT", 
"FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", 
"HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 
112L, 113L, 114L, 115L), class = "data.frame")


aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, 
data = my_data2, FUN = mean, na.rm = TRUE)


aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE)

group_by(my_data2, RAWMAT) %>%
   summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE)))


Thank you in advance for any hint.
Best wishes,
Ivan




 *LEIBNIZ-ZENTRUM*
*FÜR ARCHÄOLOGIE*

*Dr. Ivan CALANDRA*
**Head of IMPALA (IMaging Platform At LeizA)

*MONREPOS* Archaeological Research Centre, Schloss Monrepos
56567 Neuwied, Germany

T: +49 2631 9772 243
T: +49 6131 8885 543
ivan.calan...@leiza.de

leiza.de 

ORCID 
ResearchGate


LEIZA is a foundation under public law of the State of 
Rhineland-Palatinate and the City of Mainz. Its headquarters are in 
Mainz. Supervision is carried out by the Ministry of Science and 
Health of the State of Rhineland-Palatinate. LEIZA is a research 
museum of the Leibniz Association.

__

Re: [R] Problems with installing R packages from source and running C++ in R, even on fresh R installation

2023-09-04 Thread Christophe Bousquet via R-help
> If you're up to compiling R from source [] and using a symbolic
> debugger [**] to step through Rcmd.exe, we could try to do that.
> Murphy's law says that the copy of Rcmd.exe you'll build from source
> will work well and refuse to reproduce the problem for you to
> investigate. (Beyond that, there is binary-level debugging, which I'm
> not well versed in.)

Dear Ivan,

Yes, I would be up for trying, but this week I need to finish some stuff before 
going on holidays.
I will try compiling R from source when I am back from holidays, and ask you if 
I need assistance.

Many thanks again for trying to solve this issue.

Best regards,
Christophe

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate formula - differing results

2023-09-04 Thread Rui Barradas

Às 12:51 de 04/09/2023, Ivan Calandra escreveu:

Thanks Rui for your help; that would be one possibility indeed.

But am I the only one who finds that behavior of aggregate() completely 
unexpected and confusing? Especially considering that dplyr::summarise() 
and doBy::summaryBy() deal with NAs differently, even though they all 
use mean(na.rm = TRUE) to calculate the group stats.


Best wishes,
Ivan

On 04/09/2023 13:46, Rui Barradas wrote:

Às 10:44 de 04/09/2023, Ivan Calandra escreveu:

Dear useRs,

I have just stumbled across a behavior in aggregate() that I cannot 
explain. Any help would be appreciated!


Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", 
"HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 
130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = 
c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 
1791.48, 461.15, 1127.2), Length = c(44.384, 29.831, 43.869, 48.011, 
54.109, 41.742, 23.854, 32.075, 21.337, 35.459), Width = c(45.982, 
67.303, 52.679, 26.42, 25.149, 33.427, 20.683, 62.783, 26.417, 
35.297), PLATWIDTH = c(38.84, NA, 15.33, 30.37, 11.44, 14.88, 13.86, 
NA, NA, 26.71), PLATTHICK = c(8.67, NA, 7.99, 11.69, 3.3, 16.52, 
4.58, NA, NA, 9.35), EPA = c(78, NA, 78, 54, 72, 49, 56, NA, NA, 56), 
THICKNESS = c(10.97, NA, 9.36, 6.4, 5.89, 11.05, 4.9, NA, NA, 10.08), 
WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9, 29.5, 4.5, NA, NA, 23), RAWMAT 
= c("FLINT", "FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", 
"HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), row.names = c(1L, 
2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class = "data.frame")


1) Simple aggregation with 2 variables:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean, 
na.rm = TRUE)


2) Using the dot notation - different results:
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)

3) Using dplyr, I get the same results as #1:
group_by(my_data, RAWMAT) %>%
   summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))

4) It gets weirder: using all columns in #1 give the same results as 
in #2 but different from #1 and #3
aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH, 
PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = 
mean, na.rm = TRUE)


So it seems it is not only due to the notation (cbind() vs. dot). Is 
it a bug? A peculiar thing in my dataset? I tend to think this could 
be due to some variables (or their names) as all notations seem to 
agree when I remove some variables (although I haven't found out 
which variable(s) is (are) at fault), e.g.:


my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", 
"HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 
130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = 
c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 
1791.48, 461.15, 1127.2), Length = c(44.384, 29.831, 43.869, 48.011, 
54.109, 41.742, 23.854, 32.075, 21.337, 35.459), Width = c(45.982, 
67.303, 52.679, 26.42, 25.149, 33.427, 20.683, 62.783, 26.417, 
35.297), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT", "FLINT", 
"HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), 
row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), 
class = "data.frame")


aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, 
data = my_data2, FUN = mean, na.rm = TRUE)


aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE)

group_by(my_data2, RAWMAT) %>%
   summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE)))


Thank you in advance for any hint.
Best wishes,
Ivan




 *LEIBNIZ-ZENTRUM*
*FÜR ARCHÄOLOGIE*

*Dr. Ivan CALANDRA*
**Head of IMPALA (IMaging Platform At LeizA)

*MONREPOS* Archaeological Research Centre, Schloss Monrepos
56567 Neuwied, Germany

T: +49 2631 9772 243
T: +49 6131 8885 543
ivan.calan...@leiza.de

leiza.de 

ORCID 
ResearchGate


LEIZA is a foundation under public law of the State of 
Rhineland-Palatinate and the City of Mainz. Its headquarters are in 
Mainz. Supervision is carried out by the Ministry of Science and 
Health of the State of Rhineland-Palatinate. LEIZA is a research 
museum of the Leibniz Association.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

You can define a vector of the columns of interest and subset the data 
with it. Then the default na.action = na.omit will no longer remove 
the rows with NA vals in at least one column and 

Re: [R] aggregate formula - differing results

2023-09-04 Thread Ivan Calandra

Thanks Rui for your help; that would be one possibility indeed.

But am I the only one who finds that behavior of aggregate() completely 
unexpected and confusing? Especially considering that dplyr::summarise() 
and doBy::summaryBy() deal with NAs differently, even though they all 
use mean(na.rm = TRUE) to calculate the group stats.


Best wishes,
Ivan

On 04/09/2023 13:46, Rui Barradas wrote:

Às 10:44 de 04/09/2023, Ivan Calandra escreveu:

Dear useRs,

I have just stumbled across a behavior in aggregate() that I cannot 
explain. Any help would be appreciated!


Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", 
"HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 
130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = 
c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 
1791.48, 461.15, 1127.2), Length = c(44.384, 29.831, 43.869, 48.011, 
54.109, 41.742, 23.854, 32.075, 21.337, 35.459), Width = c(45.982, 
67.303, 52.679, 26.42, 25.149, 33.427, 20.683, 62.783, 26.417, 
35.297), PLATWIDTH = c(38.84, NA, 15.33, 30.37, 11.44, 14.88, 13.86, 
NA, NA, 26.71), PLATTHICK = c(8.67, NA, 7.99, 11.69, 3.3, 16.52, 
4.58, NA, NA, 9.35), EPA = c(78, NA, 78, 54, 72, 49, 56, NA, NA, 56), 
THICKNESS = c(10.97, NA, 9.36, 6.4, 5.89, 11.05, 4.9, NA, NA, 10.08), 
WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9, 29.5, 4.5, NA, NA, 23), RAWMAT 
= c("FLINT", "FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", 
"HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), row.names = c(1L, 
2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class = "data.frame")


1) Simple aggregation with 2 variables:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean, 
na.rm = TRUE)


2) Using the dot notation - different results:
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)

3) Using dplyr, I get the same results as #1:
group_by(my_data, RAWMAT) %>%
   summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))

4) It gets weirder: using all columns in #1 give the same results as 
in #2 but different from #1 and #3
aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH, 
PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = 
mean, na.rm = TRUE)


So it seems it is not only due to the notation (cbind() vs. dot). Is 
it a bug? A peculiar thing in my dataset? I tend to think this could 
be due to some variables (or their names) as all notations seem to 
agree when I remove some variables (although I haven't found out 
which variable(s) is (are) at fault), e.g.:


my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", 
"HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 
130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = 
c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 
1791.48, 461.15, 1127.2), Length = c(44.384, 29.831, 43.869, 48.011, 
54.109, 41.742, 23.854, 32.075, 21.337, 35.459), Width = c(45.982, 
67.303, 52.679, 26.42, 25.149, 33.427, 20.683, 62.783, 26.417, 
35.297), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT", "FLINT", 
"HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), 
row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), 
class = "data.frame")


aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, 
data = my_data2, FUN = mean, na.rm = TRUE)


aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE)

group_by(my_data2, RAWMAT) %>%
   summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE)))


Thank you in advance for any hint.
Best wishes,
Ivan




 *LEIBNIZ-ZENTRUM*
*FÜR ARCHÄOLOGIE*

*Dr. Ivan CALANDRA*
**Head of IMPALA (IMaging Platform At LeizA)

*MONREPOS* Archaeological Research Centre, Schloss Monrepos
56567 Neuwied, Germany

T: +49 2631 9772 243
T: +49 6131 8885 543
ivan.calan...@leiza.de

leiza.de 

ORCID 
ResearchGate


LEIZA is a foundation under public law of the State of 
Rhineland-Palatinate and the City of Mainz. Its headquarters are in 
Mainz. Supervision is carried out by the Ministry of Science and 
Health of the State of Rhineland-Palatinate. LEIZA is a research 
museum of the Leibniz Association.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

You can define a vector of the columns of interest and subset the data 
with it. Then the default na.action = na.omit will no longer remove 
the rows with NA vals in at least one column and the results are the 
same.


However, this will 

Re: [R] aggregate formula - differing results

2023-09-04 Thread Rui Barradas

Às 10:44 de 04/09/2023, Ivan Calandra escreveu:

Dear useRs,

I have just stumbled across a behavior in aggregate() that I cannot 
explain. Any help would be appreciated!


Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103", 
"HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41, 
121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83, 
1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2), 
Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 
32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42, 
25.149, 33.427, 20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84, 
NA, 15.33, 30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK = 
c(8.67, NA, 7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78, 
NA, 78, 54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4, 
5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9, 
29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT", 
"FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), 
row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class = 
"data.frame")


1) Simple aggregation with 2 variables:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean, 
na.rm = TRUE)


2) Using the dot notation - different results:
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)

3) Using dplyr, I get the same results as #1:
group_by(my_data, RAWMAT) %>%
   summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))

4) It gets weirder: using all columns in #1 give the same results as in 
#2 but different from #1 and #3
aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH, 
PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = mean, 
na.rm = TRUE)


So it seems it is not only due to the notation (cbind() vs. dot). Is it 
a bug? A peculiar thing in my dataset? I tend to think this could be due 
to some variables (or their names) as all notations seem to agree when I 
remove some variables (although I haven't found out which variable(s) is 
(are) at fault), e.g.:


my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103", 
"HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41, 
121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83, 
1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2), 
Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 
32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42, 
25.149, 33.427, 20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT", 
"FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", 
"HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 
113L, 114L, 115L), class = "data.frame")


aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, data = 
my_data2, FUN = mean, na.rm = TRUE)


aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE)

group_by(my_data2, RAWMAT) %>%
   summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE)))


Thank you in advance for any hint.
Best wishes,
Ivan




 *LEIBNIZ-ZENTRUM*
*FÜR ARCHÄOLOGIE*

*Dr. Ivan CALANDRA*
**Head of IMPALA (IMaging Platform At LeizA)

*MONREPOS* Archaeological Research Centre, Schloss Monrepos
56567 Neuwied, Germany

T: +49 2631 9772 243
T: +49 6131 8885 543
ivan.calan...@leiza.de

leiza.de 

ORCID 
ResearchGate


LEIZA is a foundation under public law of the State of 
Rhineland-Palatinate and the City of Mainz. Its headquarters are in 
Mainz. Supervision is carried out by the Ministry of Science and Health 
of the State of Rhineland-Palatinate. LEIZA is a research museum of the 
Leibniz Association.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

You can define a vector of the columns of interest and subset the data 
with it. Then the default na.action = na.omit will no longer remove the 
rows with NA vals in at least one column and the results are the same.


However, this will not give the mean values of the other numeric 
columns, just of those two.




# define a vector of columns of interest
cols <- c("Length", "Width", "RAWMAT")

# 1) Simple aggregation with 2 variables, select cols:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data[cols], FUN = 
mean, na.rm = TRUE)


# 2) Using the dot notation - if cols are selected, equal results:
aggregate(. ~ RAWMAT, data 

Re: [R] [Pkg-Collaboratos] BioShapes Almost-Package

2023-09-04 Thread Martin Maechler
> Duncan Murdoch 
> on Mon, 4 Sep 2023 04:51:32 -0400 writes:

> On 03/09/2023 10:47 p.m., Jeff Newmiller wrote:
>> Leonard... the reason roxygen exists is to allow markup
>> in source files to be used to automatically generate the
>> numerous files required by standard R packages as
>> documented in Writing R Extensions.
>> 
>> If your goal is to not use source files this way then the
>> solution is to not use roxygen at all. Just create those
>> files yourself by directly editing them from scratch.

> Just a bit of elaboration on Jeff's suggestion -- here's
> the workflow I prefer to using Roxygen.

> Once you have a function that works:

> 1.  install the package 2.  set your working directory
> to the package "man" directory 3.  run
> `prompt(functionname)` 4.  edit `functionname.Rd` in the
> "man" directory, which will already be filled in as a
> skeleton help file, with comments describing what else
> to add.

> Don't run prompt() again after editing, or you'll lose
> all your edits.  But this is a good way to get started.

> I think for the first few times the comments are really
> helpful, but I wouldn't mind a way to suppress them.

> Duncan Murdoch

Me neither.  A new option, not changing the default, would make sense.

Martin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Pkg-Collaboratos] BioShapes Almost-Package

2023-09-04 Thread Martin Maechler
> Jeff Newmiller 
> on Sun, 03 Sep 2023 19:47:32 -0700 writes:

> Leonard... the reason roxygen exists is to allow markup in
> source files to be used to automatically generate the
> numerous files required by standard R packages as
> documented in Writing R Extensions.  If your goal is to
> not use source files this way then the solution is to not
> use roxygen at all. Just create those files yourself by
> directly editing them from scratch.

Yes. Many experienced R programmers do not use Roxygen
(or use it only rarely; e.g., together with ESS (Emacs Speaks
 Statistics) to make the initial creation or sometime a thorough
 updating the help pages  man/*.Rd more convenient).

There are different tastes and different work flows for
different people.

Martin


> On September 3, 2023 7:06:09 PM PDT, Leonard Mada via
> R-help  wrote:
>> Thank you Bert.
>> 
>> 
>> Clarification:
>> 
>> Indeed, I am using an add-on package: it is customary for
>> that package - that is what I have seen - to have the
>> entire documentation included as comments in the R src
>> files. (But maybe I am wrong.)
>> 
>> 
>> I will try to find some time over the next few days to
>> explore in more detail the R documentation. Although, I
>> do not know how this will interact with the add-on
>> package.
>> 
>> 
>> Sincerely,
>> 
>> 
>> Leonard
>> 
>> 
>> On 9/4/2023 4:58 AM, Bert Gunter wrote:
>>> 1. R-package-devel is where queries about package
>>> protocols should go.
>>> 
>>> 2. But...  "Is there a succinct, but sufficiently
>>> informative description of documentation tools?"
>>> "Writing R Extensions" (shipped with R) is *the*
>>> reference for R documentation. Whether it's sufficiently
>>> "succinct" for you, I cannot say.
>>> 
>>> "I find that including the documentation in the source
>>> files is very distracting."  ?? R documentation (.Rd)
>>> files are separate from source (.R) files.  Inline
>>> documentation in source files is an "add-on" capability
>>> provided by optional packages if one prefers to do
>>> this. Such packages parse the source files to extract
>>> the documentation into the .Rd files/ So not sure what
>>> you mean here. Apologies if I have misunderstood.
>>> 
>>> " I would prefer to have only basic comments in the
>>> source files and an expanded documentation in a separate
>>> location."  If I understand you correctly, this is
>>> exactly what the R package process specifies. Again, see
>>> the "Writing R Extensions" manual for details.
>>> 
>>> Also, if you wish to have your package on CRAN, it
>>> requires that the package documents all functions in the
>>> package as specified by the "Writing ..." manual.
>>> 
>>> Again, further questions and elaboration should go to
>>> the R-package-devel list, although I think the manual is
>>> really the authoritative resource to follow.
>>> 
>>> Cheers, Bert
>>> 
>>> 
>>> 
>>> On Sun, Sep 3, 2023 at 5:06 PM Leonard Mada via R-help
>>>  wrote:
>>> 
>>> Dear R-List Members,
>>> 
>>> I am looking for collaborators to further develop the
>>> BioShapes almost-package. I added a brief description
>>> below.
>>> 
>>> A.) BioShapes (Almost-) Package
>>> 
>>> The aim of the BioShapes quasi-package is to facilitate
>>> the generation of graphical objects resembling
>>> biological and chemical entities, enabling the
>>> construction of diagrams based on these objects. It
>>> currently includes functions to generate diagrams
>>> depicting viral particles, liposomes, double helix / DNA
>>> strands, various cell types (like neurons, brush-border
>>> cells and duct cells), Ig-domains, as well as more basic
>>> shapes.
>>> 
>>> It should offer researchers in the field of biological
>>> and chemical sciences a tool to easily generate diagrams
>>> depicting the studied biological processes.
>>> 
>>> The package lacks a proper documentation and is not yet
>>> released on CRAN. However, it is available on GitHub:
>>> https://github.com/discoleo/BioShapes
>>> 
>>> Although there are 27 unique cloners on GitHub, I am
>>> still looking for contributors and collaborators. I
>>> would appreciate any collaborations to develop it
>>> further. I can be contacted both by email and on GitHub.
>>> 
>>> 
>>> B.) Documentation Tools
>>> 
>>> Is there a succinct, but sufficiently informative
>>> description of documentation tools?  I find that
>>> including the documentation in the source files is very
>>> distracting. I would prefer to have only basic comments
>>> in the source files and an expanded documentation in a
>>> separate location.
>>> 

[R] Time out error while connecting to Github repository

2023-09-04 Thread siddharth sahasrabudhe via R-help
I want to access the .csv file from my github repository. While connecting
to the Github repository I am getting the following error:

Error in curl::curl_fetch_memory(file) :
Timeout was reached: [raw.githubusercontent.com] Failed to connect to
raw.githubusercontent.com port 443 after 5250 ms: Timed out

The R-code is as below:

library(tidyverse)
library(rio)

data <- import("
https://raw.githubusercontent.com/siddharth-sahasrabudhe/Youtube-video-files/main/deck.csv
")

Can you please suggest how I can able to resolve this issue?

-- 

Regards
Siddharth Sahasrabudhe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate formula - differing results

2023-09-04 Thread Ivan Calandra
Thanks Iago for the pointer.


It then means that na.rm = TRUE is not applied in the same way within 
aggregate() as opposed to dplyr::group_by() + summarise(), right? Within 
aggregate, it behaves like na.omit(), that is, it excludes the 
incomplete cases (whole rows), whereas with group_by() + summarise() it 
is applied on each vector (variable), which is what I actually would expect.


I hadn't showed it, but doBy::summaryBy() produces the same results as 
group_by() + summarise().


Ivan


On 04/09/2023 12:45, Iago Giné Vázquez wrote:
> It seems that the issue are the missings. If in  #1 you use the 
> dataset na.omit(my_data) instead of my_data, you get the same output 
> that in #2 and in #4, where all observations with missing data are 
> removed since you are including all the variables.
>
>
> The second dataset has no issue since it has no missing data.
>
> Iago
> 
> *De:* R-help  de part de Ivan Calandra 
> 
> *Enviat el:* dilluns, 4 de setembre de 2023 11:44
> *Per a:* R-help 
> *Tema:* [R] aggregate formula - differing results
> Dear useRs,
>
> I have just stumbled across a behavior in aggregate() that I cannot
> explain. Any help would be appreciated!
>
> Sample data:
> my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100",
> "FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103",
> "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41,
> 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83,
> 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2),
> Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854,
> 32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42,
> 25.149, 33.427, 20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84,
> NA, 15.33, 30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK =
> c(8.67, NA, 7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78,
> NA, 78, 54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4,
> 5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9,
> 29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT",
> "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")),
> row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class =
> "data.frame")
>
> 1) Simple aggregation with 2 variables:
> aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean,
> na.rm = TRUE)
>
> 2) Using the dot notation - different results:
> aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)
>
> 3) Using dplyr, I get the same results as #1:
> group_by(my_data, RAWMAT) %>%
>    summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))
>
> 4) It gets weirder: using all columns in #1 give the same results as in
> #2 but different from #1 and #3
> aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH,
> PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = mean,
> na.rm = TRUE)
>
> So it seems it is not only due to the notation (cbind() vs. dot). Is it
> a bug? A peculiar thing in my dataset? I tend to think this could be due
> to some variables (or their names) as all notations seem to agree when I
> remove some variables (although I haven't found out which variable(s) is
> (are) at fault), e.g.:
>
> my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100",
> "FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103",
> "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41,
> 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83,
> 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2),
> Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854,
> 32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42,
> 25.149, 33.427, 20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT",
> "FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS",
> "HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L,
> 113L, 114L, 115L), class = "data.frame")
>
> aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, data =
> my_data2, FUN = mean, na.rm = TRUE)
>
> aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE)
>
> group_by(my_data2, RAWMAT) %>%
>    summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE)))
>
>
> Thank you in advance for any hint.
> Best wishes,
> Ivan
>
>
>
>
>     *LEIBNIZ-ZENTRUM*
> *FÜR ARCHÄOLOGIE*
>
> *Dr. Ivan CALANDRA*
> **Head of IMPALA (IMaging Platform At LeizA)
>
> *MONREPOS* Archaeological Research Centre, Schloss Monrepos
> 56567 Neuwied, Germany
>
> T: +49 2631 9772 243
> T: +49 6131 8885 543
> ivan.calan...@leiza.de
>
> leiza.de 
> 
> ORCID 
> ResearchGate
> 
>
> LEIZA is a foundation under public law 

Re: [R] aggregate formula - differing results

2023-09-04 Thread Iago Giné Vázquez
It seems that the issue are the missings. If in  #1 you use the dataset 
na.omit(my_data) instead of my_data, you get the same output that in #2 and in 
#4, where all observations with missing data are removed since you are 
including all the variables.


The second dataset has no issue since it has no missing data.

Iago

De: R-help  de part de Ivan Calandra 

Enviat el: dilluns, 4 de setembre de 2023 11:44
Per a: R-help 
Tema: [R] aggregate formula - differing results

Dear useRs,

I have just stumbled across a behavior in aggregate() that I cannot
explain. Any help would be appreciated!

Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100",
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103",
"HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41,
121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83,
1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2),
Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854,
32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42,
25.149, 33.427, 20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84,
NA, 15.33, 30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK =
c(8.67, NA, 7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78,
NA, 78, 54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4,
5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9,
29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT",
"FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")),
row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class =
"data.frame")

1) Simple aggregation with 2 variables:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean,
na.rm = TRUE)

2) Using the dot notation - different results:
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)

3) Using dplyr, I get the same results as #1:
group_by(my_data, RAWMAT) %>%
   summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))

4) It gets weirder: using all columns in #1 give the same results as in
#2 but different from #1 and #3
aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH,
PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = mean,
na.rm = TRUE)

So it seems it is not only due to the notation (cbind() vs. dot). Is it
a bug? A peculiar thing in my dataset? I tend to think this could be due
to some variables (or their names) as all notations seem to agree when I
remove some variables (although I haven't found out which variable(s) is
(are) at fault), e.g.:

my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100",
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103",
"HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41,
121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83,
1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2),
Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854,
32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42,
25.149, 33.427, 20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT",
"FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS",
"HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L,
113L, 114L, 115L), class = "data.frame")

aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, data =
my_data2, FUN = mean, na.rm = TRUE)

aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE)

group_by(my_data2, RAWMAT) %>%
   summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE)))


Thank you in advance for any hint.
Best wishes,
Ivan




*LEIBNIZ-ZENTRUM*
*F�R ARCH�OLOGIE*

*Dr. Ivan CALANDRA*
**Head of IMPALA (IMaging Platform At LeizA)

*MONREPOS* Archaeological Research Centre, Schloss Monrepos
56567 Neuwied, Germany

T: +49 2631 9772 243
T: +49 6131 8885 543
ivan.calan...@leiza.de

leiza.de 

ORCID 
ResearchGate


LEIZA is a foundation under public law of the State of
Rhineland-Palatinate and the City of Mainz. Its headquarters are in
Mainz. Supervision is carried out by the Ministry of Science and Health
of the State of Rhineland-Palatinate. LEIZA is a research museum of the
Leibniz Association.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting 

[R] aggregate formula - differing results

2023-09-04 Thread Ivan Calandra

Dear useRs,

I have just stumbled across a behavior in aggregate() that I cannot 
explain. Any help would be appreciated!


Sample data:
my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103", 
"HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41, 
121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83, 
1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2), 
Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 
32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42, 
25.149, 33.427, 20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84, 
NA, 15.33, 30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK = 
c(8.67, NA, 7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78, 
NA, 78, 54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4, 
5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9, 
29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT", 
"FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), 
row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class = 
"data.frame")


1) Simple aggregation with 2 variables:
aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean, 
na.rm = TRUE)


2) Using the dot notation - different results:
aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)

3) Using dplyr, I get the same results as #1:
group_by(my_data, RAWMAT) %>%
  summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))

4) It gets weirder: using all columns in #1 give the same results as in 
#2 but different from #1 and #3
aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH, 
PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = mean, 
na.rm = TRUE)


So it seems it is not only due to the notation (cbind() vs. dot). Is it 
a bug? A peculiar thing in my dataset? I tend to think this could be due 
to some variables (or their names) as all notations seem to agree when I 
remove some variables (although I haven't found out which variable(s) is 
(are) at fault), e.g.:


my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", 
"FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103", 
"HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41, 
121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83, 
1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2), 
Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 
32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42, 
25.149, 33.427, 20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT", 
"FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", 
"HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 
113L, 114L, 115L), class = "data.frame")


aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, data = 
my_data2, FUN = mean, na.rm = TRUE)


aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE)

group_by(my_data2, RAWMAT) %>%
  summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE)))


Thank you in advance for any hint.
Best wishes,
Ivan




*LEIBNIZ-ZENTRUM*
*FÜR ARCHÄOLOGIE*

*Dr. Ivan CALANDRA*
**Head of IMPALA (IMaging Platform At LeizA)

*MONREPOS* Archaeological Research Centre, Schloss Monrepos
56567 Neuwied, Germany

T: +49 2631 9772 243
T: +49 6131 8885 543
ivan.calan...@leiza.de

leiza.de 

ORCID 
ResearchGate


LEIZA is a foundation under public law of the State of 
Rhineland-Palatinate and the City of Mainz. Its headquarters are in 
Mainz. Supervision is carried out by the Ministry of Science and Health 
of the State of Rhineland-Palatinate. LEIZA is a research museum of the 
Leibniz Association.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Pkg-Collaboratos] BioShapes Almost-Package

2023-09-04 Thread Duncan Murdoch

On 03/09/2023 10:47 p.m., Jeff Newmiller wrote:

Leonard... the reason roxygen exists is to allow markup in source files to be 
used to automatically generate the numerous files required by standard R 
packages as documented in Writing R Extensions.

If your goal is to not use source files this way then the solution is to not 
use roxygen at all. Just create those files yourself by directly editing them 
from scratch.


Just a bit of elaboration on Jeff's suggestion -- here's the workflow I 
prefer to using Roxygen.


Once you have a function that works:

1.  install the package
2.  set your working directory to the package "man" directory
3.  run `prompt(functionname)`
4.  edit `functionname.Rd` in the "man" directory, which will already be 
filled in as a skeleton help file, with comments describing what else to 
add.


Don't run prompt() again after editing, or you'll lose all your edits. 
But this is a good way to get started.


I think for the first few times the comments are really helpful, but I 
wouldn't mind a way to suppress them.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.