Re: [Rd] na.omit inconsistent with is.na on list

2021-08-12 Thread Gabriel Becker
On Thu, Aug 12, 2021 at 4:30 PM Toby Hocking  wrote:

> Hi Gabe thanks for the feedback.
>
> On Thu, Aug 12, 2021 at 1:19 PM Gabriel Becker 
> wrote:
>
>> Hi Toby,
>>
>> This definitely appears intentional, the first  expression of
>> stats:::na.omit.default is
>>
>>if (!is.atomic(object))
>>
>> return(object)
>>
>> Based on this code it does seem that the documentation could be clarified
> to say atomic vectors.
>
>>
>> So it is explicitly just returning the object in non-atomic cases, which
>> includes lists. I was not involved in this decision (obviously) but my
>> guess is that it is due to the fact that what constitutes an observation
>> "being complete" in unclear in the list case. What should
>>
>> na.omit(list(5, NA, c(NA, 5)))
>>
>> return? Just the first element, or the first and the last? It seems, at
>> least to me, unclear.
>>
> I agree in principle/theory that it is unclear, but in practice is.na has
> an un-ambiguous answer (if list element is scalar NA then it is considered
> missing, otherwise not).
>

Well, yes it's unambiguous, but I would argue less likely than the other
option to be correct. Remember what na.omit is supposed to do: "remove
observations which are not complete".

Now for data.frames, this means it removes any row (i.e. observation,
despite the internal structure) where *any* column contains an NA. The most
analogous interpretation of na.omit on a list, in the well behaved (ie list
of atomic vectors) case, I think, is that we consider it a ragged
collection of "observations", in which case  x[is.na(x)] with x a list
would do the wrong thing because it is not checking these "observations"
for completeness.

Perhaps others disagree with me about that, and anyway, this only works
when you can check the elements of the list for "completeness" at all, the
list can have anything for elements, and then checking for completeness
becomes impossible...

As is, I do also wonder if a warning should be thrown letting the user know
that their call isn't doing ANY of the possible things it could mean...

Best,
~G


> A small change to the documentation to to add "atomic (in the sense of
>> is.atomic returning \code{TRUE})" in front of "vectors"  or similar  where
>> what types of objects are supported seems justified, though, imho, as the
>> current documentation is either ambiguous or technically incorrect,
>> depending on what we take "vector" to mean.
>>
>> Best,
>> ~G
>>
>> On Wed, Aug 11, 2021 at 10:16 PM Toby Hocking  wrote:
>>
>>> Also, the na.omit method for data.frame with list column seems to be
>>> inconsistent with is.na,
>>>
>>> > L <- list(NULL, NA, 0)
>>> > str(f <- data.frame(I(L)))
>>> 'data.frame': 3 obs. of  1 variable:
>>>  $ L:List of 3
>>>   ..$ : NULL
>>>   ..$ : logi NA
>>>   ..$ : num 0
>>>   ..- attr(*, "class")= chr "AsIs"
>>> > is.na(f)
>>>  L
>>> [1,] FALSE
>>> [2,]  TRUE
>>> [3,] FALSE
>>> > na.omit(f)
>>>L
>>> 1
>>> 2 NA
>>> 3  0
>>>
>>> On Wed, Aug 11, 2021 at 9:58 PM Toby Hocking  wrote:
>>>
>>> > na.omit is documented as "na.omit returns the object with incomplete
>>> cases
>>> > removed." and "At present these will handle vectors," so I expected
>>> that
>>> > when it is used on a list, it should return the same thing as if we
>>> subset
>>> > via is.na; however I observed the following,
>>> >
>>> > > L <- list(NULL, NA, 0)
>>> > > str(L[!is.na(L)])
>>> > List of 2
>>> >  $ : NULL
>>> >  $ : num 0
>>> > > str(na.omit(L))
>>> > List of 3
>>> >  $ : NULL
>>> >  $ : logi NA
>>> >  $ : num 0
>>> >
>>> > Should na.omit be fixed so that it returns a result that is consistent
>>> > with is.na? I assume that is.na is the canonical definition of what
>>> > should be considered a missing value in R.
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] na.omit inconsistent with is.na on list

2021-08-12 Thread Toby Hocking
Hi Gabe thanks for the feedback.

On Thu, Aug 12, 2021 at 1:19 PM Gabriel Becker 
wrote:

> Hi Toby,
>
> This definitely appears intentional, the first  expression of
> stats:::na.omit.default is
>
>if (!is.atomic(object))
>
> return(object)
>
> Based on this code it does seem that the documentation could be clarified
to say atomic vectors.

>
> So it is explicitly just returning the object in non-atomic cases, which
> includes lists. I was not involved in this decision (obviously) but my
> guess is that it is due to the fact that what constitutes an observation
> "being complete" in unclear in the list case. What should
>
> na.omit(list(5, NA, c(NA, 5)))
>
> return? Just the first element, or the first and the last? It seems, at
> least to me, unclear.
>
I agree in principle/theory that it is unclear, but in practice is.na has
an un-ambiguous answer (if list element is scalar NA then it is considered
missing, otherwise not).

> A small change to the documentation to to add "atomic (in the sense of
> is.atomic returning \code{TRUE})" in front of "vectors"  or similar  where
> what types of objects are supported seems justified, though, imho, as the
> current documentation is either ambiguous or technically incorrect,
> depending on what we take "vector" to mean.
>
> Best,
> ~G
>
> On Wed, Aug 11, 2021 at 10:16 PM Toby Hocking  wrote:
>
>> Also, the na.omit method for data.frame with list column seems to be
>> inconsistent with is.na,
>>
>> > L <- list(NULL, NA, 0)
>> > str(f <- data.frame(I(L)))
>> 'data.frame': 3 obs. of  1 variable:
>>  $ L:List of 3
>>   ..$ : NULL
>>   ..$ : logi NA
>>   ..$ : num 0
>>   ..- attr(*, "class")= chr "AsIs"
>> > is.na(f)
>>  L
>> [1,] FALSE
>> [2,]  TRUE
>> [3,] FALSE
>> > na.omit(f)
>>L
>> 1
>> 2 NA
>> 3  0
>>
>> On Wed, Aug 11, 2021 at 9:58 PM Toby Hocking  wrote:
>>
>> > na.omit is documented as "na.omit returns the object with incomplete
>> cases
>> > removed." and "At present these will handle vectors," so I expected that
>> > when it is used on a list, it should return the same thing as if we
>> subset
>> > via is.na; however I observed the following,
>> >
>> > > L <- list(NULL, NA, 0)
>> > > str(L[!is.na(L)])
>> > List of 2
>> >  $ : NULL
>> >  $ : num 0
>> > > str(na.omit(L))
>> > List of 3
>> >  $ : NULL
>> >  $ : logi NA
>> >  $ : num 0
>> >
>> > Should na.omit be fixed so that it returns a result that is consistent
>> > with is.na? I assume that is.na is the canonical definition of what
>> > should be considered a missing value in R.
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Force quitting a FORK cluster node on macOS and Solaris wreaks havoc

2021-08-12 Thread Simon Urbanek


Henrik,

I'm not quite sure I understand the report to be honest.

Just a quick comment here - using quit() in a forked child is not allowed, 
because the R clean-up is only intended for the master as it will be blowing 
away the master's state, connections, working directory, running master's exit 
handlers etc. That's why the children have to use either abort or mcexit() to 
terminate - which is what mcparallel() does. If you use q() a lot of things go 
wrong no matter the platform - e.g. try using ? in the master session after 
sourcing your code.

Cheers,
Simon


> On 12/08/2021, at 8:22 PM, Henrik Bengtsson  
> wrote:
> 
> The following smells like a bug in R to me, because it puts the main R
> session into an unstable state.  Consider the following R script:
> 
> a <- 42
> message("a=", a)
> cl <- parallel::makeCluster(1L, type="FORK")
> try(parallel::clusterEvalQ(cl, quit(save="no")))
> message("parallel:::isChild()=", parallel:::isChild())
> message("a=", a)
> rm(a)
> 
> The purpose of this was to emulate what happens when an parallel
> workers crashes.
> 
> Now, if you source() the above on macOS, you might(*) end up with:
> 
>> a <- 42
>> message("a=", a)
> a=42
>> cl <- parallel::makeCluster(1L, type="FORK")
>> try(parallel::clusterEvalQ(cl, quit(save="no")))
> Error: Error in unserialize(node$con) : error reading from connection
>> message("parallel:::isChild()=", parallel:::isChild())
> parallel:::isChild()=FALSE
>> message("a=", a)
> a=42
>> rm(a)
>> try(parallel::clusterEvalQ(cl, quit(save="no")))
> Error: Error in unserialize(node$con) : error reading from connection
>> message("parallel:::isChild()=", parallel:::isChild())
> parallel:::isChild()=FALSE
>> message("a=", a)
> Error: Error in message("a=", a) : object 'a' not found
> Execution halted
> 
> Note how 'rm(a)' is supposed to be the last line of code to be
> evaluated.  However, the force quitting of the FORK cluster node
> appears to result in the main code being evaluated twice (in
> parallel?).
> 
> (*) This does not happen on all macOS variants. For example, it works
> fine on CRAN's 'r-release-macos-x86_64' but it does give the above
> behavior on 'r-release-macos-arm64'.  I can reproduce it on GitHub
> Actions 
> (https://github.com/HenrikBengtsson/teeny/runs/3309235106?check_suite_focus=true#step:10:219)
> but not on R-hub's 'macos-highsierra-release' and
> 'macos-highsierra-release-cran'.  I can also reproduce it on R-hub's
> 'solaris-x86-patched' and solaris-x86-patched-ods' machines.  However,
> I still haven't found a Linux machine where this happens.
> 
> If one replaces quit(save="no") with tools::pskill(Sys.getpid()) or
> parallel:::mcexit(0L), this behavior does not take place (at least not
> on GitHub Actions and R-hub).
> 
> I don't have access to a macOS or a Solaris machine, so I cannot
> investigate further myself. For example, could it be an issue with
> quit(), or does is it possible to trigger by other means? And more
> importantly, should this be fixed? Also, I'd be curious what happens
> if you run the above in an interactive R session.
> 
> /Henrik
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] na.omit inconsistent with is.na on list

2021-08-12 Thread Gabriel Becker
Hi Toby,

This definitely appears intentional, the first  expression of
stats:::na.omit.default is

   if (!is.atomic(object))

return(object)


So it is explicitly just returning the object in non-atomic cases, which
includes lists. I was not involved in this decision (obviously) but my
guess is that it is due to the fact that what constitutes an observation
"being complete" in unclear in the list case. What should

na.omit(list(5, NA, c(NA, 5)))

return? Just the first element, or the first and the last? It seems, at
least to me, unclear. A small change to the documentation to to add "atomic
(in the sense of is.atomic returning \code{TRUE})" in front of "vectors"
or similar  where what types of objects are supported seems justified,
though, imho, as the current documentation is either ambiguous or
technically incorrect, depending on what we take "vector" to mean.

Best,
~G

On Wed, Aug 11, 2021 at 10:16 PM Toby Hocking  wrote:

> Also, the na.omit method for data.frame with list column seems to be
> inconsistent with is.na,
>
> > L <- list(NULL, NA, 0)
> > str(f <- data.frame(I(L)))
> 'data.frame': 3 obs. of  1 variable:
>  $ L:List of 3
>   ..$ : NULL
>   ..$ : logi NA
>   ..$ : num 0
>   ..- attr(*, "class")= chr "AsIs"
> > is.na(f)
>  L
> [1,] FALSE
> [2,]  TRUE
> [3,] FALSE
> > na.omit(f)
>L
> 1
> 2 NA
> 3  0
>
> On Wed, Aug 11, 2021 at 9:58 PM Toby Hocking  wrote:
>
> > na.omit is documented as "na.omit returns the object with incomplete
> cases
> > removed." and "At present these will handle vectors," so I expected that
> > when it is used on a list, it should return the same thing as if we
> subset
> > via is.na; however I observed the following,
> >
> > > L <- list(NULL, NA, 0)
> > > str(L[!is.na(L)])
> > List of 2
> >  $ : NULL
> >  $ : num 0
> > > str(na.omit(L))
> > List of 3
> >  $ : NULL
> >  $ : logi NA
> >  $ : num 0
> >
> > Should na.omit be fixed so that it returns a result that is consistent
> > with is.na? I assume that is.na is the canonical definition of what
> > should be considered a missing value in R.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Rprofile.site function or variable definitions break with R 4.1

2021-08-12 Thread Gabriel Becker
Hi Andrew and Dirk,

The other question to think about is what was your Rprofile.site doing
before. We can infer from this error that apparently it was defining things
*in the namespace for the base package*. How often is that actually what
you wanted it to do/a good idea?

I haven't played around with it, as I don't use Rprofile.site to actually
create/assign object only, like Dirk, set options or option-adjacent things
(such as .libPaths), but I imagine you could get it to put things into the
global environment or attach a special "local config" entry to the search
path an put things there, if you so desired.

Best,
~G

On Thu, Aug 12, 2021 at 12:41 PM Dirk Eddelbuettel  wrote:

>
> On 12 August 2021 at 15:19, Andrew Piskorski wrote:
> | Ok, but what's the recommended way to actually USE Rprofile.site now?
> | Should I move all my local configuration into a special package, and
> | do nothing in Rprofile.site except require() that package?
>
> Exactly as before. I set my mirror as I have before and nothing changes
>
>   ## We set the cloud mirror, which is 'network-close' to everybody, as
> default
>   local({
>   r <- getOption("repos")
>   r["CRAN"] <- "https://cloud.r-project.org";
>   options(repos = r)
>   })
>
> I cannot help but think that you are shooting the messenger (here
> Rprofile.site) for an actual behaviour change in R itself ?
>
> Dirk
>
> --
> https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Rprofile.site function or variable definitions break with R 4.1

2021-08-12 Thread Dirk Eddelbuettel


On 12 August 2021 at 15:19, Andrew Piskorski wrote:
| Ok, but what's the recommended way to actually USE Rprofile.site now?
| Should I move all my local configuration into a special package, and
| do nothing in Rprofile.site except require() that package?

Exactly as before. I set my mirror as I have before and nothing changes

  ## We set the cloud mirror, which is 'network-close' to everybody, as default
  local({
  r <- getOption("repos")
  r["CRAN"] <- "https://cloud.r-project.org";
  options(repos = r)
  })

I cannot help but think that you are shooting the messenger (here
Rprofile.site) for an actual behaviour change in R itself ?

Dirk

-- 
https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Rprofile.site function or variable definitions break with R 4.1

2021-08-12 Thread Andrew Piskorski
With R 4.1, it seems you can no longer do much in your "Rprofile.site"
file.  Attempting to define any functions or set any variables there
gives errors like these:

  Error: cannot add binding of 'my_function_name' to the base environment
  Error: cannot add binding of 'my_variable_name' to the base environment

Presumably that's because of this change in R 4.1.0:

  https://cran.r-project.org/doc/manuals/r-patched/NEWS.html
  CHANGES IN R 4.1.0
  The base environment and its namespace are now locked (so one can no
  longer add bindings to these or remove from these).

Ok, but what's the recommended way to actually USE Rprofile.site now?
Should I move all my local configuration into a special package, and
do nothing in Rprofile.site except require() that package?

Thanks for your help and advice!

-- 
Andrew Piskorski 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem in random number generation for Marsaglia-Multicarry + Kinderman-Ramage

2021-08-12 Thread peter dalgaard
With these matters, one has to be careful to distinguish between method error 
and implementation error. 

The reason for changing the RNG setup in R v. 1.7.0 was pretty much this kind 
of unfortunate interaction between M-M and K-R. There are even more egregious 
examples for the distribution of maxima of normal variables. Try e.g.

RNGversion("1.6.0") # Marsaglia-Multicarry, Kinderman-Ramage
 s <- replicate(1e6,max(rnorm(10)))
 plot(density(s))

(A further bug in K-R was fixed in 1.7.1, but that is tangential to this.)

A glimpse of the source of the problem is seen in the "microcorrelations" in 
this:
 
RNGkind("Mar");m <- matrix(runif(4e7),2)
 plot(m[1,],m[2,],xlim=c(0,1e-3),pch=".")
 m <- matrix(runif(4e7),2)
 points(m[1,],m[2,],pch=".")

These examples are from 2003, so the issue has been known for almost 2 decades. 
However, to the best of our knowledge, the M-M RNG is a faithful implementation 
of their method, so we have left the RNG in R's arsenal, in case someone needed 
it for some specific purpose. 

- pd

> On 12 Aug 2021, at 11:51 , GILLIBERT, Andre  
> wrote:
> 
> Dear R developers,
> 
> 
> In my opinion, I discovered a severe flaw that occur with the combination of 
> the Marsaglia-Multicarry pseudo-random number generator associated to the 
> Kinderman-Ramage algorithm to generate normally distributed numbers.
> 
> 
> The sample program is very simple (tested on R-4.1.1 x86_64 on Windows 10):
> 
> set.seed(1, "Marsaglia-Multicarry", normal.kind="Kinderman-Ramage")
> v=rnorm(1e7)
> poisson.test(sum(v < (-4)))$conf.int # returns c(34.5, 62.5)
> poisson.test(sum(v > (4)))$conf.int # returns c(334.2, 410.7)
> pnorm(-4)*1e7 # returns 316.7
> 
> 
> There should be approximatively 316 values less than -4 and 316 values 
> greater than +4, bug there are far too few values less than -4.
> 
> Results are similar with other random seeds, and things are even more obvious 
> with larger sample sizes.
> 
> The Kinderman-Ramage algorithm is fine when combined to Mersenne-Twister, and 
> Marsaglia-Multicarry is fine when combined with the normal.kind="Inversion" 
> algorithm, but the combination of Marsaglia-Multicarry and Kinderman-Ramage 
> seems to have severe flaws.
> 
> R should at least warn for that combination !
> 
> What do you think? Should I file a bug report?
> 
> --
> Sincerely
> Andr� GILLIBERT
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Problem in random number generation for Marsaglia-Multicarry + Ahrens-Dieter

2021-08-12 Thread GILLIBERT, Andre
Dear R developers,

At the same time I discovered a flaw in Marsaglia-Multicarry + 
Kinderman-Ramage, I found another in Marsaglia-Multicarry + Ahrens-Dieter.
It is less obvious than for Kinderman-Ramage; so I created a new thread for 
this bug.

The following code shows the problem (tested on R 4.1.1 x86_64 for Windows 10):

== start of code sample ==
set.seed(1, "Marsaglia-Multicarry", normal.kind="Ahrens-Dieter")
v=rnorm(1e8)

q=qnorm(seq(0.01, 0.99, 0.01))
cv=cut(v, breaks=c(-Inf, q, +Inf))
observed=table(cv)
chisq.test(observed) # p < 2.2e-16
== end of code sample ==

The chisq.test returns a P-value < 2.2e-16 while it was expected to return a 
non-significant P-value.
The additionnal code below, shows severe irregularities in the distribution of 
quantiles:

== continuation of code sample ==
expected = chisq.test(observed)$expected
z = (observed - expected)/sqrt(expected)
mean (abs(z) > 6) # 58% of z-scores are greater than 6 while none should be
== end of code sample ==

The bug is specific to the combination Marsaglia-Multicarry + Ahrens-Dieter.
There is no problem with Marsaglia-Multicarry + Inversion or Mersenne-Twister + 
Ahrens-Dieter

I would expect at least a warning (or an error) from R for such a buggy 
combination.

--
Sincerely
Andr� GILLIBERT


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Problem in random number generation for Marsaglia-Multicarry + Kinderman-Ramage

2021-08-12 Thread GILLIBERT, Andre
Dear R developers,


In my opinion, I discovered a severe flaw that occur with the combination of 
the Marsaglia-Multicarry pseudo-random number generator associated to the 
Kinderman-Ramage algorithm to generate normally distributed numbers.


The sample program is very simple (tested on R-4.1.1 x86_64 on Windows 10):

set.seed(1, "Marsaglia-Multicarry", normal.kind="Kinderman-Ramage")
v=rnorm(1e7)
poisson.test(sum(v < (-4)))$conf.int # returns c(34.5, 62.5)
poisson.test(sum(v > (4)))$conf.int # returns c(334.2, 410.7)
pnorm(-4)*1e7 # returns 316.7


There should be approximatively 316 values less than -4 and 316 values greater 
than +4, bug there are far too few values less than -4.

Results are similar with other random seeds, and things are even more obvious 
with larger sample sizes.

The Kinderman-Ramage algorithm is fine when combined to Mersenne-Twister, and 
Marsaglia-Multicarry is fine when combined with the normal.kind="Inversion" 
algorithm, but the combination of Marsaglia-Multicarry and Kinderman-Ramage 
seems to have severe flaws.

R should at least warn for that combination !

What do you think? Should I file a bug report?

--
Sincerely
Andr� GILLIBERT

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Force quitting a FORK cluster node on macOS and Solaris wreaks havoc

2021-08-12 Thread Henrik Bengtsson
The following smells like a bug in R to me, because it puts the main R
session into an unstable state.  Consider the following R script:

a <- 42
message("a=", a)
cl <- parallel::makeCluster(1L, type="FORK")
try(parallel::clusterEvalQ(cl, quit(save="no")))
message("parallel:::isChild()=", parallel:::isChild())
message("a=", a)
rm(a)

The purpose of this was to emulate what happens when an parallel
workers crashes.

Now, if you source() the above on macOS, you might(*) end up with:

> a <- 42
> message("a=", a)
a=42
> cl <- parallel::makeCluster(1L, type="FORK")
> try(parallel::clusterEvalQ(cl, quit(save="no")))
Error: Error in unserialize(node$con) : error reading from connection
> message("parallel:::isChild()=", parallel:::isChild())
parallel:::isChild()=FALSE
> message("a=", a)
a=42
> rm(a)
> try(parallel::clusterEvalQ(cl, quit(save="no")))
Error: Error in unserialize(node$con) : error reading from connection
> message("parallel:::isChild()=", parallel:::isChild())
parallel:::isChild()=FALSE
> message("a=", a)
Error: Error in message("a=", a) : object 'a' not found
Execution halted

Note how 'rm(a)' is supposed to be the last line of code to be
evaluated.  However, the force quitting of the FORK cluster node
appears to result in the main code being evaluated twice (in
parallel?).

(*) This does not happen on all macOS variants. For example, it works
fine on CRAN's 'r-release-macos-x86_64' but it does give the above
behavior on 'r-release-macos-arm64'.  I can reproduce it on GitHub
Actions 
(https://github.com/HenrikBengtsson/teeny/runs/3309235106?check_suite_focus=true#step:10:219)
but not on R-hub's 'macos-highsierra-release' and
'macos-highsierra-release-cran'.  I can also reproduce it on R-hub's
'solaris-x86-patched' and solaris-x86-patched-ods' machines.  However,
I still haven't found a Linux machine where this happens.

If one replaces quit(save="no") with tools::pskill(Sys.getpid()) or
parallel:::mcexit(0L), this behavior does not take place (at least not
on GitHub Actions and R-hub).

I don't have access to a macOS or a Solaris machine, so I cannot
investigate further myself. For example, could it be an issue with
quit(), or does is it possible to trigger by other means? And more
importantly, should this be fixed? Also, I'd be curious what happens
if you run the above in an interactive R session.

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Double to uint64_t on M1

2021-08-12 Thread Prof Brian Ripley

On 12/08/2021 04:52, Simon Urbanek wrote:


Dipterix,

this has nothing to do with R. 2^63 is too large to be represented as singed 
integer, so the behavior is undefined - to quote from the C99 specs (6.3.1.4):

"If the value of the integral part cannot be represented by the integer type, the 
behavior is undefined."

Your subject doesn't match your question as the uint64_t conversion is 
well-defined and the same on both platforms, but the conversion to int64_t in 
undefined.


As I was writing a reply to say the same thing, a few more comments.

- the example is actually in C++, but also undefined there.

- R is more careful:
> as.integer(2^31)
[1] NA
Warning message:
NAs introduced by coercion to integer range

- there is a sanitizer for this, on platforms including Linux and macOS 
(at least with clang, 
https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#supported-platforms).




Cheers,
Simon



On 12/08/2021, at 10:50 AM, Dipterix Wang  wrote:

Hi,

I was trying to convert REALSXP to int64_t in C, then found that converting 
2^63 is inconsistent across platforms:


On M1 ARM osx, 2^63 (double) bit converting to `int64_t` becomes 
9223372036854775807
On x86_64 ubuntu server, 2^63 (double) bit converting to `int64_t` is 
-9223372036854775808

I was wondering if this is desired behavior to R?

Here's the code to replicate the results above.

print_bit <- Rcpp::cppFunction(r"(
SEXP print_bit(SEXP obj){

  int64_t tmp1 = *REAL0(obj);
  printf("%lld ", tmp1);

  return(R_NilValue);
}
)")

print_bit(2^63)

Thanks,
- Dipterix
[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel