Re: [Rd] Archive policy and Rcpp?

2023-04-10 Thread Marc Schwartz via R-devel
Hi Dominick,

While somebody from CRAN might eventually reply here, your query is actually OT 
for R-Devel, which has been, since circa 2015, focused on more technical code 
development (e.g. C/C++/FORTRAN, etc.) and core R development issues, rather 
than on CRAN and package development related topics.

A better option for questions on package development/CRAN related topics would 
be to post to R-Package-Devel:

  https://stat.ethz.ch/mailman/listinfo/r-package-devel

or perhaps better yet in this specific case, directly to the CRAN folks at 
c...@r-project.org.

FWIW, my read of the CRAN policy at:

  https://cran.r-project.org/web/packages/policies.html

would suggest that there is an expectation that older versions of packages are 
"archived in perpetuity".

Regards,

Marc Schwartz
R-Devel Co-Admin


On April 10, 2023 at 10:23:09 AM, Dominick Samperi (djsamp...@gmail.com 
(mailto:djsamp...@gmail.com)) wrote:

> It appears that my archived packages Rcpp and RcppTemplate have
> been removed at CRAN, yet they appeared in the CRAN archives
> until recently.
>
> What is the CRAN policy on archives and removal?
>
> Thanks,
> Dominick
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Making headers self-contained for static analysis

2023-03-16 Thread Marc Schwartz via R-devel
Hi,

There are a limited number of MIME file types that are accepted through the 
list server, with plain text being one. Even though a patch file should be 
plain text, it is possible that your mail client may not have set the correct 
MIME type for your patch file attachment. If so, that would explain why it was 
filtered from the list.

For future reference, if you change the file extension to ".txt" and then 
attach it, that should get picked up as plain text and get through the list 
server filters.

Regards,

Marc Schwartz
R-Devel Co-Admin


On March 16, 2023 at 2:32:39 PM, Lionel Henry via R-devel 
(r-devel@r-project.org (mailto:r-devel@r-project.org)) wrote:

> People have let me know that the attachment didn't make it through.
> Do patches get filtered out?
>
> Please find it there:
> https://github.com/lionel-/r-svn/commit/e3de56798b1321a3fa8688a42bbb73d763b78024.patch
>
> I'm also happy to post it on the bugzilla if that makes sense.
>
> Best,
> Lionel
>
> On 3/16/23, Lionel Henry wrote:
> > Hello,
> >
> > I started using clangd to get better static analysis and code
> > refactoring tooling with the R sources (using eglot-mode in Emacs, it
> > just works once you've generated a `compile_commands.json` file with
> > `bear make all`). I noticed that the static analyser can't understand
> > several header files because these are not self-contained. So I went
> > through all .h files and inserted the missing includes, cf the
> > attached patch.
> >
> > Making the headers self-contained has consequences for the .c or .cpp
> > files that include them. In the case of C files, the only downside I
> > see is that it might cause users to accidentally rely on indirect
> > inclusion of standard headers, instead of directly including the
> > header to make the dependency explicit as would be good practice.
> > This doesn't seem like a big deal compared to the benefits of enabling
> > static analysis.
> >
> > However in the case of C++ that's more problematic. We don't want to
> > include the C headers because that would pollute the global namespace
> > and users might prefer to import the missing symbols (`size_t` and
> > `FILE`) selectively. Also that wouldn't help static analysis within
> > the header files since the analysers use the C path. So I have guarded
> > inclusion of standard C headers behing a `__cplusplus` check.
> >
> > If that makes sense, would R core consider applying the attached patch
> > to the R sources?
> >
> > Best,
> > Lionel
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] End of Support Date of Version 3 of “R”

2023-02-15 Thread Marc Schwartz via R-devel
Hi,

R's software development life cycle is documented here:

  https://www.r-project.org/doc/R-SDLC.pdf

This is available via the Certification link, under Documentation, on the main 
R Project web page:

  https://www.r-project.org

Section 4.4 of the document, on page 10, covers the release cycles, and section 
4.6, on page 11, covers maintenance, support and retirement.

Section 4.6 includes the following text at the end of that section on page 12:

"The x.y.0 releases are maintained via a series of x.y.z patch releases. At a 
new x.y.0 version of R, the prior version is retired from formal support. R 
Core’s efforts are then focused on the new Release (and the on-going 
Development) version. No further development, bug fixes or patches are made 
available for the retired versions. Thus there is always only one current 
version of R. However, the SVN repository will allow older release branches to 
be reopened, should the need arise."


Version 4.0.0 of R was released on April 24, 2020, thus ending formal support 
for version 3.x.x, with the last 3.x.x version being 3.6.3, which was released 
on February 29, 2020.

Regards,

Marc Schwartz


On February 15, 2023 at 10:21:26 AM, Eric Bernard (eric.berna...@michelin.com 
(mailto:eric.berna...@michelin.com)) wrote:

>
> Hello !
>
> Good day.
>
> I'd like to know what is the End of Support Date of Version 3 of R.
>
> Thanks for your answer.
>
> Have a good day.
>
> Best Regards
>
> Eric Bernard
> DCTI/BS/EC
>
> Cordialement.
>
> Eric Bernard
> Michelin
> DCTI/BS/EC
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-Forge http link redirection setup busted

2021-07-19 Thread Marc Schwartz via R-devel
Hi,

Only you can unsubscribe yourself.

Please visit:

https://stat.ethz.ch/mailman/options/r-devel

where you can unsubscribe from this list and if needed, get a password 
reminder, after entering your email address.

Regards,

Marc Schwartz 


> On Jul 19, 2021, at 5:30 PM, Arlene Battishill  
> wrote:
> 
> 
> Please unsubscribe me.
> 
>> On Mon, Jul 19, 2021 at 5:15 PM Spencer Graves  
>> wrote:
>> 
>> 
>> On 7/19/21 3:36 PM, Marc Schwartz via R-devel wrote:
>> > Dirk,
>> > 
>> > I have not use R-Forge in years, but on the main page 
>> > (https://r-forge.r-project.org), just before the blue "Latest News" box, 
>> > it shows:
>> > 
>> > "If you experience any problems or need help you can submit a support 
>> > request to the R-Forge team or write an email to r-fo...@r-project.org."
>> > 
>> > I have copied that e-mail address here. The URL for the support request 
>> > form requires a login to R-Forge to access.
>> > 
>> > Also, there appears to be a top level project associated with the 
>> > R-Forge Admins here:
>> > 
>> >https://r-forge.r-project.org/projects/site/
>> > 
>> > with references to the project members in the upper right hand corner.
>> > 
>> > No activity from the Admins there since December 2020, from what I can 
>> > tell...
>> 
>> 
>>   I encourage anyone still trying to use R-Forge to migrate to 
>> GitHub. 
>>   I migrated in 2019 after having many problems with changes not 
>> triggering checks and getting messages like, "there is no package called 
>> 'Matrix'".  There was a discussion on r-package-de...@r-project.org in 
>> January and February of this year on how to do that.  That thread could 
>> probably help anyone interested in doing this.
>> 
>> 
>>   Spencer Graves
>> 
>> > 
>> > Regards,
>> > 
>> > Marc Schwartz
>> > 
>> > 
>> > Dirk Eddelbuettel wrote on 7/19/21 4:03 PM:
>> >>
>> >> (Sorry for posting here but the top-level r-forge page does not make 
>> >> it all
>> >> that clear where to contact its admins.)
>> >>
>> >> When an old-style 'http' URL at r-forge is resolved / redirected to 
>> >> 'https',
>> >> it is corrupted and the redirect breaks.  That required a package 
>> >> re-upload
>> >> for me a few days ago (as the CRAN url checker is unhappy about a 
>> >> borked URL)
>> >> and you can see it live too e.g. via
>> >>
>> >>
>> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>> >>
>> >> and going to 'rcpp-devel archives' where the hover URL is (note 'http')
>> >>
>> >>http://lists.r-forge.r-project.org/pipermail/rcpp-devel/
>> >>
>> >> which when followed 'naively' drops the / between .org and pipermail and
>> >> gets a 404
>> >>
>> >>https://lists.r-forge.r-project.orgpipermail/rcpp-devel/
>> >>
>> >> Reinjecting the missing / helps as
>> >>
>> >>https://lists.r-forge.r-project.org/pipermail/rcpp-devel/
>> >>
>> >> works fine.  This is likely a typo / error in the redirection setup of 
>> >> the
>> >> web server.
>> >>
>> >> I would be very grateful if someone could pass this on to whoever 
>> >> looks after
>> >> r-forge these days.
>> >>
>> >> Best,  Dirk
>> >>
>> > 
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-Forge http link redirection setup busted

2021-07-19 Thread Marc Schwartz via R-devel

Dirk,

I have not use R-Forge in years, but on the main page 
(https://r-forge.r-project.org), just before the blue "Latest News" box, 
it shows:


"If you experience any problems or need help you can submit a support 
request to the R-Forge team or write an email to r-fo...@r-project.org."


I have copied that e-mail address here. The URL for the support request 
form requires a login to R-Forge to access.


Also, there appears to be a top level project associated with the 
R-Forge Admins here:


  https://r-forge.r-project.org/projects/site/

with references to the project members in the upper right hand corner.

No activity from the Admins there since December 2020, from what I can 
tell...


Regards,

Marc Schwartz


Dirk Eddelbuettel wrote on 7/19/21 4:03 PM:


(Sorry for posting here but the top-level r-forge page does not make it all
that clear where to contact its admins.)

When an old-style 'http' URL at r-forge is resolved / redirected to 'https',
it is corrupted and the redirect breaks.  That required a package re-upload
for me a few days ago (as the CRAN url checker is unhappy about a borked URL)
and you can see it live too e.g. via

   https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

and going to 'rcpp-devel archives' where the hover URL is (note 'http')

   http://lists.r-forge.r-project.org/pipermail/rcpp-devel/

which when followed 'naively' drops the / between .org and pipermail and
gets a 404

   https://lists.r-forge.r-project.orgpipermail/rcpp-devel/

Reinjecting the missing / helps as

   https://lists.r-forge.r-project.org/pipermail/rcpp-devel/

works fine.  This is likely a typo / error in the redirection setup of the
web server.

I would be very grateful if someone could pass this on to whoever looks after
r-forge these days.

Best,  Dirk



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] subset argument in nls() and possibly other functions

2021-07-13 Thread Marc Schwartz via R-devel

Hi John,

In scanning some of the more popular model functions (e.g. lm(), glm(), 
lme(), coxph(), etc.), none seem to provide examples of the use of the 
'subset' argument, even though it is documented for them.


That being said, there is some old (2003) documentation by Prof Ripley here:

https://developer.r-project.org/model-fitting-functions.html

that may be helpful, and where the link to the lm() function source code 
on the above page should be:


  https://svn.r-project.org/R/trunk/src/library/stats/R/lm.R

Within that source file, you might want to focus upon the 
model.frame.lm() function, the basic form which is used internally in 
many (most?, all?) of the typical model related functions in R to create 
the internal data frame from the specified formula, that is then used to 
create the model.


There is a parallel model.frame.glm() function for glm() here:

https://svn.r-project.org/R/trunk/src/library/stats/R/glm.R

There is also a 2003 paper by Thomas Lumley on non-standard evaluation 
that may be helpful:


https://developer.r-project.org/nonstandard-eval.pdf

The help for the generic ?model.frame has the following text for the 
'subset' argument:


"a specification of the rows to be used: defaults to all rows. This can 
be any valid indexing vector (see|[.data.frame 
|) for the rows 
of|data|or if that is not supplied, a data frame made up of the 
variables used in|formula|."


I cannot recall off-hand, using the 'subset' argument myself in ~20 
years of using R, but do seem to recall some old discussions on the 
e-mail lists, which I cannot seem to locate at present. A search via 
rseek.org may yield some benefit.


Regards,

Marc Schwartz


J C Nash wrote on 7/13/21 7:21 PM:

In mentoring and participating in a Google Summer of Code project "Improvements to 
nls()",
I've not found examples of use of the "subset" argument in the call to nls(). 
Moreover,
in searching through the source code for the various functions related to 
nls(), I can't
seem to find where subset is used, but a simple example, included below, 
indicates it works.
Three approaches all seem to give the same results.

Can someone point to documentation or code so we can make sure we get our 
revised programs
to work properly? The aim is to make them more maintainable and provide 
maintainer documentation,
along with some improved functionality. We seem, for example, to already be 
able to offer
analytic derivatives where they are feasible, and should be able to add 
Marquardt-Levenberg
stabilization as an option.

Note that this "subset" does not seem to be the "subset()" function of R.

John Nash

# CroucherSubset.R -- https://walkingrandomly.com/?p=5254

xdata = c(-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9)
ydata = 
c(0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001)
Cform <- ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata)
Cstart<-list(p1=1,p2=0.2)
Cdata<-data.frame(xdata, ydata)
Csubset<-1:8 # just first 8 points

# Original problem - no subset
fit0 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, 
start=list(p1=1,p2=.2))
summary(fit0)

# via subset argument
fit1 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, 
start=list(p1=1,p2=.2), subset=Csubset)
summary(fit1)

# via explicit subsetting
Csdata <- Cdata[Csubset, ]
Csdata
fit2 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Csdata, 
start=list(p1=1,p2=.2))
summary(fit2)

# via weights -- seems to give correct observation count if zeros not recognized
wts <- c(rep(1,8), rep(0,2))
fit3 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, 
weights=wts, start=list(p1=1,p2=.2))
summary(fit3)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Status of "**" operator

2021-05-21 Thread Marc Schwartz via R-devel

Hi All,

I was just sent some older R code from circa 2004, which contains the 
use of the "**" operator, which is parsed as "^".


From looking at ?"**", I see the following in the Note section:

"** is translated in the parser to ^, but this was undocumented for many 
years. It appears as an index entry in Becker et al (1988), pointing to 
the help for Deprecated but is not actually mentioned on that page. Even 
though it had been deprecated in S for 20 years, it was still accepted 
in R in 2008."



In using R 4.1.0:

> 2**3
[1] 8

the operator is still accepted in 2021.

Thus, has there been any discussion regarding the deprecation of this 
operator, or should the help file at least be updated to reflect the 
status in 2021?


Thanks,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] power.prop.test() documentation question

2020-12-16 Thread Marc Schwartz via R-devel
Hi All,

Based upon a discussion on power/sample size calculations on another, non-R 
related, list, some light bulbs went on regarding the assumptions of what type 
of statistical test is going to be used with various power/sample size 
calculators/functions for proportions. In some cases, this is clearly stated, 
in others, it is not.

In the case of power.prop.test() and comparing outputs against other 
calculators, there appears to be an implied presumption that an un-corrected 
chi-square test will be used, as opposed to a corrected chi-square or Fisher 
Exact Test (FET), in the 2x2 case. Sample sizes for the un-corrected chi-square 
will generally be smaller than either the corrected chi-square or the FET, 
given similar inputs, where the latter two, not surprisingly given their common 
conservative bias, will yield similar sample size results. 

This is not explicitly documented in ?power.prop.test, though it is in some 
other applications, as noted above. 

As a particular example from the other discussions, using p1 = 0.142, p2 = 
0.266, with power = 0.8 and sig.level = 0.05, power.prop.test() yields a sample 
size of ~165 per group. Other calculators that presume either a corrected 
chi-square or the FET, yield ~180 per group. 

I raise this issue, as should one use the function to calculate a prospective 
sample size for a study, and then actually use a corrected chi-square to 
analyze the data, per routine use and/or a formal analysis plan, the power of 
that test will be lower than that which was presumed for the a priori 
calculation. It may not make a big difference in some proportion of the cases 
relative to p <= alpha, but given the idiosyncrasies of the observed data at 
the end of the study, along with the effective loss of some power, it may very 
well be relevant to the results and their strict interpretation. It may also 
impact, to some extent, the a priori planning for the study, relative to the 
needed target sample size, budgeting and other considerations for a study 
sponsor.

Is there any logic in adding some notes to ?power.prop.test, to indicate the 
implied presumption of the use of an un-corrected chi-square test? 

Thanks for any comments, including telling me that I need more caffeine and to 
increase my oxygen uptake...

Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in unsplit() with tibbles

2020-11-21 Thread Marc Schwartz via R-devel
Hi,

Peter, thanks for the clarification.


Mario, I was not looking to debate the pros and cons of each environment, 
simply to point out that expecting mutually compatible functionality is not 
generalizable, especially when third party authors can make structural changes 
to their objects over time, that can then make them incompatible with base R 
functions, even if they may be today.

That is a key basis for third party packages offering specific class methods, 
whether S3 or S4, for object classes that are unique to their packages. That 
approach provides the obvious level of transparency.

For the tidyverse folks to offer a variant of split() and unsplit() that have 
specific methods for tibbles would seem entirely reasonable, presuming that 
they don't have a philosophical barrier to doing so, in deference to other 
approaches that do conform to their preferred function syntax.

Regards,

Marc


> On Nov 21, 2020, at 12:04 PM, Mario Annau  wrote:
> 
> Cool - thank you Peter!
> 
> @Marc: This is really not a tidyverse vs base-R debate and I personally think 
> that they should both work together for most parts. The common environment is 
> still R. But just to give you the full picture I also filed a bug for tibbles 
> (https://github.com/tidyverse/tibble/issues/829 
> <https://github.com/tidyverse/tibble/issues/829>). With these two fixes I 
> think that split/unsplit would work for tibbles and users (like me) just 
> don't have to care in which "environments" they are working in.
> 
> Cheers,
> Mario
> 
> 
> On Sat, 21 Nov 2020 at 17:54, Peter Dalgaard  <mailto:pda...@gmail.com>> wrote:
> I get the sentiment, but this is really just bad coding (on my own part, I 
> suspect), so we might as well just fix it...
> 
> -pd
> 
> > On 21 Nov 2020, at 17:42 , Marc Schwartz via R-devel  > <mailto:r-devel@r-project.org>> wrote:
> > 
> > 
> >> On Nov 21, 2020, at 10:55 AM, Mario Annau  >> <mailto:mario.an...@gmail.com>> wrote:
> >> 
> >> Hello,
> >> 
> >> using the `unsplit()` function with tibbles currently leads to the
> >> following error:
> >> 
> >>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
> >>> s <- split(mtcars_tb, mtcars_tb$gear)
> >>> unsplit(s, mtcars_tb$gear)
> >> Error: Must subset rows with a valid subscript vector.
> >> ℹ Logical subscripts must match the size of the indexed input.
> >> x Input has size 15 but subscript `rep(NA, len)` has size 32.
> >> Run `rlang::last_error()` to see where the error occurred.
> >> 
> >> Tibble seems to (rightly) complain, that a logical vector has been used for
> >> subsetting which does not have the same length as the data.frame (rows).
> >> Since `NA` is a logical value, the subset should be changed to
> >> `NA_integer_` in `unsplit()`:
> >> 
> >>> unsplit
> >> function (value, f, drop = FALSE)
> >> {
> >>   len <- length(if (is.list(f)) f[[1L]] else f)
> >>   if (is.data.frame(value[[1L]])) {
> >>   x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
> >>   rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
> >>   }
> >>   else x <- value[[1L]][rep(NA, len)]
> >>   split(x, f, drop = drop) <- value
> >>   x
> >> }
> >> 
> >> Cheers,
> >> Mario
> > 
> > 
> > Hi,
> > 
> > Perhaps I am missing something, but if you are using objects, like tibbles, 
> > that are intended to be part of another environment, in this case the 
> > tidyverse, why would you not use functions to manipulate these objects that 
> > were specifically created in the other environment?
> > 
> > I don't use the tidyverse, but it seems to me that to expect base R 
> > functions to work with objects not created in base R, is problematic, even 
> > though, perhaps by coincidence, they may work without adverse effects, as 
> > appears to be the case with split(). 
> > 
> > In other words, you should not, in reality, have had an a priori 
> > expectation that split() would work with a tibble either.
> > 
> > Rather than modifying the base R functions, like unsplit(), as you are 
> > suggesting, to be compatible with these third party objects, the burden 
> > should either be on you to use relevant tidyverse functions, or on the 
> > authors of the tidyverse to provide relevant class methods to provide that 
> > functionality.
> > 
> > Regards,
> > 
> > Marc Schwartz
> > 

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in unsplit() with tibbles

2020-11-21 Thread Marc Schwartz via R-devel


> On Nov 21, 2020, at 10:55 AM, Mario Annau  wrote:
> 
> Hello,
> 
> using the `unsplit()` function with tibbles currently leads to the
> following error:
> 
>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
>> s <- split(mtcars_tb, mtcars_tb$gear)
>> unsplit(s, mtcars_tb$gear)
> Error: Must subset rows with a valid subscript vector.
> ℹ Logical subscripts must match the size of the indexed input.
> x Input has size 15 but subscript `rep(NA, len)` has size 32.
> Run `rlang::last_error()` to see where the error occurred.
> 
> Tibble seems to (rightly) complain, that a logical vector has been used for
> subsetting which does not have the same length as the data.frame (rows).
> Since `NA` is a logical value, the subset should be changed to
> `NA_integer_` in `unsplit()`:
> 
>> unsplit
> function (value, f, drop = FALSE)
> {
>len <- length(if (is.list(f)) f[[1L]] else f)
>if (is.data.frame(value[[1L]])) {
>x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
>rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
>}
>else x <- value[[1L]][rep(NA, len)]
>split(x, f, drop = drop) <- value
>x
> }
> 
> Cheers,
> Mario


Hi,

Perhaps I am missing something, but if you are using objects, like tibbles, 
that are intended to be part of another environment, in this case the 
tidyverse, why would you not use functions to manipulate these objects that 
were specifically created in the other environment?

I don't use the tidyverse, but it seems to me that to expect base R functions 
to work with objects not created in base R, is problematic, even though, 
perhaps by coincidence, they may work without adverse effects, as appears to be 
the case with split(). 

In other words, you should not, in reality, have had an a priori expectation 
that split() would work with a tibble either.

Rather than modifying the base R functions, like unsplit(), as you are 
suggesting, to be compatible with these third party objects, the burden should 
either be on you to use relevant tidyverse functions, or on the authors of the 
tidyverse to provide relevant class methods to provide that functionality.

Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Hard memory limit of 16GB under Windows?

2020-04-07 Thread Marc Schwartz via R-devel
Hi Samuel,

You may already be aware, but if not, RStudio has their own support mechanisms 
here:

  https://support.rstudio.com/hc/en-us

If this does turn out to be RStudio specific, you may wish to check there for 
additional insights.

Regards,

Marc Schwartz


> On Apr 7, 2020, at 10:24 AM, Tomas Kalibera  wrote:
> 
> Hi Samuel,
> 
> please also have a look at ?memory.limit. You can set this limit at R 
> startup. It is in megabytes. Maybe R Studio sets it at runtime.
> 
> Best
> Tomas
> 
> On 4/7/20 3:57 PM, Samuel Granjeaud IR/Inserm wrote:
>> Hi Tomas,
>> 
>> Many thanks for your answer.
>> 
>> Here is a copy of a fresh session under RStudio, and after a copy under Rgui.
>> Strangely enough the result of memory.limit() is not the same. Without your 
>> question I would not have looked to RGui, being used to work with RStudio.
>> 
>> The value under RGui sounds to correspond to the total RAM of the computer. 
>> It makes me noticing that the value is in MB.
>> 
>> The value under Rstudio was so huge (1.759219e+13) that I just interpreted 
>> it as GB. But I was totally wrong. So in fact I don't know what it refers 
>> to. The documentation says "For a 64-bit versions of R under 64-bit Windows 
>> the limit is currently 8Tb.", but it looks like being 16TB, which my 
>> computer don't have of course.
>> 
>> I still have to understand why my colleague has a problem of memory 
>> allocation (cannot allocate vector of size 12.6 Gb).
>> 
>> Sorry for the wrong interpretation, but thanks for the help,
>> Samuel
>> 
>> --- RStudio
>> 
>> R version 3.6.3 (2020-02-29) -- "Holding the Windsock"
>> Copyright (C) 2020 The R Foundation for Statistical Computing
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> 
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>> 
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and
>> 'citation()' on how to cite R or R packages in publications.
>> 
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>> 
>> > memory.limit()
>> [1] 1.759219e+13
>> > sessionInfo()
>> R version 3.6.3 (2020-02-29)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows 10 x64 (build 18363)
>> 
>> Matrix products: default
>> 
>> locale:
>> [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252 
>> LC_MONETARY=French_France.1252 LC_NUMERIC=C
>> [5] LC_TIME=French_France.1252
>> 
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods base
>> 
>> loaded via a namespace (and not attached):
>> [1] compiler_3.6.3 tools_3.6.3
>> >
>> 
>> --- RGui
>> 
>> R version 3.6.3 (2020-02-29) -- "Holding the Windsock"
>> Copyright (C) 2020 The R Foundation for Statistical Computing
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> 
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>> 
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and
>> 'citation()' on how to cite R or R packages in publications.
>> 
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>> 
>> > ls()
>> character(0)
>> > memory.limit()
>> [1] 32627
>> > sessionInfo()
>> R version 3.6.3 (2020-02-29)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows 10 x64 (build 18363)
>> 
>> Matrix products: default
>> 
>> locale:
>> [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252 
>> LC_MONETARY=French_France.1252
>> [4] LC_NUMERIC=C   LC_TIME=French_France.1252
>> 
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods base
>> 
>> loaded via a namespace (and not attached):
>> [1] compiler_3.6.3
>> >
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] findInterval Documentation Suggestion

2020-03-06 Thread Marc Schwartz via R-devel


> On Mar 6, 2020, at 9:17 AM, brodie gaslam via R-devel  
> wrote:
> 
>> On Friday, March 6, 2020, 8:56:54 AM EST, Martin Maechler 
>>  wrote: 
> 
>> Note that the  * -> LaTex -> PDF rendered version looks a bitnicer.
> 
> Ah yes, that does indeed look quite a bit nicer.
> 
>> I wrote the function and that help page originally.
> 
> And thank you for doing so. It is a wonderful function.
> (0 sarcasm here).
> 
>> For that reason, replacing the well defined precise
>> inequality-based definition by *much* less precise English prosa
>> is out of the question.
> 
> I figured that might be an issue.  Would you be open to 
> providing a prose translation, but putting that in the 
> details? If so, it would be useful to get feedback on 
> what parts of the prose I proposed are imprecise enough 
> to be incorrect/incomplete for some corner case.
> 
> Finally, would it make sense to move this discussion to
> bugzilla?
> 
> Best,
> 
> Brodie.


Hi,

Just to put forth an alternative to modifying the existing, precise content 
that Martin wrote, in many cases, that content can be reasonably supplemented 
by the addition of specific examples and perhaps concise comments, that 
demonstrate what, otherwise, may be surprising behavior.

If Brodie can construct one or more such examples that might provide additional 
insights, then perhaps they can be considered for inclusion in the help file, 
such that meeting both goals of not compromising the language that Martin has 
contributed, while expanding comprehension, can be achieved.

Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as-cran issue ==> set _R_CHECK_LENGTH_1_* settings!

2020-01-14 Thread Marc Schwartz via R-devel
> On Jan 14, 2020, at 3:29 PM, Abby Spurdle  wrote:
> 
>> I do want to entice people to have a long look beyond closed
>> source OS into the world of Free Software where not only R is
>> FOSS (Free and Open Source Software) but (all / almost) all the
>> tools you use are of that same spirit.
> 
> And while everyone is talking about operating systems...
> 
> Recently, I tried to install R on Fedora.
> However, it only gave me the option of downloading and installing R
> 3.6.1, when the current release is/was R 3.6.2.
> I decided to wait, and may try again later, over the next week.
> 
> Is it possible for things to be free *and* simple?

Abby,

Which version of Fedora are you on?

The Fedora RPM build system for R:

  https://koji.fedoraproject.org/koji/packageinfo?packageID=1230

would seem to suggest that R 3.6.2 may not be available for Fedora 29 or 
earlier, which is not a surprise, given the rapid update cycle used on Fedora.

R 3.6.2 is available for Fedora 30, 31 and 32 per the above page.

If you are on Fedora >=30, you might check your yum repo to see if it has been 
properly updated.

Otherwise, if you are on Fedora <=29, you should think about updating your 
Fedora installation.

You may or may not be aware that there is a dedicated list for R on Fedora/RHEL 
and derivatives:

  https://stat.ethz.ch/mailman/listinfo/r-sig-fedora

Tom Callaway, who is the RH/Fedora maintainer for R is on that list, so you can 
pose queries to him via that list for any issues with R on Fedora.

Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mean

2020-01-09 Thread Marc Schwartz via R-devel
Peter,

Thanks for the reply.

If that were the case, then should not the following be allowed to work with 
ordered factors?

> median(factor(c("1", "2", "3"), ordered = TRUE))
Error in median.default(factor(c("1", "2", "3"), ordered = TRUE)) : 
  need numeric data

At least on the surface, if you can lexically order a character vector:

> median(c("red", "blue", "green"))
[1] "green"

you can also order a factor, or ordered factor, and if the number of elements 
is odd, return a median value.

Regards,

Marc


> On Jan 9, 2020, at 10:46 AM, peter dalgaard  wrote:
> 
> I think median() behaves as designed: As long as the argument can be ordered, 
> the "middle observation" makes sense, except when the middle falls between 
> two categories, and you can't define and average of the two candidates for a 
> median.
> 
> The "sick man" would seem to be var(). Notice that it is also inconsistent 
> with cov():
> 
>> cov(c("1","2","3","4"),c("1","2","3","4") )
> Error in cov(c("1", "2", "3", "4"), c("1", "2", "3", "4")) : 
>  is.numeric(x) || is.logical(x) is not TRUE
>> var(c("1","2","3","4"),c("1","2","3","4") )
> [1] 1.67
> 
> -pd
> 
> 
>> On 9 Jan 2020, at 14:49 , Marc Schwartz via R-devel  
>> wrote:
>> 
>> Jean-Luc,
>> 
>> Please keep the communications on the list, for the benefit of others, now 
>> and in the future, via the list archive. I am adding r-devel back here.
>> 
>> I can't speak to the rationale in some of these cases. As I noted, it may be 
>> (is likely) due to differing authors over time, and there may have been 
>> relevant use cases at the time that the code was written, resulting in the 
>> various checks. Presumably, the additional checks were not incorporated into 
>> the other functions to enforce a level of consistency.
>> 
>> We will need to wait for someone from R Core to comment.
>> 
>> Regards,
>> 
>> Marc
>> 
>>> On Jan 9, 2020, at 8:34 AM, Lipatz Jean-Luc  
>>> wrote:
>>> 
>>> Ok, inconstencies.
>>> 
>>> The last test you wrote is a bit strange. I agree that it is useful to warn 
>>> about a computation that have no sense in the case of factors. But why 
>>> testing data;frames? If you go that way using random structures, you can 
>>> also try :
>>> 
>>>> median(list(1,2),list(3,4),list(4,5))
>>> Error in if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) 
>>> return(x[FALSE][NA]) : 
>>> l'argument n'est pas interprétable comme une valeur logique
>>> De plus : Warning message:
>>> In if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) 
>>> return(x[FALSE][NA]) :
>>> la condition a une longueur > 1 et seul le premier élément est utilisé
>>> 
>>> giving a message which, despite of his length, doesn't really explain the 
>>> reason of the error.
>>> 
>>> Why not a test on arguments like?
>>> if (!is.numeric(x)) 
>>>stop("need numeric data")
>>> 
>>> 
>>> -Message d'origine-
>>> De : Marc Schwartz  
>>> Envoyé : jeudi 9 janvier 2020 14:19
>>> À : Lipatz Jean-Luc 
>>> Cc : R-Devel 
>>> Objet : Re: [Rd] mean
>>> 
>>> 
>>>> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc  
>>>> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> Is there a reason for the following behaviour?
>>>>> mean(c("1","2","3"))
>>>> [1] NA
>>>> Warning message:
>>>> In mean.default(c("1", "2", "3")) :
>>>> l'argument n'est ni numérique, ni logique : renvoi de NA
>>>> 
>>>> But:
>>>>> var(c("1","2","3"))
>>>> [1] 1
>>>> 
>>>> And also:
>>>>> median(c("1","2","3"))
>>>> [1] "2"
>>>> 
>>>> But:
>>>>> quantile(c("1","2","3"),p=.5)
>>>> Error in (1 - h) * qs[i] : 
>>>> argument non numérique pour un opérateur binaire
>>>> 
>>>> It sounds like a lack of symet

Re: [Rd] mean

2020-01-09 Thread Marc Schwartz via R-devel
Jean-Luc,

Please keep the communications on the list, for the benefit of others, now and 
in the future, via the list archive. I am adding r-devel back here.

I can't speak to the rationale in some of these cases. As I noted, it may be 
(is likely) due to differing authors over time, and there may have been 
relevant use cases at the time that the code was written, resulting in the 
various checks. Presumably, the additional checks were not incorporated into 
the other functions to enforce a level of consistency.

We will need to wait for someone from R Core to comment.

Regards,

Marc

> On Jan 9, 2020, at 8:34 AM, Lipatz Jean-Luc  wrote:
> 
> Ok, inconstencies.
> 
> The last test you wrote is a bit strange. I agree that it is useful to warn 
> about a computation that have no sense in the case of factors. But why 
> testing data;frames? If you go that way using random structures, you can also 
> try :
> 
>> median(list(1,2),list(3,4),list(4,5))
> Error in if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) 
> return(x[FALSE][NA]) : 
>  l'argument n'est pas interprétable comme une valeur logique
> De plus : Warning message:
> In if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) return(x[FALSE][NA]) :
>  la condition a une longueur > 1 et seul le premier élément est utilisé
> 
> giving a message which, despite of his length, doesn't really explain the 
> reason of the error.
> 
> Why not a test on arguments like?
>  if (!is.numeric(x)) 
>  stop("need numeric data")
> 
> 
> -Message d'origine-
> De : Marc Schwartz  
> Envoyé : jeudi 9 janvier 2020 14:19
> À : Lipatz Jean-Luc 
> Cc : R-Devel 
> Objet : Re: [Rd] mean
> 
> 
>> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc  wrote:
>> 
>> Hello,
>> 
>> Is there a reason for the following behaviour?
>>> mean(c("1","2","3"))
>> [1] NA
>> Warning message:
>> In mean.default(c("1", "2", "3")) :
>> l'argument n'est ni numérique, ni logique : renvoi de NA
>> 
>> But:
>>> var(c("1","2","3"))
>> [1] 1
>> 
>> And also:
>>> median(c("1","2","3"))
>> [1] "2"
>> 
>> But:
>>> quantile(c("1","2","3"),p=.5)
>> Error in (1 - h) * qs[i] : 
>> argument non numérique pour un opérateur binaire
>> 
>> It sounds like a lack of symetry. 
>> Best regards.
>> 
>> 
>> Jean-Luc LIPATZ
>> Insee - Direction générale
>> Responsable de la coordination sur le développement de R et la mise en 
>> oeuvre d'alternatives à SAS
> 
> 
> Hi,
> 
> It would appear, whether by design or just inconsistent implementations, 
> perhaps by different authors over time, that the checks for whether or not 
> the input vector is numeric differ across the functions.
> 
> A further inconsistency is for median(), where:
> 
>> median(c("1", "2", "3", "4"))
> [1] NA
> Warning message:
> In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) :
>  argument is not numeric or logical: returning NA
> 
> as a result of there being 4 elements, rather than 3, and the internal checks 
> in the code, where in the case of the input vector having an even number of 
> elements, mean() is used:
> 
>if (n%%2L == 1L) 
>sort(x, partial = half)[half]
>else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L])
> 
> 
> Similarly:
> 
>> median(factor(c("1", "2", "3")))
> Error in median.default(factor(c("1", "2", "3"))) : need numeric data
> 
> because the input vector is a factor, rather than character, and the initial 
> check has:
> 
>  if (is.factor(x) || is.data.frame(x)) 
>  stop("need numeric data")
> 
> 
> Regards,
> 
> Marc Schwartz
> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] mean

2020-01-09 Thread Marc Schwartz via R-devel


> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc  wrote:
> 
> Hello,
> 
> Is there a reason for the following behaviour?
>> mean(c("1","2","3"))
> [1] NA
> Warning message:
> In mean.default(c("1", "2", "3")) :
>  l'argument n'est ni numérique, ni logique : renvoi de NA
> 
> But:
>> var(c("1","2","3"))
> [1] 1
> 
> And also:
>> median(c("1","2","3"))
> [1] "2"
> 
> But:
>> quantile(c("1","2","3"),p=.5)
> Error in (1 - h) * qs[i] : 
>  argument non numérique pour un opérateur binaire
> 
> It sounds like a lack of symetry. 
> Best regards.
> 
> 
> Jean-Luc LIPATZ
> Insee - Direction générale
> Responsable de la coordination sur le développement de R et la mise en oeuvre 
> d'alternatives à SAS


Hi,

It would appear, whether by design or just inconsistent implementations, 
perhaps by different authors over time, that the checks for whether or not the 
input vector is numeric differ across the functions.

A further inconsistency is for median(), where:

> median(c("1", "2", "3", "4"))
[1] NA
Warning message:
In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) :
  argument is not numeric or logical: returning NA

as a result of there being 4 elements, rather than 3, and the internal checks 
in the code, where in the case of the input vector having an even number of 
elements, mean() is used:

if (n%%2L == 1L) 
sort(x, partial = half)[half]
else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L])


Similarly:

> median(factor(c("1", "2", "3")))
Error in median.default(factor(c("1", "2", "3"))) : need numeric data

because the input vector is a factor, rather than character, and the initial 
check has:

  if (is.factor(x) || is.data.frame(x)) 
  stop("need numeric data")


Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Offer zip builds

2019-06-03 Thread Marc Schwartz via R-devel



> On Jun 3, 2019, at 6:31 PM, Steven Penny  wrote:
> 
> On Mon, Jun 3, 2019 at 4:11 PM Marc Schwartz wrote:
>> I have not tried it, but if that is the case here, you may be able to use the
>> normal R binary installer, but adjust the default install options when
>> prompted, allowing you to customize the install location and other 
>> parameters,
>> that may be suitable in the absence of Admin rights.
>> 
>> Prior statements, not official, would suggest that R Core is not likely to
>> assist in providing official options for useRs to circumvent OS security
>> restrictions.
> 
> Theres nothing nefarious here. It would allow people to use the R environment
> without running an installer. If someone is a new user they may want to try
> R out, and installers can be invasive as they commonly:
> 
> - copy files to install dir
> - copy files to profile dir
> - set registry entries
> - set environment variables
> - set start menu entries
> 
> and historically uninstallers have a bad record of reverting these changes.
> should not put this burden upon new users or even having them resort to 
> virtual
> machine to avoid items above. having a ZIP file allows new users to run the
> R environment, then if they like it perhaps they can run the installer going
> forward. Are you familiar with Windows? As everything I am describing hasnt
> changed in at least 20 years.
> 
> I dont have a criticism of the R installer, I have not run tests to be able to
> determine if its well behaved or not. Its the *not knowing* that is the issue.
> With Windows, every installer could be perceived as a "black box".


Hi,

I am on macOS primarily, albeit, I have run both Windows and Linux routinely in 
years past.

That being said, these days, I do run Windows 10 under a Parallels VM on macOS, 
as I have a single commercial application that I need to run for clients now 
and then, and it sadly only runs on a real Windows install (e.g. not with Wine).

To your points:

The R for Windows FAQ does provide some information on installing R as a 
non-Admin:

  
https://cran.r-project.org/bin/windows/base/rw-FAQ.html#How-do-I-install-R-for-Windows_003f

as well as Registry change related information:

  
https://cran.r-project.org/bin/windows/base/rw-FAQ.html#Does-R-use-the-Registry_003f

There is also information on running from external media:

 
https://cran.r-project.org/bin/windows/base/rw-FAQ.html#Can-I-run-R-from-a-CD-or-USB-drive_003f

and uninstalling:

  
https://cran.r-project.org/bin/windows/base/rw-FAQ.html#How-do-I-UNinstall-R_003f


In addition, the R-Admin manual provides information on the Inno Setup 
installer:

  
https://cran.r-project.org/doc/manuals/r-release/R-admin.html#Building-the-Inno-Setup-installer
  
https://cran.r-project.org/doc/manuals/r-release/R-admin.html#The-Inno-Setup-installer

which leads you to:

  http://jrsoftware.org/isinfo.php

and shows that Inno Setup is, like R, fully open source, hence reviewable and 
not a black box, any more than R itself is. That should not be a surprise...

While I understand the use case you describe, it is, as I noted initially, up 
to R Core to be willing to provide an official release of a ZIP based 
installation. Unless you can make the case to them to expend the finite 
resources that they have to support this as part of each version release 
process, in light of the prior discussions, it is not clear that this appears 
to be a priority.

Again, I do not speak for them.

Otherwise, it falls to the community to volunteer to engage in that activity 
and fulfill the need.

Regards,

Marc

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Offer zip builds

2019-06-03 Thread Marc Schwartz via R-devel



> On Jun 3, 2019, at 4:40 PM, Abby Spurdle  wrote:
> 
>> If you go here:
>> https://cran.cnr.berkeley.edu/bin/windows/base
>> you see EXE installers for Windows. This contrasts with other programming
>> languages that offer both an executable installer and ZIP files that can
> be
>> extracted and run
> 
> Are you suggesting that R should do the same?
> If so, I second that, excellent idea.
> (However, gzip preferred).
> 
> I've had significant problems with the Windows installer.
> I've never had significant problems with zip files.
> Also, I assuming that the zip approach would be easier for systems
> administrators.
> However, I'm not a systems administrator...
> 
> 
> Abs


Hi,

First, I do not speak for R Core, who would, in the end, be responsible for 
offering something official here.

Second, prior discussions on this topic have generally pointed to:

  https://sourceforge.net/projects/rportable/

as one source for a portable version of R, albeit, with some dependencies (e.g. 
PortableApps framework)

That being said, again, based upon prior discussions on this topic, the typical 
reason for needing a ZIP archive of an R installation, is to circumvent Windows 
OS security restrictions, whereby a useR does not have the requisite Admin 
rights to install R via the default installer.

Thus, you can presumably download a ZIP of an R installation, unzip it in a 
location of your choosing, whereby you can then execute/run the R .exe binary. 
If you can't do that, then a ZIP will not be helpful to you.

I have not tried it, but if that is the case here, you may be able to use the 
normal R binary installer, but adjust the default install options when 
prompted, allowing you to customize the install location and other parameters, 
that may be suitable in the absence of Admin rights.

Prior statements, not official, would suggest that R Core is not likely to 
assist in providing official options for useRs to circumvent OS security 
restrictions.

Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] survival changes

2019-06-01 Thread Marc Schwartz via R-devel



> On Jun 1, 2019, at 12:59 PM, Peter Langfelder  
> wrote:
> 
> On Sat, Jun 1, 2019 at 3:22 AM Therneau, Terry M., Ph.D. via R-devel
>  wrote:
>> 
>> In the next version of the survival package I intend to make a non-upwardly 
>> compatable
>> change to the survfit object.  With over 600 dependent packages this is not 
>> something to
>> take lightly, and I am currently undecided about the best way to go about 
>> it.  I'm looking
>> for advice.
>> 
>> The change: 20+ years ago I had decided not to include the initial x=0,y=1 
>> data point in
>> the survfit object itself.  It was not formally an estimand and the 
>> plot/points/lines etc
>> routines could add this on themselves.  That turns out to have been a 
>> mistake, and has led
>> to a steady proliferation of extra bits as I realized that the time axis 
>> doesn't always
>> start at 0, and later (with multi state) that y does not always start at 1 
>> (though the
>> states sum to 1), and later the the error doesn't always start at 0, and 
>> another
>> realization with cumulative hazard, and ...
>> The new survfit method for multi-state coxph models was going to add yet 
>> another special
>> case.  Basically every component is turning into a duplicate of "row 1" vs 
>> "all the
>> others".  (And inconsistently named.)
>> 
>> Three possible solutions
>> 1. Current working draft of survival_3.0.3:  Add a 'version' element to the 
>> survfit object
>> and a 'survfit2.3' function that converts old to new.  All my downstream 
>> functions (print,
>> plot,...) start with an "if (old) update to new" line.  This has allowed me 
>> to stage
>> updates to the functions that create survfit objects -- I expect it to 
>> happen slowly.
>> There will also be a survfit3.2 function to go backwards. Both the forward 
>> and backwards
>> functions leave objects alone if they are currently in the desired format.
>> 
>> 2. Make a new class "survfit3" and the necessary 'as' functions. The package 
>> would contain
>> plot.survfit and plot.survfit3 methods, the former a two line "convert and 
>> call the
>> second" function.
>> 
>> 3. Something I haven't thought of.
> 
> A more "clean break" solution would be to start a whole new package
> (call it survival2) that would make these changes, and deprecate the
> current survival. You could add warnings about deprecation and urging
> users to switch in existing survival functions. You could continue
> bugfixes for survival but only add new features to survival2. The new
> survival2 and the current survival could live side by side on CRAN for
> quite some time, giving maintainers of dependent packages (and just
> plain users) enough time to switch. This could allow you to
> change/clean up other parts of the package that you could perhaps also
> use a rethink/rewrite, without too much concern for backward
> compatibility.
> 
> Peter


Hi,

I would be cautious in going in that direction, bearing in mind that survival 
is a Recommended package, therefore included in the default R distribution from 
the R Foundation and other parties. To have two versions can/will result in 
substantial confusion, and I would argue against that approach.

There is language in the CRAN submission policy that covers API changes, which 
strictly speaking, may or may not be the case here, depending upon which 
direction Terry elects to go:

"If an update will change the package’s API and hence affect packages depending 
on it, it is expected that you will contact the maintainers of affected 
packages and suggest changes, and give them time (at least 2 weeks, ideally 
more) to prepare updates before submitting your updated package. Do mention in 
the submission email which packages are affected and that their maintainers 
have been informed. In order to derive the reverse dependencies of a package 
including the addresses of maintainers who have to be notified upon changes, 
the function reverse_dependencies_with_maintainers is available from the 
developer website."


Given the potential extent and impact of the changes being considered, it would 
seem reasonable to:

1. Post a note to R-Devel (possibly R-Help to cover a larger useR base) 
regarding whatever changes are finalized and formally announce them. The 
changes are likely to affect end useRs as well as package maintainers.

2. Send communications directly via e-mail to the relevant package maintainers 
that have dependencies on survival.

3. Consider a longer deprecation time frame for relevant functions, to raise 
awareness and allow for changes to be made by package maintainers and useRs as 
may be apropos. Perhaps post reminders to R-Help at relevant time points in 
advance as you approach the formal deprecation and release of the updated 
package.


Terry, if you have not used it yet and/or are not aware of it, take a look at 
?Deprecated in base:

  https://stat.ethz.ch/R-manual/R-devel/library/base/html/Deprecated.html

which is helpful in setting up a deprecation process. If you 

Re: [Rd] prettyNum digits=0 not compatible with scientific notation

2019-03-22 Thread Marc Schwartz via R-devel



> On Mar 22, 2019, at 7:25 PM, peter dalgaard  wrote:
> 
> 
> 
>> On 22 Mar 2019, at 18:07 , Martin Maechler  
>> wrote:
>> 
>> gives (on Linux R 3.5.3, Fedora 28)
>> 
>>  d=10 d=7  d=2  d=1 d=0   
>> [1,] "123456" "123456" "123456" "1e+05" "%#4.0-1e"
>> [2,] "12345.6""12345.6""12346"  "12346" "%#4.0-1e"
>> [3,] "1234.56""1234.56""1235"   "1235"  "1235"
>> [4,] "123.456""123.456""123""123"   "123" 
>> [5,] "12.3456""12.3456""12" "12""12"  
>> [6,] "1.23456""1.23456""1.2""1" "1"   
>> [7,] "0.123456"   "0.123456"   "0.12"   "0.1"   "0"   
>> [8,] "0.0123456"  "0.0123456"  "0.012"  "0.01"  "0"   
>> [9,] "0.00123456" "0.00123456" "0.0012" "0.001" "0"   
>> 
>> but probably looks better on Mac
> 
> 
> Yes (3.5.1 though)
> 
>> nn <- 123456*10^(0:-8); dd <- c(10, 7, 2:0); names(dd) <- paste0("d=",dd)
>> sapply(dd, function(dig) sapply(nn, format, digits=dig))
>  d=10 d=7  d=2  d=1 d=0 
> [1,] "123456" "123456" "123456" "1e+05" "1.e+05"
> [2,] "12345.6""12345.6""12346"  "12346" "1.e+04"
> [3,] "1234.56""1234.56""1235"   "1235"  "1235"  
> [4,] "123.456""123.456""123""123"   "123"   
> [5,] "12.3456""12.3456""12" "12""12"
> [6,] "1.23456""1.23456""1.2""1" "1" 
> [7,] "0.123456"   "0.123456"   "0.12"   "0.1"   "0" 
> [8,] "0.0123456"  "0.0123456"  "0.012"  "0.01"  "0" 
> [9,] "0.00123456" "0.00123456" "0.0012" "0.001" "0"  
> 


Here is 3.5.3 on macOS:

> nn <- 123456*10^(0:-8); dd <- c(10, 7, 2:0); names(dd) <- paste0("d=",dd)
> sapply(dd, function(dig) sapply(nn, format, digits=dig))
  d=10 d=7  d=2  d=1 d=0 
 [1,] "123456" "123456" "123456" "1e+05" "1.e+05"
 [2,] "12345.6""12345.6""12346"  "12346" "1.e+04"
 [3,] "1234.56""1234.56""1235"   "1235"  "1235"  
 [4,] "123.456""123.456""123""123"   "123"   
 [5,] "12.3456""12.3456""12" "12""12"
 [6,] "1.23456""1.23456""1.2""1" "1" 
 [7,] "0.123456"   "0.123456"   "0.12"   "0.1"   "0" 
 [8,] "0.0123456"  "0.0123456"  "0.012"  "0.01"  "0" 
 [9,] "0.00123456" "0.00123456" "0.0012" "0.001" "0" 


Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Proposed patch for ?Extract

2019-02-21 Thread Marc Schwartz via R-devel
Hi,

In follow up to the thread on R-Help yesterday:

  https://stat.ethz.ch/pipermail/r-help/2019-February/461725.html

I am attaching a proposed patch against the trunk version of Extract.Rd, with 
wording added to the "Matrices and arrays" section, to note that indexing these 
object by factors behaves in a manner consistent with vectors.

Regards,

Marc Schwartz

--- ExtractORIG.Rd  2019-02-21 08:17:47.0 -0500
+++ ExtractMOD.Rd   2019-02-21 08:23:35.0 -0500
@@ -151,6 +151,9 @@
   indices can be numeric, logical, character, empty or even factor.
   An empty index (a comma separated blank) indicates that all entries in
   that dimension are selected.
+  As with vectors, indexing by factors is equivalent to indexing by the
+  numeric codes (see \code{\link{factor}}) and not by the character
+  values which are printed (for which use \code{[as.character(i)]}).
   The argument \code{drop} applies to this form of indexing.
 
   A third form of indexing is via a numeric matrix with the one column


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Marc Schwartz via R-devel



> On Aug 31, 2018, at 9:36 AM, Iñaki Ucar  wrote:
> 
> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> () escribió:
>> 
>> Dear all,
>> 
>> I a bit unsure, whether this qualifies as a bug, but it is definitly a 
>> strange behaviour. That why I wanted to discuss it.
>> 
>> With the following function, I want to test for evenly space numbers, 
>> starting from anywhere.
>> 
>> .is_continous_evenly_spaced <- function(n){
>>  if(length(n) < 2) return(FALSE)
>>  n <- n[order(n)]
>>  n <- n - min(n)
>>  step <- n[2] - n[1]
>>  test <- seq(from = min(n), to = max(n), by = step)
>>  if(length(n) == length(test) &&
>> all(n == test)){
>>return(TRUE)
>>  }
>>  return(FALSE)
>> }
>> 
>>> .is_continous_evenly_spaced(c(1,2,3,4))
>> [1] TRUE
>>> .is_continous_evenly_spaced(c(1,3,4,5))
>> [1] FALSE
>>> .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
>> [1] FALSE
>> 
>> I expect the result for 1 and 2, but not for 3. Upon Investigation it turns 
>> out, that n == test is TRUE for every pair, but not for the pair of 0.2.
>> 
>> The types reported are always double, however n[2] == 0.1 reports FALSE as 
>> well.
>> 
>> The whole problem is solved by switching from all(n == test) to 
>> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>> 
>> Does this work as intended? Thanks for any help, advise and suggestions in 
>> advance.
> 
> I guess this has something to do with how the sequence is built and
> the inherent error of floating point arithmetic. In fact, if you
> return test minus n, you'll get:
> 
> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
> 
> and the error gets bigger when you continue the sequence; e.g., this
> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
> 
> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
> [6] 4.440892e-16 4.440892e-16 0.00e+00
> 
> So, independently of this is considered a bug or not, instead of
> 
> length(n) == length(test) && all(n == test)
> 
> I would use the following condition:
> 
> isTRUE(all.equal(n, test))
> 
> Iñaki
> 
>> 
>> Best regards,
>> Felix


Hi,

This is essentially FAQ 7.31:

  
https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
 


Review that and the references therein to gain some insights into binary 
representations of floating point numbers.

Rather than the more complicated code you have above, try the following:

evenlyspaced <- function(x) {
  gaps <- diff(sort(x))
  all(gaps[-1] == gaps[1])
}

Note the use of ?diff:

> diff(c(1, 2, 3, 4))
[1] 1 1 1

> diff(c(1, 3, 4, 5))
[1] 2 1 1

> diff(c(1, 1.1, 1.2, 1.3))
[1] 0.1 0.1 0.1

However, in reality, due to the floating point representation issues noted 
above:

> print(diff(c(1, 1.1, 1.2, 1.3)), 20)
[1] 0.100088818 0.099866773
[3] 0.100088818

So the differences between the numbers are not exactly 0.1.

Using the function above, you get:

> evenlyspaced(c(1, 2, 3, 4))
[1] TRUE

> evenlyspaced(c(1, 3, 4, 5))
[1] FALSE

> evenlyspaced(c(1, 1.1, 1.2, 1.3))
[1] FALSE

As has been noted, if you want the gap comparison to be based upon some margin 
of error, use ?all.equal rather than the explicit equals comparison that I have 
in the function above. Something along the lines of:

evenlyspaced <- function(x) {
  gaps <- diff(sort(x))
  all(sapply(gaps[-1], function(x) all.equal(x, gaps[1])))
}

On which case, you now get:

evenlyspaced(c(1, 1.1, 1.2, 1.3))
[1] TRUE


Regards,

Marc Schwartz


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Where does L come from?

2018-08-25 Thread Marc Schwartz via R-devel
On Aug 25, 2018, at 9:26 AM, Hadley Wickham  wrote:
> 
> Hi all,
> 
> Would someone mind pointing to me to the inspiration for the use of
> the L suffix to mean "integer"?  This is obviously hard to google for,
> and the R language definition
> (https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Constants)
> is silent.
> 
> Hadley


The link you have above, does reference the use of 'L', but not the derivation.

There is a thread on R-Help from 2012 ("Difference between 10 and 10L"), where 
Prof. Ripley addresses the issue in response to Bill Dunlap and the OP:

  https://stat.ethz.ch/pipermail/r-help/2012-May/311771.html

In searching, I also found the following thread on SO:

  https://stackoverflow.com/questions/22191324/clarification-of-l-in-r/22192378

which had a link to the R-Help thread above and others.

Regards,

Marc Schwartz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel