[Rd] install.packages() / update.packages() sometimes outputs to stdout and sometimes to stderr [and menu() readline()]

2015-05-17 Thread Henrik Bengtsson
I've noticed that install.packages()
[https://svn.r-project.org/R/trunk/src/library/utils/R/packages.R] and
update.packages()
[https://svn.r-project.org/R/trunk/src/library/utils/R/packages2.R]
sometimes output to stdout and sometimes to stderr.

It looks like stderr is used (e.g. via cat()) when the message is part
of querying the user, e.g.

update.packages - function(lib.loc = NULL, repos = getOption(repos),
[...]
cat(old[k, Package], :\n,
Version, old[k, Installed],
installed in, old[k, LibPath],
if(checkBuilt) paste(built under R, old[k, Built]),
\n,
Version, old[k, ReposVer], available at,
simplifyRepos(old[k, Repository], type))
cat(\n)
answer - substr(readline(Update (y/N/c)?  ), 1L, 1L)
if(answer == c | answer == C) {
cat(cancelled by user\n)
return(invisible())
}

but it is not consistently so, because some are sent to stderr (e.g.
via message()), e.g.

install.packages -
function(pkgs, lib, repos = getOption(repos),
[...]
if(action == interactive  interactive()) {
msg -
ngettext(sum(later  hasSrc),
 Do you want to install from sources
the package which needs compilation?,
 Do you want to install from sources
the packages which need compilation?)
message(msg, domain = NA)
res - readline(y/n: )
if(res != y) later - later  !hasSrc
} else if (action == never) {
cat(  Binaries will be installed\n)
later - later  !hasSrc
}

Also, as one see in the latter example, it is not only interactive
user queries for which stdout is used.  It's simply not consistent -
at least I cannot see pattern.

If these are not intended behaviors, I'm happy to provide patches.
I'd prefer stderr for all user queries (see below) - but I can also
see how this is something that needs considered thoughts and made an
official design policy across the R base distribution.


[Related to the above but could deserve it's own thread (feel free to
move the below to its own thread)]

utils::menu(..., graphics=FALSE)
[https://svn.r-project.org/R/trunk/src/library/utils/R/menu.R] queries
the user via standard output, which becomes an issue when running
interactive report generators, which mostly captures stdout and makes
it part of the produced artifact.  Personally, I'd argue that querying
the user via stderr would be a better choice in more cases.

Also, it's a bit weird that base::readline(), which is used for the
actual prompting of the user, is sent neither to R's stdout nor
stderr, e.g.

 zz - file(all.Rout, open = wt)
 sink(zz); sink(zz, type = message)
 ans - menu(letters[1:3], title=Select one:, graphics=FALSE)
Selection: 1

 sink(type = message); sink()
 ans
[1] 1

 cat(readLines(all.Rout), sep=\n)
Select one:

1: a
2: b
3: c

Note the only thing displayed to the user is the prompt Selection: ,
which is generated by readline().  It does however output to the
system's stdout (verified on Linux and Windows), e.g.

$ Rscript -e readline('Press ENTER: ')  stdout.log
$ cat stdout.log
Press ENTER:
[1] 

Compare this to how it works in, for instance, bash:

$ read -p Press ENTER:  ans  stdout.log
Press ENTER:
$ read -p Press ENTER:  ans  stderr.log
$ cat stderr.log
Press ENTER:

My preference would be that menu() and readline() and other messages
for querying the user would output to the same connection/stream.
Again, I'd favor stderr over stdout, but possibly even a third
alternative designed specifically for user queries, cf. my R-devel
post '[Rd] Output to raw console rather than stdout/stderr?' on
2015-02-01 (https://stat.ethz.ch/pipermail/r-devel/2015-February/070578.html).
Just like when using menu(..., graphics=TRUE), this would not clutter
up the output to stdout (or stderr).  But even without this third
alternative, I argue that stderr is better than stdout. I'm happy to
provide patches for this as well.


Thanks,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The function cummax() seems to have a bug.

2015-05-17 Thread Henrik Bengtsson
Below is some further troubleshooting on this:

From code inspection this bug happens for only:

* for integer values
* when the first element is NA_integer_ and the second is not.


Examples:

# Numeric/doubles works as expected
 cummax(c(NA_real_, 0, 1, 2, 3))
[1] NA NA NA NA NA

# It does not occur when the first value is non-NA
 cummax(c(0L, NA_integer_, 1L, 2L, 3L))
[1]  0 NA NA NA NA

# When first value is NA, it is not remembered
# (because internal for loop starts with 2nd element)
 cummax(c(NA_integer_, 0L, 1L, 2L, 3L))
[1] NA  0  1  2  3

The problem is not there for cummin():

 cummin(c(0L, NA_integer_, 1L, 2L, 3L))
[1]  0 NA NA NA NA
 cummin(c(NA_integer_, 0L, 1L, 2L, 3L))
[1] NA NA NA NA NA

but that is just pure luck due to the fact how NA_integer_ is
internally represented as the smallest possible 4-byte signed integer,
i.e.

LibExtern intR_NaInt;   /* NA_INTEGER:= INT_MIN currently */
#define NA_INTEGER  R_NaInt

Note the comment, which implies that code should not rely on the
assumption that NA_integer_ == NA_INTEGER == R_NaInt == INT_MIN; it
could equally well have been INT_MAX, which in case cummin()would
return the wrong result whereas cummax() wouldn't. So, cummin() makes
the same mistake ascummax(), where the for-loop skips the test for NA
of the first element, cf.
https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L145-L148

The simple solution is probably to do (cf. native icumsum):

[HB-X201]{hb}: svn diff src/main/cum.c
Index: src/main/cum.c
===
--- src/main/cum.c  (revision 68378)
+++ src/main/cum.c  (working copy)
@@ -130,7 +130,7 @@
 int *ix = INTEGER(x), *is = INTEGER(s);
 int max = ix[0];
 is[0] = max;
-for (R_xlen_t i = 1 ; i  xlength(x) ; i++) {
+for (R_xlen_t i = 0 ; i  xlength(x) ; i++) {
if(ix[i] == NA_INTEGER) break;
is[i] = max = (max  ix[i]) ? max : ix[i];
 }
@@ -142,7 +142,7 @@
 int *ix = INTEGER(x), *is = INTEGER(s);
 int min = ix[0];
 is[0] = min;
-for (R_xlen_t i = 1 ; i  xlength(x) ; i++ ) {
+for (R_xlen_t i = 0 ; i  xlength(x) ; i++ ) {
if(ix[i] == NA_INTEGER) break;
is[i] = min = (min  ix[i]) ? min : ix[i];
 }

/Henrik

On Sun, May 17, 2015 at 4:13 AM, Dongcan Jiang dongcan.ji...@gmail.com wrote:
 Hi,

 The function cummax() seems to have a bug.

 x - c(NA, 0)
 storage.mode(x) - integer
 cummax(x)
 [1] NA  0

 The correct result of this case should be NA NA. The mistake in
 [https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L130-L136] may be
 the reason.

 Best Regards,
 Dongcan

 --
 Dongcan Jiang
 Team of Search Engine  Web Mining
 School of Electronic Engineering  Computer Science
 Peking University, Beijing, 100871, P.R.China

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] That 'make check-all' problem with the survival package

2015-05-17 Thread Hin-Tak Leung
--
On Sat, May 16, 2015 2:33 PM BST Marc Schwartz wrote:


 On May 16, 2015, at 6:11 AM, Hin-Tak Leung ht...@users.sourceforge.net 
 wrote:
 
 
 
 --
 On Sat, May 16, 2015 8:04 AM BST Uwe Ligges wrote:
 
 Not sure why this goes to R-devel. You just could have asked the 
 maintainer. Terry Therneau is aware of it and promised he will fix it.
 
 
 The quickest fix is to add cmprsk to the recommended list , and that's is an 
 R-devel issue.


Actually, the easiest solution would be for Terry to either modify relevant 
code according to:

  
http://cran.r-project.org/doc/manuals/r-release/R-exts.html#Suggested-packages

or perhaps better, as noted, to remove the need for cmprsk at all.


We shall disagree on what is quickest - one other option I did not mention is 
to back
out the version upgrade to 2.38-1 in February, and ship the previous 2.37-7.

I trust that Terry made the addition for good reasons. So my thoughts are 
geared more towards
how to accommodate the new addition, rather than revert/remove it.
(though I'd be happier to isolate a smaller change than a bulk 2.37-7 - 2.38.1 
upgrade
and think more about that. Hence the mention of the repositories.).

The latter raises the philosophical issue as to whether or not a Recommended 
package should or really needs to have any connections at all to third party 
CRAN packages. It seems to me that the default R distribution of “Base” and 
“Recommended” packages should be able to fully run in a stand alone manner 
without any declared external connections to other non-default packages.

The only other Recommended package that I found that has such a connection is 
nlme, which has a Suggests for Frank’s Hmisc. However, based upon a grep 
review of the package contents, there are only two code based references to 
Hmisc that I located:

./R/newFunc.R:264:      ## e.g. Hmisc's labelled
./tests/augPred_lab.R:2:if(require(Hmisc)) {

Neither of which is really needed (the first being a comment only, so benign) 
and of course the latter goes against the current guidance in R-exts. The 
second, within the if() code, uses the Hmisc label() function, which it seems 
to me is not really needed here.

Regards,

Marc Schwartz


 
 On 16.05.2015 07:22, Hin-Tak Leung wrote:
 'make check-all' for current R has been showing this error in the middle
 for a few months now - any thought on fixing this? I think cmprsk
 should be either included in the recommended bundle, or
 the survival vignette to not depend on it. Having 'make check-all' showing
 glaring ERROR's for a few months seems to defeat the purpose of
 doing any checking at all via 'make check-all'.
 
 FWIW, I did look at when/how the issue was introduced, but it appeared
 that svn://svn.r-forge.r-project.org/svnroot/survival is no longer being
 updated, and git://github.com/cran/survival.git only shows release jumps.
 Anyway, if first appears with survival 2.38-1 in February, and as the 
 previous
 2.37-7 was 13 months older, this info is of no use to anybody.
 I didn't write earlier as I thought the issue would go away at some point;
 but obviously this isn't the case after 3 months.
 
 ---
  ERROR
 Errors in running code in vignettes:
 when running code in ‘compete.Rnw’
   ...
 temp$fstat - as.numeric(event)
 
 temp$msex - with(temp, 1 * (sex == M))
 
 fgfit1 - with(temp, crr(etime, fstat, cov1 = cbind(age,
 +     msex, mspike), failcode = 2, cencode = 1, variance = TRUE))
 
   When sourcing ‘compete.R’:
 Error: could not find function crr
 Execution halted
 
 * checking re-building of vignette outputs ... NOTE
 Error in re-building vignettes:
   ...
 Warning in coxph(Surv(futime, death) ~ group:age2 + sex + strata(group),  :
   X matrix deemed to be singular; variable 23 24 25
 Loading required package: cmprsk
 Warning in library(package, lib.loc = lib.loc, character.only = TRUE, 
 logical.return = TRUE,  :
   there is no package called ‘cmprsk’
 
 Error: processing vignette 'compete.Rnw' failed with diagnostics:
  chunk 15 (label = finegray)
 Error in eval(expr, envir, enclos) : could not find function crr
 Execution halted
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] The function cummax() seems to have a bug.

2015-05-17 Thread Dongcan Jiang
Hi,

The function cummax() seems to have a bug.

 x - c(NA, 0)
 storage.mode(x) - integer
 cummax(x)
[1] NA  0

The correct result of this case should be NA NA. The mistake in [
https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L130-L136] may be
the reason.

Best Regards,
Dongcan

-- 
Dongcan Jiang
Team of Search Engine  Web Mining
School of Electronic Engineering  Computer Science
Peking University, Beijing, 100871, P.R.China

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel