Re: [Rd] qt() returns Inf with certain negative ncp values

2022-06-15 Thread Stephen Berman
On Tue, 14 Jun 2022 18:00:24 +0200 Martin Maechler  
wrote:

>> GILLIBERT, Andre
>> on Tue, 14 Jun 2022 13:39:41 + writes:
>
> > Hello,
> >> I asked about the following observations on r-help and it
> >> was suggested that they may indicate an algorithmic
> >> problem with qt(), so I thought I should report them
> >> here.
>
> Which is fine.
> Usually you should *CAREFULLY* read the corresponding reference
> documentation before posting.

I actually have read the documentation before, though admittedly I
didn't reread it carefully before posting, but I vaguely remembered the
reservations about the tail accuracy of large values.  The main reason I
posted was my surprise at getting seemingly good values, then suddenly
Inf, then again seemingly good values.  I actually ran into this issue
when I was graphing various effect sizes with t-distributions and with a
large effect suddenly got an error and no graph, but then with an even
larger effect got a graph again.

[...]
> Still, this lack of a better algorithm had bothered me (as R
> Core member) in the past quite a bit, and I had implemented other
> approximations for cases where the current algorithm is
> deficient... but I had not been entirely satisfied, nor had I
> finished exploring or finding solutions in all relevant cases.
>
> In the mean time I had created CRAN package 'DPQ' (Density,
> Probability, Quantile computations) which also contains
> quite a few functions related to better/alternative computations
> of pt(*, ncp=*)  which I call pnt(), not the least because R's
> implementation of the algorithm is in   /src/nmath/pnt.c
> and the C function is called pnt().
>
> Till now, I have not found a student or a collaborator to
> finally get this project further  {{hint, hint!}}.
>
> In DPQ, (download the *source* package if you are interested),
> there's a help page listing the current approaches I have
>
>   https://search.r-project.org/CRAN/refmans/DPQ/html/pnt.html
> or
>   https://rdrr.io/cran/DPQ/man/pnt.html
>
> Additionally, in the source (man/pnt.Rd) there are comments about a not yet
> implemented one, and there are even two R scripts exhibiting
> bogous (and already fixed) behavior of the non-central t CDF:
>
>  https://rdrr.io/rforge/DPQ/src/tests/t-nonc-tst.R   and
>  https://rdrr.io/rforge/DPQ/src/tests/pnt-prec.R
>
> Indeed, this situation *can* be improved, but it needs dedicated work
> of people somewhat knowledgable in applied math etc.
>
> Would you (readers ..) be interested in helping?

I'm afraid I don't have enough knowledge or time to be useful for such a
project.  But what you describe sounds interesting and I'll try to find
time to look at it.

Thanks for your reply, I really appreciate it.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] build failure: 'hashtab' is not an exported object from 'namespace:utils'

2021-12-16 Thread Stephen Berman
On Thu, 16 Dec 2021 09:57:40 + Prof Brian Ripley  
wrote:

> On 16/12/2021 09:13, Stephen Berman wrote:
>> I just did `svn up' on the R development sources, switched to the build
>> directory (I build R out of tree), ran make, and got this:
>
> Precisely which version of R-devel updating from which version? -- this is an
> area that has changed frequently in the last several days.

Yes, it's been more than a month since my last build: I updated from

r81161 | maechler | 2021-11-08 14:30:50 +0100 (Mon, 08 Nov 2021) | 1 line

to

r81384 | pd | 2021-12-16 01:20:16 +0100 (Thu, 16 Dec 2021) | 1 line

> I suspect 'make clean' is not enough -- use 'make distclean' for an ab initio
> build.

That certainly gave me a clean slate -- and the build then succeeded.
Thanks.

On Thu, 16 Dec 2021 12:58:55 +0300 Ivan Krylov  wrote:

> On Thu, 16 Dec 2021 10:13:11 +0100
> Stephen Berman  wrote:
>
>> Is this a known issue and is there a fix?
>
> For me, the fix was to remove the already-installed
> $SVNROOT/library/utils (which didn't yet contain hashtab) and re-run
> make, letting the R build process re-install it from scratch.

I guess that would have been $BUILD/library/utils for me?  Perhaps that
would have shortened the build time, but I had already done `make
distclean' and a complete rebuild was quick enough.  But thanks for the
reply.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] build failure: 'hashtab' is not an exported object from 'namespace:utils'

2021-12-16 Thread Stephen Berman
I just did `svn up' on the R development sources, switched to the build
directory (I build R out of tree), ran make, and got this:

make[6]: Entering directory '/home/steve/build/r-devel/src/library/tools/src'
../../../../library/tools/libs/tools.so is unchanged
make[6]: Leaving directory '/home/steve/build/r-devel/src/library/tools/src'
make[5]: Leaving directory '/home/steve/build/r-devel/src/library/tools/src'
make[4]: Leaving directory '/home/steve/build/r-devel/src/library/tools'
make[4]: Entering directory '/home/steve/build/r-devel/src/library/tools'
installing 'sysdata.rda'
Error: 'hashtab' is not an exported object from 'namespace:utils'
Execution halted
make[4]: *** [/home/steve/src/R/r-devel/share/make/basepkg.mk:151: sysdata] 
Error 1
make[4]: Leaving directory '/home/steve/build/r-devel/src/library/tools'
make[3]: *** [Makefile:36: all] Error 2
make[3]: Leaving directory '/home/steve/build/r-devel/src/library/tools'
make[2]: *** [Makefile:37: R] Error 1
make[2]: Leaving directory '/home/steve/build/r-devel/src/library'
make[1]: *** [Makefile:28: R] Error 1
make[1]: Leaving directory '/home/steve/build/r-devel/src'
make: *** [Makefile:61: R] Error 1

I then did `make clean', ran configure and make again, and got the same
failure.  Is this a known issue and is there a fix?

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.table() fails with https in R 3.6 but not in R 3.5

2019-05-06 Thread Stephen Berman
On Mon, 6 May 2019 11:12:25 +0200 Ralf Stubner  wrote:

> On 04.05.19 19:04, Stephen Berman wrote:
>> In versions of R prior to 3.6.0 the following invocation succeeds,
>> returning the data frame shown:
>>
>>> read.table("https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text;,
>>> header=TRUE)
>>Dekade   Anzahl
>> 11900 11467254
>> 21910 13023370
>> 31920 13434601
>> 41930 13296355
>> 51940 12121250
>> 61950 13191131
>> 71960 10587420
>> 81970 10944129
>> 91980 11279439
>> 10   1990 12052652
>>
>> But in version 3.6.0 it fails:
>>
>>> read.table("https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text;,
>>> header=TRUE)
>> Error in file(file, "rt") :
>>   cannot open the connection to
>> 'https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text'
>> In addition: Warning message:
>> In file(file, "rt") :
>>   cannot open URL
>> 'https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text':
>> HTTP status was '403 Forbidden'
>
> I can reproduce the behavior on Debian using the CRAN supplied package
> for R 3.6.0. Trying to read the page with 'curl' produces also a 403
> error plus some HTML text (in German) explaining that I am treated as a
> 'robot' due to the supplied User-Agent (here: curl/7.52.1). One
> suggested solution is to adjust that value which does solve the issue:
>
>  > options(HTTPUserAgent='mozilla')

I confirm that works for me, too.  Thanks!  FWIW, the default value of
HTTPUserAgent in R 3.6 here is "R (3.6.0 x86_64-suse-linux-gnu x86_64
linux-gnu)", and using this (in R 3.6) fails as I reported, while the
default value of HTTPUserAgent in R 3.5 here is "R (3.5.0
x86_64-suse-linux-gnu x86_64 linux-gnu)" and using that (in R 3.5)
succeeds.  However, setting HTTPUserAgent in R 3.5 to "libcurl/7.60.0"
fails just as it does in 3.6.  It's not clear to me if this particular
website is being too restrictive or if R 3.6 should deal with it, or at
least mention the issue in NEWS or somewhere else.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] read.table() fails with https in R 3.6 but not in R 3.5

2019-05-06 Thread Stephen Berman
In versions of R prior to 3.6.0 the following invocation succeeds,
returning the data frame shown:

> read.table("https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text;,
>  header=TRUE)
   Dekade   Anzahl
11900 11467254
21910 13023370
31920 13434601
41930 13296355
51940 12121250
61950 13191131
71960 10587420
81970 10944129
91980 11279439
10   1990 12052652

But in version 3.6.0 it fails:

> read.table("https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text;,
>  header=TRUE)
Error in file(file, "rt") :
  cannot open the connection to 
'https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text'
In addition: Warning message:
In file(file, "rt") :
  cannot open URL 
'https://www.dwds.de/r/stat?corpus=kern=tokens=decade=text': 
HTTP status was '403 Forbidden'

The table at this URL is generated by a query processor and the same
failure happens in 3.6.0 with other queries at this website.  This
website does not appear to serve data via http: replacing https by http
in the above gives the same results, and in 3.6.0 the error message
contains the URL with http but in the warning message the URL is with
https.  I have also tried a few other websites that serve
(non-generated) tabular data via https
(e.g. 
https://graphchallenge.s3.amazonaws.com/synthetic/gc3/Theory-16-25-81-Bk.tsv)
and with these read.table() succeeds in 3.6.0, so the problem isn't
https in general.  Maybe it has to do with the page being generated
rather than static?  There's only one reference to https in the 3.6.0
NEWS, concerning libcurl; I can't tell if it's relevant.

In case it matters, this is with R packaged for openSUSE, and I've found
the above difference between 3.5 and 3.6 on both openSUSE Leap 15.0 and
openSUSE Tumbleweed.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] encoding argument of source() in 3.5.0

2018-06-06 Thread Stephen Berman
On Tue, 5 Jun 2018 16:03:54 +0200 Tomas Kalibera  
wrote:

> Thanks for the report, fixed in R-devel (74848).
>
> Best
> Tomas

FTR, I confirm that the problem I reported is now fixed under both
GNU/Linux and MS-Windows.  Thanks!

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] encoding argument of source() in 3.5.0

2018-06-04 Thread Stephen Berman
On Mon, 4 Jun 2018 10:44:11 +0200 Martin Maechler  
wrote:

>> peter dalgaard 
>> on Sun, 3 Jun 2018 23:51:24 +0200 writes:
>
> > Looks like this actually comes from readLines(), nothing
> > to do with source() as such: In current R-devel (still):
>
> >> f <- file("http://home.versanet.de/~s-berman/source2.R;, 
> encoding="UTF-8")
> >> readLines(f)
> > character(0)
> >> close(f)
> >> f <- file("http://home.versanet.de/~s-berman/source2.R;)
> >> readLines(f)
> > [1] "source.test2 <- function() {"   "print(\"Non-ascii: äöüß\")"
> > [3] "}" 
>
> > -pd
>
> and that's not even readLines(), but rather how exactly the
> connection is defined [even in your example above]
>
>   > urlR <- "http://home.versanet.de/~s-berman/source2.R;
>   > readLines(urlR, encoding="UTF-8")
>   [1] "source.test2 <- function() {"   "print(\"Non-ascii: äöüß\")"
>   [3] "}" 
>   > f <- file(urlR, encoding = "UTF-8")
>   > readLines(f)
>   character(0)
>
> and the same behavior with scan()  instead of readLines() :
>
>> scan(urlR,"") # works
> Read 7 items
> [1] "source.test2"   "<-" "function()" "{" 
> [5] "print(\"Non-ascii:" "äöüß\")""}" 
>> scan(f,"") # fails
> Read 0 items
> character(0)
>> 
>
> So it seems as if the bug is in the file() [or url()] C code ..

Yes, the problem seems to be restricted to loading files from a
(non-local) URL; i.e. this works fine on my computer:

  > source("file:///home/steve/prog/R/source2.R", encoding="UTF-8")

Also, I noticed this works too:

  > read.table("http://home.versanet.de/~s-berman/table2;, encoding="UTF-8", 
skip=1)

where (if I read the source correctly) using `skip=1' makes read.table()
call readLines().  (The read.table() invocation also works without
`skip'.)

> But then we also have to consider Windows .. where I think most changes have
> happened during the  R-3.4.4 --> R-3.5.0  transition.

Yes, please.  I need (or at least it would be convenient) to be able to
load R code containing non-ascii characters from the web under
MS-Windows.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] encoding argument of source() in 3.5.0

2018-06-02 Thread Stephen Berman
In R 3.5.0 using the `encoding' argument of source() prevents loading
files from the internet; without the `encoding' argument files can be
loaded from the internet, but if they contain non-ascii characters,
these are not correctly displayed under MS-Windows (but they are
correctly displayed under GNU/Linux).  With R 3.4.{2,3,4} there is no
such problem: using `encoding' the files are loaded and non-ascii
characters are correctly displayed under MS-Windows (but not without
`encoding').  Here is a transcript from R 3.5.0 under GNU/Linux (the
URLs are real, in case anyone wants to try and reproduce the problem):

> ls()
character(0)
> source("http://home.versanet.de/~s-berman/source1.R;, encoding="UTF-8")
> ls()
character(0)
> source("http://home.versanet.de/~s-berman/source2.R;, encoding="UTF-8")
> ls()
character(0)
> source("http://home.versanet.de/~s-berman/source1.R;)
> ls()
[1] "source.test1"
> source("http://home.versanet.de/~s-berman/source2.R;)
> ls()
[1] "source.test1" "source.test2"
> source.test1()
[1] "This is a test."
> source.test2()
[1] "Non-ascii: äöüß"

(The four non-ascii characters are Unicode 0xE4, 0xF6, 0xFC, 0xDF.)
With 3.5.0 under MS-Windows, the transcript is the same except for the
display of the last output, which is this:

[1] "Non-ascii: äöüß"

(Here there are eight non-ascii characters, which display the Unicode
decompositions of the four non-ascii characters above.)

Here is a transcript from R 3.4.3 under MS-Windows (under GNU/Linux it's
the same except that the non-ascii characters are also correctly
displayed even without the `encoding' argument):

> ls()
character(0)
> source("http://home.versanet.de/~s-berman/source1.R;)
> ls()
[1] "source.test1"
> source("http://home.versanet.de/~s-berman/source2.R;)
> ls()
[1] "source.test1" "source.test2"
> source.test1()
[1] "This is a test."
> source.test2()
[1] "Non-ascii: äöüß"
> rm(source.test2)
> ls()
[1] "source.test1"
> source("http://home.versanet.de/~s-berman/source2.R;, encoding="UTF-8")
> ls()
[1] "source.test1" "source.test2"
> source.test2()
[1] "Non-ascii: äöüß"

I did a web search but didn't find any reports of this issue, nor did I
see any relevant entry in the 3.5.0 NEWS, so this looks like a bug, but
maybe I've overlooked something.  I'd be grateful for any enlightenment.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Another issue with Sys.timezone

2017-10-20 Thread Stephen Berman
On Fri, 20 Oct 2017 09:15:42 +0200 Martin Maechler <maech...@stat.math.ethz.ch> 
wrote:

>>>>>> Stephen Berman <stephen.ber...@gmx.net>
>>>>>> on Thu, 19 Oct 2017 17:12:50 +0200 writes:
>
> > On Wed, 18 Oct 2017 18:09:41 +0200 Martin Maechler
> > <maech...@stat.math.ethz.ch> wrote:
> >>>>>>> Martin Maechler <maech...@stat.math.ethz.ch>
> >>>>>>> on Mon, 16 Oct 2017 19:13:31 +0200 writes:
>
[...]
> >>> whereas on Windows I get Europe/Berlin for the first (why on
> >>> earth - I'm really in Zurich) and get "CEST" ("Central European Summer
> >>> Time")
> >>> for the 2nd one instead of NA ... simply using a smarter version
> >>> of your proposal.   The windows source is
> >>> in R's source at  src/library/base/R/windows/system.R :
> >>> 
> >>> Sys.timezone <- function(location = TRUE)
> >>> {
> >>> tz <- Sys.getenv("TZ", names = FALSE)
> >>> if(nzchar(tz)) return(tz)
> >>> if(location) return(.Internal(tzone_name()))
> >>> z <- as.POSIXlt(Sys.time())
> >>> zz <- attr(z, "tzone")
> >>> if(length(zz) == 3L) zz[2L + z$isdst] else zz[1L]
> >>> }
> >>> 
> >>> >From what I read, the last three lines also work in your setup
> >>> where it seems zz would be of length 1, right ?
>
> > Those line do indeed work here, but zz has three elements:
>
> >> attributes(as.POSIXlt(Sys.time()))$tzone
> > [1] "" "CET"  "CEST"
>
> { "but" ??   yes, three elements is what I see too, but for that
>   reason there's the  if(length(zz) == 3L) ... }

The "but" was in response to "it seems zz would be of length 1", but
perhaps I misunderstood you.

[...]
> >> As you say yourself, the above system("... xargs md5sum ...")
> >> using workaround is really too platform specific  but I'd guess
> >> there should be a less error prone way to get the long timezone
> >> name on your system ...
>
> > If I understand the zic(8) man page, the files in /usr/share/zoneinfo
> > should contain this information, but I don't know how to extract it,
> > since these are compiled files.  And since on my system /etc/localtime
> > is a copy of one of these compiled files, I don't know of any other way
> > to recover the location name without comparing it to those files.
>
> >> If that remains "contained" (i.e. small) and works with files
> >> and R's files tools -- e.g. file.*() ones [but not system()],
> >> I'd consider a patch to the above source file
> >> (sent by you to the R-devel mailing list --- or after having
> >> gotten an account there by asking, via bug report & patch
> >> attachment at https://bugs.r-project.org/ )
>
> > If comparing file size sufficed, that would be easy to do in R;
> > unfortunately, it is not sufficient, since some files designating
> > different time zones in /usr/share/zoneinfo do have the same size.  So
> > the only alternative I can think of is to compare bytes, e.g. with
> > md5sum or with cmp.  Is there some way to do this in R without using
> > system()?
>
> Can't you use
>   tz1 <- readBin("/etc/localtime", "raw", 200L)
> plus later
>   tz2 <- gsub(...,  rawToChar(tz1))
>
> on your  /etc/localtime file 
> almost identically as the current code does for "/etc/timezone" ?

Oh, thanks.  I've looked at this code over and over again in the last
few days and somehow still didn't see its usefulness (maybe because I
haven't had occasion to deal with binary data in R till now).  Anyway,
just substituting "/etc/localtime" for "/etc/timezone" doesn't work,
since my /etc/localtime file seems not to hold the timezone location
name in a form recoverable with rawToChar() (all I see are the
abbreviated timezones CEST, CEMT and CET-1CEST); but I can use the raw
bytes to make the comparison with files in /usr/share/zoneinfo.  With
the attached patch, I get both the timezone location name (with
location=TRUE) and the abbreviated timezone (with location=FALSE).  One
thing I wonder about: is looking at just the first 200 bytes guaranteed
to be sufficient, or would it be better to use n=file.size() to examine
the whole file?

Steve Berman

*** datetime.R.orig	2017-10-20 17:15:0

Re: [Rd] Another issue with Sys.timezone

2017-10-19 Thread Stephen Berman
On Wed, 18 Oct 2017 18:09:41 +0200 Martin Maechler <maech...@stat.math.ethz.ch> 
wrote:

>>>>>> Martin Maechler <maech...@stat.math.ethz.ch>
>>>>>> on Mon, 16 Oct 2017 19:13:31 +0200 writes:

(I also included a reply to part of this response of yours below.)

>>>>>> Stephen Berman <stephen.ber...@gmx.net>
>>>>>> on Sun, 15 Oct 2017 01:53:12 +0200 writes:
>
>> > (I reported the test failure mentioned below to R-help but was advised
>> > that this list is the right one to address the issue; in the meantime I
>> > investigated the matter somewhat more closely, including searching
>> > recent R-devel postings, since I haven't been following this list.)
>> 
>> > Last May there were two reports here of problems with Sys.timezone, one
>> > where the zoneinfo directory is in a nonstandard location
>> > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074267.html) and the
>> > other where the system lacks the file /etc/localtime
>> > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074275.html).  My
>> > system exhibits a third case: it lacks /etc/timezone and does not set 
>> TZ
>> > systemwide, but it does have /etc/localtime, which is a copy of, rather
>> > than a symlink to, a file under zoneinfo.  On this system 
>> Sys.timezone()
>> > returns NA and the Sys.timezone test in reg-tests-1d fails.  However, 
>> on
>> > my system I can get the (abbreviated) timezone in R by using 
>> as.POSIXlt,
>> > e.g. as.POSIXlt(Sys.time())$zone.  If Sys.timezone took advantage of
>> > this, e.g. as below, it would be useful on such systems as mine and the
>> > regression test would pass.
>> 
>> > my.Sys.timezone <- 
>> >function (location = TRUE) 
>> > {
>> >tz <- Sys.getenv("TZ", names = FALSE)
>> >if (!location || nzchar(tz)) 
>> >return(Sys.getenv("TZ", unset = NA_character_))
>> >lt <- normalizePath("/etc/localtime")
>> >if (grepl(pat <- "^/usr/share/zoneinfo/", lt) ||
>> >grepl(pat <- "^/usr/share/zoneinfo.default/", lt)) 
>> >sub(pat, "", lt)
>> >else if (lt == "/etc/localtime")
>> >if (!file.exists("/etc/timezone"))
>> >return(as.POSIXlt(Sys.time())$zone)
>> >else if (dir.exists("/usr/share/zoneinfo") && {
>> >info <- file.info(normalizePath("/etc/timezone"), 
>> extra_cols = FALSE)
>> >(!info$isdir && info$size <= 200L)
>> >} && {
>> >tz1 <- tryCatch(readBin("/etc/timezone", "raw", 200L), 
>> >error = function(e) raw(0L))
>> >length(tz1) > 0L && all(tz1 %in% as.raw(c(9:10, 13L, 
>> 32:126)))
>> >} && {
>> >tz2 <- gsub("^[[:space:]]+|[[:space:]]+$", "", 
>> rawToChar(tz1))
>> >tzp <- file.path("/usr/share/zoneinfo", tz2)
>> >file.exists(tzp) && !dir.exists(tzp) &&
>> >identical(file.size(normalizePath(tzp)), 
>> file.size(lt))
>> >}) 
>> >tz2
>> >else NA_character_
>> > }
>> 
>> > One problem with this is that the zone component of as.POSIXlt only
>> > holds the abbreviated timezone, not the Olson name.  
>> 
>> Yes, indeed.  So, really only for  Sys.timezone(location = FALSE)  this
>> should be given, for the default  location = TRUE   it should
>> still give NA (i.e. NA_character_)  in your setup.
>> 
>> Interestingly, the Windows versions of Sys.timezone(location =
>> FALSE) uses something like your proposal,  and I tend to think that
>> -- again only for location=FALSE -- this should be used on
>> on-Windows as well, at least instead of returning  NA  then.
>> 
>> Also for me on 3 different Linuxen (Fedora 24, F. 26, and ubuntu
>> 14.04 LTS), I get
>> 
>>   > Sys.timezone()
>>   [1] "Europe/Zurich"
>>   > Sys.timezone(FALSE)
>>   [1] NA
>

[Rd] Another issue with Sys.timezone

2017-10-14 Thread Stephen Berman
(I reported the test failure mentioned below to R-help but was advised
that this list is the right one to address the issue; in the meantime I
investigated the matter somewhat more closely, including searching
recent R-devel postings, since I haven't been following this list.)

Last May there were two reports here of problems with Sys.timezone, one
where the zoneinfo directory is in a nonstandard location
(https://stat.ethz.ch/pipermail/r-devel/2017-May/074267.html) and the
other where the system lacks the file /etc/localtime
(https://stat.ethz.ch/pipermail/r-devel/2017-May/074275.html).  My
system exhibits a third case: it lacks /etc/timezone and does not set TZ
systemwide, but it does have /etc/localtime, which is a copy of, rather
than a symlink to, a file under zoneinfo.  On this system Sys.timezone()
returns NA and the Sys.timezone test in reg-tests-1d fails.  However, on
my system I can get the (abbreviated) timezone in R by using as.POSIXlt,
e.g. as.POSIXlt(Sys.time())$zone.  If Sys.timezone took advantage of
this, e.g. as below, it would be useful on such systems as mine and the
regression test would pass.

my.Sys.timezone <- 
function (location = TRUE) 
{
tz <- Sys.getenv("TZ", names = FALSE)
if (!location || nzchar(tz)) 
return(Sys.getenv("TZ", unset = NA_character_))
lt <- normalizePath("/etc/localtime")
if (grepl(pat <- "^/usr/share/zoneinfo/", lt) ||
grepl(pat <- "^/usr/share/zoneinfo.default/", lt)) 
sub(pat, "", lt)
else if (lt == "/etc/localtime")
if (!file.exists("/etc/timezone"))
return(as.POSIXlt(Sys.time())$zone)
else if (dir.exists("/usr/share/zoneinfo") && {
info <- file.info(normalizePath("/etc/timezone"), extra_cols = 
FALSE)
(!info$isdir && info$size <= 200L)
} && {
tz1 <- tryCatch(readBin("/etc/timezone", "raw", 200L), 
error = function(e) raw(0L))
length(tz1) > 0L && all(tz1 %in% as.raw(c(9:10, 13L, 32:126)))
} && {
tz2 <- gsub("^[[:space:]]+|[[:space:]]+$", "", rawToChar(tz1))
tzp <- file.path("/usr/share/zoneinfo", tz2)
file.exists(tzp) && !dir.exists(tzp) &&
identical(file.size(normalizePath(tzp)), file.size(lt))
}) 
tz2
else NA_character_
}

One problem with this is that the zone component of as.POSIXlt only
holds the abbreviated timezone, not the Olson name.  I don't know how to
get the Olson name using only R functions, but maybe it would be good
enough to return the abbreviated timezone where possible, e.g. as above.
(On my system I can get the Olson name of the timezone in R with a shell
pipeline, e.g.: system("find /usr/share/zoneinfo/ -type f | xargs md5sum
| grep $(md5sum /etc/localtime | cut -d ' ' -f 1) | head -n 1 | cut -d
'/' -f 5,6"), but the last part of this is tailored to my configuration
and the whole thing is not OS-neutral, so it isn't suitable for
Sys.timezone.)

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel