Re: [Rd] utils::install.packages with quiet=TRUE fails for source packages on Windows

2018-01-26 Thread peter dalgaard
 mvtnorm  1.0-6  1.0-7  TRUE
>> Do you want to install from sources the package which needs compilation?
>> y/n: y
>> installing the source package 'mvtnorm'
>> Tracing system2(cmd0, args, env = env, stdout = outfile, stderr = outfile) 
>> on entry
>> args :  Named chr [1:5] "CMD" "INSTALL" "-l" 
>> "\"C:\\Users\\askers\\AppData\\Local\\Temp\\RtmpoRb97l\"" ...
>> command :  chr "C:/PROGRA~1/R/R-34~1.3/bin/x64/R"
>> env :  chr(0)
>> input :  NULL
>> invisible :  logi TRUE
>> minimized :  logi FALSE
>> stderr :  logi FALSE
>> stdin :  chr ""
>> stdout :  logi FALSE
>> wait :  logi TRUE
>> Warning messages:
>> 1: running command '"C:/PROGRA~1/R/R-34~1.3/bin/x64/R" CMD INSTALL -l 
>> "C:\Users\askers\AppData\Local\Temp\RtmpoRb97l" 
>> C:\Users\askers\AppData\Local\Temp\RtmpoRb97l/downloaded_packages/mvtnorm_1.0-7.tar.gz'
>>  had status 1
>> 2: In utils::install.packages("mvtnorm", lib = tempdir(), quiet = TRUE) :
>>   installation of package 'mvtnorm' had non-zero exit status
>> I do not encounter this problem on my Linux machine.
>> Andreas
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R-win] Bug 17159 - recursive dir.create() fails on windows shares due to permissions (MMaechler: Resending to R-windows@R-pr..)

2018-01-17 Thread Peter Dalgaard
I can easily believe that. It was maily for Joris, that it might not be 
necessary to reinstall. 

-pd

> On 17 Jan 2018, at 11:55 , Thompson, Pete  wrote:
> 
> That solution works fine for the use case where each user has a network based 
> home directory and needs to run R from there, but doesn’t help with my 
> situation. I need to be able to support arbitrary network based paths in 
> arbitrary numbers – so mapping drives isn’t an option. I have found a 
> workaround using symbolic links to the network share created within the 
> temporary folder, but would much prefer that R support UNC paths – it seems a 
> reasonably simple fix.
> 
> Cheers
> Pete
> 
> 
> On 17/01/2018, 10:52, "Peter Dalgaard"  wrote:
> 
>I usually draw a complete blank if  I try to assist our IT department with 
> such issues (we really need better documentation than the Admin manual for 
> large-system installs by non-experts in R).
> 
>However, it is my impression that there are also options involving 
> environment variables and LFS naming. E.g., map the networked user directory 
> to, say, a P: "drive" and make sure that the environment is set up to reflect 
> this.
> 
>-pd
> 
>> On 16 Jan 2018, at 17:52 , Joris Meys  wrote:
>> 
>> Hi all,
>> 
>> I ran into this exact issue yesterday during the exam of statistical
>> computing. Users can install packages in a user library that R tries to
>> create automatically on the network drive of the student. But that doesn't
>> happen as the unc path is not read correctly, leading to R attempting to
>> create a local directory and being told it has no right to do so.
>> 
>> That is an older version of R though (3.3), but I'm wondering whether I
>> would ask our IT department to just update R on all these computers to the
>> latest version, or if we have to look for another solution.
>> 
>> Cheers
>> Joris
>> 
>> On Mon, Jan 8, 2018 at 1:43 PM, Thompson, Pete 
>> wrote:
>> 
>>> Hi, I’d like to ask about bug 17159:
>>> 
>>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17159
>>> 
>>> I can confirm that I see exactly this bug when using dir.create on paths
>>> of UNC form (\\server\share\xxx), with the recursive flag set. I’m seeing
>>> this when attempting to use install.packages with such a path (which I know
>>> isn’t supported, but would be great if it was!). I can see that a patch has
>>> been suggested for the problem and from looking at the source code I
>>> believe it’s a correct fix. Is there a possibility of getting this patch
>>> included?
>>> 
>>> The existing logic for Windows recursive dir.create (platform.c lines
>>> 2209-22203) appears to be:
>>> - Skip over any \\share at the start of the directory name
>>> - Loop while there are pieces of directory name left (i.e. we haven’t hit
>>> the last \ character)
>>> = Find the next portion of the directory name (up to the next \
>>> character)
>>> = Attempt to create the directory (unless it is of the form x: - i.e. a
>>> drive name)
>>> = Ignore any ‘already exists’ errors, otherwise throw an error
>>> 
>>> This logic appears flawed in that it skips \\share which isn’t a valid
>>> path format (according to https://msdn.microsoft.com/en-
>>> us/library/windows/desktop/aa365247(v=vs.85).aspx ). Dredging my memory,
>>> it’s possible that \\share was a supported format in very old versions of
>>> Windows, but it’s been a long time since the UNC format came in. It’s also
>>> possible that \\share is a valid format in some odd environments, but the
>>> UNC format is far more widely used.
>>> 
>>> The patch suggested by Evan Cortens is simply to change the skip logic to
>>> skip over \\server\share instead of \\share. This will certainly fix the
>>> common use case of using UNC paths, but doesn’t attempt to deal with all
>>> the more complex options in Microsoft’s documentation. I doubt many users
>>> would ask for the complex cases, but the basic UNC format would be of wide
>>> applicability.
>>> 
>>> Thanks
>>> Pete Thompson
>>> Director, Information Technology
>>> Head of Spotfire Centre of Excellence
>>> IQVIA
>>> 
>>> 
>>> 
>>> 
>>> 
>>> IMPORTANT - PLEASE READ: This electronic message, including its
>>> attachments, is CONFIDENTIAL and may contain PROPRIETARY or LEGALLY
>>> PRIVILEGED or PROTECTED

Re: [Rd] [R-win] Bug 17159 - recursive dir.create() fails on windows shares due to permissions (MMaechler: Resending to R-windows@R-pr..)

2018-01-17 Thread Peter Dalgaard
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
> 
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> ___
> R-windows mailing list
> r-wind...@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-windows

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com









__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Numerical stability in chisq.test

2017-12-28 Thread peter dalgaard


> On 28 Dec 2017, at 13:08 , Kurt Hornik  wrote:
> 
>>>>>> Jan Motl writes:
> 
>> The chisq.test on line 57 contains following code:
>>  STATISTIC <- sum(sort((x - E)^2/E, decreasing = TRUE))
> 
> The preceding 2 lines seem relevant:
> 
>## Sorting before summing may look strange, but seems to be
>## a sensible way to deal with rounding issues (PR#3486):
>STATISTIC <- sum(sort((x - E) ^ 2 / E, decreasing = TRUE))
> 
> -k

My thoughts too. PR 3486 is about simulated tables that theoretically have 
STATISTIC equal to the one observed, but come out slightly different, messing 
up the simulated p value. The sort is not actually intended to squeeze the very 
last bit of accuracy out of the computation, just to make sure that the 
round-off affects equivalent tables in the same way. "Fixing" the code may 
therefore unfix PR#3486; at the very least some care is required if this is 
modified.  

-pd


> 
>> However, based on book "Accuracy and stability of numerical algorithms" 
>> available from:
>>  
>> http://ftp.demec.ufpr.br/CFD/bibliografia/Higham_2002_Accuracy%20and%20Stability%20of%20Numerical%20Algorithms.pdf
>> Table 4.1 on page 89, it is better to sort the data in increasing order than 
>> in decreasing order, when the data are non-negative.
> 
>> An example:
>>  x = matrix(c(rep(1.1, 1)), 10^16, nrow = 10001, ncol = 1)# We 
>> have a vector with 1*1.1 and 1*10^16
>>  c(sum(sort(x, decreasing = TRUE)), sum(sort(x, decreasing = FALSE)))
>> The result:
>>  100010996 100011000
>> When we sort the data in the increasing order, we get the correct result. If 
>> we sort the data in the decreasing order, we get a result that is off by 4.
> 
>> Shouldn't the sort be in the increasing order rather than in the decreasing 
>> order?
> 
>> Best regards,
>> Jan Motl
> 
> 
>> PS: This post is based on discussion on 
>> https://stackoverflow.com/questions/47847295/why-does-chisq-test-sort-data-in-descending-order-before-summation
>>  and the response from the post to r-h...@r-project.org.
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Dialect for shell scripts

2017-12-18 Thread peter dalgaard
Solaris is pretty much dead at this point (closed source or not), but it is not 
the only oddball OS around. 

The need to support a wide palette of (Unix) OS variations has been rapidly 
declining in later years. Fifteen years ago every supercomputer seemed to have 
its own set of OS quirks and we wanted to be able to support the cutting edge. 
Now things seem to converge on a few variations on the big three: Windows, 
MacOS, Linux, possibly on some parallel/cloud infrastructure. 

However, it is probably worth maintaining the conservative defensive policies 
at least for a while yet.

(I suspect that the text in WRE is actually older than the current POSIX 
standard, which is what causes confusion as to what constitutes a Bourne shell. 
It may be worth editing it to say that we are in fact more restrictive than 
current POSIX.)

-pd  

> On 18 Dec 2017, at 23:36 , Paul McQuesten  wrote:
> 
> I do not have a dog in this fight, but I have to ask:
>How much person time is worthwhile to invest in supporting Solaris 10?
> 
> It has been closed-source (Post-Oracle)
> <https://en.wikipedia.org/wiki/Solaris_(operating_system)#Post-Oracle_closed_source_(Solaris_10_after_March_2010,_and_Solaris_11_(2011_and_later))>
> since
> March 2010.
> 
> On Mon, Dec 18, 2017 at 1:23 PM, Kurt Hornik  wrote:
> 
>>>>>>> Iñaki Úcar writes:
>> 
>> Same from here: in addition to what the standards say, it always pays to
>> be defensive and check "Portable Shell Programming" in the Autoconf
>> manual.  Among other things, this says
>> 
>> '$((EXPRESSION))'
>> Arithmetic expansion is not portable as some shells (most notably
>> Solaris 10 '/bin/sh') don't support it.
>> 
>> motivating the code shown below.  Perhaps simplest to always use expr.
>> 
>> -k
>> 
>> 
>>> For what it's worth, Autoconf does not assume that arithmetic
>>> expansion will be available. Instead, it emits the following shell
>>> code:
>> 
>>> if ( eval 'test $(( 1 + 1 )) = 2' ) 2>/dev/null; then
>>>  eval 'func_arith ()
>>>  {
>>>func_arith_result=$(( $* ))
>>>  }'
>>> else
>>>  func_arith ()
>>>  {
>>>func_arith_result=`expr "$@"`
>>>  }
>>> fi
>> 
>>> 2017-12-17 23:55 GMT+01:00 Rodrigo Tobar :
>>>> Dear all,
>>>> 
>>>> During a recent package submission, we were highlighted that some lines
>> in
>>>> our configure script didn't follow the correct syntax. The lines looked
>> like
>>>> this:
>>>> 
>>>> x=$(($y/10))
>>>> 
>>>> We were indicated at the time that this is because the statement does
>> not
>>>> use Bourne shell syntax, which is absolutely true, and also that the
>> manual
>>>> warns about this, which is true again. So far everything is clear.
>>>> 
>>>> However, what confuses me is that even when the manual says that "you
>> can
>>>> include an executable (Bourne) shell script configure in your package"
>> [1],
>>>> the associated footnote says something slightly different: "The script
>>>> should only assume a POSIX-compliant /bin/sh" [2]. The footnote goes
>> even
>>>> further, and links to the POSIX specification of the Shell Command
>> Language
>>>> [3] (as published by The Open Group), which explicitly includes
>> arithmetic
>>>> expressions like the one above in its syntax [4].
>>>> 
>>>> My question then is: what exact dialect should be considered? Given
>> that the
>>>> statement above does not work in the Bourne shell, I conclude that the
>>>> Bourne shell is not POSIX-compliant. That in turn would make the manual
>>>> ambiguous as to the precise dialect that should be used by our configure
>>>> scripts, and either the shells used by R should be changed to be
>>>> POSIX-compliants, or the manual edited to be more precise regarding .
>>>> 
>>>> Many thanks.
>>>> 
>>>> Rodrigo
>>>> 
>>>> [1]
>>>> https://cran.r-project.org/doc/manuals/r-release/R-exts.
>> html#Configure-and-cleanup
>>>> [2] https://cran.r-project.org/doc/manuals/r-release/R-exts.html#FOOT25
>>>> [3] http://pubs.opengroup.org/onlinepubs/9699919799/
>> utilities/V3_chap02.html
>>>> [4]
>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>> utilities/V3_chap

Re: [Rd] Discourage the weights= option of lm with summarized data

2017-12-03 Thread peter dalgaard

> On 3 Dec 2017, at 16:31 , Arie ten Cate  wrote:
> 
> Peter,
> 
> This is a highly structured text. Just for the discussion, I separate
> the building blocks, where (D) and (E) and (F) are new:
> 
> BEGIN OF TEXT 
> 
> (A)
> 
> Non-‘NULL’ ‘weights’ can be used to indicate that different
> observations have different variances (with the values in ‘weights’
> being inversely proportional to the variances);
> 
> (B)
> 
> or equivalently, when the elements of ‘weights’ are positive integers
> w_i, that each response y_i is the mean of w_i unit-weight
> observations
> 
> (C)
> 
> (including the case that there are w_i observations equal to y_i and
> the data have been summarized).
> 
> (D)
> 
> However, in the latter case, notice that within-group variation is not
> used. Therefore, the sigma estimate and residual degrees of freedom
> may be suboptimal;
> 
> (E)
> 
> in the case of replication weights, even wrong.
> 
> (F)
> 
> Hence, standard errors and analysis of variance tables should be
> treated with care.
> 
> END OF TEXT 
> 
> I don't understand (D), partly because it is unclear to me whether (D)
> refers to (C) or to (B)+(C):

B, including C, is "the latter case". 

>If (D) refers only to (C), as the reader might automatically think
> with the repetition of the word "case", then it is unclear to me to
> what block does (E) refer.

Not so. If it did, it should go inside the parentheses.

>If, on the other hand, (D) refers to (B)+(C) then (E) probably
> refers to (C) and then I suggest to make this more clear by replacing
> "in the case of replication weights" in (E) by "in the case of
> summarized data".
> 

That would be wrong. Data can be summarized by means of groups (and SDs, which 
are unused, hence the suboptimality), _including_ the case where all elements 
are identical. 

> I suggest to change "even wrong" in (E) into the more down-to-earth "wrong".

That would seem to be a matter of taste. 

Howver, "equivalently" in (B) does not look right.

> 
> (For the record: I prefer something like my original explanation of
> the problem with (C), instead of (D)+(E)+(F):
>"With summarized data the standard errors get smaller with
> increasing numbers of observations w_i. However, when for instance all
> w_i are multiplied with the same constant larger than one, the
> reported standard errors do not get smaller since the w_i are defined
> apart from an arbitrary positive multiplicative constant. Hence the
> reported standard errors tend to be too large and the reported t
> values and the reported number of significance stars too small.
> Obviously, also the reported number of observations and the reported
> number of degrees of freedom are too small."
>Note that with heteroskedasticity, _the_ residual standard error
> has no meaning.)
> 
> Finally, about the original text: (B) and (C) mention only y_i, not
> x_i, while this is about entire observations. Maybe this can remedied
> also?
> 
>  Arie
> 
> On Tue, Nov 28, 2017 at 1:01 PM, peter dalgaard  wrote:
>> My local R-devel version now has (in ?lm)
>> 
>> Non-‘NULL’ ‘weights’ can be used to indicate that different
>> observations have different variances (with the values in
>> ‘weights’ being inversely proportional to the variances); or
>> equivalently, when the elements of ‘weights’ are positive integers
>> w_i, that each response y_i is the mean of w_i unit-weight
>> observations (including the case that there are w_i observations
>> equal to y_i and the data have been summarized). However, in the
>> latter case, notice that within-group variation is not used.
>> Therefore, the sigma estimate and residual degrees of freedom may
>> be suboptimal; in the case of replication weights, even wrong.
>> Hence, standard errors and analysis of variance tables should be
>> treated with care.
>> 
>> OK?
>> 
>> 
>> -pd
>> 
>> 
>>> On 12 Oct 2017, at 13:48 , Arie ten Cate  wrote:
>>> 
>>> OK. We have now three suggestions to repair the text:
>>> - remove the text
>>> - add "not" at the beginning of the text
>>> - add at the end of the text a warning; something like:
>>> 
>>> "Note that in this case the standard estimates of the parameters are
>>> in general not correct, and hence also the t values and the p value.
>>> Also the number of degrees of freedom is not correct. (The parameter
>>> values are correct.)&quo

Re: [Rd] Discourage the weights= option of lm with summarized data

2017-11-28 Thread peter dalgaard
My local R-devel version now has (in ?lm)

 Non-‘NULL’ ‘weights’ can be used to indicate that different
 observations have different variances (with the values in
 ‘weights’ being inversely proportional to the variances); or
 equivalently, when the elements of ‘weights’ are positive integers
 w_i, that each response y_i is the mean of w_i unit-weight
 observations (including the case that there are w_i observations
 equal to y_i and the data have been summarized). However, in the
 latter case, notice that within-group variation is not used.
 Therefore, the sigma estimate and residual degrees of freedom may
 be suboptimal; in the case of replication weights, even wrong.
 Hence, standard errors and analysis of variance tables should be
 treated with care.

OK?


-pd


> On 12 Oct 2017, at 13:48 , Arie ten Cate  wrote:
> 
> OK. We have now three suggestions to repair the text:
> - remove the text
> - add "not" at the beginning of the text
> - add at the end of the text a warning; something like:
> 
>  "Note that in this case the standard estimates of the parameters are
> in general not correct, and hence also the t values and the p value.
> Also the number of degrees of freedom is not correct. (The parameter
> values are correct.)"
> 
> A remark about the glm example: the Reference manual says: "For a
> binomial GLM prior weights are used to give the number of trials when
> the response is the proportion of successes ".  Hence in the
> binomial case the weights are frequencies.
> With y <- 0.51 and w <- 100 you get the same result.
> 
>   Arie
> 
> On Mon, Oct 9, 2017 at 5:22 PM, peter dalgaard  wrote:
>> AFAIR, it is a little more subtle than that.
>> 
>> If you have replication weights, then the estimates are right, it is "just" 
>> that the SE from summary.lm() are wrong. Somehow, the text should reflect 
>> this.
>> 
>> It is of some importance when you put glm() into the mix, because you can in 
>> fact get correct results from things like
>> 
>> y <- c(0,1)
>> w <- c(49,51)
>> glm(y~1, weights=w, family=binomial)
>> 
>> -pd
>> 
>>> On 9 Oct 2017, at 07:58 , Arie ten Cate  wrote:
>>> 
>>> Yes.  Thank you; I should have quoted it.
>>> I suggest to remove this text or to add the word "not" at the beginning.
>>> 
>>>  Arie
>>> 
>>> On Sun, Oct 8, 2017 at 4:38 PM, Viechtbauer Wolfgang (SP)
>>>  wrote:
>>>> Ah, I think you are referring to this part from ?lm:
>>>> 
>>>> "(including the case that there are w_i observations equal to y_i and the 
>>>> data have been summarized)"
>>>> 
>>>> I see; indeed, I don't think this is what 'weights' should be used for 
>>>> (the other part before that is correct). Sorry, I misunderstood the point 
>>>> you were trying to make.
>>>> 
>>>> Best,
>>>> Wolfgang
>>>> 
>>>> -Original Message-
>>>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Arie ten 
>>>> Cate
>>>> Sent: Sunday, 08 October, 2017 14:55
>>>> To: r-devel@r-project.org
>>>> Subject: [Rd] Discourage the weights= option of lm with summarized data
>>>> 
>>>> Indeed: Using 'weights' is not meant to indicate that the same
>>>> observation is repeated 'n' times.  As I showed, this gives erroneous
>>>> results. Hence I suggested that it is discouraged rather than
>>>> encouraged in the Details section of lm in the Reference manual.
>>>> 
>>>>  Arie
>>>> 
>>>> ---Original Message-
>>>> On Sat, 7 Oct 2017, wolfgang.viechtba...@maastrichtuniversity.nl wrote:
>>>> 
>>>> Using 'weights' is not meant to indicate that the same observation is
>>>> repeated 'n' times. It is meant to indicate different variances (or to
>>>> be precise, that the variance of the last observation in 'x' is
>>>> sigma^2 / n, while the first three observations have variance
>>>> sigma^2).
>>>> 
>>>> Best,
>>>> Wolfgang
>>>> 
>>>> -Original Message-
>>>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Arie ten 
>>>> Cate
>>>> Sent: Saturday, 07 October, 2017 9:36
>>>> To: r-devel@r-project.org
>>>> Subject: [Rd] Discourage the weights= option of lm with 

Re: [Rd] Discourage the weights= option of lm with summarized data

2017-11-28 Thread peter dalgaard
It's on my todo list (for R-devel, it is not _that_ important), other things 
just keep taking priority...

-pd

> On 28 Nov 2017, at 09:29 , Arie ten Cate  wrote:
> 
> Since the three posters agree (only) that there is a bug, I propose to
> file it as a bug, which is the least we can do now.
> 
> There is more to it: the only other case of a change in the Reference
> Manual which I know of, is also about the weights option! This is in
> coxph. The Reference Manual version 3.0.0 (2013) says about coxph:
> 
>   " ... If weights is a vector of integers, then the estimated
> coefficients are equivalent to estimating the model from data with the
> individual cases replicated as many times as indicated by weights."
> 
> This is not true, as can be seen from the following code, which uses
> the data from the first example in the Reference Manual of coxph:
> 
>   library(survival)
>   print(df1 <- as.data.frame(list(
> time=c(4,3,1,1,2,2,3),
>   status=c(1,1,1,0,1,1,0),
>x=c(0,2,1,1,1,0,0),
>  sex=c(0,0,0,0,1,1,1)
>   )))
>   print(w <- rep(2,7))
>   print(coxph(Surv(time,status) ~ x + strata(sex),data=df1,weights=w))
>  # manually doubling the data:
>   print(df2 <- rbind(df1,df1))
>   print(coxph(Surv(time,status) ~ x + strata(sex), data=df2))
> 
> This should not come as a surprise, since with coxph the computation
> of the likelihood (given the parameters) for a single observation uses
> also the other observations.
> 
> This bug has been repaired. The present Reference Manual of coxph says
> that the weights option specifies a vector of case weights, to which
> is added only: "For a thorough discussion of these see the book by
> Therneau and Grambsch."
> 
> Let us repair the other bug also.
> 
>Arie
> 
> On Thu, Oct 12, 2017 at 1:48 PM, Arie ten Cate  wrote:
>> OK. We have now three suggestions to repair the text:
>> - remove the text
>> - add "not" at the beginning of the text
>> - add at the end of the text a warning; something like:
>> 
>>  "Note that in this case the standard estimates of the parameters are
>> in general not correct, and hence also the t values and the p value.
>> Also the number of degrees of freedom is not correct. (The parameter
>> values are correct.)"
>> 
>> A remark about the glm example: the Reference manual says: "For a
>> binomial GLM prior weights are used to give the number of trials when
>> the response is the proportion of successes ".  Hence in the
>> binomial case the weights are frequencies.
>> With y <- 0.51 and w <- 100 you get the same result.
>> 
>>   Arie
>> 
>> On Mon, Oct 9, 2017 at 5:22 PM, peter dalgaard  wrote:
>>> AFAIR, it is a little more subtle than that.
>>> 
>>> If you have replication weights, then the estimates are right, it is "just" 
>>> that the SE from summary.lm() are wrong. Somehow, the text should reflect 
>>> this.
>>> 
>>> It is of some importance when you put glm() into the mix, because you can 
>>> in fact get correct results from things like
>>> 
>>> y <- c(0,1)
>>> w <- c(49,51)
>>> glm(y~1, weights=w, family=binomial)
>>> 
>>> -pd
>>> 
>>>> On 9 Oct 2017, at 07:58 , Arie ten Cate  wrote:
>>>> 
>>>> Yes.  Thank you; I should have quoted it.
>>>> I suggest to remove this text or to add the word "not" at the beginning.
>>>> 
>>>>  Arie
>>>> 
>>>> On Sun, Oct 8, 2017 at 4:38 PM, Viechtbauer Wolfgang (SP)
>>>>  wrote:
>>>>> Ah, I think you are referring to this part from ?lm:
>>>>> 
>>>>> "(including the case that there are w_i observations equal to y_i and the 
>>>>> data have been summarized)"
>>>>> 
>>>>> I see; indeed, I don't think this is what 'weights' should be used for 
>>>>> (the other part before that is correct). Sorry, I misunderstood the point 
>>>>> you were trying to make.
>>>>> 
>>>>> Best,
>>>>> Wolfgang
>>>>> 
>>>>> -Original Message-
>>>>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Arie 
>>>>> ten Cate
>>>>> Sent: Sunday, 08 October, 2017 14:55
>>>>> To: r-devel@r-project.org
>>>>> Subject: [Rd] Discourage the weights= option of lm with summarized data
>>>>> 
>>>>> Indeed: Usi

[Rd] R 3.4.3 scheduled for November 30

2017-11-09 Thread Peter Dalgaard
Full schedule available on developer.r-project.org (pending auto-update from 
SVN)

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Extreme bunching of random values from runif with Mersenne-Twister seed

2017-11-05 Thread peter dalgaard

> On 5 Nov 2017, at 15:17 , Duncan Murdoch  wrote:
> 
> On 04/11/2017 10:20 PM, Daniel Nordlund wrote:
>> Tirthankar,
>> "random number generators" do not produce random numbers.  Any given
>> generator produces a fixed sequence of numbers that appear to meet
>> various tests of randomness.  By picking a seed you enter that sequence
>> in a particular place and subsequent numbers in the sequence appear to
>> be unrelated.  There are no guarantees that if YOU pick a SET of seeds
>> they won't produce a set of values that are of a similar magnitude.
>> You can likely solve your problem by following Radford Neal's advice of
>> not using the the first number from each seed.  However, you don't need
>> to use anything more than the second number.  So, you can modify your
>> function as follows:
>> function(x) {
>>set.seed(x, kind = "default")
>>y = runif(2, 17, 26)
>>return(y[2])
>>  }
>> Hope this is helpful,
> 
> That's assuming that the chosen seeds are unrelated to the function output, 
> which seems unlikely on the face of it.  You can certainly choose a set of 
> seeds that give high values on the second draw just as easily as you can 
> choose seeds that give high draws on the first draw.
> 
> The interesting thing about this problem is that Tirthankar doesn't believe 
> that the seed selection process is aware of the function output.  I would say 
> that it must be, and he should be investigating how that happens if he is 
> worried about the output, he shouldn't be worrying about R's RNG.
> 

Hmm, no. The basic issue is that RNGs are constructed so that with x_{n+1} = 
f(x_n),
x_1, x_2, x_3,... will look random, not so that f(s_1), f(s_2), f(s_3), ... 
will look random for any s_1, s_2, ... . This is true, even if seeds s_1, s_2, 
... are not chosen so as to mess with the RNG. In the present case, it seems 
that the seeds around 86e6 tend to give similar output. On the other hand, it 
is not _just_ the similarity in magnitude that does it, try e.g.

s <- as.integer(runif(100, 86.54e6, 86.98e6))
r <- sapply(s, function(s){set.seed(s); runif(1,17,26)})
plot(s,r, pch=".")

and no obvious pattern emerges. My best guess is that the seeds are not only of 
similar magnitude, but also have other bit-pattern similarities.

(Isn't there a Knuth quote to the effect that "Every random number generator 
will fail in at least one application"?)

One remaining issue is whether it is really true that the same seeds givee 
different output on different platforms. That shouldn't happen, I believe.


> Duncan Murdoch
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Memory address of character datatype

2017-11-02 Thread peter dalgaard
I'm not really disagreeing with this, but is not the point of pryr to let you 
investigate internals from the R level? 

Building code that relies on pryr returning things with specific properties is 
very likely doubleplusunrecommended by pryr's author as well.

In that spirit, I suppose that you could reasonably wish for features that 
would let you peek at memory locations and follow pointers around, etc. As long 
as you don't poke() anything into memory, you are not likely to break anything. 
(Hmm, unless you try printing non-objects and suchlike...) Of course users 
should be aware that any change to R internals may invalidate previously 
working code (e.g., by changing the "24" in the OP's example).  I don't see any 
such functionality in pryr though.

-pd

> On 2 Nov 2017, at 10:08 , Tomas Kalibera  wrote:
> 
> If you were curious about the hidden details of the memory layout in R, the 
> best reference is the source code. In your example, you are not getting to 
> your string because there is one more pointer in the way, "x" is a vector of 
> strings, each string is represented by a pointer.
> 
> At C level, there is an API for getting an address of the value, e.g. 
> INTEGER(x) or CHAR(STRING_ELT(x)).
> At R level, there is no such API.
> 
> You should never bypass these APIs.  The restrictions of the APIs allow us to 
> change details of the memory layout between svn versions or even as the 
> program executes (altrep), in order to save memory or improve performance. 
> Also, it means that the layout can be slightly different between platforms, 
> e.g. 32-bit vs 64-bit.
> 
> Unfortunately address(x) from pryr bypasses the APIs - you should never use 
> address(x) in your programs and I wish address(x) did not exist. If you had a 
> concrete problem at hand you wanted to solve with "address(x)", feel free to 
> ask for a viable solution.
> 
> Best
> Tomas
> 
> 
> 
> 
> On 11/01/2017 07:37 PM, lille stor wrote:
>> Hi,
>>  To get the memory address of where the value of variable "x" (of datatype 
>> "numeric") is stored one does the following in R (in 32 bit):
>> library(pryr)
>>   x <- 1024
>>   addr <- as.numeric(address(x)) + 24# 24 is needed to jump the 
>> variable info and point to the data itself (i.e. 1024)
>>  The question now is what is the value of the jump so that one can obtain 
>> the memory address of where the value of variable "x" (of datatype 
>> "character"):
>>  
>>   library(pryr)
>>   x <- "abc"
>>   addr <- as.numeric(address(x)) + ??# what should be the value of 
>> the jump so that it points to the data of variable "x" (i.e. abc)?
>>  Thank you in advance!
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Discourage the weights= option of lm with summarized data

2017-10-09 Thread peter dalgaard
AFAIR, it is a little more subtle than that. 

If you have replication weights, then the estimates are right, it is "just" 
that the SE from summary.lm() are wrong. Somehow, the text should reflect this.

It is of some importance when you put glm() into the mix, because you can in 
fact get correct results from things like

y <- c(0,1)
w <- c(49,51)
glm(y~1, weights=w, family=binomial)

-pd
 


> On 9 Oct 2017, at 07:58 , Arie ten Cate  wrote:
> 
> Yes.  Thank you; I should have quoted it.
> I suggest to remove this text or to add the word "not" at the beginning.
> 
>   Arie
> 
> On Sun, Oct 8, 2017 at 4:38 PM, Viechtbauer Wolfgang (SP)
>  wrote:
>> Ah, I think you are referring to this part from ?lm:
>> 
>> "(including the case that there are w_i observations equal to y_i and the 
>> data have been summarized)"
>> 
>> I see; indeed, I don't think this is what 'weights' should be used for (the 
>> other part before that is correct). Sorry, I misunderstood the point you 
>> were trying to make.
>> 
>> Best,
>> Wolfgang
>> 
>> -Original Message-
>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Arie ten 
>> Cate
>> Sent: Sunday, 08 October, 2017 14:55
>> To: r-devel@r-project.org
>> Subject: [Rd] Discourage the weights= option of lm with summarized data
>> 
>> Indeed: Using 'weights' is not meant to indicate that the same
>> observation is repeated 'n' times.  As I showed, this gives erroneous
>> results. Hence I suggested that it is discouraged rather than
>> encouraged in the Details section of lm in the Reference manual.
>> 
>>   Arie
>> 
>> ---Original Message-
>> On Sat, 7 Oct 2017, wolfgang.viechtba...@maastrichtuniversity.nl wrote:
>> 
>> Using 'weights' is not meant to indicate that the same observation is
>> repeated 'n' times. It is meant to indicate different variances (or to
>> be precise, that the variance of the last observation in 'x' is
>> sigma^2 / n, while the first three observations have variance
>> sigma^2).
>> 
>> Best,
>> Wolfgang
>> 
>> -Original Message-
>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Arie ten 
>> Cate
>> Sent: Saturday, 07 October, 2017 9:36
>> To: r-devel@r-project.org
>> Subject: [Rd] Discourage the weights= option of lm with summarized data
>> 
>> In the Details section of lm (linear models) in the Reference manual,
>> it is suggested to use the weights= option for summarized data. This
>> must be discouraged rather than encouraged. The motivation for this is
>> as follows.
>> 
>> With summarized data the standard errors get smaller with increasing
>> numbers of observations. However, the standard errors in lm do not get
>> smaller when for instance all weights are multiplied with the same
>> constant larger than one, since the inverse weights are merely
>> proportional to the error variances.
>> 
>> Here is an example of the estimated standard errors being too large
>> with the weights= option. The p value and the number of degrees of
>> freedom are also wrong. The parameter estimates are correct.
>> 
>>  n <- 10
>>  x <- c(1,2,3,4)
>>  y <- c(1,2,5,4)
>>  w <- c(1,1,1,n)
>>  xb <- c(x,rep(x[4],n-1))  # restore the original data
>>  yb <- c(y,rep(y[4],n-1))
>>  print(summary(lm(yb ~ xb)))
>>  print(summary(lm(y ~ x, weights=w)))
>> 
>> Compare with PROC REG in SAS, with a WEIGHT statement (like R) and a
>> FREQ statement (for summarized data).
>> 
>>Arie
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] possible bug in R CMD Rd2pdf

2017-09-27 Thread peter dalgaard
If it is looking for ./DESCRIPTION, perhaps it matters which directory you 
invoke it from?

-pd

> On 27 Sep 2017, at 04:27 , Kasper Daniel Hansen 
>  wrote:
> 
> When I include the macros \packageAuthor, \packageDescription,
> \packageTitle, \packageMaintainer in a XX-package.Rd file, R CMD Rd2pdf
> fails with
> 
> $ R CMD Rd2pdf mpra
> Hmm ... looks like a package
> Converting Rd files to LaTeX Error : mpra/man/mpra-package.Rd:6: file
> './DESCRIPTION' does not exist
> 
> This does not happen if I comment out 4 occurrences of these 4 macros in
> mpra-package.Rd.
> 
> This is with
> 
> R Under development (unstable) (2017-09-26 r73351) -- "Unsuffered
> Consequences"
> Copyright (C) 2017 The R Foundation for Statistical Computing
> Platform: x86_64-apple-darwin16.7.0 (64-bit)
> 
> or
> 
> R version 3.4.2 RC (2017-09-26 r73351) -- "Short Summer"
> Copyright (C) 2017 The R Foundation for Statistical Computing
> Platform: x86_64-apple-darwin16.7.0 (64-bit)
> 
> and MacTex 2017.
> 
> Best,
> Kasper
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Incorrect Import by Data for CSV File

2017-09-25 Thread peter dalgaard

> On 25 Sep 2017, at 14:27 , Prof Brian Ripley  wrote:
> 
> On 25/09/2017 08:00, Dario Strbenac wrote:
>> Good day,
>> The data function can import a variety of file formats, one of them being 
>> C.S.V. 
> 
> That isn't its documented purpose.  It was the original way for packages to 
> provide datasets as needed (before lazy data was added).
> 
> Problematically, all of the table columns are collapsed into a single data 
> frame column. This occurs because "files ending .csv or .CSV are read using 
> read.table(..., header = TRUE, sep = ";", as.is=FALSE)". I suggest that the 
> semi-colon used as the column separator be changed to a comma.
> 
> We suggest you read the documentation ... the (non-English-locales) version 
> with a semicolon separator is one of four documented formats, and the 
> English-language one is not.  Even if it were desirable it would not be 
> possible to make a backwards-incompatible change after almost 20 years.
> 
> It really isn't clear why anyone would want to use anything other than the 
> second option (.rda) for data() unless other manipulations are needed (e.g. 
> to attach a package).  But that option was not part of the original 
> implementation.
> 

It can be handy to have raw ascii data included in a package for people to see, 
but then you can use the .R mechanism to read the data. It is done for a couple 
of cases in the ISwR package, see e.g. the stroke.R and stroke.csv pair. This 
also allows you to fix up other things that you have no chcance of specifying 
directly in the file:

stroke <-  read.csv2("stroke.csv", na.strings=".")
names(stroke) <- tolower(names(stroke))
stroke <-  within(stroke,{
sex <- factor(sex,levels=0:1,labels=c("Female","Male"))
dgn <- factor(dgn)
coma <- factor(coma, levels=0:1, labels=c("No","Yes"))
minf <- factor(minf, levels=0:1, labels=c("No","Yes"))
diab <- factor(diab, levels=0:1, labels=c("No","Yes"))
han <- factor(han, levels=0:1, labels=c("No","Yes"))
died <- as.Date(died, format="%d.%m.%Y")
end <- pmin(died, as.Date("1996-01-01"), na.rm=TRUE)
dstr <- as.Date(dstr,format="%d.%m.%Y")
obsmonths <- as.numeric(end-dstr, "days")/30.6
obsmonths[obsmonths==0] <- 0.1
dead <- !is.na(died) & died < as.Date("1996-01-01")
died[!dead] <- NA
rm(end)
})


-pd


> -- 
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel r73293 and the testthat package

2017-09-18 Thread peter dalgaard

> On 18 Sep 2017, at 00:25 , Will Landau  wrote:
> 
> Hello,
> 
> 
> Windows R-devel no longer lets me use testthat even though the CRAN checks 
> are pretty much clean. I have copied my session output below. 
> 
> Will


Well, there us a reason for the nickname of R-devel:

> R Under development (unstable) (2017-09-16 r73293) -- "Unsuffered 
> Consequences"

Re-stabilizing after the ALTREP updates took longer than expected, so some 
packages may not be updated yet. I suspect you should just wait and see if it 
comes around by itself.

-pd

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Natural vs National in R signon banner?

2017-09-01 Thread Peter Dalgaard
Just leave it, I think. Some nations have 4 national languages (as Martin will 
know), some languages are not national, and adopted children often do not speak 
their native (=born) language... I suspect someone already put a substantial 
amount of thought into the terminology.

-pd


> On 1 Sep 2017, at 09:45 , Martin Maechler  wrote:
> 
>>>>>> Paul McQuesten 
>>>>>>on Thu, 31 Aug 2017 18:48:12 -0500 writes:
> 
>> Actually, I do agree with you about Microsoft.
>> But they have so many users that their terminology should not be ignored.
> 
>> Here are a few more views:
> 
>> https://www.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.performance/natl_lang_supp_locale_speed.htm
>> https://docs.oracle.com/cd/E23824_01/html/E26033/glmbx.html
>> http://support.sas.com/documentation/cdl/en/nlsref/69741/HTML/default/viewer.htm#n1n9bwctsthuqbn1xgipyw5xwujl.htm
>> https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=GSA_config_nls
>> https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixprggd/genprogc/nls.htm
>> http://scc.ustc.edu.cn/zlsc/tc4600/intel/2017.0.098/compiler_f/common/core/GUID-1AEC889E-98A7-4A7D-91B3-865C476F603D.html
> 
>> It does appear, however, that what I call 'National Language' is often
>> referred to as 'Native Language'. And the 'National Language' terminology
>> is said to not be used consistently:
>> https://en.wikipedia.org/wiki/National_language
> 
>> I do still feel, however, that claiming 'Natural Language' support in R
>> sets expectations of new users overly high.
> 
>> Thank you for spending so much time on such a minor nit.
> 
> continuing the nits and gnats :
> 
> I think I now understand what you mean.  From the little I
> understand about English intricacies and with my not
> fully developed gut feeling of good English (which I rarely
> speak but sometimes appreciate when reading / listening),
> I would indeed
> 
> prefer  'Native Language'
> to'Natural Language'  
> 
> Martin Maechler
> ETH Zurich
> 
>> Regards
> 
> 
> 
>> On Thu, Aug 31, 2017 at 5:45 PM, Duncan Murdoch 
>> wrote:
> 
>>> On 31/08/2017 6:37 PM, Paul McQuesten wrote:
>>> 
>>>> Thanks, Duncan. But if it is not inappropriate, I feel empowered to argue.
>>>> 
>>>> According to this definition, https://en.wikipedia.org/wiki/
>>>> Natural_language:
>>>> In neuropsychology, linguistics and the philosophy of language, a
>>>> natural language or ordinary language is any language that has evolved
>>>> naturally in humans ...
>>>> 
>>>> Thus this banner statement may appear over-claiming to a significant
>>>> fraction of R users.
>>>> 
>>>> It seems that LOCALE is called 'National language' support in other
>>>> software systems.
>>>> Eg: https://www.microsoft.com/resources/msdn/goglobal/default.mspx
>>>> 
>>> 
>>> I wouldn't take Microsoft as an authority on this (or much of anything).
>>> They really are amazingly incompetent, considering how much money they earn.
>>> 
>>> Duncan Murdoch
>>> 
>>> 
>>>> And, yes, this is a low priority issue. All of you have better things to
>>>> do.
>>>> 
>>>> R is an extremely powerful and comprehensive software system.
>>>> Thank you all for that.
>>>> And I would like to clean one gnat from the windshield.
>>>> 
>>>> I just wax pedantic at times.
>>>> 
>>>> On Thu, Aug 31, 2017 at 5:13 PM, Duncan Murdoch >>> <mailto:murdoch.dun...@gmail.com>> wrote:
>>>> 
>>>> On 31/08/2017 5:38 PM, Paul McQuesten wrote:
>>>> 
>>>> The R signon banner includes this statement:
>>>> Natural language support but running in an English locale
>>>> 
>>>> Should that not say 'National' instead of 'Natural'?
>>>> Meaning that LOCALE support is enabled, not that the interface
>>>> understands
>>>> human language?
>>>> 
>>>> 
>>>> No, "natural language" refers to human languages, but it doesn't
>>>> imply that R understands them.  NLS just means that messages may be
>>>> presented in (or translated to) other human languages in an
>>>> appropriate context.
>>>> 
>>>> For example, you can start R on most platform

[Rd] R 3.4.2 scheduled for September 28

2017-08-31 Thread Peter Dalgaard
Full schedule available on developer.r-project.org (pending auto-update from 
SVN)

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Are r2dtable and C_r2dtable behaving correctly?

2017-08-25 Thread peter dalgaard

> On 25 Aug 2017, at 12:04 , Peter Dalgaard  wrote:
> 
>> There are three possible matrices, and these come out in proportions 1:4:1, 
>> the one with all cells filled with ones being
>> most common.
> 
> ... and
> 
>> dhyper(0:2,2,2,2)
> [1] 0.167 0.667 0.167
>> dhyper(0:2,2,2,2) *6
> [1] 1 4 1
> 
> so that is exactly what you would expect.

And, incidentally, this is the "statistician's socks" puzzle from introductory 
probability: 

A statistician has 2 white socks and 2 black socks. Late for work a dark 
November morning, he puts on two socks at random. What is the probability that 
he goes to work with different colored socks?

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Are r2dtable and C_r2dtable behaving correctly?

2017-08-25 Thread Peter Dalgaard
6006   8827   4633   2050865340116
>64 65 66 67
>38 13  7  1
>> 
> 
> For a  2x2  table, there's really only one degree of freedom,
> hence the above characterizes the full distribution for that
> case.
> 
> I would have expected to see all possible values in  0:100
> instead of such a "normal like" distribution with carrier only
> in [34, 67].
> 
> There are newer publications and maybe algorithms.
> So maybe the algorithm is "flawed by design" for really large
> total number of observations, rather than wrong
> Seems interesting ...
> 
> Martin Maechler
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Are r2dtable and C_r2dtable behaving correctly?

2017-08-25 Thread Peter Dalgaard

> On 25 Aug 2017, at 10:30 , Martin Maechler  wrote:
> 
[...]
> https://stackoverflow.com/questions/37309276/r-r2dtable-contingency-tables-are-too-concentrated
> 
> 
>> set.seed(1); system.time(tabs <- r2dtable(1e6, c(100, 100), c(100, 100))); 
>> A11 <- vapply(tabs, function(x) x[1, 1], numeric(1))
>   user  system elapsed 
>  0.218   0.025   0.244 
>> table(A11)
> 
>34 35 36 37 38 39 40 41 42 43 
> 2 17 40129334883   2026   4522   8766  15786 
>44 45 46 47 48 49 50 51 52 53 
> 26850  42142  59535  78851  96217 107686 112438 108237  95761  78737 
>54 55 56 57 58 59 60 61 62 63 
> 59732  41474  26939  16006   8827   4633   2050865340116 
>64 65 66 67 
>38 13  7  1 
>> 
> 
> For a  2x2  table, there's really only one degree of freedom,
> hence the above characterizes the full distribution for that
> case.
> 
> I would have expected to see all possible values in  0:100
> instead of such a "normal like" distribution with carrier only
> in [34, 67].

Hmm, am I missing a point here?

> round(dhyper(0:100,100,100,100)*1e6)
  [1]  0  0  0  0  0  0  0  0  0  0
 [11]  0  0  0  0  0  0  0  0  0  0
 [21]  0  0  0  0  0  0  0  0  0  0
 [31]  0  0  0  1  4 13 43129355897
 [41]   2087   4469   8819  16045  26927  41700  59614  78694  95943 108050
 [51] 112416 108050  95943  78694  59614  41700  26927  16045   8819   4469
 [61]   2087897355129 43 13  4  1  0  0
 [71]  0  0  0  0  0  0  0  0  0  0
 [81]  0  0  0  0  0  0  0      0  0  0
 [91]  0  0  0  0  0  0  0  0  0  0
[101]  0


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] configure.ac

2017-08-01 Thread peter dalgaard
If you check developer.r-project.org, you'll find links to the scripts that we 
use for building releases and pre-releases of R. These are usually run on a 
Mac, but shouldn't require much change for Linux. In particular, notice this 
lead-in in the prerelease script:

rm -rf BUILD-dist
mkdir BUILD-dist
cd R
aclocal -I m4
autoconf
cd ../BUILD-dist
etc

-pd

> On 1 Aug 2017, at 17:29 , Ramón Fallon  wrote:
> 
> Hi,
> 
> Just a quick mail to mention that I cannot generate a new configure script
> using autoconf or autoreconf. I had edited the configure.ac and thought ...
> "oh, that's my fault", but then I tried it on R-patched and R-3.4.1 without
> touching configure.ac and had the same problems.
> 
> The "building R packages" documentation seems to suggest that "autoconf"
> should take care of it, but I must be missing something as I expect it to
> be a common task.
> 
> I also tried explicit autohell (yes, I know) commands
> 1) autoreconf --force -v
> completes but invoking configure (with no options) gives
> 
> checking build system type... x86_64-pc-linux-gnu
> checking host system type... x86_64-pc-linux-gnu
> loading site script './config.site'
> loading build-specific script './config.site'
> ./configure: line 2982: syntax error near unexpected token `blas'
> ./configure: line 2982: `  withval=$with_blas; R_ARG_USE(blas)'
> 
> OK.. there's a recipe that oen can use, starting with:
> 
> libtoolize --force
> 
> but you get:
> 
> A sequence typical found out there, starting withlibtoolize: putting
> auxiliary files in AC_CONFIG_AUX_DIR, `tools'.
> libtoolize: linking file `tools/ltmain.sh'
> libtoolize: You should add the contents of the following files to
> `aclocal.m4':
> libtoolize:   `/usr/share/aclocal/libtool.m4'
> libtoolize:   `/usr/share/aclocal/ltoptions.m4'
> libtoolize:   `/usr/share/aclocal/ltversion.m4'
> libtoolize:   `/usr/share/aclocal/ltsugar.m4'
> libtoolize:   `/usr/share/aclocal/lt~obsolete.m4'
> libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.ac and
> libtoolize: rerunning libtoolize, to keep the correct libtool macros
> in-tree.
> libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am.
> 
> 
> then:
> 
> aclocal
> autoheader
> automake --force-missing --add-missing
> 
> first two go OK, but the third gives
> 
> configure.ac: no proper invocation of
> was found.
> configure.ac: You should verify that configure.ac invokes AM_INIT_AUTOMAKE,
> configure.ac: that aclocal.m4 is present in the top-level directory,
> configure.ac: and that aclocal.m4 was recently regenerated (using aclocal).
> automake: no `Makefile.am' found for any configure output
> 
> then the following runs OK
> autoconf
> 
> but running configure gives the same BLAS error.
> 
> But I'm farily sure one shouldn't run to see what's wrong with BLAS,ratehr
> it's just the configure options not being read properly. The
> AM_INIT_AUTOMAKE
> issue definitely seems important.
> 
> Is there anything I'm missing?
> 
> Cheers and thanks in advance!
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] italic font on cairo devices in R 3.4

2017-07-23 Thread Peter Dalgaard

> On 12 Jul 2017, at 18:41 , Ilia Kats  wrote:
> 
> FYI, I now have a second confirmed Mac OS system (with R 3.4.1) where the 
> text is not rendered as italic.

Me three,

It affects cairo_pdf() only, plain pdf() renders the example just fine.

-pd

> 
> Ilia
> 
> 
> 
>  Original Message 
> Subject: Re: [Rd] italic font on cairo devices in R 3.4
> Date: 2017-07-12 05:48:24 +0200
> From: Paul Murrell
> To: ilia-kats, r-devel
>> Hi
>> 
>> Do you have the 'fonts-texgyre' (Debian) package installed?
>> If not, does installing that help?
>> 
>> Paul
>> 
>> On 07/07/17 20:30, Ilia Kats wrote:
>>> [cross-post from R-help]
>>> 
>>> Hi all,
>>> 
>>> I have the following problem: Since R 3.4.0, italic fonts rendered on Cairo 
>>> devices appear pixelated. Here's a minimal example:
>>> cairo_pdf('test.pdf')
>>> plot(1:10, ylab=expression(italic(test)))
>>> dev.off()
>>> 
>>> The same problem occurs with bolditalic, but not bold. I am using Debian 
>>> Stretch. Several friends tried the same on their machines, another Debian 
>>> machine has the same problem. On MacOSX the output was not pixelated, but 
>>> it wasn't italic either. Ubuntu 16.04.2 xenial works fine. My impression is 
>>> that R somehow can't find the proper font to use and falls back to 
>>> something weird. Ideas?
>>> 
>>> Note that I'm not subscribed to the list, so please CC me in replies.
>>> 
>>> Cheers, Ilia
>>> 
>>> ______
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
> 
> -- 
> Linux - Where do you want to fly today?
> -- Unknown source
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [patch] ?confint: "assumes asymptotic normality"

2017-07-20 Thread peter dalgaard

> On 20 Jul 2017, at 19:46 , Scott Kostyshak  wrote:
> 
> On Thu, Jul 20, 2017 at 04:21:04PM +0200, Martin Maechler wrote:
>>>>>>> Scott Kostyshak 
>>>>>>>on Thu, 20 Jul 2017 03:28:37 -0400 writes:
>> 
>>>> From ?confint:
>>> "Computes confidence intervals" and "The default method assumes
>>> asymptotic normality"
>> 
>>> For me, a "confidence interval" implies an exact confidence interval in
>>> formal statistics (I concede that when speaking, the term is often used
>>> more loosely). And of course, even if a test statistic is asymptotically
>>> normal (so the assumption is satisfied), the finite distribution might
>>> not be normal and thus an exact confidence interval would not be
>>> computed.
>> 
>>> Attached is a patch that simply changes "asymptotic normality" to
>>> "normality" in confint.Rd. This encourages the user of the function to
>>> think about whether their asymptotically normal statistic is "normal
>>> enough" in a finite sample to get something reliable from confint().
>> 
>>> Alternatively, we could instead change "Computes confidence intervals"
>>> to "Computes asymptotic confidence intervals".
>> 
>>> I hope I'm not being too pedantic here.
>> 
>> well, it's just at the 97.5% border line of "too pedantic"  ...
> 
> :)
> 
>> ;-)
>> 
>> I think you are right with your first proposal to drop
>> "asymptotic" here.  After all, there's the explict 'fac <- qnorm(a)'.
> 
> Note that I received a private email that my message was indeed too
> pedantic and expressed disagreement with the proposal. I'm not sure if
> they intended it to be private so I will respond in private and see if
> they feel like bringing the discussion on the list. Or perhaps this
> minor (and perhaps controversial?) issue is not worth any additional
> time.

At any rate, it is important not to let the pedantry cause the text to become 
misleading. If you just write "assumes normality", readers may consider the 
procedure to be simply wrong when the estimator (or worse: the original data) 
is not normally distributed. And "computes asymptotic c.i." is just wrong, 
because they are sometimes exact. 

It may be necessary to spell things out more extensively. Something like "the 
default method assumes normality and that the s.e. is known. For asymptotically 
normally distributed estimators, it yields an asymptotic confidence interval."

-pd

  

> 
>> One could consider to make  'qnorm' an argument of the
>> default method to allow more general distributional assumptions,
>> but it may be wiser to have useRs write their own
>> confint.() method, notably for cases where
>> diag(vcov(object)) is an efficiency waste...
> 
> Thanks for your comments,
> 
> Scott
> 
>> Martin
>> 
>> 
>>> Scott
>> 
>> 
>>> -- 
>>> Scott Kostyshak
>>> Assistant Professor of Economics
>>> University of Florida
>>> https://people.clas.ufl.edu/skostyshak/
>> 
>> 
>>> --
>>> Index: src/library/stats/man/confint.Rd
>>> ===
>>> --- src/library/stats/man/confint.Rd(revision 72930)
>>> +++ src/library/stats/man/confint.Rd(working copy)
>>> @@ -31,7 +31,7 @@
>>> }
>>> \details{
>>> \code{confint} is a generic function.  The default method assumes
>>> -  asymptotic normality, and needs suitable \code{\link{coef}} and
>>> +  normality, and needs suitable \code{\link{coef}} and
>>> \code{\link{vcov}} methods to be available.  The default method can be
>>> called directly for comparison with other methods.
>> 
>> 
>>> --
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] make check-recommended hanging on up-to-date Rdevel from SVN

2017-07-07 Thread peter dalgaard

> On 6 Jul 2017, at 23:26 , Gabriel Becker  wrote:
> 
> Hi all,
> 
> I'm getting an issue with Rdevel where make check-recommended hangs
> consistently for me on Mac El Capitan when checking the Matrix package. I
> did svn update and tools/rsync_recommended earlier today and it didn't fix
> the issue.
> 
> Specifically, it is hanging on the
> 
> * checking dependencies in R code ...

Not happening for me on Sierra, it seems. However, I have a minimal source 
install of R-devel, so that

* checking package dependencies ... NOTE
Package suggested but not available for checking: ‘expm’

Packages which this enhances but not available for checking:
  ‘MatrixModels’ ‘graph’ ‘SparseM’ ‘sfsmisc’

... and it is thinkable that the issue is with one of those(?)

-pd

> 
> 
> stage (while checking Matrix, it passes fine for MASS and lattice).
> Currently I'm getting the same behavior when I do
> 
> tools:::.check_packages("/tests/RecPackages/Matrix")
> 
> 
> Is this a known issue and if not, do other people see this behavior?
> 
> Best,
> ~G
> 
> 
> Various info:
> 
> Build script:
> 
> #!/bin/bash
> 
> export CC="gcc -std=gnu99 -fsanitize=address"
> # make -O0 if we want full debugging symbols
> export CFLAGS="-fno-omit-frame-pointer -g -O2 -Wall -pedantic -mtune=native"
> #export CFLAGS="-fno-omit-frame-pointer -g -O0 -Wall -pedantic
> -mtune=native"
> 
> export CXX="g++ -fsanitize=address -fno-omit-frame-pointer"
> export F77="gfortran -arch x86_64"
> export FC="gfortran -arch x86_64"
> export MAIN_LDFLAGS=" -fsanitize=address"
> 
> ../checkedout/Rsource/rawtrunk/configure
> --prefix=/Users/beckerg4/local/Rdevel --enable-R-framework
> --enable-memory-profiling
> make -j3
> make install
> 
> Svn Info:
> 
> beckerg4-T4G3QN:rawtrunk beckerg4$ *svn info*
> 
> Path: .
> 
> Working Copy Root Path: /Users/beckerg4/gabe/checkedout/Rsource/rawtrunk
> 
> URL: https://svn.r-project.org/R/trunk
> 
> Relative URL: ^/trunk
> 
> Repository Root: https://svn.r-project.org/R
> 
> Repository UUID: 00db46b3-68df-0310-9c12-caf00c1e9a41
> 
> Revision: 72894
> 
> Node Kind: directory
> 
> Schedule: normal
> 
> Last Changed Author: lawrence
> 
> Last Changed Rev: 72894
> 
> Last Changed Date: 2017-07-06 00:12:06 -0700 (Thu, 06 Jul 2017)
> 
> 
> 
> Svn status (no local changes)
> 
> beckerg4-T4G3QN:rawtrunk beckerg4$* svn status*
> 
> beckerg4-T4G3QN:rawtrunk beckerg4$
> 
> 
> Session info after Matrix is attached:
> 
> *> library(Matrix)*
> 
> *> sessionInfo()*
> 
> R Under development (unstable) (2017-07-06 r72894)
> 
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> 
> Running under: OS X El Capitan 10.11.6
> 
> 
> Matrix products: default
> 
> BLAS: /Users/beckerg4/gabe/Rdevelbuild/lib/libRblas.dylib
> 
> LAPACK: /Users/beckerg4/gabe/Rdevelbuild/lib/libRlapack.dylib
> 
> 
> locale:
> 
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> 
> 
> attached base packages:
> 
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> 
> other attached packages:
> 
> [1] Matrix_1.2-10
> 
> 
> loaded via a namespace (and not attached):
> 
> [1] compiler_3.5.0  grid_3.5.0  lattice_0.20-35
> 
> 
> 
> -- 
> Gabriel Becker, PhD
> Associate Scientist (Bioinformatics)
> Genentech Research
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Odd behaviour in within.list() when deleting 2+ variables

2017-06-26 Thread peter dalgaard

> On 26 Jun 2017, at 21:56 , Martin Maechler  wrote:
> 
> 
> Indeed, the fix I've committed reverts almost to the previous
> first version of  within.data.frame  (which is from Peter
> Dalgaard, for those who don't know).
> 

Great foresight on my part there, eh? ;-)

-p

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Odd behaviour in within.list() when deleting 2+ variables

2017-06-26 Thread Peter Dalgaard

> On 26 Jun 2017, at 19:04 , Martin Maechler  wrote:
> 
>>>>>> peter dalgaard 
>>>>>>on Mon, 26 Jun 2017 13:43:28 +0200 writes:
> 
>> This seems to be due to changes made by Martin Maechler in
>> 2008. Presumably this fixed something, but it escapes my
>> memory.
> 
> Yes: The change set (svn -c46441) also contains the following NEWS entry
> 
> BUG FIXES
> 
> o within(, { ... }) now also works when '...' removes
>   more than one column.
> 

The odd thing is that the assign-NULL technique used for removing a single 
column, NOW also seems to work for several columns in a data frame, so I wonder 
what the bug was back then...

-pd


> 
>> However, it seems to have broken the equivalence
>> between within.list and within.data.frame, so now
> 
>>  within.list <- within.data.frame
> 
>> does not suffice.
> 
> There have been many improvements since then, so maybe we can
> change the code so that the above will work again.
> 
> Another problem seems that we had no tests of  within.list()
> anywhere... so we will have them now.
> 
> I've hade an idea that seems to work and even simplify the
> code  will get back to the issue later in the evening.
> 
> Martin
> 
> 
>> The crux of the matter seems to be that both the following
>> constructions work for data frames
> 
>>> aq <- head(airquality)
>>> names(aq)
>> [1] "Ozone"   "Solar.R" "Wind""Temp""Month"   "Day"
>>> aq[c("Wind","Temp")] <- NULL
>>> aq
>> Ozone Solar.R Month Day
>> 141 190 5   1
>> 236 118 5   2
>> 312 149 5   3
>> 418 313 5   4
>> 5NA  NA 5   5
>> 628  NA 5   6
>>> aq <- head(airquality)
>>> aq[c("Wind","Temp")] <- vector("list",2)
>>> aq
>> Ozone Solar.R Month Day
>> 141 190 5   1
>> 236 118 5   2
>> 312 149 5   3
>> 418 313 5   4
>> 5NA  NA 5   5
>> 628  NA 5   6
> 
>> However, for lists they differ:
> 
>>> aq <- as.list(head(airquality))
>>> aq[c("Wind","Temp")] <- vector("list",2)
>>> aq
>> $Ozone
>> [1] 41 36 12 18 NA 28
> 
>> $Solar.R
>> [1] 190 118 149 313  NA  NA
> 
>> $Wind
>> NULL
> 
>> $Temp
>> NULL
> 
>> $Month
>> [1] 5 5 5 5 5 5
> 
>> $Day
>> [1] 1 2 3 4 5 6
> 
>>> aq <- as.list(head(airquality))
>>> aq[c("Wind","Temp")] <- NULL
>>> aq
>> $Ozone
>> [1] 41 36 12 18 NA 28
> 
>> $Solar.R
>> [1] 190 118 149 313  NA  NA
> 
>> $Month
>> [1] 5 5 5 5 5 5
> 
>> $Day
>> [1] 1 2 3 4 5 6
> 
> 
>> -pd
> 
>>> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel  
>>> wrote:
>>> 
>>> The behaviour of within() with list input changes if you delete 2 or more 
>>> variables, compared to deleting one:
>>> 
>>> l <- list(x=1, y=2, z=3)
>>> 
>>> within(l,
>>> {
>>> rm(z)
>>> })
>>> #$x
>>> #[1] 1
>>> #
>>> #$y
>>> #[1] 2
>>> 
>>> 
>>> within(l, {
>>> rm(y)
>>> rm(z)
>>> })
>>> #$x
>>> #[1] 1
>>> #
>>> #$y
>>> #NULL
>>> #
>>> #$z
>>> #NULL
>>> 
>>> 
>>> When 2 or more variables are deleted, the list entries are instead set to 
>>> NULL. Is this intended?
>>> 
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
>> -- 
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Odd behaviour in within.list() when deleting 2+ variables

2017-06-26 Thread peter dalgaard
This seems to be due to changes made by Martin Maechler in 2008. Presumably 
this fixed something, but it escapes my memory.

However, it seems to have broken the equivalence between within.list and 
within.data.frame, so now

within.list <- within.data.frame

does not suffice.

The crux of the matter seems to be that both the following constructions work 
for data frames

> aq <- head(airquality)
> names(aq)
[1] "Ozone"   "Solar.R" "Wind""Temp""Month"   "Day"
> aq[c("Wind","Temp")] <- NULL
> aq
  Ozone Solar.R Month Day
141 190 5   1
236 118 5   2
312 149 5   3
418 313 5   4
5NA  NA 5   5
628  NA 5   6
> aq <- head(airquality)
> aq[c("Wind","Temp")] <- vector("list",2)
> aq
  Ozone Solar.R Month Day
141 190 5   1
236 118 5   2
312 149 5   3
418 313 5   4
5NA  NA 5   5
628  NA 5   6

However, for lists they differ:

> aq <- as.list(head(airquality))
> aq[c("Wind","Temp")] <- vector("list",2)
> aq
$Ozone
[1] 41 36 12 18 NA 28

$Solar.R
[1] 190 118 149 313  NA  NA

$Wind
NULL

$Temp
NULL

$Month
[1] 5 5 5 5 5 5

$Day
[1] 1 2 3 4 5 6

> aq <- as.list(head(airquality))
> aq[c("Wind","Temp")] <- NULL
> aq
$Ozone
[1] 41 36 12 18 NA 28

$Solar.R
[1] 190 118 149 313  NA  NA

$Month
[1] 5 5 5 5 5 5

$Day
[1] 1 2 3 4 5 6


-pd

> On 26 Jun 2017, at 04:40 , Hong Ooi via R-devel  wrote:
> 
> The behaviour of within() with list input changes if you delete 2 or more 
> variables, compared to deleting one:
> 
> l <- list(x=1, y=2, z=3)
> 
> within(l,
> {
>rm(z)
> })
> #$x
> #[1] 1
> #
> #$y
> #[1] 2
> 
> 
> within(l, {
>rm(y)
>rm(z)
> })
> #$x
> #[1] 1
> #
> #$y
> #NULL
> #
> #$z
> #NULL
> 
> 
> When 2 or more variables are deleted, the list entries are instead set to 
> NULL. Is this intended?
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] duplicated factor labels.

2017-06-23 Thread peter dalgaard
Hmm, the danger in this is that duplicated factor levels _used_ to be allowed 
(i.e. multiple codes with the same level). Disallowing it is what broke 
read.spss() on some files, because SPSS's concept of value labels is not 1-to-1 
with factors. 

Reallowing it with different semantics could be premature. I mean, if we hadn't 
had the "forbidden" step, read.spss() could have changed behaviour unnoticed. 
So what if there is code relying on duplicate factor levels, which hasn't been 
run for some time?

-pd

> On 23 Jun 2017, at 10:42 , Martin Maechler  wrote:
> 
>>>>>> Martin Maechler 
>>>>>>on Thu, 22 Jun 2017 11:43:59 +0200 writes:
> 
>>>>>> Paul Johnson 
>>>>>>on Fri, 16 Jun 2017 11:02:34 -0500 writes:
> 
>>> On Fri, Jun 16, 2017 at 2:35 AM, Joris Meys  wrote:
>>>> To extwnd on Martin 's explanation :
>>>> 
>>>> In factor(), levels are the unique input values and labels the unique 
>>>> output
>>>> values. So the function levels() actually displays the labels.
>>>> 
> 
>>> Dear Joris
> 
>>> I think we agree. Currently, factor insists both levels and labels be 
>>> unique.
> 
>>> I wish that it would not accept nonunique labels. I also understand it
>>> is impractical to change this now in base R.
> 
>>> I don't think I succeeded in explaining why this would be nicer.
>>> Here's another example. Fairly often, we see input data like
> 
>>> x <- c("Male", "Man", "male", "Man", "Female")
> 
>>> The first four represent the same value.  I'd like to go in one step
>>> to a new factor variable with enumerated types "Male" and "Female".
>>> This fails
> 
>>> xf <- factor(x, levels = c("Male", "Man", "male", "Female"),
>>> labels = c("Male", "Male", "Male", "Female"))
> 
>>> Instead, we need 2 steps.
> 
>>> xf <- factor(x, levels = c("Male", "Man", "male", "Female"))
>>> levels(xf) <- c("Male", "Male", "Male", "Female")
> 
>>> I think it is quirky that `levels<-.factor` allows the duplicated
>>> labels, whereas factor does not.
> 
>>> I wrote a function rockchalk::combineLevels to simplify combining
>>> levels, but most of the students here like plyr::mapvalues to do it.
>>> The use of levels() can be tricky because one must enumerate all
>>> values, not just the ones being changed.
> 
>>> But I do understand Martin's point. Its been this way 25 years, it
>>> won't change. :).
> 
>> Well.. the above is a bit out of context.
> 
>> Your first example really did not make a point to me (and Joris)
>> and I showed that you could use even two different simple factor() calls to
>> produce what you wanted 
>> yc <- factor(c("1",NA,NA,"4","4","4"))
>> yn <- factor(c( 1, NA,NA, 4,  4,  4))
> 
>> Your new example is indeed  much more convincing !
> 
>> (Note though that the two steps that are needed can be written 
>> more shortly
> 
>> The  "been this way 25 years"  is one a reason to be very
>> cautious(*) with changes, but not a reason for no changes!
> 
>> (*) Indeed as some of you have noted we really should not "break behavior".
>> This means to me we cannot accept a change there which gives
>> an error or a different result in cases the old behavior gave a valid factor.
> 
>> I'm looking at a possible change currently
>> [not promising that a change will happen ...]
> 
> In the end, I've liked the change (after 2-3 iterations), and
> now been brave to commit to R-devel (svn 72845).
> 
> With the change, I had to disable one of our own regression
> checks (tests/reg-tests-1b.R, line 726):
> 
> The following is now (in R-devel -> R 3.5.0) valid:
> 
>> factor(1:2, labels = c("A","A"))
>   [1] A A
>   Levels: A
>> 
> 
> I wonder how many CRAN package checks will "break" from
> this (my guess is in the order of a dozen), but I hope
> that these breakages will be benign, e.g., similar to the above
> case where before an error was expected via tools :: assertError(.)
> 
> Martin
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines

2017-06-16 Thread peter dalgaard

> On 16 Jun 2017, at 21:17 , Duncan Murdoch  wrote:
> 
> paste0("this is the first part",
>"this is the second part")
> 
> If the rather insignificant amount of time it takes to execute this function 
> call really matters (and I'm not convinced of that), then shouldn't it be 
> solved by the compiler applying constant folding to paste0()?

And, of course, if it is equivalent to a literal, it can be precomputed. There 
is no point in having it in the middle of a tight loop. 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R history: Why 'L; in suffix character ‘L’ for integer constants?

2017-06-16 Thread peter dalgaard
Wikipedia claims that C ints are still only guaranteed to be at least 16 bits, 
and longs are at least 32 bits. So no, R's integers are long.

-pd

> On 16 Jun 2017, at 20:20 , William Dunlap via R-devel  
> wrote:
> 
> But R "integers" are C "ints", as opposed to S "integers", which are C
> "long ints".  (I suppose R never had to run on ancient hardware with 16 bit
> ints.)
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> On Fri, Jun 16, 2017 at 10:47 AM, Yihui Xie  wrote:
> 
>> Yeah, that was what I heard from our instructor when I was a graduate
>> student: L stands for Long (integer).
>> 
>> Regards,
>> Yihui
>> --
>> https://yihui.name
>> 
>> 
>> On Fri, Jun 16, 2017 at 11:00 AM, Serguei Sokol 
>> wrote:
>>> Le 16/06/2017 à 17:54, Henrik Bengtsson a écrit :
>>>> 
>>>> I'm just curious (no complaints), what was the reason for choosing the
>>>> letter 'L' as a suffix for integer constants?  Does it stand for
>>>> something (literal?), is it because it visually stands out, ..., or no
>>>> specific reason at all?
>>> 
>>> My guess is that it is inherited form C "long integer" type (contrary to
>>> "short integer" or simply "integer")
>>> https://en.wikipedia.org/wiki/C_data_types
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 'ordered' destroyed to 'factor'

2017-06-16 Thread peter dalgaard

> On 16 Jun 2017, at 15:59 , Robert McGehee  wrote:
> 
> For instance, what would you expect to get from unlist() if each element of 
> the list had different levels, or were both ordered, but in a different way, 
> or if some elements of the list were factors and others were ordered factors?
>> unlist(list(ordered(c("a","b")), ordered(c("b","a"
> [1] ?

Those actually have the same levels in the same order: a < b

Possibly, this brings the point home more clearly

unlist(list(ordered(c("a","c")), ordered(c("b","d"

(Notice that alphabetical order is largely irrelevant, so all of these level 
orderings are equally possible:

a < c < b < d
a < b < c < d
a < b < d < c
b < a < c < d
b < a < d < c
b < d < a < c

).

-pd
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Unexpected interaction between missing() and a blank expression

2017-06-06 Thread peter dalgaard

> On 6 Jun 2017, at 18:50 , Hong Ooi via R-devel  wrote:
> 
> This is something I came across just now:
> 
> f <- function(x) missing(x)
> z <- quote(expr=)
> 
> f(z)
> # TRUE
> 
> The object z contains the equivalent of a missing function argument. Another 
> method for generating a missing arg would be alist(a=)$a .
> 
> Should f(z) return TRUE in this case? I interpret missing() as checking 
> whether the parent function call had a value supplied for the given argument. 
> Here, I have supplied an argument (z), so I would expect f to return FALSE.

Missing values propagate in R, e.g.

> f <- function(x) missing(x)
> g <- function(y) f(y)
> g()
[1] TRUE

This is technically done by having a "missing" object, which is not really 
intended to be visible to users, but pops up in a few esoteric constructions. 
Trying do anything constructive with the missing object usually leads to grief, 
or at least surprises, e.g.:

> z <-quote(expr=)
> z <- z
Error: argument "z" is missing, with no default

-pd
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] stats::line() does not produce correct Tukey line when n mod 6 is 2 or 3

2017-05-31 Thread peter dalgaard

> On 31 May 2017, at 16:40 , Joris Meys  wrote:
> 
> And with "equally spaced" I obviously meant "of equal size". It's getting
> too hot in the office here...

We have a fair amount of cool westerly wind up here that I could transfer to 
you via  WWTP (Wind and Weather Transport Protocol). If you open up a 
sufficiently large pipe, that is. 

Anyways, in the past we have tried to follow Tukey's instructions on details 
like the definition of the "hinges" on boxplots, so presumably we should try 
and do likewise for this case. 

I suspect that Tukey would say "divide the data into three roughly equal-sized 
groups" or some such. The obvious thing to do would be to allocate N %/% 3 to 
each group and then the N %% 3 remaining symmetrically and as evenly as 
possible, which in my book would rather be (1,0,1) than (0, 2, 0) for the case 
N %% 3 == 2. If  N %% 3 == 1, there is no alternative to (0, 1, 0) by this 
logic.

> 
> On Wed, May 31, 2017 at 4:39 PM, Joris Meys  wrote:
> 
>> Seriously, if a method gives a wrong result, it's wrong. line() does NOT
>> implement the algorithm of Tukey, even not after the patch. We're not
>> discussing Excel here, are we?
>> 
>> The method of Tukey is rather clear, and it is NOT using the default
>> quantile definition from the quantile function. Actually, it doesn't even
>> use quantiles to define the groups. It just says that the groups should be
>> more or less equally spaced. As the method of Tukey relies on the medians
>> of the subgroups, it would make sense to pick a method that is
>> approximately unbiased with regard to the median. That would be type 8
>> imho.
>> 
>> To get the size of the outer groups, Tukey would've been more than happy
>> enough with a:
>> 
>>> floor(length(dfr$time) / 3)
>> [1] 6
>> 
>> There you have the size of your left and right group, and now we can
>> discuss about which median type should be used for the robust fitting.
>> 
>> But I can honestly not understand why anyone in his right mind would
>> defend a method that is clearly wrong while not working at Microsoft's
>> spreadsheet department.
>> 
>> Cheers
>> Joris
>> 
>> On Wed, May 31, 2017 at 4:03 PM, Serguei Sokol 
>> wrote:
>> 
>>> Le 31/05/2017 à 15:40, Joris Meys a écrit :
>>> 
>>>> OTOH,
>>>> 
>>>>> sapply(1:9, function(i){
>>>> +   sum(dfr$time <= quantile(dfr$time, 1./3., type = i))
>>>> + })
>>>> [1] 8 8 6 6 6 6 8 6 6
>>>> 
>>>> Only the default (type = 7) and the first two types give the result
>>>> lines() gives now. I think there is plenty of reasons to give why any of
>>>> the other 6 types might be better suited in Tukey's method.
>>>> 
>>>> So to my mind, chaning the definition of line() to give sensible output
>>>> that is in accordance with the theory, does not imply any inconsistency
>>>> with the quantile definition in R. At least not with 6 out of the 9
>>>> different ones ;-)
>>>> 
>>> Nice shot.
>>> But OTOE (on the other end ;)
>>>> sapply(1:9, function(i){
>>> +   sum(dfr$time >= quantile(dfr$time, 2./3., type = i))
>>> + })
>>> [1] 8 8 8 8 6 6 8 6 6
>>> 
>>> Here "8" gains 5 votes against 4 for "6". There were two defector methods
>>> that changed the point number and should be discarded. Which leaves us
>>> with the score 3:4, still in favor of "6" but the default method should
>>> prevail
>>> in my sens.
>>> 
>>> Serguei.
>>> 
>> 
>> 
>> 
>> --
>> Joris Meys
>> Statistical consultant
>> 
>> Ghent University
>> Faculty of Bioscience Engineering
>> Department of Mathematical Modelling, Statistics and Bio-Informatics
>> 
>> tel :  +32 (0)9 264 61 79 <+32%209%20264%2061%2079>
>> joris.m...@ugent.be
>> ---
>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>> 
> 
> 
> 
> -- 
> Joris Meys
> Statistical consultant
> 
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
> 
> tel :  +32 (0)9 264 61 79
> joris.m...@ugent.be
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stats::line() does not produce correct Tukey line when n mod 6 is 2 or 3

2017-05-29 Thread peter dalgaard
A usually trustworthy R correspondent posted a pure R implementation on SO at 
some point in his lost youth:

https://stackoverflow.com/questions/3224731/john-tukey-median-median-or-resistant-line-statistical-test-for-r-and-line

This one does indeed generate the line of identity for the (1:9, 1:9) case, so 
I do suspect that we have a genuine scr*wup in line().

Notice, incidentally, that

> line(1:9+rnorm(9,,1e-1),1:9+rnorm(9,,1e-1))

Call:
line(1:9 + rnorm(9, , 0.1), 1:9 + rnorm(9, , 0.1))

Coefficients:
[1]  -0.9407   1.1948

I.e., it is not likely an issue with exact integers or perfect fit.

-pd



> On 29 May 2017, at 07:21 , GlenB  wrote:
> 
>> Tukey divides the points into three groups, not the x and y values
> separately.
> 
>> I'll try to get hold of the book for a direct quote, might take a couple
> of days.
> 
> Ah well, I can't get it for a week. But the fact that it's often called
> Tukey's three group line (try a search on *tukey three group line* and
> you'll get plenty of hits) is pretty much a giveaway.
> 
> 
> On Mon, May 29, 2017 at 2:19 PM, GlenB  wrote:
> 
>> Tukey divides the points into three groups, not the x and y values
>> separately.
>> 
>> I'll try to get hold of the book for a direct quote, might take a couple
>> of days.
>> 
>> 
>> 
>> On Mon, May 29, 2017 at 8:40 AM, Duncan Murdoch 
>> wrote:
>> 
>>> On 27/05/2017 9:28 PM, GlenB wrote:
>>> 
>>>> Bug: stats::line() does not produce correct Tukey line when n mod 6 is 2
>>>> or
>>>> 3
>>>> 
>>>> Example: line(1:9,1:9) should have intercept 0 and slope 1 but it gives
>>>> intercept -1 and slope 1.2
>>>> 
>>>> Trying line(1:i,1:i) across a range of i makes it clear there's a cycle
>>>> of
>>>> length 6, with four of every six correct.
>>>> 
>>>> Bug has been present across many versions.
>>>> 
>>>> The machine I just tried it on just now has R3.2.3:
>>>> 
>>> 
>>> If you look at the source (in src/library/stats/src/line.c), the
>>> explanation is clear:  the x value is chosen as the 1/6 quantile (according
>>> to a particular definition of quantile), and the y value is chosen as the
>>> median of the y values where x is less than or equal to the 1/3 quantile.
>>> Those are different definitions (though I think they would be
>>> asymptotically equivalent under pretty weak assumptions), so it's not
>>> surprising the x value doesn't correspond perfectly to the y value, and the
>>> line ends up "wrong".
>>> 
>>> So is it a bug?  Well, that depends on Tukey's definition.  I don't have
>>> a copy of his book handy so I can't really say.  Maybe the R function is
>>> doing exactly what Tukey said it should, and that's not a bug.  Or maybe R
>>> is wrong.
>>> 
>>> Duncan Murdoch
>>> 
>>> 
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] help pages base R not rendered correctly?

2017-05-23 Thread peter dalgaard

> On 23 May 2017, at 15:56 , Joris Meys  wrote:
> 
> Hi Duncan,
> 
> that explains, thank you. If nobody finds the time to fix that, I might
> give it a shot myself this summer. Barbeque is overrated.

I beg to differ! Chances of rain are underestimated, though (in .be as in .dk, 
I suspect). 

;-)

-pd

> 
> Cheers
> Joris
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] Somewhat obscure bug in R 3.4.0 building from source

2017-05-21 Thread Peter Dalgaard
Inline below...

> On 21 May 2017, at 20:57 , Duncan Murdoch  wrote:
> 
> On 21/05/2017 10:30 AM, Peter Carbonetto wrote:
>> Hi,
>> 
>> I uncovered a bug in installing R 3.4.0 from source in Linux, following the
>> standard procedure (configure; make; make install). Is this an appropriate
>> place to report this bug? If not, can you please direct me to the
>> appropriate place?
> 
> Generally R-devel is better; I've responded there.
> 
>> 
>> The error occurs only when I do "make clean" followed by "make" again; make
>> works the first time.
>> 
>> The error is a failure to build NEWS.pdf:
>> 
>> Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet,  :
>>  pdflatex is not available
>> Calls:  -> texi2pdf -> texi2dvi
>> Execution halted
>> make[1]: *** [NEWS.pdf] Error 1
>> make: [docs] Error 2 (ignored)
>> 
>> and can be reproduced wit the following sequence:
>> 
>> ./configure
>> make
>> make clean
>> make
> 
> We usually don't build in the source directory; see the second recommendation 
> in the admin manual section 2.1.  So it's possible there's a bug triggered 
> when you do that.  Can you try building in a separate directory?

Notice that the error is that "pdflatex" is missing from your setup. We do, for 
the benefit of users with defective TeX installations supply a pre-built 
NEWS.pdf (and NEWS.html too) in the source tarballs. However, they are 
technically make targets and make clean will wipe them; in that case, you had 
better have the tools to rebuild them! 

-pd

> 
> Duncan Murdoch
> 
>> 
>> This suggests to me that perhaps "make clean" is not working.
>> 
>> I'm happy to provide more details so that you are able to reproduce the bug.
>> 
>> Thanks,
>> 
>> Peter Carbonetto, Ph.D.
>> Computational Staff Scientist, Statistics & Genetics
>> Research Computing Center
>> University of Chicago
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] R-3.4.0 fails test

2017-05-18 Thread peter dalgaard

> On 18 May 2017, at 14:58 , Martyn Plummer  wrote:
> 
> 
> 
>> On 18 May 2017, at 14:51, peter dalgaard  wrote:
>> 
>> 
>>> On 18 May 2017, at 13:47 , Joris Meys  wrote:
>>> 
>>> Correction: Also dlt uses the default timezone, but POSIXlt is not 
>>> recalculated whereas POSIXct is. Reason for that is the different way 
>>> values are stored (hours, minutes, seconds as opposed to minutes from 
>>> origin, as explained in my previous mail)
>>> 
>> 
>> I would suspect that there is something more subtle going on, New Zealand 
>> time is 10, 11, or 12 hours from Central European, depending on time of year 
>> (10 in our Summer, 12 in theirs and 11 during the overlap at both ends, if 
>> you must know), and we are talking a 1 hour difference.  
>> 
>> However DST transitions were both in March/April, so that's not it. Maybe a 
>> POSIX[lc]t expert can comment?
> 
> If I change the month from December to June then I see the same phenomenon in 
> my Europe/Paris time zone. The issue seems to be that, for the date chosen 
> for the test, Summer/daylight savings time is in force in NZ and some other 
> parts of the southern hemisphere , but not in the northern hemisphere.
> 

Of course! I overlooked that the date in the test is the issue, not the current 
date. (Let's blame that on the fact that Summer seems to have finally arrived 
in Copenhagen...)

"svn praise" claims this test is due to Martin Maechler in r71742, so maybe he 
knows how to fix it. (I wonder if he just used the current date at the time, or 
actually thought that there would be no DST issues in December ;-) )

-pd


> Martyn
> 
>> -pd
>> 
>>> CHeers
>>> Joris
>>> 
>>> On Thu, May 18, 2017 at 1:45 PM, Joris Meys  wrote:
>>> This has to do with your own timezone. If I run that code on my computer, 
>>> both formats are correct. If I do this after 
>>> 
>>> Sys.setenv(TZ = "UTC")
>>> 
>>> Then:
>>> 
>>>> cbind(format(dlt), format(dct))
>>> [,1]  [,2] 
>>> [1,] "2016-12-06 21:45:41" "2016-12-06 20:45:41"
>>> [2,] "2016-12-06 21:45:42" "2016-12-06 20:45:42"
>>> 
>>> The reason for that, is that dlt has a timezone set, but dct doesn't. To be 
>>> correct, it only takes the first value "", which indicates "Use the default 
>>> timezone of the locale".
>>> 
>>>> attr(dlt, "tzone")
>>> [1] "" "CET"  "CEST"
>>>> attr(dct, "tzone")
>>> [1] ""
>>> 
>>> The thing is, in POSIXlt the timezone attribute is stored together with the 
>>> actual values for hour, minute etc. in list format. Changing the timezone 
>>> doesn't change those values, but it will change the time itself:
>>> 
>>>> Sys.unsetenv("TZ")
>>>> dlt2 <- dlt
>>>> attr(dlt2,"tzone") <- "UTC"
>>>> dlt2
>>> [1] "2016-12-06 21:45:41 UTC" "2016-12-06 21:45:42 UTC"
>>> [3] "2016-12-06 21:45:43 UTC" "2016-12-06 21:45:44 UTC"
>>> 
>>> in POSIXct the value doesn't change either, just the attribute. But this 
>>> value is the number of seconds since the origin. So the time itself doesn't 
>>> change, but you'll see a different hour.
>>> 
>>>> dct
>>> [1] "2016-12-06 21:45:41 CET" "2016-12-06 21:45:42 CET"
>>> ...
>>>> attr(dct,"tzone") <- "UTC"
>>>> dct
>>> [1] "2016-12-06 20:45:41 UTC" "2016-12-06 20:45:42 UTC"
>>> [3] "2016-12-06 20:45:43 UTC" "2016-12-06 20:45:44 UTC"
>>> 
>>> So what you see, is simply the result of your timezone settings on your 
>>> computer.
>>> 
>>> Cheers
>>> Joris
>>> 
>>>> On Thu, May 18, 2017 at 1:19 PM, peter dalgaard  wrote:
>>>> 
>>>> On 18 May 2017, at 11:00 , Patrick Connolly  
>>>> wrote:
>>>> 
>>>> On Wed, 17-May-2017 at 01:21PM +0200, Peter Dalgaard wrote:
>>>> 
>>>> |>
>>>> |> Anyways, you might want to
>>>> |>
>>>> |> a) move the discussion to R-devel
>>>> |> b) include your platform (hardware, OS) and time zone info
>>>> 
>>>> Syste

Re: [Rd] [R] R-3.4.0 fails test

2017-05-18 Thread peter dalgaard

> On 18 May 2017, at 13:47 , Joris Meys  wrote:
> 
> Correction: Also dlt uses the default timezone, but POSIXlt is not 
> recalculated whereas POSIXct is. Reason for that is the different way values 
> are stored (hours, minutes, seconds as opposed to minutes from origin, as 
> explained in my previous mail)
> 

I would suspect that there is something more subtle going on, New Zealand time 
is 10, 11, or 12 hours from Central European, depending on time of year (10 in 
our Summer, 12 in theirs and 11 during the overlap at both ends, if you must 
know), and we are talking a 1 hour difference.  

However DST transitions were both in March/April, so that's not it. Maybe a 
POSIX[lc]t expert can comment?

-pd

> CHeers
> Joris
> 
> On Thu, May 18, 2017 at 1:45 PM, Joris Meys  wrote:
> This has to do with your own timezone. If I run that code on my computer, 
> both formats are correct. If I do this after 
> 
> Sys.setenv(TZ = "UTC")
> 
> Then:
> 
> > cbind(format(dlt), format(dct))
>   [,1]  [,2] 
>  [1,] "2016-12-06 21:45:41" "2016-12-06 20:45:41"
>  [2,] "2016-12-06 21:45:42" "2016-12-06 20:45:42"
> 
> The reason for that, is that dlt has a timezone set, but dct doesn't. To be 
> correct, it only takes the first value "", which indicates "Use the default 
> timezone of the locale".
> 
> > attr(dlt, "tzone")
> [1] "" "CET"  "CEST"
> > attr(dct, "tzone")
> [1] ""
> 
> The thing is, in POSIXlt the timezone attribute is stored together with the 
> actual values for hour, minute etc. in list format. Changing the timezone 
> doesn't change those values, but it will change the time itself:
> 
> > Sys.unsetenv("TZ")
> > dlt2 <- dlt
> > attr(dlt2,"tzone") <- "UTC"
> > dlt2
>  [1] "2016-12-06 21:45:41 UTC" "2016-12-06 21:45:42 UTC"
>  [3] "2016-12-06 21:45:43 UTC" "2016-12-06 21:45:44 UTC"
> 
> in POSIXct the value doesn't change either, just the attribute. But this 
> value is the number of seconds since the origin. So the time itself doesn't 
> change, but you'll see a different hour.
> 
> > dct
>  [1] "2016-12-06 21:45:41 CET" "2016-12-06 21:45:42 CET"
> ...
> > attr(dct,"tzone") <- "UTC"
> > dct
>  [1] "2016-12-06 20:45:41 UTC" "2016-12-06 20:45:42 UTC"
>  [3] "2016-12-06 20:45:43 UTC" "2016-12-06 20:45:44 UTC"
> 
> So what you see, is simply the result of your timezone settings on your 
> computer.
> 
> Cheers
> Joris
> 
> On Thu, May 18, 2017 at 1:19 PM, peter dalgaard  wrote:
> 
> > On 18 May 2017, at 11:00 , Patrick Connolly  
> > wrote:
> >
> > On Wed, 17-May-2017 at 01:21PM +0200, Peter Dalgaard wrote:
> >
> > |>
> > |> Anyways, you might want to
> > |>
> > |> a) move the discussion to R-devel
> > |> b) include your platform (hardware, OS) and time zone info
> >
> > System:Host: MTA-V1-427894 Kernel: 3.19.0-32-generic x86_64 (64 bit 
> > gcc: 4.8.2)
> >   Desktop: KDE Plasma 4.14.2 (Qt 4.8.6) Distro: Linux Mint 17.3 Rosa
> 
> I suppose that'll do...
> 
> 
> > Time zone: NZST
> 
> 
> 
> >
> > |> c) run the offending code lines "by hand" and show us the values of 
> > format(dlt) and format(dct) so we can see what the problem is, something 
> > like
> > |>
> > |> dlt <- structure(
> > |> list(sec = 52, min = 59L, hour = 18L, mday = 6L, mon = 11L, year = 
> > 116L,
> > |>wday = 2L, yday = 340L, isdst = 0L, zone = "CET", gmtoff = 3600L),
> > |>class = c("POSIXlt", "POSIXt"), tzone = c("", "CET", "CEST"))
> > |> dlt$sec <- 1 + 1:10
> > |> dct <- as.POSIXct(dlt)
> > |> cbind(format(dlt), format(dct))
> >
> >> cbind(format(dlt), format(dct))
> >  [,1]  [,2]
> > [1,] "2016-12-06 21:45:41" "2016-12-06 22:45:41"
> > [2,] "2016-12-06 21:45:42" "2016-12-06 22:45:42"
> > [3,] "2016-12-06 21:45:43" "2016-12-06 22:45:43"
> > [4,] "2016-12-06 21:45:44" "2016-12-06 22:45:44"
> > [5,] "2016-12-06 21:45:45" "2016-12-06 22:45:45"
> > [6,] "2016-12-06 21:45:46" "2016-12-06 22:45:46"
> > [7,] "2016-12-06 2

Re: [Rd] [R] R-3.4.0 fails test

2017-05-18 Thread peter dalgaard

> On 18 May 2017, at 11:00 , Patrick Connolly  
> wrote:
> 
> On Wed, 17-May-2017 at 01:21PM +0200, Peter Dalgaard wrote:
> 
> |> 
> |> Anyways, you might want to 
> |> 
> |> a) move the discussion to R-devel
> |> b) include your platform (hardware, OS) and time zone info
> 
> System:Host: MTA-V1-427894 Kernel: 3.19.0-32-generic x86_64 (64 bit gcc: 
> 4.8.2)
>   Desktop: KDE Plasma 4.14.2 (Qt 4.8.6) Distro: Linux Mint 17.3 Rosa

I suppose that'll do...


> Time zone: NZST



> 
> |> c) run the offending code lines "by hand" and show us the values of 
> format(dlt) and format(dct) so we can see what the problem is, something like
> |> 
> |> dlt <- structure(
> |> list(sec = 52, min = 59L, hour = 18L, mday = 6L, mon = 11L, year = 
> 116L,
> |>wday = 2L, yday = 340L, isdst = 0L, zone = "CET", gmtoff = 3600L),
> |>class = c("POSIXlt", "POSIXt"), tzone = c("", "CET", "CEST"))
> |> dlt$sec <- 1 + 1:10 
> |> dct <- as.POSIXct(dlt)
> |> cbind(format(dlt), format(dct))
> 
>> cbind(format(dlt), format(dct))
>  [,1]  [,2] 
> [1,] "2016-12-06 21:45:41" "2016-12-06 22:45:41"
> [2,] "2016-12-06 21:45:42" "2016-12-06 22:45:42"
> [3,] "2016-12-06 21:45:43" "2016-12-06 22:45:43"
> [4,] "2016-12-06 21:45:44" "2016-12-06 22:45:44"
> [5,] "2016-12-06 21:45:45" "2016-12-06 22:45:45"
> [6,] "2016-12-06 21:45:46" "2016-12-06 22:45:46"
> [7,] "2016-12-06 21:45:47" "2016-12-06 22:45:47"
> [8,] "2016-12-06 21:45:48" "2016-12-06 22:45:48"
> [9,] "2016-12-06 21:45:49" "2016-12-06 22:45:49"
> [10,] "2016-12-06 21:45:50" "2016-12-06 22:45:50"
>> 
> 


So exactly 1 hour out of whack. Is there a Daylight Saving Times issue, 
perchance?

-pd


> 
> -- 
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
>   ___Patrick Connolly   
> {~._.~}   Great minds discuss ideas
> _( Y )_Average minds discuss events 
> (:_~*~_:)  Small minds discuss people  
> (_)-(_) . Eleanor Roosevelt
> 
> ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] stopifnot() does not stop at first non-TRUE argument

2017-05-16 Thread peter dalgaard

> On 16 May 2017, at 18:37 , Suharto Anggono Suharto Anggono via R-devel 
>  wrote:
> 
> switch(i, ...)
> extracts 'i'-th argument in '...'. It is like
> eval(as.name(paste0("..", i))) .

Hey, that's pretty neat! 

-pd

> 
> Just mentioning other things:
> - For 'n',
> n <- nargs()
> can be used.
> - sys.call() can be used in place of match.call() .
> ---
>>>>>> peter dalgaard 
>>>>>>on Mon, 15 May 2017 16:28:42 +0200 writes:
> 
>> I think Hervé's idea was just that if switch can evaluate arguments 
>> selectively, so can stopifnot(). But switch() is .Primitive, so does it from 
>> C. 
> 
> if he just meant that, then "yes, of course" (but not so interesting).
> 
>> I think it is almost a no-brainer to implement a sequential stopifnot if 
>> dropping to C code is allowed. In R it gets trickier, but how about this:
> 
> Something like this, yes, that's close to what Serguei Sokol had proposed
> (and of course I *do*  want to keep the current sophistication
> of stopifnot(), so this is really too simple)
> 
>> Stopifnot <- function(...)
>> {
>> n <- length(match.call()) - 1
>> for (i in 1:n)
>> {
>> nm <- as.name(paste0("..",i))
>> if (!eval(nm)) stop("not all true")
>> }
>> }
>> Stopifnot(2+2==4)
>> Stopifnot(2+2==5, print("Hey!!!") == "Hey!!!")
>> Stopifnot(2+2==4, print("Hey!!!") == "Hey!!!")
>> Stopifnot(T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,F,T)
> 
> 
>>> On 15 May 2017, at 15:37 , Martin Maechler  
>>> wrote:
>>> 
>>> I'm still curious about Hervé's idea on using  switch()  for the
>>> issue.
> 
>> -- 
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stopifnot() does not stop at first non-TRUE argument

2017-05-15 Thread peter dalgaard
However, it doesn't look much of a hassle to fuse my suggestion into the 
current stopifnot: Basically, just use eval(as.name(paste0("..",i))) instead of 
ll[[i]] and base the initial calculation of n on match.call() rather than on 
list(...).

-pd


> On 15 May 2017, at 17:04 , Martin Maechler  wrote:
> 
>>>>>> peter dalgaard 
>>>>>>on Mon, 15 May 2017 16:28:42 +0200 writes:
> 
>> I think Hervé's idea was just that if switch can evaluate arguments 
>> selectively, so can stopifnot(). But switch() is .Primitive, so does it from 
>> C. 
> 
> if he just meant that, then "yes, of course" (but not so interesting).
> 
>> I think it is almost a no-brainer to implement a sequential stopifnot if 
>> dropping to C code is allowed. In R it gets trickier, but how about this:
> 
> Something like this, yes, that's close to what Serguei Sokol had proposed
> (and of course I *do*  want to keep the current sophistication
> of stopifnot(), so this is really too simple)
> 
>> Stopifnot <- function(...)
>> {
>> n <- length(match.call()) - 1
>> for (i in 1:n)
>> {
>> nm <- as.name(paste0("..",i))
>> if (!eval(nm)) stop("not all true")
>> }
>> }
>> Stopifnot(2+2==4)
>> Stopifnot(2+2==5, print("Hey!!!") == "Hey!!!")
>> Stopifnot(2+2==4, print("Hey!!!") == "Hey!!!")
>> Stopifnot(T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,F,T)
> 
> 
>>> On 15 May 2017, at 15:37 , Martin Maechler  
>>> wrote:
>>> 
>>> I'm still curious about Hervé's idea on using  switch()  for the
>>> issue.
> 
>> -- 
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stopifnot() does not stop at first non-TRUE argument

2017-05-15 Thread peter dalgaard
I think Hervé's idea was just that if switch can evaluate arguments 
selectively, so can stopifnot(). But switch() is .Primitive, so does it from C. 

I think it is almost a no-brainer to implement a sequential stopifnot if 
dropping to C code is allowed. In R it gets trickier, but how about this:

Stopifnot <- function(...)
{
  n <- length(match.call()) - 1
  for (i in 1:n)
  {
nm <- as.name(paste0("..",i))
if (!eval(nm)) stop("not all true")
  }
}
Stopifnot(2+2==4)
Stopifnot(2+2==5, print("Hey!!!") == "Hey!!!")
Stopifnot(2+2==4, print("Hey!!!") == "Hey!!!")
Stopifnot(T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,F,T)


> On 15 May 2017, at 15:37 , Martin Maechler  wrote:
> 
> I'm still curious about Hervé's idea on using  switch()  for the
> issue.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-06 Thread peter dalgaard

> On 6 May 2017, at 01:49 , Nick Brown  wrote:
> 
> Hi John,
> 
> Thanks for the comment... but that appears to mean that SPSS has a big 
> problem.  I have always been told that to include an interaction term in a 
> regression, the only way is to do the multiplication by hand.  But then it 
> seems to be impossible to stop SPSS from re-standardizing the variable that 
> corresponds to the interaction term.  Am I missing something?  Is there a way 
> to perform the regression with the interaction in SPSS without computing the 
> interaction as a separate variable?

Just look at the unstandardized coefficients in SPSS. The standardized ones are 
of some usefulness, but it is limited in the case of syntesized regressors like 
product terms. I imagine that the interpretation also goes whacky for squared 
terms, dummy coded groupings, etc.

(Does SPSS really still not have an automated way of generating interaction 
terms? 1977 called googling Looks like GLM understands them, REGRESS 
not.)

-pd

> 
> Best,
> Nick
> 
> From: "John Fox" 
> To: "Nick Brown" , "peter dalgaard" 
> Cc: r-devel@r-project.org
> Sent: Friday, 5 May, 2017 8:22:53 PM
> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
> 
> Dear Nick,
> 
> 
> On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
>  wrote:
> 
> >>I conjecture that something in the vicinity of
> >> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
> >>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
> >>summary(res) 
> >> would reproduce the SPSS Beta values.
> >
> >Yes, that works. Thanks!
> 
> That you have to work hard in R to match the SPSS results isn’t such a bad
> thing when you factor in the observation that standardizing the
> interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
> its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.
> 
> Best,
>  John
> 
> -----
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> Web: http://socserv.mcmaster.ca/jfox/
> 
> 
> > 
> >
> >- Original Message -
> >
> >From: "peter dalgaard" 
> >To: "Viechtbauer Wolfgang (SP)"
> >, "Nick Brown"
> >
> >Cc: r-devel@r-project.org
> >Sent: Friday, 5 May, 2017 3:33:29 PM
> >Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
> >
> >Thanks, I was getting to try this, but got side tracked by actual work...
> >
> >Your analysis reproduces the SPSS unscaled estimates. It still remains to
> >figure out how Nick got
> >
> >> 
> >coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
> >
> >(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
> >0.07342198 -0.39650356 -0.36569488 -0.09435788
> >
> >
> >which does not match your output. I suspect that ZMEAN_PA and
> >ZDIVERSITY_PA were scaled for this analysis (but the interaction term
> >still obviously is not). I conjecture that something in the vicinity of
> >
> >res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
> >scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
> >summary(res) 
> >
> >would reproduce the SPSS Beta values.
> >
> >
> >> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
> >> wrote:
> >> 
> >> I had no problems running regression models in SPSS and R that yielded
> >>the same results for these data.
> >> 
> >> The difference you are observing is from fitting different models. In
> >>R, you fitted: 
> >> 
> >> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
> >> summary(res) 
> >> 
> >> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
> >>is not a standardized variable itself and not the same as "ZINTER_PA_C"
> >>in the png you showed, which is not a variable in the dataset, but can
> >>be created with: 
> >> 
> >> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
> >> 
> >> If you want the same results as in SPSS, then you need to fit:
> >> 
> >> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
> >>data=dat) 
> >> summary(res) 
> >> 
> >> This yields: 
> >> 
> >> Coefficients: 
> >> Estimate Std. Error t value Pr(>|t|)
> >> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
> >> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
> >> ZDIVE

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread peter dalgaard
Thanks, I was getting to try this, but got side tracked by actual work...

Your analysis reproduces the SPSS unscaled estimates. It still remains to 
figure out how Nick got

> 
coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))

   (Intercept)   ZMEAN_PA  ZDIVERSITY_PA 
ZMEAN_PA:ZDIVERSITY_PA 
0.07342198-0.39650356-0.36569488
-0.09435788 


which does not match your output. I suspect that ZMEAN_PA and ZDIVERSITY_PA 
were scaled for this analysis (but the interaction term still obviously is 
not). I conjecture that something in the vicinity of

res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + scale(ZMEAN_PA 
* ZDIVERSITY_PA), data=dat)
summary(res)

would reproduce the SPSS Beta values.


> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP) 
>  wrote:
> 
> I had no problems running regression models in SPSS and R that yielded the 
> same results for these data.
> 
> The difference you are observing is from fitting different models. In R, you 
> fitted:
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
> summary(res)
> 
> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This is 
> not a standardized variable itself and not the same as "ZINTER_PA_C" in the 
> png you showed, which is not a variable in the dataset, but can be created 
> with:
> 
> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
> 
> If you want the same results as in SPSS, then you need to fit:
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, data=dat)
> summary(res)
> 
> This yields:
> 
> Coefficients:
>  Estimate Std. Error t value Pr(>|t|)
> (Intercept)6.410410.01722  372.21   <2e-16 ***
> ZMEAN_PA  -1.627260.04200  -38.74   <2e-16 ***
> ZDIVERSITY_PA -1.500820.07447  -20.15   <2e-16 ***
> ZINTER_PA_C   -0.589550.05288  -11.15   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Exactly the same as in the png.
> 
> Peter already mentioned this as a possible reason for the discrepancy: 
> https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it perhaps 
> the case that x1 and x2 have already been scaled to have standard deviation 
> 1? In that case, x1*x2 won't be.")
> 
> Best,
> Wolfgang
> 
> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick Brown
> Sent: Friday, May 05, 2017 10:40
> To: peter dalgaard
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
> 
> Hi, 
> 
> Here is (I hope) all the relevant output from R. 
> 
>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > mean(s1$ZDIVERSITY_PA, 
>> na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA, na.rm=T) [1] -5.430282e-15 > 
>> lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA  
>> ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>-0.3962254 -0.3636026 -0.1425772  ## 
> This is what I thought was the problem originally. :-) 
> 
> 
>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 
>> (Intercept)   ZMEAN_PA  ZDIVERSITY_PA 
>> ZMEAN_PA:ZDIVERSITY_PA 
>0.07342198-0.39650356-0.36569488   
>  -0.09435788 > coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, 
> data=s1)) ZMEAN_PA  ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>0.07342198-0.39650356-0.36569488   
>  -0.09435788 The equivalent from SPSS is attached. The unstandardized 
> coefficients in SPSS look nothing like those in R. The standardized 
> coefficients in SPSS match the lm.ridge()$coef numbers very closely indeed, 
> suggesting that the same algorithm may be in use. 
> 
> I have put the dataset file, which is the untouched original I received from 
> the authors, in this Dropbox folder: 
> https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. 
> You can read it into R with this code (one variable needs to be standardized 
> and centered; everything else is already in the file): 
> 
> s1 <- read.csv("Emodiversity_Study1.csv", stringsAsFactors=FALSE) 
> s1$ZDEPRESSION <- scale(s1$DEPRESSION) 
> Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so, I'm sure IBM 
> will want to fix it quickly (ha ha ha). 
> 
> Nick 
> 
> - Original Message -
> 
> From: "peter dalgaard"  
> To: "Nick Brown"  
> Cc: "Simon Bonner" , r-devel@r-project.or

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread peter dalgaard
ambda=0 is 
>> the 
>> default, so it can be omitted; I can alternate between including or deleting 
>> ".ridge" in the function call, and watch the coefficient for the interaction 
>> change.) 
>> 
>> 
>> 
>> What seems slightly strange to me here is that I assumed that lm.ridge() 
>> just 
>> piggybacks on lm() anyway, so in the specific case where lambda=0 and there 
>> is no "ridging" to do, I'd expect exactly the same results. 
>> 
>> 
>> Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex 
>> will 
>> not be easy to make, but I can share the data via Dropbox or something if 
>> that 
>> would help. 
>> 
>> 
>> 
>> I appreciate that when there is strong collinearity then all bets are off in 
>> terms 
>> of what the betas mean, but I would really expect lm() and lm.ridge() to 
>> give 
>> the same results. (I would be happy to ignore SPSS, but for the moment it's 
>> part of the majority!) 
>> 
>> 
>> 
>> Thanks for reading, 
>> Nick 
>> 
>> 
>> [[alternative HTML version deleted]] 
>> 
>> __ 
>> R-devel@r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-devel 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-04 Thread peter dalgaard
Um, the link to StackOverflow does not seem to contain the same question. It 
does contain a stern warning not to use the $coef component of lm.ridge...

Is it perhaps the case that x1 and x2 have already been scaled to have standard 
deviation 1? In that case, x1*x2 won't be.

Also notice that SPSS tends to use "Beta" for standardized regression 
coefficients, and (AFAIR) "b" for the regular ones.

-pd

> On 4 May 2017, at 16:28 , Nick Brown  wrote:
> 
> Hallo, 
> 
> I hope I am posting to the right place. I was advised to try this list by Ben 
> Bolker (https://twitter.com/bolkerb/status/859909918446497795). I also posted 
> this question to StackOverflow 
> (http://stackoverflow.com/questions/43771269/lm-gives-different-results-from-lm-ridgelambda-0).
>  I am a relative newcomer to R, but I wrote my first program in 1975 and have 
> been paid to program in about 15 different languages, so I have some general 
> background knowledge. 
> 
> 
> I have a regression from which I extract the coefficients like this: 
> lm(y ~ x1 * x2, data=ds)$coef 
> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
> 
> 
> 
> When I do the same regression in SPSS, I get: 
> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
> So the main effects are in agreement, but there is quite a difference in the 
> coefficient for the interaction. 
> 
> 
> X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my 
> idea, but it got published), so there is quite possibly something going on 
> with collinearity. So I thought I'd try lm.ridge() to see if I can get an 
> idea of where the problems are occurring. 
> 
> 
> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge 
> penalty) and check we get the same results as with lm(): 
> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
> x1=0.40, x2=0.37, x1*x2=0.14 
> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is 
> the default, so it can be omitted; I can alternate between including or 
> deleting ".ridge" in the function call, and watch the coefficient for the 
> interaction change.) 
> 
> 
> 
> What seems slightly strange to me here is that I assumed that lm.ridge() just 
> piggybacks on lm() anyway, so in the specific case where lambda=0 and there 
> is no "ridging" to do, I'd expect exactly the same results. 
> 
> 
> Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex 
> will not be easy to make, but I can share the data via Dropbox or something 
> if that would help. 
> 
> 
> 
> I appreciate that when there is strong collinearity then all bets are off in 
> terms of what the betas mean, but I would really expect lm() and lm.ridge() 
> to give the same results. (I would be happy to ignore SPSS, but for the 
> moment it's part of the majority!) 
> 
> 
> 
> Thanks for reading, 
> Nick 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] stopifnot() does not stop at first non-TRUE argument

2017-05-03 Thread peter dalgaard
The first line of stopifnot is

n <- length(ll <- list(...))

which takes ALL arguments and forms a list of them. This implies evaluation, so 
explains the effect that you see.

To do it differently, you would have to do something like 

   dots <- match.call(expand.dots=FALSE)$...

and then explicitly evaluate each argument in turn in the caller frame. This 
amount of nonstandard evaluation sounds like it would incur a performance 
penalty, which could be undesirable.

If you want to enforce the order of evaluation, there is always

   stopifnot(A)
   stopifnot(B)

-pd

> On 3 May 2017, at 02:50 , Hervé Pagès  wrote:
> 
> Hi,
> 
> It's surprising that stopifnot() keeps evaluating its arguments after
> it reaches the first one that is not TRUE:
> 
>  > stopifnot(3 == 5, as.integer(2^32), a <- 12)
>  Error: 3 == 5 is not TRUE
>  In addition: Warning message:
>  In stopifnot(3 == 5, as.integer(2^32), a <- 12) :
>NAs introduced by coercion to integer range
>  > a
>  [1] 12
> 
> The details section in its man page actually suggests that it should
> stop at the first non-TRUE argument:
> 
>  ‘stopifnot(A, B)’ is conceptually equivalent to
> 
>   { if(any(is.na(A)) || !all(A)) stop(...);
> if(any(is.na(B)) || !all(B)) stop(...) }
> 
> Best,
> H.
> 
> -- 
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] failure of make check-all

2017-04-06 Thread peter dalgaard

> On 6 Apr 2017, at 15:51 , Therneau, Terry M., Ph.D.  wrote:
> 
> Peter,
>  Retry how much of it?  That is, where does it go in the sequence from svn up 
> to make check?  I'll update my notes so as to do it correctly.


From the configure step, I'd expect.

Actually, now that I look at your output again, you seem to be compiling in the 
source directory, which is generally a bad idea, because you end up with 
derived files intermixed with source code all over the place. If you compile in 
a parallel directory, then you can in the worst case just wipe it and start 
over. Also, compiling in the source dir has confused "make" on previous 
occasions (ideally, it should work, but most of us don't check and so are 
poorly motivated to track down issues). So

svn up
# mkdir ../BUILD if not there already
cd ../BUILD
../R/configure
make
...


Also, for checking the tests, I think the canonical way is 

path/to/R --vanilla < reg-examples3.R

which may be subtly different from using source().

I'd surmise that your issue is something with the C-level routine registration, 
but I haven't followed the technicalities of that in full detail. Did your 
build rebuild the survival package?

-pd

> 
> In any case, I put it first and reran the whole command chain.  I had 
> recently upgraded linux from 14.xx LTS to 16.04 LTS so it makes sense to 
> start over.  This removed the large diff in base-Ex.Rout from the first part 
> of the log, but the terminal error still remains:
> 
> make[3]: Leaving directory '/usr/local/src/R-devel/tests'
> make[3]: Entering directory '/usr/local/src/R-devel/tests'
> running code in 'reg-tests-3.R' ... OK
>  comparing 'reg-tests-3.Rout' to './reg-tests-3.Rout.save' ... OK
> running code in 'reg-examples3.R' ...Makefile.common:98: recipe for target 
> 'reg-examples3.Rout' failed
> make[3]: *** [reg-examples3.Rout] Error 1
> make[3]: Leaving directory '/usr/local/src/R-devel/tests'
> Makefile.common:273: recipe for target 'test-Reg' failed
> 
> Here are lines 97-100 of tests/Makefile.common:
> 
> .R.Rout:
>@rm -f $@ $@.fail $@.log
>@$(ECHO) $(ECHO_N) "running code in '$<' ...$(ECHO_C)" > $@.log
>@$(R) < $< > $@.fail 2>&1 || (cat $@.log && rm $@.log && exit 1)
> 
> 
> There is one complier warning message, it prints in pink so as not to miss it!
> 
> main.c: In function ‘dummy_ii’:
> main.c:1669:12: warning: function returns address of local variable 
> [-Wreturn-local-addr]
> return (uintptr_t) ⅈ
> 
> -
> 
> So as to be more complete I did "cd tests; R" and source("reg-examples3.R"), 
> and lo and behold the error is
> 
>> source('reg-examples3.R')
> Loading required package: MASS
> Loading required package: survival
> Error in fitter(X, Y, strats, offset, init, control, weights = weights,  :
>  object 'Ccoxmart2' not found
> 
> Looking at src/coxmart2.c and src/init.c I don't see anything different than 
> the other dozen .C routines in my survival package.  The file tests/book7.R 
> in the package exercises this routine, and CMD check passes.
> 
> Hints?
> 
> Terry T.
> 
> 
> 
> 
> 
> 
> On 04/06/2017 07:52 AM, peter dalgaard wrote:
>> You may want to retry that after a make distclean, in case anything changed 
>> in the toolchain.
>> 
>> -pd
>> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] failure of make check-all

2017-04-06 Thread peter dalgaard
#x27;./arith.Rout.save' ... OK
> running code in 'lm-tests.R' ... OK
>  comparing 'lm-tests.Rout' to './lm-tests.Rout.save' ... OK
> running code in 'ok-errors.R' ... OK
>  comparing 'ok-errors.Rout' to './ok-errors.Rout.save' ... OK
> running code in 'method-dispatch.R' ... OK
>  comparing 'method-dispatch.Rout' to './method-dispatch.Rout.save' ... OK
> running code in 'any-all.R' ... OK
>   comparing 'any-all.Rout' to './any-all.Rout.save' ... OK
> running code in 'd-p-q-r-tests.R' ... OK
>  comparing 'd-p-q-r-tests.Rout' to './d-p-q-r-tests.Rout.save' ... OK
> make[3]: Leaving directory '/usr/local/src/R-devel/tests'
> running sloppy specific tests
> make[3]: Entering directory '/usr/local/src/R-devel/tests'
> running code in 'complex.R' ... OK
>  comparing 'complex.Rout' to './complex.Rout.save' ... OK
> running code in 'print-tests.R' ... OK
>  comparing 'print-tests.Rout' to './print-tests.Rout.save' ... OK
> running code in 'lapack.R' ... OK
>  comparing 'lapack.Rout' to './lapack.Rout.save' ... OK
> running code in 'datasets.R' ... OK
>  comparing 'datasets.Rout' to './datasets.Rout.save' ... OK
> running code in 'datetime.R' ... OK
>  comparing 'datetime.Rout' to './datetime.Rout.save' ... OK
> running code in 'iec60559.R' ... OK
>  comparing 'iec60559.Rout' to './iec60559.Rout.save' ... OK
> make[3]: Leaving directory '/usr/local/src/R-devel/tests'
> make[2]: Leaving directory '/usr/local/src/R-devel/tests'
> make[2]: Entering directory '/usr/local/src/R-devel/tests'
> running regression tests ...
> make[3]: Entering directory '/usr/local/src/R-devel/tests'
> running code in 'array-subset.R' ... OK
> running code in 'reg-tests-1a.R' ... OK
> running code in 'reg-tests-1b.R' ... OK
> running code in 'reg-tests-1c.R' ... OK
> running code in 'reg-tests-1d.R' ... OK
> running code in 'reg-tests-2.R' ... OK
>  comparing 'reg-tests-2.Rout' to './reg-tests-2.Rout.save' ... OK
> running code in 'reg-examples1.R' ... OK
> running code in 'reg-examples2.R' ... OK
> running code in 'reg-packages.R' ... OK
> running code in 'p-qbeta-strict-tst.R' ... OK
> running code in 'r-strict-tst.R' ... OK
> running code in 'reg-IO.R' ... OK
>  comparing 'reg-IO.Rout' to './reg-IO.Rout.save' ... OK
> running code in 'reg-IO2.R' ... OK
>  comparing 'reg-IO2.Rout' to './reg-IO2.Rout.save' ... OK
> running code in 'reg-plot.R' ... OK
>  comparing 'reg-plot.pdf' to './reg-plot.pdf.save' ... OK
> running code in 'reg-S4-examples.R' ... OK
> running code in 'reg-BLAS.R' ... OK
> make[3]: Leaving directory '/usr/local/src/R-devel/tests'
> make[3]: Entering directory '/usr/local/src/R-devel/tests'
> running code in 'reg-tests-3.R' ... OK
>  comparing 'reg-tests-3.Rout' to './reg-tests-3.Rout.save' ... OK
> running code in 'reg-examples3.R' ...Makefile.common:98: recipe for target 
> 'reg-
> examples3.Rout' failed
> make[3]: *** [reg-examples3.Rout] Error 1
> make[3]: Leaving directory '/usr/local/src/R-devel/tests'
> Makefile.common:273: recipe for target 'test-Reg' failed
> make[2]: *** [test-Reg] Error 2
> make[2]: Leaving directory '/usr/local/src/R-devel/tests'
> Makefile.common:165: recipe for target 'test-all-basics' failed
> make[1]: *** [test-all-basics] Error 1
> make[1]: Leaving directory '/usr/local/src/R-devel/tests'
> Makefile:239: recipe for target 'check-all' failed
> make: *** [check-all] Error 2
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Very hard to reproduce bug (?) in R-devel

2017-04-05 Thread peter dalgaard

> On 05 Apr 2017, at 20:40 , Winston Chang  wrote:
> 
> I think there's a good chance that this is due to a bug in R. I have
> been trying to track down the cause of the problem but haven't been
> able find it.
> 
> -Winston

Apologies in advance if this is just stating the obvious, but let me try and 
put some general ideas  on the table.

- is anything non-deterministic involved? (Doesn't sound so, but...)
- could it be something with the bytecompiler?
- can you get something (_anything_) to trigger the bug (in any variant) when 
running R under gdb? I'm thinking gctorture() etc.
- it is odd that you cannot immediately get the same behaviour with R -d gdb or 
valgrind. Are you sure you are actually running the same script in the same way?
- if you can get a hold of something inside gdb, then there should be some 
potential for backtracking using hardware watchpoints and such. As in: This 
memory location doesn't contain the value I expected; what changed it?

-pd


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in documentation for ?legend

2017-03-27 Thread peter dalgaard

> On 27 Mar 2017, at 09:59 , Martin Maechler  wrote:
> 
> 
> (You did not understand Peter:  He *did* agree with you that
> there's no 'title.cex' argument  and explained why the oddity
> probably has happened in the distant past ..)


I was also pointing out that the help page specifically does NOT document any 
such argument - otherwise the self-tests would have found the inconsistency 
long ago. What it does have is a (false) hint that there might somewhere be 
something called title.cex defaulting to the value of cex. 

The main point is that when someone claims that there is inconsistency between 
documentation and code, the first thing I do is to check the (long) argument 
list, the second thing is the (even longer) list of documented arguments. 
Neither has title.cex. No version information was given, so I couldn't know 
whether it might have been fixed in one of the more recent releases, etc.

(I didn't fix it immediately on the off chance that the original author had 
actually planned to implement a title.cex feature. But he probably didn't.)

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error in documentation for ?legend

2017-03-25 Thread peter dalgaard

> On 25 Mar 2017, at 00:39 , POLITZER-AHLES, Stephen [CBS] 
>  wrote:
> 
> To whom it may concern:
> 
> 
> The help page for ?legend refers to a `title.cex` parameter, which suggests 
> that the function has such a parameter.

No it does not. All arguments are listed and documented, none of them is 
title.cex, and there's no "...".

However, the documentation for "cex" has this oddity inside:

 cex: character expansion factor *relative* to current
  ‘par("cex")’.  Used for text, and provides the default for
  ‘pt.cex’ and ‘title.cex’.

Checking the sources suggests that this is the last anyone has seen of 
title.cex:

pd$ grep -r title.cex src
src/library/graphics/man/legend.Rd:\code{pt.cex} and \code{title.cex}.}
pd$ 

The text was inserted as part of the addition of the title.col (!) argument, so 
it looks like the author got some wires crossed.

-pd

> As far as I can tell, though, it doesn't; here's an example:
> 
> 
>> plot(1,1)
>> legend("topright",pch=1, legend="something", title="my legend", title.cex=2)
> Error in legend("topright", pch = 1, legend = "something", title = "my 
> legend",  :
>  unused argument (title.cex = 2)
> 
> 
> This issue appears to have been discussed online before (e.g. here's a post 
> from 2011 mentioning it: 
> http://r.789695.n4.nabble.com/Change-the-text-size-of-the-title-in-a-legend-of-a-R-plot-td3482880.html)
>  but I'm not sure if anyone ever reported it to R developers.
> 
> 
> Is it possible for someone to update the ?legend documentation page so that 
> it doens't refer to a parameter that isn't usable?
> 
> 
> Best,
> 
> Steve Politzer-Ahles
> 
> 
> ---
> Stephen Politzer-Ahles
> The Hong Kong Polytechnic University
> Department of Chinese and Bilingual Studies
> http://www.mypolyuweb.hk/~sjpolit/<http://www.mypolyuweb.hk/%7Esjpolit/>
> 
> 
> [http://mlm.polyu.edu.hk/PolyU80_Email_Signature.png]
> 
> www.polyu.edu.hk/80anniversary<http://www.polyu.edu.hk/80anniversary>
> 
> Disclaimer:\ \ This message (including any attachments) ...{{dropped:19}}
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R 3.4.0

2017-03-17 Thread Peter Dalgaard
R 3.4.0 "You Stupid Darkness" is now scheduled for April 21

The detailed schedule can be found on developer.r-project.org

For the Core Team

Peter D.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pressing either Ctrl-\ of Ctrl-4 core dumps R

2017-02-13 Thread peter dalgaard

On 12 Feb 2017, at 23:54 , Henrik Bengtsson  wrote:

> I still don't understand why the terminal treats keypress Ctrl+4 the
> same as Ctrl+\, but at least I'm not alone;
> https://catern.com/posts/terminal_quirks.html#fn.3.

I would guess that this was just to get certain escape chars within reach on 
non-US keyboard layouts, e.g. the [\] characters are replaced by some 
permutation of the three extra letters in Scandinavian languages (as Henrik 
surely knows all about). So the awkward ones were reassigned/duplicated at 
Ctrl-2 -- Ctrl-8.

On, say, a current Mac Terminal.app, Ctrl-\ should be Ctrl-Shift-Alt-7, but 
that key combination actually just generates "7". I also recall terminals where 
some characters could only be obtained via compose sequences, e.g. compose-/-/ 
for "\", and there was no obvious way to add a Ctrl modifier to that.

-pd


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R 3.3.3 on March 6

2017-02-05 Thread Peter Dalgaard
The wrap-up release of the R-3.3.x series will be on Monday, March 6th. 

Package maintainers should check that their packages still work with this 
release. In particular, recommended-package maintainers should be extra careful 
since we do not want unexpected turbulence at this point.

On behalf of the R Core Team
Peter Dalgaard

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Unexpected EOF in R-patched_2017-01-30

2017-01-31 Thread peter dalgaard

> On 31 Jan 2017, at 18:56 , Avraham Adler  wrote:
> 
> Hello.
> 
> When trying to unpack today's version of R-patched,

>From which source? The files from cran.r-project.org seems OK, both those in 
>src/base-prerelease and those from ETHZ. Also, is it not "tar -xfz" when 
>reading a compressed file?

-pd 

> I get the following error:
> 
> C:\R>tar -xf R-patched_2017-01-30.tar.gz
> 
> gzip: stdin: unexpected end of file
> tar: Unexpected EOF in archive
> tar: Unexpected EOF in archive
> tar: Error is not recoverable: exiting now
> 
> I got the same error for R-patched_2017-01-30.tar.gz but not for 
> R-3.3.2.tar.gz.
> 
> Thank you,
> 
> Avi
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] unlicense

2017-01-18 Thread peter dalgaard
gt;> 
>>>>>> If it is recognized by the OSF or FSF or some other authority as a FOSS
>>>>>> license, then CRAN would probably also recognize it.  If not, then CRAN
>>>>>> doesn't have the resources to evaluate it and so is unlikely to
>>>>>> recognize
>>>>>> it.
>>>>> 
>>>>> 
>>>>> 
>>>>> Unlicense is listed in https://spdx.org/licenses/
>>>>> 
>>>>> Debian does include software "licensed" like this, and seems to think
>>>>> this is one way (not the only one) of declaring something to be
>>>>> "public domain".  The first two examples I found:
>>>>> 
>>>>> https://tracker.debian.org/media/packages/r/rasqal/copyright-0.9.29-1
>>>>> 
>>>>> 
>>>>> https://tracker.debian.org/media/packages/w/wiredtiger/copyright-2.6.1%2Bds-1
>>>>> 
>>>>> This follows the format explained in
>>>>> 
>>>>> 
>>>>> https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/#license-specification,
>>>>> which does not explicitly include Unlicense, but does include CC0,
>>>>> which AFAICT is meant to formally license something so that it is
>>>>> equivalent to being in the public domain. R does include CC0 as a
>>>>> shorthand (e.g., geoknife).
>>>>> 
>>>>> https://www.debian.org/legal/licenses/ says that
>>>>> 
>>>>> 
>>>>> 
>>>>> Licenses currently found in Debian main include:
>>>>> 
>>>>> - ...
>>>>> - ...
>>>>> - public domain (not a license, strictly speaking)
>>>>> 
>>>>> 
>>>>> 
>>>>> The equivalent for CRAN would probably be something like "License:
>>>>> public-domain + file LICENSE".
>>>>> 
>>>>> -Deepayan
>>>>> 
>>>>>> Duncan Murdoch
>>>>>> 
>>>>>> 
>>>>>> __
>>>>>> R-devel@r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>> 
>>>>> 
>>>>> 
>>>>> __
>>>>> R-devel@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>> 
>>>> 
>>>> __
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] problem with print.generic(x)deparse(substitute(x))

2017-01-09 Thread peter dalgaard

On 09 Jan 2017, at 10:53 , Spencer Graves  wrote:

> # Define an object of class 'dum'
> k <- 1
> class(k) <- 'dum'
> str(k) # as expected
> 
> # Define print.dum
> print.dum <- function(x, ...)
>  deparse(substitute(x))
> 
> print(k) # Prints "k" as expected
> # THE FOLLOWING PRINTS NOTHING:
> k # Why?


Because it doesn't work that way...

First of all, your print.dum relies on autoprinting of its return value, it 
doesn't print anything itself. That's not how one usually writes print methods: 
You should print something and (usually) return the argument invisibly. 

Autoprinting calls the print method to do the actual printing and returns the 
object invisibly, irrespective of the return value from the print method. To 
wit:

> k
> dput(.Last.value)
structure(1, class = "dum")

(Trying to print the return value would invite infinite looping.)

However, there's another catch: deparse(substitute(...)) relies on knowing the 
argument to print() before evaluation, but autoprinting does not retain that 
information, it just looks at the object that has been computed and passes it 
to the relevant print method, so you get this effect:

> print.dum <- function(x, ...)
+  print(deparse(substitute(x)))
> k
[1] "x"

-pd



-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] Problems when trying to install and load package "rzmq"

2017-01-03 Thread peter dalgaard
Possibly so. 

However, the ZeroMQ libraries do exist for Windows, so it might be possible to 
get the package working there. However, CRAN probably won't have the libraries, 
so cannot produce a binary package, and it is also quite possible that the 
package author is not a Windows person. 

At the very least, you'll need some familiarity with the Windows toolchain and 
be prepared to apply a fair amount of elbow grease.

-pd

(crosspost to r-help removed)

On 29 Dec 2016, at 22:04 , Paul Bernal  wrote:

> Dear Jeff,
> 
> Thank you for your fast and kind reply. When you say that you do not think
> this can be done on windows, then I would have to use something like Ubuntu
> or Linux?
> 
> Best regards
> 
> Paul
> 
> 2016-12-29 16:00 GMT-05:00 Jeff Newmiller :
> 
>> Read the system requirements [1]. I don't think you can do this on windows.
>> 
>> [1] https://cran.r-project.org/web/packages/rzmq/index.html
>> --
>> Sent from my phone. Please excuse my brevity.
>> 
>> On December 29, 2016 12:23:26 PM PST, Paul Bernal 
>> wrote:
>>> After connecting to a mirror, I typed the following command:
>>> 
>>> install.packages("rzqm")
>>> 
>>> but I received the following message:
>>> 
>>> ERROR: compilation failed for package 'rzmq'
>>> 
>>> removing 'E:/Documents/R/win-library/3.3/rzmq'
>>> 
>>> package which is only available in source form, and may need
>>> compilation of
>>> C/C++/Fortran: 'rzmq'
>>> These will not be installed
>>> 
>>> The computer environment is Windows 8 64x bits
>>> 
>>> 
>>> Any help and/or guidance will be greatly appreciated
>>> 
>>>  [[alternative HTML version deleted]]
>>> 
>>> __
>>> r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] varimax implementation in stats package

2017-01-02 Thread peter dalgaard
factanal() was originally in MASS which is support software for Venables & 
Ripley (2002). They have Bartholomew & Knott (1999) as the main reference for 
factor analysis, so that would be a place to look (I don't have it to hand). 

At any rate, varimax optimizes a well-defined criterion, so the "only" thing to 
do is to verify that the algorithm does that, not that it is somehow equivalent 
to any other algorithm. On the face of it, I would guess that it is derived ab 
initio, there is nothing pairwise about the code.

-pd

On 02 Jan 2017, at 08:53 , Sebastian Starke  wrote:

> Hello,
> 
> recently I was looking at the implementation of the "varimax" rotation 
> procedure from the "stats" package and to me it looks quite different from 
> the algorithm originally suggested by Kaiser in 1958.
> 
> The R procedure iteratively uses singular value decompositions of some 
> matrices whereas Kaiser proposed to iteratively compute rotation matrices 
> between all pairs of factors which does not seem to happen ( at least not 
> explicitely ) in the R version.
> 
> My question now is whether R uses a completely different approach than Kaiser 
> (if so, then could you please point me to a publication or explanation of the 
> algorithm used since I wasn't able to find any) or if it is the Kaiser method 
> just well hidden under quite a bit of clever linear algebra ( explanations on 
> why the methods are equal is also appreciated).
> 
> Thanks for any hints or clarifications!
> 
> With best regards
> 
> Sebastian Starke
> 
> ______
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] unlist strips date class

2016-12-05 Thread peter dalgaard

On 02 Dec 2016, at 23:13 , Hervé Pagès  wrote:

> More generally one might reasonably expect 'unlist(x)' to be equivalent
> to 'do.call(c, x)' on a list 'x' where all the list elements are atomic
> vectors:

Well, both are generic, and e.g. there is no "Date" method for unlist(), but 
there is for c(). It is not clear that the two should be kept in lockstep and 
there is certainly no mechanism to enforce that.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] using with inside loop breaks next

2016-10-27 Thread Peter Dalgaard
(a)/(c) mostly, I think. The crux is that "next" is unhappy about being 
evaluated in a different environment than the containing loop. Witness this:


> for (i in 1:10) {if (i == 5) evalq(next); print(i)}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
> for (i in 1:10) {if (i == 5) evalq(next, new.env()); print(i)}
[1] 1
[1] 2
[1] 3
[1] 4
Error in eval(substitute(expr), envir, enclos) : 
  no loop for break/next, jumping to top level
> for (i in 1:10) {if (i == 5) evalq(next, parent.env(new.env())); print(i)}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10

-pd



> On 27 Oct 2016, at 09:51 , Richard Cotton  wrote:
> 
> If I want to use with inside a loop, it seems that next gets confused.
> To reproduce:
> 
> for(lst in list(list(a = 1), list(a = 2), list(a = 3)))
> {
>  with(lst, if(a == 2) next else print(a))
> }
> 
> I expect 1 and 3 to be printed, but I see
> 
> [1] 1
> Error in eval(expr, envir, enclos) :
>  no loop for break/next, jumping to top level
> 
> Is this
> a) by design, or
> b) a bug, or
> c) a thing that is rare enough that I should just rewrite my code?
> 
> -- 
> Regards,
> Richie
> 
> Learning R
> 4dpiecharts.com
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] BUG?: On Linux setTimeLimit() fails to propagate timeout error when it occurs (works on Windows)

2016-10-26 Thread peter dalgaard
Spencer also had tools and rsconnect loaded (via a namespace) but it doesn't 
seem to make a difference for me if I load them. It also doesn't seem to matter 
for me whether it is CRAN R, locally built R, Terminal, R.app. However, RStudio 
differs 

> setTimeLimit(elapsed=1)
Error: reached elapsed time limit
> setTimeLimit(elapsed=1)
Error: reached elapsed time limit
> setTimeLimit(elapsed=1); system.time({Sys.sleep(10);message("done")})
Error in Sys.sleep(10) : reached elapsed time limit
Timing stopped at: 0.003 0.003 0.733 

-pd


> On 26 Oct 2016, at 21:54 , Henrik Bengtsson  
> wrote:
> 
> Thank you for the feedback and confirmations.  Interesting to see that
> it's also reproducible on macOS expect for Spencer; that might
> indicate a difference in builds.
> 
> BTW, my original post suggested that timeout error was for sure
> detected while running Sys.sleep(10).  However, it could of course
> also be that it is only detected after it finishes.
> 
> 
> For troubleshooting, the help("setTimeLimit", package = "base") says that:
> 
> * "Time limits are checked whenever a user interrupt could occur. This
> will happen frequently in R code and during Sys.sleep, but only at
> points in compiled C and Fortran code identified by the code author."
> 
> The example here uses Sys.sleep(), which supports and detects user interrupts.
> 
> 
> The timeout error message is thrown by the R_ProcessEvents(void)
> function as defined in:
> 
> * src/unix/sys-unix.c
> (https://github.com/wch/r-source/blob/trunk/src/unix/sys-unix.c#L421-L453)
> * src/gnuwin32/system.c
> (https://github.com/wch/r-source/blob/trunk/src/gnuwin32/system.c#L110-L140)
> 
> So, they're clearly different implementations on Windows and Unix.
> Also, for the Unix implementation, the code differ based on
> preprocessing directive HAVE_AQUA, which could explain why Spencer
> observes a different behavior than Peter and Berend (all on macOS).
> 
> 
> Whenever the R_CheckUserInterrupt() function is called it in turn
> always calls R_ProcessEvents().  At the end, there is a code snippet -
> if (R_interrupts_pending) onintr(); - which is Windows specific and
> could be another important difference between Windows and Unix.  This
> function is defined in:
> 
> * src/main/errors.c
> (https://github.com/wch/r-source/blob/trunk/src/main/errors.c#L114-L134)
> 
> 
> The do_setTimeLimit() function controls global variables cpuLimitValue
> and elapsedLimitValue, which are checked in R_ProcessEvents(), but
> other than setting the timeout limits I don't think it's involved in
> the runtime checks. The do_setTimeLimit() is defined in:
> 
> * src/main/sysutils.c
> (https://github.com/wch/r-source/blob/trunk/src/main/sysutils.c#L1692-L1736)
> 
> 
> Unfortunately, right now, I've got little extra time to troubleshoot
> this further.
> 
> /Henrik
> 
> On Wed, Oct 26, 2016 at 2:22 AM, Berend Hasselman  wrote:
>> 
>>> On 26 Oct 2016, at 04:44, Henrik Bengtsson  
>>> wrote:
>>> ...
>>> This looks like a bug to me.  Can anyone on macOS confirm whether this
>>> is also a problem there or not?
>>> 
>> 
>> 
>> Tried it on macOS El Capitan and got this (running in R.app with R version 
>> 3.3.2 RC (2016-10-23 r71574):
>> 
>>> setTimeLimit(elapsed=1)
>>> system.time({ Sys.sleep(10); message("done") })
>> Error in Sys.sleep(10) : reached elapsed time limit
>> Timing stopped at: 0.113 0.042 10.038
>> 
>> Berend
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] BUG?: On Linux setTimeLimit() fails to propagate timeout error when it occurs (works on Windows)

2016-10-26 Thread peter dalgaard

On 26 Oct 2016, at 04:44 , Henrik Bengtsson  wrote:

> This looks like a bug to me.  Can anyone on macOS confirm whether this
> is also a problem there or not?

I don't know whether it is a problem ( ;-) ), but it does the same thing 
(checked Mavericks, Yosemite and Sierra)

-pd

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] typo or stale info in qr man

2016-10-25 Thread peter dalgaard

On 25 Oct 2016, at 10:08 , Martin Maechler  wrote:

>>>>>> Wojciech Musial (Voitek) 
>>>>>>on Mon, 24 Oct 2016 15:07:55 -0700 writes:
> 
>> man for `qr` says that the function uses LINPACK's DQRDC, while it in
>> fact uses DQRDC2.
> 
> which is a modification of LINPACK's DQRDC.
> 
> But you are right, and I have added to the help file (and a tiny
> bit to the comments in the Fortran source).
> 
> When this change was done > 20 years ago, it was still hoped 
> that the numerical linear algebra community or more specifically
> those behind LAPACK would eventually provide this functionality
> with LAPACK (and we would then use that),
> but that has never happened according to my knowledge.
> 

I had some thoughts on this recently and resolved that the base issue is that R 
wants successive (Gram/Schmidt-type) orthogonalization of the design matrix, 
not really QR as such. 

The LINPACK QR routine happens to work by orthogonalization, but it is far from 
the only way of doing QR, and most likely not the "best" one 
(speedwise/precisionwise) if a QR decompositiion as such is the target. 
(Pivoting is only part of the story)

lm() and associates (notably anova()) relies so much on successive terms being 
orthogonalized that method="qr" really is a misnomer. For much the same reason, 
it really is too much to expect that numerical analysts would enforce 
orthogonality features on a general QR-decomposer. 

I suppose that if we want to be free of LINPACK, we may need to step back and 
write our own orthogonalization routines based on other routines in LAPACK or 
on the BLAS directly.

-pd

> Thank you for the 'heads up'.
> 
> Martin Maechler
> ETH Zurich
> 
> ______
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] anonymous function parsing bug?

2016-10-21 Thread peter dalgaard

> On 21 Oct 2016, at 19:17 , Wilm Schumacher  wrote:
> 
> Am 21.10.2016 um 18:10 schrieb William Dunlap:
>> Are you saying that
>>f1 <- function(x) log(x)
>>f2 <- function(x) { log } (x)
>> should act differently?
> yes. Or more precisely: I would expect that. "Should" implies, that I want to 
> change something. I just want to understand the behavior (or file a bug, if 
> this would have been one).

I think Bill and Luke are failing in trying to make you work out the logic for 
yourself...

The point is that 
{
  some_computation
}(x)

is an expression that evaluates some_computation and applies it as a function 
to the argument x (or fails if not a function). 

When you define functions, the body can be a single expression, so

f <- function(a)
{
  some_computation
}(x)

is effectively the same as

f <- function(a) {
 {
   some_computation
 }(x)
}

where you seem to be expecting

{f <- function(a) {
 {
   some_computation
 }
}(x)

Got it?
  
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Numerical accuracy of matrix multiplication

2016-09-16 Thread peter dalgaard

On 16 Sep 2016, at 12:41 , Alexis Sarda  wrote:

> Hello,
> 
> while testing the crossprod() function under Linux, I noticed the following:
> 
> set.seed(883)
> x <- rnorm(100)
> x %*% x - sum(x^2) # equal to 1.421085e-14
> 
> Is this difference normal? It seems to be rather large for double precision.
> 

It's less than .Machine$double.eps, relative (!) to x  %*% x ~= 100.

-pd

> Regards,
> Alexis.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] problem with abine(lm(...)) for plot(y~x, log='xy')

2016-08-07 Thread peter dalgaard
Try log10()...

-pd

> On 07 Aug 2016, at 21:03 , Spencer Graves  wrote:
> 
> Hello:
> 
> 
>   In the following plot, the fitted line plots 100 percent above the 
> points:
> 
> 
> tstDat <- data.frame(x=10^(1:3), y=10^(1:3+.1*rnorm(3)))
> tstFit <- lm(log(y)~log(x), tstDat)
> plot(y~x, tstDat, log='xy')
> abline(tstFit)
> 
> 
>   I can get the correct line with the following:
> 
> 
> tstPredDat <- data.frame(x=10^seq(1, 3, len=2))
> tstPred <- predict(tstFit, tstPredDat)
> lines(tstPredDat$x, exp(tstPred))
> 
> 
>   I tried "abline(tstFit)" hoping it would work.  If the error had not 
> been so obvious, I might not have noticed it.
> 
> 
>   Thanks for your work to build a better R (and through that a better 
> world).
> 
> 
>   Spencer Graves
> 
> ______
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] What happened to Ross Ihaka's proposal for a Common Lisp based R successor?

2016-08-05 Thread peter dalgaard

On 05 Aug 2016, at 06:41 , Andrew Judson  wrote:

> I read this paper
> <https://www.stat.auckland.ac.nz/~ihaka/downloads/Compstat-2008.pdf> and
> haven't been able to find out what happened - I have seen some sporadic
> mention in message groups but nothing definitive. Does anyone know?

Presumably Ross does...

You get a hint if you go one level up and look for the newest file:

https://www.stat.auckland.ac.nz/~ihaka/downloads/New-System.pdf


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Issues building from svn

2016-07-19 Thread peter dalgaard

> On 19 Jul 2016, at 22:05 , Dirk Eddelbuettel  wrote:
> 
> 
> On a fresh svn checkout in a fresh directory, following up on 'configure'
> (with many options, pointer to exact invocation below) by 'make' ends in
> tears:

This also breaks plain builds of r-devel. 

Subdir make is such a pain! Top level Makefile has

SUBDIRS = m4 tools doc etc share src tests
...
@for d in $(SUBDIRS); do \
  (cd $${d} && $(MAKE) R) || exit 1; \
done

so building something in doc that relies on something built in src ends badly. 
Preliminary fiddling indicates that it might work to move doc after src in 
SUBDIRS, but the reverse (src to before doc) did not. 

[snippage]

> The problem does manifest itself when the svn directory is not 'fresh' as in
> the case of the Docker builds as ../bin/R exists from the previous builds.
> 
> Dirk
> 

Er, a "not" went missing in there??

-p

> -- 
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] String encoding problem

2016-07-07 Thread peter dalgaard

> On 07 Jul 2016, at 18:15 , Hadley Wickham  wrote:
> 
> Right - I'm aware of that.  But to me, it doesn't seem correct to
> print a string that is not a valid R string. Why is an unknown
> encoding printed like UTF-8?
> 

It isn't -- no UTF-8 would have the \xbf. I may be flogging a dead horse, but 
it seems to me that there are three alternatives:

- refuse the input (x <- "\xc9\x82\xbf" gives "sorry, not a UTF-8 string" or so)
- refuse to print it (print(x) gives "cannot print non-UTF-8 string")
- what happens now

and a fourth one might be to actually allow mixing of \u0007 and \x07 and \007, 
but I suspect that there are demons down the line which is why it is not 
happening now. (Does it ring a bell with anyone?)

-pd


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Latest R-devel build failing on OS X

2016-05-24 Thread peter dalgaard
I had a regression in config.site so the nightly build didn't. Retrying

Looks like it will build, but the ctl-R, ctl-C bug is still present on OSX 
(w/Simon's libs). This _was_ fixed for a while, was it not?

(The NEWS entry is also wrong: The issue existed before readline 6.3)

-pd

On 24 May 2016, at 12:55 , Martin Maechler  wrote:

> 
> Can you (Frederick, Peter, Keith, but ideally others, too)
> confirm that you don't see any problems anymore, when building a
> version of R-devel from sources that are newer 
> than (or equal to)  svn revision 70632 (2016-05-19 10:59:51, see below)?
> 
> I'm asking because the question is open if these should be
> "back ported" to R 3.3.0 patched or not.
> 
> Best regards,
> Martin
> 
>>>>>> Martin Maechler 
>>>>>>on Thu, 19 May 2016 11:02:48 +0200 writes:
> 
>>>>>>  
>>>>>>on Wed, 18 May 2016 15:03:31 -0700 writes:
> 
>>>> Readline <= 6.2 shouldn't require the SIGWINCH patch, so
>>>> if older versions have trouble finding rl_resize_terminal
>>>> then you could wrap a macro around that part.
> 
>>> I find python related patches that use
> 
>>> #ifdef HAVE_RL_RESIZE_TERMINAL
> 
>>> so they must have configured for that.  We could and
>>> probably should do the same, but as a Linux_only guy
>>> currently (even basically only one flavor of Linux), I'd
>>> appreciate others to produce code for that.
> 
>> Actually that was easy (in hindsight.. I took too long!)
>> enough, so I've now committed
> 
>> 
>> r70632 | maechler | 2016-05-19 10:59:51 +0200 (Thu, 19 May 2016) | 1 line
>> Changed paths:
>> M configure
>> M configure.ac
>> M src/include/config.h.in
>> M src/unix/sys-std.c
> 
>> check for rl_resize_terminal() now
>> 
> 
>> ... and Keith should even not see the warning anymore
>> (nor Peter the error, when compiling using readline 5.x instead of 6.[23]).
> 
> 
>[...]

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Latest R-devel build failing on OS X

2016-05-18 Thread peter dalgaard
Spoke too soon, both systems now build, but neither has the original bugs 
fixed

(Incidentally, I realized why the ctl-R...ctl-C bug never bit me: The emacs 
habit is to exit isearch with ctl-G and that works flawlessly.) 

-pd

> On 18 May 2016, at 22:40 , peter dalgaard  wrote:
> 
> Ah, got it. For some ancient reason config.site on that machine had 
> 
> LDFLAGS=-L/usr/X11R6/lib
> 
> in config.site, and that prevented configure from inserting -L 
> /usr/local/lib, so linked /usr/lib/libreadline.dylib, which is the 
> Apple-supplied one, which possibly is really libedit...
> 
> -p
> 
> 
>> On 18 May 2016, at 22:01 , peter dalgaard  wrote:
>> 
>> gcc   -L/usr/X11R6/lib -o R.bin Rmain.o CommandLineArgs.o Rdynload.o 
>> Renviron.o RNG.o agrep.o apply.o arithmetic.o array.o attrib.o bind.o 
>> builtin.o character.o coerce.o colors.o complex.o connections.o context.o 
>> cum.o dcf.o datetime.o debug.o deparse.o devices.o dotcode.o dounzip.o 
>> dstruct.o duplicate.o edit.o engine.o envir.o errors.o eval.o format.o 
>> gevents.o gram.o gram-ex.o graphics.o grep.o identical.o inlined.o inspect.o 
>> internet.o iosupport.o lapack.o list.o localecharset.o logic.o main.o 
>> mapply.o match.o memory.o names.o objects.o options.o paste.o platform.o 
>> plot.o plot3d.o plotmath.o print.o printarray.o printvector.o printutils.o 
>> qsort.o radixsort.o random.o raw.o registration.o relop.o rlocale.o 
>> saveload.o scan.o seq.o serialize.o sort.o source.o split.o sprintf.o 
>> startup.o subassign.o subscript.o subset.o summary.o sysutils.o times.o 
>> unique.o util.o version.o g_alab_her.o g_cntrlify.o g_fontdb.o g_her_glyph.o 
>> xxxpr.o   `ls ../unix/*.o ../appl/*.o ../nmath/*.!
 o` ../extra/tre/libtre.a ../extra/intl/libintl.a ../extra/tzone/libtz.a 
-L../../lib/x86_64 -lRblas -L/usr/lib/gcc/i686-apple-darwin11/4.2.1/x86_64 
-L/usr/lib/gcc/i686-apple-darwin11/4.2.1 -L/usr/lib -lgfortran
-Wl,-framework -Wl,CoreFoundation -lreadline  -lpcre -llzma -lbz2 -lz -licucore 
-lm -llzma -liconv
>> Undefined symbols for architecture x86_64:
>> "_rl_resize_terminal", referenced from:
>> _Rstd_ReadConsole in sys-std.o
>> ld: symbol(s) not found for architecture x86_64
>> clang: error: linker command failed with exit code 1 (use -v to see 
>> invocation)
>> make[3]: *** [R.bin] Error 1
>> make[2]: *** [R] Error 2
>> make[1]: *** [R] Error 1
>> make: *** [R] Error 1
>> 
>> and the MBAir still builds... 
>> 
>> Confused!,
>> 
>> -pd
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Latest R-devel build failing on OS X

2016-05-18 Thread peter dalgaard
Ah, got it. For some ancient reason config.site on that machine had 

LDFLAGS=-L/usr/X11R6/lib

in config.site, and that prevented configure from inserting -L /usr/local/lib, 
so linked /usr/lib/libreadline.dylib, which is the Apple-supplied one, which 
possibly is really libedit...

-p


> On 18 May 2016, at 22:01 , peter dalgaard  wrote:
> 
> gcc   -L/usr/X11R6/lib -o R.bin Rmain.o CommandLineArgs.o Rdynload.o 
> Renviron.o RNG.o agrep.o apply.o arithmetic.o array.o attrib.o bind.o 
> builtin.o character.o coerce.o colors.o complex.o connections.o context.o 
> cum.o dcf.o datetime.o debug.o deparse.o devices.o dotcode.o dounzip.o 
> dstruct.o duplicate.o edit.o engine.o envir.o errors.o eval.o format.o 
> gevents.o gram.o gram-ex.o graphics.o grep.o identical.o inlined.o inspect.o 
> internet.o iosupport.o lapack.o list.o localecharset.o logic.o main.o 
> mapply.o match.o memory.o names.o objects.o options.o paste.o platform.o 
> plot.o plot3d.o plotmath.o print.o printarray.o printvector.o printutils.o 
> qsort.o radixsort.o random.o raw.o registration.o relop.o rlocale.o 
> saveload.o scan.o seq.o serialize.o sort.o source.o split.o sprintf.o 
> startup.o subassign.o subscript.o subset.o summary.o sysutils.o times.o 
> unique.o util.o version.o g_alab_her.o g_cntrlify.o g_fontdb.o g_her_glyph.o 
> xxxpr.o   `ls ../unix/*.o ../appl/*.o ../nmath/*.o!
 ` ../extra/tre/libtre.a ../extra/intl/libintl.a ../extra/tzone/libtz.a 
-L../../lib/x86_64 -lRblas -L/usr/lib/gcc/i686-apple-darwin11/4.2.1/x86_64 
-L/usr/lib/gcc/i686-apple-darwin11/4.2.1 -L/usr/lib -lgfortran
-Wl,-framework -Wl,CoreFoundation -lreadline  -lpcre -llzma -lbz2 -lz -licucore 
-lm -llzma -liconv
> Undefined symbols for architecture x86_64:
>  "_rl_resize_terminal", referenced from:
>  _Rstd_ReadConsole in sys-std.o
> ld: symbol(s) not found for architecture x86_64
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> make[3]: *** [R.bin] Error 1
> make[2]: *** [R] Error 2
> make[1]: *** [R] Error 1
> make: *** [R] Error 1
> 
> and the MBAir still builds... 
> 
> Confused!,
> 
> -pd

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Latest R-devel build failing on OS X

2016-05-18 Thread peter dalgaard
lib -lgfortran
>> -Wl,-framework -Wl,CoreFoundation -lreadline  -lpcre -llzma -lbz2 -lz 
>> -licucore -lm -llzma -liconv
>> Undefined symbols for architecture x86_64:
>> "_rl_mark", referenced from:
>> _popReadline in sys-std.o
>> "_rl_readline_state", referenced from:
>> _popReadline in sys-std.o
>> "_rl_resize_terminal", referenced from:
>> _Rstd_ReadConsole in sys-std.o
>> ld: symbol(s) not found for architecture x86_64
>> clang: error: linker command failed with exit code 1 (use -v to see 
>> invocation)
>> make[3]: *** [R.bin] Error 1
> 
>> On 18 May 2016, at 14:18 , Keith O'Hara  wrote:
> 
>>> Dear R-devel,
>>> 
>>> The latest version of R-devel (05-17) is throwing an error for me when 
>>> building on OS X (v 10.11.4):
>>> 
>>> making Rembedded.d from Rembedded.c
>>> making dynload.d from dynload.c
>>> making system.d from system.c
>>> making sys-unix.d from sys-unix.c
>>> making sys-std.d from sys-std.c
>>> making X11.d from X11.c
>>> clang -I. -I../../src/include -I../../src/include  -I/usr/local/include 
>>> -DHAVE_CONFIG_H-fPIC  -Wall -mtune=core2 -g -O2  -c Rembedded.c -o 
>>> Rembedded.o
>>> clang -I. -I../../src/include -I../../src/include  -I/usr/local/include 
>>> -DHAVE_CONFIG_H-fPIC  -Wall -mtune=core2 -g -O2  -c dynload.c -o 
>>> dynload.o
>>> clang -I. -I../../src/include -I../../src/include  -I/usr/local/include 
>>> -DHAVE_CONFIG_H-fPIC  -Wall -mtune=core2 -g -O2  -c system.c -o system.o
>>> clang -I. -I../../src/include -I../../src/include  -I/usr/local/include 
>>> -DHAVE_CONFIG_H-fPIC  -Wall -mtune=core2 -g -O2  -c sys-unix.c -o 
>>> sys-unix.o
>>> clang -I. -I../../src/include -I../../src/include  -I/usr/local/include 
>>> -DHAVE_CONFIG_H-fPIC  -Wall -mtune=core2 -g -O2  -c sys-std.c -o 
>>> sys-std.o
>>> sys-std.c:592:5: warning: implicit declaration of function 'RL_UNSETSTATE' 
>>> is invalid in C99
>>>[-Wimplicit-function-declaration]
>>>  RL_UNSETSTATE(RL_STATE_ISEARCH | RL_STATE_NSEARCH | RL_STATE_VIMOTION |
>>>  ^
>>> sys-std.c:592:19: error: use of undeclared identifier 'RL_STATE_ISEARCH'
>>>  RL_UNSETSTATE(RL_STATE_ISEARCH | RL_STATE_NSEARCH | RL_STATE_VIMOTION |
>>>^
>>> sys-std.c:592:38: error: use of undeclared identifier 'RL_STATE_NSEARCH'
>>>  RL_UNSETSTATE(RL_STATE_ISEARCH | RL_STATE_NSEARCH | RL_STATE_VIMOTION |
>>>   ^
>>> sys-std.c:592:57: error: use of undeclared identifier 'RL_STATE_VIMOTION'
>>>  RL_UNSETSTATE(RL_STATE_ISEARCH | RL_STATE_NSEARCH | RL_STATE_VIMOTION |
>>>  ^
>>> sys-std.c:593:5: error: use of undeclared identifier 'RL_STATE_NUMERICARG'
>>>RL_STATE_NUMERICARG | RL_STATE_MULTIKEY);
>>>^
>>> sys-std.c:593:27: error: use of undeclared identifier 'RL_STATE_MULTIKEY'
>>>RL_STATE_NUMERICARG | RL_STATE_MULTIKEY);
>>>  ^
>>> sys-std.c:596:40: error: use of undeclared identifier 'rl_mark'
>>>  rl_line_buffer[rl_point = rl_end = rl_mark = 0] = 0;
>>> ^
>>> sys-std.c:597:5: error: use of undeclared identifier 'rl_done'
>>>  rl_done = 1;
>>>  ^
>>> sys-std.c:998:7: warning: implicit declaration of function 
>>> 'rl_resize_terminal' is invalid in C99
>>>[-Wimplicit-function-declaration]
>>>  rl_resize_terminal();
>>>  ^
>>> 2 warnings and 7 errors generated.
>>> make[3]: *** [sys-std.o] Error 1
>>> make[2]: *** [R] Error 2
>>> make[1]: *** [R] Error 1
>>> make: *** [R] Error 1
>>> 
>>> 
>>> 
>>> 
>>> My configuration information:
>>> 
>>> R is now configured for x86_64-apple-darwin15.4.0
>>> 
>>> Source directory:  .
>>> Installation directory:/Builds/R-devel
>>> 
>>> C compiler:clang  -Wall -mtune=core2 -g -O2
>>> Fortran 77 compiler:   gfortran-4.8  -g -O2
>>> 
>>> C++ compiler:      clang++  -Wall -mtune=core2 -g -O2
>>> C++11 compiler:clang++  -std=c++11 -Wall -mtune=core2 -g -O2
>>> Fortran 90/95 compiler:gfortran-4.8 -Wa

Re: [Rd] Latest R-devel build failing on OS X

2016-05-18 Thread peter dalgaard
x27;rl_resize_terminal' is invalid in C99
> [-Wimplicit-function-declaration]
>   rl_resize_terminal();
>   ^
> 2 warnings and 7 errors generated.
> make[3]: *** [sys-std.o] Error 1
> make[2]: *** [R] Error 2
> make[1]: *** [R] Error 1
> make: *** [R] Error 1
> 
> 
> 
> 
> My configuration information:
> 
> R is now configured for x86_64-apple-darwin15.4.0
> 
> Source directory:  .
> Installation directory:/Builds/R-devel
> 
> C compiler:clang  -Wall -mtune=core2 -g -O2
> Fortran 77 compiler:   gfortran-4.8  -g -O2
> 
> C++ compiler:  clang++  -Wall -mtune=core2 -g -O2
> C++11 compiler:clang++  -std=c++11 -Wall -mtune=core2 -g -O2
> Fortran 90/95 compiler:gfortran-4.8 -Wall -g -O2
> Obj-C compiler:clang -Wall -mtune=core2 -g -O2 -fobjc-exceptions
> 
> Interfaces supported:  aqua, tcltk
> External libraries:readline, BLAS(OpenBLAS), LAPACK(in blas), curl
> Additional capabilities:   PNG, JPEG, TIFF, NLS, cairo, ICU
> Options enabled:   shared R library, R profiling, memory profiling
> 
> Capabilities skipped:  
> Options not enabled:   shared BLAS
> 
> Recommended packages:  yes
> 
> 
> Apologies in advance if I have incorrectly formatted the issue or omitted 
> something important.
> 
> Kind regards,
> Keith
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] where to send patches to R source code

2016-05-13 Thread peter dalgaard
Actually, I think both Martin and I use reaadline R versions on a daily basis 
(Linux and OSX Terminal respectively). For my case, it is just that I rarely 
use the backwards search feature and I'm old enough that terminal widths other 
than 80 look odd to me so I don't resize much either. Of course that doesn't 
mean that the bugs aren't annoying to others!

-pd



On 13 May 2016, at 00:23 , frede...@ofb.net wrote:

> Hi Peter, Martin, and others,
> 
> Thanks for your replies.
> 
> - The bugs apply to all systems that use GNU Readline, not just Linux
>  or Arch Linux.
> 
> - Readline version 6.3 changed the signal handling so that SIGWINCH is
>  no longer handled automatically by the library. This means it's not
>  currently possible for people using R on e.g. Linux to resize the
>  terminal, or at least when they do so they have to make sure that
>  all their commands fit in one line and don't wrap.
> 
> - There is also a long-standing bug in Readline where the callback
>  interface didn't properly clear the line on SIGINT (^C). This means
>  that "exiting" reverse-incremental-search with ^C would give an
>  apparently empty prompt which still had some pending input, so if
>  you hit ^C-Return then an unintended command would get executed.
> 
> If they're not "bothering all that many people", then perhaps it's
> because everyone uses Windows or Mac OS X or RStudio. For me these are
> pretty significant bugs. The second one causes unintended code to be
> executed. Random code could delete files, for example, or worse. The
> first one bites me every time I want to change the size of a window,
> which is pretty often.
> 
> I tried to get Readline maintainer Chet Ramey to fix these on the
> Readline side, but he disagreed with my proposal:
> 
> https://lists.gnu.org/archive/html/bug-readline/2016-04/threads.html
> 
> I'm glad that my message here at least was seen and I hope that
> someone who uses the R command line on Linux will have time to verify
> that the patches work correctly. They are basically Chet-approved
> workarounds for bugs/changes in Readline, not very complicated.
> 
> Do either of you know a Linux R person you could ping to get these
> patches checked out? I'm not overly frustrated, and I'm not in a major
> hurry, but from what we've observed it seems like waiting for someone
> concerned to come along and finally read Bugzilla or the R-Devel
> archives is not going to result in a very dense Poisson process...
> 
> Thanks,
> 
> Frederick Eaton
> 
> On Thu, May 12, 2016 at 03:42:59PM +0200, peter dalgaard wrote:
>> 
>>> On 12 May 2016, at 10:03 , Martin Maechler  
>>> wrote:
>>> 
>>>>>>>> 
>>>>>>>>   on Wed, 11 May 2016 23:00:20 -0700 writes:
>>> 
>>>> Dear R Developers,
>>>> I wrote to this list a week ago with some patches that fix bugs in R's
>>>> GNU Readline interface, but I haven't had a reply. I'm not in a hurry
>>>> but I'd like to make sure that my message is getting read by the right
>>>> people. Should I be sending my patches somewhere else?
>>> 
>>> Thank you Frederick for your reports and patches.
>>> You did send them to the correct place, https://bugs.r-project.org/
>>> 
>>> Sometimes (as here) a combination of circumstances do lead to
>>> nobody picking them up quickly.
>>> In this case,
>>> 
>>> - probably none of R-core use or even have easy access to Arch Linux
>>> so we cannot easily veryify that there is a bug at all
>>> nor -consequently- veryify that your patch does fix the bug.
>> 
>> Actually, the bugs look like they should apply fairly generally, just maybe 
>> not bothering all that many people. But there could be portability issues 
>> with the fixes, so I suspect some of us were waiting for "a readline expert" 
>> to check them out.
>> 
>> -pd
>> 
>> BTW: Anyone with a fix for the stuck-at-eol issue? (aaabbbccc) 
>> 
>>> 
>>> - no other user has confirmed the bug on his/her platform, so
>>> there did not seem a huge demand...
>>> 
>>> - Accidentally many in R core may be busy with other bugs, teaching, .
>>> and just lack the extra resources to delve into these problems
>>> at the current moment.
>>> 
>>> Hence, there was not even an 'Acknowledged' change to your
>>> reports--indeed as nobody had been able to see there is a problem
>>> existing outside of your personal computer.
>>> 
&g

Re: [Rd] where to send patches to R source code

2016-05-12 Thread peter dalgaard

> On 12 May 2016, at 10:03 , Martin Maechler  wrote:
> 
>>>>>>  
>>>>>>on Wed, 11 May 2016 23:00:20 -0700 writes:
> 
>> Dear R Developers,
>> I wrote to this list a week ago with some patches that fix bugs in R's
>> GNU Readline interface, but I haven't had a reply. I'm not in a hurry
>> but I'd like to make sure that my message is getting read by the right
>> people. Should I be sending my patches somewhere else?
> 
> Thank you Frederick for your reports and patches.
> You did send them to the correct place, https://bugs.r-project.org/
> 
> Sometimes (as here) a combination of circumstances do lead to
> nobody picking them up quickly.
> In this case,
> 
> - probably none of R-core use or even have easy access to Arch Linux
>  so we cannot easily veryify that there is a bug at all
>  nor -consequently- veryify that your patch does fix the bug.

Actually, the bugs look like they should apply fairly generally, just maybe not 
bothering all that many people. But there could be portability issues with the 
fixes, so I suspect some of us were waiting for "a readline expert" to check 
them out.

-pd

BTW: Anyone with a fix for the stuck-at-eol issue? (aaabbbccc) 

> 
> - no other user has confirmed the bug on his/her platform, so
>  there did not seem a huge demand...
> 
> - Accidentally many in R core may be busy with other bugs, teaching, .
>  and just lack the extra resources to delve into these problems
>  at the current moment.
> 
> Hence, there was not even an 'Acknowledged' change to your
> reports--indeed as nobody had been able to see there is a problem
> existing outside of your personal computer.
> 
> I agree that this must seem a bit frustrating to you.
> 
> --
> Martin
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] boxplot with formula involving two-factor levels

2016-04-29 Thread peter dalgaard

> On 29 Apr 2016, at 23:54 , d...@stat.oregonstate.edu wrote:
> 
> Hi,
> 
> I noticed two seemingly equivalent call to boxplot will give different plots 
> (in the way how the combined factor levels are arranged on the x-axis): 
> 
> x = factor(rep(c("a", "b", "c"), each=2));
> y = rep(factor(c("one", "two")), each=3);
> r = 3;
> n = r * 6;
> x = rep(x, 3);
> y = rep(y, 3);
> z = rnorm(n);
> 
> par(mfrow=c(2,1));
> 
> ## The following two seeming equivalent calls to boxplot give different 
> results
> boxplot(z~x:y);
> 
> f = x:y;
> boxplot(z~f);
> 
> This is puzzling to me. Is this normal? 

Normal, but a little odd. The root cause is the difference between x:y and 
interaction(x,y), documented on the help page for the latter.

-pd


> Thanks!
> 
> Best,
> Yanming
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] residual standard "error"

2016-04-28 Thread peter dalgaard

On 28 Apr 2016, at 10:00 , Martin Maechler  wrote:

>>>>>> Randall Pruim 
>>>>>>on Sun, 17 Apr 2016 13:54:28 + writes:
> 
>> I see that the sigma() function has recently been introduced into R 3.3.  
>> The help for sigma() says:
>> Extract the estimated standard deviation of the errors, the “residual 
>> standard deviation” (misnomed also “residual standard error”, e.g., in 
>> summary.lm()'s output, from a fitted model.
> 
>> Is there any reason not to fix the mis-naming of residual standard error now 
>> too?  Both functions are in the stats package.  It seems odd for one 
>> function in the package to point out an issue with another, especially when 
>> fixing it would only affect displayed output and not the rest of the API.
> 
> Yes, there is a reason, believe it or not, it is called "tradition".
> 
> 1) The tradition of some/many(?) statistics text books who use the same 
> misnomer
> 2) The "tradition" of S+ (S, S-PLUS) before R and R ever since,
>   with its output re-printed in lecture notes and other text books.
> 
> For that reason I did not dare to change it in print.summary.lm(),
> even though I could have been one of the few to change it at a time
> when are was still in its infancy and (years *before* it got to
> version 1.0.0 on Feb.29, 2000).

You may want to Google "Standard error of the estimate" (and weep...)

-pd

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fresh build from source of R-3.2.5 failing "make check" under 64-bit Ubuntu [SOLVED]

2016-04-22 Thread peter dalgaard

> On 22 Apr 2016, at 00:25 , Mark Dalphin  
> wrote:
> 
>> 
>> mount | grep /opt/
> myHost.science:/mnt/home/opt/apps on /opt/apps type nfs 
> (rw,soft,bg,nfsvers=3,addr=XXX.XXX.XXX.XXX)
> 
> Just the NFS, I guess. This is not good, but not an R-Devel problem.

On a hunch: Check out what the "soft" mount option does. I seem to recall some 
complications related to immediacy of operations.

-pd

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fresh build from source of R-3.2.5 failing "make check" under 64-bit Ubuntu

2016-04-21 Thread peter dalgaard
y Depends   Imports LinkingTo Suggests
> myTst "myTst" "myLib" "1.0"   NA   "methods" NA  NANA 
>  Enhances License License_is_FOSS
> myTst NA   "What license is it under?" NA
>  License_restricts_use OS_type MD5sum NeedsCompilation Built 
> myTst NANA  NA NA   "3.2.5"
>> stopifnot(identical(res[,"Package"], setNames(,sort(c(p.lis, "myTst",
> +   res[,"LibPath"] == "myLib")
> Error: identical(res[, "Package"], setNames(, sort(c(p.lis, "myTst"
> is not TRUE
> Execution halted
> 
>> ls myLib
> exNSS4  myTst  pkgA  pkgB
> 
> So, it looks like the "installed.packages()" is not correctly detecting
> what is present in "myLib".
> 
> Has anyone else seen this? Has anyone got any ideas about what is going
> wrong? My environment does not include R_LIBS, LD_LIBRARY, etc. The PATH
> does include an older version of R, 3.1.1.
> 
> Regards,
> Mark Dalphin
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Benchmarks for R

2016-04-21 Thread peter dalgaard
make check-all 

?

On 21 Apr 2016, at 10:23 , Francisco Banha  wrote:

> Hello,
> 
> I'm currently working on a project where I have to make chages to the source 
> code of R itself. By doing this it's possible that I mess something up and 
> make R stop working correctly.
> Can anyone tell me if there are some benchmarks that test the whole R 
> language or at least the most important part of it?
> Thank you.
> 
> Best regards,
> Francisco
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Building R-patched and R-devel fails

2016-04-18 Thread peter dalgaard

On 17 Apr 2016, at 17:09 , Uwe Ligges  wrote:

> 
> 
> On 17.04.2016 11:01, Prof Brian Ripley wrote:
>> On 17/04/2016 07:25, Berwin A Turlach wrote:
>>> G'day all,
>>> 
>>> probably you have noticed this by now, but I thought I ought to report
>>> it. :)
>> 
>> Already fixed for Unix by the time this reached me.  Since that version
>> of Survival has been put into 3.2 patched, that also needed its
>> Makefile.in updated.
>> 
>> I've left Windows changes to those with a Windows build system.
> 
> 
> Done for R-devel and the 3.3 branch just in case something happens on other 
> machines. At least on my machine the build happened in an appropriate order 
> anyway.

Great.

In case you missed Terry's note: This dependency transpires to be due to a very 
recent addition addition to survival, which explains why we were never bitten 
by it in the past. And since M comes before s in most collations, it probably 
only affected parallel builds anyway. And even in that case, the outcome of 
race conditions is undetermined.

-pd


> 
> Best,
> Uwe
> 
>>> 
>>> My scripts that update the SVN sources for R-patched and R-devel, run
>>> `tools/rsync-recommended' (for both) and then install both these
>>> versions from scratch failed this morning.  Apparently the new version
>>> of the recommended package `survival' depends on the recommended
>>> package `Matrix', but the makefile does not ensure that Matrix is build
>>> before survival.  So my log files had entries:
>>> 
>>>   [...]
>>>   ERROR: dependency ‘Matrix’ is not available for package ‘survival’
>>>   * removing ‘/opt/src/R-devel-build/library/survival’
>>>   Makefile:51: recipe for target 'survival.ts' failed
>>>   make[2]: *** [survival.ts] Error 1
>>>   make[2]: *** Waiting for unfinished jobs
>>>   [...]
>>> 
>>> Presumably, in both branches, the files Makefile.in and Makefile.win in
>>> src/library/Recommended have to be adapted to contain the following
>>> line at the end among the "Hardcoded dependencies":
>>> 
>>>survival.ts: Matrix.ts
>>> 
>>> Cheers,
>>> 
>>>Berwin
>>> 
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>> 
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] sys.function(0)

2016-03-28 Thread peter dalgaard
Dunno, really. Some strange things can happen with nonstandard evaluation, like 
having a function designed to evaluate something in the parent of its caller, 
but nonetheless sometimes being called from the command line. So things are 
sometimes defensively coded.

-pd

> On 28 Mar 2016, at 00:08 , Mick Jordan  wrote:
> 
> A related question is why are sys.parent/parent.frame so permissive in their 
> error checking? E.g:
> 
> > sys.parent(-1)
> [1] 0
> > sys.parent(-2)
> [1] 0
> > sys.parent(1)
> [1] 0
> > sys.parent(2)
> [1] 0
> > parent.frame(4)
> 
> >

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] sys.function(0)

2016-03-27 Thread peter dalgaard

> On 27 Mar 2016, at 22:05 , Mick Jordan  wrote:
> 
> As I understand 
> https://stat.ethz.ch/R-manual/R-devel/library/base/html/sys.parent.html
> sys.function(n) returns the function associated with stack frame n. 
> Since frame 0 is defined as .GlobalEnv which is not associated with a 
> function, I would expect this to always return NULL. However, it does not:
> 
>> sys.function()
> NULL
>> f <- function(x) sys.function(x)
>> f(0)
> function(x) sys.function(x)
>> f(1)
> function(x) sys.function(x)
>> f(2)
> Error in sys.function(x) : not that many frames on the stack
> 
> Why the different behavior when sys.function(0) is called inside another 
> function?

This is a documentation bug. The case "which = 0" differs between sys.frame() 
and sys.call()/sys.function(). For the latter, it means the current 
call/function, whereas sys.frame(0) is always the global envir. It is pretty 
clear from the underlying C code that the three functions treat their argument 
differently:

R_sysframe has

if (n == 0)
return(R_GlobalEnv);

if (n > 0)
n = framedepth(cptr) - n;
else
n = -n;

whereas the other two (R_syscall and R_sysfunction) omit the special treatment 
for n==0. Without this, n==0, comes out unchanged from the if-construct, 
indicating that one should go 0 frames up the stack (same as 
n==framedepth(cptr)).

Obviously, it won't work to document the "which" argument identically for all 
three functions...  

-pd


> 
> Mick Jordan
> 
> 
>   [[alternative HTML version deleted]]
> 
> ______
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] summary( prcomp(*, tol = .) ) -- and 'rank.'

2016-03-25 Thread peter dalgaard

> On 25 Mar 2016, at 10:08 , Jari Oksanen  wrote:
> 
>> 
>> On 25 Mar 2016, at 10:41 am, peter dalgaard  wrote:
>> 
>> As I see it, the display showing the first p << n PCs adding up to 100% of 
>> the variance is plainly wrong. 
>> 
>> I suspect it comes about via a mental short-circuit: If we try to control p 
>> using a tolerance, then that amounts to saying that the remaining PCs are 
>> effectively zero-variance, but that is (usually) not the intention at all. 
>> 
>> The common case is that the remainder terms have a roughly _constant_, 
>> small-ish variance and are interpreted as noise. Of course the magnitude of 
>> the noise is important information.  
>> 
> But then you should use Factor Analysis which has that concept of “noise” 
> (unlike PCA).

Actually, FA has a slightly different concept of noise. PCA can be interpreted 
as a purely technical operation, but also as an FA variant with same variance 
for all components.

Specifically, FA is 

Sigma = LL' + Psi

with Psi a diagonal matrix. If Psi = sigma^2 I , then L can be determined (up 
to rotation) as the first p components of PCA. (This is used in ML algorithms 
for FA since it allows you to concentrate the likelihood to be a function of 
Psi.)

Methods like PC regression are not being very specific about the model, but the 
underlying line of thought is that PCs with small variances are 
"uninformative", so that you can make do with only the first handful 
regressors. I tend to interpret "uninformative" as "noise-like" in these 
contexts.

-pd

> 
> Cheers, Jari Oksanen
> 
>>> On 25 Mar 2016, at 00:02 , Steve Bronder  wrote:
>>> 
>>> I agree with Kasper, this is a 'big' issue. Does your method of taking only
>>> n PCs reduce the load on memory?
>>> 
>>> The new addition to the summary looks like a good idea, but Proportion of
>>> Variance as you describe it may be confusing to new users. Am I correct in
>>> saying Proportion of variance describes the amount of variance with respect
>>> to the number of components the user chooses to show? So if I only choose
>>> one I will explain 100% of the variance? I think showing 'Total Proportion
>>> of Variance' is important if that is the case.
>>> 
>>> 
>>> Regards,
>>> 
>>> Steve Bronder
>>> Website: stevebronder.com
>>> Phone: 412-719-1282
>>> Email: sbron...@stevebronder.com
>>> 
>>> 
>>> On Thu, Mar 24, 2016 at 2:58 PM, Kasper Daniel Hansen <
>>> kasperdanielhan...@gmail.com> wrote:
>>> 
>>>> Martin, I fully agree.  This becomes an issue when you have big matrices.
>>>> 
>>>> (Note that there are awesome methods for actually only computing a small
>>>> number of PCs (unlike your code which uses svn which gets all of them);
>>>> these are available in various CRAN packages).
>>>> 
>>>> Best,
>>>> Kasper
>>>> 
>>>> On Thu, Mar 24, 2016 at 1:09 PM, Martin Maechler <
>>>> maech...@stat.math.ethz.ch
>>>>> wrote:
>>>> 
>>>>> Following from the R-help thread of March 22 on "Memory usage in prcomp",
>>>>> 
>>>>> I've started looking into adding an optional   'rank.'  argument
>>>>> to prcomp  allowing to more efficiently get only a few PCs
>>>>> instead of the full p PCs, say when p = 1000 and you know you
>>>>> only want 5 PCs.
>>>>> 
>>>>> (https://stat.ethz.ch/pipermail/r-help/2016-March/437228.html
>>>>> 
>>>>> As it was mentioned, we already have an optional 'tol' argument
>>>>> which allows *not* to choose all PCs.
>>>>> 
>>>>> When I do that,
>>>>> say
>>>>> 
>>>>>   C <- chol(S <- toeplitz(.9 ^ (0:31))) # Cov.matrix and its root
>>>>>   all.equal(S, crossprod(C))
>>>>>   set.seed(17)
>>>>>   X <- matrix(rnorm(32000), 1000, 32)
>>>>>   Z <- X %*% C  ## ==>  cov(Z) ~=  C'C = S
>>>>>   all.equal(cov(Z), S, tol = 0.08)
>>>>>   pZ <- prcomp(Z, tol = 0.1)
>>>>>   summary(pZ) # only ~14 PCs (out of 32)
>>>>> 
>>>>> I get for the last line, the   summary.prcomp(.) call :
>>>>> 
>>>>>> summary(pZ) # only ~14 PCs (out of 32)
>>>>> Importance of components:
>>>>>

Re: [Rd] summary( prcomp(*, tol = .) ) -- and 'rank.'

2016-03-25 Thread peter dalgaard
As I see it, the display showing the first p << n PCs adding up to 100% of the 
variance is plainly wrong. 

I suspect it comes about via a mental short-circuit: If we try to control p 
using a tolerance, then that amounts to saying that the remaining PCs are 
effectively zero-variance, but that is (usually) not the intention at all. 

The common case is that the remainder terms have a roughly _constant_, 
small-ish variance and are interpreted as noise. Of course the magnitude of the 
noise is important information.  

-pd

> On 25 Mar 2016, at 00:02 , Steve Bronder  wrote:
> 
> I agree with Kasper, this is a 'big' issue. Does your method of taking only
> n PCs reduce the load on memory?
> 
> The new addition to the summary looks like a good idea, but Proportion of
> Variance as you describe it may be confusing to new users. Am I correct in
> saying Proportion of variance describes the amount of variance with respect
> to the number of components the user chooses to show? So if I only choose
> one I will explain 100% of the variance? I think showing 'Total Proportion
> of Variance' is important if that is the case.
> 
> 
> Regards,
> 
> Steve Bronder
> Website: stevebronder.com
> Phone: 412-719-1282
> Email: sbron...@stevebronder.com
> 
> 
> On Thu, Mar 24, 2016 at 2:58 PM, Kasper Daniel Hansen <
> kasperdanielhan...@gmail.com> wrote:
> 
>> Martin, I fully agree.  This becomes an issue when you have big matrices.
>> 
>> (Note that there are awesome methods for actually only computing a small
>> number of PCs (unlike your code which uses svn which gets all of them);
>> these are available in various CRAN packages).
>> 
>> Best,
>> Kasper
>> 
>> On Thu, Mar 24, 2016 at 1:09 PM, Martin Maechler <
>> maech...@stat.math.ethz.ch
>>> wrote:
>> 
>>> Following from the R-help thread of March 22 on "Memory usage in prcomp",
>>> 
>>> I've started looking into adding an optional   'rank.'  argument
>>> to prcomp  allowing to more efficiently get only a few PCs
>>> instead of the full p PCs, say when p = 1000 and you know you
>>> only want 5 PCs.
>>> 
>>> (https://stat.ethz.ch/pipermail/r-help/2016-March/437228.html
>>> 
>>> As it was mentioned, we already have an optional 'tol' argument
>>> which allows *not* to choose all PCs.
>>> 
>>> When I do that,
>>> say
>>> 
>>> C <- chol(S <- toeplitz(.9 ^ (0:31))) # Cov.matrix and its root
>>> all.equal(S, crossprod(C))
>>> set.seed(17)
>>> X <- matrix(rnorm(32000), 1000, 32)
>>> Z <- X %*% C  ## ==>  cov(Z) ~=  C'C = S
>>> all.equal(cov(Z), S, tol = 0.08)
>>> pZ <- prcomp(Z, tol = 0.1)
>>> summary(pZ) # only ~14 PCs (out of 32)
>>> 
>>> I get for the last line, the   summary.prcomp(.) call :
>>> 
>>>> summary(pZ) # only ~14 PCs (out of 32)
>>> Importance of components:
>>>  PC1PC2PC3PC4 PC5 PC6
>>> PC7 PC8
>>> Standard deviation 3.6415 2.7178 1.8447 1.3943 1.10207 0.90922
>> 0.76951
>>> 0.67490
>>> Proportion of Variance 0.4352 0.2424 0.1117 0.0638 0.03986 0.02713
>> 0.01943
>>> 0.01495
>>> Cumulative Proportion  0.4352 0.6775 0.7892 0.8530 0.89288 0.92001
>> 0.93944
>>> 0.95439
>>>   PC9PC10PC11PC12PC13   PC14
>>> Standard deviation 0.60833 0.51638 0.49048 0.44452 0.40326 0.3904
>>> Proportion of Variance 0.01214 0.00875 0.00789 0.00648 0.00534 0.0050
>>> Cumulative Proportion  0.96653 0.97528 0.98318 0.98966 0.99500 1.
>>>> 
>>> 
>>> which computes the *proportions* as if there were only 14 PCs in
>>> total (but there were 32 originally).
>>> 
>>> I would think that the summary should  or could in addition show
>>> the usual  "proportion of variance explained"  like result which
>>> does involve all 32  variances or std.dev.s ... which are
>>> returned from the svd() anyway, even in the case when I use my
>>> new 'rank.' argument which only returns a "few" PCs instead of
>>> all.
>>> 
>>> Would you think the current  summary() output is good enough or
>>> rather misleading?
>>> 
>>> I think I would want to see (possibly in addition) proportions
>>> with respect to the full variance and not just to the variance
>>> of those f

Re: [Rd] Regression in strptime

2016-03-15 Thread peter dalgaard

> On 15 Mar 2016, at 11:52 , Martin Maechler  wrote:
> 
> 
> o  R version 3.3.0 (Supposedly Educational) prerelease versions will appear 
> starting Monday 2016-03-14. Final release is scheduled for Thursday 
> 2016-04-14.

Oops, that's actually incorrect. It's R-3.2.4-patched for a couple more days. 
Branching for R-3.3.x happens on Thursday. But R-devel snapshots will be of the 
same thing until then.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Regression in strptime

2016-03-12 Thread peter dalgaard
OK, .Internal is not necessary to reproduce oddity in this area. I also see 
things like (notice 1980)

> strptime(paste0(sample(1900:1999,80,replace=TRUE),"/01/01"), "%Y/%m/%d", 
> tz="CET")
 [1] "1942-01-01 CEST" "1902-01-01 CET"  "1956-01-01 CET"  "1972-01-01 CET" 
 [5] "1962-01-01 CET"  "1900-01-01 CET"  "1921-01-01 CET"  "1972-01-01 CET" 
 [9] "1918-01-01 CET"  "1989-01-01 CET"  "1900-01-01 CET"  "1970-01-01 CET" 
[13] "1971-01-01 CET"  "1910-01-01 CET"  "1956-01-01 CET"  "1953-01-01 CET" 
[17] "1964-01-01 CET"  "1932-01-01 CET"  "1968-01-01 CET"  "1990-01-01 CET" 
[21] "1961-01-01 CET"  "1920-01-01 CET"  "1961-01-01 CET"  "1941-01-01 CEST"
[25] "1947-01-01 CET"  "1979-01-01 CET"  "1943-01-01 CET"  "1976-01-01 CET" 
[29] "1951-01-01 CET"  "1912-01-01 CET"  "1983-01-01 CET"  "1985-01-01 CET" 
[33] "1970-01-01 CET"  "1917-01-01 CET"  "1930-01-01 CET"  "1966-01-01 CET" 
[37] "1953-01-01 CET"  "1938-01-01 CET"  "1974-01-01 CET"  "1959-01-01 CET" 
[41] "1984-01-01 CET"  "1928-01-01 CET"  "1970-01-01 CET"  "1959-01-01 CET" 
[45] "1935-01-01 CET"  "1934-01-01 CET"  "1935-01-01 CET"  "1951-01-01 CET" 
[49] "1907-01-01 CET"  "1985-01-01 CET"  "1906-01-01 CET"  "1912-01-01 CET" 
[53] "1966-01-01 CET"  "1944-01-01 CET"  "1952-01-01 CET"  "1936-01-01 CET" 
[57] "1967-01-01 CET"  "1925-01-01 CET"  "1980-01-01 CEST" "1930-01-01 CET" 
[61] "1999-01-01 CET"  "1965-01-01 CET"  "1903-01-01 CET"  "1942-01-01 CET" 
[65] "1917-01-01 CET"  "1995-01-01 CET"  "1939-01-01 CET"  "1949-01-01 CET" 
[69] "1950-01-01 CET"  "1966-01-01 CET"  "1996-01-01 CET"  "1966-01-01 CET" 
[73] "1999-01-01 CET"  "1961-01-01 CET"  "1946-01-01 CET"  "1902-01-01 CET" 
[77] "1983-01-01 CET"  "1981-01-01 CET"  "1949-01-01 CET"  "1977-01-01 CET" 

The issue seems to be present in R-devel but not in (CRAN) 3.2.0

-pd


> On 12 Mar 2016, at 17:43 , Mick Jordan  wrote:
> 
> On 3/12/16 12:33 AM, peter dalgaard wrote:
>>> On 12 Mar 2016, at 00:05 , Mick Jordan  wrote:
>>> 
>>> This is definitely obscure but we had a unit test that called 
>>> .Internal(strptime, "1942/01/01", %Y/%m/%d") with timezone (TZ) set to CET.
>> Umm, that doesn't even parse. And fixing the typo, it doesn't run:
>> 
>>> .Internal(strptime, "1942/01/01", %Y/%m/%d")
>> Error: unexpected SPECIAL in ".Internal(strptime, "1942/01/01", %Y/%"
>>> .Internal(strptime, "1942/01/01", "%Y/%m/%d")
>> Error in .Internal(strptime, "1942/01/01", "%Y/%m/%d") :
>>   3 arguments passed to '.Internal' which requires 1
>> 
>> 
>> 
>>> In R-3.1.3 that returned "1942-01-01 CEST" which, paradoxically, is correct 
>>> as they evidently did strange things in Germany during the war period. Java 
>>> also returns the same. However, R-3.2.4 returns "1942-01-01 CET".
>> Did you mean:
>> 
>> pd$ r-release-branch/BUILD-dist/bin/R
>> 
>> R version 3.2.4 Patched (2016-03-10 r70319) -- "Very Secure Dishes"
>> Copyright (C) 2016 The R Foundation for Statistical Computing
>> Platform: x86_64-apple-darwin13.4.0/x86_64 (64-bit)
>> [...]
>>> strptime("1942/01/01", "%Y/%m/%d", tz="CET")
>> [1] "1942-01-01 CEST"
>> 
>> But then as you see, it does have DST on New Years Day.
>> 
>> All in all, there is something you are not telling us.
>> 
>> Notice that all DST information is OS dependent as it depends on which 
>> version of the "Olson database" is installed.
>> 
>> 
> You are correct that I was sloppy with syntax for the example. We are, for 
> better or worse, calling the .Internal, but actually with a large vector of 
> arguments, of which the 1942 entry is element 82. I can confirm that for the 
> vector of length 1 example that I didn't test but just assumed would also 
> fail, the answer is correct. However, it is not for the full ve

Re: [Rd] Regression in strptime

2016-03-12 Thread peter dalgaard

> On 12 Mar 2016, at 00:05 , Mick Jordan  wrote:
> 
> This is definitely obscure but we had a unit test that called 
> .Internal(strptime, "1942/01/01", %Y/%m/%d") with timezone (TZ) set to CET.

Umm, that doesn't even parse. And fixing the typo, it doesn't run:

> .Internal(strptime, "1942/01/01", %Y/%m/%d")
Error: unexpected SPECIAL in ".Internal(strptime, "1942/01/01", %Y/%"
> .Internal(strptime, "1942/01/01", "%Y/%m/%d")
Error in .Internal(strptime, "1942/01/01", "%Y/%m/%d") : 
  3 arguments passed to '.Internal' which requires 1



> In R-3.1.3 that returned "1942-01-01 CEST" which, paradoxically, is correct 
> as they evidently did strange things in Germany during the war period. Java 
> also returns the same. However, R-3.2.4 returns "1942-01-01 CET".

Did you mean:

pd$ r-release-branch/BUILD-dist/bin/R

R version 3.2.4 Patched (2016-03-10 r70319) -- "Very Secure Dishes"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0/x86_64 (64-bit)
[...]
> strptime("1942/01/01", "%Y/%m/%d", tz="CET")
[1] "1942-01-01 CEST"

But then as you see, it does have DST on New Years Day.

All in all, there is something you are not telling us.

Notice that all DST information is OS dependent as it depends on which version 
of the "Olson database" is installed.


> 
> Mick Jordan
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rmultinom.c error probability not sum to 1

2016-03-10 Thread peter dalgaard

> On 10 Mar 2016, at 21:25 , m.van_iter...@lumc.nl wrote:
> 
> Hi all, 
> 
> I should have given a better explanation of my problem. Here it is.
> 
> I extracted from my code the bit that gives the error. Place this in a file 
> called test.c

Aha. Missing info #1, C not R...

> 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> int main(){
> 
>  double prob[3] = {0.0, 0.0, 0.0};
>  double prob_tot = 0.;
> 
>  prob[0] = 0.3*dnorm(2, 0, 1, 0);
>  prob[1] = 0.5*dnorm(5, 0, 1, 0);
>  prob[2] = 0.2*dnorm(-3, 0, 1, 0);
> 
>  //obtain prob_tot
>  prob_tot = 0.;
>  for(int j = 0; j < 3; j++)
>prob_tot += prob[j];
> 
>  //normalize probabilities
>  for(int j = 0; j < 3; j++)
>prob[j] = prob[j]/prob_tot;
> 
>  //or this give the same error
>  //prob[2] = 1.0 - prob[1] - prob[0];
> 

prob_tot = 0; missing here

>  //checking indeed prob_tot not exactly 1
>  for(int j = 0; j < 3; j++)
>prob_tot += prob[j];
> 
>  Rprintf("Prob_tot: %f\n", prob_tot);
> 
>  int rN[3];
>  rmultinom(1, prob, 1, rN);

Er, where do you tell rmultinom that variates are three-dimensional? It's not 
going to infer it from array sizes. 

-pd

>  return 0;
> }
> 
> run R CMD SHLIB test.c to generate the test.so. Now from within R
> 
>> dyn.load("test.so")
>> .C("main")
> Prob_tot: 1.017084
> Error: rbinom: probability sum should be 1, but is 0.948075
> 
> Maybe I miss some trivial C knowledge why this is not exactly one!
> 
> Thanks in advance!
> 
> Regards, 
> Maarten
> 
> From: peter dalgaard [pda...@gmail.com]
> Sent: Thursday, March 10, 2016 1:26 PM
> To: Iterson, M. van (MOLEPI)
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] rmultinom.c error probability not sum to 1
> 
> On 10 Mar 2016, at 12:47 , m.van_iter...@lumc.nl wrote:
> 
>> Dear all,
>> 
>> I have a questions regarding using the c function rmultinom.c.
>> 
>> I got the following error message "rbinom: probability sum should be 1, but 
>> is 0.999264"
>> 
>> Which is thrown by:
>> 
>> if(fabs((double)(p_tot - 1.)) > 1e-7)
>> MATHLIB_ERROR(_("rbinom: probability sum should be 1, but is %g"),
>> (double) p_tot);
>> 
>> I understand my probabilities do not sum to one close enough. I tried the 
>> following,
>> p[2] = 1. - p[0] - p[1],  where 'p', are the probabilities but this is not 
>> sufficient to pass the error message!
>> 
> 
> 
> p[0] 
> 
> 
>> Thanks in advance!
>> 
>> Regards,
>> Maarten
>> 
>> (I don't think this is an issue with versions but I used R version 3.2.3 and 
>> can provide more details on my linux build if necessary.)
>> 
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem building R-3.2.4

2016-03-10 Thread peter dalgaard
Yes, this is fixed in R-patched, but you can just change the $ to @ which is 
what was intended.

You could also install a system-wide version of the library. Notice that in 
3.3.x, the included xz & al. will disappear.

-pd

> On 10 Mar 2016, at 17:51 , Mick Jordan  wrote:
> 
> I am trying to build R-3.2.4 on an Oracle Enterprise Linux system, where I 
> have previously built R-3.1.3 and predecessors without problems. I ran 
> "./configure --with-x=no" ok. The make fails in src/extra/xz with what looks 
> like a Makefile problem:
> 
> liblzma.a: $(liblzma_a_OBJECTS)
>$rm -f $@
>$(AR) -cr $@ $(liblzma_a_OBJECTS)
>$(RANLIB) $@
> 
> 
> What I see in the make log is:
> 
> gcc -std=gnu99 -I./api -I. -I../../../src/include -I../../../src/include 
> -I/usr/local/include -DHAVE_CONFIG_H -fopenmp  -g -O2  -c x86.c -o x86.o
> m -f liblzma.a
> make[4]: m: Command not found
> make[4]: *** [liblzma.a] Error 127
> make[4]: Leaving directory `/tmp/R-3.2.4/src/extra/xz'
> make[3]: *** [R] Error 2
> make[3]: Leaving directory `/tmp/R-3.2.4/src/extra/xz'
> make[2]: *** [make.xz] Error 2
> make[2]: Leaving directory `/tmp/R-3.2.4/src/extra'
> make[1]: *** [R] Error 1
> make[1]: Leaving directory `/tmp/R-3.2.4/src'
> make: *** [R] Error 1
> 
> I'm very suspicious of the "$rm -f @a" line, which also appears in the 
> Makefile.in. Seems like $r has resolved to empty leading to the command "m -f 
> liblzma.a"
> 
> Mick Jordan
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rmultinom.c error probability not sum to 1

2016-03-10 Thread peter dalgaard

On 10 Mar 2016, at 12:47 , m.van_iter...@lumc.nl wrote:

> Dear all,
> 
> I have a questions regarding using the c function rmultinom.c.
> 
> I got the following error message "rbinom: probability sum should be 1, but 
> is 0.999264"
> 
> Which is thrown by:
> 
> if(fabs((double)(p_tot - 1.)) > 1e-7)
> MATHLIB_ERROR(_("rbinom: probability sum should be 1, but is %g"),
> (double) p_tot);
> 
> I understand my probabilities do not sum to one close enough. I tried the 
> following,
> p[2] = 1. - p[0] - p[1],  where 'p', are the probabilities but this is not 
> sufficient to pass the error message!
> 


p[0] 


> Thanks in advance!
> 
> Regards,
> Maarten
> 
> (I don't think this is an issue with versions but I used R version 3.2.3 and 
> can provide more details on my linux build if necessary.)
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> ______
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem installing R-devel dated 4 March 2016.

2016-03-06 Thread peter dalgaard
Hm, we'll likely never find out. It looks a bit like a race condition or a 
Makefile deficiency in which some dependencies are not explicitly recorded (so 
that it tries to copy files before they have been made). I suppose that could 
happen if you try "make install" w/o a preceding "make". 

-pd

> On 06 Mar 2016, at 04:54 , Rolf Turner  wrote:
> 
> 
> Thanks Peter.  I tried running the uninstalled R; it worked.  I checked on 
> the existence of FAQ, etc. --- yep everything was there.  I don't know about 
> over-zealous virus checkers; I haven't overtly installed any such.
> 
> So, mystified, I started all over again from scratch.  This time it worked; 
> seamlessly.
> 
> Totally mysterious.  Story of my life.  Be that as it were, all systems are 
> go now, and my package checking was successful.
> 
> Sorry for the noise.
> 
> cheers,
> 
> Rolf
> 
> -- 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
> 
> On 06/03/16 03:10, peter dalgaard wrote:
>> 
>>> On 05 Mar 2016, at 03:35 , Rolf Turner  wrote:
>>> 
>>> 
>>> I am trying to install the latest development version of R so as to be able 
>>> to perform a package check according the rules specified for CRAN.
>>> 
>>> I performed the following steps:
>>> 
>>> (1) Downloaded  R-devel.tar.gz, dated 04-Mar-2016 03:21, from CRAN
>>> (2) Upacked.
>>> (3) Created directory "BldDir" next to the directory "R-devel".
>>> (4) Changed directories to BldDir.
>>> (5) Executed  ../R-devel/configure --with-tcltk --with-cairo .
>>> (6) Executed make .
>>> (7) Executed sudo make install .
>>> 
>>> I got the error messages:
>>>  .
>>>  .
>>>  .
>>>> mkdir -p -- /usr/local/lib64/R/doc
>>>> /usr/bin/install: cannot stat `FAQ': No such file or directory
>>>> /usr/bin/install: cannot stat `RESOURCES': No such file or directory
>>>> /usr/bin/install: cannot stat `NEWS': No such file or directory
>>>> /usr/bin/install: cannot stat `NEWS.pdf': No such file or directory
>>>> /usr/bin/install: cannot stat `NEWS.rds': No such file or directory
>>>> /usr/bin/install: cannot stat `NEWS': No such file or directory
>>>> /usr/bin/install: cannot stat `NEWS.pdf': No such file or directory
>>>> make[1]: *** [install-sources2] Error 1
>>>> make[1]: Leaving directory `/home/rolf/Desktop/R-dev-inst/BldDir/doc'
>>>> make: *** [install] Error 1
>>> 
>>> Can someone/anyone tell me what I am missing or doing wrong?
>>> 
>> 
>> Beats me. & I just checked that make install works on my system (usually, I 
>> just run test versions out of their build dirs).
>> 
>> You might check a couple of things though:
>> 
>> - does ~/Desktop/R-dev-inst/BldDir/bin/R work?
>> - does ~/Desktop/R-dev-inst/BldDir/doc/FAQ et al. actually exist?
>> - is there an overzealous virus checker active (those have been known to 
>> move fresh files to "safe locations" right under peoples feet...)
>> 
>> -pd
>> 
>> 
>>> Ta.
> 
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem installing R-devel data 4 March 2016.

2016-03-05 Thread peter dalgaard

> On 05 Mar 2016, at 03:35 , Rolf Turner  wrote:
> 
> 
> I am trying to install the latest development version of R so as to be able 
> to perform a package check according the rules specified for CRAN.
> 
> I performed the following steps:
> 
> (1) Downloaded  R-devel.tar.gz, dated 04-Mar-2016 03:21, from CRAN
> (2) Upacked.
> (3) Created directory "BldDir" next to the directory "R-devel".
> (4) Changed directories to BldDir.
> (5) Executed  ../R-devel/configure --with-tcltk --with-cairo .
> (6) Executed make .
> (7) Executed sudo make install .
> 
> I got the error messages:
>  .
>  .
>  .
>> mkdir -p -- /usr/local/lib64/R/doc
>> /usr/bin/install: cannot stat `FAQ': No such file or directory
>> /usr/bin/install: cannot stat `RESOURCES': No such file or directory
>> /usr/bin/install: cannot stat `NEWS': No such file or directory
>> /usr/bin/install: cannot stat `NEWS.pdf': No such file or directory
>> /usr/bin/install: cannot stat `NEWS.rds': No such file or directory
>> /usr/bin/install: cannot stat `NEWS': No such file or directory
>> /usr/bin/install: cannot stat `NEWS.pdf': No such file or directory
>> make[1]: *** [install-sources2] Error 1
>> make[1]: Leaving directory `/home/rolf/Desktop/R-dev-inst/BldDir/doc'
>> make: *** [install] Error 1
> 
> Can someone/anyone tell me what I am missing or doing wrong?
> 

Beats me. & I just checked that make install works on my system (usually, I 
just run test versions out of their build dirs).

You might check a couple of things though:

- does ~/Desktop/R-dev-inst/BldDir/bin/R work?
- does ~/Desktop/R-dev-inst/BldDir/doc/FAQ et al. actually exist?
- is there an overzealous virus checker active (those have been known to move 
fresh files to "safe locations" right under peoples feet...)

-pd


> Ta.
> 
> cheers,
> 
> Rolf Turner
> 
> -- 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.vector in R-devel loaded 3/3/2016

2016-03-04 Thread peter dalgaard
Er, until _what_ is fixed?

I see no anomalies with the version in R-pre:

> library(Matrix)
> as.vector
standardGeneric for "as.vector" defined from package "base"

function (x, mode = "any") 
standardGeneric("as.vector")

Methods may be defined for arguments: x, mode
Use  showMethods("as.vector")  for currently available ones.
> str(as.vector(1:3))
 int [1:3] 1 2 3
> str(as.vector(1:3+0))
 num [1:3] 1 2 3
> str(as.vector(list(1,2,3))
+ )
List of 3
 $ : num 1
 $ : num 2
 $ : num 3
> str(as.vector(list(1,2,3), mode="integer"))
 int [1:3] 1 2 3
> str(as.vector(list(1,2,3), mode="numeric"))
 num [1:3] 1 2 3


Also, *current* r-devel has the same definition:

$ ~/r-devel/BUILD-dist/bin/R

R Under development (unstable) (2016-03-03 r70270) -- "Unsuffered Consequences"
[...yadayada...]
> library(Matrix)
> as.vector
function (x, mode = "any") 
.Internal(as.vector(x, mode))




> On 04 Mar 2016, at 01:09 , Jeff Laake - NOAA Federal  
> wrote:
> 
> I dug into this a little further and discovered the problem.  When my
> package is for checking, it loads Matrix.  In the R-devel version of
> Matrix, as.vector is re-defined without mode specified
> 
>> as.vector
> standardGeneric for "as.vector" defined from package "base"
> 
> function (x, mode)
> standardGeneric("as.vector")
> 
> Methods may be defined for arguments: x, mode
> Use  showMethods("as.vector")  for currently available ones.
> 
> In R3.2.3 it is defined with mode="any" specified.
> 
>> as.vector
> standardGeneric for "as.vector" defined from package "base"
> 
> function (x, mode = "any")
> standardGeneric("as.vector")
> 
> Methods may be defined for arguments: x, mode
> Use  showMethods("as.vector")  for currently available ones.
> 
> Until this is fixed I'll copy over the devel version of Matrix.
> 
> --jeff
> 
> 
> On Thu, Mar 3, 2016 at 7:23 AM, Jeff Laake - NOAA Federal <
> jeff.la...@noaa.gov> wrote:
> 
>> I just installed R-devel to check my package before submitting.  I got an
>> error in my vignette in regards to as.vector.  When I looked at the code
>> for as.vector in R-devel it is
>> 
>> standardGeneric for "as.vector" defined from package "base"
>> 
>> function (x, mode)
>> standardGeneric("as.vector")
>> 
>> Methods may be defined for arguments: x, mode
>> Use  showMethods("as.vector")  for currently available ones.
>> 
>> The code from R3.2.3 is
>>> as.vector
>> function (x, mode = "any")
>> .Internal(as.vector(x, mode))
>> 
>> 
>>> 
>> 
>> Is default for mode missing as I suspect or will mode be required from now
>> on?
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R 3.2.4 rc issue

2016-03-04 Thread peter dalgaard
Thanks for the info, Dirk.

The tarball builds don't run make check (because of a policy decision that it 
is better to have the sources available on all platforms for testing than to 
have none if it breaks on a single platform). However the build as of tonight 
has no problem with make check on the build machine. Did you by any chance 
forget that Matrix is a recommended package and expected to be available when 
checking?

-pd

> On 04 Mar 2016, at 04:52 , Dirk Eddelbuettel  wrote:
> 
> 
> I generally run 'make; make check' (with more settings) when building the
> Debian package.  Running 3.2.4 rc from last night, I see a lot of package
> loading issues during 'make check'.  Here is splines as one examples:
> 
> checking package 'splines'
> * using log directory '/build/r-base-3.2.3.20160303/tests/splines.Rcheck'
> * using R version 3.2.4 RC (2016-03-02 r70270)
> * using platform: x86_64-pc-linux-gnu (64-bit)
> * using session charset: ASCII
> * using option '--no-build-vignettes'
> * looks like 'splines' is a base package
> * skipping installation test
> * checking package directory ... OK
> * checking DESCRIPTION meta-information ... OK
> * checking top-level files ... OK
> * checking for left-over files ... OK
> * checking index information ... OK
> * checking package subdirectories ... OK
> * checking whether the package can be loaded ... OK
> * checking whether the package can be loaded with stated dependencies ... OK
> * checking whether the package can be unloaded cleanly ... OK
> * checking whether the namespace can be loaded with stated dependencies ... OK
> * checking whether the namespace can be unloaded cleanly ... OK
> * checking S3 generic/method consistency ... OK
> * checking replacement functions ... OK
> * checking foreign function calls ... OK
> * checking R code for possible problems ... OK
> * checking Rd files ... OK
> * checking Rd metadata ... OK
> * checking Rd cross-references ... WARNING
> Unknown package 'Matrix' in Rd xrefs
> * checking for missing documentation entries ... OK
> * checking for code/documentation mismatches ... OK
> * checking Rd \usage sections ... OK
> * checking Rd contents ... OK
> * checking compiled code ... OK
> * checking examples ... SKIPPED
> * checking tests ...
>  Running 'spline-tst.R'
> ERROR
> Running the tests in 'tests/spline-tst.R' failed.
> Last 13 lines of output:
>> proc.time()
> user  system elapsed 
>2.272   0.020   2.291 
>> 
>> ###- sparse / dense   interpSpline() 
>> ---
>> 
>> ## from  help(interpSpline) -- ../man/interpSpline.Rd
>> ispl <- interpSpline( women$height, women$weight)
>> isp. <- interpSpline( women$height, women$weight, sparse=TRUE)
>  Error in splineDesign(knots, x, ord, derivs, sparse = sparse) : 
>splineDesign(*, sparse=TRUE) needs package 'Matrix' correctly installed
>  Calls: interpSpline -> interpSpline.default -> splineDesign
>  Execution halted
> * checking PDF version of manual ... OK
> * DONE
> 
> Status: 1 ERROR, 1 WARNING
> See
>  '/build/r-base-3.2.3.20160303/tests/splines.Rcheck/00check.log'
> for details.
> 
> 
> Did something change in requirements?
> 
> Dirk
> 
> -- 
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Source code of early S versions

2016-03-01 Thread peter dalgaard

> On 29 Feb 2016, at 19:54 , Barry Rowlingson  
> wrote:
> 
>> PS:  somehow "historical" would be less unnerving than "archeological"
> 
> At least I didn't say palaeontological.

So John should feel more like stone age than dinosaur?

(Some portion of this must be a fortune candidate!)

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Unable to Install Packages from Binaries on Windows for R 3.2.3

2016-02-27 Thread peter dalgaard

> On 27 Feb 2016, at 05:22 , Ramnath Vaidyanathan  
> wrote:
> 
> Installing packages from binaries on Windows seems broken, when using
> mirrors that are up to date with CRAN
> 
> install.packages(
>  'httr',
>  type = 'binary',
>  repos = "https://cran.rstudio.com/";
> )
> 
> Changing repos to the Kansas CRAN mirror installs the package as expected,
> but that could be because the KS mirror has not yet synced.
> 
> Someone pointed out that the PACKAGES.gz file at
> https://cran.r-project.org/bin/windows/contrib/3.2/ seems to be corrupted
> (0 KB), and this could be the issue.


It's at 202K now in both places. Perhaps just retry?

-pd

> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


<    1   2   3   4   5   6   7   8   9   10   >