Re: [Rd] Proposed Patch for poly.Rd

Marc Schwartz Fri, 14 Jul 2017 09:01:36 -0700

> On Jul 14, 2017, at 9:50 AM, Martin Maechler <maech...@stat.math.ethz.ch> 
> wrote:
> 
>>>>>> Martin Maechler <maech...@stat.math.ethz.ch>
>>>>>>    on Fri, 14 Jul 2017 16:30:50 +0200 writes:
> 
>>>>>> Marc Schwartz <marc_schwa...@me.com>
>>>>>>    on Fri, 14 Jul 2017 06:57:26 -0500 writes:
> 
>>>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz <marc_schwa...@me.com> wrote:
>>>> 
>>>> 
>>>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz <marc_schwa...@me.com> wrote:
>>>>> 
>>>>> 
>>>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <murdoch.dun...@gmail.com> 
>>>>>> wrote:
>>>>>> 
>>>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>>>>>>> Hi All,
>>>>>>> 
>>>>>>> As per the discussion today on R-Help:
>>>>>>> 
>>>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>>>>>>> 
>>>>>>> I am attaching a proposed patch for poly.Rd to provide clarifying 
>>>>>>> wording relative to naming the 'degree' argument explicitly, in the 
>>>>>>> case where the 'x' argument is a matrix, rather than a vector.
>>>>>>> 
>>>>>>> This is based upon the svn trunk version of poly.Rd.
>>>>>> 
>>>>>> I don't think this is the right fix.  The use of the unnamed 2nd arg as 
>>>>>> degree happens whether the first arg is a matrix or not.
>>>>>> 
>>>>>> I didn't read the whole thread in detail, but it appears there's a bug 
>>>>>> somewhere, in the report or in the poly() code or in the plsr() code. 
>>>>>> That bug should be reported on the bug list if it turns out to be in 
>>>>>> base R, and to the package maintainer if it is in plsr().
>>>>>> 
>>>>>> Duncan Murdoch
>>>>> 
>>>>> 
>>>>> Duncan,
>>>>> 
>>>>> Thanks for your reply. You only really need to read that last post in the 
>>>>> thread linked to above.
>>>>> 
>>>>> I won't deny the possibility of a bug in poly(), relative to the handling 
>>>>> of 'x' as a matrix. The behavior occurs in the poly() function in a pure 
>>>>> stand alone fashion, without the need for plsr():
>>>>> 
>>>>> x1 <- runif(20)
>>>>> x2 <- runif(20)
>>>>> mx <- cbind(x1, x2)
>>>>> 
>>>> 
>>>> <snip>
>>>> 
>>>> Duncan,
>>>> 
>>>> Tracing through the code for poly() using debug once with:
>>>> 
>>>> poly(mx, 2)
>>>> 
>>>> and then with:
>>>> 
>>>> poly(mx, degree = 2)
>>>> 
>>>> there is a difference in the transformation of 'mx' internally by the use 
>>>> of:
>>>> 
>>>> if (is.matrix(x)) {
>>>> m <- unclass(as.data.frame(cbind(x, ...)))
>>>> return(do.call(polym, c(m, degree = degree, raw = raw, list(coefs = 
>>>> coefs))))
>>>> }
>>>> 
>>>> 
>>>> In the first case, 'mx' ends up being transformed to:
>>>> 
>>>> Browse[2]> m
>>>> $x1
>>>> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
>>>> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
>>>> [19] 0.58641410 0.08858699
>>>> 
>>>> $x2
>>>> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
>>>> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
>>>> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
>>>> [19] 0.5110733 0.4122336
>>>> 
>>>> $V3
>>>> [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>>>> 
>>>> attr(,"row.names")
>>>> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
>>>> 
>>>> Thus, when do.call() is used, m$V3 is passed as the 'x' argument on the 
>>>> third iteration, essentially resulting in:
>>>> 
>>>>> polym(rep(2, 20), degree = 2)
>>>> Error in poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) : 
>>>> 'degree' must be less than number of unique points
>>>> 
>>>> 
>>>> Note also that in this case, 'dots', which is the result of using 
>>>> list(...) on the initial call, is:
>>>> 
>>>> Browse[2]> dots
>>>> [[1]]
>>>> [1] 2
>>>> 
>>>> 
>>>> In the second case:
>>>> 
>>>> Browse[2]> m
>>>> $x1
>>>> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
>>>> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
>>>> [19] 0.58641410 0.08858699
>>>> 
>>>> $x2
>>>> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
>>>> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
>>>> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
>>>> [19] 0.5110733 0.4122336
>>>> 
>>>> attr(,"row.names")
>>>> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
>>>> 
>>>> 
>>>> So, there is no m$V3.
>>>> 
>>>> Note also that 'dots' ends up being:
>>>> 
>>>> Browse[2]> dots
>>>> list()
>>>> 
>>>> 
>>>> In both cases, 'degree' is indeed 2, but the result of 'list(...)' on the 
>>>> initial function call is quite different.
>>>> 
>>>> So, I may be hypo-caffeinated, but if there is a bug here, it may be due 
>>>> to the way in which cbind() is being called in the code above, where the 
>>>> three dots are being used?
>>>> 
>>>> I can replicate the presumably correct behavior by using:
>>>> 
>>>> m <- unclass(as.data.frame(cbind(x)))
>>>> 
>>>> instead of:
>>>> 
>>>> m <- unclass(as.data.frame(cbind(x, ...)))
>>>> 
>>>> But I am not sure if removing the three dots in the cbind() call may have 
>>>> other unintended consequences.
>>>> 
>>>> Regards,
>>>> 
>>>> Marc
> 
> 
>>> Duncan,
> 
>>> Some additional information here.
>>> Reviewing the source code for the function in SVN:
> 
>>> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R
> 
>>> there is a relevant comment in the code:
> 
>>> if(is.matrix(x)) { ## FIXME: fails when combined with 'unnamed degree' above
>>> m <- unclass(as.data.frame(cbind(x, ...)))
>>> return(do.call(polym, c(m, degree = degree, raw = raw,
>>> list(coefs=coefs))))
>>> }
> 
> 
>>> A version review would suggest that the above comment was added to the code 
>>> back in 2015.
> 
>> Yes, by me, possibly here :
> 
>> $ svn log -v -c68727
>> ------------------------------------------------------------------------
>> r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23 Jul 2015) | 1 line
>> Changed paths:
>> M /trunk/doc/NEWS.Rd
>> M /trunk/src/library/stats/R/contr.poly.R
>> M /trunk/src/library/stats/man/poly.Rd
>> M /trunk/tests/Examples/stats-Ex.Rout.save
>> M /trunk/tests/reg-tests-1c.R
> 
>> poly(), polym() now work better notably for prediction
>> ------------------------------------------------------------------------
>> $ svn-diffB -c68727 doc/NEWS.Rd
>> Index: doc/NEWS.Rd
>> ===================================================================
>> 126a127,133
>>> 
>>> \item \code{polym()} gains a \code{coefs = NULL} argument and
>>> returns class \code{"poly"} just like \code{poly()} which gets a
>>> new \code{simple=FALSE} option.  They now lead to correct
>>> \code{predict()}ions, e.g., on subsets of the original data.
>>> %% see https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html
> 
> 
>>> So it would appear that the behavior being discussed here is known.
> 
>> Indeed!  I remember to have spent quite a few hours with the
>> code and its different uses before committing that patch.
> 
>>> I am still confused by the need for the '...' in the call to cbind(), which 
>>> as far as I can tell, has been in the code at least back to 2003, when the 
>>> poly() code was split from base. 
> 
>>> I am not sure why one would want to pass on other '...' arguments to 
>>> cbind(), but I am presumably missing something here.
> 
>> Yes, I think passing the '...' is important there...
>> OTOH, I'm almost sure that I wrote the 'FIXME' because I thought
>> one should be able to do things better.
>> So, I'm happy to e-talk to you about how to get rid of the FIXME
>> and still remain back-compatible: Notably with the paragraph in  ?poly
>> |> Details:
>> |>
>> |>      Although formally ‘degree’ should be named (as it follows ‘...’),
>> |>      an unnamed second argument of length 1 will be interpreted as the
>> |>      degree, such that ‘poly(x, 3)’ can be used in formulas.
> 
> As a matter of fact,  a patch seems very simple, and I am
> testing it now.
> 
> Won't have much more time today, but will return "on this channel"
> later, maybe tomorrow.
> 
> Martin



Martin,

Thanks for taking the time to look at this!

Marc

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Proposed Patch for poly.Rd

Reply via email to