Re: [Rd] CRAN policies

Mark.Bravington Sat, 31 Mar 2012 04:18:24 -0700

Herewith comments on some replies to my earlier post. To avoid burying my own 
points, I'll briefly restate my views (which may have evolved a bit):


 - We should not be concocting yet more complicated rules to solve imaginary 
problems;

 - RCMD CHECK should have (i) Notes, which are up to the individual to ponder 
and are not CRAN's concern, and (ii) Warnings, which trigger rejection from 
CRAN. And Warnings should be for a really good reason. Then developers have 
clarity, and the CRANia have less to do. Surely the CRANia are on a 
hiding-to-nothing if they create work for themselves by continuing to require 
manual inspection; the torrent of packages is only going to get deeper.

 - Given the vast number of packages, the burden of work imposed by false 
positives from new Warnings (or "significant Notes", mmmm) should be very 
carefully considered against any benefits of true positives. I think the 
balance is going wrong.

Righto, here are my comments on responses, some of which overlap. Thanks for 
those; I've snipped heavily and paraphrased to save space, no offence intended.

 - Matthew D: "It [all additions of Notes in RCMD CHECK] improves quality, 
surely." My comments in the final para were actually about Warnings sensu 
above, not Notes-- sorry if that was unclear. If someone is willing to add 
checks that lead just to Notes sensu above, then good on 'em. But, from my own 
experience and reports from others, I certainly do not consider that all 
Notes/Warnings really do indicate lack-of-quality (even excluding 
visible-bindings). I don't know exactly what's in QC.R, but one recent new 
thing did trigger a complaint from CRAN about a non-problem in 10-year-old 
code. The ensuing discussion cost me, and CRAN, time that neither of us have to 
spare. My job does not give me time to keep re-hitting a moving target. As to 
Memos vs Notes vs Warnings: why wouldn't two categories do? Packaging rules are 
quite complicated enough already!  (The temptation to make them even more 
baroque just to try to stem the flood is understandable, but not laudable...)

 Footnote: I've just glanced at the check results for mvbutils under R-devel. 
Another new and in my opinion unreasonable Warning has cropped up on 
10-year-old perfectly functional code (beside others which may have a point). 
I'll start a separate thread, but this reinforces my view that fixing Notes or 
even Warnings doesn't necessarily improve quality-- and it's not limited to the 
visible-bindings case.

 - Spencer G: well, I didn't say "RCMD CHECK is bad"! I'm not advocating 
anarchy, merely pointing out that: there are limits to what RCMD CHECK can and 
should do, that it is fulfilling two different roles which are getting muddled, 
and that not everyone finds all of it useful. I'm honestly glad you do, but I 
don't (except Codoc, as I said), so one-size-does-not-fit-all. FWIW, my own 
pathway to efficient writing relies on (i) a good debugger (the debug package), 
and (ii) a really seamless method for editing my packages on-the-fly (one part 
of the mvbutils package).


 - Paul G: [300 Notes? Please explain!] Actually, I could explain the idiom 
quicker than I can write this paragraph, but I don't want to here because I'm 
opposed on principle to Notes requiring an explanation. This part of mvbutils 
has worked for 10 years. Someone has subsequently decided that code should look 
a certain way, and has added a check that isn't in the language itself-- but 
they haven't thought of everything, and of course they never could. (That might 
be paranoid. Maybe they aren't trying to impose how things should look, and 
rather are just trying to be helpful, which would be fine. It depends on how 
Notes are being interpreted, which from this thread is no longer clear. The 
R-core line used to be "Notes are just notes" but now we seem to have 
"significant Notes" and vague threats about "lots of Notes" etc. Paranoia seems 
reasonable.) However, anyone interested is welcome to look at the mvbutils 
package, as Bill did. The main idiom is clearly documented in ?mlo!
 cal, and the other cases are usually eval() I think. Since there is no 
reliable way for a static check to figure out where the eval() happens, it 
hasn't got a hope of assessing whether bindings exist. Dammit, now I'm 
explaining, which I didn't want to, but only so that someone can change the 
check, mind...

  As to peace-of-mind from RCMD CHECK: well, I certainly don't have it! Two 
reasons:

  (1) You don't have to look far on CRAN to find packages that are badly 
written (and more often badly documented) but pass their RCMD CHECK fine. I 
tried this just now and got a result on my first go.

  (2) I know very well how to modify my code to evade the notes and most 
warnings, without changing any of what it does-- as could anyone with a bit of 
creativity. If I had inclination and time, I could do it. In no reasonable 
sense would it be "better" code, though.

 So RCMD CHECK is neither a necessary nor sufficient condition for virtue. 
Inspection of a language as rich as R will never be foolproof. The user simply 
has to take it on trust that a package does what it claims, or otherwise decide 
not to use it. How the package does it, is up to the author. My experience of 
other people's software is: peace-of-mind starts with helpful documentation, 
and also depends on whether I get a sense from the archives that the author 
might actually help if I run into something odd. Several well-known packages 
fail these tests, so I avoid them. Automated checks, beyond a certain limited 
point which they have probably reached, seem to me to be playing the wrong game.

 Bill D: [proposal for additional "documentation" mechanism] Thanks for going 
to the trouble of looking at my code; I certainly appreciate the effort, but 
your proposal is exactly what I am against! The issue is with the check, not 
with my code, and as above I do not see why I should need to add elaborate 
justifications. For some, this particular check (visible-binding) is apparently 
useful. For others, it's not. So why not just leave it as a Note that people 
can worry about or not if they want? It should not be of concern to those very 
busy CRANia people.

 Joshua W: [CRAN can set its own rules, and if a package doesn't easily fit 
them, maybe it should be put elsewhere.] Certainly CRAN/R-core (the distinction 
is shadowy to me) can, and frequently does, decide to do whatever it wants, 
including decisions about what to host. But it does not follow that every 
decision taken is axiomatically a Good Thing for R. More effort now goes into R 
development from people outside R core than inside it (>3000 packages). If a 
CRAN/Rcore decision entails a lot of work for others to amend code in ways that 
do not make the code work better, then it doesn't strike me as a good decision. 
Ditto if perfectly functional code is forced off CRAN, where it is (sort of) 
easy to find-- it becomes more difficult for the wider R community to get it, 
and of course it may not get *any* checks that way. NB I am not commenting here 
on individual aspects of RCMD CHECK etc-- this is a general point about mission 
creep, helps and hindrances, and balance of workloa!
 d.


 Mark

Mark Bravington
CSIRO CMIS
Marine Lab
Hobart
Australia
________________________________________
From: Joshua Wiley [jwiley.ps...@gmail.com]
Sent: 31 March 2012 06:03
To: Kevin Wright
Cc: Bravington, Mark (CMIS, Hobart); r-de...@stat.math.ethz.ch
Subject: Re: [Rd] CRAN policies

On Fri, Mar 30, 2012 at 11:41 AM, Kevin Wright <kw.s...@gmail.com> wrote:
> I'll echo Mark's concerns.  R _used_ to be a language for "turning ideas
> into software quickly".  Now it is more like "prototyping ideas in software
> quickly", and then spend a substantial amount of time trying to follow
> administrative rules to package the code.

..if you want to submit to CRAN.  There are practically zero if you
host on your own website.  Of course developers are free to do
whatever they want and R core does not get to tell them what/how to do
it.  R core does get a say when you ask them to host your source and
build your package binaries.

> Quality has its costs.

So does using CRAN.  If it is not the best solution for your problem,
use something else.  Hadley uses github from development ggplot2, and
with the dev_tools package, it is relatively easy for users to install
the source ggplot2 code.  Something like that might be appropriate for
code/packages wehre you just want to 'turn ideas into software
quickly'.  There is an extra step required for users to use it, but
that makes sense because it weeds out inept users from using code with
less quality control.

>
> Many of the code checks I find quite useful, but the "no visible binding"
> one generates lots of nuisance notes for me.  I must have a similar coding
> style to Mark.
>
> Kevin
>
>
> On Thu, Mar 29, 2012 at 8:29 PM, <mark.braving...@csiro.au> wrote:
>
>> I'm concerned this thread is heading the wrong way, towards techno-fixes
>> for imaginary problems. R package-building is already encumbered with a
>> huge set of complicated rules, and more instructions/rules eg for metadata
>> would make things worse not better.
>>
>> RCMD CHECK on the 'mvbutils' package generates over 300 Notes about "no
>> visible binding...", which inevitably I just ignore. They arise because
>> RCMD CHECK is too "stupid" to understand one of my preferred coding idioms
>> (I'm not going to explain what-- that's beside the point). And RCMD CHECK
>> always will be too "stupid" to understand everything that a rich language
>> like R might quite reasonably cause experienced coders to do.
>>
>> It should not be CRAN's business how I write my code, or even whether my
>> code does what it is supposed to. It might be CRAN's business to try to
>> work out whether my code breaks CRAN's policies, eg by causing R to crash
>> horribly-- that's presumably what Warnings are for (but see below). And
>> maybe there could be circumstances where an automatic check might be
>> "worried" enough to alert the CRANia and require manual explanation and
>> emails etc from a developer, but even that seems doomed given the growing
>> deluge of packages.
>>
>> RCMD CHECK currently functions both as a "sanitizer" for CRAN, and as a
>> developer-tool. But the fact that the one programl does both things seems
>> accidental to me, and I think this dual-use is muddying the discussion.
>> There's a big distinction between (i) code-checks that developers
>> themselves might or might not find useful-- which should be left to the
>> developer, and will vary from person to person-- and (ii) code-checks that
>> CRAN enforces for its own peace-of-mind. Maybe it's convenient to have both
>> functions in the same place, and it'd be fine to use Notes for one and
>> Warnings for the other, but the different purposes should surely be kept
>> clear.
>>
>> Personally, in building over 10 packages (only 2 on CRAN), I haven't found
>> RCMD CHECK to be of any use, except for the code-documentation and
>> example-running bits. I know other people have different opinions, but
>> that's the point: one-size-does-not-fit-all when it comes to coding tools.
>>
>> And wrto the Warnings themselves: I feel compelled to point out that it's
>> logically impossible to fully check whether R code will do bad things. One
>> has to wonder at what point adding new checks becomes futile or
>> counterproductive. There must be over 2000 people who have written CRAN
>> packages by now; every extra check and non-back-compatible additional
>> requirement runs the risk of generating false-negatives and incurring many
>> extra person-hours to "fix" non-problems. Plus someone needs to document
>> and explain the check (adding to the rule mountain), plus there is the time
>> spent in discussions like this..!
>>
>> Mark
>>
>> Mark Bravington
>> CSIRO CMIS
>> Marine Lab
>> Hobart
>> Australia
>> ________________________________________
>> From: r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] On
>> Behalf Of Hadley Wickham [had...@rice.edu]
>> Sent: 30 March 2012 07:42
>> To: William Dunlap
>> Cc: r-de...@stat.math.ethz.ch; Spencer Graves
>> Subject: Re: [Rd] CRAN policies
>>
>> > Most of that stuff is already in codetools, at least when it is checking
>> functions
>> > with checkUsage().  E.g., arguments of ~ are not checked.  The  expr
>> argument
>> > to with() will not be checked if you add  skipWith=FALSE to the call to
>> checkUsage.
>> >
>> >  > library(codetools)
>> >
>> >  > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~
>> Pred}))
>> >  <anonymous>: no visible binding for global variable 'Num' (:1)
>> >  <anonymous>: no visible binding for global variable 'Den' (:1)
>> >
>> >  > checkUsage(function(dataFrame) with(dataFrame, {Num/Den ; Resp ~
>> Pred}), skipWith=TRUE)
>> >
>> >  > checkUsage(function(dataFrame) with(DataFrame, {Num/Den ; Resp ~
>> Pred}), skipWith=TRUE)
>> >  <anonymous>: no visible binding for global variable 'DataFrame'
>> >
>> > The only part that I don't see is the mechanism to add code-walker
>> functions to
>> > the environment in codetools that has the standard list of them for
>> functions with
>> > nonstandard evaluation:
>> >  > objects(codetools:::collectUsageHandlers, all=TRUE)
>> >   [1] "$"             "$<-"           ".Internal"
>> >   [4] "::"            ":::"           "@"
>> >   [7] "@<-"           "{"             "~"
>> >  [10] "<-"            "<<-"           "="
>> >  [13] "assign"        "binomial"      "bquote"
>> >  [16] "data"          "detach"        "expression"
>> >  [19] "for"           "function"      "Gamma"
>> >  [22] "gaussian"      "if"            "library"
>> >  [25] "local"         "poisson"       "quasi"
>> >  [28] "quasibinomial" "quasipoisson"  "quote"
>> >  [31] "Quote"         "require"       "substitute"
>> >  [34] "with"
>>
>> It seems like we really need a standard way to add metadata to functions:
>>
>> attr(with, "special_args") <- "expr"
>> attr(lm, "special_args") <- c("formula", "weights", "subset")
>>
>> This would be useful because it could automatically contribute to the
>> documentation.
>>
>> Similarly,
>>
>> attr(my.new.method, "s3method") <- c("my.new", "method")
>>
>> could be useful.
>>
>> Hadley
>>
>>
>> --
>> Assistant Professor / Dobelman Family Junior Chair
>> Department of Statistics / Rice University
>> http://had.co.nz/
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> Kevin Wright
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN policies

Reply via email to