Re: [Rd] Underscores in package names

2019-08-16 Thread Kevin Wright
I've heard the arguments against dots in names many times. The t.test and
data.frame examples have been repeated so often that it has become accepted
as gospel.  In my experience, evidence of any actual problems is fairly
limited (almost non-existent).  I've been happily using dots in function
names for 20 (sigh) years and only 1 time had an unanticipated S3 class
kick in.  I find the "." much easier to type than "_" because of the
proximity of the keys to the home-row on the keyboard.

On Thu, Aug 15, 2019 at 8:00 AM Jim Hester  wrote:

> Martin,
>
> Thank you for discussing this amongst R-core and for detailing the
> R-core discussion here.
>
> Some specific examples where having underscores available would have
> been useful.
>
> 1. My primerTree package (2013) was originally primer_tree, but I had
> to change the name to camelCase to comply with the check requirements.
> Using camelCase in the package name makes reading code jarring, as the
> functions all use snake_case.
> 2. The widely used testthat package would likely be called test_that,
> like the corresponding function within the package. This also
> highlights one of the drawbacks of the current situation, without
> separators the package name is more difficult to read, does it have
> two t's or three?
> 3. The assertive suite of packages use `.` for separation, e.g.
> `assertive.base`, `assertive.datetimes` etc. but all functions within
> the packages use `_` separators, again likely this was done out of
> necessity rather than desire.
>
> There are many more I am sure, these were some that came immediately
> to mind. More important than the specific examples is the opportunity
> cost of having this restriction, which we cannot really quantify.
>
> Using dots for separators has a number of practical problems.
> Functions using dots are ambiguous, e.g. is `as.data.frame()` a
> regular function, an `as.data()` method for a `frame` object, or an
> `as()` method for a `data.frame` object? And in fact regular functions
> can be accidentally promoted to S3 methods by defining a S3 generic,
> which does actually happen in real life, confusing users [1]. While
> package names are not functions, using dots in package names
> encourages the use of dots in functions, a dangerous practice. Dots in
> names is also one of the common stones cast at R as a language, as
> dots are used for object oriented method dispatch in other common
> languages.
>
> The prevalence of dotted functions is the only major naming convention
> which is steadily decreasing over time. It now accounts for only
> around 15% of all function names when looking at all 94 Million lines
> of code currently available on CRAN (See Figure 2. from Yen et. al.
> [2]).
>
> Thanks again for the public discussion,
>
> Jim
>
> [1]: https://twitter.com/_ColinFay/status/1105579764797108230
> [2]: https://osf.io/preprints/socarxiv/ts2wq/
>
> On Wed, Aug 14, 2019 at 5:16 AM Martin Maechler
>  wrote:
> >
> > > Duncan Murdoch
> > > on Fri, 9 Aug 2019 20:23:28 -0400 writes:
> >
> > > On 09/08/2019 4:37 p.m., Gabriel Becker wrote:
> > >> Duncan,
> > >>
> > >>
> > >> On Fri, Aug 9, 2019 at 1:17 PM Duncan Murdoch <
> murdoch.dun...@gmail.com
> > >> > wrote:
> > >>
> > >> On 09/08/2019 2:41 p.m., Gabriel Becker wrote:
> > >> > Note that this proposal would make mypackage_2.3.1 a valid
> > >> *package name*,
> > >> > whose corresponding tarball name might be mypackage_2.3.1_2.3.2
> > >> after a
> > >> > patch. Yes its a silly example, but why allow that kind of
> ambiguity?
> > >> >
> > >> CRAN already has a package named "FuzzyNumbers.Ext.2", whose
> tarball is
> > >> FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that
> game.
> > >>
> > >>
> > >> I suppose technically 2 is a valid version number for a package
> (?) so I
> > >> suppose you have me there. But as Ben pointed out while I was
> writing
> > >> this, all I can really say is that in practice they read to me (as
> > >> someone who has administered R on a large cluster and written
> > >> build-system software for it) as substantially different levels of
> > >> ambiguity. I do acknowledge, as Ben does, that yes a more complex
> > >> regular expression/splitting algorithm can be written that would
> handle
> > >> the more general package names. I just don't personally see a
> motivation
> > >> that justifies changing something this fundamental (even if it is
> both
> > >> narrow and was initially more or less arbitrarily chosen) about R
> at
> > >> this late date.
> > >>
> > >> I guess at the end of the day, I guess what I'm saying is that
> breaking
> > >> and changing things is sometimes good, but if we're going to rock
> the
> > >> boat personally I'd want to do so going after bigger wins than
> this one.
> > >> Thats just my opinion though.
> >
> > > 

Re: [Rd] Underscores in package names

2019-08-16 Thread Jan Gorecki
Thanks Abby and Martin,

In every company I worked using R - 3 in total - there was at least
one (up to ~10) processes designed (dev and implemented) to depend on
current package naming scheme, having underscore as separator of
package name and its version. From my experience I believe this is a
(very?) common practice. I also use it myself.
Arguments for having underscore in package names are simply weak.
Dot in function names is an entirely different issue caused by S3
dispatch. No need to look at other OOP languages, it is R.
Package name is not a function name.
There are no practical gains.
There is nothing wrong in having package "a.pkg" and function "a_pkg()".

Regards,
Jan Gorecki


On Fri, Aug 16, 2019 at 1:20 AM Abby Spurdle  wrote:
>
> > While
> > package names are not functions, using dots in package names
> > encourages the use of dots in functions, a dangerous practice.
>
> "dangerous"...?
> I can't understand the necessity of RStudio and Tiny-Verse affiliated
> persons to repeatedly use subjective and unscientific phrasing.
>
> Elegant, Advanced, Dangerous...
> At UseR, there was even "Advanced Use of your Favorite IDE".
>
> This is not science.
> This is marketing.
>
> There's nothing dangerous about it other than your belief that it's
> dangerous.
> I note that many functions in the stats package use dots in function names.
> Your statement implies that the stats package is badly designed, which it
> is not.
> Out of 14,800-ish packages on CRAN, very few of them are even close to the
> standard set by the stats package, in my opinion.
>
> And as noted by other people in this thread, changing naming policies could
> interfere with a lot of software "out there", which is dangerous.
>
> > Dots in
> > names is also one of the common stones cast at R as a language, as
> > dots are used for object oriented method dispatch in other common
> > languages.
>
> I don't think the goal is to copy other OOP systems.
> Furthermore, some shells use dot as the current working directory and Java
> uses dots in package namespaces.
> And then there's regular expressions...
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-15 Thread Abby Spurdle
> While
> package names are not functions, using dots in package names
> encourages the use of dots in functions, a dangerous practice.

"dangerous"...?
I can't understand the necessity of RStudio and Tiny-Verse affiliated
persons to repeatedly use subjective and unscientific phrasing.

Elegant, Advanced, Dangerous...
At UseR, there was even "Advanced Use of your Favorite IDE".

This is not science.
This is marketing.

There's nothing dangerous about it other than your belief that it's
dangerous.
I note that many functions in the stats package use dots in function names.
Your statement implies that the stats package is badly designed, which it
is not.
Out of 14,800-ish packages on CRAN, very few of them are even close to the
standard set by the stats package, in my opinion.

And as noted by other people in this thread, changing naming policies could
interfere with a lot of software "out there", which is dangerous.

> Dots in
> names is also one of the common stones cast at R as a language, as
> dots are used for object oriented method dispatch in other common
> languages.

I don't think the goal is to copy other OOP systems.
Furthermore, some shells use dot as the current working directory and Java
uses dots in package namespaces.
And then there's regular expressions...

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-15 Thread Jim Hester
Martin,

Thank you for discussing this amongst R-core and for detailing the
R-core discussion here.

Some specific examples where having underscores available would have
been useful.

1. My primerTree package (2013) was originally primer_tree, but I had
to change the name to camelCase to comply with the check requirements.
Using camelCase in the package name makes reading code jarring, as the
functions all use snake_case.
2. The widely used testthat package would likely be called test_that,
like the corresponding function within the package. This also
highlights one of the drawbacks of the current situation, without
separators the package name is more difficult to read, does it have
two t's or three?
3. The assertive suite of packages use `.` for separation, e.g.
`assertive.base`, `assertive.datetimes` etc. but all functions within
the packages use `_` separators, again likely this was done out of
necessity rather than desire.

There are many more I am sure, these were some that came immediately
to mind. More important than the specific examples is the opportunity
cost of having this restriction, which we cannot really quantify.

Using dots for separators has a number of practical problems.
Functions using dots are ambiguous, e.g. is `as.data.frame()` a
regular function, an `as.data()` method for a `frame` object, or an
`as()` method for a `data.frame` object? And in fact regular functions
can be accidentally promoted to S3 methods by defining a S3 generic,
which does actually happen in real life, confusing users [1]. While
package names are not functions, using dots in package names
encourages the use of dots in functions, a dangerous practice. Dots in
names is also one of the common stones cast at R as a language, as
dots are used for object oriented method dispatch in other common
languages.

The prevalence of dotted functions is the only major naming convention
which is steadily decreasing over time. It now accounts for only
around 15% of all function names when looking at all 94 Million lines
of code currently available on CRAN (See Figure 2. from Yen et. al.
[2]).

Thanks again for the public discussion,

Jim

[1]: https://twitter.com/_ColinFay/status/1105579764797108230
[2]: https://osf.io/preprints/socarxiv/ts2wq/

On Wed, Aug 14, 2019 at 5:16 AM Martin Maechler
 wrote:
>
> > Duncan Murdoch
> > on Fri, 9 Aug 2019 20:23:28 -0400 writes:
>
> > On 09/08/2019 4:37 p.m., Gabriel Becker wrote:
> >> Duncan,
> >>
> >>
> >> On Fri, Aug 9, 2019 at 1:17 PM Duncan Murdoch  >> > wrote:
> >>
> >> On 09/08/2019 2:41 p.m., Gabriel Becker wrote:
> >> > Note that this proposal would make mypackage_2.3.1 a valid
> >> *package name*,
> >> > whose corresponding tarball name might be mypackage_2.3.1_2.3.2
> >> after a
> >> > patch. Yes its a silly example, but why allow that kind of ambiguity?
> >> >
> >> CRAN already has a package named "FuzzyNumbers.Ext.2", whose tarball is
> >> FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that game.
> >>
> >>
> >> I suppose technically 2 is a valid version number for a package (?) so 
> I
> >> suppose you have me there. But as Ben pointed out while I was writing
> >> this, all I can really say is that in practice they read to me (as
> >> someone who has administered R on a large cluster and written
> >> build-system software for it) as substantially different levels of
> >> ambiguity. I do acknowledge, as Ben does, that yes a more complex
> >> regular expression/splitting algorithm can be written that would handle
> >> the more general package names. I just don't personally see a 
> motivation
> >> that justifies changing something this fundamental (even if it is both
> >> narrow and was initially more or less arbitrarily chosen) about R at
> >> this late date.
> >>
> >> I guess at the end of the day, I guess what I'm saying is that breaking
> >> and changing things is sometimes good, but if we're going to rock the
> >> boat personally I'd want to do so going after bigger wins than this 
> one.
> >> Thats just my opinion though.
>
> > Sorry, I wasn't clear.  I agree with you.  I was just saying that the
> > particular argument based on ugly tarball names isn't the reason.
>
> > Duncan Murdoch
>
> Thank you (and Gabe).
>
> We have had some R core internal "talk" about Jim Hester's
> suggestion (of adding underscores to the allow characters in
> package names).
> Duncan had already given a good reason why such a change would be problematic
> (the underscore being used as unique separator of package name
>  and version in source and binary package archives),
> and with Jim's offer to find and provide patches for all places
> this is used in the R sources, we've convinced ourselves that
> there is much more code "out there", notably 'devops' code in
> scripts, which currently 

Re: [Rd] Underscores in package names

2019-08-14 Thread Martin Maechler
> Duncan Murdoch 
> on Fri, 9 Aug 2019 20:23:28 -0400 writes:

> On 09/08/2019 4:37 p.m., Gabriel Becker wrote:
>> Duncan,
>> 
>> 
>> On Fri, Aug 9, 2019 at 1:17 PM Duncan Murdoch > > wrote:
>> 
>> On 09/08/2019 2:41 p.m., Gabriel Becker wrote:
>> > Note that this proposal would make mypackage_2.3.1 a valid
>> *package name*,
>> > whose corresponding tarball name might be mypackage_2.3.1_2.3.2
>> after a
>> > patch. Yes its a silly example, but why allow that kind of ambiguity?
>> >
>> CRAN already has a package named "FuzzyNumbers.Ext.2", whose tarball is
>> FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that game.
>> 
>> 
>> I suppose technically 2 is a valid version number for a package (?) so I 
>> suppose you have me there. But as Ben pointed out while I was writing 
>> this, all I can really say is that in practice they read to me (as 
>> someone who has administered R on a large cluster and written 
>> build-system software for it) as substantially different levels of 
>> ambiguity. I do acknowledge, as Ben does, that yes a more complex 
>> regular expression/splitting algorithm can be written that would handle 
>> the more general package names. I just don't personally see a motivation 
>> that justifies changing something this fundamental (even if it is both 
>> narrow and was initially more or less arbitrarily chosen) about R at 
>> this late date.
>> 
>> I guess at the end of the day, I guess what I'm saying is that breaking 
>> and changing things is sometimes good, but if we're going to rock the 
>> boat personally I'd want to do so going after bigger wins than this one. 
>> Thats just my opinion though.

> Sorry, I wasn't clear.  I agree with you.  I was just saying that the 
> particular argument based on ugly tarball names isn't the reason.

> Duncan Murdoch

Thank you (and Gabe).

We have had some R core internal "talk" about Jim Hester's
suggestion (of adding underscores to the allow characters in
package names).
Duncan had already given a good reason why such a change would be problematic
(the underscore being used as unique separator of package name
 and version in source and binary package archives),
and with Jim's offer to find and provide patches for all places
this is used in the R sources, we've convinced ourselves that
there is much more code "out there", notably 'devops' code in
scripts, which currently relies on the current package naming
rules and which could break, often only rarely and hence
possibly unnoticed for too long.

Also, we've not seen compelling arguments why the current scheme
would be too limited (people mentioned that if you must use a
separator, "." was available).

Consequence:  We stay with the stability principle and the
package naming scheme is _not_ going to be changed for now.

Martin Maechler
ETH Zurich and R Core Team

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-12 Thread Stephen Ellison
To throw a very small pennyworth into this debate, the metRology package I 
maintain uses mixed case to highlight R for that community when I'm talking 
about, or citing it. R takeup in that community is not yet high and the visible 
reminder  seems to help. 

I'll obviously accept a consensus decision for some other case convention taken 
on sound technical grounds, but if this is essentially an aesthetic matter I'd 
prefer not to change it for someone else's idea of what looks pretty and what 
doesn’t. 

Steve Ellison

> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of neonira
> Arinoem
> Sent: 09 August 2019 20:39
> To: Ben Bolker
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] Underscores in package names
> 
> 
> Naming policies are always tricky. The one proposed by Hadley, as the one
> proposed by Google, are usable but not optimal according to most common
> needs, that are
> 
> 1. Name a package
> 2. Name a class
> 3. Name a function
> 4. Name a parameter of a function
> 5. Name a variable
> 
> ...


***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread robin hankin
Having written the 'lorentz' ,'Davies' and 'schwarzschild' packages,
I'm interested in packages that are named for a particular person.
There are (by my count) 34 packages on CRAN like this, with names that
are the surname of a particular (real) person.  Of these 34,  only 7
are capitalized.

hankin.ro...@gmail.com



hankin.ro...@gmail.com




On Sat, Aug 10, 2019 at 6:50 AM Gabriel Becker  wrote:
>
> On Fri, Aug 9, 2019 at 11:05 AM neonira Arinoem  wrote:
>
> > Won't it be better to have a convention that allows lowercase, dash,
> > underscore and dot as only valid characters for new package names and keep
> > the ancient format validation scheme for older package names?
> >
>
> Validation isn't the only thing we need to do wrt package names. we also
> need to detect them, and particularly,  in at least one case, extract them
> from package tarball filenames (which we also need to be able to
> detect/find).
>
> If we were writing a new language and people wanted to allow snake case in
> package names, sure, but we're talking about about changing how a small but
> package names and package tarballs have always (or at least a very long
> time, I didn't check) had the same form, and it seems expressive enough to
> me? I mean periods are allowed if you feel a strong need for something
> other than a letter.
>
> Note that this proposal would make mypackage_2.3.1 a valid *package name*,
> whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after a
> patch. Yes its a silly example, but why allow that kind of ambiguity?
>
>
>
> For the record @Ben Bolker 
>
> Packages that mix case anywhere in their package name:
>
> > table(grepl("((^[a-z].*[A-Z])|(^[A-Z].*[a-z]))", row.names(a1)))
>
>
> FALSE  TRUE
>
>  8818  5932
>
>
> Packages which start with lower case and have at least one upper
>
> > table(grepl("((^[a-z].*[A-Z]))", row.names(a1)))
>
>
> FALSE  TRUE
>
> 12315  2435
>
>
> Packages which start with uppercase and have at least one lower
>
> > table(grepl("((^[A-Z].*[a-z]))", row.names(a1)))
>
>
> FALSE  TRUE
>
> 11253  3497
>
> Packages which take advantage of the above-mentioned legality of periods
>
> > table(grepl(".", row.names(a1), fixed=TRUE))
>
>
> FALSE  TRUE
>
> 14259   491
>
> Packages with pure lower-case alphabetic names
>
> > table(grepl("^[a-z]+$", row.names(a1)))
>
>
> FALSE  TRUE
>
>  7712  7038
>
>
> Packages with pure upper-case alphabetic names
>
> > table(grepl("^[A-Z]+$", row.names(a1)))
>
>
> FALSE  TRUE
>
> 13636  1114
>
>
> Package with at least one numeric digit in their name
>
> > table(grepl("[0-9]", row.names(a1)))
>
>
> FALSE  TRUE
>
> 14208   542
>
>
> It would be interesting to do an actual analysis of the changes in these
> trends over time, but I Really should be working, so that will have to
> either wait or be done by someone else.
> Best,
> ~G
>
>
>
> > This could be implemented by a single function, taking a strictNaming_b_1
> > parameter which defaults to true. Easy to use, and compliance results will
> > vary according to the parameter value, allowing strict compliance for new
> > package names and lazy compliance for older ones.
> >
> > Doing so allows to enforce a new package name convention while also
> > insuring continuity of compliance for already existing package names.
> >
> > Fabien GELINEAU alias Neonira
> >
> > Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :
> >
> > > Please, no.  I'd also like to disallow uppercase letters in package
> > names.
> > > For instance, the cuteness of using a capital "R" in package names is
> > > outweighed by the annoyance of trying to remember which packages use an
> > > upper-case letter.
> > >
> > > On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
> > > wrote:
> > >
> > > > Are there technical reasons that package names cannot be snake case?
> > > > This seems to be enforced by `.standard_regexps()$valid_package_name`
> > > > which currently returns
> > > >
> > > >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> > > >
> > > > Is there any technical reason this couldn't be altered to accept `_`
> > > > as well, e.g.
> > > >
> > > >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> > > >
> > > > I realize that historically `_` has not always been valid in variable
> > > > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> > > > believe). Might we also allow underscores for package names?
> > > >
> > > > Jim
> > > >
> > > > __
> > > > R-devel@r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > > >
> > >
> > >
> > > --
> > > Kevin Wright
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > 

Re: [Rd] Underscores in package names

2019-08-09 Thread Duncan Murdoch

On 09/08/2019 4:37 p.m., Gabriel Becker wrote:

Duncan,


On Fri, Aug 9, 2019 at 1:17 PM Duncan Murdoch > wrote:


On 09/08/2019 2:41 p.m., Gabriel Becker wrote:
 > Note that this proposal would make mypackage_2.3.1 a valid
*package name*,
 > whose corresponding tarball name might be mypackage_2.3.1_2.3.2
after a
 > patch. Yes its a silly example, but why allow that kind of ambiguity?
 >
CRAN already has a package named "FuzzyNumbers.Ext.2", whose tarball is
FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that game.


I suppose technically 2 is a valid version number for a package (?) so I 
suppose you have me there. But as Ben pointed out while I was writing 
this, all I can really say is that in practice they read to me (as 
someone who has administered R on a large cluster and written 
build-system software for it) as substantially different levels of 
ambiguity. I do acknowledge, as Ben does, that yes a more complex 
regular expression/splitting algorithm can be written that would handle 
the more general package names. I just don't personally see a motivation 
that justifies changing something this fundamental (even if it is both 
narrow and was initially more or less arbitrarily chosen) about R at 
this late date.


I guess at the end of the day, I guess what I'm saying is that breaking 
and changing things is sometimes good, but if we're going to rock the 
boat personally I'd want to do so going after bigger wins than this one. 
Thats just my opinion though.


Sorry, I wasn't clear.  I agree with you.  I was just saying that the 
particular argument based on ugly tarball names isn't the reason.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Gabriel Becker
Duncan,


On Fri, Aug 9, 2019 at 1:17 PM Duncan Murdoch 
wrote:

> On 09/08/2019 2:41 p.m., Gabriel Becker wrote:
> > Note that this proposal would make mypackage_2.3.1 a valid *package
> name*,
> > whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after a
> > patch. Yes its a silly example, but why allow that kind of ambiguity?
> >
> CRAN already has a package named "FuzzyNumbers.Ext.2", whose tarball is
> FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that game.
>

I suppose technically 2 is a valid version number for a package (?) so I
suppose you have me there. But as Ben pointed out while I was writing this,
all I can really say is that in practice they read to me (as someone who
has administered R on a large cluster and written build-system software for
it) as substantially different levels of ambiguity. I do acknowledge, as
Ben does, that yes a more complex regular expression/splitting algorithm
can be written that would handle the more general package names. I just
don't personally see a motivation that justifies changing something this
fundamental (even if it is both narrow and was initially more or less
arbitrarily chosen) about R at this late date.

I guess at the end of the day, I guess what I'm saying is that breaking and
changing things is sometimes good, but if we're going to rock the boat
personally I'd want to do so going after bigger wins than this one. Thats
just my opinion though.

Best,
~G


> Duncan Murdoch
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Ben Bolker


 Ugh, but not *as* ambiguous as the proposed example (you can still
split unambiguously on "_"; yes, you could split on "last _" in
Gabriel's example, but ...)

On 2019-08-09 4:17 p.m., Duncan Murdoch wrote:
> On 09/08/2019 2:41 p.m., Gabriel Becker wrote:
>> Note that this proposal would make mypackage_2.3.1 a valid *package
>> name*,
>> whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after a
>> patch. Yes its a silly example, but why allow that kind of ambiguity?
>>
> CRAN already has a package named "FuzzyNumbers.Ext.2", whose tarball is
> FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that game.
> 
> Duncan Murdoch
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Duncan Murdoch

On 09/08/2019 10:23 a.m., Jim Hester wrote:

To be clear, I'd be happy to contribute code to make this work, with
the changes mentioned by Duncan and elsewhere in the codebase, if
someone on R-core was interested in reviewing it.


You seem to have ignited a lot of discussion.

Just to add my own point of view:  I think removing a restriction on the 
allowed names is a generally bad idea.  I think Rasmus Bååth gave a 
really valid complaint about the variety of naming conventions in R in a 
presentation I saw based on his article


https://journal.r-project.org/archive/2012/RJ-2012-018/index.html

Looking at the article now, it's not as entertaining as I remember his 
presentation was, but it makes good points about the value of consistency.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Duncan Murdoch

On 09/08/2019 2:41 p.m., Gabriel Becker wrote:

Note that this proposal would make mypackage_2.3.1 a valid *package name*,
whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after a
patch. Yes its a silly example, but why allow that kind of ambiguity?

CRAN already has a package named "FuzzyNumbers.Ext.2", whose tarball is 
FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that game.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread neonira Arinoem
Yes Brian. That's currently possible.

I am not speaking of what is currently possible but of the rules we should
enforce, using both strict compliance for new rules and lazy compliance for
older packages

Le ven. 9 août 2019 à 21:35, Brian G. Peterson  a
écrit :

> On 2019-08-09 14:27, neonira Arinoem wrote:
> > I do not follow you Gabriel. Package name must not use digit numbers.
> > Tarbal will use them, taken from the DESCRIPTION file, version field.
> >
> > That's why I consider the weird case name you presented as irrelevant,
> > and
> > not to be considered.
>
> ggplot2 ?
>
> Numbers are allowed in package names right now.
>
>
> > Le ven. 9 août 2019 à 20:41, Gabriel Becker  a
> > écrit :
> >
> >>
> >>
> >> On Fri, Aug 9, 2019 at 11:05 AM neonira Arinoem 
> >> wrote:
> >>
> >>> Won't it be better to have a convention that allows lowercase, dash,
> >>> underscore and dot as only valid characters for new package names and
> >>> keep
> >>> the ancient format validation scheme for older package names?
> >>>
> >>
> >> Validation isn't the only thing we need to do wrt package names. we
> >> also
> >> need to detect them, and particularly,  in at least one case, extract
> >> them
> >> from package tarball filenames (which we also need to be able to
> >> detect/find).
> >>
> >> If we were writing a new language and people wanted to allow snake
> >> case in
> >> package names, sure, but we're talking about about changing how a
> >> small but
> >> package names and package tarballs have always (or at least a very
> >> long
> >> time, I didn't check) had the same form, and it seems expressive
> >> enough to
> >> me? I mean periods are allowed if you feel a strong need for something
> >> other than a letter.
> >>
> >> Note that this proposal would make mypackage_2.3.1 a valid *package
> >> name*,
> >> whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after
> >> a
> >> patch. Yes its a silly example, but why allow that kind of ambiguity?
> >>
> >>
> >>
> >> For the record @Ben Bolker 
> >>
> >> Packages that mix case anywhere in their package name:
> >>
> >> > table(grepl("((^[a-z].*[A-Z])|(^[A-Z].*[a-z]))", row.names(a1)))
> >>
> >>
> >> FALSE  TRUE
> >>
> >>  8818  5932
> >>
> >>
> >> Packages which start with lower case and have at least one upper
> >>
> >> > table(grepl("((^[a-z].*[A-Z]))", row.names(a1)))
> >>
> >>
> >> FALSE  TRUE
> >>
> >> 12315  2435
> >>
> >>
> >> Packages which start with uppercase and have at least one lower
> >>
> >> > table(grepl("((^[A-Z].*[a-z]))", row.names(a1)))
> >>
> >>
> >> FALSE  TRUE
> >>
> >> 11253  3497
> >>
> >> Packages which take advantage of the above-mentioned legality of
> >> periods
> >>
> >> > table(grepl(".", row.names(a1), fixed=TRUE))
> >>
> >>
> >> FALSE  TRUE
> >>
> >> 14259   491
> >>
> >> Packages with pure lower-case alphabetic names
> >>
> >> > table(grepl("^[a-z]+$", row.names(a1)))
> >>
> >>
> >> FALSE  TRUE
> >>
> >>  7712  7038
> >>
> >>
> >> Packages with pure upper-case alphabetic names
> >>
> >> > table(grepl("^[A-Z]+$", row.names(a1)))
> >>
> >>
> >> FALSE  TRUE
> >>
> >> 13636  1114
> >>
> >>
> >> Package with at least one numeric digit in their name
> >>
> >> > table(grepl("[0-9]", row.names(a1)))
> >>
> >>
> >> FALSE  TRUE
> >>
> >> 14208   542
> >>
> >>
> >> It would be interesting to do an actual analysis of the changes in
> >> these
> >> trends over time, but I Really should be working, so that will have to
> >> either wait or be done by someone else.
> >> Best,
> >> ~G
> >>
> >>
> >>
> >>> This could be implemented by a single function, taking a
> >>> strictNaming_b_1
> >>> parameter which defaults to true. Easy to use, and compliance results
> >>> will
> >>> vary according to the parameter value, allowing strict compliance for
> >>> new
> >>> package names and lazy compliance for older ones.
> >>>
> >>> Doing so allows to enforce a new package name convention while also
> >>> insuring continuity of compliance for already existing package names.
> >>>
> >>> Fabien GELINEAU alias Neonira
> >>>
> >>> Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit
> >>> :
> >>>
> >>> > Please, no.  I'd also like to disallow uppercase letters in package
> >>> names.
> >>> > For instance, the cuteness of using a capital "R" in package names is
> >>> > outweighed by the annoyance of trying to remember which packages use
> an
> >>> > upper-case letter.
> >>> >
> >>> > On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
> >>> > wrote:
> >>> >
> >>> > > Are there technical reasons that package names cannot be snake
> case?
> >>> > > This seems to be enforced by
> `.standard_regexps()$valid_package_name`
> >>> > > which currently returns
> >>> > >
> >>> > >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> >>> > >
> >>> > > Is there any technical reason this couldn't be altered to accept
> `_`
> >>> > > as well, e.g.
> >>> > >
> >>> > >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> >>> > >
> >>> > > I realize that historically `_` has not always been valid in
> 

Re: [Rd] Underscores in package names

2019-08-09 Thread Gabriel Becker
Neonira,

On Fri, Aug 9, 2019 at 12:27 PM neonira Arinoem  wrote:

> I do not follow you Gabriel. Package name must not use digit numbers.
> Tarbal will use them, taken from the DESCRIPTION file, version field.
>

I was referring to Jim Hester's original proposal, which AFAIU was just to
add "_" to the allowed characters. Yours goes much farther an also adds
dash but removes all numbers (which I admit I didn't notice) and upper case
letters. This is a much more radical change, and one I don't really
understand the justification for. I get forcing lowercase (Id rather the
machinery were just case insensitive, myself) but disallowing numbers,
given that one of the most popular contributed packages of all time -
ggplot2 - has a number in it, seems strange.  I also don't really grok the
desire for dashes on top of periods and underscores.


Best,

~G



> That's why I consider the weird case name you presented as irrelevant, and
> not to be considered.
>
>
> Le ven. 9 août 2019 à 20:41, Gabriel Becker  a
> écrit :
>
>>
>>
>> On Fri, Aug 9, 2019 at 11:05 AM neonira Arinoem 
>> wrote:
>>
>>> Won't it be better to have a convention that allows lowercase, dash,
>>> underscore and dot as only valid characters for new package names and
>>> keep
>>> the ancient format validation scheme for older package names?
>>>
>>
>> Validation isn't the only thing we need to do wrt package names. we also
>> need to detect them, and particularly,  in at least one case, extract them
>> from package tarball filenames (which we also need to be able to
>> detect/find).
>>
>> If we were writing a new language and people wanted to allow snake case
>> in package names, sure, but we're talking about about changing how a small
>> but package names and package tarballs have always (or at least a very long
>> time, I didn't check) had the same form, and it seems expressive enough to
>> me? I mean periods are allowed if you feel a strong need for something
>> other than a letter.
>>
>> Note that this proposal would make mypackage_2.3.1 a valid *package name*,
>> whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after a
>> patch. Yes its a silly example, but why allow that kind of ambiguity?
>>
>>
>>
>> For the record @Ben Bolker 
>>
>> Packages that mix case anywhere in their package name:
>>
>> > table(grepl("((^[a-z].*[A-Z])|(^[A-Z].*[a-z]))", row.names(a1)))
>>
>>
>> FALSE  TRUE
>>
>>  8818  5932
>>
>>
>> Packages which start with lower case and have at least one upper
>>
>> > table(grepl("((^[a-z].*[A-Z]))", row.names(a1)))
>>
>>
>> FALSE  TRUE
>>
>> 12315  2435
>>
>>
>> Packages which start with uppercase and have at least one lower
>>
>> > table(grepl("((^[A-Z].*[a-z]))", row.names(a1)))
>>
>>
>> FALSE  TRUE
>>
>> 11253  3497
>>
>> Packages which take advantage of the above-mentioned legality of periods
>>
>> > table(grepl(".", row.names(a1), fixed=TRUE))
>>
>>
>> FALSE  TRUE
>>
>> 14259   491
>>
>> Packages with pure lower-case alphabetic names
>>
>> > table(grepl("^[a-z]+$", row.names(a1)))
>>
>>
>> FALSE  TRUE
>>
>>  7712  7038
>>
>>
>> Packages with pure upper-case alphabetic names
>>
>> > table(grepl("^[A-Z]+$", row.names(a1)))
>>
>>
>> FALSE  TRUE
>>
>> 13636  1114
>>
>>
>> Package with at least one numeric digit in their name
>>
>> > table(grepl("[0-9]", row.names(a1)))
>>
>>
>> FALSE  TRUE
>>
>> 14208   542
>>
>>
>> It would be interesting to do an actual analysis of the changes in these
>> trends over time, but I Really should be working, so that will have to
>> either wait or be done by someone else.
>> Best,
>> ~G
>>
>>
>>
>>> This could be implemented by a single function, taking a strictNaming_b_1
>>> parameter which defaults to true. Easy to use, and compliance results
>>> will
>>> vary according to the parameter value, allowing strict compliance for new
>>> package names and lazy compliance for older ones.
>>>
>>> Doing so allows to enforce a new package name convention while also
>>> insuring continuity of compliance for already existing package names.
>>>
>>> Fabien GELINEAU alias Neonira
>>>
>>> Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :
>>>
>>> > Please, no.  I'd also like to disallow uppercase letters in package
>>> names.
>>> > For instance, the cuteness of using a capital "R" in package names is
>>> > outweighed by the annoyance of trying to remember which packages use an
>>> > upper-case letter.
>>> >
>>> > On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
>>> > wrote:
>>> >
>>> > > Are there technical reasons that package names cannot be snake case?
>>> > > This seems to be enforced by `.standard_regexps()$valid_package_name`
>>> > > which currently returns
>>> > >
>>> > >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
>>> > >
>>> > > Is there any technical reason this couldn't be altered to accept `_`
>>> > > as well, e.g.
>>> > >
>>> > >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
>>> > >
>>> > > I realize that historically `_` has not always been valid in variable
>>> > > names, but this has now been 

Re: [Rd] Underscores in package names

2019-08-09 Thread neonira Arinoem
Naming policies are always tricky. The one proposed by Hadley, as the one
proposed by Google, are usable but not optimal according to most common
needs, that are

1. Name a package
2. Name a class
3. Name a function
4. Name a parameter of a function
5. Name a variable


My approach is the following

1. Package names should be  made of lowercase characters, dash, dot and
underscore

2. Class names are UpperCamelCased

3. Function names are lowerCamelCased

4. Function parameters are semantic names resulting from underscore
separated lowerCamelCased function name, type acronym and length
specification.

5. Variable should be snake case


That way you can not confuse one for the other. This brings clear view,
ease reading and speeds up implementation.

As always, this could be applied to new packages and to some extends to
package upgrades

What do you think of a such approach?


Le ven. 9 août 2019 à 20:18, Ben Bolker  a écrit :

>
>   Creeping code complexity ...
>
>   I like to think that the cuteR names will have a Darwinian
> disadvantage in the long run. FWIW Hadley Wickham argues (rightly, I
> think) against mixed-case names:
> http://r-pkgs.had.co.nz/package.html#naming. I too am guilty of picking
> mixed-case package names in the past.  Extra credit if the package name
> and the standard function have different cases! e.g.
> glmmADMB::glmmadmb(), although (a) that wasn't my choice and (b) at
> least it was never on CRAN and (c) it wasn't one of the cuteR variety.
>
>   Bonus points for the first analysis of case conventions in existing
> CRAN package names ... I'll start.
>
> > a1 <- rownames(available.packages())
> > cute <- "[a-z]*R[a-z]*"
> > table(grepl(cute,a1))
>
> FALSE  TRUE
> 12565  2185
>
>
> On 2019-08-09 2:00 p.m., neonira Arinoem wrote:
> > Won't it be better to have a convention that allows lowercase, dash,
> > underscore and dot as only valid characters for new package names and
> keep
> > the ancient format validation scheme for older package names?
> >
> > This could be implemented by a single function, taking a strictNaming_b_1
> > parameter which defaults to true. Easy to use, and compliance results
> will
> > vary according to the parameter value, allowing strict compliance for new
> > package names and lazy compliance for older ones.
> >
> > Doing so allows to enforce a new package name convention while also
> > insuring continuity of compliance for already existing package names.
> >
> > Fabien GELINEAU alias Neonira
> >
> > Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :
> >
> >> Please, no.  I'd also like to disallow uppercase letters in package
> names.
> >> For instance, the cuteness of using a capital "R" in package names is
> >> outweighed by the annoyance of trying to remember which packages use an
> >> upper-case letter.
> >>
> >> On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
> >> wrote:
> >>
> >>> Are there technical reasons that package names cannot be snake case?
> >>> This seems to be enforced by `.standard_regexps()$valid_package_name`
> >>> which currently returns
> >>>
> >>>"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> >>>
> >>> Is there any technical reason this couldn't be altered to accept `_`
> >>> as well, e.g.
> >>>
> >>>   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> >>>
> >>> I realize that historically `_` has not always been valid in variable
> >>> names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> >>> believe). Might we also allow underscores for package names?
> >>>
> >>> Jim
> >>>
> >>> __
> >>> R-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>
> >>
> >> --
> >> Kevin Wright
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Brian G. Peterson

On 2019-08-09 14:27, neonira Arinoem wrote:

I do not follow you Gabriel. Package name must not use digit numbers.
Tarbal will use them, taken from the DESCRIPTION file, version field.

That's why I consider the weird case name you presented as irrelevant, 
and

not to be considered.


ggplot2 ?

Numbers are allowed in package names right now.



Le ven. 9 août 2019 à 20:41, Gabriel Becker  a
écrit :




On Fri, Aug 9, 2019 at 11:05 AM neonira Arinoem  
wrote:



Won't it be better to have a convention that allows lowercase, dash,
underscore and dot as only valid characters for new package names and 
keep

the ancient format validation scheme for older package names?



Validation isn't the only thing we need to do wrt package names. we 
also
need to detect them, and particularly,  in at least one case, extract 
them

from package tarball filenames (which we also need to be able to
detect/find).

If we were writing a new language and people wanted to allow snake 
case in
package names, sure, but we're talking about about changing how a 
small but
package names and package tarballs have always (or at least a very 
long
time, I didn't check) had the same form, and it seems expressive 
enough to

me? I mean periods are allowed if you feel a strong need for something
other than a letter.

Note that this proposal would make mypackage_2.3.1 a valid *package 
name*,
whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after 
a

patch. Yes its a silly example, but why allow that kind of ambiguity?



For the record @Ben Bolker 

Packages that mix case anywhere in their package name:

> table(grepl("((^[a-z].*[A-Z])|(^[A-Z].*[a-z]))", row.names(a1)))


FALSE  TRUE

 8818  5932


Packages which start with lower case and have at least one upper

> table(grepl("((^[a-z].*[A-Z]))", row.names(a1)))


FALSE  TRUE

12315  2435


Packages which start with uppercase and have at least one lower

> table(grepl("((^[A-Z].*[a-z]))", row.names(a1)))


FALSE  TRUE

11253  3497

Packages which take advantage of the above-mentioned legality of 
periods


> table(grepl(".", row.names(a1), fixed=TRUE))


FALSE  TRUE

14259   491

Packages with pure lower-case alphabetic names

> table(grepl("^[a-z]+$", row.names(a1)))


FALSE  TRUE

 7712  7038


Packages with pure upper-case alphabetic names

> table(grepl("^[A-Z]+$", row.names(a1)))


FALSE  TRUE

13636  1114


Package with at least one numeric digit in their name

> table(grepl("[0-9]", row.names(a1)))


FALSE  TRUE

14208   542


It would be interesting to do an actual analysis of the changes in 
these

trends over time, but I Really should be working, so that will have to
either wait or be done by someone else.
Best,
~G



This could be implemented by a single function, taking a 
strictNaming_b_1
parameter which defaults to true. Easy to use, and compliance results 
will
vary according to the parameter value, allowing strict compliance for 
new

package names and lazy compliance for older ones.

Doing so allows to enforce a new package name convention while also
insuring continuity of compliance for already existing package names.

Fabien GELINEAU alias Neonira

Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit 
:


> Please, no.  I'd also like to disallow uppercase letters in package
names.
> For instance, the cuteness of using a capital "R" in package names is
> outweighed by the annoyance of trying to remember which packages use an
> upper-case letter.
>
> On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
> wrote:
>
> > Are there technical reasons that package names cannot be snake case?
> > This seems to be enforced by `.standard_regexps()$valid_package_name`
> > which currently returns
> >
> >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> >
> > Is there any technical reason this couldn't be altered to accept `_`
> > as well, e.g.
> >
> >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> >
> > I realize that historically `_` has not always been valid in variable
> > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> > believe). Might we also allow underscores for package names?
> >
> > Jim
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
> --
> Kevin Wright
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock

__
R-devel@r-project.org mailing list

Re: [Rd] Underscores in package names

2019-08-09 Thread neonira Arinoem
I do not follow you Gabriel. Package name must not use digit numbers.
Tarbal will use them, taken from the DESCRIPTION file, version field.

That's why I consider the weird case name you presented as irrelevant, and
not to be considered.


Le ven. 9 août 2019 à 20:41, Gabriel Becker  a
écrit :

>
>
> On Fri, Aug 9, 2019 at 11:05 AM neonira Arinoem  wrote:
>
>> Won't it be better to have a convention that allows lowercase, dash,
>> underscore and dot as only valid characters for new package names and keep
>> the ancient format validation scheme for older package names?
>>
>
> Validation isn't the only thing we need to do wrt package names. we also
> need to detect them, and particularly,  in at least one case, extract them
> from package tarball filenames (which we also need to be able to
> detect/find).
>
> If we were writing a new language and people wanted to allow snake case in
> package names, sure, but we're talking about about changing how a small but
> package names and package tarballs have always (or at least a very long
> time, I didn't check) had the same form, and it seems expressive enough to
> me? I mean periods are allowed if you feel a strong need for something
> other than a letter.
>
> Note that this proposal would make mypackage_2.3.1 a valid *package name*,
> whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after a
> patch. Yes its a silly example, but why allow that kind of ambiguity?
>
>
>
> For the record @Ben Bolker 
>
> Packages that mix case anywhere in their package name:
>
> > table(grepl("((^[a-z].*[A-Z])|(^[A-Z].*[a-z]))", row.names(a1)))
>
>
> FALSE  TRUE
>
>  8818  5932
>
>
> Packages which start with lower case and have at least one upper
>
> > table(grepl("((^[a-z].*[A-Z]))", row.names(a1)))
>
>
> FALSE  TRUE
>
> 12315  2435
>
>
> Packages which start with uppercase and have at least one lower
>
> > table(grepl("((^[A-Z].*[a-z]))", row.names(a1)))
>
>
> FALSE  TRUE
>
> 11253  3497
>
> Packages which take advantage of the above-mentioned legality of periods
>
> > table(grepl(".", row.names(a1), fixed=TRUE))
>
>
> FALSE  TRUE
>
> 14259   491
>
> Packages with pure lower-case alphabetic names
>
> > table(grepl("^[a-z]+$", row.names(a1)))
>
>
> FALSE  TRUE
>
>  7712  7038
>
>
> Packages with pure upper-case alphabetic names
>
> > table(grepl("^[A-Z]+$", row.names(a1)))
>
>
> FALSE  TRUE
>
> 13636  1114
>
>
> Package with at least one numeric digit in their name
>
> > table(grepl("[0-9]", row.names(a1)))
>
>
> FALSE  TRUE
>
> 14208   542
>
>
> It would be interesting to do an actual analysis of the changes in these
> trends over time, but I Really should be working, so that will have to
> either wait or be done by someone else.
> Best,
> ~G
>
>
>
>> This could be implemented by a single function, taking a strictNaming_b_1
>> parameter which defaults to true. Easy to use, and compliance results will
>> vary according to the parameter value, allowing strict compliance for new
>> package names and lazy compliance for older ones.
>>
>> Doing so allows to enforce a new package name convention while also
>> insuring continuity of compliance for already existing package names.
>>
>> Fabien GELINEAU alias Neonira
>>
>> Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :
>>
>> > Please, no.  I'd also like to disallow uppercase letters in package
>> names.
>> > For instance, the cuteness of using a capital "R" in package names is
>> > outweighed by the annoyance of trying to remember which packages use an
>> > upper-case letter.
>> >
>> > On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
>> > wrote:
>> >
>> > > Are there technical reasons that package names cannot be snake case?
>> > > This seems to be enforced by `.standard_regexps()$valid_package_name`
>> > > which currently returns
>> > >
>> > >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
>> > >
>> > > Is there any technical reason this couldn't be altered to accept `_`
>> > > as well, e.g.
>> > >
>> > >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
>> > >
>> > > I realize that historically `_` has not always been valid in variable
>> > > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
>> > > believe). Might we also allow underscores for package names?
>> > >
>> > > Jim
>> > >
>> > > __
>> > > R-devel@r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> > >
>> >
>> >
>> > --
>> > Kevin Wright
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list

Re: [Rd] Underscores in package names

2019-08-09 Thread Gabriel Becker
On Fri, Aug 9, 2019 at 11:05 AM neonira Arinoem  wrote:

> Won't it be better to have a convention that allows lowercase, dash,
> underscore and dot as only valid characters for new package names and keep
> the ancient format validation scheme for older package names?
>

Validation isn't the only thing we need to do wrt package names. we also
need to detect them, and particularly,  in at least one case, extract them
from package tarball filenames (which we also need to be able to
detect/find).

If we were writing a new language and people wanted to allow snake case in
package names, sure, but we're talking about about changing how a small but
package names and package tarballs have always (or at least a very long
time, I didn't check) had the same form, and it seems expressive enough to
me? I mean periods are allowed if you feel a strong need for something
other than a letter.

Note that this proposal would make mypackage_2.3.1 a valid *package name*,
whose corresponding tarball name might be mypackage_2.3.1_2.3.2 after a
patch. Yes its a silly example, but why allow that kind of ambiguity?



For the record @Ben Bolker 

Packages that mix case anywhere in their package name:

> table(grepl("((^[a-z].*[A-Z])|(^[A-Z].*[a-z]))", row.names(a1)))


FALSE  TRUE

 8818  5932


Packages which start with lower case and have at least one upper

> table(grepl("((^[a-z].*[A-Z]))", row.names(a1)))


FALSE  TRUE

12315  2435


Packages which start with uppercase and have at least one lower

> table(grepl("((^[A-Z].*[a-z]))", row.names(a1)))


FALSE  TRUE

11253  3497

Packages which take advantage of the above-mentioned legality of periods

> table(grepl(".", row.names(a1), fixed=TRUE))


FALSE  TRUE

14259   491

Packages with pure lower-case alphabetic names

> table(grepl("^[a-z]+$", row.names(a1)))


FALSE  TRUE

 7712  7038


Packages with pure upper-case alphabetic names

> table(grepl("^[A-Z]+$", row.names(a1)))


FALSE  TRUE

13636  1114


Package with at least one numeric digit in their name

> table(grepl("[0-9]", row.names(a1)))


FALSE  TRUE

14208   542


It would be interesting to do an actual analysis of the changes in these
trends over time, but I Really should be working, so that will have to
either wait or be done by someone else.
Best,
~G



> This could be implemented by a single function, taking a strictNaming_b_1
> parameter which defaults to true. Easy to use, and compliance results will
> vary according to the parameter value, allowing strict compliance for new
> package names and lazy compliance for older ones.
>
> Doing so allows to enforce a new package name convention while also
> insuring continuity of compliance for already existing package names.
>
> Fabien GELINEAU alias Neonira
>
> Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :
>
> > Please, no.  I'd also like to disallow uppercase letters in package
> names.
> > For instance, the cuteness of using a capital "R" in package names is
> > outweighed by the annoyance of trying to remember which packages use an
> > upper-case letter.
> >
> > On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
> > wrote:
> >
> > > Are there technical reasons that package names cannot be snake case?
> > > This seems to be enforced by `.standard_regexps()$valid_package_name`
> > > which currently returns
> > >
> > >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> > >
> > > Is there any technical reason this couldn't be altered to accept `_`
> > > as well, e.g.
> > >
> > >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> > >
> > > I realize that historically `_` has not always been valid in variable
> > > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> > > believe). Might we also allow underscores for package names?
> > >
> > > Jim
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> >
> > --
> > Kevin Wright
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Tobias Verbeke
> Creeping code complexity ...
> 
>  I like to think that the cuteR names will have a Darwinian
> disadvantage in the long run. FWIW Hadley Wickham argues (rightly, I
> think) against mixed-case names:
> http://r-pkgs.had.co.nz/package.html#naming.

Good development environments will offer content assist (or tab completion or 
similar) which will not be hindered by naming conventions (whether camel case, 
dromedary case or other forms that snaked into the R world). Talking about 
Darwinian advantages, Wikipedia[1] just taught me about the existence of 
'darwin case' ?!

Best,
Tobias

[1] https://en.wikipedia.org/wiki/Camel_case

> On 2019-08-09 2:00 p.m., neonira Arinoem wrote:
>> Won't it be better to have a convention that allows lowercase, dash,
>> underscore and dot as only valid characters for new package names and keep
>> the ancient format validation scheme for older package names?
>> 
>> This could be implemented by a single function, taking a strictNaming_b_1
>> parameter which defaults to true. Easy to use, and compliance results will
>> vary according to the parameter value, allowing strict compliance for new
>> package names and lazy compliance for older ones.
>> 
>> Doing so allows to enforce a new package name convention while also
>> insuring continuity of compliance for already existing package names.
>> 
>> Fabien GELINEAU alias Neonira
>> 
>> Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :
>> 
>>> Please, no.  I'd also like to disallow uppercase letters in package names.
>>> For instance, the cuteness of using a capital "R" in package names is
>>> outweighed by the annoyance of trying to remember which packages use an
>>> upper-case letter.
>>>
>>> On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
>>> wrote:
>>>
 Are there technical reasons that package names cannot be snake case?
 This seems to be enforced by `.standard_regexps()$valid_package_name`
 which currently returns

"[[:alpha:]][[:alnum:].]*[[:alnum:]]"

 Is there any technical reason this couldn't be altered to accept `_`
 as well, e.g.

   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"

 I realize that historically `_` has not always been valid in variable
 names, but this has now been acceptable for 15+ years (since R 1.9.0 I
 believe). Might we also allow underscores for package names?

 Jim

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

>>>
>>>
>>> --
>>> Kevin Wright
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Ben Bolker


  Creeping code complexity ...

  I like to think that the cuteR names will have a Darwinian
disadvantage in the long run. FWIW Hadley Wickham argues (rightly, I
think) against mixed-case names:
http://r-pkgs.had.co.nz/package.html#naming. I too am guilty of picking
mixed-case package names in the past.  Extra credit if the package name
and the standard function have different cases! e.g.
glmmADMB::glmmadmb(), although (a) that wasn't my choice and (b) at
least it was never on CRAN and (c) it wasn't one of the cuteR variety.

  Bonus points for the first analysis of case conventions in existing
CRAN package names ... I'll start.

> a1 <- rownames(available.packages())
> cute <- "[a-z]*R[a-z]*"
> table(grepl(cute,a1))

FALSE  TRUE
12565  2185


On 2019-08-09 2:00 p.m., neonira Arinoem wrote:
> Won't it be better to have a convention that allows lowercase, dash,
> underscore and dot as only valid characters for new package names and keep
> the ancient format validation scheme for older package names?
> 
> This could be implemented by a single function, taking a strictNaming_b_1
> parameter which defaults to true. Easy to use, and compliance results will
> vary according to the parameter value, allowing strict compliance for new
> package names and lazy compliance for older ones.
> 
> Doing so allows to enforce a new package name convention while also
> insuring continuity of compliance for already existing package names.
> 
> Fabien GELINEAU alias Neonira
> 
> Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :
> 
>> Please, no.  I'd also like to disallow uppercase letters in package names.
>> For instance, the cuteness of using a capital "R" in package names is
>> outweighed by the annoyance of trying to remember which packages use an
>> upper-case letter.
>>
>> On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
>> wrote:
>>
>>> Are there technical reasons that package names cannot be snake case?
>>> This seems to be enforced by `.standard_regexps()$valid_package_name`
>>> which currently returns
>>>
>>>"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
>>>
>>> Is there any technical reason this couldn't be altered to accept `_`
>>> as well, e.g.
>>>
>>>   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
>>>
>>> I realize that historically `_` has not always been valid in variable
>>> names, but this has now been acceptable for 15+ years (since R 1.9.0 I
>>> believe). Might we also allow underscores for package names?
>>>
>>> Jim
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>> --
>> Kevin Wright
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread neonira Arinoem
Won't it be better to have a convention that allows lowercase, dash,
underscore and dot as only valid characters for new package names and keep
the ancient format validation scheme for older package names?

This could be implemented by a single function, taking a strictNaming_b_1
parameter which defaults to true. Easy to use, and compliance results will
vary according to the parameter value, allowing strict compliance for new
package names and lazy compliance for older ones.

Doing so allows to enforce a new package name convention while also
insuring continuity of compliance for already existing package names.

Fabien GELINEAU alias Neonira

Le ven. 9 août 2019 à 18:40, Kevin Wright  a écrit :

> Please, no.  I'd also like to disallow uppercase letters in package names.
> For instance, the cuteness of using a capital "R" in package names is
> outweighed by the annoyance of trying to remember which packages use an
> upper-case letter.
>
> On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
> wrote:
>
> > Are there technical reasons that package names cannot be snake case?
> > This seems to be enforced by `.standard_regexps()$valid_package_name`
> > which currently returns
> >
> >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> >
> > Is there any technical reason this couldn't be altered to accept `_`
> > as well, e.g.
> >
> >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> >
> > I realize that historically `_` has not always been valid in variable
> > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> > believe). Might we also allow underscores for package names?
> >
> > Jim
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
> --
> Kevin Wright
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Gabriel Becker
Hi Jim,

While its true that it wouldn't be *particularly *hard^^ to adapt the base
code to change this, there is certainly a non-zero amount of user/package
code that relies on the well-defined package tarball naming scheme as well.
I know because I've written some myself in switchr/GRAN* but I seriously
doubt I'm the only one. I would imagine there's also quite a bit of more if
you include DEVOPSy-style build/administration scripts and not just user R
code.

To me, the benefit of this change seems a pretty minor "nice-to-have" when
weighed against breaking even a moderate amount of existing code.

^^ making sure we found every place the tarball naming scheme/package name
constraints are implicitly assumed in the R sources might well be less
trivial than we think, though once found I agree the changes would likely
be *relatively* straightforward. For example, I happen to know that in
addition to the places Duncan pointed out,  tools::update_PACKAGES relies
heavily on code that extracts the name and version of a package from
something that "looks like a package tarball" as an optimization mechanism,
so that would need to be reworked. It seems likely  (almost certain?) that
write_PACKAGES also relies on matching the tarball-name patter when
determining which packages are present, though I remember less details
there because I didn't write most of it.

Best,
~G



On Fri, Aug 9, 2019 at 9:40 AM Kevin Wright  wrote:

> Please, no.  I'd also like to disallow uppercase letters in package names.
> For instance, the cuteness of using a capital "R" in package names is
> outweighed by the annoyance of trying to remember which packages use an
> upper-case letter.
>
> On Thu, Aug 8, 2019 at 9:32 AM Jim Hester 
> wrote:
>
> > Are there technical reasons that package names cannot be snake case?
> > This seems to be enforced by `.standard_regexps()$valid_package_name`
> > which currently returns
> >
> >"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> >
> > Is there any technical reason this couldn't be altered to accept `_`
> > as well, e.g.
> >
> >   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> >
> > I realize that historically `_` has not always been valid in variable
> > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> > believe). Might we also allow underscores for package names?
> >
> > Jim
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
> --
> Kevin Wright
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Kevin Wright
Please, no.  I'd also like to disallow uppercase letters in package names.
For instance, the cuteness of using a capital "R" in package names is
outweighed by the annoyance of trying to remember which packages use an
upper-case letter.

On Thu, Aug 8, 2019 at 9:32 AM Jim Hester  wrote:

> Are there technical reasons that package names cannot be snake case?
> This seems to be enforced by `.standard_regexps()$valid_package_name`
> which currently returns
>
>"[[:alpha:]][[:alnum:].]*[[:alnum:]]"
>
> Is there any technical reason this couldn't be altered to accept `_`
> as well, e.g.
>
>   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
>
> I realize that historically `_` has not always been valid in variable
> names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> believe). Might we also allow underscores for package names?
>
> Jim
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Jim Hester
To be clear, I'd be happy to contribute code to make this work, with
the changes mentioned by Duncan and elsewhere in the codebase, if
someone on R-core was interested in reviewing it.

Jim

On Thu, Aug 8, 2019 at 11:05 AM Duncan Murdoch  wrote:
>
> On 08/08/2019 10:31 a.m., Jim Hester wrote:
> > Are there technical reasons that package names cannot be snake case?
> > This seems to be enforced by `.standard_regexps()$valid_package_name`
> > which currently returns
> >
> > "[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> >
> > Is there any technical reason this couldn't be altered to accept `_`
> > as well, e.g.
> >
> >"[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> >
> > I realize that historically `_` has not always been valid in variable
> > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> > believe). Might we also allow underscores for package names?
>
> The tarball names separate the package name from the version number
> using an underscore.  There is code that is written to assume there is
> at most one underscore, e.g. .check_package_CRAN_incoming in
> src/library/tools/R/QC.r.
>
> That code could be changed, but so could the proposed package name...
>
> Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-08 Thread Duncan Murdoch

On 08/08/2019 10:31 a.m., Jim Hester wrote:

Are there technical reasons that package names cannot be snake case?
This seems to be enforced by `.standard_regexps()$valid_package_name`
which currently returns

"[[:alpha:]][[:alnum:].]*[[:alnum:]]"

Is there any technical reason this couldn't be altered to accept `_`
as well, e.g.

   "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"

I realize that historically `_` has not always been valid in variable
names, but this has now been acceptable for 15+ years (since R 1.9.0 I
believe). Might we also allow underscores for package names?


The tarball names separate the package name from the version number 
using an underscore.  There is code that is written to assume there is 
at most one underscore, e.g. .check_package_CRAN_incoming in 
src/library/tools/R/QC.r.


That code could be changed, but so could the proposed package name...

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Underscores in package names

2019-08-08 Thread Jim Hester
Are there technical reasons that package names cannot be snake case?
This seems to be enforced by `.standard_regexps()$valid_package_name`
which currently returns

   "[[:alpha:]][[:alnum:].]*[[:alnum:]]"

Is there any technical reason this couldn't be altered to accept `_`
as well, e.g.

  "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"

I realize that historically `_` has not always been valid in variable
names, but this has now been acceptable for 15+ years (since R 1.9.0 I
believe). Might we also allow underscores for package names?

Jim

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel