[R-SIG-Mac] System-wide site library [Was: CRAN installer for macOS - directory permissions]

2022-06-08 Thread Simon Urbanek
We could re-design the layout of the framework and site locations to be more in 
line with the modern Apple standards. Splitting the library into system (R 
itself only) and site library could make the framework more correctly self 
sufficient. The site library would then go into "/Library/Application 
Support/org.R-project.R/library” and user library in the equally named 
subdirectory of $HOME. That would directly correspond to the Apple designations 
of NSApplicationSupportDirectory in NSLocalDomainMask and NSUserDomainMask 
domains. The drawback would be that none of this is versioned any longer, so we 
probably would have to rely on different bundle IDs (e.g. to distinguish 
big-sur builds from high-sierra builds) and possibly add versioned 
subdirectories inside our realm. Also this would make it impossible to make 
self-contained R apps, because the packages are outside of the framework path 
structure, but I have not seen anyone using that feature in a long time.

Obviously, this would be a major breaking change so for R-devel and R.app would 
have to be updated to use the corresponding paths for the package management, 
but that’s not too hard. I wouldn’t put this on top of my list given that the 
effect is mostly cosmetic, but by using the Apple API it would allow the 
framework to be used in a container if anyone cared (there are bigger issues if 
anyone wanted to create an iOS version, though ;)).

Cheers,
Simon



> On 9/06/2022, at 06:56, Duncan Murdoch  wrote:
> 
> Henrik, you posted this a couple of days ago and I didn't address the 
> _R_CHECK_DEPENDS_ONLY_ point you raised.
> 
> You're right that the current implementation of _R_CHECK_DEPENDS_ONLY_ 
> doesn't work if all packages are installed in one lib.  This is a flaw, with 
> one fix being to never put contributed packages into the system lib.  (I 
> haven't done a Linux install in a long time; don't they by default put 
> recommended packages there?  They can also be Suggested packages, so if 
> they're in the system lib, that's a bug.)
> 
> Another possible fix is to change how _R_CHECK_DEPENDS_ONLY_ works, so that 
> it affects package loading directly, by allowing the user to specify a 
> whitelist of packages (e.g. based on the dependencies in the DESCRIPTION 
> file) and having the package loader refuse to load any package unless it's in 
> there.  I think I like the current implementation better.
> 
> So I'd change my recommendation for single-user systems:  they should have 
> two libs.  One contains base packages and nothing else, the other contains 
> all contributed packages, including recommended ones.  Assuming the single 
> user is in the admin group, they could modify the second lib, but only 
> reinstalls of R would change the first one.
> 
> On a multi-user system there would typically be another lib in the user 
> account.
> 
> Duncan Murdoch
> 
> On 03/06/2022 12:45 p.m., Henrik Bengtsson wrote:
>> I see two fairly big problems with users installing R packages to
>> .Library by default.  One is related to package checking and CRAN, and
>> one is related to translation of expectations when moving between
>> operating systems (as Patrick already pointed out).  At the end, I'll
>> also argue that R_LIBS_SITE exists for those who wish to maintain
>> site-wide R package libraries to be shared among users, which is
>> better than using .Library for this.
>> # R CMD check
>> When you check a package with 'R CMD check --as-cran', or, with
>> environment variable `_R_CHECK_DEPENDS_ONLY_` set to true, the checks
>> are run in a sandbox where only declared package dependencies and any
>> packages in the system package library (= .Library) are on the library
>> path (= .libPaths()), e.g.
>> print(.libPaths())
>> [1] "/tmp/alice/RtmpYDq3KF/RLIBS_2410b74eb16752"
>> [2] "/path/to/R-4.2.0/lib/R/library"
>> What's in the user's library (= R_LIBS_USER) or in the site library (=
>> .Library.site/R_LIBS_SITE) is not part of the testing.  This mechanism
>> is very valuable since it helps to identify undeclared package
>> dependencies.
>> **The default behavior on macOS discussed here, where R packages are
>> installed to .Library, breaks this.**  Developers with non-base R
>> packages in .Library will not benefit from the 'R CMD check --as-cran'
>> checks for undeclared packages. This increases the risk of them not
>> being aware of the problem of undeclared packages, which is a
>> discussion we see from time to time on R-devel and R-pkg-devel, e.g.
>> when it comes to what should be listed under Suggests: or not.
>> BTW, this makes me wonder how many macOS developers notice this
>> problem only as they submit to CRAN, and have to resubmit. Also, this
>> issue might add extra work to the CRAN Team, e.g. spending time
>> locking at and responding to possible false positives, handling more
>> emails, and handling more re-submissions.
>> # Social expectations
>> The second problem with the current default macOS behavior is when
>> people ho

Re: [R-SIG-Mac] CRAN installer for macOS - directory permissions

2022-06-08 Thread Duncan Murdoch
Henrik, you posted this a couple of days ago and I didn't address the 
_R_CHECK_DEPENDS_ONLY_ point you raised.


You're right that the current implementation of _R_CHECK_DEPENDS_ONLY_ 
doesn't work if all packages are installed in one lib.  This is a flaw, 
with one fix being to never put contributed packages into the system 
lib.  (I haven't done a Linux install in a long time; don't they by 
default put recommended packages there?  They can also be Suggested 
packages, so if they're in the system lib, that's a bug.)


Another possible fix is to change how _R_CHECK_DEPENDS_ONLY_ works, so 
that it affects package loading directly, by allowing the user to 
specify a whitelist of packages (e.g. based on the dependencies in the 
DESCRIPTION file) and having the package loader refuse to load any 
package unless it's in there.  I think I like the current implementation 
better.


So I'd change my recommendation for single-user systems:  they should 
have two libs.  One contains base packages and nothing else, the other 
contains all contributed packages, including recommended ones.  Assuming 
the single user is in the admin group, they could modify the second lib, 
but only reinstalls of R would change the first one.


On a multi-user system there would typically be another lib in the user 
account.


Duncan Murdoch

On 03/06/2022 12:45 p.m., Henrik Bengtsson wrote:

I see two fairly big problems with users installing R packages to
.Library by default.  One is related to package checking and CRAN, and
one is related to translation of expectations when moving between
operating systems (as Patrick already pointed out).  At the end, I'll
also argue that R_LIBS_SITE exists for those who wish to maintain
site-wide R package libraries to be shared among users, which is
better than using .Library for this.

# R CMD check

When you check a package with 'R CMD check --as-cran', or, with
environment variable `_R_CHECK_DEPENDS_ONLY_` set to true, the checks
are run in a sandbox where only declared package dependencies and any
packages in the system package library (= .Library) are on the library
path (= .libPaths()), e.g.

print(.libPaths())
[1] "/tmp/alice/RtmpYDq3KF/RLIBS_2410b74eb16752"
[2] "/path/to/R-4.2.0/lib/R/library"

What's in the user's library (= R_LIBS_USER) or in the site library (=
.Library.site/R_LIBS_SITE) is not part of the testing.  This mechanism
is very valuable since it helps to identify undeclared package
dependencies.

**The default behavior on macOS discussed here, where R packages are
installed to .Library, breaks this.**  Developers with non-base R
packages in .Library will not benefit from the 'R CMD check --as-cran'
checks for undeclared packages. This increases the risk of them not
being aware of the problem of undeclared packages, which is a
discussion we see from time to time on R-devel and R-pkg-devel, e.g.
when it comes to what should be listed under Suggests: or not.

BTW, this makes me wonder how many macOS developers notice this
problem only as they submit to CRAN, and have to resubmit. Also, this
issue might add extra work to the CRAN Team, e.g. spending time
locking at and responding to possible false positives, handling more
emails, and handling more re-submissions.


# Social expectations

The second problem with the current default macOS behavior is when
people hop between systems and operating systems.  Particularly, a
macOS user coming to Unix or Windows does not immediately understand
how and where R packages are installed.  They get a prompt about a
"personal library" and might choose to decline because it's not what
they're used to seeing.  Then they might end up in the Stack Overflow
cut'n'paste rabbit hole, where they find some instructions on setting
'R_LIBS_USER=$HOME/R-lib' without version specifiers.  Works fine
until they upgrade R next year, when they start getting weird warnings
or errors of some packages not working that they slowly start to
accept as the normal behavior of R. I see this problem on large HPC
environments where I help out thousands of HPC users. Also, reading
various support forums out there, I think this is a real problem. It's
only recently, thanks to Patrick, I learned about this rather odd
macOS behavior, and I do think it is a cause for confusion and
miscommunication.  Another problem with different OS behaviors is that
it complicated documentation and instructions.  I strongly believe, it
would be beneficial to the R community if we all have the same
experience and expectations regardless of OS.

I believe the above problems are best addressed by changing the
*default* settings on macOS so that it is *not* possible to install to
.Library, and instead require a user to install to their personal
package library.  Advanced users who prefer to install to .Library,
can still configure R, or .Library, to do so.

As Patrick suggests, defaulting .Library to 755, instead of 775, or
avoid setting the "admin" group, seems like a simple solution that
wou