On 10/29/2014 01:07 PM, Vincent Carey wrote:
On Wed, Oct 29, 2014 at 2:15 PM, Hervé Pagès <[email protected]
<mailto:[email protected]>> wrote:
Hi,
On 10/28/2014 08:51 PM, Vincent Carey wrote:
On Tue, Oct 28, 2014 at 5:48 PM, Hervé Pagès
<[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>> wrote:
On 10/28/2014 12:42 PM, Vincent Carey wrote:
On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès
<[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>
<mailto:[email protected]
<mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>>__> wrote:
Hi,
On 10/28/2014 08:48 AM, Vincent Carey wrote:
On Tue, Oct 28, 2014 at 11:23 AM, Kasper
Daniel Hansen <
[email protected]
<mailto:[email protected]>
<mailto:kasperdanielhansen@__gmail.com
<mailto:[email protected]>>
<mailto:kasperdanielhansen@
<mailto:kasperdanielhansen@>__g__mail.com <http://gmail.com>
<mailto:kasperdanielhansen@__gmail.com
<mailto:[email protected]>>>> wrote:
Well, first I want to make sure that there
is not
something
special
regarding S4 methods and classes. I have a
feeling
that they
are a special
case.
Second, while I agree with Jim's general
opinion,
it is a
little bit
different when I have return objects which are
defined in
other packages.
If I don't depend on this other package,
the user
is hosed
wrt. the return
object, unless I manually export all
classes from
this other
In what sense? If you return an instance of
GRanges,
certain
things can be
done
even if GenomicRanges is not attached.
Yes certain things maybe, but it's hard to predict
which ones.
You can get values of slots, for
example.
With the following little package
%vjcair> cat foo/NAMESPACE
importFrom(IRanges, IRanges)
importClassesFrom(______GenomicRanges, GRanges)
importFrom(GenomicRanges, GRanges)
export(myfun)
%vjcair> cat foo/DESCRIPTION
Package: foo
Title: foo
Version: 0.0.0
Author: VJ Carey <[email protected]
<mailto:[email protected]>
<mailto:stvjc@channing.__harvard.edu
<mailto:[email protected]>>
<mailto:stvjc@channing.
<mailto:stvjc@channing.>__harva__rd.edu <http://harvard.edu>
<mailto:stvjc@channing.__harvard.edu
<mailto:[email protected]>>>>
Description:
Suggests:
Depends:
Imports: GenomicRanges
Maintainer: VJ Carey
<[email protected] <mailto:[email protected]>
<mailto:stvjc@channing.__harvard.edu
<mailto:[email protected]>>
<mailto:stvjc@channing.
<mailto:stvjc@channing.>__harva__rd.edu <http://harvard.edu>
<mailto:stvjc@channing.__harvard.edu
<mailto:[email protected]>>>>
License: Private
LazyLoad: yes
%vjcair> cat foo/R/*
myfun = function(seqnames="1",
ranges=IRanges(1,2), ...)
GRanges(seqnames=seqnames, ranges=ranges,
...)
The following works:
library(foo)
x = myfun()
x
GRanges object with 1 range and 0 metadata
columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] 1 [1, 2] *
-------
seqinfo: 1 sequence from an unspecified
genome; no
seqlengths
So the show method works, even though I have not
touched it. (I
did not
expect it to work, in fact.)
Exactly. Let's call it luck ;-)
Additionally, I can get access to slots.
The end user should never try to access slots
directly but
use getters
and setters instead. And most getters and setters for
GRanges objects
are defined and documented in the GenomicRanges
package.
Those that are
not are defined in packages that GenomicRanges
depends on.
But
ranges()
fails. If I, the user, want to use it, I need to
arrange for that.
IMO if your package returns a GRanges object to
the user,
then the user
should be able to access the man page for GRanges
objects
with ?GRanges.
Oddly enough, that seems to be incorrect. I added a
man page to foo
that has
a \link[GenomicRanges]{GRanges-____class}. I ran
help.start and
the cross
reference
from my man page succeeds. Furthermore with the
sessionInfo
below, ?GRanges
succeeds at the CLI.
Did you try to run example(GRanges)? I'm not sure that will
work.
Correct. Cursory look at source shows that help() uses
loadedNamespaces()
to find the help file. example() could probably do likewise.
Sounds reasonable. So it seems that some recent changes in R make
it possible to access the man page and examples for stuff that
is imported but not attached. This is an important shift in paradigm
to me. In the past I would just rely on the simple notion that
what I can access with ? or example() reflects what's in my
search pass. Now if I do ?DNAStringSet and it succeeds, I can't
assume DNAStringSet() is in my search path anymore. And if I
want to copy/paste a few commands from the examples in order to
try them in my session, they might fail because the package where
these examples belong is not necessarily attached.
I wonder whether that means we should now start every example
section with library(foo)? The rationale for not doing it so far
I think that would be excessive. You are correct that some code will
not run, and the user will have to decide what to do. We have access to
core members. example() could be tuned to check for attachment of the
package hosting the page and fail if the host package is not attached, with
a hint as to how to proceed. For cutting and pasting, caveat emptor.
was that if you can access the man page with ? then that means
the package is already attached.
As a side note the decision to extend the scope of ? to attached
packages and not to all installed packages feels arbitrary to me.
Going all the way would make ? even more useful and would be
consistent with what I see when navigating the documentation in
a browser. So when the user wants to call DNAStringSet() but
doesn't remember where it lives, ?DNAStringSet would be a quick
and easy way to know, and this whether the package is loaded via
a namespace or not.
I think this is a reasonable objective.
Anyway, to get back to the original topic, IMO this change in R
still doesn't justify changing the Depends vs Imports game. I see
at least 3 strong cases for using 'Depends: A' instead of 'Imports: A'
in package B:
(1) B defines (and exports) a class that extend a class defined in A.
In my view there is a risk of needless namespace pollution in this case.
Depends seems extreme, other things being equal. Better to let the user
determine in real time whether this should occur. It seems to me that
particularly
when packages have lots of complicated interrelationships, it is best to
have the
developers manage symbols internally to the code, reducing as much as
possible
the impact on the user the user environment. Minimizing the use of
Depends seems
consistent with this.
(2) B defines (and exports) methods for a generic defined in A.
(3) B defines (and exports) functions or methods that return
objects of a class defined in package A.
'Imports: A' should be reserved to situations where A is used
internally by B and in a way that is B's internal business only
and none of the end-user's business. A typical example is the
internal use of RSQLite and biomaRt in GenomicFeatures.
I'm sympathetic to this view but would rather be out of the business of
figuring out what the end-user's business is apart from using and
getting value from the functions defined in the package that I contributed.
Leaving the attachments up to the user is one way.
I can see the attractiveness of trying to minimize what gets attached
to the user's session but I'm also concerned that trying to go to far
in that direction ultimately has no real benefit and can hurt the
user-friendliness of the software.
We should try to assemble data on this concern. I don't know how to do it.
Well user-friendliness is hard to measure because it can be very
subjective. Personally I don't feel that my package B is the most
user-friendly if my functions return objects of a class defined
in package A and if A is not in Depends. If I know in advance that
my users will almost always need to attach A before they can do
anything with these objects, then I'd rather do that for them.
Note that it's different from trying to anticipate any possible
use of these objects by my users, which I agree is a business
I'd rather stay out.
H.
H.
For example after I do library(rtracklayer), I can indeed do
?DNAStringSet at the command line (I'm surprised this
works), but
then example(DNAStringSet) fails:
> example(DNAStringSet)
Warning message:
In example(DNAStringSet) : no help found for ‘DNAStringSet’
I'm also surprised this is just a warning but that's
another story...
H.
I am not trying to defend the NOTE but the
principle of minimizing
Depends declarations needs to be considered critically,
and I am
just
exploring the space.
> ?GRanges # it worked as usual in the tty
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1]
en_US.UTF-8/en_US.UTF-8/en_US.____UTF-8/C/en_US.UTF-8/en_US.__UTF-__8
attached base packages:
[1] stats graphics grDevices datasets utils tools
methods
[8] base
other attached packages:
[1] foo_0.0.0 rmarkdown_0.3.8 knitr_1.6
[4] weaver_1.31.0 codetools_0.2-9 digest_0.6.4
[7] BiocInstaller_1.16.0
loaded via a namespace (and not attached):
[1] BiocGenerics_0.11.5 evaluate_0.5.5
formatR_1.0
[4] GenomeInfoDb_1.1.26 GenomicRanges_1.17.48
htmltools_0.2.6
[7] IRanges_1.99.32 parallel_3.1.1
S4Vectors_0.2.8
[10] stats4_3.1.1 stringr_0.6.2
XVector_0.5.8
And that works only if the GenomicRanges package is
attached. Attaching
GenomicRanges will also attach other packages that
GenomicRanges depends
on where some GRanges accessors might be defined and
documented (e.g.
metadata()).
In some cases you'll decide you want the user
to have a
full
complement of
methods for your package to function
meaningfully. For
example,
I am
considering
using dplyr idioms to work with data
structures in a
package,
and it seems
I should
just depend on dplyr rather than pick out and
document
which
things I want
to expose. But that
may still be an undesirable design.
package, like
importClassesFrom("______GenomicRanges", "GRanges")
exportClasses("GRanges")
Surely that is not intended.
It is important that my package works
without being
attached
to the search
path and I do this by carefully importing
what I
need, ie.
my code does not
require that my dependencies are attached
to the search
path. But the end
user will be hosed without it.
Yes s/he will. Fortunately when your package
namespace gets
loaded by
another package, then nothing gets attached to the
search
path, even if
your package depends (instead of imports) on other
packages. So using
Depends instead of Imports for your own
dependencies won't
make any
difference in that respect, which is good.
My impression is that the NOTE in R CMD
check was
written by
someone who
did not anticipate large-scale use and
re-use of
classes and
methods across
many packages.
That's my impression too.
Cheers,
H.
Best,
Kasper
On Tue, Oct 28, 2014 at 11:14 AM, James W.
MacDonald
<[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>
<mailto:[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>>>
wrote:
I agree with Vince. It's your job as a
package
developer
to make
available to your package all the
functions
necessary
for the package to
work. But I am not sure it is your job
to load
all the
packages that your
end user might need.
Best,
Jim
On Tue, Oct 28, 2014 at 11:04 AM,
Vincent Carey <
[email protected] <mailto:[email protected]>
<mailto:stvjc@channing.__harvard.edu
<mailto:[email protected]>>
<mailto:stvjc@channing.
<mailto:stvjc@channing.>__harva__rd.edu <http://harvard.edu>
<mailto:stvjc@channing.__harvard.edu
<mailto:[email protected]>>>> wrote:
On Tue, Oct 28, 2014 at 10:19 AM,
Kasper
Daniel Hansen <
[email protected]
<mailto:[email protected]>
<mailto:kasperdanielhansen@__gmail.com
<mailto:[email protected]>>
<mailto:kasperdanielhansen@
<mailto:kasperdanielhansen@>__g__mail.com <http://gmail.com>
<mailto:kasperdanielhansen@__gmail.com
<mailto:[email protected]>>>> wrote:
What is the current best
paradigm for
using all
the classes in
S4Vectors/GenomeInfoDb/______GenomicRanges/IRanges
I obviously import methods and
classes
from the
relevant packages.
But shouldn't I depend on
these packages as
well? Since I basically
want
the user to have this
functionality at the
command line? That is what
I do
now.
I've wondered about this as well.
It seems the
principle is that the
user
should
take care of attaching additional
packages when
needed. It might be
appropriate
to give a hint in the package startup
message, if
having some other
package
attached
would typically be of great utility.
Given your list above, I would
think that
depending
on GenomicRanges
would
often
be sufficient, and
IRanges/S4Vectors would not
require dependency
assertion. I would
think that GenomeInfoDb should be
a voluntary
attachment for a specific
session.
These are just my guesses -- I
doubt there
will be
complete consensus,
but
I have
started to think very critically
about using
Depends, and I think it is
better when its
use is minimized.
That of course leads to the R
CMD check
NOTE on
depending on too many
packages.... I guess I should
ignore
that one.
Best,
Kasper
[[alternative HTML
version
deleted]]
_____________________________________________________
[email protected] <mailto:[email protected]>
<mailto:Bioc-devel@r-project.__org
<mailto:[email protected]>>
<mailto:Bioc-devel@r-project.
<mailto:Bioc-devel@r-project.>____org
<mailto:Bioc-devel@r-project.__org
<mailto:[email protected]>>> mailing list
https://stat.ethz.ch/mailman/______listinfo/bioc-devel
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>
[[alternative HTML
version deleted]]
_____________________________________________________
[email protected] <mailto:[email protected]>
<mailto:Bioc-devel@r-project.__org
<mailto:[email protected]>>
<mailto:Bioc-devel@r-project.
<mailto:Bioc-devel@r-project.>____org
<mailto:Bioc-devel@r-project.__org
<mailto:[email protected]>>> mailing list
https://stat.ethz.ch/mailman/______listinfo/bioc-devel
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health
Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
[[alternative HTML version deleted]]
_____________________________________________________
[email protected] <mailto:[email protected]>
<mailto:Bioc-devel@r-project.__org
<mailto:[email protected]>>
<mailto:Bioc-devel@r-project.
<mailto:Bioc-devel@r-project.>____org
<mailto:Bioc-devel@r-project.__org
<mailto:[email protected]>>>
mailing list
https://stat.ethz.ch/mailman/______listinfo/bioc-devel
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
<https://stat.ethz.ch/mailman/____listinfo/bioc-devel
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
<https://stat.ethz.ch/mailman/__listinfo/bioc-devel
<https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: [email protected]
<mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>
<mailto:[email protected]
<mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
<tel:%28206%29%20667-5791>
<tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
<tel:%28206%29%20667-1319>
<tel:%28206%29%20667-1319>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: [email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
<tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
<tel:%28206%29%20667-1319>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: [email protected] <mailto:[email protected]>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: [email protected]
Phone: (206) 667-5791
Fax: (206) 667-1319
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel