Hi,

On 10/28/2014 08:51 PM, Vincent Carey wrote:


On Tue, Oct 28, 2014 at 5:48 PM, Hervé Pagès <[email protected]
<mailto:[email protected]>> wrote:



    On 10/28/2014 12:42 PM, Vincent Carey wrote:



        On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès
        <[email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>> wrote:

             Hi,

             On 10/28/2014 08:48 AM, Vincent Carey wrote:

                 On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
        [email protected] <mailto:[email protected]>
                 <mailto:kasperdanielhansen@__gmail.com
        <mailto:[email protected]>>> wrote:

                     Well, first I want to make sure that there is not
        something
                     special
                     regarding S4 methods and classes. I have a feeling
        that they
                     are a special
                     case.

                     Second, while I agree with Jim's general opinion,
        it is a
                     little bit
                     different when I have return objects which are
        defined in
                     other packages.
                     If I don't depend on this other package, the user
        is hosed
                     wrt. the return
                     object, unless I manually export all classes from
        this other


                 In what sense?  If you return an instance of GRanges,
        certain
                 things can be
                 done
                 even if GenomicRanges is not attached.


             Yes certain things maybe, but it's hard to predict which ones.

                   You can get values of slots, for
                 example.

                 With the following little package

                 %vjcair> cat foo/NAMESPACE

                 importFrom(IRanges, IRanges)

                 importClassesFrom(____GenomicRanges, GRanges)

                 importFrom(GenomicRanges, GRanges)

                 export(myfun)



                 %vjcair> cat foo/DESCRIPTION

                 Package: foo

                 Title: foo

                 Version: 0.0.0

                 Author: VJ Carey <[email protected]
        <mailto:[email protected]>
                 <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>>

                 Description:

                 Suggests:

                 Depends:

                 Imports: GenomicRanges

                 Maintainer: VJ Carey <[email protected]
        <mailto:[email protected]>
                 <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>>


                 License: Private

                 LazyLoad: yes



                 %vjcair> cat foo/R/*

                 myfun = function(seqnames="1", ranges=IRanges(1,2), ...)

                      GRanges(seqnames=seqnames, ranges=ranges, ...)


                 The following works:


                     library(foo)


                     x = myfun()


                     x


                 GRanges object with 1 range and 0 metadata columns:

                         seqnames    ranges strand

                            <Rle> <IRanges>  <Rle>

                     [1]        1    [1, 2]      *

                     -------

                     seqinfo: 1 sequence from an unspecified genome; no
        seqlengths


                 So the show method works, even though I have not
        touched it.  (I
                 did not

                 expect it to work, in fact.)


             Exactly. Let's call it luck ;-)

                   Additionally, I can get access to slots.


             The end user should never try to access slots directly but
        use getters
             and setters instead. And most getters and setters for
        GRanges objects
             are defined and documented in the GenomicRanges package.
        Those that are
             not are defined in packages that GenomicRanges depends on.

                   But
                 ranges()

                 fails.  If I, the user, want to use it, I need to
        arrange for that.


             IMO if your package returns a GRanges object to the user,
        then the user
             should be able to access the man page for GRanges objects
        with ?GRanges.


        Oddly enough, that seems to be incorrect.  I added a man page to foo
        that has
        a \link[GenomicRanges]{GRanges-__class}.  I ran help.start and
        the cross
        reference
        from my man page succeeds.  Furthermore with the sessionInfo
        below, ?GRanges
        succeeds at the CLI.


    Did you try to run example(GRanges)? I'm not sure that will work.


Correct.  Cursory look at source shows that help() uses loadedNamespaces()
to find the help file.  example() could probably do likewise.

Sounds reasonable. So it seems that some recent changes in R make
it possible to access the man page and examples for stuff that
is imported but not attached. This is an important shift in paradigm
to me. In the past I would just rely on the simple notion that
what I can access with ? or example() reflects what's in my
search pass. Now if I do ?DNAStringSet and it succeeds, I can't
assume DNAStringSet() is in my search path anymore. And if I
want to copy/paste a few commands from the examples in order to
try them in my session, they might fail because the package where
these examples belong is not necessarily attached.
I wonder whether that means we should now start every example
section with library(foo)? The rationale for not doing it so far
was that if you can access the man page with ? then that means
the package is already attached.

As a side note the decision to extend the scope of ? to attached
packages and not to all installed packages feels arbitrary to me.
Going all the way would make ? even more useful and would be
consistent with what I see when navigating the documentation in
a browser. So when the user wants to call DNAStringSet() but
doesn't remember where it lives, ?DNAStringSet would be a quick
and easy way to know, and this whether the package is loaded via
a namespace or not.

Anyway, to get back to the original topic, IMO this change in R
still doesn't justify changing the Depends vs Imports game. I see
at least 3 strong cases for using 'Depends: A' instead of 'Imports: A'
in package B:
  (1) B defines (and exports) a class that extend a class defined in A.
  (2) B defines (and exports) methods for a generic defined in A.
  (3) B defines (and exports) functions or methods that return
      objects of a class defined in package A.

'Imports: A' should be reserved to situations where A is used
internally by B and in a way that is B's internal business only
and none of the end-user's business. A typical example is the
internal use of RSQLite and biomaRt in GenomicFeatures.

I can see the attractiveness of trying to minimize what gets attached
to the user's session but I'm also concerned that trying to go to far
in that direction ultimately has no real benefit and can hurt the
user-friendliness of the software.

H.



    For example after I do library(rtracklayer), I can indeed do
    ?DNAStringSet at the command line (I'm surprised this works), but
    then example(DNAStringSet) fails:

       > example(DNAStringSet)
       Warning message:
       In example(DNAStringSet) : no help found for ‘DNAStringSet’

    I'm also surprised this is just a warning but that's another story...

    H.

          I am not trying to defend the NOTE but the
        principle of minimizing
        Depends declarations needs to be considered critically, and I am
        just
        exploring the space.

          > ?GRanges  # it worked as usual in the tty

          > sessionInfo()

        R version 3.1.1 (2014-07-10)

        Platform: x86_64-apple-darwin13.1.0 (64-bit)


        locale:

        [1]
        en_US.UTF-8/en_US.UTF-8/en_US.__UTF-8/C/en_US.UTF-8/en_US.UTF-__8


        attached base packages:

        [1] stats     graphics  grDevices datasets  utils     tools
          methods

        [8] base


        other attached packages:

        [1] foo_0.0.0            rmarkdown_0.3.8      knitr_1.6

        [4] weaver_1.31.0        codetools_0.2-9      digest_0.6.4

        [7] BiocInstaller_1.16.0


        loaded via a namespace (and not attached):

           [1] BiocGenerics_0.11.5   evaluate_0.5.5        formatR_1.0

           [4] GenomeInfoDb_1.1.26   GenomicRanges_1.17.48 htmltools_0.2.6

           [7] IRanges_1.99.32       parallel_3.1.1        S4Vectors_0.2.8

        [10] stats4_3.1.1          stringr_0.6.2         XVector_0.5.8

             And that works only if the GenomicRanges package is
        attached. Attaching
             GenomicRanges will also attach other packages that
        GenomicRanges depends
             on where some GRanges accessors might be defined and
        documented (e.g.
             metadata()).



                 In some cases you'll decide you want the user to have a
        full
                 complement of

                 methods for your package to function meaningfully.  For
        example,
                 I am
                 considering

                 using dplyr idioms to work with data structures in a
        package,
                 and it seems
                 I should

                 just depend on dplyr rather than pick out and document
        which
                 things I want
                 to expose.  But that

                 may still be an undesirable design.


                     package, like
                         importClassesFrom("____GenomicRanges", "GRanges")

                         exportClasses("GRanges")
                     Surely that is not intended.

                     It is important that my package works without being
        attached
                     to the search
                     path and I do this by carefully importing what I
        need, ie.
                     my code does not
                     require that my dependencies are attached to the search
                     path.  But the end
                     user will be hosed without it.


             Yes s/he will. Fortunately when your package namespace gets
        loaded by
             another package, then nothing gets attached to the search
        path, even if
             your package depends (instead of imports) on other
        packages. So using
             Depends instead of Imports for your own dependencies won't
        make any
             difference in that respect, which is good.


                     My impression is that the NOTE in R CMD check was
        written by
                     someone who
                     did not anticipate large-scale use and re-use of
        classes and
                     methods across
                     many packages.


             That's my impression too.

             Cheers,
             H.


                     Best,
                     Kasper


                     On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald
                     <[email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>>
                     wrote:

                         I agree with Vince. It's your job as a package
        developer
                         to make
                         available to your package all the functions
        necessary
                         for the package to
                         work. But I am not sure it is your job to load
        all the
                         packages that your
                         end user might need.

                         Best,

                         Jim



                         On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
        [email protected] <mailto:[email protected]>
                         <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>> wrote:

                             On Tue, Oct 28, 2014 at 10:19 AM, Kasper
        Daniel Hansen <
        [email protected] <mailto:[email protected]>
                             <mailto:kasperdanielhansen@__gmail.com
        <mailto:[email protected]>>> wrote:

                                 What is the current best paradigm for
        using all
                                 the classes in

        S4Vectors/GenomeInfoDb/____GenomicRanges/IRanges


                                 I obviously import methods and classes
        from the
                                 relevant packages.

                                 But shouldn't I depend on these packages as
                                 well?  Since I basically

                             want

                                 the user to have this functionality at the
                                 command line? That is what

                             I do

                                 now.


                             I've wondered about this as well.  It seems the
                             principle is that the
                             user
                             should
                             take care of attaching additional packages when
                             needed.  It might be
                             appropriate
                             to give a hint in the package startup
        message, if
                             having some other
                             package
                             attached
                             would typically be of great utility.

                             Given your list above, I would think that
        depending
                             on GenomicRanges
                             would
                             often
                             be sufficient, and IRanges/S4Vectors would not
                             require dependency
                             assertion.  I would
                             think that GenomeInfoDb should be a voluntary
                             attachment for a specific
                             session.

                             These are just my guesses -- I doubt there
        will be
                             complete consensus,
                             but
                             I have
                             started to think very critically about using
                             Depends, and I think it is
                             better when its
                             use is minimized.


                                 That of course leads to the R CMD check
        NOTE on
                                 depending on too many
                                 packages.... I guess I should ignore
        that one.

                                 Best,
                                 Kasper

                                           [[alternative HTML version
        deleted]]


        ___________________________________________________
        [email protected] <mailto:[email protected]>
                                 <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>> mailing list
        https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>

        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>


                                       [[alternative HTML version deleted]]


        ___________________________________________________
        [email protected] <mailto:[email protected]>
                             <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>> mailing list
        https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>

        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>




                         --
                         James W. MacDonald, M.S.
                         Biostatistician
                         University of Washington
                         Environmental and Occupational Health Sciences
                         4225 Roosevelt Way NE, # 100
                         Seattle WA 98105-6099




                          [[alternative HTML version deleted]]

                 ___________________________________________________
        [email protected] <mailto:[email protected]>
        <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>>
                 mailing list
        https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
                 <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>


             --
             Hervé Pagès

             Program in Computational Biology
             Division of Public Health Sciences
             Fred Hutchinson Cancer Research Center
             1100 Fairview Ave. N, M1-B514
             P.O. Box 19024
             Seattle, WA 98109-1024

             E-mail: [email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>

             Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
        <tel:%28206%29%20667-5791>
             Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
        <tel:%28206%29%20667-1319>



    --
    Hervé Pagès

    Program in Computational Biology
    Division of Public Health Sciences
    Fred Hutchinson Cancer Research Center
    1100 Fairview Ave. N, M1-B514
    P.O. Box 19024
    Seattle, WA 98109-1024

    E-mail: [email protected] <mailto:[email protected]>
    Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
    Fax: (206) 667-1319 <tel:%28206%29%20667-1319>



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [email protected]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to