On 10/29/2014 01:07 PM, Vincent Carey wrote:


On Wed, Oct 29, 2014 at 2:15 PM, Hervé Pagès <[email protected]
<mailto:[email protected]>> wrote:

    Hi,

    On 10/28/2014 08:51 PM, Vincent Carey wrote:



        On Tue, Oct 28, 2014 at 5:48 PM, Hervé Pagès
        <[email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>> wrote:



             On 10/28/2014 12:42 PM, Vincent Carey wrote:



                 On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès
                 <[email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>
                 <mailto:[email protected]
        <mailto:[email protected]> <mailto:[email protected]
        <mailto:[email protected]>>>__> wrote:

                      Hi,

                      On 10/28/2014 08:48 AM, Vincent Carey wrote:

                          On Tue, Oct 28, 2014 at 11:23 AM, Kasper
        Daniel Hansen <
        [email protected]
        <mailto:[email protected]>
        <mailto:kasperdanielhansen@__gmail.com
        <mailto:[email protected]>>
                          <mailto:kasperdanielhansen@
        <mailto:kasperdanielhansen@>__g__mail.com <http://gmail.com>

                 <mailto:kasperdanielhansen@__gmail.com
        <mailto:[email protected]>>>> wrote:

                              Well, first I want to make sure that there
        is not
                 something
                              special
                              regarding S4 methods and classes. I have a
        feeling
                 that they
                              are a special
                              case.

                              Second, while I agree with Jim's general
        opinion,
                 it is a
                              little bit
                              different when I have return objects which are
                 defined in
                              other packages.
                              If I don't depend on this other package,
        the user
                 is hosed
                              wrt. the return
                              object, unless I manually export all
        classes from
                 this other


                          In what sense?  If you return an instance of
        GRanges,
                 certain
                          things can be
                          done
                          even if GenomicRanges is not attached.


                      Yes certain things maybe, but it's hard to predict
        which ones.

                            You can get values of slots, for
                          example.

                          With the following little package

                          %vjcair> cat foo/NAMESPACE

                          importFrom(IRanges, IRanges)

                          importClassesFrom(______GenomicRanges, GRanges)

                          importFrom(GenomicRanges, GRanges)

                          export(myfun)



                          %vjcair> cat foo/DESCRIPTION

                          Package: foo

                          Title: foo

                          Version: 0.0.0

                          Author: VJ Carey <[email protected]
        <mailto:[email protected]>
                 <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>
                          <mailto:stvjc@channing.
        <mailto:stvjc@channing.>__harva__rd.edu <http://harvard.edu>
                 <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>>>

                          Description:

                          Suggests:

                          Depends:

                          Imports: GenomicRanges

                          Maintainer: VJ Carey
        <[email protected] <mailto:[email protected]>
                 <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>
                          <mailto:stvjc@channing.
        <mailto:stvjc@channing.>__harva__rd.edu <http://harvard.edu>

                 <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>>>


                          License: Private

                          LazyLoad: yes



                          %vjcair> cat foo/R/*

                          myfun = function(seqnames="1",
        ranges=IRanges(1,2), ...)

                               GRanges(seqnames=seqnames, ranges=ranges,
        ...)


                          The following works:


                              library(foo)


                              x = myfun()


                              x


                          GRanges object with 1 range and 0 metadata
        columns:

                                  seqnames    ranges strand

                                     <Rle> <IRanges>  <Rle>

                              [1]        1    [1, 2]      *

                              -------

                              seqinfo: 1 sequence from an unspecified
        genome; no
                 seqlengths


                          So the show method works, even though I have not
                 touched it.  (I
                          did not

                          expect it to work, in fact.)


                      Exactly. Let's call it luck ;-)

                            Additionally, I can get access to slots.


                      The end user should never try to access slots
        directly but
                 use getters
                      and setters instead. And most getters and setters for
                 GRanges objects
                      are defined and documented in the GenomicRanges
        package.
                 Those that are
                      not are defined in packages that GenomicRanges
        depends on.

                            But
                          ranges()

                          fails.  If I, the user, want to use it, I need to
                 arrange for that.


                      IMO if your package returns a GRanges object to
        the user,
                 then the user
                      should be able to access the man page for GRanges
        objects
                 with ?GRanges.


                 Oddly enough, that seems to be incorrect.  I added a
        man page to foo
                 that has
                 a \link[GenomicRanges]{GRanges-____class}.  I ran
        help.start and
                 the cross
                 reference
                 from my man page succeeds.  Furthermore with the
        sessionInfo
                 below, ?GRanges
                 succeeds at the CLI.


             Did you try to run example(GRanges)? I'm not sure that will
        work.


        Correct.  Cursory look at source shows that help() uses
        loadedNamespaces()
        to find the help file.  example() could probably do likewise.


    Sounds reasonable. So it seems that some recent changes in R make
    it possible to access the man page and examples for stuff that
    is imported but not attached. This is an important shift in paradigm
    to me. In the past I would just rely on the simple notion that
    what I can access with ? or example() reflects what's in my
    search pass. Now if I do ?DNAStringSet and it succeeds, I can't
    assume DNAStringSet() is in my search path anymore. And if I
    want to copy/paste a few commands from the examples in order to
    try them in my session, they might fail because the package where
    these examples belong is not necessarily attached.
    I wonder whether that means we should now start every example
    section with library(foo)? The rationale for not doing it so far


I think that would be excessive.  You are correct that some code will
not run, and the user will have to decide what to do.  We have access to
core members.  example() could be tuned to check for attachment of the
package hosting the page and fail if the host package is not attached, with
a hint as to how to proceed.  For cutting and pasting, caveat emptor.

    was that if you can access the man page with ? then that means
    the package is already attached.

    As a side note the decision to extend the scope of ? to attached
    packages and not to all installed packages feels arbitrary to me.
    Going all the way would make ? even more useful and would be
    consistent with what I see when navigating the documentation in
    a browser. So when the user wants to call DNAStringSet() but
    doesn't remember where it lives, ?DNAStringSet would be a quick
    and easy way to know, and this whether the package is loaded via
    a namespace or not.


I think this is a reasonable objective.


    Anyway, to get back to the original topic, IMO this change in R
    still doesn't justify changing the Depends vs Imports game. I see
    at least 3 strong cases for using 'Depends: A' instead of 'Imports: A'
    in package B:
       (1) B defines (and exports) a class that extend a class defined in A.


In my view there is a risk of needless namespace pollution in this case.
Depends seems extreme, other things being equal.  Better to let the user
determine in real time whether this should occur.  It seems to me that
particularly
when packages have lots of complicated interrelationships, it is best to
have the
developers manage symbols internally to the code, reducing as much as
possible
the impact on the user the user environment.  Minimizing the use of
Depends seems
consistent with this.

       (2) B defines (and exports) methods for a generic defined in A.
       (3) B defines (and exports) functions or methods that return
           objects of a class defined in package A.

    'Imports: A' should be reserved to situations where A is used
    internally by B and in a way that is B's internal business only
    and none of the end-user's business. A typical example is the
    internal use of RSQLite and biomaRt in GenomicFeatures.


I'm sympathetic to this view but would rather be out of the business of
figuring out what the end-user's business is apart from using and
getting value from the functions defined in the package that I contributed.
Leaving the attachments up to the user is one way.


    I can see the attractiveness of trying to minimize what gets attached
    to the user's session but I'm also concerned that trying to go to far
    in that direction ultimately has no real benefit and can hurt the
    user-friendliness of the software.


We should try to assemble data on this concern.  I don't know how to do it.

Well user-friendliness is hard to measure because it can be very
subjective. Personally I don't feel that my package B is the most
user-friendly if my functions return objects of a class defined
in package A and if A is not in Depends. If I know in advance that
my users will almost always need to attach A before they can do
anything with these objects, then I'd rather do that for them.

Note that it's different from trying to anticipate any possible
use of these objects by my users, which I agree is a business
I'd rather stay out.

H.



    H.



             For example after I do library(rtracklayer), I can indeed do
             ?DNAStringSet at the command line (I'm surprised this
        works), but
             then example(DNAStringSet) fails:

                > example(DNAStringSet)
                Warning message:
                In example(DNAStringSet) : no help found for ‘DNAStringSet’

             I'm also surprised this is just a warning but that's
        another story...

             H.

                   I am not trying to defend the NOTE but the
                 principle of minimizing
                 Depends declarations needs to be considered critically,
        and I am
                 just
                 exploring the space.

                   > ?GRanges  # it worked as usual in the tty

                   > sessionInfo()

                 R version 3.1.1 (2014-07-10)

                 Platform: x86_64-apple-darwin13.1.0 (64-bit)


                 locale:

                 [1]

        en_US.UTF-8/en_US.UTF-8/en_US.____UTF-8/C/en_US.UTF-8/en_US.__UTF-__8



                 attached base packages:

                 [1] stats     graphics  grDevices datasets  utils     tools
                   methods

                 [8] base


                 other attached packages:

                 [1] foo_0.0.0            rmarkdown_0.3.8      knitr_1.6

                 [4] weaver_1.31.0        codetools_0.2-9      digest_0.6.4

                 [7] BiocInstaller_1.16.0


                 loaded via a namespace (and not attached):

                    [1] BiocGenerics_0.11.5   evaluate_0.5.5
        formatR_1.0

                    [4] GenomeInfoDb_1.1.26   GenomicRanges_1.17.48
        htmltools_0.2.6

                    [7] IRanges_1.99.32       parallel_3.1.1
        S4Vectors_0.2.8

                 [10] stats4_3.1.1          stringr_0.6.2
          XVector_0.5.8

                      And that works only if the GenomicRanges package is
                 attached. Attaching
                      GenomicRanges will also attach other packages that
                 GenomicRanges depends
                      on where some GRanges accessors might be defined and
                 documented (e.g.
                      metadata()).



                          In some cases you'll decide you want the user
        to have a
                 full
                          complement of

                          methods for your package to function
        meaningfully.  For
                 example,
                          I am
                          considering

                          using dplyr idioms to work with data
        structures in a
                 package,
                          and it seems
                          I should

                          just depend on dplyr rather than pick out and
        document
                 which
                          things I want
                          to expose.  But that

                          may still be an undesirable design.


                              package, like

          importClassesFrom("______GenomicRanges", "GRanges")


                                  exportClasses("GRanges")
                              Surely that is not intended.

                              It is important that my package works
        without being
                 attached
                              to the search
                              path and I do this by carefully importing
        what I
                 need, ie.
                              my code does not
                              require that my dependencies are attached
        to the search
                              path.  But the end
                              user will be hosed without it.


                      Yes s/he will. Fortunately when your package
        namespace gets
                 loaded by
                      another package, then nothing gets attached to the
        search
                 path, even if
                      your package depends (instead of imports) on other
                 packages. So using
                      Depends instead of Imports for your own
        dependencies won't
                 make any
                      difference in that respect, which is good.


                              My impression is that the NOTE in R CMD
        check was
                 written by
                              someone who
                              did not anticipate large-scale use and
        re-use of
                 classes and
                              methods across
                              many packages.


                      That's my impression too.

                      Cheers,
                      H.


                              Best,
                              Kasper


                              On Tue, Oct 28, 2014 at 11:14 AM, James W.
        MacDonald
                              <[email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>
                 <mailto:[email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>>>
                              wrote:

                                  I agree with Vince. It's your job as a
        package
                 developer
                                  to make
                                  available to your package all the
        functions
                 necessary
                                  for the package to
                                  work. But I am not sure it is your job
        to load
                 all the
                                  packages that your
                                  end user might need.

                                  Best,

                                  Jim



                                  On Tue, Oct 28, 2014 at 11:04 AM,
        Vincent Carey <
        [email protected] <mailto:[email protected]>
        <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>
                                  <mailto:stvjc@channing.
        <mailto:stvjc@channing.>__harva__rd.edu <http://harvard.edu>
                 <mailto:stvjc@channing.__harvard.edu
        <mailto:[email protected]>>>> wrote:

                                      On Tue, Oct 28, 2014 at 10:19 AM,
        Kasper
                 Daniel Hansen <
        [email protected]
        <mailto:[email protected]>
        <mailto:kasperdanielhansen@__gmail.com
        <mailto:[email protected]>>
                                      <mailto:kasperdanielhansen@
        <mailto:kasperdanielhansen@>__g__mail.com <http://gmail.com>
                 <mailto:kasperdanielhansen@__gmail.com
        <mailto:[email protected]>>>> wrote:

                                          What is the current best
        paradigm for
                 using all
                                          the classes in

                 S4Vectors/GenomeInfoDb/______GenomicRanges/IRanges



                                          I obviously import methods and
        classes
                 from the
                                          relevant packages.

                                          But shouldn't I depend on
        these packages as
                                          well?  Since I basically

                                      want

                                          the user to have this
        functionality at the
                                          command line? That is what

                                      I do

                                          now.


                                      I've wondered about this as well.
        It seems the
                                      principle is that the
                                      user
                                      should
                                      take care of attaching additional
        packages when
                                      needed.  It might be
                                      appropriate
                                      to give a hint in the package startup
                 message, if
                                      having some other
                                      package
                                      attached
                                      would typically be of great utility.

                                      Given your list above, I would
        think that
                 depending
                                      on GenomicRanges
                                      would
                                      often
                                      be sufficient, and
        IRanges/S4Vectors would not
                                      require dependency
                                      assertion.  I would
                                      think that GenomeInfoDb should be
        a voluntary
                                      attachment for a specific
                                      session.

                                      These are just my guesses -- I
        doubt there
                 will be
                                      complete consensus,
                                      but
                                      I have
                                      started to think very critically
        about using
                                      Depends, and I think it is
                                      better when its
                                      use is minimized.


                                          That of course leads to the R
        CMD check
                 NOTE on
                                          depending on too many
                                          packages.... I guess I should
        ignore
                 that one.

                                          Best,
                                          Kasper

                                                    [[alternative HTML
        version
                 deleted]]


                 _____________________________________________________
        [email protected] <mailto:[email protected]>
        <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>>
                                          <mailto:Bioc-devel@r-project.
        <mailto:Bioc-devel@r-project.>____org
                 <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>>> mailing list
        https://stat.ethz.ch/mailman/______listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
                 <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>

                 <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
                 <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>


                                                [[alternative HTML
        version deleted]]


                 _____________________________________________________
        [email protected] <mailto:[email protected]>
        <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>>
                                      <mailto:Bioc-devel@r-project.
        <mailto:Bioc-devel@r-project.>____org
                 <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>>> mailing list
        https://stat.ethz.ch/mailman/______listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
                 <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>

                 <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
                 <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>




                                  --
                                  James W. MacDonald, M.S.
                                  Biostatistician
                                  University of Washington
                                  Environmental and Occupational Health
        Sciences
                                  4225 Roosevelt Way NE, # 100
                                  Seattle WA 98105-6099




                                   [[alternative HTML version deleted]]


          _____________________________________________________
        [email protected] <mailto:[email protected]>
        <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>>
                 <mailto:Bioc-devel@r-project.
        <mailto:Bioc-devel@r-project.>____org
                 <mailto:Bioc-devel@r-project.__org
        <mailto:[email protected]>>>
                          mailing list
        https://stat.ethz.ch/mailman/______listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
                 <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>

          <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
                 <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>


                      --
                      Hervé Pagès

                      Program in Computational Biology
                      Division of Public Health Sciences
                      Fred Hutchinson Cancer Research Center
                      1100 Fairview Ave. N, M1-B514
                      P.O. Box 19024
                      Seattle, WA 98109-1024

                      E-mail: [email protected]
        <mailto:[email protected]> <mailto:[email protected]
        <mailto:[email protected]>>
                 <mailto:[email protected]
        <mailto:[email protected]> <mailto:[email protected]
        <mailto:[email protected]>>>


                      Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
        <tel:%28206%29%20667-5791>
                 <tel:%28206%29%20667-5791>
                      Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
        <tel:%28206%29%20667-1319>
                 <tel:%28206%29%20667-1319>



             --
             Hervé Pagès

             Program in Computational Biology
             Division of Public Health Sciences
             Fred Hutchinson Cancer Research Center
             1100 Fairview Ave. N, M1-B514
             P.O. Box 19024
             Seattle, WA 98109-1024

             E-mail: [email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>
             Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
        <tel:%28206%29%20667-5791>
             Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
        <tel:%28206%29%20667-1319>



    --
    Hervé Pagès

    Program in Computational Biology
    Division of Public Health Sciences
    Fred Hutchinson Cancer Research Center
    1100 Fairview Ave. N, M1-B514
    P.O. Box 19024
    Seattle, WA 98109-1024

    E-mail: [email protected] <mailto:[email protected]>
    Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
    Fax: (206) 667-1319 <tel:%28206%29%20667-1319>



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [email protected]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to