On 10/28/2014 12:42 PM, Vincent Carey wrote:


On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès <[email protected]
<mailto:[email protected]>> wrote:

    Hi,

    On 10/28/2014 08:48 AM, Vincent Carey wrote:

        On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
        [email protected]
        <mailto:[email protected]>> wrote:

            Well, first I want to make sure that there is not something
            special
            regarding S4 methods and classes. I have a feeling that they
            are a special
            case.

            Second, while I agree with Jim's general opinion, it is a
            little bit
            different when I have return objects which are defined in
            other packages.
            If I don't depend on this other package, the user is hosed
            wrt. the return
            object, unless I manually export all classes from this other


        In what sense?  If you return an instance of GRanges, certain
        things can be
        done
        even if GenomicRanges is not attached.


    Yes certain things maybe, but it's hard to predict which ones.

          You can get values of slots, for
        example.

        With the following little package

        %vjcair> cat foo/NAMESPACE

        importFrom(IRanges, IRanges)

        importClassesFrom(__GenomicRanges, GRanges)

        importFrom(GenomicRanges, GRanges)

        export(myfun)



        %vjcair> cat foo/DESCRIPTION

        Package: foo

        Title: foo

        Version: 0.0.0

        Author: VJ Carey <[email protected]
        <mailto:[email protected]>>

        Description:

        Suggests:

        Depends:

        Imports: GenomicRanges

        Maintainer: VJ Carey <[email protected]
        <mailto:[email protected]>>

        License: Private

        LazyLoad: yes



        %vjcair> cat foo/R/*

        myfun = function(seqnames="1", ranges=IRanges(1,2), ...)

             GRanges(seqnames=seqnames, ranges=ranges, ...)


        The following works:


            library(foo)


            x = myfun()


            x


        GRanges object with 1 range and 0 metadata columns:

                seqnames    ranges strand

                   <Rle> <IRanges>  <Rle>

            [1]        1    [1, 2]      *

            -------

            seqinfo: 1 sequence from an unspecified genome; no seqlengths


        So the show method works, even though I have not touched it.  (I
        did not

        expect it to work, in fact.)


    Exactly. Let's call it luck ;-)

          Additionally, I can get access to slots.


    The end user should never try to access slots directly but use getters
    and setters instead. And most getters and setters for GRanges objects
    are defined and documented in the GenomicRanges package. Those that are
    not are defined in packages that GenomicRanges depends on.

          But
        ranges()

        fails.  If I, the user, want to use it, I need to arrange for that.


    IMO if your package returns a GRanges object to the user, then the user
    should be able to access the man page for GRanges objects with ?GRanges.


Oddly enough, that seems to be incorrect.  I added a man page to foo
that has
a \link[GenomicRanges]{GRanges-class}.  I ran help.start and the cross
reference
from my man page succeeds.  Furthermore with the sessionInfo below, ?GRanges
succeeds at the CLI.

Did you try to run example(GRanges)? I'm not sure that will work.

For example after I do library(rtracklayer), I can indeed do
?DNAStringSet at the command line (I'm surprised this works), but
then example(DNAStringSet) fails:

  > example(DNAStringSet)
  Warning message:
  In example(DNAStringSet) : no help found for ‘DNAStringSet’

I'm also surprised this is just a warning but that's another story...

H.

 I am not trying to defend the NOTE but the
principle of minimizing
Depends declarations needs to be considered critically, and I am just
exploring the space.

 > ?GRanges  # it worked as usual in the tty

 > sessionInfo()

R version 3.1.1 (2014-07-10)

Platform: x86_64-apple-darwin13.1.0 (64-bit)


locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


attached base packages:

[1] stats     graphics  grDevices datasets  utils     tools     methods

[8] base


other attached packages:

[1] foo_0.0.0            rmarkdown_0.3.8      knitr_1.6

[4] weaver_1.31.0        codetools_0.2-9      digest_0.6.4

[7] BiocInstaller_1.16.0


loaded via a namespace (and not attached):

  [1] BiocGenerics_0.11.5   evaluate_0.5.5        formatR_1.0

  [4] GenomeInfoDb_1.1.26   GenomicRanges_1.17.48 htmltools_0.2.6

  [7] IRanges_1.99.32       parallel_3.1.1        S4Vectors_0.2.8

[10] stats4_3.1.1          stringr_0.6.2         XVector_0.5.8

    And that works only if the GenomicRanges package is attached. Attaching
    GenomicRanges will also attach other packages that GenomicRanges depends
    on where some GRanges accessors might be defined and documented (e.g.
    metadata()).



        In some cases you'll decide you want the user to have a full
        complement of

        methods for your package to function meaningfully.  For example,
        I am
        considering

        using dplyr idioms to work with data structures in a package,
        and it seems
        I should

        just depend on dplyr rather than pick out and document which
        things I want
        to expose.  But that

        may still be an undesirable design.


            package, like
                importClassesFrom("__GenomicRanges", "GRanges")
                exportClasses("GRanges")
            Surely that is not intended.

            It is important that my package works without being attached
            to the search
            path and I do this by carefully importing what I need, ie.
            my code does not
            require that my dependencies are attached to the search
            path.  But the end
            user will be hosed without it.


    Yes s/he will. Fortunately when your package namespace gets loaded by
    another package, then nothing gets attached to the search path, even if
    your package depends (instead of imports) on other packages. So using
    Depends instead of Imports for your own dependencies won't make any
    difference in that respect, which is good.


            My impression is that the NOTE in R CMD check was written by
            someone who
            did not anticipate large-scale use and re-use of classes and
            methods across
            many packages.


    That's my impression too.

    Cheers,
    H.


            Best,
            Kasper


            On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald
            <[email protected] <mailto:[email protected]>>
            wrote:

                I agree with Vince. It's your job as a package developer
                to make
                available to your package all the functions necessary
                for the package to
                work. But I am not sure it is your job to load all the
                packages that your
                end user might need.

                Best,

                Jim



                On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
                [email protected]
                <mailto:[email protected]>> wrote:

                    On Tue, Oct 28, 2014 at 10:19 AM, Kasper Daniel Hansen <
                    [email protected]
                    <mailto:[email protected]>> wrote:

                        What is the current best paradigm for using all
                        the classes in
                        S4Vectors/GenomeInfoDb/__GenomicRanges/IRanges

                        I obviously import methods and classes from the
                        relevant packages.

                        But shouldn't I depend on these packages as
                        well?  Since I basically

                    want

                        the user to have this functionality at the
                        command line? That is what

                    I do

                        now.


                    I've wondered about this as well.  It seems the
                    principle is that the
                    user
                    should
                    take care of attaching additional packages when
                    needed.  It might be
                    appropriate
                    to give a hint in the package startup message, if
                    having some other
                    package
                    attached
                    would typically be of great utility.

                    Given your list above, I would think that depending
                    on GenomicRanges
                    would
                    often
                    be sufficient, and IRanges/S4Vectors would not
                    require dependency
                    assertion.  I would
                    think that GenomeInfoDb should be a voluntary
                    attachment for a specific
                    session.

                    These are just my guesses -- I doubt there will be
                    complete consensus,
                    but
                    I have
                    started to think very critically about using
                    Depends, and I think it is
                    better when its
                    use is minimized.


                        That of course leads to the R CMD check NOTE on
                        depending on too many
                        packages.... I guess I should ignore that one.

                        Best,
                        Kasper

                                  [[alternative HTML version deleted]]

                        _________________________________________________
                        [email protected]
                        <mailto:[email protected]> mailing list
                        https://stat.ethz.ch/mailman/__listinfo/bioc-devel
                        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>


                              [[alternative HTML version deleted]]

                    _________________________________________________
                    [email protected]
                    <mailto:[email protected]> mailing list
                    https://stat.ethz.ch/mailman/__listinfo/bioc-devel
                    <https://stat.ethz.ch/mailman/listinfo/bioc-devel>




                --
                James W. MacDonald, M.S.
                Biostatistician
                University of Washington
                Environmental and Occupational Health Sciences
                4225 Roosevelt Way NE, # 100
                Seattle WA 98105-6099




                 [[alternative HTML version deleted]]

        _________________________________________________
        [email protected] <mailto:[email protected]>
        mailing list
        https://stat.ethz.ch/mailman/__listinfo/bioc-devel
        <https://stat.ethz.ch/mailman/listinfo/bioc-devel>


    --
    Hervé Pagès

    Program in Computational Biology
    Division of Public Health Sciences
    Fred Hutchinson Cancer Research Center
    1100 Fairview Ave. N, M1-B514
    P.O. Box 19024
    Seattle, WA 98109-1024

    E-mail: [email protected] <mailto:[email protected]>

    Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
    Fax: (206) 667-1319 <tel:%28206%29%20667-1319>



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [email protected]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to