This email is enough to start the conversation, but the person who will help is 
on holiday until approximately January 3 so a response will be delayed.

Martin

On 12/21/18, 7:26 PM, "Bioc-devel on behalf of Lu, Dongyi (Lambda)" 
<bioc-devel-boun...@r-project.org on behalf of d...@caltech.edu> wrote:

    I don’t mean to name the package “SingleCell”. I was referring to the 
biocView. Also, BUS format is quite different from the 10x molecule info, since 
while CellRanger aligns reads to the genome with STAR, the BUS file is 
generated by pseudoalignment to a transcriptome index and gives the set of 
transcripts a read is compatible to rather than which gene a read aligns to.
    
    In the ExperimentHub vignette about creating a new ExperimentHub package, 
we should contact a Bioconductor team member to upload the data. So does it 
mean that I directly email one of the core team members?
    
    Lambda
    
    On 12/21/18, 3:02 AM, "Bioc-devel on behalf of 
bioc-devel-requ...@r-project.org" <bioc-devel-boun...@r-project.org on behalf 
of bioc-devel-requ...@r-project.org> wrote:
    
        Send Bioc-devel mailing list submissions to
                bioc-devel@r-project.org
        
        To subscribe or unsubscribe via the World Wide Web, visit
                https://stat.ethz.ch/mailman/listinfo/bioc-devel
        or, via email, send a message with subject or body 'help' to
                bioc-devel-requ...@r-project.org
        
        You can reach the person managing the list at
                bioc-devel-ow...@r-project.org
        
        When replying, please edit your Subject line so it is more specific
        than "Re: Contents of Bioc-devel digest..."
        
        
        Today's Topics:
        
           1. Re:  New ExperimentHub resource and some related questions
              (Aaron Lun)
           2. Re:  New ExperimentHub resource and some related questions
              (Shepherd, Lori)
           3. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
              (Martin Morgan)
           4. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
              (Tierney, Luke)
           5. Re: Compilation flags, CHECK errors and BiocNeighbors
              (Obenchain, Valerie)
        
        ----------------------------------------------------------------------
        
        Message: 1
        Date: Thu, 20 Dec 2018 12:00:20 +0000
        From: Aaron Lun <infinite.monkeys.with.keyboa...@gmail.com>
        To: bioc-devel <bioc-devel@r-project.org>
        Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
                questions
        Message-ID: <9bf95433-af04-431b-b71d-62425195d...@gmail.com>
        Content-Type: text/plain; charset="utf-8"
        
        I presume your package is not actually called “SingleCell” (in point 
1). This would be pretty confusing wjem compared to the simpleSingleCell 
package, the SingleCellExperiment package, and the SingleCell biocViews term 
itself. It would probably make more sense to call it BUStoolsR or some other 
appropriate pun (e.g., RBUS, which is funniest when it gets to version 3.8.0.).
        
        Also, at first glance, the BUS format seems pretty similar to 10X’s 
molecule information file, for which the DropletUtils package has a series of 
reader functions. You may find some of the code there useful for your package. 
I might also add a readBUS() function to DropletUtils if this turns out to be a 
popular format for droplet data, though TBH the sparse matrix is a much more 
common starting point.
        
        -A
        
        > On 20 Dec 2018, at 01:42, Lu, Dongyi (Lambda) <d...@caltech.edu> 
wrote:
        > 
        > Hi everyone,
        > 
        > I’m writing a package (biocViews SinigleCell) that converts files of 
the BUS format (standing for Barcode, UMI, Set, see 
https://www.biorxiv.org/content/early/2018/11/21/472571) into a sparse matrix 
in R that can be used in Seurat and SingleCellExperiment. In order to write the 
examples and the vignette, I’m also putting the data itself into a package for 
ExperimentHub. The data used here are some mixed human and mouse cells from 
10x. Here are my questions:
        > 
        > 
        >  1.  In the documentation for 
`ExperimentHubData::makeExperimentHubMetadata`, the fields `RDataClass` and 
`DispatchClass` are required. However, this accompanying dataset package is 
meant to download text files (generated by command line tools outside R) to 
disk rather than into the R session, and it’s the job of the SingleCell package 
to converts the text files into a sparse matrix. There is a website documenting 
how the command line tools were used to generate the text files. So is this 
dataset still appropriate for ExperimentHub?
        >  2.  If it is appropriate, then what shall I put in `RDataClass` and 
`DispatchClass`?
        > 
        > Thanks,
        > Lambda
        > 
        >       [[alternative HTML version deleted]]
        > 
        > _______________________________________________
        > Bioc-devel@r-project.org mailing list
        > https://stat.ethz.ch/mailman/listinfo/bioc-devel
        
        
        
        
        ------------------------------
        
        Message: 2
        Date: Thu, 20 Dec 2018 12:05:57 +0000
        From: "Shepherd, Lori" <lori.sheph...@roswellpark.org>
        To: "Lu, Dongyi (Lambda)" <d...@caltech.edu>,
                "bioc-devel@r-project.org" <bioc-devel@r-project.org>
        Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
                questions
        Message-ID:
                
<mw2pr12mb23645e21836b066c9e38f9ddf9...@mw2pr12mb2364.namprd12.prod.outlook.com>
                
        Content-Type: text/plain; charset="utf-8"
        
        There is a DispatchClass  -  FilePath -  That will download the file 
and give you the path to the file in the cache location rather than loading it 
to the R session -  You then can use the file path in whatever read/load/etc 
method you deem fit.
        
        RDataClass  - I would either say character or matrix - knowing that 
there will be instructions on how to load the data somewhere in your package -
        
        
        
        Lori Shepherd
        
        Bioconductor Core Team
        
        Roswell Park Cancer Institute
        
        Department of Biostatistics & Bioinformatics
        
        Elm & Carlton Streets
        
        Buffalo, New York 14263
        
        ________________________________
        From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Lu, 
Dongyi (Lambda) <d...@caltech.edu>
        Sent: Wednesday, December 19, 2018 8:42:39 PM
        To: bioc-devel@r-project.org
        Subject: [Bioc-devel] New ExperimentHub resource and some related 
questions
        
        Hi everyone,
        
        I�m writing a package (biocViews SinigleCell) that converts files of 
the BUS format (standing for Barcode, UMI, Set, see 
https://www.biorxiv.org/content/early/2018/11/21/472571) into a sparse matrix 
in R that can be used in Seurat and SingleCellExperiment. In order to write the 
examples and the vignette, I�m also putting the data itself into a package for 
ExperimentHub. The data used here are some mixed human and mouse cells from 
10x. Here are my questions:
        
        
          1.  In the documentation for 
`ExperimentHubData::makeExperimentHubMetadata`, the fields `RDataClass` and 
`DispatchClass` are required. However, this accompanying dataset package is 
meant to download text files (generated by command line tools outside R) to 
disk rather than into the R session, and it�s the job of the SingleCell package 
to converts the text files into a sparse matrix. There is a website documenting 
how the command line tools were used to generate the text files. So is this 
dataset still appropriate for ExperimentHub?
          2.  If it is appropriate, then what shall I put in `RDataClass` and 
`DispatchClass`?
        
        Thanks,
        Lambda
        
                [[alternative HTML version deleted]]
        
        _______________________________________________
        Bioc-devel@r-project.org mailing list
        https://stat.ethz.ch/mailman/listinfo/bioc-devel
        
        
        This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
                [[alternative HTML version deleted]]
        
        
        
        
        ------------------------------
        
        Message: 3
        Date: Thu, 20 Dec 2018 14:17:04 +0000
        From: Martin Morgan <mtmorgan.b...@gmail.com>
        To: "Tierney, Luke" <luke-tier...@uiowa.edu>, "Shepherd, Lori"
                <lori.sheph...@roswellpark.org>
        Cc: bioc-devel <bioc-devel@r-project.org>
        Subject: Re: [Bioc-devel] Aliasing `]` breaks BiocCheck::BiocCheck()
                version 1.18.0
        Message-ID:
                
<mwhpr05mb3582c1f459721640bde93c55f9...@mwhpr05mb3582.namprd05.prod.outlook.com>
                
        Content-Type: text/plain; charset="utf-8"
        
        this comes from `findGlobals()`
        
        > foo <- `[`
        > findGlobals(foo)
        Error in makeUsageCollector(fun, ...) : only works for closures
        > traceback()
        4: stop("only works for closures")
        3: makeUsageCollector(fun, ...)
        2: collectUsage(fun, enterGlobal = enter)
        1: findGlobals(foo)
        
        In the bigger context it is in code that looks for poor 'coding 
practice', in this particular case looking for use of T / F rather than TRUE / 
FALSE, where the logic is to parse each function for use of global variables, 
and then to search for T / F amongst those.
        
        The full traceback when run on the package at 
https://github.com/mtmorgan/PkgA/tree/BiocCheck-sbs
        
        * Checking coding practice...
        Error in makeUsageCollector(fun, ...) : only works for closures
        > traceback()
        9: stop("only works for closures")
        8: makeUsageCollector(fun, ...)
        7: collectUsage(fun, enterGlobal = enter)
        6: findGlobals(value)
        5: FUN(X[[i]], ...)
        4: lapply(objs, FUN = function(obj) {
               value = env[[obj]]
               if (is.function(value)) 
                   findGlobals(value)
               else character(0)
           })
        3: findLogicalRdir(pkgname, c("T", "F"))
        2: checkCodingPractice(package_dir, parsedCode, package_name)
        1: BiocCheck::BiocCheck(".")
        
        Martin
        
        On 12/19/18, 8:32 AM, "Bioc-devel on behalf of Tierney, Luke" 
<bioc-devel-boun...@r-project.org on behalf of luke-tier...@uiowa.edu> wrote:
        
            codetools already checks only closures in checkUsageENv and hande
            checkUsagePackage, so this is anissue on the Bioc side.
            
            Best,
            
            luke
            
            On Tue, 18 Dec 2018, Tierney, Luke wrote:
            
            > Codetools should probably be ignoring those. Will have a look
            >
            > Sent from my iPhone
            >
            >> On Dec 18, 2018, at 6:54 AM, Shepherd, Lori 
<lori.sheph...@roswellpark.org> wrote:
            >>
            >> Can you please open an issue for this so we don't lose track of 
it -
            >>
            >> https://github.com/Bioconductor/BiocCheck/issues
            >>
            >>
            >>
            >> Lori Shepherd
            >>
            >> Bioconductor Core Team
            >>
            >> Roswell Park Cancer Institute
            >>
            >> Department of Biostatistics & Bioinformatics
            >>
            >> Elm & Carlton Streets
            >>
            >> Buffalo, New York 14263
            >>
            >> ________________________________
            >> From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of 
Shian Su <s...@wehi.edu.au>
            >> Sent: Monday, December 17, 2018 8:34:10 PM
            >> To: bioc-devel
            >> Subject: [Bioc-devel] Aliasing `]` breaks BiocCheck::BiocCheck() 
version 1.18.0
            >>
            >> Hi all,
            >>
            >> If you put
            >>
            >> foo <- `[`
            >>
            >> Somewhere in a package, it will trigger
            >>
            >> Error in makeUsageCollector(fun, ...) : only works for closures
            >>
            >> In BiocCheck::BiocCheck() (version 1.18.0). This comes from
            >>
            >> if (typeof(fun) != "closure")
            >>        stop("only works for closures")
            >>
            >> In codetools::makeUsageCollector(), but
            >>
            >>> typeof(`[`)
            >> ## "special"
            >>
            >> Not that it matters for my use-case because I had discovered 
magrittr???s extract alias, but it might be an edge case worth covering, 
especially since the error message is so cryptic.
            >>
            >> Kind regards,
            >> Shian Su
            >>
            >> _______________________________________________
            >>
            >> The information in this email is confidential and 
intend...{{dropped:29}}
            >>
            >> _______________________________________________
            >> Bioc-devel@r-project.org mailing list
            >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
            > _______________________________________________
            > Bioc-devel@r-project.org mailing list
            > https://stat.ethz.ch/mailman/listinfo/bioc-devel
            
            -- 
            Luke Tierney
            Ralph E. Wareham Professor of Mathematical Sciences
            University of Iowa                  Phone:             319-335-3386
            Department of Statistics and        Fax:               319-335-3017
                Actuarial Science
            241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
            Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
            
            _______________________________________________
            Bioc-devel@r-project.org mailing list
            https://stat.ethz.ch/mailman/listinfo/bioc-devel
            
        
        
        ------------------------------
        
        Message: 4
        Date: Thu, 20 Dec 2018 14:31:47 +0000
        From: "Tierney, Luke" <luke-tier...@uiowa.edu>
        To: Martin Morgan <mtmorgan.b...@gmail.com>
        Cc: "Shepherd, Lori" <lori.sheph...@roswellpark.org>, bioc-devel
                <bioc-devel@r-project.org>
        Subject: Re: [Bioc-devel] Aliasing `]` breaks BiocCheck::BiocCheck()
                version 1.18.0
        Message-ID: <alpine.DEB.2.21.1812200829080.3478@luke-Latitude-7480>
        Content-Type: text/plain; charset="utf-8"
        
        That's where the error is signaled, but the issue is in
        
        > 4: lapply(objs, FUN = function(obj) {
        >       value = env[[obj]]
        >       if (is.function(value))
        >           findGlobals(value)
        >       else character(0)
        >   })
        > 3: findLogicalRdir(pkgname, c("T", "F"))
        
        Change is.function(value) to typeof(value) == "closure" and you should 
be OK.
        
        Best,
        
        luke
        
        On Thu, 20 Dec 2018, Martin Morgan wrote:
        
        > this comes from `findGlobals()`
        >
        >> foo <- `[`
        >> findGlobals(foo)
        > Error in makeUsageCollector(fun, ...) : only works for closures
        >> traceback()
        > 4: stop("only works for closures")
        > 3: makeUsageCollector(fun, ...)
        > 2: collectUsage(fun, enterGlobal = enter)
        > 1: findGlobals(foo)
        >
        > In the bigger context it is in code that looks for poor 'coding 
practice', in this particular case looking for use of T / F rather than TRUE / 
FALSE, where the logic is to parse each function for use of global variables, 
and then to search for T / F amongst those.
        >
        > The full traceback when run on the package at 
https://github.com/mtmorgan/PkgA/tree/BiocCheck-sbs
        >
        > * Checking coding practice...
        > Error in makeUsageCollector(fun, ...) : only works for closures
        >> traceback()
        > 9: stop("only works for closures")
        > 8: makeUsageCollector(fun, ...)
        > 7: collectUsage(fun, enterGlobal = enter)
        > 6: findGlobals(value)
        > 5: FUN(X[[i]], ...)
        > 4: lapply(objs, FUN = function(obj) {
        >       value = env[[obj]]
        >       if (is.function(value))
        >           findGlobals(value)
        >       else character(0)
        >   })
        > 3: findLogicalRdir(pkgname, c("T", "F"))
        > 2: checkCodingPractice(package_dir, parsedCode, package_name)
        > 1: BiocCheck::BiocCheck(".")
        >
        > Martin
        >
        > On 12/19/18, 8:32 AM, "Bioc-devel on behalf of Tierney, Luke" 
<bioc-devel-boun...@r-project.org on behalf of luke-tier...@uiowa.edu> wrote:
        >
        >    codetools already checks only closures in checkUsageENv and hande
        >    checkUsagePackage, so this is anissue on the Bioc side.
        >
        >    Best,
        >
        >    luke
        >
        >    On Tue, 18 Dec 2018, Tierney, Luke wrote:
        >
        >    > Codetools should probably be ignoring those. Will have a look
        >    >
        >    > Sent from my iPhone
        >    >
        >    >> On Dec 18, 2018, at 6:54 AM, Shepherd, Lori 
<lori.sheph...@roswellpark.org> wrote:
        >    >>
        >    >> Can you please open an issue for this so we don't lose track of 
it -
        >    >>
        >    >> https://github.com/Bioconductor/BiocCheck/issues
        >    >>
        >    >>
        >    >>
        >    >> Lori Shepherd
        >    >>
        >    >> Bioconductor Core Team
        >    >>
        >    >> Roswell Park Cancer Institute
        >    >>
        >    >> Department of Biostatistics & Bioinformatics
        >    >>
        >    >> Elm & Carlton Streets
        >    >>
        >    >> Buffalo, New York 14263
        >    >>
        >    >> ________________________________
        >    >> From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf 
of Shian Su <s...@wehi.edu.au>
        >    >> Sent: Monday, December 17, 2018 8:34:10 PM
        >    >> To: bioc-devel
        >    >> Subject: [Bioc-devel] Aliasing `]` breaks 
BiocCheck::BiocCheck() version 1.18.0
        >    >>
        >    >> Hi all,
        >    >>
        >    >> If you put
        >    >>
        >    >> foo <- `[`
        >    >>
        >    >> Somewhere in a package, it will trigger
        >    >>
        >    >> Error in makeUsageCollector(fun, ...) : only works for closures
        >    >>
        >    >> In BiocCheck::BiocCheck() (version 1.18.0). This comes from
        >    >>
        >    >> if (typeof(fun) != "closure")
        >    >>        stop("only works for closures")
        >    >>
        >    >> In codetools::makeUsageCollector(), but
        >    >>
        >    >>> typeof(`[`)
        >    >> ## "special"
        >    >>
        >    >> Not that it matters for my use-case because I had discovered 
magrittr???s extract alias, but it might be an edge case worth covering, 
especially since the error message is so cryptic.
        >    >>
        >    >> Kind regards,
        >    >> Shian Su
        >    >>
        >    >> _______________________________________________
        >    >>
        >    >> The information in this email is confidential and 
intend...{{dropped:29}}
        >    >>
        >    >> _______________________________________________
        >    >> Bioc-devel@r-project.org mailing list
        >    >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
        >    > _______________________________________________
        >    > Bioc-devel@r-project.org mailing list
        >    > https://stat.ethz.ch/mailman/listinfo/bioc-devel
        >
        >    --
        >    Luke Tierney
        >    Ralph E. Wareham Professor of Mathematical Sciences
        >    University of Iowa                  Phone:             319-335-3386
        >    Department of Statistics and        Fax:               319-335-3017
        >        Actuarial Science
        >    241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
        >    Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
        >
        >    _______________________________________________
        >    Bioc-devel@r-project.org mailing list
        >    https://stat.ethz.ch/mailman/listinfo/bioc-devel
        >
        >
        
        -- 
        Luke Tierney
        Ralph E. Wareham Professor of Mathematical Sciences
        University of Iowa                  Phone:             319-335-3386
        Department of Statistics and        Fax:               319-335-3017
            Actuarial Science
        241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
        Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
        
        ------------------------------
        
        Message: 5
        Date: Thu, 20 Dec 2018 19:52:08 +0000
        From: "Obenchain, Valerie" <valerie.obench...@roswellpark.org>
        To: Aaron Lun <infinite.monkeys.with.keyboa...@gmail.com>,
                "bioc-devel@r-project.org" <bioc-devel@r-project.org>
        Subject: Re: [Bioc-devel] Compilation flags, CHECK errors and
                BiocNeighbors
        Message-ID:
                
<mwhpr1201mb02547c0566b9daf16450cf7bff...@mwhpr1201mb0254.namprd12.prod.outlook.com>
                
        Content-Type: text/plain; charset="utf-8"
        
        The problem is that during the nightly builds, one of the Bioconductor 
        packages writes out a .R/Makevars.win in biocbuild's HOME during R CMD 
        build.
        
        Yesterday I removed the .R/ directory before the builds started and, as 
        expected, today's NodeInfo on tokay2 and packages using the C++11 show 
        the correct flags.
        
        If this .R/Makevars.win is not removed, it will (and did in the past) 
        pollute the next build cycle such that the NodeInfo and all packages 
        using C++11 would report/use the wrong flags.
        
        I think I've narrowed down which package is doing this and will contact 
        the maintainer. We'll also implement some sanitation code in the BBS to 
        prevent this from happening again.
        
        The reason HOME is writable is that many applications need to create 
        files (often hidden) such as lock files, cache, config files etc. If 
        they can't, they'll break and they will sometimes break in a subtle way 
        that is not immediately obvious.
        
        One last follow up is to explain why the previous iteration of the 
        NodeInfo on the build report reported the incorrect C++11 flags. The 
        problem there was that previously we were only picking up CXX1XFLAGS 
        instead of the individual CXX11FLAGS, CXX14FLAGS etc.
        
        Thanks for being persistent on this issue and for bringing the 
        conversation to bioc-devel.
        
        Val
        
        
        
        On 12/18/18 8:39 AM, Obenchain, Valerie wrote:
        > The devel build report hasn't posted yet but I took a look at the new
        > compiler flag output Herve implemented. The results show tokay2 is
        > indeed using
        > 
        > CXX11FLAGS: -O3 -march=native -mtune=native
        > 
        > This is inconsistent with what we have in the R/etc/<arch>/Makeconf 
for
        > both architectures on both tokay1 and tokay2. The Makeconf looks like 
this:
        > 
        > CXX11 = $(BINPREF)g++ $(M_ARCH)
        > CXX11FLAGS = -O2 -Wall $(DEBUGFLAG) -mtune=generic
        > CXX11PICFLAGS =
        > CXX11STD = -std=gnu++11
        > 
        > I don't know why the Makeconf is not being respected on tokay2. I can
        > confirm the inconsistency in an R session -
        > 
        > tokay2:
        > 
        > PS C:\Users\biocbuild\bbs-3.9-bioc\R> ./bin/R CMD config CXX11FLAGS
        > -O3 -march=native -mtune=native
        > 
        > tokay1:
        > 
        > PS C:\Users\biocbuild\bbs-3.8-bioc\R> ./bin/R CMD config CXX11FLAGS
        > -O2 -Wall -mtune=generic
        > 
        > I'll work with Herve to resolve this.
        > 
        > Val
        > 
        > 
        > 
        > On 12/17/18 5:05 PM, Aaron Lun wrote:
        >> Thanks Val. I don�t think it�s a BiocNeighbors thing, as it doesn�t 
try
        >> to customize the compilation flags or have its own Makevars. 
Moreover,
        >> the �-O3 -mtune=native -mtune=generic� flags seem to show up on all 
of
        >> my packages containing C++11 code. Some cursory checks of other 
packages
        >> suggest that the correct flags (�-O2 -mtune=generic�) are used for 
C++98
        >> code.
        >>
        >> -A
        >>
        >>> On 17 Dec 2018, at 17:47, Obenchain, Valerie 
<valerie.obench...@roswellpark.org> wrote:
        >>>
        >>> Hi Aaron,
        >>>
        >>> The only compilation flags that are different for tokay1 (release) 
and
        >>> tokay2 (devel) are C++14 flags. BiocNeighbors is not using C++14 but
        >>> C++11 so I think the changes we discussed previously actually don't
        >>> apply to your case.
        >>>
        >>> All compilation flags we use are listed at the top of the build 
report,
        >>> e.g., for tokay2:
        >>>
        >>> 
https://www.bioconductor.org/checkResults/devel/bioc-LATEST/tokay2-NodeInfo.html
        >> 
<https://www.bioconductor.org/checkResults/devel/bioc-LATEST/tokay2-NodeInfo.html>
        >>>
        >>> I can look into this further but right now I'm not sure where the 
'-O3
        >>> -march=native -mtune=native' is coming from in the check output for
        >>> BiocNeighbors. We don't use 'native' on the builders for 
build/check or
        >>> for creating binaries.
        >>>
        >>> Herve might have more insight on this.
        >>>
        >>> Val
        >>>
        >>>
        >>>
        >>>
        >>>
        >>>
        >>>
        >>> On 12/15/18 10:56 PM, Aaron Lun wrote:
        >>>> Sometime between 6-18 November, BiocNeighbors� BioC-devel builds 
began failing on Windows 64-bit, and have continued to fail since:
        >>>>
        >>>> 
http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/
        >> 
<http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>
        >> 
<http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/
        >> 
<http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>>
        >>>>
        >>>> The most interesting part is the nature of the failures. They are 
not segmentation faults but rather �incorrect� output in the unit tests:
        >>>>
        >>>> - BiocNeighbors uses the Annoy algorithm for approximate nearest 
neighbor search, which is provided as a header-only C++ library in the 
RcppAnnoy package.
        >>>>
        >>>> - I have compiled the BiocNeighhbors C++ code with an �#include" 
for these libraries to use the Annoy routines. For testing, I compared the 
output of my C++ code to the output of the code in the RcppAnnoy package.
        >>>>
        >>>> - It is these tests that are failing (i.e., the output does not 
match up) during CHECK on Windows 64-bit only, despite the fact that the same 
library is being �#include�d in both the BiocNeighbors and RcppAnnoy sources!
        >>>>
        >>>> What makes this particularly intriguing is that the differences 
between BiocNeighbors and RcppAnnoy are very minor. Less than 1% of the 
neighbor identities differ, and only for some of the scenarios, so it�s not an 
obvious bug that would be changing the  output en masse. Now, the package also 
uses/tests Annoy in
        >> BioC-release but builds fine on tokay1:
        >>>>
        >>>> 
http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/
        >> 
<http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/> 
<http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/
        >> 
<http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/>>
        >>>>
        >>>> The major difference between the Bioc-release/devel builds is the 
compilation flags, which have changed from �-O2 -mtune=generic� to �-O3 
-march=native -mtune=native� in tokay2. I am told (thanks Val) that the timing 
of this change is consistent with the  start of the BiocNeighbors build 
failures on tokay2. I would guess
        >> that RcppAnnoy is also compiled with �-O2 -mtune=generic� on the CRAN
        >> build systems, introducing differences in optimization levels between
        >> the BiocNeighbors and RcppAnnoy binaries. These could be responsible 
for
        >> the discrepancies in the search results.
        >>>>
        >>>> I was able to reproduce this on my Unix cluster (gcc 6.5.0) where 
setting �-march=native� with either �-O3� or �-O2� caused a difference in the 
calculations. After much trial and error, I eventually narrowed this down to 
the �-mfma� flag, which seems to  change the precision of multiply-and-add 
operations and thus the
        >> search results. This occurs even when AVX support is turned off; I 
guess
        >> the compiler tries to be smart if it detects you are doing some kind 
of
        >> simultaneous multiply and addition, which is a pretty common thing 
to do
        >> when computing Euclidean distances.
        >>>>
        >>>> In summary: can we not use �-march=native� on tokay2? (Val, I know 
we discussed this, but whatever changes you made to the compilation flags don�t 
seem to have propagated to the build machines.) As the case study with 
BiocNeighbors shows, this leads to inconsistencies  between the CRAN and 
BioC-devel binaries for the same code, which
        >> unnecessarily complicates downstream usage and unit tests. I also 
wonder
        >> how binaries specialized for tokay2�s architecture would behave on 
other
        >> CPUs with different instruction sets, if they would run at all.
        >>>>
        >>>> Cheers,
        >>>>
        >>>> Aaron
        >>>>         [[alternative HTML version deleted]]
        >>>>
        >>>> _______________________________________________
        >>>> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing 
list
        >>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
        >> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
        >>>>
        >>>
        >>>
        >>>
        >>> This email message may contain legally privileged and/or 
confidential information.  If you are not the intended recipient(s), or the 
employee or agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that  any disclosure, copying, 
distribution, or use of this email message is
        >> prohibited.  If you have received this message in error, please 
notify
        >> the sender immediately by e-mail and delete this email message from 
your
        >> computer. Thank you.
        >>
        >>
        >>           [[alternative HTML version deleted]]
        >>
        >> _______________________________________________
        >> Bioc-devel@r-project.org mailing list
        >> https://stat.ethz.ch/mailman/listinfo/bioc-devel
        > 
        > 
        > 
        > This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
        > _______________________________________________
        > Bioc-devel@r-project.org mailing list
        > https://stat.ethz.ch/mailman/listinfo/bioc-devel
        > 
        
        
        
        This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
        
        ------------------------------
        
        Subject: Digest Footer
        
        _______________________________________________
        Bioc-devel mailing list
        Bioc-devel@r-project.org
        https://stat.ethz.ch/mailman/listinfo/bioc-devel
        
        
        ------------------------------
        
        End of Bioc-devel Digest, Vol 177, Issue 17
        *******************************************
    
    _______________________________________________
    Bioc-devel@r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
    
_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to