from:"Hervé Pagès"

Re: [Bioc-devel] Biostrings: unicode character ("compact ellipsis") in print()/show() output (2nd attempt, rephrased)

2020-03-09 Thread Hervé Pagès

-2Dx86-5F64-2Ddebian-2Dcl=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=iEfGQDmbJr0Dp78RIYxXPcyoSMXMo31SAqDdWvOcdxI=YtTzYBdTB81bDnav9WcphW_315e5Urmzz03mziPFAOc= 
ang/apcluster-00check.html

),

but leads to warnings also in a UTF-8 locale.

Any help is gratefully appreciated, thanks so much in advance!

Best regards,
Ulrich

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=iEfGQDmbJr0Dp78RIYxXPcyoSMXMo31SAqDdWvOcdxI=q-0nEguTQCQNv04U4nju5HMBn6l8GmC_yCd7KSemIU8= 


--
The information in this e-mail is intended only for t...{{dropped:24}}


___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=iEfGQDmbJr0Dp78RIYxXPcyoSMXMo31SAqDdWvOcdxI=q-0nEguTQCQNv04U4nju5HMBn6l8GmC_yCd7KSemIU8= 



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] unlink() on "~" removes the home directory

2020-02-26 Thread Hervé Pagès


On 2/26/20 14:47, Gábor Csárdi wrote:

!!! DON'T TRY THE CODE IN THIS EMAIL AT HOME !!!


Ok I'll try it at work on my boss's computer, sounds a lot safer.

H.



Well, unlink() does what it is supposed to do, so you could argue that
there is nothing wrong with it. Also, nobody would call unlink() on
"~", right?

The situation is not so simple, however. E.g. if you happen to have a
directory called "~", and you iterate over all files and directories
to selectively remove some of them, then your code might end up
calling unlink on the local "~" directory, and then your home is gone.

But you would not create a directory named "~", that is just asking
for trouble. Well, surely, _intentionally_ you would not do that.
Unintentionally, you might. E.g. something like this is enough:

# Create a subpath within a base directory
badfun <- function(base = ".", path) {
   dir.create(file.path(base, path), recursive = TRUE, showWarnings = FALSE)
}
badfun(path = "~/foo")

(If you did run this, be very careful how you remove the directory called "~"!)

A real example is `R CMD build` which deletes the home directory of
the current user if the root of the package contains a non-empty "~"
directory. Luckily this is now fixed in R-devel, so R 4.0.0 will do
better. (R 3.6.3 will not.) See
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_wch_r-2Dsource_commit_1d4f7aa1dac427ea2213d1f7cd7b5c16e896af22=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=C3VCGF44o7jATPOlC8aZhaT4YGU1JtcOixJKZgu6KyI=iWNt-0G2gZa99bnOqNBMOHph0NyVoJdsIwuA07GhJZQ=

I have seen several bug reports about various packages (that call R
CMD build) removing the home directory, so this indeed happens in
practice to a number of people. The commit above will fix `R CMD
build`, but it would be great to "fix" this in general.

It seems pretty hard to prevent users from creating of a "~"
directory. But preventing unlink() from deleting "~" does not actually
seem too hard. If unlink() could just refuse removing "~" (when expand
= TRUE), that would be great. It seems to me that the current behavior
is very-very rarely intended, and its consequences are potentially
disastrous.

If unlink("~", recursive = TRUE) errors, you can still remove a local
"~" file/dir with unlink("./~", ...). And you can still remove your
home directory if you really want to do that, with
unlink(path.expand("~"), ...). So no functionality is lost.

Also, if anyone is aware of packages/functions that tend to create "~"
directories or files, please let me know.

I would be happy to submit a patch for the new unlink("~") behavior.

Thanks,
Gabor

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=C3VCGF44o7jATPOlC8aZhaT4YGU1JtcOixJKZgu6KyI=FeZWU9uN-HwDNkSBOmbYXiGqu8q8-U6DI-ddyUn7HHw=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] TIMEOUT error in building vignette in package scruff

2020-02-03 Thread Hervé Pagès


Hi Zhe,

On 2/2/20 18:12, Wang, Zhe wrote:

Hi Herve,

I made a few modifications to scruff package trying to reduce computing time 
for the examples and vignette compilation. But still the development build does 
not pass on the Windows host due to timeout error. Can you check what went 
wrong on Windows OS?


What "went wrong" is that for some reason the code in the vignette is 
too slow on Windows. Note that 'R CMD build scruff' takes about 1273 
seconds on malbec2. Given that things tend to run slower on Windows 
compared to Linux (for example, 'R CMD build Rsubread' takes 259 seconds 
on Windows vs 96 seconds only on Linux), it is not unreasonable to 
imagine that 'R CMD build scruff' would take more than 2400 seconds on 
tokay2. Hence the TIMEOUT.


Did you take a look at Rubread's performance regression introduced on 
Jan 7? Did you report it to the Rsubread maintainer? When you choose to 
depend on someone's else package, it means that you are prepared to deal 
with the changes they make to their package. This includes letting them 
know when they introduce problems that affect you.


Cheers,
H.



Thank you,
Zhe

-Original Message-
From: Wang, Zhe
Sent: Thursday, January 9, 2020 8:16 AM
To: Pages, Herve 
Subject: RE: [Bioc-devel] TIMEOUT error in building vignette in package scruff

Hi Herve,

Thanks a lot for helping out. I'll take a look at Rsubread's source code and 
see if there's anything I can do to fix this. If not I guess I'll contact their 
maintainer for a solution.

Zhe

-Original Message-
From: Pages, Herve [mailto:hpa...@fredhutch.org]
Sent: Wednesday, January 8, 2020 7:07 PM
To: Wang, Zhe ; bioc-devel@r-project.org
Subject: Re: [Bioc-devel] TIMEOUT error in building vignette in package scruff

I can confirm that downgrading Rsubread to an older version eliminates the 
slowdown. Here are the timings I get on my laptop (Ubuntu 16.04.6
LTS) for 'R CMD build scruff' with different versions of Rsubread (all these 
timings use the same R-devel version and the latest master branch of scruff):

   Rsubread   |
commit   date version | time R CMD build scruff
--+
3eb25843  Jan  7 2020   2.1.1 |   49 min 20 sec
6438d4a9  Dec 17 2019   2.1.1 | > 10 min (*)
3f716691  Dec 16 2019   2.1.1 | > 10 min (*)
b3602eb6  Dec 10 2019   2.1.1 | > 10 min (*)
ef5e4c21  Dec  9 2019   2.1.1 |   45 min 33 sec
d0b1acfb  Oct 29 2019   2.1.0 |3 min 47 sec
--+
  (*) command was interrupted before completion

So the culprit seems to be commit ef5e4c21d:

hpages@spectre:~/Rsubread$ git log ef5e4c21d -n 1
commit ef5e4c21d5a6d633aec0b2922dd9f230fed23463
Author: Yang Liao 
Date:   Mon Dec 9 13:42:08 2019 +1100

Updated Ambient-RNA detection algorithm

Hope this helps,
H.


On 1/8/20 10:30, Pages, Herve wrote:

Hi Zhe,

The code in the vignette seems to rely a lot on the Rsubread package
to perform some very computationally intensive operations. Could it be
that some recent changes to the Rsubread package are somehow related
to the sudden slowdown of these operations? That's something I suggest
you investigate.

Cheers,
H.


On 1/8/20 07:50, Wang, Zhe wrote:

Hi all,

I am the maintainer of package scruff. Recently I encountered a timeout error 
in creating vignette in the development branch of the package. I tried 
increasing the number of cores for parallelization from 2 to 3 but it still 
timed out. I did not make any changes to the R codes in the vignette before 
this error happened. I tested building the vignette locally and it compiles 
without error. Does anyone know how I should fix this? Should I comment out 
some of the command in the vignette to reduce computing time? Should I try 
increasing the core number a bit more?

Thanks,
Zhe


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mai
lman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3X
eAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=stHSi8EN5PUgMEVus6KdL2z7-DfGz
ikck8btNROFUx0=dSXLn6y8wJtjg6ukcpOpGbSHfXqqsH97r9fuSJkakGo=





--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] how can I declare that a package doesn't/can't fully support Windows

2018-10-12 Thread Hervé Pagès


I'm not too familiar with Rsamtools::pileup() but I know it does
handle indels. I also remember that last time I tried it was very
fast, very flexible, and has a very extensive man page.

H.

On 10/12/2018 09:09 AM, Tim Triche, Jr. wrote:
can it handle indels reasonably well?  It might be a reasonable 
substitute, especially if it can spit out a VRanges somehow


--t


On Thu, Oct 11, 2018 at 10:14 AM Hervé Pagès <mailto:hpa...@fredhutch.org>> wrote:


Sounds good.

I should also mention that Rsamtools has pileup() that is available
on all platforms. Don't know how easy it would be to use to achieve
the kind of variant calling you're doing in MTseeker though...

Cheers,
H.

On 10/11/2018 04:35 AM, Tim Triche, Jr. wrote:
 > This makes sense. Windows users won’t be easily able to call
variants across thousands of samples but at least the plotting,
impact prediction, etc will work fine for them.
 >
 > I will need to define things such that the variant calling is
optional, which is not too absurd — I’ll add loading of
MVRanges/MVRangesList objects from VCFs and move gmapR to suggests.
 >
 > Personally I use the variant calling functionality regularly, but
I have my doubts as to whether someone on Windows would even have
enough RAM to call variants across 2000+ samples in a shot, so this
is a good compromise.  Everything else is, I believe, platform
agnostic. This should work.
 >
 > --t
 >
 >> On Oct 11, 2018, at 1:41 AM, Hervé Pagès mailto:hpa...@fredhutch.org>> wrote:
 >>
 >> And of course: the whole trick I described below only makes sense
 >> if MTseeker doesn't rely on gmapR for its core functionality, that
 >> is, if not having gmapR installed still allows the user to
accomplish
 >> something meaningful with MTseeker.
 >>
 >> Otherwise the trick below will make MTseeker available on Windows
 >> but Windows users won't be able to accomplish anything meaningful
 >> with it. In that case, marking the package as unsupported on Windows
 >> would be preferable.
 >>
 >> Hope this makes sense,
 >>
 >> H.
 >>
 >>> On 10/10/2018 10:26 PM, Hervé Pagès wrote:
 >>> Hi Tim,
 >>> No platform-specific dontrun capabilities AFAIK but you can use
 >>> something like:
 >>>    if (requireNamespace("gmapR", quietly=TRUE)) {
 >>>        ...
 >>>        ...
 >>>    }
 >>> in your man pages.
 >>> You would also need to move gmapR from Imports to Suggests.
 >>> Then make sure that MTseeker passes 'R CMD check' **without**
 >>> the gmapR package being installed. You'll need to define and set
 >>> environment variable _R_CHECK_FORCE_SUGGESTS_ to 0 for this.
 >>> Do it with:
 >>>    export _R_CHECK_FORCE_SUGGESTS_=0
 >>> on Linux or Mac, or with:
 >>>    set _R_CHECK_FORCE_SUGGESTS_=0
 >>> on Windows.
 >>> Once MTseeker is accepted, we'll add a .BBSoptions file with
special
 >>> directive:
 >>>    CHECKprepend.win: set _R_CHECK_FORCE_SUGGESTS_=0&&
 >>> This will have the effect to set the environment variable on the
 >>> Windows build machines before running 'R CMD check' there.
 >>> So MTseeker will be supported and available on all platforms.
 >>> For MTseekerData: the package doesn't seem to make any use of gmapR
 >>> so you can probably remove gmapR from its Suggests field.
 >>> Hope this helps,
 >>> H.
 >>>> On 10/10/2018 07:46 AM, Tim Triche, Jr. wrote:
 >>>> it looks like gmapR does not support Windows, and as a result,
my MTseeker
 >>>> package cannot build on tokay1, so the Data package which
requires it also
 >>>> cannot build on tokay1.  Are there platform-specific dontrun
capabilities?
 >>>>
 >>>>

https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_spb-5Freports_MTseekerData-5Fbuildreport-5F20181010103212.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=x8fxgyra6DmetOzKCh1Zm1X25BwPM7TDr8UUNYwgV18=QNKSsmQnfEAQOTI0r8dDuW_d01XzL_cL1cb5QIElLxw=
 >>>>
 >>>> Short of somehow forcing gmapR to build on Windows, which I
believe is
 >>>> beyond my control, is there a way to declare that parts of the
MTseeker
 >>>> package are unsupported/unsupportable on Windows?
 >>>>
 >>>> I suppose I could cleave off

Re: [Bioc-devel] Bioconductor package build fails

2018-10-11 Thread Hervé Pagès

rectly on Bioconductor's Windows and OS X machines but fails 
on Ubuntu:

https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_checkResults_release_bioc-2DLATEST_variancePartition_malbec2-2Dbuildsrc.html=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=zy-TmtsjB4a5Mwg313kvmlN0-6yO-6JDYZsT1nV6cCw=RBYxTA_a5mlzemSuuk87tom-Q10dMclvh6KjhOGULDw=

The code in the package has been stable, and I am not able to reproduce the 
error on by Ubuntu machine 14.04.5 LTS with R 3.5.1.

The issue has something to do with multithreading using doParallel.  The error 
is:

Warning in socketConnection("localhost", port = port, server = TRUE, blocking = 
TRUE,  :
   port 11104 cannot be opened
Quitting from lines 102-154 (variancePartition.Rnw)
Error: processing vignette 'variancePartition.Rnw' failed with diagnostics:
cannot open the connection
Execution halted

With the Bioconductor 3.7 Freeze coming, I�m concerned about getting stuck with 
a broken package.

Thanks,
  - Gabriel

[[alternative HTML version deleted]]



___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=zy-TmtsjB4a5Mwg313kvmlN0-6yO-6JDYZsT1nV6cCw=5WZzTZGK1jX2QI5fKn_INmjLYyv876INVFqftwFxlT4=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] how can I declare that a package doesn't/can't fully support Windows

2018-10-11 Thread Hervé Pagès


Sounds good.

I should also mention that Rsamtools has pileup() that is available
on all platforms. Don't know how easy it would be to use to achieve
the kind of variant calling you're doing in MTseeker though...

Cheers,
H.

On 10/11/2018 04:35 AM, Tim Triche, Jr. wrote:

This makes sense. Windows users won’t be easily able to call variants across 
thousands of samples but at least the plotting, impact prediction, etc will 
work fine for them.

I will need to define things such that the variant calling is optional, which 
is not too absurd — I’ll add loading of MVRanges/MVRangesList objects from VCFs 
and move gmapR to suggests.

Personally I use the variant calling functionality regularly, but I have my 
doubts as to whether someone on Windows would even have enough RAM to call 
variants across 2000+ samples in a shot, so this is a good compromise.  
Everything else is, I believe, platform agnostic. This should work.

--t


On Oct 11, 2018, at 1:41 AM, Hervé Pagès  wrote:

And of course: the whole trick I described below only makes sense
if MTseeker doesn't rely on gmapR for its core functionality, that
is, if not having gmapR installed still allows the user to accomplish
something meaningful with MTseeker.

Otherwise the trick below will make MTseeker available on Windows
but Windows users won't be able to accomplish anything meaningful
with it. In that case, marking the package as unsupported on Windows
would be preferable.

Hope this makes sense,

H.


On 10/10/2018 10:26 PM, Hervé Pagès wrote:
Hi Tim,
No platform-specific dontrun capabilities AFAIK but you can use
something like:
   if (requireNamespace("gmapR", quietly=TRUE)) {
   ...
   ...
   }
in your man pages.
You would also need to move gmapR from Imports to Suggests.
Then make sure that MTseeker passes 'R CMD check' **without**
the gmapR package being installed. You'll need to define and set
environment variable _R_CHECK_FORCE_SUGGESTS_ to 0 for this.
Do it with:
   export _R_CHECK_FORCE_SUGGESTS_=0
on Linux or Mac, or with:
   set _R_CHECK_FORCE_SUGGESTS_=0
on Windows.
Once MTseeker is accepted, we'll add a .BBSoptions file with special
directive:
   CHECKprepend.win: set _R_CHECK_FORCE_SUGGESTS_=0&&
This will have the effect to set the environment variable on the
Windows build machines before running 'R CMD check' there.
So MTseeker will be supported and available on all platforms.
For MTseekerData: the package doesn't seem to make any use of gmapR
so you can probably remove gmapR from its Suggests field.
Hope this helps,
H.

On 10/10/2018 07:46 AM, Tim Triche, Jr. wrote:
it looks like gmapR does not support Windows, and as a result, my MTseeker
package cannot build on tokay1, so the Data package which requires it also
cannot build on tokay1.  Are there platform-specific dontrun capabilities?

https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_spb-5Freports_MTseekerData-5Fbuildreport-5F20181010103212.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=x8fxgyra6DmetOzKCh1Zm1X25BwPM7TDr8UUNYwgV18=QNKSsmQnfEAQOTI0r8dDuW_d01XzL_cL1cb5QIElLxw=

Short of somehow forcing gmapR to build on Windows, which I believe is
beyond my control, is there a way to declare that parts of the MTseeker
package are unsupported/unsupportable on Windows?

I suppose I could cleave off the variant-recalling portions but that seems
a little ridiculous. The original goal was to take the non-NuMT reads from
a given alignment, realign (only) those to rCRS/RSRS, and call against
that, for better mitochondrial haplogroup inference. We're still working
towards the full version, but even just calling variants against rCRS with
indels is hugely useful, and the ability to screen out haplogroup-specific
variants while retaining indels, SNVs, etc. turns out to be VERY handy.
More generally, there isn't any equivalent (AFAIK) in BioC, at all.

--t

 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=x8fxgyra6DmetOzKCh1Zm1X25BwPM7TDr8UUNYwgV18=rzPUqLBge6xE1TymeYxIxJxkiiOHefbSgzPOLYEhvrM=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] how can I declare that a package doesn't/can't fully support Windows

2018-10-10 Thread Hervé Pagès


And of course: the whole trick I described below only makes sense
if MTseeker doesn't rely on gmapR for its core functionality, that
is, if not having gmapR installed still allows the user to accomplish
something meaningful with MTseeker.

Otherwise the trick below will make MTseeker available on Windows
but Windows users won't be able to accomplish anything meaningful
with it. In that case, marking the package as unsupported on Windows
would be preferable.

Hope this makes sense,

H.

On 10/10/2018 10:26 PM, Hervé Pagès wrote:

Hi Tim,

No platform-specific dontrun capabilities AFAIK but you can use
something like:

   if (requireNamespace("gmapR", quietly=TRUE)) {
   ...
   ...
   }

in your man pages.

You would also need to move gmapR from Imports to Suggests.

Then make sure that MTseeker passes 'R CMD check' **without**
the gmapR package being installed. You'll need to define and set
environment variable _R_CHECK_FORCE_SUGGESTS_ to 0 for this.
Do it with:

   export _R_CHECK_FORCE_SUGGESTS_=0

on Linux or Mac, or with:

   set _R_CHECK_FORCE_SUGGESTS_=0

on Windows.

Once MTseeker is accepted, we'll add a .BBSoptions file with special
directive:

   CHECKprepend.win: set _R_CHECK_FORCE_SUGGESTS_=0&&

This will have the effect to set the environment variable on the
Windows build machines before running 'R CMD check' there.
So MTseeker will be supported and available on all platforms.

For MTseekerData: the package doesn't seem to make any use of gmapR
so you can probably remove gmapR from its Suggests field.

Hope this helps,
H.


On 10/10/2018 07:46 AM, Tim Triche, Jr. wrote:
it looks like gmapR does not support Windows, and as a result, my 
MTseeker
package cannot build on tokay1, so the Data package which requires it 
also
cannot build on tokay1.  Are there platform-specific dontrun 
capabilities?


https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_spb-5Freports_MTseekerData-5Fbuildreport-5F20181010103212.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=x8fxgyra6DmetOzKCh1Zm1X25BwPM7TDr8UUNYwgV18=QNKSsmQnfEAQOTI0r8dDuW_d01XzL_cL1cb5QIElLxw= 



Short of somehow forcing gmapR to build on Windows, which I believe is
beyond my control, is there a way to declare that parts of the MTseeker
package are unsupported/unsupportable on Windows?

I suppose I could cleave off the variant-recalling portions but that 
seems
a little ridiculous. The original goal was to take the non-NuMT reads 
from

a given alignment, realign (only) those to rCRS/RSRS, and call against
that, for better mitochondrial haplogroup inference. We're still working
towards the full version, but even just calling variants against rCRS 
with
indels is hugely useful, and the ability to screen out 
haplogroup-specific

variants while retaining indels, SNVs, etc. turns out to be VERY handy.
More generally, there isn't any equivalent (AFAIK) in BioC, at all.

--t

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=x8fxgyra6DmetOzKCh1Zm1X25BwPM7TDr8UUNYwgV18=rzPUqLBge6xE1TymeYxIxJxkiiOHefbSgzPOLYEhvrM= 







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] how can I declare that a package doesn't/can't fully support Windows

2018-10-10 Thread Hervé Pagès


Hi Tim,

No platform-specific dontrun capabilities AFAIK but you can use
something like:

  if (requireNamespace("gmapR", quietly=TRUE)) {
  ...
  ...
  }

in your man pages.

You would also need to move gmapR from Imports to Suggests.

Then make sure that MTseeker passes 'R CMD check' **without**
the gmapR package being installed. You'll need to define and set
environment variable _R_CHECK_FORCE_SUGGESTS_ to 0 for this.
Do it with:

  export _R_CHECK_FORCE_SUGGESTS_=0

on Linux or Mac, or with:

  set _R_CHECK_FORCE_SUGGESTS_=0

on Windows.

Once MTseeker is accepted, we'll add a .BBSoptions file with special
directive:

  CHECKprepend.win: set _R_CHECK_FORCE_SUGGESTS_=0&&

This will have the effect to set the environment variable on the
Windows build machines before running 'R CMD check' there.
So MTseeker will be supported and available on all platforms.

For MTseekerData: the package doesn't seem to make any use of gmapR
so you can probably remove gmapR from its Suggests field.

Hope this helps,
H.


On 10/10/2018 07:46 AM, Tim Triche, Jr. wrote:

it looks like gmapR does not support Windows, and as a result, my MTseeker
package cannot build on tokay1, so the Data package which requires it also
cannot build on tokay1.  Are there platform-specific dontrun capabilities?

https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_spb-5Freports_MTseekerData-5Fbuildreport-5F20181010103212.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=x8fxgyra6DmetOzKCh1Zm1X25BwPM7TDr8UUNYwgV18=QNKSsmQnfEAQOTI0r8dDuW_d01XzL_cL1cb5QIElLxw=

Short of somehow forcing gmapR to build on Windows, which I believe is
beyond my control, is there a way to declare that parts of the MTseeker
package are unsupported/unsupportable on Windows?

I suppose I could cleave off the variant-recalling portions but that seems
a little ridiculous. The original goal was to take the non-NuMT reads from
a given alignment, realign (only) those to rCRS/RSRS, and call against
that, for better mitochondrial haplogroup inference. We're still working
towards the full version, but even just calling variants against rCRS with
indels is hugely useful, and the ability to screen out haplogroup-specific
variants while retaining indels, SNVs, etc. turns out to be VERY handy.
More generally, there isn't any equivalent (AFAIK) in BioC, at all.

--t

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=x8fxgyra6DmetOzKCh1Zm1X25BwPM7TDr8UUNYwgV18=rzPUqLBge6xE1TymeYxIxJxkiiOHefbSgzPOLYEhvrM=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Problem with setClassUnion and DelayedArray

2018-10-10 Thread Hervé Pagès


Hi Elizabeth,

I agree that the setClassUnion() warning is rather esoteric, especially
the "consider setClassUnion()" part.

  library(DelayedArray)
  setClassUnion("matrixOrHDF5", c("matrix", "DelayedArray"))
  # Warning message:
  # subclass "DelayedArray1" of class "DelayedArray" is not local and
  # cannot be updated for new inheritance information; consider

Furthermore, showClass("DelayedArray") reports the correct inheritance
information:

  showClass("DelayedArray1")
  # Class "DelayedArray1" [package "DelayedArray"]
  #
  # Slots:
  #
  # Name:index delayed_opsseed
  # Class:listlist ANY
  #
  # Extends:
  # Class "DelayedArray", directly
  # Class "DelayedUnaryIsoOp", by class "DelayedArray", distance 2
  # Class "matrixOrHDF5", by class "DelayedArray", distance 2
  # Class "DelayedUnaryOp", by class "DelayedArray", distance 3
  # Class "DelayedOp", by class "DelayedArray", distance 4
  # Class "Array", by class "DelayedArray", distance 5

As well as extends():

  extends("DelayedArray1")
  # [1] "DelayedArray1" "DelayedArray"  "DelayedUnaryIsoOp"
  # [4] "matrixOrHDF5"  "DelayedUnaryOp""DelayedOp"
  # [7] "Array"

  extends("DelayedArray", "matrixOrHDF5")
  # [1] TRUE

  extends("DelayedArray1", "DelayedArray")
  # [1] TRUE

  extends("DelayedArray1", "matrixOrHDF5")
  # [1] TRUE

So it might just be a spurious warning :-/

Anyway, I've exported the DelayedArray1 class in DelayedArray 0.7.48:


https://github.com/Bioconductor/DelayedArray/commit/26061a9b28b87b8a3ee26b8b81ff3334b55115c1

No more warning with this version:

  library(DelayedArray)
  setClassUnion("matrixOrHDF5", c("matrix", "DelayedArray"))

Cheers,
H.


On 10/10/2018 11:51 AM, Elizabeth Purdom wrote:

Hello,

I am using `setClassUnion` in my package `clusterExperiment` in the following 
code to allow for either matrix or DelayedArray:

setClassUnion("matrixOrHDF5",members=c("matrix", "DelayedArray"))

This causes the following warning in checking my package:

Warning: subclass "DelayedArray1" of class "DelayedArray" is not local and 
cannot be updated for new inheritance information; consider setClassUnion()

I’ve gotten this warning in other settings, and I believe it is due to the fact 
that setClassUnion works on the subclasses of the members, so if you give the 
argument `members=c(“X”,”Y”)` and you haven’t imported into your package all of 
the subclasses of “X” and “Y” it is warning you those non-imported classes 
haven’t been dealt with (though if so, I’d say the warning is awfully cryptic, 
especially since it says to use `setClassUnion` as a solution). In my other 
cases, I have just gone ahead and imported all of the subclasses from the 
package that defines the member classes and have gotten rid of the message (in 
the past it hasn’t been so many). But “DelayedArray1” is not an exported class 
of the DelayedArray package so that is not an option here.

I have been just ignoring this warning, since I understand (I think) the 
warning, I can’t do anything about it, and am not concerned about it since this 
new class is only used internally by my function for the slot definition. And I 
don’t think the user sees this generally. But given that we’re coming up on a 
release I thought I would ask if there’s anything I can do to get rid of this 
warning! Or can I go with my first instinct and safely ignore it?

Thanks,
Elizabeth Purdom

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=4le20lkIbxVE8gFC4uH_tCGjq9qX1garrTomLOEFN6A=VePKILIDYgQk9KdF7u7hJQJLglF5ga8I6M5u99inEyo=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] a pattern to be avoided? mcols(x)$y <- z

2018-10-03 Thread Hervé Pagès


Hi Vince,

This issue was reported here a couple of weeks ago:

  https://github.com/Bioconductor/GenomicRanges/issues/11

Internally $<- uses something like:

  do.call(DataFrame, list(DF1, DF2))

to combine the metadata columns. However in some situations
the do.call(DataFrame, list(...)) form is **very** inefficient
compared to the more direct DataFrame(...) form:

  library(S4Vectors)
  DF1 <- DataFrame(a=Rle(11:1999, 1011:2999), b=5)
  DF2 <- DataFrame(c=Rle(12:2000, 1011:2999))
  system.time(DF12 <- do.call(DataFrame, list(DF1, DF2)))
  #   user  system elapsed
  #  4.476   0.000   4.476
  system.time(DF12b <- DataFrame(DF1, DF2))
  #   user  system elapsed
  #  0.002   0.000   0.001
  identical(DF12, DF12b)
  # [1] TRUE

@Michael: Any idea what's going on?

Thanks,
H.


On 10/03/2018 07:01 AM, Vincent Carey wrote:

The following comes up in use of Fdb.InfiniumMethylation.hg19::getPlatform


debug: mcols(GR)$channel <- Rle(as.factor(mcols(GR)$channel450))

Browse[3]> system.time(uu <- Rle(as.factor(mcols(GR)$channel450)))

user  system elapsed

   0.020   0.003   0.022

Browse[3]> system.time(mcols(GR)$channel <-
Rle(as.factor(mcols(GR)$channel450)))

user  system elapsed

  47.263   0.067  47.373

Browse[3]> GR$channel[1]

factor-Rle of length 1 with 1 run

   Lengths:1

   Values : Both

Levels(3): Both Grn Red

Browse[3]> system.time(GR$channel <- Rle(as.factor(mcols(GR)$channel450)))

user  system elapsed

   0.058   0.006   0.065


Presumably the mcols()$<- copies/rewrites a lot of data needlessly?



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenomeInfoDB SeqInfo function error

2018-09-27 Thread Hervé Pagès

Hi Dario,

On 09/13/2018 09:18 AM, Dario Righelli wrote:

Hello everyone,

I'm using in DEScan2 package the GenomeInfoDb::Seqinfo function with
genome="mm10".

And sometimes it appens to retrieve this error message

"cannot open the connection to
'https://urldefense.proofpoint.com/v2/url?u=ftp-3A__ftp.ncbi.nlm.nih.gov_genomes_all_GCF_000_001_635_GCF-5F01635.20-5FGRCm38_GCF-5F01635.20-5FGRCm38-5Fassembly-5Freport.txt=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-DK9eRRfECV1yPP6ZtcqI5wTDmAcAznmepaSA1e4lRE=lJUG8lKk4WIQQWiOaXjM3CYfr-ksMFhs5svvIk6kUyY='"

Even if the the file is reachable.

I cannot reproduce this, not too surprisingly...

This kind of intermittent internet access problem is not uncommon
and typically hard to reproduce. GenomeInfoDb::Seqinfo() was trying
to download a file from ftp.ncbi.nlm.nih.gov and failed for some
reason. It could be because NCBI's FTP site was temporarily unavailable
or because of any other network problem between NCBI and the machine
where GenomeInfoDb::Seqinfo() was called. Unfortunately there is not
much we can do about these transient connectivity issues in general.

However we can mitigate them:

- One way to mitigate them though would be to use a caching mechanism
e.g. to use BiocFileCache to store the data downloaded by
GenomeInfoDb::Seqinfo(genoe="some_genome") locally the 1st time
it's downloaded for a particular genome.

- Another way would be to have this data already included in
GenomeInfoDb (or GenomeInfoDbData) for the most frequently used
genomes. In addition, the caching mechanism could still be used
for the other genomes.

- Another way to mitigate this maybe would be to have
GenomeInfoDb::Seqinfo(genoe="some_genome") re-try the download
a couple of times (after waiting 1 or 2 sec before re-trying)
before giving up. This could be done in combination with the
above features. The re-try feature could even be integrated to
BiocFileCache.

Although for now my feeling is that this issue is maybe not so much
of an annoyance to justify putting these new developments high on
the TODO list.

Just throwing some random thoughts here. Don't know what others
think about this.

I noticed it because I received an ERROR report from the bioconductor test bot.
I have a unit test for my package that doesn't pass on linux, but it works on
other machines.

Looking on the Internet, this seems like an old (solved) problem.

Would you mind sharing a link to this information? Thanks!

Cheers,
H.

What do you suggest to do?

thanks,
dario

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-DK9eRRfECV1yPP6ZtcqI5wTDmAcAznmepaSA1e4lRE=sYIOe-2EKFxkXyKVQFowbNaXORn4F0QUhjWWkqlSUpY=

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Rd] Bug in printing array of type "list"

2018-09-26 Thread Hervé Pagès


Hi,

This array is of type "list" but print() reports otherwise:

  a1 <- array(list(1), 2:0)

  typeof(a1)
  # [1] "list"

  a1
  # <2 x 1 x 0 array of character>
  #  [,1]
  # [1,]
  # [2,]

No such problem with an array of type "logical":

  a2 <- array(NA, 2:0)

  typeof(a2)
  # [1] "logical"

  a2
  # <2 x 1 x 0 array of logical>
  #  [,1]
  # [1,]
  # [2,]

Thanks,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.vector() broken on a matrix or array of type "list"

2018-09-26 Thread Hervé Pagès

Hi Martin,

On 09/26/2018 12:41 AM, Martin Maechler wrote:

Hervé Pagès
 on Tue, 25 Sep 2018 23:27:19 -0700 writes:

 > Hi, Unlike on an atomic matrix, as.vector() doesn't drop
 > the "dim" attribute of matrix or array of type "list":

m <- matrix(list(), nrow=2, ncol=3)
m
#  [,1] [,2] [,3]
# [1,] NULL NULL NULL
# [2,] NULL NULL NULL

as.vector(m)
#  [,1] [,2] [,3]
# [1,] NULL NULL NULL
# [2,] NULL NULL NULL

as documented and as always, including (probably all) versions of S and S-plus.

is.vector(as.vector(m))
# [1] FALSE

as bad is that looks, that's also "known" and has been the case
forever as well...

I agree that the semantics of as.vector(.)  are not what you
would expect, and probably neither what we would do when
creating R today. *)
The help page {the same for as.vector() and is.vector()}
mentions that as.vector() behavior more than once, notably at
the end of 'Details' and its 'Note's
... with one exception where you have a strong point, and the documenation
is incomplete at least -- under the heading

  Methods for 'as.vector()':

... follow the conventions of the default method.  In particular

...
...
...

• ‘is.vector(as.vector(x, m), m)’ should be true for any mode ‘m’,
   including the default ‘"any"’.

and you are right that this is not fulfilled in the case the
list has a 'dim' attribute.

But I don't think we "can" change as.vector(.) for that case
(where it is a no-op).
Rather  possibly is.vector(.) should not return FALSE but TRUE -- with
the reasoning (I think most experienced R programmers would
agree) that the foremost property of 'm' is to be
  - a list() {with a dim attribute and matrix-like indexing possibility}
rather than
  - a 'matrix' {where every matrix entry is a list()}.

Note that this change would break all the code around that uses
is.vector() to distinguish between an array (of mode "atomic" or
"list") and a non-array. Arguably is.array() should preferably be
used for that but I'm sure there is a lot of code around that uses
is.vector().

The bottom of the problem is that as.vector() doesn't drop attributes
that is.vector() sees as "vector breakers" i.e. as breaking the vector
nature of an object. So for example is.vector() considers the "dim"
attribute to be a vector breaker but as.vector() doesn't drop it.

So yes in order to bring is.vector() and as.vector() in agreement you
can either change one or the other, or both. My gut feeling though is
that it would be less disruptive to not change what is.vector() thinks
about the "dim" attribute and to make sure that as.vector() **always**
drops it (together with "dimnames" if present). How much code around
could there be that calls as.vector() on an array and expects the "dim"
attribute to be dropped **except** when the mode() of the array is
"list"? It is more likely that the code around that calls as.vector()
on an array doesn't expect such exception and so is broken. This was
actually the case for my code ;-)

Thanks,
H.

At the moment my gut feeling would propose to only update the
documentation, adding that one case as "an exception for historic reasons".

Martin

-
*) {Possibly such an R we would create today would be much closer to
 julia, where every function is generic / a multi-dispach method
 "a la S4"  and still be blazingly fast, thanks to JIT
 compilation, method caching and more smart things.}
But as you know one of the strength of (base) R is its stability
and reliability.  You can only use something as a "the language
of applied statistics and data science" and rely that published
code still works 10 years later if the language is not
changed/redesigned from scratch every few years ((as some ... are)).

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] as.vector() broken on a matrix or array of type "list"

2018-09-26 Thread Hervé Pagès


Hi,

Unlike on an atomic matrix, as.vector() doesn't drop the "dim"
attribute of matrix or array of type "list":

  m <- matrix(list(), nrow=2, ncol=3)
  m
  #  [,1] [,2] [,3]
  # [1,] NULL NULL NULL
  # [2,] NULL NULL NULL

  as.vector(m)
  #  [,1] [,2] [,3]
  # [1,] NULL NULL NULL
  # [2,] NULL NULL NULL

  is.vector(as.vector(m))
  # [1] FALSE

Thanks,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Bias in R's random integers?

2018-09-20 Thread Hervé Pagès


Hi,

Note that it wouldn't be the first time that sample() changes behavior
in a non-backward compatible way:

  https://stat.ethz.ch/pipermail/r-devel/2012-October/065049.html

Cheers,
H.


On 09/20/2018 08:15 AM, Duncan Murdoch wrote:

On 20/09/2018 6:59 AM, Ralf Stubner wrote:

On 9/20/18 1:43 AM, Carl Boettiger wrote:
For a well-tested C algorithm, based on my reading of Lemire, the 
unbiased
"algorithm 3" in 
https://urldefense.proofpoint.com/v2/url?u=https-3A__arxiv.org_abs_1805.10941=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=tVt5ARiRzaOYr7BgOc0nC_hDq80BUkAUKNwcowN5W1k=TtofIDsvWasBZGzOl9J0kBQnJMksr2Rg3u1l8CM5-qE= 
is part already of the C
standard library in OpenBSD and macOS (as arc4random_uniform), and in 
the
GNU standard library.  Lemire also provides C++ code in the appendix 
of his

piece for both this and the faster "nearly divisionless" algorithm.

It would be excellent if any R core members were interested in 
considering
bindings to these algorithms as a patch, or might express 
expectations for

how that patch would have to operate (e.g. re Duncan's comment about
non-integer arguments to sample size).  Otherwise, an R package binding
seems like a good starting point, but I'm not the right volunteer.

It is difficult to do this in a package, since R does not provide access
to the random bits generated by the RNG. Only a float in (0,1) is
available via unif_rand(). 


I believe it is safe to multiply the unif_rand() value by 2^32, and take 
the whole number part as an unsigned 32 bit integer.  Depending on the 
RNG in use, that will give at least 25 random bits.  (The low order bits 
are the questionable ones.  25 is just a guess, not a guarantee.)


However, if one is willing to use an external

RNG, it is of course possible. After reading about Lemire's work [1], I
had planned to integrate such an unbiased sampling scheme into the dqrng
package, which I have now started. [2]

Using Duncan's example, the results look much better:


library(dqrng)
m <- (2/5)*2^32
y <- dqsample(m, 100, replace = TRUE)
table(y %% 2)


  0  1
500252 499748


Another useful diagnostic is

   plot(density(y[y %% 2 == 0]))

Obviously that should give a more or less uniform density, but for 
values near m, the default sample() gives some nice pretty pictures of 
quite non-uniform densities.


By the way, there are actually quite a few examples of very large m 
besides m = (2/5)*2^32 where performance of sample() is noticeably bad. 
You'll see problems in y %% 2 for any integer a > 1 with m = 2/(1 + 2a) 
* 2^32, problems in y %% 3 for m = 3/(1 + 3a)*2^32 or m = 3/(2 + 
3a)*2^32, etc.


So perhaps I'm starting to be convinced that the default sample() should 
be fixed.


Duncan Murdoch




Currently I am taking the other interpretation of "truncated":


table(dqsample(2.5, 100, replace = TRUE))


  0  1
499894 500106

I will adjust this to whatever is decided for base R.


However, there is currently neither long vector nor weighted sampling
support. And the performance without replacement is quite bad compared
to R's algorithm with hashing.

cheerio
ralf

[1] via 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.pcg-2Drandom.org_posts_bounded-2Drands.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=tVt5ARiRzaOYr7BgOc0nC_hDq80BUkAUKNwcowN5W1k=OlX-dzwoOeFlod3Gofa_1TQaZwmjsCH9C9v3lM5Y2rY= 

[2] 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_daqana_dqrng_tree_feature_sample=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=tVt5ARiRzaOYr7BgOc0nC_hDq80BUkAUKNwcowN5W1k=DNaSqRCy89Hvbg1G0SpyEL0kkr9_RqWXi9pTy75V32M= 





__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=tVt5ARiRzaOYr7BgOc0nC_hDq80BUkAUKNwcowN5W1k=WOx4NyeYmWxpDG3tBRQ9-_Y3_7YAlKUKOP6gZLs0BrQ= 





__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=tVt5ARiRzaOYr7BgOc0nC_hDq80BUkAUKNwcowN5W1k=WOx4NyeYmWxpDG3tBRQ9-_Y3_7YAlKUKOP6gZLs0BrQ= 



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] Errors in Building AssessORF

2018-09-10 Thread Hervé Pagès

Hi Deepank,

Just to clarify, this is about a new submission and the build
you're referring to is:

http://bioconductor.org/spb_reports/AssessORF_buildreport_20180910173835.html

Note that I can reproduce this error on my laptop with:

  git clone https://github.com/DRK248/AssessORF
  git clone https://github.com/DRK248/AssessORFData
  R CMD INSTALL AssessORF AssessORFData
  R CMD build AssessORF
  R CMD check AssessORF_0.99.9.tar.gz

The last command (R CMD check) produces:

  ...
  * checking examples ... ERROR
  Running examples in ‘AssessORF-Ex.R’ failed
  The error most likely occurred in:

  > ### Name: ScoreAssesmentResults
  > ### Title: Score Gene Assessment Results
  > ### Aliases: ScoreAssesmentResults
  >
  > ### ** Examples
  >
  >
  > ScoreAssessmentResults(readRDS(system.file("extdata", 
"ATCC17978_PreSaved_ResultsObj_Prodigal.rds", package = "AssessORF")), "a")

  Warning in gzfile(file, "rb") :
cannot open compressed file '', probable reason 'No such file or 
directory'

  Error in gzfile(file, "rb") : cannot open the connection
  Calls: ScoreAssessmentResults -> readRDS -> gzfile
  Execution halted
  ...

I cloned the 2 packages less than 30 min ago so I have the most current
versions. Also since this was the first time I cloned and installed
AssessORF and AssessORFData on my machine, I shouldn't have any stale
files or stale versions of the packages. So this error actually looks
like a true positive to me and not an error due to some caching or other
dysfunction of the SPB (Single Package Builder).

The problem is that the package contains an example (in the
ScoreAssesmentResults.Rd) man page that tries to read the
inst/extdata/ATCC17978_PreSaved_ResultsObj_Prodigal.rds
file. But the inst/extdata/ folder doesn't contain such file:

  hpages@spectre:~/AssessORF$ ls inst/extdata/
  Adenoviridae.sqlite
  MGAS5005_PreSaved_ResultsObj_Prodigal.rds
  MGAS5005_PreSaved_DataMapObj.rds
  MGAS5005_Prodigal.sco

You should be able to reproduce this error on your machine too by
using the standard 'R CMD build' + 'R CMD check' sequence.

Finally note that AssessORFData Suggests AssessORF even though it
looks like it should be the other way around (AssesORF uses
AssessORFData but AssessORFData has no mention of AssessORF and
doesn't seem to make any use of it).

Cheers,
H.

On 09/10/2018 02:57 PM, Deepank Korandla wrote:

Hi,

I'm the developer of the AssessORF package. For the last couple of valid
builds that I pushed (each with an appropriate version bump), I am getting
the same error from the automated package builder involving code that is no
longer part of my package. Specifically, the automated package builder is
trying to run an outdated example for one of the functions in my package
and failing because the name of the function has changed and because the
data files used in the old example are no longer part of my package. I am
not sure how to resolve this issue on my end as I have pushed multiple
valid builds since I started seeing this error and nothing has changed. Can
anyone help me with this?

Thanks,
Deepank Korandla
Wright Lab, University of Pittsburgh

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=tS_aEsUJun1qA9WO0HWeyiqKt8zalnNiL3cp0BZVtKQ=oWrN9I53U5I1BDWdLuzccCO_uWnj1E5Hj2rTpn0ss1g=

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] proteoQC

2018-09-07 Thread Hervé Pagès

Care to provide some details? What you have tried? Error you got?

A must-read document that extensively covers the topic of maintaining
a Bioconductor package via git/GitHub is:

  https://bioconductor.org/developers/how-to/git/

See in particular question 3. in the FAQ.

Hope this helps,
H.

On 09/07/2018 10:25 AM, Bo Wen wrote:

Yes.
Bo

On Fri, Sep 7, 2018 at 12:24 PM Hervé Pagès <mailto:hpa...@fredhutch.org>> wrote:

Hi Bo,

Are you saying you cannot push changes to your package?

H.

On 09/07/2018 09:55 AM, Bo Wen wrote:
 > Hi,
 > I'm the developer of proteoQC (
 >

https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_packages_release_bioc_html_proteoQC.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-qRky554yTDVN2u7XQ0xlXDcnNdMzg93wm5J1Z0dAd8=-VtwzQI55NcHpB3lD1tbE7rFlcgsdXxMK5gB0NJTCug=).
I
 > changed my job so my previous email is not valid. Can anyone help
me update
 > my email to wenbos...@gmail.com <mailto:wenbos...@gmail.com> in
the package?
 > Thanks.
 > Bo Wen
 >
 >       [[alternative HTML version deleted]]
 >
 > ___
 > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
mailing list
 >

https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-qRky554yTDVN2u7XQ0xlXDcnNdMzg93wm5J1Z0dAd8=8KSj7P_s3AiHvIUqkGETwbJeuTqxb0ptt1zRCEzYMbE=
 >

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>
    Phone:  (206) 667-5791
Fax:    (206) 667-1319

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] proteoQC

2018-09-07 Thread Hervé Pagès


Hi Bo,

Are you saying you cannot push changes to your package?

H.

On 09/07/2018 09:55 AM, Bo Wen wrote:

Hi,
I'm the developer of proteoQC (
https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_packages_release_bioc_html_proteoQC.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-qRky554yTDVN2u7XQ0xlXDcnNdMzg93wm5J1Z0dAd8=-VtwzQI55NcHpB3lD1tbE7rFlcgsdXxMK5gB0NJTCug=).
 I
changed my job so my previous email is not valid. Can anyone help me update
my email to wenbos...@gmail.com in the package?
Thanks.
Bo Wen

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-qRky554yTDVN2u7XQ0xlXDcnNdMzg93wm5J1Z0dAd8=8KSj7P_s3AiHvIUqkGETwbJeuTqxb0ptt1zRCEzYMbE=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] BSGenome submission Annotation pckg

2018-09-06 Thread Hervé Pagès

Hi Jose,

Thanks for contributing to the project. The easiest way to go would
be that you make the BSgenome packages you made available somewhere
and send me the links to them. I will add them to our data annotation
repo after running some quick checks on them.

Cheers,
H.

On 09/06/2018 10:01 AM, Jose V. Die wrote:

Sorry for the confusion. Rectifying my previous email: This time from my right
Email address and showing the Subject.

Hello everyone.

We run two molecular breeding programs at the Department of Genetics
(University of Cordoba) with the plant species asparagus and chickpea.
We made two BSgenome packages with their respective genome reference sequences
and would like to make them available to everyone by contributing to the
repository of Annotation packages.

Please, could you tell me what´s next.

Thanks,
Jose Die

The tarball packs are available here.
Asparagus genome :
https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_open-3Fid-3D1r9-5FF76YZ51tS28yKG-5FVnUMpfQ2F-5FP-2DsP=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=SBwgbmncOy4t81buo8pj5hYV1lFJgc3nbjOmXg4cdNg=p2zK8164qUHXnQ0aVaVCUdURwqvK4MoKAM9T1ZbZCe0=

Chickpea genome:
https://urldefense.proofpoint.com/v2/url?u=https-3A__drive.google.com_open-3Fid-3D1tkpxIw5DZS5OQN-5F9D0qE-2DENekxvGDYhG=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=SBwgbmncOy4t81buo8pj5hYV1lFJgc3nbjOmXg4cdNg=7r3krNQk3kAIKpwXso7xk4Bpu0u2BR7e2YSBFvc65c4=

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=SBwgbmncOy4t81buo8pj5hYV1lFJgc3nbjOmXg4cdNg=k4gMuekxpPsqOqqUSi0uHUPWOMOjAVWbvHVFbq-rMYA=

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Web hook push error

2018-09-06 Thread Hervé Pagès


ORFik is in the devel builds:

  http://bioconductor.org/checkResults/devel/bioc-LATEST/index.html

as well as in the release builds:

  http://bioconductor.org/checkResults/release/bioc-LATEST/index.html

This indicates that the package has been accepted and added to
Bioconductor before we released Bioconductor 3.7 back in April.

Upon acceptance of your package you should have received an email
(off issue tracker) with important information about how to maintain
a Bioconductor package. One important bit to keep in mind is that
your package is now hosted at git.bioconductor.org in addition to
be on GitHub, so you need to maintain it at both locations (modifying
it on GitHub doesn't affect the copy on git.bioconductor.org, unless
you take action). The links Lori sent in her email below explain you
how to do this. FWIW here is the full collection of HOWTO documents
that extensively cover the topic of maintaining a Bioconductor package
via git/GitHub:

  http://bioconductor.org/developers/how-to/git/

H.


On 09/05/2018 06:52 AM, Shepherd, Lori wrote:

Your package has been accepted and officially added to the devel version of 
Bioconductor.  You need to follow the instructions for setting up remotes and 
pushing to the git.bioconductor.org server.  The webhook may be deactivated and 
is only used during the submission process.


https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_developers_how-2Dto_git_sync-2Dexisting-2Drepositories_=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Asx6gaOwY0Z5CgT-ST6S6E1x3niqMsE76PJRThTxe6M=mRb0qxlTslceHfqLDPat_Bkx_cP-usXqFZI9DmtsAqY=


https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_developers_how-2Dto_git_push-2Dto-2Dgithub-2Dbioc_=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Asx6gaOwY0Z5CgT-ST6S6E1x3niqMsE76PJRThTxe6M=Oce8HvjdQAYlh36NBUU0AUpo3nkdkEFqvJvK8YzUXXo=



Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of H�kon Tjeldnes 

Sent: Wednesday, September 5, 2018 9:39:42 AM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] Web hook push error

I noticed today that since release, the web hook has not pushed:

Web hook bar is green, with the message:
can't build unless issue is open and has the '2. review in progress'
  label, or is closed and has the 'TESTING' label.


This is for the package ORFik


Have we set up the web hook in the wrong way ?

 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Asx6gaOwY0Z5CgT-ST6S6E1x3niqMsE76PJRThTxe6M=Or_DF3OO0pZW-21R4n_QqIqrQIrOhGssX9bm0EuLQyQ=


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]



___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Asx6gaOwY0Z5CgT-ST6S6E1x3niqMsE76PJRThTxe6M=Or_DF3OO0pZW-21R4n_QqIqrQIrOhGssX9bm0EuLQyQ=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] Argument 'dim' misspelled in error message

2018-09-04 Thread Hervé Pagès


Thanks!

On 09/01/2018 05:42 AM, Kurt Hornik wrote:

Hervé Pagès writes:


Thanks: fixed in the trunk with c75223.

Best
-k


Hi,
The following error message misspells the name of
the 'dim' argument:



array(integer(0), dim=integer(0))

Error in array(integer(0), dim = integer(0)) :
  'dims' cannot be of length 0



The name of the argument is 'dim' not 'dims':



args(array)

function (data = NA, dim = length(data), dimnames = NULL)
NULL



Cheers,
H.



--
Hervé Pagès



Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024



E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319



__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwIFAw=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=SzMRc3M_TJEtaAqp-2nqiquGAjCH605Ocf2-jkPG_1E=1PeobGV2Ld7gOtIS5coLotgg3VLknDQyCXVjO08DbX4=


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Argument 'dim' misspelled in error message

2018-08-31 Thread Hervé Pagès


Hi,

The following error message misspells the name of
the 'dim' argument:

  > array(integer(0), dim=integer(0))
  Error in array(integer(0), dim = integer(0)) :
'dims' cannot be of length 0

The name of the argument is 'dim' not 'dims':

  > args(array)
  function (data = NA, dim = length(data), dimnames = NULL)
  NULL

Cheers,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] Help with class lost after subsetting.

2018-08-28 Thread Hervé Pagès


On 08/28/2018 11:46 AM, Hervé Pagès wrote:

On 08/27/2018 11:01 PM, Martin Morgan wrote:



On 08/28/2018 12:19 AM, Charles Plessy wrote:

Dear Bioconductor developers,

In the CAGEr package, I created a "CAGEexp" class that extends
"MultiAssayExperiment" without adding new slots, in order to define 
generic

functions that require CAGEr-specific contents in the colData slot.

Unfortunately, when run in the development branch of Bioconductor,
the CAGEexp objects lose their class when they are subsetted.  Here
is an example:


CAGEr::exampleCAGEexp

A CAGEexp object of 4 listed
(...)


CAGEr::exampleCAGEexp[,1]

A MultiAssayExperiment object of 4 listed
(...)

This breaks examples in the package, as well as existing code.

I am lost on how to troubleshoot this.  May I ask for your help ?


I debugged this using first `selectMethod("[", 
"MultiAssayExperiment")` and then `showMethod()` / `selectMethod()` to 
arrive at `subsetByColData,MultiAssayExperiment,ANY-method`.


The problem is that this line

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_waldronlab_MultiAssayExperiment_blob_master_R_subsetBy-2Dmethods.R-23L261=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=BFHgkPpGkkRZx_me9V6pN2aTIxXYDgkUBG5jJTKLugc=6XrUMqkUrT5cepxjwAwSXVXdjOeyRdWAjdpGaasVqc0= 



returns a MultiAssayExperiment; what it should do is probably closer 
to the 'copy constructor' functionality of `initialize()`, along the 
lines of


   initialize(x, ExperimentList = ..., )

This could be opened as an issue on the MultiAssayExperiment github 
repository; maybe Herve or Michael or others might comment on the best 
implementation.


Yep. Personally I tend to prefer BiocGenerics:::replaceSlots()
over initialize() because the former can be called with check=FALSE
in order to skip a possibly expensive validation. So:

     BiocGenerics:::replaceSlots(x
     ExperimentList = harmon[["experiments"]],
     colData = harmon[["colData"]],
     sampleMap = harmon[["sampleMap"]],
     metadata = metadata(x),
     check = FALSE)

If you know that the replacement values are valid (because of the way
you prepared them), then validation should not be needed.

Also when only **some** of the slots are updated (which is not the
case in the above example where all the slots are being replaced),


ERRATA: It seems that MultiAssayExperiment have one more slot,
the "drops" slot, that the code above does not modify so this would
be one more reason IMO to use BiocGenerics:::replaceSlots() instead
of something like initialize(x, ExperimentList = ..., ) or
new(class(x),ExperimentList = ..., ).

H.


I find that the use of initialize() is misleading from a readability
point of view.

See 
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_bioc-2Ddevel_2017-2DSeptember_011496.html=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=OlbCmHfOSsvIe5QU8cUQjnV7NeLHnJ9GndGatxMWmXQ=fV9PeZDYz9qEIxeLk00LcjpNNgzQy_kzi6aFEKBvlds= 


for a discussion about this about 1 year ago.

H.



Martin



Best regards,



___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=BFHgkPpGkkRZx_me9V6pN2aTIxXYDgkUBG5jJTKLugc=Xa6tx2WwH603kmeR7WiV1PLMBM3myI5fUfjLL6WkMmU= 





--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Help with class lost after subsetting.

2018-08-28 Thread Hervé Pagès


On 08/27/2018 11:01 PM, Martin Morgan wrote:



On 08/28/2018 12:19 AM, Charles Plessy wrote:

Dear Bioconductor developers,

In the CAGEr package, I created a "CAGEexp" class that extends
"MultiAssayExperiment" without adding new slots, in order to define 
generic

functions that require CAGEr-specific contents in the colData slot.

Unfortunately, when run in the development branch of Bioconductor,
the CAGEexp objects lose their class when they are subsetted.  Here
is an example:


CAGEr::exampleCAGEexp

A CAGEexp object of 4 listed
(...)


CAGEr::exampleCAGEexp[,1]

A MultiAssayExperiment object of 4 listed
(...)

This breaks examples in the package, as well as existing code.

I am lost on how to troubleshoot this.  May I ask for your help ?


I debugged this using first `selectMethod("[", "MultiAssayExperiment")` 
and then `showMethod()` / `selectMethod()` to arrive at 
`subsetByColData,MultiAssayExperiment,ANY-method`.


The problem is that this line

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_waldronlab_MultiAssayExperiment_blob_master_R_subsetBy-2Dmethods.R-23L261=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=BFHgkPpGkkRZx_me9V6pN2aTIxXYDgkUBG5jJTKLugc=6XrUMqkUrT5cepxjwAwSXVXdjOeyRdWAjdpGaasVqc0= 



returns a MultiAssayExperiment; what it should do is probably closer to 
the 'copy constructor' functionality of `initialize()`, along the lines of


   initialize(x, ExperimentList = ..., )

This could be opened as an issue on the MultiAssayExperiment github 
repository; maybe Herve or Michael or others might comment on the best 
implementation.


Yep. Personally I tend to prefer BiocGenerics:::replaceSlots()
over initialize() because the former can be called with check=FALSE
in order to skip a possibly expensive validation. So:

BiocGenerics:::replaceSlots(x
ExperimentList = harmon[["experiments"]],
colData = harmon[["colData"]],
sampleMap = harmon[["sampleMap"]],
metadata = metadata(x),
check = FALSE)

If you know that the replacement values are valid (because of the way
you prepared them), then validation should not be needed.

Also when only **some** of the slots are updated (which is not the
case in the above example where all the slots are being replaced),
I find that the use of initialize() is misleading from a readability
point of view.

See https://stat.ethz.ch/pipermail/bioc-devel/2017-September/011496.html
for a discussion about this about 1 year ago.

H.



Martin



Best regards,



___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=BFHgkPpGkkRZx_me9V6pN2aTIxXYDgkUBG5jJTKLugc=Xa6tx2WwH603kmeR7WiV1PLMBM3myI5fUfjLL6WkMmU= 



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] Where does L come from?

2018-08-25 Thread Hervé Pagès


On 08/25/2018 04:33 PM, Duncan Murdoch wrote:

On 25/08/2018 4:49 PM, Hervé Pagès wrote:

The choice of the L suffix in R to mean "R integer type", which
is mapped to the "int" type at the C level, and NOT to the "long int"
type, is really unfortunate as it seems to be misleading and confusing
a lot of people.


I don't have stats about this so I take back the "lot".

Can you provide any evidence of that (e.g. a link to a message from one 
of these people)?  I think a lot of people don't really know about the L 
suffix, but that's different from being confused or misleaded by it.


And if you make a criticism like that, it would really be fair to 
suggest what R should have done instead.  I can't think of anything 
better, given that "i" was already taken, and that the lack of a decimal 
place had historically not been significant.  Using "I" *would* have 
been confusing (3i versus 3I being very different).  Deciding that 3 
suddenly became an integer value different from 3. would have led to 
lots of inefficient conversions (since stats mainly deals with floating 
point values).


Maybe 10N, or 10n? I'm not convinced that 10I would have been
confusing but the I can easily be mistaken for a 1.

H.



Duncan Murdoch




The fact that nowadays "int" and "long int" have the same size on most
platforms is only anecdotal here.

Just my 2 cents.

H.

On 08/25/2018 10:01 AM, Dirk Eddelbuettel wrote:


On 25 August 2018 at 09:28, Carl Boettiger wrote:
| I always thought it meant "Long" (I'm assuming R's integers are long
| integers in C sense (iirrc one can declare 'long x', and it being 
common to

| refer to integers as "longs"  in the same way we use "doubles" to mean
| double precision floating point).  But pure speculation on my part, 
so I'm

| curious!

It does per my copy (dated 1990 !!) of the 2nd ed of Kernighan & 
Ritchie.  It
explicitly mentions (sec 2.2) that 'int' may be 16 or 32 bits, and 
'long' is
32 bit; and (in sec 2.3) introduces the I, U, and L labels for 
constants.  So
"back then when" 32 bit was indeed long.  And as R uses 32 bit 
integers ...


(It is all murky because the size is an implementation detail and later
"essentially everybody" moved to 32 bit integers and 64 bit longs as 
the 64
bit architectures became prevalent.  Which is why when it matters one 
should

really use more explicit types like int32_t or int64_t.)

Dirk







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Where does L come from?

2018-08-25 Thread Hervé Pagès





On 08/25/2018 02:23 PM, Dirk Eddelbuettel wrote:


On 25 August 2018 at 13:49, Hervé Pagès wrote:
| The choice of the L suffix in R to mean "R integer type", which
| is mapped to the "int" type at the C level, and NOT to the "long int"
| type, is really unfortunate as it seems to be misleading and confusing
| a lot of people.

The point I was trying to make in what you quote below is that the L may come
from a time when int and long int were in fact the same on most relevant
architectures. And it is hardly R's fault that C was allowed to change.

Also, it hardly matters given that R has precisely one integer type so I am
unsure where you see the confusion between long int and int.
  
| The fact that nowadays "int" and "long int" have the same size on most

| platforms is only anecdotal here.
|
| Just my 2 cents.

Are you sure?

   R> Rcpp::evalCpp("sizeof(long int)")
   [1] 8
   R> Rcpp::evalCpp("sizeof(int)")
   [1] 4
   R>


My bad, it's only the same on Windows. My point is that the discussion
about the size of int vs long int is only a distraction here. The 
important bit is that 10L in R is represented by 10 in C, which is an

int, not by 10L, which is a long int. Could hardly be more confusing.

H.




Dirk

| H.
|
| On 08/25/2018 10:01 AM, Dirk Eddelbuettel wrote:
| >
| > On 25 August 2018 at 09:28, Carl Boettiger wrote:
| > | I always thought it meant "Long" (I'm assuming R's integers are long
| > | integers in C sense (iirrc one can declare 'long x', and it being common 
to
| > | refer to integers as "longs"  in the same way we use "doubles" to mean
| > | double precision floating point).  But pure speculation on my part, so I'm
| > | curious!
| >
| > It does per my copy (dated 1990 !!) of the 2nd ed of Kernighan & Ritchie.  
It
| > explicitly mentions (sec 2.2) that 'int' may be 16 or 32 bits, and 'long' is
| > 32 bit; and (in sec 2.3) introduces the I, U, and L labels for constants.  
So
| > "back then when" 32 bit was indeed long.  And as R uses 32 bit integers ...
| >
| > (It is all murky because the size is an implementation detail and later
| > "essentially everybody" moved to 32 bit integers and 64 bit longs as the 64
| > bit architectures became prevalent.  Which is why when it matters one should
| > really use more explicit types like int32_t or int64_t.)
| >
| > Dirk
| >
|
| --
| Hervé Pagès
|
| Program in Computational Biology
| Division of Public Health Sciences
| Fred Hutchinson Cancer Research Center
| 1100 Fairview Ave. N, M1-B514
| P.O. Box 19024
| Seattle, WA 98109-1024
|
| E-mail: hpa...@fredhutch.org
| Phone:  (206) 667-5791
| Fax:(206) 667-1319



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Where does L come from?

2018-08-25 Thread Hervé Pagès


The choice of the L suffix in R to mean "R integer type", which
is mapped to the "int" type at the C level, and NOT to the "long int"
type, is really unfortunate as it seems to be misleading and confusing
a lot of people.

The fact that nowadays "int" and "long int" have the same size on most
platforms is only anecdotal here.

Just my 2 cents.

H.

On 08/25/2018 10:01 AM, Dirk Eddelbuettel wrote:


On 25 August 2018 at 09:28, Carl Boettiger wrote:
| I always thought it meant "Long" (I'm assuming R's integers are long
| integers in C sense (iirrc one can declare 'long x', and it being common to
| refer to integers as "longs"  in the same way we use "doubles" to mean
| double precision floating point).  But pure speculation on my part, so I'm
| curious!

It does per my copy (dated 1990 !!) of the 2nd ed of Kernighan & Ritchie.  It
explicitly mentions (sec 2.2) that 'int' may be 16 or 32 bits, and 'long' is
32 bit; and (in sec 2.3) introduces the I, U, and L labels for constants.  So
"back then when" 32 bit was indeed long.  And as R uses 32 bit integers ...

(It is all murky because the size is an implementation detail and later
"essentially everybody" moved to 32 bit integers and 64 bit longs as the 64
bit architectures became prevalent.  Which is why when it matters one should
really use more explicit types like int32_t or int64_t.)

Dirk



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] longint

2018-08-16 Thread Hervé Pagès


On 08/16/2018 11:30 AM, Prof Brian Ripley wrote:

On 16/08/2018 18:33, Hervé Pagès wrote:

...


Only on Intel platforms int is 32 bits. Strictly speaking int is only
required to be >= 16 bits. Who knows what the size of an int is on
the Sunway TaihuLight for example ;-)


R's configure checks that int is 32 bit and will not compile without it 
(src/main/arithmetic.c) ... so int and int32_t are the same on all 
platforms where the latter is defined.


Good to know. Thanks for the clarification!

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] longint

2018-08-16 Thread Hervé Pagès


On 08/16/2018 05:12 AM, Dirk Eddelbuettel wrote:


On 15 August 2018 at 20:32, Benjamin Tyner wrote:
| Thanks for the replies and for confirming my suspicion.
|
| Interestingly, src/include/S.h uses a trick:
|
|     #define longint int
|
| and so does the nlme package (within src/init.c).

As Bill Dunlap already told you, this is a) ancient and b) was concerned with
the int as 16 bit to 32 bit transition period. Ie a long time ago. Old C
programmers remember.

You should preferably not even use 'long int' on the other side but rely on
the fact that all compiler nowadays allow you to specify exactly what size is
used via int64_t (long), int32_t (int), ... and the unsigned cousins (which R
does not have).  So please receive the value as a int64_t and then cast it to
an int32_t -- which corresponds to R's notion of an integer on every platform.


Only on Intel platforms int is 32 bits. Strictly speaking int is only
required to be >= 16 bits. Who knows what the size of an int is on
the Sunway TaihuLight for example ;-)

H.



And please note that that conversion is lossy.  If you must keep 64 bits then
the bit64 package by Jens Oehlschlaegel is good and eg fully supported inside
data.table. We use it for 64-bit integers as nanosecond timestamps in our
nanotime package (which has some converters).

Dirk



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] longint

2018-08-15 Thread Hervé Pagès

No segfault but a BIG warning from the compiler. That's because 
dereferencing the pointer inside your myfunc() function will

produce an int that is not predictable i.e. it is system-dependent.
Its value will depend on sizeof(long int) (which is not
guaranteed to be 8) and on the endianness of the system.

Also if the pointer you pass in the call to the function is
an array of long ints, then pointer arithmetic inside your myfunc()
won't necessarily take you to the array element that you'd expect.

Note that there are very specific situations where you can actually
do this kind of things e.g. in the context of writing a callback
function to pass to qsort(). See 'man 3 qsort' if you are on a Unix
system. In that case pointers to void and explicit casts should
be used. If done properly, this is portable code and the compiler won't
issue warnings.

H.


On 08/15/2018 07:05 AM, Brian Ripley wrote:




On 15 Aug 2018, at 12:48, Duncan Murdoch  wrote:


On 15/08/2018 7:08 AM, Benjamin Tyner wrote:
Hi
In my R package, imagine I have a C function defined:
 void myfunc(int *x) {
// some code
 }
but when I call it, I pass it a pointer to a longint instead of a
pointer to an int. Could this practice potentially result in a segfault?


I don't think the passing would cause a segfault, but "some code" might be 
expecting a positive number, and due to the type error you could pass in a positive 
longint and have it interpreted as a negative int.


Are you thinking only of a little-endian system?  A 32-bit lookup of a pointer 
to a 64-bit area could read the wrong half and get a completely different value.



Duncan Murdoch

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwIFAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc=


__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwIFAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] Block bootstrap for GenomicRanges

2018-08-14 Thread Hervé Pagès


I see. If the blocks need to be completely random (but still
with the constraint that once rearranged they form a partition
of the chromosome), then a slightly modified solution would be:

## Add a feature id to the ranges in y. This is not required
## but will help see what happens to the features:

  mcols(y)$feature_id <- head(letters, length(y))
  y
  # IRanges object with 12 ranges and 1 metadata column:
  #  start   end width |  feature_id
  # | 
  #  [1]5155 5 |   a
  #  [2]6165 5 |   b
  #  [3]7175 5 |   c
  #  [4]   111   115 5 |   d
  #  [5]   121   125 5 |   e
  #  ...   ...   ...   ... . ...
  #  [8]   511   515 5 |   h
  #  [9]   521   525 5 |   i
  # [10]   921   925 5 |   j
  # [11]   931   935 5 |   k
  # [12]   941   945 5 |   l

## Generate the random blocks:

  random_blocks <- IRanges(start=round(runif(10,1,901)), width=100)

## Add a block id. Again, not needed for the algo below, but will
## help understand the final object y_prime:

  mcols(random_blocks)$block_id <- head(LETTERS, length(random_blocks))
  random_blocks
  # IRanges object with 10 ranges and 1 metadata column:
  #  start   end width |block_id
  # | 
  #  [1]   283   382   100 |   A
  #  [2]   898   997   100 |   B
  #  [3]   298   397   100 |   C
  #  [4]   680   779   100 |   D
  #  [5]   722   821   100 |   E
  #  [6]   632   731   100 |   F
  #  [7]   594   693   100 |   G
  #  [8]   689   788   100 |   H
  #  [9]   886   985   100 |   I
  # [10]   673   772   100 |   J

## Compute the shift involved in rearranging each block:

  rearranged_blocks <- successiveIRanges(width(random_blocks))
  block_shift <- start(rearranged_blocks) - start(random_blocks)

## Compute y':

  y_prime <- do.call(c,
lapply(seq_along(random_blocks),
  function(b) {
features_to_shift <- subsetByOverlaps(y, random_blocks[b])
block_id <- mcols(random_blocks)$block_id[b]
mcols(features_to_shift)$block_id <- rep(block_id, 
length(features_to_shift))

shift(features_to_shift, block_shift[b])
  }
)
  )

  y_prime
  # IRanges object with 6 ranges and 2 metadata columns:
  # start   end width |  feature_idblock_id
  #|  
  # [1]   124   128 5 |   j   B
  # [2]   134   138 5 |   k   B
  # [3]   144   148 5 |   l   B
  # [4]   836   840 5 |   j   I
  # [5]   846   850 5 |   k   I
  # [6]   856   860 5 |   l   I

Still based on shift(), which avoids all the little annoyances
of using Rle's as an intermediate representation of the ranges.

It uses a loop which might be problem if the number of blocks is
big (say more than 5). There might be a way to avoid the loop
though, but it's probably not trivial...

H.


On 08/14/2018 05:26 AM, Michael Love wrote:

dear Hervé,

Thanks again for the quick and useful reply!

I think that the theory behind the block bootstrap [Kunsch (1989), Liu
and Singh (1992), Politis and Romano (1994)], needs that the blocks be
drawn with replacement (you can get some features twice) and that the
blocks can be overlapping. In a hand-waving way, I think, it's "good"
for the variance estimation on any statistic of interest that y' may
have more or less features than y.

I will explore a bit using the solutions you've laid out.

Now that I think about it, the start-position based solution that I
was thinking about will break if two features in y share the same
start position, so that's not good.

On Mon, Aug 13, 2018 at 11:58 PM, Hervé Pagès  wrote:

That helps. I think I start to understand what you are after.

See below...


On 08/13/2018 06:07 PM, Michael Love wrote:


dear Hervé,

Thanks for the quick reply about directions to take this.

I'm sorry for not providing sufficient detail about the goal of block
bootstrapping in my initial post. Let me try again. For a moment, let
me ignore multiple chromosomes/seqs and just focus on a single set of
IRanges.

The point of the block bootstrap is: Let's say we want to find the
number of overlaps of x and y, and then assess how surprised we are at
how large that overlap is. Both of them may have features that tend to
cluster together along the genome (independently). One method would
just be to move the features in y around to random start sites, makin

Re: [Bioc-devel] Block bootstrap for GenomicRanges

2018-08-13 Thread Hervé Pagès

width of the
range, I think this Views approach would actually work.

y.boot.1.rng <- IRanges(start(y.boot.1)[runValue(y.boot.1) == 1],
   width=runLength(y.boot.1)[runValue(y.boot.1) == 1])

I'm interested in building a function that takes in IRanges and
outputs these shuffled set of IRanges.



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Block bootstrap for GenomicRanges

2018-08-13 Thread Hervé Pagès

4VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=58qAk0b5MsXbsLCIUZP84lVgqb3DywZToIUQoX2WpTc=kaZA6WamwApoWQOGryaxzaB3GgxiLFgfd1YRK4w0O7U=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Is biocLite() working for you right now? Could be a problem on our side. Issue with DelayedArray install with a PC on R 3.5.0

2018-08-03 Thread Hervé Pagès

cBqo8bRrI8t2yBsLDIip0=;)


Error in file(filename, "r", encoding = encoding) :

   cannot open the connection

In addition: Warning message:

In file(filename, "r", encoding = encoding) :

   InternetOpenUrl failed: 'A connection with the server could not be
established'


sessionInfo()


R version 3.5.0 (2018-04-23)

Platform: x86_64-w64-mingw32/x64 (64-bit)

Running under: Windows 10 x64 (build 17134)



Matrix products: default



locale:

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252

[4] LC_NUMERIC=C   LC_TIME=English_United States.1252



attached base packages:

[1] stats graphics  grDevices utils datasets  methods   base



loaded via a namespace (and not attached):

[1] compiler_3.5.0




On my Mac however, I can install DelayedArray.


source('https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_biocLite.R=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=E_aXM9k6x5bj6QJvBcCq62cBqo8bRrI8t2yBsLDIip0=')

Bioconductor version 3.7 (BiocInstaller 1.30.0), ?biocLite for help

biocLite('DelayedArray')

BioC_mirror: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=ONhQc9D8pZG8vfLdgX0Q1AWOwuPxDBzEW2Ifp06nmsQ=
Using Bioconductor 3.7 (BiocInstaller 1.30.0), R 3.5.1 (2018-07-02).
Installing package(s) ‘DelayedArray’
trying URL 
'https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_packages_3.7_bioc_bin_macosx_el-2Dcapitan_contrib_3.5_DelayedArray-5F0.6.2.tgz=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=Kpu3S3I2bAtZswJFljWMnqN-QA1hE9ZDjSWJ0J-kBm0='
Content type 'application/x-gzip' length 1308365 bytes (1.2 MB)
==
downloaded 1.2 MB

sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: 
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] BiocInstaller_1.30.0

loaded via a namespace (and not attached):
[1] compiler_3.5.1 tools_3.5.1


So, I have no idea how to approach this and just wanted to double
check that things are ok from your side.


Thanks,
Leo

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=YRPQuTB5uy8tiE-k-5akRFXnJhoNwjhnS3ooygAEUQU=


___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=VQqOqRCRgcQL-JkG2jnIvc9WLYZ09mm9gdXpjfDqJos=YRPQuTB5uy8tiE-k-5akRFXnJhoNwjhnS3ooygAEUQU=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] as.data.frame for GRanges when one meta column is a data frame

2018-07-05 Thread Hervé Pagès

e to bring to your attention that the TnT package is failing
to pass 'R CMD build' on all platforms in the devel version of
Bioconductor (i.e. BioC 3.8):

https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_checkResults_devel_bioc-2DLATEST_TnT=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Qj-vl9xxsXyBySh08ExrawvLKqjD6wsNm-Ksdv_FY5M=F8bgEUvup-gEFW5bhS2Qwar6e7mcBHB5RJ7bpO320-g=

Would you mind taking a look at this? Don't hesitate to ask on the bi
oc-de...@r-project.org mailing list if you have any question or need
help.


While devel is a place to experiment with new features, we expect
packages to build and check cleanly in a reasonable time period and
not stay broken for
any extended period of time.   The package has been failing since
06/11/18

If no action is taken over the next few weeks we will begin the
deprecation process for your package.


Thank you for your time and effort, and your continued contribution
to Bioconductor.

Pleae be advised that Bioconductor has switched from svn to Git. Some
helpful links can be found here:
https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_developers_how-2Dto_git_=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Qj-vl9xxsXyBySh08ExrawvLKqjD6wsNm-Ksdv_FY5M=sTHnSumyDr9UrxEynYbE2X_wTeyelJEgKiJ5qCh5_y8=
https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_developers_how-2Dto_git_bug-2Dfix-2Din-2Drelease-2Dand-2D=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Qj-vl9xxsXyBySh08ExrawvLKqjD6wsNm-Ksdv_FY5M=005acfxYDLwSkfUPRJ14v0UbzU6yeYb_6s0TrIgT50k=
devel/



Lori Shepherd
Bioconductor Core Team
Roswell Park Cancer Institute
Department of Biostatistics & Bioinformatics
Elm & Carlton Streets
Buffalo, New York 14263

This email message may contain legally privileged and/or confidential
information. If you are not the intended recipient(s), or the
employee or agent responsible for the delivery of this message to the
intended recipient(s), you are hereby notified that any disclosure,
copying, distribution, or use of this email message is prohibited. If
you have received this message in error, please notify the sender
immediately by e-mail and delete this email message from your
computer. Thank you.


___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Qj-vl9xxsXyBySh08ExrawvLKqjD6wsNm-Ksdv_FY5M=t4B7seeMvFDydrqlCa5XQLvfjxhjSke-NHGWjS30Lkc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Rd] MARGIN in base::unique.matrix() and base::unique.array()

2018-07-02 Thread Hervé Pagès


Hi,

The man page for base::unique.matrix() and base::unique.array() says
that MARGIN is expected to be a single integer. OTOH the code in charge
of checking the user supplied MARGIN is:

if (length(MARGIN) > ndim || any(MARGIN > ndim))
stop(gettextf("MARGIN = %d is invalid for dim = %d",
MARGIN, dx), domain = NA)

which doesn't really make sense.

As a consequence the user gets an obscure error message when specifying
a MARGIN that satisfies the above check but is in fact invalid:

  > unique(matrix(1:10, ncol=2), MARGIN=1:2)
  Error in args[[MARGIN]] <- !duplicated.default(temp, fromLast = 
fromLast,  :

object of type 'symbol' is not subsettable

Also the code used by the above check to generate the error message
is broken:

  > unique(matrix(1:10, ncol=2), MARGIN=1:3)
  Error in sprintf(gettext(fmt, domain = domain), ...) :
arguments cannot be recycled to the same length

  > unique(matrix(1:10, ncol=2), MARGIN=3)
  Error in unique.matrix(matrix(1:10, ncol = 2), MARGIN = 3) :
c("MARGIN = 3 is invalid for dim = 5", "MARGIN = 3 is invalid for 
dim = 2")


Thanks,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] reverting to older version

2018-06-11 Thread Hervé Pagès


I can only encourage you to keep track of the most significant changes
in your package in its NEWS file, especially if its history is a little
bit complicated as it seems to be the case here. Briefly explaining the
motivations behind those changes is a good idea.

Cheers,
H.


On 06/11/2018 08:47 PM, Samsiddhi Bhattacharjee wrote:

OK...1.98.0 and 1.99.0 sounds good. Shall do that.
Is it necessary to convey the reasons for the change e.g. NEWS file ? 
That's my last question...I hope !



On Tuesday, June 12, 2018, Hervé Pagès <mailto:hpa...@fredhutch.org>> wrote:


Ah ok. Yes 1.99.0 is fine. Then the package will be released as 2.0.0
in Fall as part of BioC 3.8.

Not that version numbers have a strong meaning but I was thinking that
maybe you could bump to 1.98.0 in release to sort of indicate the fact
that the package in release is the precursor of what's going to become
2.0.0 in the next release. If 1.98.0 works as expected, you should
freeze it i.e. only touch it when you absolutely need to fix something
in it.

Hope this helps,
H.

On 06/11/2018 06:33 PM, Samsiddhi Bhattacharjee wrote:

Thanks, I shall do that. Its OK to keep the master as 1.99.0 ?
It should probably have been 1.19.1 ?


On Monday, June 11, 2018, Hervé Pagès mailto:hpa...@fredhutch.org> <mailto:hpa...@fredhutch.org
<mailto:hpa...@fredhutch.org>>> wrote:

     Hi,

     Having a package that is known to be broken in release is not
     really an option.

     How about replacing all the files in the RELEASE_3_7 branch
     with what's in the master branch. For the version, just bump
     z (in x.y.z) to its next version. Don't touch x or y. So the
     version would become 1.18.1 in release. Then commit (it's going
     to be a single commit) with a commit message that says
something
     like "Resync with master branch".

     Cheers,
     H.

     On 06/11/2018 09:27 AM, Samsiddhi Bhattacharjee wrote:

         Hi,

         I am maintainer of package ASSET. We have recently
discovered
         some issues
         (most importantly computational speed issues) with recent
         versions of our
         package and wanted to revert the code to an older
version ASSET
         v 1.8.0
         present in Bioconductor release 3.2, before proceeding
to make
         further
         enhancements to the package.

         In release 3.3 , there were major changes to the
package, it is
         like a
         branch that we now realize that we need to abandon. We had
         introduced a new
         feature and for that we switched from deterministic p-value
         calculation to
         stochastic calculation. We did not notice the issues
untill now.
         We want to
         switch back to the deterministic one, which was present
last in 3.2.

         As suggested by Nitesh, I have made the changes in
devel branch
         (basically
         by copying the code as it was in release 3.2, and only
updating the
         DESCRIPTION file make the version 1.99.0 as this will
be a major
         change
         (although we are taking a few steps back, we will
probably add
         some steps
         forward before release 3.8).

         I wanted to put a .onAttach() message in the current
version to
         make the
         user aware of the issues and possibly mentioning the next
         release and/or
         pointing to the older release. However, as Herve
         has pointed out, people may mix up devel and release
versions
         causing
         problems. Hence Herve had suggested:

         "It will be much better if you actually fix the release
version
         of your
         package. This should just be a matter of porting the
fixes you
         do in devel
         with 'git cherry-pick'."

         Reason I am hesitating is that the changes (diff of 3.7
and 3.2)
         are quite
         a lot and doing selective changes as suggested will
introduce
         further bugs,
         and even after selection these changes will be *many*.
Is it ok
         to backport
         a "patch" to the release with a large number of
changes? If yes,
         what
         should the version number be bumped to?

Re: [Bioc-devel] reverting to older version

2018-06-11 Thread Hervé Pagès


Ah ok. Yes 1.99.0 is fine. Then the package will be released as 2.0.0
in Fall as part of BioC 3.8.

Not that version numbers have a strong meaning but I was thinking that
maybe you could bump to 1.98.0 in release to sort of indicate the fact
that the package in release is the precursor of what's going to become
2.0.0 in the next release. If 1.98.0 works as expected, you should
freeze it i.e. only touch it when you absolutely need to fix something
in it.

Hope this helps,
H.

On 06/11/2018 06:33 PM, Samsiddhi Bhattacharjee wrote:
Thanks, I shall do that. Its OK to keep the master as 1.99.0 ? It should 
probably have been 1.19.1 ?



On Monday, June 11, 2018, Hervé Pagès <mailto:hpa...@fredhutch.org>> wrote:


Hi,

Having a package that is known to be broken in release is not
really an option.

How about replacing all the files in the RELEASE_3_7 branch
with what's in the master branch. For the version, just bump
z (in x.y.z) to its next version. Don't touch x or y. So the
version would become 1.18.1 in release. Then commit (it's going
to be a single commit) with a commit message that says something
like "Resync with master branch".

Cheers,
H.

On 06/11/2018 09:27 AM, Samsiddhi Bhattacharjee wrote:

Hi,

I am maintainer of package ASSET. We have recently discovered
some issues
(most importantly computational speed issues) with recent
versions of our
package and wanted to revert the code to an older version ASSET
v 1.8.0
present in Bioconductor release 3.2, before proceeding to make
further
enhancements to the package.

In release 3.3 , there were major changes to the package, it is
like a
branch that we now realize that we need to abandon. We had
introduced a new
feature and for that we switched from deterministic p-value
calculation to
stochastic calculation. We did not notice the issues untill now.
We want to
switch back to the deterministic one, which was present last in 3.2.

As suggested by Nitesh, I have made the changes in devel branch
(basically
by copying the code as it was in release 3.2, and only updating the
DESCRIPTION file make the version 1.99.0 as this will be a major
change
(although we are taking a few steps back, we will probably add
some steps
forward before release 3.8).

I wanted to put a .onAttach() message in the current version to
make the
user aware of the issues and possibly mentioning the next
release and/or
pointing to the older release. However, as Herve
has pointed out, people may mix up devel and release versions
causing
problems. Hence Herve had suggested:

"It will be much better if you actually fix the release version
of your
package. This should just be a matter of porting the fixes you
do in devel
with 'git cherry-pick'."

Reason I am hesitating is that the changes (diff of 3.7 and 3.2)
are quite
a lot and doing selective changes as suggested will introduce
further bugs,
and even after selection these changes will be *many*. Is it ok
to backport
a "patch" to the release with a large number of changes? If yes,
what
should the version number be bumped to?

Thanks in advance.

Regards,

--
Samsiddhi

         [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
mailing list

https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=fgBGvYIMbW3NwrKMVPVed43z9LsMyZhyprB7VIWmzRQ=mkxJZC0R8tmJDvJ5e5BD4q_sni2JIJB-sCIAkpGut9c=

<https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=fgBGvYIMbW3NwrKMVPVed43z9LsMyZhyprB7VIWmzRQ=mkxJZC0R8tmJDvJ5e5BD4q_sni2JIJB-sCIAkpGut9c=>


-- 
Hervé Pagès


Program in Computational Biolog

<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3DComputational-2BBiolog-26entry-3Dgmail-26source-3Dg=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=yK9EcNtuXJxVARcKqhhDNsaafTbhs3BL6XY0N6Jg9Do=AHsUDoAQB3QsfUp0YXfRbO6LCtWkCM0BLKJzCMlqYsE=>y
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>
    Phone:  (206) 667-5791
Fax:    (20

Re: [Bioc-devel] reverting to older version

2018-06-11 Thread Hervé Pagès

Hi,

Having a package that is known to be broken in release is not
really an option.

How about replacing all the files in the RELEASE_3_7 branch
with what's in the master branch. For the version, just bump
z (in x.y.z) to its next version. Don't touch x or y. So the
version would become 1.18.1 in release. Then commit (it's going
to be a single commit) with a commit message that says something
like "Resync with master branch".

Cheers,
H.

On 06/11/2018 09:27 AM, Samsiddhi Bhattacharjee wrote:

Hi,

I am maintainer of package ASSET. We have recently discovered some issues
(most importantly computational speed issues) with recent versions of our
package and wanted to revert the code to an older version ASSET v 1.8.0
present in Bioconductor release 3.2, before proceeding to make further
enhancements to the package.

In release 3.3 , there were major changes to the package, it is like a
branch that we now realize that we need to abandon. We had introduced a new
feature and for that we switched from deterministic p-value calculation to
stochastic calculation. We did not notice the issues untill now. We want to
switch back to the deterministic one, which was present last in 3.2.

As suggested by Nitesh, I have made the changes in devel branch (basically
by copying the code as it was in release 3.2, and only updating the
DESCRIPTION file make the version 1.99.0 as this will be a major change
(although we are taking a few steps back, we will probably add some steps
forward before release 3.8).

I wanted to put a .onAttach() message in the current version to make the
user aware of the issues and possibly mentioning the next release and/or
pointing to the older release. However, as Herve
has pointed out, people may mix up devel and release versions causing
problems. Hence Herve had suggested:

"It will be much better if you actually fix the release version of your
package. This should just be a matter of porting the fixes you do in devel
with 'git cherry-pick'."

Reason I am hesitating is that the changes (diff of 3.7 and 3.2) are quite
a lot and doing selective changes as suggested will introduce further bugs,
and even after selection these changes will be *many*. Is it ok to backport
a "patch" to the release with a large number of changes? If yes, what
should the version number be bumped to?

Thanks in advance.

Regards,

--
Samsiddhi

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=fgBGvYIMbW3NwrKMVPVed43z9LsMyZhyprB7VIWmzRQ=mkxJZC0R8tmJDvJ5e5BD4q_sni2JIJB-sCIAkpGut9c=

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Hervé Pagès





On 06/08/2018 02:15 PM, Hadley Wickham wrote:

On Fri, Jun 8, 2018 at 2:09 PM, Berry, Charles  wrote:




On Jun 8, 2018, at 1:49 PM, Hadley Wickham  wrote:

Hmmm, yes, there must be some special case in the C code to avoid
recycling a length-1 logical vector:



Here is a version that (I think) handles Herve's issue of arrays having one or 
more 0 dimensions.

subset_ROW <-
 function(x,i)
{
 dims <- dim(x)
 index_list <- which(dims[-1] != 0L) + 3
 mc <- quote(x[i])
 nd <- max(1L, length(dims))
 mc[ index_list ] <- list(TRUE)
 mc[[ nd + 3L ]] <- FALSE
 names( mc )[ nd+3L ] <- "drop"
 eval(mc)
}

Curiously enough the timing is *much* better for this implementation than for 
the first version I sent.

Constructing a version of `mc' that looks like `x[idrop=FALSE]' can be done 
with `alist(a=)' in place of `list(TRUE)' in the earlier version but seems to 
slow things down noticeably. It requires almost twice (!!) as much time as the 
version above.


I think that's probably because alist() is a slow way to generate a
missing symbol:

bench::mark(
   alist(x = ),
   list(x = quote(expr = )),
   check = FALSE
)[1:5]
#> # A tibble: 2 x 5
#>   expressionmin mean   median  max
#>  
#> 1 alist(x = ) 2.8µs   3.54µs   3.29µs   34.9µs
#> 2 list(x = quote(expr = ))169ns 219.38ns181ns   24.2µs

(note the units)


That's a good one. Need to change this in S4Vectors::default_extractROWS()
and other places. Thanks!

H.



Hadley




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Hervé Pagès


The C code for subsetting doesn't need to recycle a logical subscript.
It only needs to walk on it and start again at the beginning of the
vector when it reaches the end. Not exactly the same as detecting the
"take everything along that dimension" situation though.
x[TRUE, TRUE, TRUE] triggers the full subsetting machinery when x[]
and x[ , , ] could (and should) easily avoid it.

H.

On 06/08/2018 01:49 PM, Hadley Wickham wrote:

Hmmm, yes, there must be some special case in the C code to avoid
recycling a length-1 logical vector:

dims <- c(4, 4, 4, 1e5)

arr <- array(rnorm(prod(dims)), dims)
dim(arr)
#> [1]  4  4  4 10
i <- c(1, 3)

bench::mark(
   arr[i, TRUE, TRUE, TRUE],
   arr[i, , , ]
)[c("expression", "min", "mean", "max")]
#> # A tibble: 2 x 4
#>   expressionmin mean  max
#> 
#> 1 arr[i, TRUE, TRUE, TRUE]   41.8ms   43.6ms   46.5ms
#> 2 arr[i, , , ]   41.7ms   43.1ms   46.3ms


On Fri, Jun 8, 2018 at 12:31 PM, Berry, Charles  wrote:




On Jun 8, 2018, at 11:52 AM, Hadley Wickham  wrote:

On Fri, Jun 8, 2018 at 11:38 AM, Berry, Charles  wrote:




On Jun 8, 2018, at 10:37 AM, Hervé Pagès  wrote:

Also the TRUEs cause problems if some dimensions are 0:


matrix(raw(0), nrow=5, ncol=0)[1:3 , TRUE]

Error in matrix(raw(0), nrow = 5, ncol = 0)[1:3, TRUE] :
   (subscript) logical subscript too long


OK. But this is easy enough to handle.



H.

On 06/08/2018 10:29 AM, Hadley Wickham wrote:

I suspect this will have suboptimal performance since the TRUEs will
get recycled. (Maybe there is, or could be, ALTREP, support for
recycling)
Hadley



AFAICS, it is not an issue. Taking

arr <- array(rnorm(2^22),c(2^10,4,4,4))

as a test case

and using a function that will either use the literal code `x[idrop=FALSE]' 
or `eval(mc)':

subset_ROW4 <-
 function(x, i, useLiteral=FALSE)
{
literal <- quote(x[idrop=FALSE])
mc <- quote(x[i])
nd <- max(1L, length(dim(x)))
mc[seq(4,length=nd-1L)] <- rep(TRUE, nd-1L)
mc[["drop"]] <- FALSE
if (useLiteral)
eval(literal)
else
eval(mc)
}

I get identical times with

system.time(for (i in 1:1) subset_ROW4(arr,seq(1,length=10,by=100),TRUE))

and with

system.time(for (i in 1:1) subset_ROW4(arr,seq(1,length=10,by=100),FALSE))


I think that's because you used a relatively low precision timing
mechnaism, and included the index generation in the timing. I see:

arr <- array(rnorm(2^22),c(2^10,4,4,4))
i <- seq(1,length = 10, by = 100)

bench::mark(
  arr[i, TRUE, TRUE, TRUE],
  arr[i, , , ]
)
#> # A tibble: 2 x 1
#>   expressionminmean   median  max  n_gc
#>
#> 1 arr[i, TRUE,…   7.4µs  10.9µs  10.66µs   1.22ms 2
#> 2 arr[i, , , ]   7.06µs   8.8µs   7.85µs 538.09µs 2

So not a huge difference, but it's there.



Funny. I get similar results to yours above albeit with smaller differences. 
Usually < 5 percent.

But with subset_ROW4 I see no consistent difference.

In this example, it runs faster on average using `eval(mc)' to return the 
result:


arr <- array(rnorm(2^22),c(2^10,4,4,4))
i <- seq(1,length=10,by=100)
bench::mark(subset_ROW4(arr,i,FALSE), subset_ROW4(arr,i,TRUE))[,1:8]

# A tibble: 2 x 8
   expression  min mean   median  max `itr/sec` 
mem_alloc  n_gc
   

1 subset_ROW4(arr, i, FALSE)   28.9µs   34.9µs   32.1µs   1.36ms28686.
5.05KB 5
2 subset_ROW4(arr, i, TRUE)28.9µs     35µs   32.4µs 875.11µs28572.
5.05KB 5




And on subsequent reps the lead switches back and forth.


Chuck







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Hervé Pagès


A missing subscript is still preferable to a TRUE though because it
carries the meaning "take it all". A TRUE also achieves this but via
implicit recycling. For example x[ , , ] and x[TRUE, TRUE, TRUE]
achieve the same thing (if length(x) != 0) and are both no-ops but
the subsetting code gets a chance to immediately and easily detect
the former as a no-op whereas it will probably not be able to do it
so easily for the latter. So in this case it will most likely generate
a copy of 'x' and fill the new array by taking a full walk on it.

H.

On 06/08/2018 11:52 AM, Hadley Wickham wrote:

On Fri, Jun 8, 2018 at 11:38 AM, Berry, Charles  wrote:




On Jun 8, 2018, at 10:37 AM, Hervé Pagès  wrote:

Also the TRUEs cause problems if some dimensions are 0:

  > matrix(raw(0), nrow=5, ncol=0)[1:3 , TRUE]
  Error in matrix(raw(0), nrow = 5, ncol = 0)[1:3, TRUE] :
(subscript) logical subscript too long


OK. But this is easy enough to handle.



H.

On 06/08/2018 10:29 AM, Hadley Wickham wrote:

I suspect this will have suboptimal performance since the TRUEs will
get recycled. (Maybe there is, or could be, ALTREP, support for
recycling)
Hadley



AFAICS, it is not an issue. Taking

arr <- array(rnorm(2^22),c(2^10,4,4,4))

as a test case

and using a function that will either use the literal code `x[idrop=FALSE]' 
or `eval(mc)':

subset_ROW4 <-
  function(x, i, useLiteral=FALSE)
{
 literal <- quote(x[idrop=FALSE])
 mc <- quote(x[i])
 nd <- max(1L, length(dim(x)))
 mc[seq(4,length=nd-1L)] <- rep(TRUE, nd-1L)
 mc[["drop"]] <- FALSE
 if (useLiteral)
 eval(literal)
 else
 eval(mc)
  }

I get identical times with

system.time(for (i in 1:1) subset_ROW4(arr,seq(1,length=10,by=100),TRUE))

and with

system.time(for (i in 1:1) subset_ROW4(arr,seq(1,length=10,by=100),FALSE))


I think that's because you used a relatively low precision timing
mechnaism, and included the index generation in the timing. I see:

arr <- array(rnorm(2^22),c(2^10,4,4,4))
i <- seq(1,length = 10, by = 100)

bench::mark(
   arr[i, TRUE, TRUE, TRUE],
   arr[i, , , ]
)
#> # A tibble: 2 x 1
#>   expressionminmean   median  max  n_gc
#>
#> 1 arr[i, TRUE,…   7.4µs  10.9µs  10.66µs   1.22ms 2
#> 2 arr[i, , , ]   7.06µs   8.8µs   7.85µs 538.09µs 2

So not a huge difference, but it's there.

Hadley




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Hervé Pagès


Also the TRUEs cause problems if some dimensions are 0:

  > matrix(raw(0), nrow=5, ncol=0)[1:3 , TRUE]
  Error in matrix(raw(0), nrow = 5, ncol = 0)[1:3, TRUE] :
(subscript) logical subscript too long

H.

On 06/08/2018 10:29 AM, Hadley Wickham wrote:

I suspect this will have suboptimal performance since the TRUEs will
get recycled. (Maybe there is, or could be, ALTREP, support for
recycling)
Hadley

On Fri, Jun 8, 2018 at 10:16 AM, Berry, Charles  wrote:




On Jun 8, 2018, at 8:45 AM, Hadley Wickham  wrote:

Hi all,

Is there a better to way to subset the ROWs (in the sense of NROW) of
an vector, matrix, data frame or array than this?



You can use TRUE to fill the subscripts for dimensions 2:nd



subset_ROW <- function(x, i) {
  nd <- length(dim(x))
  if (nd <= 1L) {
x[i]
  } else {
dims <- rep(list(quote(expr = )), nd - 1L)
do.call(`[`, c(list(quote(x), quote(i)), dims, list(drop = FALSE)))
  }
}



subset_ROW <-
 function(x,i)
{
 mc <- quote(x[i])
 nd <- max(1L, length(dim(x)))
 mc[seq(4, length=nd-1L)] <- rep(list(TRUE), nd - 1L)
 mc[["drop"]] <- FALSE
 eval(mc)

}



subset_ROW(1:10, 4:6)
#> [1] 4 5 6

str(subset_ROW(array(1:10, c(10)), 2:4))
#>  int [1:3(1d)] 2 3 4
str(subset_ROW(array(1:10, c(10, 1)), 2:4))
#>  int [1:3, 1] 2 3 4
str(subset_ROW(array(1:10, c(5, 2)), 2:4))
#>  int [1:3, 1:2] 2 3 4 7 8 9
str(subset_ROW(array(1:10, c(10, 1, 1)), 2:4))
#>  int [1:3, 1, 1] 2 3 4

subset_ROW(data.frame(x = 1:10, y = 10:1), 2:4)
#>   x y
#> 2 2 9
#> 3 3 8
#> 4 4 7



HTH,

Chuck







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Hervé Pagès


On 06/08/2018 10:32 AM, Hervé Pagès wrote:

On 06/08/2018 10:15 AM, Michael Lawrence wrote:

There probably should be an abstraction for this. In S4Vectors, we
have extractROWS().


FWIW the code in S4Vectors that does what your subset_ROW() does is:


https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_S4Vectors_blob_04cc9516af986b30445e99fd1337f13321b7b4f6_R_subsetting-2Dutils.R-23L466-2DL476=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=LnDTzOeXwI6VI-4SVVi2rwDE7A-az-AhxPAB6X7Lkhc=_2PVGd2BrNNHtPjGsJkhSLAmtX3eoFuZDWWs2c8zZ4w= 


Wrong link sorry. Here is the correct one:


https://github.com/Bioconductor/S4Vectors/blob/04cc9516af986b30445e99fd1337f13321b7b4f6/R/subsetting-utils.R#L453-L464

H.




(This is the default "extractROWS" method.)

Except for the normalization of 'i', it does the same as your
subset_ROW(). I don't know how to do this without generating a call
with missing arguments.

H.



Michael

On Fri, Jun 8, 2018 at 8:45 AM, Hadley Wickham  
wrote:

Hi all,

Is there a better to way to subset the ROWs (in the sense of NROW) of
an vector, matrix, data frame or array than this?

subset_ROW <- function(x, i) {
   nd <- length(dim(x))
   if (nd <= 1L) {
 x[i]
   } else {
 dims <- rep(list(quote(expr = )), nd - 1L)
 do.call(`[`, c(list(quote(x), quote(i)), dims, list(drop = FALSE)))
   }
}

subset_ROW(1:10, 4:6)
#> [1] 4 5 6

str(subset_ROW(array(1:10, c(10)), 2:4))
#>  int [1:3(1d)] 2 3 4
str(subset_ROW(array(1:10, c(10, 1)), 2:4))
#>  int [1:3, 1] 2 3 4
str(subset_ROW(array(1:10, c(5, 2)), 2:4))
#>  int [1:3, 1:2] 2 3 4 7 8 9
str(subset_ROW(array(1:10, c(10, 1, 1)), 2:4))
#>  int [1:3, 1, 1] 2 3 4

subset_ROW(data.frame(x = 1:10, y = 10:1), 2:4)
#>   x y
#> 2 2 9
#> 3 3 8
#> 4 4 7

It seems like there should be a way to do this that doesn't require
generating a call with missing arguments, but I can't think of it.

Thanks!

Hadley

--
https://urldefense.proofpoint.com/v2/url?u=http-3A__hadley.nz=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=MF0DzYDiaYtcFXIyQwpQKs9lVbLNvdBBUubTv7BVAfM=GSpoAzc1Kn_BnTIkDh0HBFGKtRm-xFodxEPOejriC9Q= 



__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=MF0DzYDiaYtcFXIyQwpQKs9lVbLNvdBBUubTv7BVAfM=HsEbNAT5IElAUS-W2VVSeJs4tfQc77heV7BbQxru518= 





__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=MF0DzYDiaYtcFXIyQwpQKs9lVbLNvdBBUubTv7BVAfM=HsEbNAT5IElAUS-W2VVSeJs4tfQc77heV7BbQxru518= 







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Hervé Pagès


On 06/08/2018 10:15 AM, Michael Lawrence wrote:

There probably should be an abstraction for this. In S4Vectors, we
have extractROWS().


FWIW the code in S4Vectors that does what your subset_ROW() does is:


https://github.com/Bioconductor/S4Vectors/blob/04cc9516af986b30445e99fd1337f13321b7b4f6/R/subsetting-utils.R#L466-L476

(This is the default "extractROWS" method.)

Except for the normalization of 'i', it does the same as your
subset_ROW(). I don't know how to do this without generating a call
with missing arguments.

H.



Michael

On Fri, Jun 8, 2018 at 8:45 AM, Hadley Wickham  wrote:

Hi all,

Is there a better to way to subset the ROWs (in the sense of NROW) of
an vector, matrix, data frame or array than this?

subset_ROW <- function(x, i) {
   nd <- length(dim(x))
   if (nd <= 1L) {
 x[i]
   } else {
 dims <- rep(list(quote(expr = )), nd - 1L)
 do.call(`[`, c(list(quote(x), quote(i)), dims, list(drop = FALSE)))
   }
}

subset_ROW(1:10, 4:6)
#> [1] 4 5 6

str(subset_ROW(array(1:10, c(10)), 2:4))
#>  int [1:3(1d)] 2 3 4
str(subset_ROW(array(1:10, c(10, 1)), 2:4))
#>  int [1:3, 1] 2 3 4
str(subset_ROW(array(1:10, c(5, 2)), 2:4))
#>  int [1:3, 1:2] 2 3 4 7 8 9
str(subset_ROW(array(1:10, c(10, 1, 1)), 2:4))
#>  int [1:3, 1, 1] 2 3 4

subset_ROW(data.frame(x = 1:10, y = 10:1), 2:4)
#>   x y
#> 2 2 9
#> 3 3 8
#> 4 4 7

It seems like there should be a way to do this that doesn't require
generating a call with missing arguments, but I can't think of it.

Thanks!

Hadley

--
https://urldefense.proofpoint.com/v2/url?u=http-3A__hadley.nz=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=MF0DzYDiaYtcFXIyQwpQKs9lVbLNvdBBUubTv7BVAfM=GSpoAzc1Kn_BnTIkDh0HBFGKtRm-xFodxEPOejriC9Q=

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=MF0DzYDiaYtcFXIyQwpQKs9lVbLNvdBBUubTv7BVAfM=HsEbNAT5IElAUS-W2VVSeJs4tfQc77heV7BbQxru518=



__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=MF0DzYDiaYtcFXIyQwpQKs9lVbLNvdBBUubTv7BVAfM=HsEbNAT5IElAUS-W2VVSeJs4tfQc77heV7BbQxru518=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] build cache on bioconductor build system?

2018-05-30 Thread Hervé Pagès


Hi Vivek,

Are you submitting a package and bumped its version in order to trigger
a new build? The build system used for the submission is the "Single
Package Builder" (SPB) and is not the same as the main builder used
for the daily builds (BBS). It would help if you could provide the link
to the build report where you see the error or at least the name of
your package.

On 05/30/2018 07:30 AM, Vivek Bhardwaj wrote:


The package throwing error is*VariantAnnotation *(coming from function:
locateVariants). I checked it's build for the dev branch and indeed it's
broken. **

All right then I would remove version specification from my package and
I guess I'll wait for**VariantAnnotation to get fixed.


FWIW the latest version of VariantAnnotation in devel is 1.27.1:

  https://bioconductor.org/packages/3.8/bioc/html/VariantAnnotation.html

and it passes BUILD and CHECK today on all platforms today.

The SPB sometimes needs to install dependencies that in some rare
occasion can get stale a few days later. So in a sense yes it can
cache packages. Lori, our SPB expert, will be able to help you with
this if you provide the name of your package. Thanks!

H.



Best,
Vivek



On 05/30/2018 04:04 PM, Martin Morgan wrote:

There is no cache; like users you get the 'current' version of the
package from the relevant repository. If your package used
concatenateObjects, then it needs to be updated. If your package is
broken through a third package, it needs to be fixed (perhaps it has
been but did not propagate, see the build reports for packages in the
devel branch 
https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_checkResults_=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=XilqVN5SNnjEa4_siRsVQPa3EsNEK53OKM7n156DDnc=nghjJ8CDqIvM9dcgTCxhGrxP3XlnGhurWc5aREDAIyk=
 , and look for the
stoplight at the right).

Martin

On 05/30/2018 09:56 AM, Vivek Bhardwaj wrote:

Hi All

Is there any build cache on bioc build system? I updated S4Vector and
IRanges in my dependencies and triggered a new build of my package, only
to find that this broke other dependencies due to renaming of function
`concatenateObjects ` to `bindRows` in S4Vector. Now I removed the
mentioned versions in my DESCRIPTION, but the build is still broken.
Also tried specifying a particular version of these packages and I get
the error :

** byte-compile and prepare package for lazy loading Error in
loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck =
vI[[j]]) : namespace 'IRanges' 2.15.13 is already loaded, but == 2.14.10
is required

How do I refresh the cache?




This email message may contain legally privileged and/or confidential
information.  If you are not the intended recipient(s), or the
employee or agent responsible for the delivery of this message to the
intended recipient(s), you are hereby notified that any disclosure,
copying, distribution, or use of this email message is prohibited.  If
you have received this message in error, please notify the sender
immediately by e-mail and delete this email message from your
computer. Thank you.





--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] How best to remap S4Vectors::Hits indices?

2018-05-25 Thread Hervé Pagès


Hi Pariksheet,

On 05/22/2018 04:57 PM, Pariksheet Nanda wrote:

Hi folks,

I'm working on a package that does some trivial GRanges position
classifications; primarily to standardize nomenclature according to the
literature in workflows.

The API for S4Vectors::Hits() generally doesn't seem amenable to modify
Hits objects, except for the remapHits() feature (which I see underneath
the covers really generates a new Hits object).


Exactly. And that is the case for any object in R that is not a
reference object (i.e. that is not an environment, external pointer,
or reference class instance). Modifying it always generates a new
object. For example replacing a column of a data frame with
my_df$foo <- value or a slot of an S4 object with my_object@foo <- value
generates a new object. So adding setter methods for Hits objects
wouldn't change that.

The only reason we don't provide from()/queryHits() or
to()/subjectHits() setters is because we've not been able to identify
use cases that justify having them so far. For those use cases where
the 'from' and 'to' slots both need to be modified (in an atomic way),
calling the Hits() constructor to generate a new object does the job.



I was hoping someone could take a quick look at a short function I'm using
to subset and reindex Hits in the da_tss() function:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_coregenomics_nascentrna_blob_a2d9d10564c3a88759237b56ec49d0d3e73f6d16_R_classify.R-23L70=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=NZbAyFFpxVrRnJ_wgmnGDVpP3zsiyUN-I1CYW18k--I=wdseZzTGLbMSi02jPq5IsSaOgUlJVn_Pbqop_swCpjc=
Yes, to illustrate the problem I'm having, I've directly used the @-style
S4 access which is, of course, a terrible thing to do because it defeats
the purpose of S4 object validation, which is why I'm e-mailing the list
for an alternative.  I feel like casting to something like a data.frame,
changing the indices, and changing back to Hits would be wasteful and
improperly using the Bioconductor framework?


No need to cast the object to a data.frame. That would indeed be
wasteful. Just compute the new 'from' and 'to' vectors then do
'Hits(from, to, nLnode=nLnode(hits), nRnode=nRnode(hits))'
to create the modified object ('hits' being the original object).

Hope this helps,
H.



Here are the corresponding tests that run the da_tss() function:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_coregenomics_nascentrna_blob_a2d9d10564c3a88759237b56ec49d0d3e73f6d16_tests_testthat_test-2Dclassifiers.R=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=NZbAyFFpxVrRnJ_wgmnGDVpP3zsiyUN-I1CYW18k--I=tljRQI1QSZvtWBVQ6nZxvkvHDEhsHLEuileQTDJZAu0=

What it comes down to is this:
I want to compare a subset of GRanges for hits, but revert to the original
GRanges indices when returning the results.

Thanks for any advice!
Pariksheet

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=NZbAyFFpxVrRnJ_wgmnGDVpP3zsiyUN-I1CYW18k--I=axfJINFZYMTUgAtiTpF1FfjKAHOgjHrsbge0ANjtCrE=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Build error in tokay2: cannot reserve space for vector

2018-05-21 Thread Hervé Pagès


On 05/21/2018 05:50 AM, Martin Morgan wrote:
Remember that 32-bit Windows can only address vectors that are less than 
2^32 - 1 elements long -- it looks like your example is trying to do 
more than this, and the solution is to implement a more modest example.


32-bit Windows limits the amount of memory used by a single process to
a little bit less than 3GB. So I think you can actually create vectors
that have more than 2^32 elements as long as the total memory used by
R doesn't exceeds 3GB.

One more thing: since this is a 32-bit Windows issue, you should be able
to reproduce this locally by starting R in 32-bit mode:

  R --arch i386

Hope this helps,

H.



Martin

On 05/18/2018 04:05 AM, Sergio Picart Armada wrote:


Dear Bioconductor team,

I'm the maintainer of the FELLA package.
Lately the check in tokay2 has failed, see 
https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_checkResults_release_bioc-2DLATEST_FELLA_tokay2-2Dchecksrc.html=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-81iJhkqlvXvFLUkNuwd1vRA5eMW-Cf9eDiFgGMLDuk=e4cRnI7RwkBDHiDNJodfgLbk-BNKmO6OZsD1MoUHxLc= 

Specifically: Message: At vector.pmt:442 : cannot reserve space for 
vector, Out of memory
Class:   simpleError/error/conditionIt only happens in tokay2 and I 
cannot reproduce this locally.

Is this something on the server side or should I take action?

Thank you,
Sergi

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-81iJhkqlvXvFLUkNuwd1vRA5eMW-Cf9eDiFgGMLDuk=brMBxrnaU_ShvDUNzFa8pJ08ru-VDl6-q5yWF9aSIQc= 






This email message may contain legally privileged and/or...{{dropped:2}}

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=-81iJhkqlvXvFLUkNuwd1vRA5eMW-Cf9eDiFgGMLDuk=brMBxrnaU_ShvDUNzFa8pJ08ru-VDl6-q5yWF9aSIQc= 



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] modify _R_CHECK_FORCE_SUGGESTS_ ?

2018-05-21 Thread Hervé Pagès


Hi Vivek,

I'd say don't worry about R CMD check failing with the Single Package
Builder on Windows for now. When your package gets accepted, we'll
set _R_CHECK_FORCE_SUGGESTS_ to FALSE on Windows so your package will
be supported on this platform despite having Rsubread in Suggests.

Note that this not unprecedented: we're already doing this for
VariantTools (suggests gmapR, which is no supported on Windows
either) and singleCellTK (suggests Rsubread).

The way we do this is by adding a .BBSoptions file to the package
source tree with the following line in it:

  CHECKprepend.win: set _R_CHECK_FORCE_SUGGESTS_=0&&

Of course, you'll have to make sure that the code in the man pages,
vignettes and unit tests can run without Rsubread being installed.

Cheers,
H.


On 05/19/2018 02:49 AM, Bhardwaj, Vivek wrote:

??I think one option for me would be to switch to RBowtie instead of Rsubread, 
since this seems to have a windows binary. I would try this as soon as I have 
access to a windows system.




Vivek Bhardwaj
PhD Candidate | International Max Planck Research School
Max Planck Institute of Immunobiology and Epigenetics
Stübeweg 51, Freiburg

From: Bhardwaj, Vivek
Sent: Saturday, May 19, 2018 10:35 AM
To: Michael Lawrence; Martin Morgan
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] modify _R_CHECK_FORCE_SUGGESTS_ ?


@All thanks for your comments

I have provided Rsubread mapping wrapper as an optional step in the pipeline. Users can 
continue the analysis after mapping with their tool of choice as well. Therefore I added 
it in "suggests" and skip running it in the example and don't evaluate that 
chunk in the vignette.

I wouldn't want to skip supporting windows only because of an optional step, 
but I also prefer to keep that function as it allows users to perform 
end-to-end analysis in R if they wish to.

What would be the best option for me for now?



On 5/19/18 12:00 AM, Michael Lawrence wrote:

On Fri, May 18, 2018 at 2:26 PM, Martin Morgan
<martin.mor...@roswellpark.org><mailto:martin.mor...@roswellpark.org> wrote:


You can create a plain text file in the root directory of your package
.BBSoptions with the line

   UnsupportedPlatform: win

Your package will not be available on Windows, losing about 1/2 your
potential audience.


I'm not sure how many people will endeavor to run that part of the
icetea pipeline on their Windows laptop. What about separating the
preprocessing and downstream exploratory stuff into two packages?



A better strategy is to figure out why you are
Suggests:'ing Rsubread, and find alternative cross-platform solutions.

I would not hesitate to add your voice to mine in asking the Rsubread
maintainer to make their package cross-platform compatible. While this
requires considerable work, it would benefit the Bioconductor community in
this and subsequent years.

Martin

On 05/18/2018 04:55 PM, Bhardwaj, Vivek wrote:


Hi All


My package is in review and the build is failing since a suggested package
(Rsubread) is not available on windows. Is there a way for me to instruct
the build machine on bioc to use: _R_CHECK_FORCE_SUGGESTS_ = FALSE ??



Best,

Vivek



Vivek Bhardwaj
PhD Candidate | International Max Planck Research School
Max Planck Institute of Immunobiology and Epigenetics
St?beweg 51, Freiburg

 [[alternative HTML version deleted]]



___
Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=lfZWGGarfZUEkLuLxuYCQktiBKaASsMbNpCwLMzfFrM=8onp29q_vGi-gvujeRnxw0EL-gvcJNBKB7_FPrP4zYg=



This email message may contain legally privileged and/or...{{dropped:2}}


___
Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=lfZWGGarfZUEkLuLxuYCQktiBKaASsMbNpCwLMzfFrM=8onp29q_vGi-gvujeRnxw0EL-gvcJNBKB7_FPrP4zYg=




[[alternative HTML version deleted]]



___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=lfZWGGarfZUEkLuLxuYCQktiBKaASsMbNpCwLMzfFrM=8onp29q_vGi-gvujeRnxw0EL-gvcJNBKB7_FPrP4zYg=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___

Re: [Bioc-devel] Batch Package Submission

2018-05-17 Thread Hervé Pagès


Hi Paolo,

On 05/17/2018 08:38 AM, Paolo Martini wrote:

Dear list,
I am about to submit a new package.
For my commodity (code reuse across other packages that I have done) I
split my code into three separate packages: one for general utilities, one
for task specific to omic data and one specifically for integrating
multiple omics.

Should I submit the three of them as a batch or individually?


Individually. Since these packages depend on each others, submit
the package that the others depend on first. You might actually
want to wait a little bit and see how the review goes for that
first package before you submit the others. Maybe the review
process of this first package will trigger important changes
to it that will also require significant refactoring of the
other packages. So it seems to me that the overall process might
be smoother if you do one package at a time.

Please also provide links to the other packages (as a comment
in the issue tracker) after you submit the first package. This
will provide some context to the review.

Thanks,
H.



Thanks a lot



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] Dispatch mechanism seems to alter object before calling method on it

2018-05-16 Thread Hervé Pagès


On 05/16/2018 01:24 PM, Michael Lawrence wrote:

On Wed, May 16, 2018 at 12:23 PM, Hervé Pagès <hpa...@fredhutch.org> wrote:

On 05/16/2018 10:22 AM, Michael Lawrence wrote:


Factors and data.frames are not structures, because they must have a
class attribute. Just call them "objects". They are higher level than
structures, which in practice just shape data without adding a lot of
semantics. Compare getClass("matrix") and getClass("factor").

I agree that inheritance through explicit coercion is confusing. As
far as I know, there are only 2 places where it is used:
1) Objects with attributes but no class, basically "structure" and its
subclasses "array" <- "matrix"
2) Classes that extend a reference type ("environment", "name" and
"externalptr") via hidden delegation (@.xData)

I'm not sure if anyone should be doing #2. For #1, a simple "fix"
would be just to drop inheritance of "structure" from "vector". I
think the intent was to mimic base R behavior, where it will happily
strip (or at least ignore) attributes when passing an array or matrix
to an internal function that expects a vector.

A related problem, which explains why factor and data.frame inherit
from "vector" even though they are objects, is that any S4 object
derived from those needs to be (for pragmatic compatibility reasons)
an integer vector or list, respectively, internally (the virtual
@.Data slot). Separating that from inheritance would probably be
difficult.

Yes, we can consider these to be problems, to some extent stemming
from the behavior and design of R itself, but I'm not sure it's worth
doing anything about them at this point.



Thanks for the informative discussion. It still doesn't explain
why 'm' gets its attributes stripped and 'x' does not though:

   m <- matrix(1:12, ncol=3)
   x <- structure(1:3, titi="A")

   setGeneric("foo", function(x) standardGeneric("foo"))
   setMethod("foo", "vector", identity)

   foo(m)
   # [1]  1  2  3  4  5  6  7  8  9 10 11 12

   foo(x)
   # [1] 1 2 3
   # attr(,"titi")
   # [1] "A"

If I understand correctly, both are "structures", not "objects".



The structure 'x' has no class, so nothing special is going to happen.
As you know, S4 has a well-defined class hierarchy. Just look at
getClass("structure") to see its subclasses. There was at some point
an attempt to create a sort of dynamic inheritance, where a 'test'
function would be called and could figure this out. However, that was
never implemented. For one thing, it would be even more confusing.


Why aren't these problems worth fixing? More generally speaking
the erratic behavior of the S4 system with respect to S3 objects
has been a plague since the beginning of the methods package.
And many people have complained about this in many occasions in
one way or another. For the record, here are some of the most
notorious problems:

   class(as.numeric(1:4))
   # [1] "numeric"
   class(as(1:4, "numeric"))
   # [1] "integer"



This is not really a problem with the methods package. is.numeric(1L)
is TRUE, thus integer extends numeric, so coercing an integer to
numeric is a no-op.


Only as(1:4, "numeric", strict=FALSE) should be a no-op.
as(1:4, "numeric") should still coerce because as() is supposed
to perform strict coercion by default.


as.numeric() should really be called as.double()
or something. But that's not going to change, of course.


as.numeric() is doing the right thing (i.e. strict coercion) so there
is no need to touch it.




   is.vector(matrix())
   # [1] FALSE
   is(matrix(), "vector")
   # [1] TRUE



We already discussed this in the context of "structure" inheriting
from "vector" and explicit coercion.


   is.list(data.frame())
   # [1] TRUE
   is(data.frame(), "list")
   # [1] FALSE
   extends("data.frame", "list")
   # [1] TRUE



This is a compromise for compatibility with inherits(), since the
result of data.frame() is an S3 object.


So we should add to the list that inherits(data.frame(), "list") is
broken too. Once it gets fixed, is(data.frame(), "list") won't need
to compromise anymore and will be free to return the correct answer.





   is(data.frame(), "vector")
   # [1] FALSE
   is(data.frame(), "factor")
   # [1] FALSE
   is(data.frame(), "vector_OR_factor")
   # [1] TRUE



The question is: which inheritance to follow, S3 or S4? Since "vector"
is a basic class, inheritance follows S3 rules. But the class union is
an S4 class, so it follows S4 rules.


   etc...

Many people stay away from S4 because of these incomprehensible
behaviors.

Finally note that even pure S3 operations can p

Re: [Rd] Dispatch mechanism seems to alter object before calling method on it

2018-05-16 Thread Hervé Pagès


On 05/16/2018 10:22 AM, Michael Lawrence wrote:

Factors and data.frames are not structures, because they must have a
class attribute. Just call them "objects". They are higher level than
structures, which in practice just shape data without adding a lot of
semantics. Compare getClass("matrix") and getClass("factor").

I agree that inheritance through explicit coercion is confusing. As
far as I know, there are only 2 places where it is used:
1) Objects with attributes but no class, basically "structure" and its
subclasses "array" <- "matrix"
2) Classes that extend a reference type ("environment", "name" and
"externalptr") via hidden delegation (@.xData)

I'm not sure if anyone should be doing #2. For #1, a simple "fix"
would be just to drop inheritance of "structure" from "vector". I
think the intent was to mimic base R behavior, where it will happily
strip (or at least ignore) attributes when passing an array or matrix
to an internal function that expects a vector.

A related problem, which explains why factor and data.frame inherit
from "vector" even though they are objects, is that any S4 object
derived from those needs to be (for pragmatic compatibility reasons)
an integer vector or list, respectively, internally (the virtual
@.Data slot). Separating that from inheritance would probably be
difficult.

Yes, we can consider these to be problems, to some extent stemming
from the behavior and design of R itself, but I'm not sure it's worth
doing anything about them at this point.


Thanks for the informative discussion. It still doesn't explain
why 'm' gets its attributes stripped and 'x' does not though:

  m <- matrix(1:12, ncol=3)
  x <- structure(1:3, titi="A")

  setGeneric("foo", function(x) standardGeneric("foo"))
  setMethod("foo", "vector", identity)

  foo(m)
  # [1]  1  2  3  4  5  6  7  8  9 10 11 12

  foo(x)
  # [1] 1 2 3
  # attr(,"titi")
  # [1] "A"

If I understand correctly, both are "structures", not "objects".

Why aren't these problems worth fixing? More generally speaking
the erratic behavior of the S4 system with respect to S3 objects
has been a plague since the beginning of the methods package.
And many people have complained about this in many occasions in
one way or another. For the record, here are some of the most
notorious problems:

  class(as.numeric(1:4))
  # [1] "numeric"
  class(as(1:4, "numeric"))
  # [1] "integer"

  is.vector(matrix())
  # [1] FALSE
  is(matrix(), "vector")
  # [1] TRUE

  is.list(data.frame())
  # [1] TRUE
  is(data.frame(), "list")
  # [1] FALSE
  extends("data.frame", "list")
  # [1] TRUE

  setClassUnion("vector_OR_factor", c("vector", "factor"))
  is(data.frame(), "vector")
  # [1] FALSE
  is(data.frame(), "factor")
  # [1] FALSE
  is(data.frame(), "vector_OR_factor")
  # [1] TRUE

  etc...

Many people stay away from S4 because of these incomprehensible
behaviors.

Finally note that even pure S3 operations can produce output that
doesn't make sense:

  is.list(data.frame())
  # [1] TRUE
  is.vector(list())
  # [1] TRUE
  is.vector(data.frame())
  # [1] FALSE

  (that is: a data frame is a list and a list is a vector but
  a data frame is not a vector!)

Why aren't these problems taken more seriously?

Thanks,
H.



Michael

On Wed, May 16, 2018 at 8:33 AM, Hervé Pagès <hpa...@fredhutch.org> wrote:

On 05/15/2018 09:13 PM, Michael Lawrence wrote:


My understanding is that array (or any other structure) does not
"simply" inherit from vector, because structures are not vectors in
the strictest sense. Basically, once a vector gains attributes, it is
a structure, not a vector. The methods package accommodates this by
defining an "is" relationship between "structure" and "vector" via an
"explicit coerce", such that any "structure" passed to a "vector"
method is first passed to as.vector(), which strips attributes. This
is very much by design.



It seems that the problem is really with matrices and arrays, not
with "structures" in general:

   f <- factor(c("z", "x", "z"), levels=letters)
   m <- matrix(1:12, ncol=3)
   df <- data.frame(f=f)
   x <- structure(1:3, titi="A")

Only the matrix looses its attributes when passed to a "vector"
method:

   setGeneric("foo", function(x) standardGeneric("foo"))
   setMethod("foo", "vector", identity)

   foo(f) # attributes are preserved
   # [1] z x z
   # Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z

   foo(m) # attributes are stripped
   #

Re: [Rd] Dispatch mechanism seems to alter object before calling method on it

2018-05-16 Thread Hervé Pagès


On 05/15/2018 09:13 PM, Michael Lawrence wrote:

My understanding is that array (or any other structure) does not
"simply" inherit from vector, because structures are not vectors in
the strictest sense. Basically, once a vector gains attributes, it is
a structure, not a vector. The methods package accommodates this by
defining an "is" relationship between "structure" and "vector" via an
"explicit coerce", such that any "structure" passed to a "vector"
method is first passed to as.vector(), which strips attributes. This
is very much by design.


It seems that the problem is really with matrices and arrays, not
with "structures" in general:

  f <- factor(c("z", "x", "z"), levels=letters)
  m <- matrix(1:12, ncol=3)
  df <- data.frame(f=f)
  x <- structure(1:3, titi="A")

Only the matrix looses its attributes when passed to a "vector"
method:

  setGeneric("foo", function(x) standardGeneric("foo"))
  setMethod("foo", "vector", identity)

  foo(f) # attributes are preserved
  # [1] z x z
  # Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z

  foo(m) # attributes are stripped
  # [1]  1  2  3  4  5  6  7  8  9 10 11 12

  foo(df)# attributes are preserved
  #   f
  # 1 z
  # 2 x
  # 3 z

  foo(x) # attributes are preserved
  # [1] 1 2 3
  # attr(,"titi")
  # [1] "A"

Also if structures are passed to as.vector() before being passed to
a "vector" method, shouldn't as.vector() and foo() be equivalent on
them? For 'f' and 'x' they're not:

  as.vector(f)
  # [1] "z" "x" "z"

  as.vector(x)
  # [1] 1 2 3

Finally note that for factors and data frames the "vector" method gets
selected despite the fact that is( , "vector") is FALSE:

  is(f, "vector")
  # [1] FALSE

  is(m, "vector")
  # [1] TRUE

  is(df, "vector")
  # [1] FALSE

  is(x, "vector")
  # [1] TRUE

Couldn't we recognize these problems as real, even if they are by
design? Hopefully we can all agree that:
- the dispatch mechanism should only dispatch, not alter objects;
- is() and selectMethod() should not contradict each other.

Thanks,
H.



Michael


On Tue, May 15, 2018 at 5:25 PM, Hervé Pagès <hpa...@fredhutch.org> wrote:

Hi,

This was quite unexpected:

   setGeneric("foo", function(x) standardGeneric("foo"))

   setMethod("foo", "vector", identity)

   foo(matrix(1:12, ncol=3))
   # [1]  1  2  3  4  5  6  7  8  9 10 11 12

   foo(array(1:24, 4:2))
   # [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24

If I define a method for array objects, things work as expected though:

   setMethod("foo", "array", identity)

   foo(matrix(1:12, ncol=3))
   #  [,1] [,2] [,3]
   # [1,]159
   # [2,]26   10
   # [3,]37   11
   # [4,]48   12

So, luckily, I have a workaround.

But shouldn't the dispatch mechanism stay away from the business of
altering objects before passed to it?

Thanks,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=gynT4YhbmVKZhnX4srXlCWZZRyVBMXG211CKgftdEs0=_I0aFHQVnXdBfB5kTLg9TxK_2LHdSuaB6gqZwSx1orQ=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] [Fwd: Problem showing a GRanges with a meta data column of data frame]

2018-05-15 Thread Hervé Pagès


Hi Jialin,

Thanks for the report. This should be fixed in BioC 3.7 (S4Vectors
0.18.2 and IRanges 2.14.10) and BioC devel (S4Vectors 0.19.3 and
IRanges 2.15.11).

The updated packages will become available via biocLite() in the
next 24h (if everything goes as expected).

Best,
H.

On 05/14/2018 07:38 PM, Jialin Ma wrote:

Sorry, forgot to attach the session info:

sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-suse-linux-gnu (64-bit)
Running under: openSUSE Tumbleweed

Matrix products: default
BLAS: /usr/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/R/lib/libRlapack.so

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4stats graphics  grDevices
utils datasets
[8] methods   base

other attached packages:
[1] GenomicRanges_1.32.2 GenomeInfoDb_1.16.0  IRanges_2.14.8
[4] S4Vectors_0.18.1 BiocGenerics_0.26.0  magrittr_1.5

loaded via a namespace (and not attached):
[1]
zlibbioc_1.26.0compiler_3.5.0 XVector_0.20.0
[4] tools_3.5.0GenomeInfoDbData_1.1.0 RCurl_1.95-
4.10
[7] yaml_2.1.19bitops_1.0-6


 Forwarded Message 
From: Jialin Ma <marl...@gmx.cn>
To: bioc-devel <bioc-devel@r-project.org>
Subject: Problem showing a GRanges with a meta data column of data
frame
Date: Tue, 15 May 2018 10:17:35 +0800
Mailer: Evolution 3.24.4


Hi all,

I recently upgraded R to 3.5 and use the new release of Bioconductor.
The new version of S4Vectors seems to have problem showing a GRanges
with a meta-data column of a data frame. One simplified example is:

gr <- GRanges('chr2', IRanges(1, 11))
gr$df <- data.frame(a = 32)

rep(gr, 11) ## No error

rep(gr, 12) ## Error when printing
# GRanges object with 12 ranges and 1 metadata column:
# Error in .Call2("vector_OR_factor_extract_ranges", x, start, width,
PACKAGE = "S4Vectors") :
#   'end' must be <= 'length(x)'

Any help would be appreciated!

Best regards,
Jialin




___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=09DDSYiwyEgMDuRBEXo2G8cd-Yse5xe8Z_k_aT3ciew=zcOg_EezEKrtpGE_7wyqZHWJPaVh4cYJd739M3rPjqA=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] BSGenome submission.

2018-05-14 Thread Hervé Pagès


Yes the .tar.gz file. Thanks!  H.

On 05/14/2018 11:37 AM, Jose Die wrote:

Hi Hervé,

You mean the .tar.gz file?
Sorry, first time I’m doing this.

Jose




El 14/5/2018, a las 1:33, Hervé Pagès <hpa...@fredhutch.org> escribió:

Hi Jose,

Contributed BSgenome packages get added to the repository of
annotation packages rather than to AnnotationHub (think of
AnnotationHub as a repository of files, not packages). Please
make these BSgenome packages available somewhere and provide us
with a link so we can take a look at them. They will need to go
thru a quick review before acceptance. Thanks!

H.


On 05/11/2018 01:39 PM, Jose Die wrote:

Hello.
I got two BSGenome packages using the BSGenome and I´d like to distribute them.
Please, could anyone give me some information on how to submit them to the 
AnnotationHub?
Thanks,
Jose
[[alternative HTML version deleted]]
___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=DwkzuTPwUOhQjGp3ABzxFdKpIIE-QUyp-aib_k9yLhM=UIIcIndBybo3SSq8zvAkZbz76A466wWBbc0ks5vIauM=


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenomicRanges List subclass and apply

2018-05-14 Thread Hervé Pagès


Hi Jack,

You can use

  sapply(seq_along(gr), function(i) print(gr[i]))

instead of

  sapply(gr, print)

But yes, as Michael noted, looping on a GRanges or IRanges object
is generally not efficient and should be avoided. There is almost
always a "vectorized" solution and it's generally much faster.
However, depending on what you are trying to do exactly, coming up
with a "vectorized" solution can be tricky.

Cheers,
H.

On 05/14/2018 07:28 AM, Jack Fu wrote:

Hey all,

I think some of the recent changes to GRanges has affected using the
apply class functions with GRanges objects:

   o GenomicRanges now is a List subclass. This means that GRanges objects
and their derivatives are now considered list-like objects (even though
[[ don't work on them yet, this will be implemented in Bioconductor 
3.8).


The following code will throw:
gr <- GRanges(1, IRanges(1:2, 3:4))
sapply(gr, print)

Error in (function (classes, fdef, mtable)  :
unable to find an inherited method for function 'getListElement' for
signature '"GRanges"'

Access using gr[1], gr[1:2] still works normally.
Are there any recommendations on a workaround for this issue without
resorting back to for loops?

Thanks all,
Jack

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=k5vpJVkh58WH_4jBoE8Hcz_bmop9lW6D_bqF-tDiDm8=s_wv4S9c-aOZ3KFs8bcPcZL8UJUz0xC6cOi4LMqmkRc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] BSGenome submission.

2018-05-13 Thread Hervé Pagès


Hi Jose,

Contributed BSgenome packages get added to the repository of
annotation packages rather than to AnnotationHub (think of
AnnotationHub as a repository of files, not packages). Please
make these BSgenome packages available somewhere and provide us
with a link so we can take a look at them. They will need to go
thru a quick review before acceptance. Thanks!

H.


On 05/11/2018 01:39 PM, Jose Die wrote:

Hello.

I got two BSGenome packages using the BSGenome and I´d like to distribute them.

Please, could anyone give me some information on how to submit them to the 
AnnotationHub?

Thanks,
Jose





[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=DwkzuTPwUOhQjGp3ABzxFdKpIIE-QUyp-aib_k9yLhM=UIIcIndBybo3SSq8zvAkZbz76A466wWBbc0ks5vIauM=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] CRAN R 3.5 binary - where is it?

2018-05-10 Thread Hervé Pagès


On 05/10/2018 12:13 PM, Kenneth Condon wrote:

Hi all,

It sounds like the best thing to do is go back to the original binary and
remove the source installation.

Clarice, you know I can't even remember if I did that - I think I might try
fix the current dependency issue tomorrow, but if any more come up, Ill go
back to the begining do as you suggested with the binary again.

Herve, the link I provided has R3.4.4 .deb binaries from 2018 - not sure
where you are getting the 2014 from.


According to the link I sent you, CRAN provides R .deb packages for the
following versions of Ubuntu: Artful Aardvark (17.10), Zesty Zapus
(17.04), Xenial Xerus (16.04; LTS), and Trusty Tahr (14.04; LTS).

You provided a link to .deb packages for Ubuntu Trusty. Ubuntu Trusty
is from 2014 so is somewhat outdated. With an EOL scheduled for next
year I think.

Anyway, I don't see any R 3.5.0 .deb package on CRAN for any of
these Ubuntu versions yet. Please make sure to check the link I sent
you. It tells you what to do if you have questions about those packages.

Cheers,
H.



Thanks,
Kenneth

On Thu, May 10, 2018 at 7:43 PM, Clarice Groeneveld <
clari.groenev...@gmail.com> wrote:


Hi Kenneth,

I see you're on Ubuntu 14.04. Have you tried in Terminal:
sudo apt-get update
sudo apt-get upgrade

to upgrade your R version?

Best,
Clarice.

Em qui, 10 de mai de 2018 às 14:20, Kenneth Condon <roonysga...@gmail.com>
escreveu:


Hi all,

I want to submit a package to bioconductor this week but first I want to
build it on the latest R 3.5 release (with bioconductor 3.7). However,
CRAN
only has 3.4 binaries
<https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-2Dproject.org_bin_linux_ubuntu_trusty_-3FC-3DN-3BO-3DA=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=rgKDL-SEoO8XpuX4jdMnUPYoVWS4DaJ9O2LcugvN5mY=o5QNX-FSUAzZdmBhik0P991kwi874S7MpTVvzrXDco0=>
 so I've had
to build 3.5 from source which has unfortunately sent me to dependency
hell.

Rather than spending the next 3 days fixing all dependencies, does anyone
know when the 3.5 dependency is likely to surface?

I think bioconductor 3.7 cannot be installed on R 3.4 so any advice would
be appreciated.

Thanks,
Kenneth.

 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=rgKDL-SEoO8XpuX4jdMnUPYoVWS4DaJ9O2LcugvN5mY=khVcWxjZcnypfFFaQA8sd9DNwHVfED9YMwnudCUfe0s=





[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=rgKDL-SEoO8XpuX4jdMnUPYoVWS4DaJ9O2LcugvN5mY=khVcWxjZcnypfFFaQA8sd9DNwHVfED9YMwnudCUfe0s=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] CRAN R 3.5 binary - where is it?

2018-05-10 Thread Hervé Pagès


Hi Kenneth,

Given the link you're providing, it looks like you are looking
for an R 3.5 package for Ubuntu Trusty (which is from 2014).
Please see https://cran.r-project.org/bin/linux/ubuntu/ for
the best place to ask this.

Cheers,
H.

On 05/10/2018 10:20 AM, Kenneth Condon wrote:

Hi all,

I want to submit a package to bioconductor this week but first I want to
build it on the latest R 3.5 release (with bioconductor 3.7). However, CRAN
only has 3.4 binaries
<https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-2Dproject.org_bin_linux_ubuntu_trusty_-3FC-3DN-3BO-3DA=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=iR0SNBpjI6zR3P2WDdOLGY07wOto6iaQWjT91Sn6ln8=viR9H7HrChpAEwuRH-x0mevehovx0-daMB-X3n40rqk=>
 so I've had
to build 3.5 from source which has unfortunately sent me to dependency hell.

Rather than spending the next 3 days fixing all dependencies, does anyone
know when the 3.5 dependency is likely to surface?

I think bioconductor 3.7 cannot be installed on R 3.4 so any advice would
be appreciated.

Thanks,
Kenneth.

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=iR0SNBpjI6zR3P2WDdOLGY07wOto6iaQWjT91Sn6ln8=IIGlY5oxZoalMA00eVotN8FxahrYeGHBN8kPlQ9hQ_k=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] length of `...`

2018-05-08 Thread Hervé Pagès


Thanks Martin for the clarifications.  H.

On 05/04/2018 06:02 AM, Martin Maechler wrote:

Hervé Pagès <hpa...@fredhutch.org>
 on Thu, 3 May 2018 08:55:20 -0700 writes:


 > Hi,
 > It would be great if one of the experts could comment on the
 > difference between Hadley's dotlength and ...length? The fact
 > that someone bothered to implement a new primitive for that
 > when there seems to be a very simple and straightforward R-only
 > solution suggests that there might be some gotchas/pitfalls with
 > the R-only solution.

Namely


dotlength <- function(...) nargs()



(This is subtly different from calling nargs() directly as it will
only count the elements in ...)



Hadley



Well,  I was the "someone".  In the past I had seen (and used myself)

length(list(...))

and of course that was not usable.
I knew of some substitute() / match.call() tricks [but I think
did not know Bill's cute substitute(...()) !] at the time, but
found them too esoteric.

Aditionally and importantly,  ...length()  and  ..elt(n)  were
developed  "synchronously",  and the R-substitutes for ..elt()
definitely are less trivial (I did not find one at the time), as
Duncan's example to Bill's proposal has shown, so I had looked
at .Primitive() solutions of both.

In hindsight I should have asked here for advice,  but may at
the time I had been a bit frustrated by the results of some of
my RFCs ((nothing specific in mind !))

But __if__ there's really no example where current (3.5.0 and newer)

   ...length()

differs from Hadley's  dotlength()
I'd vert happy to replace ...length 's C based definition by
Hadley's beautiful minimal solution.

Martin


 > On 05/03/2018 08:34 AM, Hadley Wickham wrote:
 >> On Thu, May 3, 2018 at 8:18 AM, Duncan Murdoch 
<murdoch.dun...@gmail.com> wrote:
 >>> On 03/05/2018 11:01 AM, William Dunlap via R-devel wrote:
 >>>>
 >>>> In R-3.5.0 you can use ...length():
 >>>> > f <- function(..., n) ...length()
 >>>> > f(stop("one"), stop("two"), stop("three"), n=7)
 >>>> [1] 3
 >>>>
 >>>> Prior to that substitute() is the way to go
 >>>> > g <- function(..., n) length(substitute(...()))
 >>>> > g(stop("one"), stop("two"), stop("three"), n=7)
 >>>> [1] 3
 >>>>
 >>>> R-3.5.0 also has the ...elt(n) function, which returns
 >>>> the evaluated n'th entry in ... , without evaluating the
 >>>> other ... entries.
 >>>> > fn <- function(..., n) ...elt(n)
 >>>> > fn(stop("one"), 3*5, stop("three"), n=2)
 >>>> [1] 15
 >>>>
 >>>> Prior to 3.5.0, eval the appropriate component of the output
 >>>> of substitute() in the appropriate environment:
 >>>> > gn <- function(..., n) {
 >>>> +   nthExpr <- substitute(...())[[n]]
 >>>> +   eval(nthExpr, envir=parent.frame())
 >>>> + }
 >>>> > gn(stop("one"), environment(), stop("two"), n=2)
 >>>> 
 >>>>
 >>>
 >>> Bill, the last of these doesn't quite work, because ... can be passed 
down
 >>> through a string of callers.  You don't necessarily want to evaluate 
it in
 >>> the parent.frame().  For example:
 >>>
 >>> x <- "global"
 >>> f <- function(...) {
 >>> x <- "f"
 >>> g(...)
 >>> }
 >>> g <- function(...) {
 >>> firstExpr <- substitute(...())[[1]]
 >>> c(list(...)[[1]], eval(firstExpr, envir = parent.frame()))
 >>> }
 >>>
 >>> Calling g(x) correctly prints "global" twice, but calling f(x) 
incorrectly
 >>> prints
 >>>
 >>> [1] "global" "f"
 >>>
 >>> You can get the first element of ... without evaluating the rest using 
..1,
 >>> but I don't know a way to do this for general n in pre-3.5.0 base R.
 >>
 >> If you don't mind using a package:
 >>
 >> # works with R 3.1 and up
 >> library(rlang)
 >>
 >> x <- "global"
 >> f <- function(...) {
 >> x <- "f"
 >> g(...)
 >> }
 >> g <- function(...) {
 >> dots <- enquos(...)
 >> eval_tidy(dots[[1]])
 >> }
 >>
 >&g

Re: [Bioc-devel] class name collision in cache: igvR and Gviz

2018-05-04 Thread Hervé Pagès


Hi Paul,

Luckily you caught this only 3 days after the release so renaming the
class now is probably ok and shouldn't be disruptive.

Cheers,
H.


On 05/04/2018 10:52 AM, Paul Shannon wrote:

I just discovered a class name collision - AnnotationTrack, in Gviz and my new 
package igvR.   wish to get wise counsel before proceeding with a fix.  Here’s 
the error message:

Found more than one class "AnnotationTrack" in cache;
using the first, from namespace ‘igvR' Also defined by ‘Gviz’

AnnotationTrack is an abstract base class in my new package igvR.  The concrete 
derived classes at present are

   DataFrameAnnotationTrack
   GRangesAnnotationTrack
   UCSCBedAnnotationTrack

It would be easy for me to rename AnnotationTrack to “GenomeAnnotationTrack” or 
even “igvAnnotationTrack”, thereby  avoiding the name clash.

Reasonable?  Fix in both release and devel?

Thanks -

  - Paul

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=LKMPMJT3dPT3MzP89-SrXj7WE68ASStRykpsAASpDJU=l2WFtbtf4nCVcoygg5Eob_wu-Q4qfETzWYAA_f9-mKc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] length of `...`

2018-05-03 Thread Hervé Pagès

Hi,

It would be great if one of the experts could comment on the
difference between Hadley's dotlength and ...length? The fact
that someone bothered to implement a new primitive for that
when there seems to be a very simple and straightforward R-only
solution suggests that there might be some gotchas/pitfalls with
the R-only solution.

Thanks,
H.

On 05/03/2018 08:34 AM, Hadley Wickham wrote:

On Thu, May 3, 2018 at 8:18 AM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote:

On 03/05/2018 11:01 AM, William Dunlap via R-devel wrote:

In R-3.5.0 you can use ...length():
> f <- function(..., n) ...length()
> f(stop("one"), stop("two"), stop("three"), n=7)
[1] 3

Prior to that substitute() is the way to go
> g <- function(..., n) length(substitute(...()))
> g(stop("one"), stop("two"), stop("three"), n=7)
[1] 3

R-3.5.0 also has the ...elt(n) function, which returns
the evaluated n'th entry in ... , without evaluating the
other ... entries.
> fn <- function(..., n) ...elt(n)
> fn(stop("one"), 3*5, stop("three"), n=2)
[1] 15

Prior to 3.5.0, eval the appropriate component of the output
of substitute() in the appropriate environment:
> gn <- function(..., n) {
+   nthExpr <- substitute(...())[[n]]
+   eval(nthExpr, envir=parent.frame())
+ }
> gn(stop("one"), environment(), stop("two"), n=2)

Bill, the last of these doesn't quite work, because ... can be passed down
through a string of callers.  You don't necessarily want to evaluate it in
the parent.frame().  For example:

x <- "global"
f <- function(...) {
   x <- "f"
   g(...)
}
g <- function(...) {
   firstExpr <- substitute(...())[[1]]
   c(list(...)[[1]], eval(firstExpr, envir = parent.frame()))
}

Calling g(x) correctly prints "global" twice, but calling f(x) incorrectly
prints

[1] "global" "f"

You can get the first element of ... without evaluating the rest using ..1,
but I don't know a way to do this for general n in pre-3.5.0 base R.

If you don't mind using a package:

# works with R 3.1 and up
library(rlang)

x <- "global"
f <- function(...) {
   x <- "f"
   g(...)
}
g <- function(...) {
   dots <- enquos(...)
   eval_tidy(dots[[1]])
}

f(x, stop("!"))
#> [1] "global"
g(x, stop("!"))
#> [1] "global"

Hadley

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] Suggested edits to support site posting guide

2018-05-01 Thread Hervé Pagès

owing
link allows edits:

https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1baiBUYB8E02KMbaojjoo-2D=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=S5D7QD263pcF-j96zEV-sY0v_ulBJ_ZpI9JMXcbAEOI=b6dghZFuzEt_8azoY3GVoQn3k6G9Emd8wXXj5KOr6kU=
tKV3Ctd9sA5lVCvwFIWquA/edit?usp=sharing

best,
Mike

 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=S5D7QD263pcF-j96zEV-sY0v_ulBJ_ZpI9JMXcbAEOI=chTCvAJdVNlm_YsDjNRIrvNNptfOmOK0yc5mLpWzIkQ=


This email message may contain legally privileged and/or confidential

information.  If you are not the intended recipient(s), or the employee or
agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited.  If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.

 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=S5D7QD263pcF-j96zEV-sY0v_ulBJ_ZpI9JMXcbAEOI=chTCvAJdVNlm_YsDjNRIrvNNptfOmOK0yc5mLpWzIkQ=


___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=S5D7QD263pcF-j96zEV-sY0v_ulBJ_ZpI9JMXcbAEOI=chTCvAJdVNlm_YsDjNRIrvNNptfOmOK0yc5mLpWzIkQ=



 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=S5D7QD263pcF-j96zEV-sY0v_ulBJ_ZpI9JMXcbAEOI=chTCvAJdVNlm_YsDjNRIrvNNptfOmOK0yc5mLpWzIkQ=


___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=S5D7QD263pcF-j96zEV-sY0v_ulBJ_ZpI9JMXcbAEOI=chTCvAJdVNlm_YsDjNRIrvNNptfOmOK0yc5mLpWzIkQ=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Error: node stack overflow

2018-04-30 Thread Hervé Pagès


Great, thanks!

I was curious and did:

hpages@spectre:~/svn/R/R-3-5-branch$ svn diff -r 74669:74674 
src/library/methods/R/methodsTable.R

Index: src/library/methods/R/methodsTable.R
===
--- src/library/methods/R/methodsTable.R(revision 74669)
+++ src/library/methods/R/methodsTable.R(revision 74674)
@@ -697,7 +697,7 @@

 .findNextFromTable <- function(method, f, optional, envir, prev = 
character())

 {
-fdef <- getGeneric(f)
+fdef <- getGeneric(f, where=envir)
 env <- environment(fdef)
 ##target <- method@target
 n <- get(".SigLength", envir = env)

That was it? Whao! Probably one of the best illustration I've seen
that a hard-to-reproduce bug doesn't necessarily require a complex
or sophisticated fix ;-) I'm not implying that the fix was easy here,
a proper fix can be hard to figure out even if it's simple.

BTW, before the fix, the 'envir' argument was ignored so I wonder if
a static code analysis tool couldn't have detected this...

H.


On 04/30/2018 03:26 PM, Michael Lawrence wrote:

I just pushed it to the 3.5 branch.

On Mon, Apr 30, 2018 at 2:14 PM, Hervé Pagès <hpa...@fredhutch.org> wrote:

Excellent! Are you planning to commit this to the 3.5 branch too?
In that case we'll wait a couple more days before installing R 3.5
patched on the build machines for the BioC 3.8 builds.

Thanks,
H.


On 04/30/2018 01:43 PM, Michael Lawrence wrote:


It's checked into devel now. Thanks for the well documented examples,
Hervé.

On Mon, Apr 30, 2018 at 10:26 AM, Michael Lawrence <micha...@gene.com>
wrote:


I've fixed it and will push to R-devel as soon as it passes checks.

Michael

On Sun, Apr 29, 2018 at 9:04 PM, Michael Lawrence <micha...@gene.com>
wrote:


Just noticed this thread. I will look into this and hopefully fix it.

On Sun, Apr 29, 2018 at 6:12 PM, Hervé Pagès <hpa...@fredhutch.org>
wrote:


Hi,

I made progress on this. This has actually nothing to do with Java.
You get the same thing with the flexmix package. What rJava and flexmix
have in common is that they both define a method on the base::unique()
implicit S4 generic.

The issue actually originates in the methods package. In order to
remove
rJava, BiocGenerics and IRanges from the equation, I made 2 minimalist
packages, uniqueMethod and uniqueGeneric, that can be used to reproduce
the issue. See:


https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_uniqueGeneric=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=hI6LWw_u7csv2YoouDZz4PmV3GnTtE0movmB_pZuog8=5UWyuXnFrW3P7eeuT0jq8O9gi7BOreYKUmZ0LdDpJ3M=


I committed a workaround in S4Vectors (0.17.44). With this version of
S4Vectors:

library(rJava)
library(IRanges)
unique(IRanges())
# IRanges object with 0 ranges and 0 metadata columns:
#start   end width
#  

Let me know if you still run into problems with this.

Cheers,
H.


sessionInfo()


R Under development (unstable) (2018-02-26 r74306)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /home/hpages/R/R-3.5.r74306/lib/libRblas.so
LAPACK: /home/hpages/R/R-3.5.r74306/lib/libRlapack.so

locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
   [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils
datasets
[8] methods   base

other attached packages:
[1] IRanges_2.13.29 S4Vectors_0.17.44   BiocGenerics_0.25.3
[4] rJava_0.9-9

loaded via a namespace (and not attached):
[1] compiler_3.5.0


On 04/14/2018 03:11 AM, Hervé Pagès wrote:



Hi Zheng,

I can totally reproduce this on my Ubuntu laptop:

 library(rJava)
 library(IRanges)
 unique(IRanges())
 # Error in validObject(.Object) :
 #   invalid class “MethodWithNext” object: Error : C stack usage
7969396 is too close to the limit

See my seesionInfo() at the end of this email.

Probably related to this (but not 100% sure) loading rJava seems
to break selectMethod().

More precisely: The rJava package defines some "unique" S4 methods
and the BiocGenerics package defines (and exports) the unique() S4
generic with the following statement:

 setGeneric("unique", signature="x")

Here is what happens when loading the rJava package first:

 library(rJava)
 library(BiocGenerics)

 setClass("A", slots=c(a="integer"))
 setMethod("unique", "A",
   function(x, incomparables=FALSE, ...) {x@a <- unique(x@a); x}
 )

 selectMethod("unique", "A")
 # Method Definition (Class &q

Re: [Bioc-devel] Error: node stack overflow

2018-04-30 Thread Hervé Pagès


Excellent! Are you planning to commit this to the 3.5 branch too?
In that case we'll wait a couple more days before installing R 3.5
patched on the build machines for the BioC 3.8 builds.

Thanks,
H.


On 04/30/2018 01:43 PM, Michael Lawrence wrote:

It's checked into devel now. Thanks for the well documented examples, Hervé.

On Mon, Apr 30, 2018 at 10:26 AM, Michael Lawrence <micha...@gene.com> wrote:

I've fixed it and will push to R-devel as soon as it passes checks.

Michael

On Sun, Apr 29, 2018 at 9:04 PM, Michael Lawrence <micha...@gene.com> wrote:

Just noticed this thread. I will look into this and hopefully fix it.

On Sun, Apr 29, 2018 at 6:12 PM, Hervé Pagès <hpa...@fredhutch.org> wrote:

Hi,

I made progress on this. This has actually nothing to do with Java.
You get the same thing with the flexmix package. What rJava and flexmix
have in common is that they both define a method on the base::unique()
implicit S4 generic.

The issue actually originates in the methods package. In order to remove
rJava, BiocGenerics and IRanges from the equation, I made 2 minimalist
packages, uniqueMethod and uniqueGeneric, that can be used to reproduce
the issue. See:

   
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_uniqueGeneric=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=hI6LWw_u7csv2YoouDZz4PmV3GnTtE0movmB_pZuog8=5UWyuXnFrW3P7eeuT0jq8O9gi7BOreYKUmZ0LdDpJ3M=

I committed a workaround in S4Vectors (0.17.44). With this version of
S4Vectors:

   library(rJava)
   library(IRanges)
   unique(IRanges())
   # IRanges object with 0 ranges and 0 metadata columns:
   #start   end width
   #  

Let me know if you still run into problems with this.

Cheers,
H.


sessionInfo()

R Under development (unstable) (2018-02-26 r74306)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /home/hpages/R/R-3.5.r74306/lib/libRblas.so
LAPACK: /home/hpages/R/R-3.5.r74306/lib/libRlapack.so

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
[8] methods   base

other attached packages:
[1] IRanges_2.13.29 S4Vectors_0.17.44   BiocGenerics_0.25.3
[4] rJava_0.9-9

loaded via a namespace (and not attached):
[1] compiler_3.5.0


On 04/14/2018 03:11 AM, Hervé Pagès wrote:


Hi Zheng,

I can totally reproduce this on my Ubuntu laptop:

library(rJava)
library(IRanges)
unique(IRanges())
# Error in validObject(.Object) :
#   invalid class “MethodWithNext” object: Error : C stack usage
7969396 is too close to the limit

See my seesionInfo() at the end of this email.

Probably related to this (but not 100% sure) loading rJava seems
to break selectMethod().

More precisely: The rJava package defines some "unique" S4 methods
and the BiocGenerics package defines (and exports) the unique() S4
generic with the following statement:

setGeneric("unique", signature="x")

Here is what happens when loading the rJava package first:

library(rJava)
library(BiocGenerics)

setClass("A", slots=c(a="integer"))
setMethod("unique", "A",
  function(x, incomparables=FALSE, ...) {x@a <- unique(x@a); x}
)

selectMethod("unique", "A")
# Method Definition (Class "derivedDefaultMethod"):
#
# function (x, incomparables = FALSE, ...)
# UseMethod("unique")
# 
# 
#
# Signatures:
# x
# target  "A"
# defined "ANY"

selectMethod() doesn't find the method for A objects!

It seems that selectMethod() is looking in the method table for
the implicit unique() generic defined in rJava instead of the
explicit unique() generic defined in BiocGenerics. If we tell
selectMethod() which generic to consider, then it finds the method
for A objects:

selectMethod(BiocGenerics::unique, "A")
# Method Definition:
#
# function (x, incomparables = FALSE, ...)
# {
#   x@a <- unique(x@a)
#   x
# }
#
# Signatures:
# x
# target  "A"
# defined "A"

In order to reproduce the above problem without the BiocGenerics
package in the equation, it's not enough to do:

library(rJava)
setGeneric("unique", signature="x")
etc...

The setGeneric("unique", signature="x") statement must be put in
a package. I've created a minimalist package on GitHub that just
wraps this statement:


https://urldefense.proofpoint.com/v2/url?u=https-3A__gi

Re: [Bioc-devel] Virtual class for `matrix` and `DelayedArray`? (or better strategy for dealing with them both)

2018-04-30 Thread Hervé Pagès


Interesting. I tried something like that in the past i.e. start with
a unary setClassUnion() but then got into problems when I tried to add
new members to the union by **extending** the union class:

  https://stat.ethz.ch/pipermail/r-devel/2016-March/072489.html

So it seems like I should have used setIs() instead.

Then later I realized that I could just do with having the 'seed' slot
of the DelayedArray class be "ANY" and having the validity method
checking that the slot actually contains something for which dim()
is not null. I actually adopted this "array-like == dim() is not NULL"
approach everywhere in the package. It's simple and works well.

H.

On 04/30/2018 01:33 PM, Michael Lawrence wrote:

It would be great to be able to define a matrix-like abstraction
independent of 'matrix' and 'DelayedMatrix'. It could also encompass
objects from the Matrix package and potentially other things. So you
could define a parent class of 'matrix' using setClassUnion() and then
use setIs() to establish further derivations:

setClassUnion("MatrixLike", "matrix")
setIs("DelayedMatrix", "MatrixLike")

Michael

On Mon, Apr 30, 2018 at 11:35 AM, Hervé Pagès <hpa...@fredhutch.org> wrote:

The class union should probably be:

   setClassUnion("matrixOrDelayed", c("matrix", "DelayedMatrix"))

i.e. use DelayedMatrix instead of DelayedArray.

So in addition to the class union and to Stephanie's solution, which
IMO are both valid solutions, you could also go for something like this:

myNewRowMeans <- function(x,...)
{
 if (length(dim(x)) != 2)
 stop("'x' must be a matrix-like object")
 ...
)

that is, just a regular function that checks that 'x' is matrix-like
based on its number of dimensions. If you really want to restrict to
matrix and DelayedMatrix only, replace the test with

 if (!(is.matrix(x) || is(x, "DelayedMatrix")))
 stop("'x' must be a matrix or DelayedMatrix object")

The difference being that now the function will reject matrix-like
objects that are not matrix or DelayedMatrix objects (e.g. a Matrix
derivative from the Matrix package).

Cheers,
H.



On 04/30/2018 09:29 AM, Stephanie M. Gogarten wrote:


Rather than a class union, how about an internal function that is called
by the methods for both matrix and DelayedArray:


setGeneric("myNewRowMeans", function(x,...) {
standardGeneric("myNewRowMeans")})

#' @importFrom DelayedArray rowMeans
.myNewRowMeans <- function(x,...){
  # a lot of code independent of x
  print("This is a lot of code shared regardless of class of x\n")
  # a lot of code that depends on x, but is dispatched by the functions
called
  out<-rowMeans(x)
  #a lot of code based on output of out
  out<-out+1
  return(out)
}

setMethod("myNewRowMeans",
signature = "matrix",
definition = function(x,...){
.myNewRowMeans(x,...)
}
)

setMethod("myNewRowMeans",
signature = "DelayedArray",
definition = function(x,...){
.myNewRowMeans(x,...)
}
)


On 4/30/18 9:10 AM, Tim Triche, Jr. wrote:


But if you merge methods like that, the error method can be that much
more
difficult to identify. It took a couple of weeks to chase that bug down
properly, and it ended up down to rowMeans2 vs rowMeans.

I suppose the merged/abstracted method allows to centralize any such
dispatch into one place and swap out ill-behaved methods once identified,
so as long as DelayedArray/DelayedMatrixStats quirks are
documented/understood, maybe it is better to create this union class?

The Matrix/matrixStats/DelayedMatrix/DelayedMatrixStats situation has
been
"interesting" in practical terms, as seemingly simple abstractions appear
to require more thought. That was my only point.


--t

On Mon, Apr 30, 2018 at 11:28 AM, Martin Morgan <
martin.mor...@roswellpark.org> wrote:


But that issue will be fixed, so Tim's advice is inappropriate.


On 04/30/2018 10:42 AM, Tim Triche, Jr. wrote:


Don't do that.  Seriously, just don't.


https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_DelayedArray_issues_16=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Rhy4i6H9xaY8HzWv9v_jhOnp5OyEpJcG52RP3nHorU8=olbErqY3_l7i45-WeTkaUNGalrQQr-7i59rhJVF6OGQ=

--t

On Mon, Apr 30, 2018 at 10:02 AM, Elizabeth Purdom <
epur...@stat.berkeley.edu> wrote:

Hello,



I am trying to extend my package to handle `HDF5Matrix` class ( or
more
generally `DelayedArray`). I currently have S4 functions for `matrix`
class. Usually I have a method for `SummarizedExperiment`, which will
call
call the method on `assay(x)` and I want the method to be able to deal
with
if `assay(x)` is a `DelayedArray`.

Most of my functions, howeve

Re: [Bioc-devel] Virtual class for `matrix` and `DelayedArray`? (or better strategy for dealing with them both)

2018-04-30 Thread Hervé Pagès

  signature = "matrixOrDelayed",
    definition = function(x,...){
  # a lot of code independent of x
  print("This is a lot of code shared 
regardless

of
class of x\n")
  # a lot of code that depends on x, but is
dispatched by the functions called
  out<-rowMeans(x)
  #a lot of code based on output of out
  out<-out+1
  return(out)
  }
)
```

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Rhy4i6H9xaY8HzWv9v_jhOnp5OyEpJcG52RP3nHorU8=PcBHWXeL0_5KMWSkRgj5UXk640tXb20rGH9sO98oR2w= 





 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Rhy4i6H9xaY8HzWv9v_jhOnp5OyEpJcG52RP3nHorU8=PcBHWXeL0_5KMWSkRgj5UXk640tXb20rGH9sO98oR2w= 






This email message may contain legally privileged and/or confidential
information.  If you are not the intended recipient(s), or the 
employee or

agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited.  If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.



[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Rhy4i6H9xaY8HzWv9v_jhOnp5OyEpJcG52RP3nHorU8=PcBHWXeL0_5KMWSkRgj5UXk640tXb20rGH9sO98oR2w= 





___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Rhy4i6H9xaY8HzWv9v_jhOnp5OyEpJcG52RP3nHorU8=PcBHWXeL0_5KMWSkRgj5UXk640tXb20rGH9sO98oR2w= 



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Virtual class for `matrix` and `DelayedArray`? (or better strategy for dealing with them both)

2018-04-30 Thread Hervé Pagès

=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug=



 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug=




This email message may contain legally privileged and/or confidential
information.  If you are not the intended recipient(s), or the employee

or

agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited.  If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.



 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug=





[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=WCuDvGWmDrT5ZoYylftzjbrlaEu-lOxIIJaNJgn6itQ=_3ZIrKjXNYWYMKKDBvbn1aNtGMB6rfqfhs-zU_P5_ug=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Last minute error in the build

2018-04-30 Thread Hervé Pagès


Done:


https://github.com/Bioconductor/HDF5Array/commit/c525570bc927274c37d9f267a2cf194d8e545d91

I also applied the fix to the new RELEASE_3_7 branch
of HDF5Array.

If you install the latest version of HDF5Array (1.8.0
in the RELEASE_3_7 branch and 1.9.0 in master), that should
clear the error you get when running the code in scmeth
vignette. The build system will pick-up the latest version
of HDF5Array so scmeth should go green again on the build
report tomorrow.

H.

On 04/30/2018 10:21 AM, Hervé Pagès wrote:

Hi Divy,

I will take care of this. You don't need to do anything.

Cheers,
H.

On 04/30/2018 07:16 AM, Kangeyan, Divy wrote:

Hi,
 I am the author of *scmeth* package that was submitted during this
cycle. I know that there will be a new bioconductor release tomorrow. My
package has been passing all the builds until April 27th which is past
April 25th, the deadline to pass R CMD check and R CMD build. Now I see
errors in the nightly builds which probably happened over the weekend.
Following is the error:

Error: processing vignette 'my-vignette.Rmd' failed with diagnostics:
HDF5Matrix object uses internal representation from DelayedArray
   >= 0.5.11 and < 0.5.24 and cannot be displayed or used. Please 
update it with:


   object <- updateObject(object, verbose=TRUE)

   and re-serialize it.
Execution halted


The error looks like it is related to a deprecated object in my 
vignette. I

am not sure whether this is something that I have to update in the new
development version or something that I can resolve now. Considering the
new release is tomorrow and all the builds passed beyond the deadline 
I am

wondering whether the package will have a release version?

Thank you,
Divy

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=RBt9k3A-vxuVohDsl55aCS2e531MTyn_iu2qGRh2-qI=OO19dRaLUk9SvbPL7BTpB_uWkqIJk-_78VSX_p_SznQ= 







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Last minute error in the build

2018-04-30 Thread Hervé Pagès


Hi Divy,

I will take care of this. You don't need to do anything.

Cheers,
H.

On 04/30/2018 07:16 AM, Kangeyan, Divy wrote:

Hi,
 I am the author of *scmeth* package that was submitted during this
cycle. I know that there will be a new bioconductor release tomorrow. My
package has been passing all the builds until April 27th which is past
April 25th, the deadline to pass R CMD check and R CMD build. Now I see
errors in the nightly builds which probably happened over the weekend.
Following is the error:

Error: processing vignette 'my-vignette.Rmd' failed with diagnostics:
HDF5Matrix object uses internal representation from DelayedArray
   >= 0.5.11 and < 0.5.24 and cannot be displayed or used. Please update it 
with:

   object <- updateObject(object, verbose=TRUE)

   and re-serialize it.
Execution halted


The error looks like it is related to a deprecated object in my vignette. I
am not sure whether this is something that I have to update in the new
development version or something that I can resolve now. Considering the
new release is tomorrow and all the builds passed beyond the deadline I am
wondering whether the package will have a release version?

Thank you,
Divy

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=RBt9k3A-vxuVohDsl55aCS2e531MTyn_iu2qGRh2-qI=OO19dRaLUk9SvbPL7BTpB_uWkqIJk-_78VSX_p_SznQ=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Error: node stack overflow

2018-04-29 Thread Hervé Pagès

Hi,

I made progress on this. This has actually nothing to do with Java.
You get the same thing with the flexmix package. What rJava and flexmix
have in common is that they both define a method on the base::unique()
implicit S4 generic.

The issue actually originates in the methods package. In order to remove
rJava, BiocGenerics and IRanges from the equation, I made 2 minimalist
packages, uniqueMethod and uniqueGeneric, that can be used to reproduce
the issue. See:

  https://github.com/Bioconductor/uniqueGeneric

I committed a workaround in S4Vectors (0.17.44). With this version of
S4Vectors:

  library(rJava)
  library(IRanges)
  unique(IRanges())
  # IRanges object with 0 ranges and 0 metadata columns:
  #start   end width
  #  

Let me know if you still run into problems with this.

Cheers,
H.

> sessionInfo()
R Under development (unstable) (2018-02-26 r74306)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /home/hpages/R/R-3.5.r74306/lib/libRblas.so
LAPACK: /home/hpages/R/R-3.5.r74306/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
[8] methods   base

other attached packages:
[1] IRanges_2.13.29 S4Vectors_0.17.44   BiocGenerics_0.25.3
[4] rJava_0.9-9

loaded via a namespace (and not attached):
[1] compiler_3.5.0

On 04/14/2018 03:11 AM, Hervé Pagès wrote:

Hi Zheng,

I can totally reproduce this on my Ubuntu laptop:

   library(rJava)
   library(IRanges)
   unique(IRanges())
   # Error in validObject(.Object) :
   #   invalid class “MethodWithNext” object: Error : C stack usage 
7969396 is too close to the limit

See my seesionInfo() at the end of this email.

Probably related to this (but not 100% sure) loading rJava seems
to break selectMethod().

More precisely: The rJava package defines some "unique" S4 methods
and the BiocGenerics package defines (and exports) the unique() S4
generic with the following statement:

   setGeneric("unique", signature="x")

Here is what happens when loading the rJava package first:

   library(rJava)
   library(BiocGenerics)

   setClass("A", slots=c(a="integer"))
   setMethod("unique", "A",
     function(x, incomparables=FALSE, ...) {x@a <- unique(x@a); x}
   )

   selectMethod("unique", "A")
   # Method Definition (Class "derivedDefaultMethod"):
   #
   # function (x, incomparables = FALSE, ...)
   # UseMethod("unique")
   # 
   # 
   #
   # Signatures:
   # x
   # target  "A"
   # defined "ANY"

selectMethod() doesn't find the method for A objects!

It seems that selectMethod() is looking in the method table for
the implicit unique() generic defined in rJava instead of the
explicit unique() generic defined in BiocGenerics. If we tell
selectMethod() which generic to consider, then it finds the method
for A objects:

   selectMethod(BiocGenerics::unique, "A")
   # Method Definition:
   #
   # function (x, incomparables = FALSE, ...)
   # {
   #   x@a <- unique(x@a)
   #   x
   # }
   #
   # Signatures:
   # x
   # target  "A"
   # defined "A"

In order to reproduce the above problem without the BiocGenerics
package in the equation, it's not enough to do:

   library(rJava)
   setGeneric("unique", signature="x")
   etc...

The setGeneric("unique", signature="x") statement must be put in
a package. I've created a minimalist package on GitHub that just
wraps this statement:

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Bioconductor_uniqueGeneric=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=b4YM1TCcJjPge2siQJS5BQv7g1DMkoCQ-7FvZz89w-E=P9-WHLV4FmY9PQcNjkZ4Cgc24Oi1QTNVBqTaQ1iS-kg= 

This package can be used instead of BiocGenerics to reproduce the
problem above.

I'm not 100% sure that this problem is related to the issue you
reported originally but it seems very likely to me.

Not quite sure what the next step should be. I've been told by
some R core developers that there are known interaction issues
between Java, rJava and R that are currently being worked on.
Someone should ask on the R-devel mailing list or directly to
Simon Urbanek, the rJava author, for more information about this.

H.

 > sessionInfo()
R Under development (unstable) (2018-02-26 r74306)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /home/hpages/R/R-3.5.r74306/lib/libRblas.so
LAPACK: /home/hpages/R/R-3.5.r74306/

Re: [Bioc-devel] DMRcaller build error

2018-04-27 Thread Hervé Pagès


On 04/27/2018 12:31 PM, Radu Zabet wrote:

Thank you for that Herve!

I managed to figure out what the problem was.

I was using a GRangesList constructor and,  when validating the argument 
passed to the function, I was checking if the class was GRangesList 
instead of CompressedGRangesList.


CompressedGRangesList is a subclass of virtual class GRangesList that
uses a particular internal representation to store the list in an
efficient manner.

Note that it's alway better to use is(x, "GRangesList") for this kind
of checks. The exact class of 'x' does not matter as long as 'x' derives
from GRangesList. Doing class(x) == "CompressedGRangesList" will be
FALSE if 'x' is another GRangesList derivative and that is probably not
what you want.

Cheers,
H.



Radu

On Fri, Apr 27, 2018 at 8:11 PM, Hervé Pagès <hpa...@fredhutch.org 
<mailto:hpa...@fredhutch.org>> wrote:


Hi Radu,

DMRcaller is all green on today's report:

https://bioconductor.org/checkResults/3.7/bioc-LATEST/DMRcaller/

<https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_checkResults_3.7_bioc-2DLATEST_DMRcaller_=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=sC2oO3c16mMorqdk5d8nvHzqg1dL7v228YdPl836xng=FzUy1jhF6u2fVJ4oNL_dP1E2-djzZldXsth0bOXXfEg=>

Remember that after you update (and push to git.bioconductor.org

<https://urldefense.proofpoint.com/v2/url?u=http-3A__git.bioconductor.org=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=sC2oO3c16mMorqdk5d8nvHzqg1dL7v228YdPl836xng=XmaKhXkb20E2q6udR5bDulIMCnvWq3qxtVUhHLOTZiQ=>)
you
need to wait at least 18h before the update is visible on the build
report. 18h is if you pushed right before the builds start (the software
builds start every day at 4:45 pm EST). If you push right after the
builds start then you will need to wait 42h!

Cheers,
H.

On 04/20/2018 07:47 AM, Radu Zabet wrote:

Hi everyone,

I am the maintainer of DMRcaller. I did an update yesterday and
on the
build report today, the package failed with the error.

Error: processing vignette 'DMRcaller.Rnw' failed with diagnostics:
   methylationProfile needs to be a GRangesList
Execution halted


On my machine (MacOS with R 3.4.4) it works

* creating vignettes ... OK

Any suggestion of why this might happen? Is there something I am
    missing?

Radu



-- 
Hervé Pagès


Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Canc

<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3DFred-2BHutchinson-2BCanc-26entry-3Dgmail-26source-3Dg=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=sC2oO3c16mMorqdk5d8nvHzqg1dL7v228YdPl836xng=9d0yj_ILpvoHB0vJLvElIr7feDGWvigQCz4J4bsuCbo=>er
Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>
Phone:  (206) 667-5791
Fax:    (206) 667-1319




--
Best regards,

Dr Nicolae Radu Zabet
Lecturer in Computational Biology,
School of Biological Sciences, University of Essex,
Colchester, CO4 3SQ, United Kingdom
T: +44(0)1206872630
E: nza...@essex.ac.uk <mailto:nza...@essex.ac.uk>


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] build machines

2018-04-27 Thread Hervé Pagès


On 04/27/2018 10:50 AM, Martin Morgan wrote:
For what it's worth, BiocParallel implemented as outlined in it's 
vignette limits the number of cores via


     if (nzchar(Sys.getenv("BBS_HOME")))
     cores <- min(4L, cores)

i.e., checking an environment variable set on the build system. This is 
highly fragile and I wouldn't necessarily recommend this outside the 
BiocParallel context.


One problem with this is that when people troubleshoot they don't
get the same thing than what they see on the build report.

How about detecting that code is being run in the context of
R CMD build or R CMD check instead? Is there an easy/robust
way to do this?

Thanks,
H.



Martin

On 04/27/2018 01:39 PM, Ludwig Geistlinger wrote:

Hi Hervé,


Some packages are good citizens and limit the number of
cores to 1 or 2 only during 'R CMD check' but some packages
try to use all the cores that are available


That seems to be an important note for developers using parallel 
computation.
What's best practice to realize this within my code, i.e. checking 
whether the code is currently subject to R CMD check (and accordingly 
reducing the number of cores used)?


Thanks,
Ludwig

--
Dr. Ludwig Geistlinger
CUNY School of Public Health


From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of 
Kasper Daniel Hansen <kasperdanielhan...@gmail.com>

Sent: Friday, April 27, 2018 10:29 AM
To: Hervé Pagès
Cc: bioc-devel@r-project.org
Subject: Re: [Bioc-devel] build machines

Thanks.

I used
   /usr/bin/time -v R CMD check ...
to record the max memory usage of the check, which for minfi suggests
around 5Gb.  That's a lot.

Best,
Kasper

On Thu, Apr 26, 2018 at 3:02 PM, Hervé Pagès <hpa...@fredhutch.org> 
wrote:



Hi,

The Linux and Windows builders have 32 GB of RAM, the Mac
builders 64 Gb.

We also run concurrent R CMD check's.

Here is a summary:

   platform   RAM   nb of nb of concurrent
  (Gb)  cores    R CMD check's
   ---
   Linux (malbecs) 32  20   10
   Windows (tokays)    32  40   24
   Mac (meridas)   64  24   18

That's a lot of concurrency. And there is actually more
concurrency than that if you consider the fact that many
packages run things in parallel during 'R CMD check'.
Some packages are good citizens and limit the number of
cores to 1 or 2 only during 'R CMD check' but some packages
try to use all the cores that are available. This will have
a strong impact on the overall progress of the builds. We
don't have an easy way to identify those packages right now.

In average, based on our monitoring of the build machines
things seem to work ok i.e. the concurrent R CMD check's
don't seem to be competing too much to access resources.

But occasionally there could be too much competition. The
crazy big elapsed time compared to the relatively short user
and system times that you observed Kasper are likely to reflect
that. They could be the sign that the machine ran out of memory
and started swapping. Not because it happens to your package
means that your package uses too much memory. The swapping is
the result of the **cumulated** memory usage of all the
R CMD check's running at that moment. It could be worth checking
how much memory R CMD check'ing your package uses though.

The exact set of packages that are being R CMD check'ed at any
given time is in constant fluctuation and will also vary from
one day to the other. This would explain why some days you see
timeouts on some platforms and some days not. We don't have
an easy way to know which packages were competing with yours
during the 40 min window that 'R CMD check' was running on your
package until the build system declared a timeout. It's possible
(by looking at the BBS logs) but is time consuming.

We should probably add some memory at some point to the Windows
builders. 32 Gb is not enough to smoothly run 24 R CMD check's
concurrently.

H.


On 04/26/2018 08:48 AM, Diogo FT Veiga wrote:


Hi Daniel,

I have the same issue with my package (new contribution). I just finish
reviewing the package with the modifications requested.

I am having a warning because R CMD check is exceeding 5 min, but 
this is

happening only in the Windows machine.

In Linux and OSX the check finishes in <= 4min, while in Windows takes
~6min.

https://urldefense.proofpoint.com/v2/url?u=http-3A__biocondu
ctor.org_spb-5Freports_maser-5Fbuildreport-5F20180425114748
.html=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY
_wJYbW0WYiZvSXAJJKaaPhzWA=JwiMI-3BEUJlonlihLD_mDkPuEIalQbk
rQPSGahzfsg=1aMitB3PnVLoojx1lnj_UT_ZeKlJ_OcJDFT4D6BPXow=


Not sure how to proceed from here.

Thanks,
Diogo


On Thu, Apr 26, 2018 at 9:52 AM, Kasper Daniel Hansen <
kasperdanielhan...@gmail.com> wrote:

We have been working on the minfi package lately, with a move to a

Delayed

Re: [Bioc-devel] How to update R version in terminal?

2018-04-27 Thread Hervé Pagès


Hi,

On 04/27/2018 09:34 AM, Yuande Tan wrote:

Dear Sirs,
I want to update R 3.3.1 in terminal in my local mac computer.
I use the following command to install r 3.5.0
brew cask install r-app

and also

brew link --overwrite r


But R --version shows

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"

Copyright (C) 2016 The R Foundation for Statistical Computing

Platform: x86_64-apple-darwin11.0.0 (64-bit)


Anyone can help me to address this problem?


Any reason you're not installing the CRAN binary?

  https://cran.r-project.org/bin/macosx/

Please note that this is not a Bioconductor question. It is better to
ask this kind of question on general R discussion channels like the
R-help or R-SIG-Mac mailing list or Stack Overflow.

Cheers,
H.




Thanks



Yuande

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=HqvexBcWB-xz8MjS_TwC_bBhQlHytyjCHlOhaJf4nb8=TeoqGt6tumKLFdxpQcB4ys07vuqVmLjW-5n8OBCEkzo=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] build machines

2018-04-26 Thread Hervé Pagès

_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=JwiMI-3BEUJlonlihLD_mDkPuEIalQbkrQPSGahzfsg=R1DGN1kNpBZ4ZRBCTQzDPQlNYapuBNSYB4JTM6tO60w=



[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=JwiMI-3BEUJlonlihLD_mDkPuEIalQbkrQPSGahzfsg=R1DGN1kNpBZ4ZRBCTQzDPQlNYapuBNSYB4JTM6tO60w=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] git push confusion

2018-04-23 Thread Hervé Pagès

 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.





This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.





This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] problem with class definitions between S4Vectors and RNeXML in using Summarized Experiment

2018-04-23 Thread Hervé Pagès


Completely agree on the importance of disambiguating class references
in the long term.

However I still think that just because we have a mechanism to
disambiguate means we shouldn't make an effort to avoid name clashes,
especially when the name clash is easy to avoid, like in the case
discussed here. Annotated is not a good name anyway. Something like
Annotatable would be more appropriate. Objects of this class **can**
be annotated but many of them are not.

H.

On 04/14/2018 12:25 PM, Michael Lawrence wrote:

Last night I checked in a workaround to S4Vectors. It just calls
getClass("Annotated") instead of passing the class name directly.

I'll check in a simple fix for is() today maybe to R 3.6 (devel) and
then we'll be good for now.

On Sat, Apr 14, 2018 at 8:59 AM, Martin Morgan
<martin.mor...@roswellpark.org> wrote:

On 04/14/2018 07:21 AM, Vincent Carey wrote:


But Annotated is defined in S4Vectors and RNeXML; the latter is not a
Bioconductor package.

The likelihood of collisions among class names defined in different
packages seems pretty high
as S4 adoption grows.  So requiring a systematic approach to
disambiguating
class references seems inevitable.



I agree that renaming is not a robust solution, and would encourage Michael
to commit the change using the class definition in `is()` with more
elaborate solutions (really this is a problem with is(), where it should be
fixed, rather than introducing complicated syntax?) left for a later day.

Martin




On Sat, Apr 14, 2018 at 4:05 AM, Hervé Pagès <hpa...@fredhutch.org> wrote:


How about renaming Annotated? Isn't having 2 classes around with the
same name fundamentally a bad situation? No amount of workarounds will
change that.

H.


On 04/12/2018 04:06 PM, Michael Lawrence wrote:


Yea, good idea, I was thinking of supporting :: in class names and
parsing them out. In code is better.  Maybe %::%? It wouldn't have to
get a class object (for one thing, a class might not exist), because
the methods package supports a 'package' attribute on the character
vector, abstracted by packageSlot().



On Thu, Apr 12, 2018 at 3:26 PM, Vincent Carey
<st...@channing.harvard.edu> wrote:


If we need to disambiguate class references, perhaps an operator

could help?  Along the lines of base::"::" ...


"%c%" <- function(package,class) {

  pk = as.character(substitute(package))

  cl = as.character(substitute(class))

  getClass(cl, where=getNamespace(pk))

}


Biobase %c% ExpressionSet  # a classRepresentation instance


is(1:5, Biobase %c% ExpressionSet)  # FALSE


is(Biobase::ExpressionSet(), "ExpressionSet")  # TRUE


is(Biobase::ExpressionSet(),  Biobase %c% ExpressionSet) # TRUE






On Thu, Apr 12, 2018 at 3:57 PM, Michael Lawrence
<lawrence.mich...@gene.com> wrote:



Hi Davide,

We can get this fixed soon, but I was hoping to hear e.g. Herve's
opinion first if he has one.

Michael

On Thu, Apr 12, 2018 at 12:53 PM, Davide Risso
<dar2...@med.cornell.edu




wrote:


Hi Michael,

Thanks for looking into this.

Can you or someone with push permission to S4Vectors implement the
workaround that you mentioned?

Happy to create a pull request on Github if that helps.

We’re trying to solve this to fix the clusterExperiment package build
on
Bioc-devel.

Thanks,
Davide


On Apr 12, 2018, at 1:27 PM, Michael Lawrence
<lawrence.mich...@gene.com>
wrote:

Yea it's basically

library(S4Vectors)
library(RNeXML)
is(1:5, "Annotated")
# Found more than one class "Annotated" in cache; using the first,
from namespace 'S4Vectors'
# Also defined by ‘RNeXML’
# [1] FALSE

But can be worked around:

is(1:5, getClass("Annotated", where=getNamespace("S4Vectors"))

# [1] FALSE

Of course, using class objects instead of class names in every call
to
is() is not very palatable, but that's how it's done in all other
languages, as far as I know.

There is an inconsistency between new() and is() when resolving the
class name. new() looks into the calling package's namespace, while
is() looks at the package for the class of the 'object'. The new()
approach seems sensible for that function, since packages should be
abstracting the construction of their objects with constructors. The
is() approach is broken though, because it's easy to imagine cases
like where some foreign object is passed to a function, and the
function checks the type with is().

I can change is() to use the calling package as the fallback, so
DataFrame(1:5) no longer produces a message. But calling it from
another package, or global env, will still break, just like new().
How
does that sound?

On the other hand, maybe we should be more careful with calls to is()
and use class objects. That's a good workaround in this case, anyway,
since I probably can't get the change into R before release.

Michael


On Thu, Apr 12, 2018 at 9:03 AM, Aaron Lun <a...@wehi.edu.au> wrote:

Well,

Re: [Bioc-devel] dbApply name collision, RMySQL and RPostgreSQL, no direct call to either

2018-04-23 Thread Hervé Pagès


Hi Paul,

trena imports RMySQL and RPostgreSQL. Both packages define and export
dbApply() so by importing all the symbols from both packages you get a
name clash. You can get around this by importing only the things you
need. It seems that you only call the following generic functions in
trena:

  dbConnect
  dbListTables
  dbGetQuery
  dbListConnections
  dbDisconnect

These are S4 generic functions that are defined in the DBI package so
they should be imported from there. Also import the corresponding
methods defined in RMySQL and RPostgreSQL. Your NAMESPACE file will
look something like this:

importFrom(DBI,
dbConnect,
dbListTables,
dbGetQuery,
dbListConnections,
dbDisconnect)


importMethodsFrom(RMySQL,
dbConnect,
dbListTables,
dbGetQuery,
dbListConnections,
dbDisconnect)

importMethodsFrom(RPostgreSQL,
dbConnect,
dbListTables,
dbGetQuery,
dbListConnections,
dbDisconnect)

BTW have you considered using RMariaDB instead of RMySQL?

  https://cran.r-project.org/web/packages/RMariaDB/

Cheers,
H.

On 04/23/2018 11:12 AM, Paul Shannon wrote:


 Warning: replacing previous import ‘RMySQL::dbApply’ by 
‘RPostgreSQL::dbApply’ when loading ‘trena’

We do not call dbApply directly anywhere in the package. I imagine it is called 
routinely by functions that we do call.

Any suggestions on how to clear this warning?

Thanks.

  - Paul

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=MeyA9WrkJOQDYGySOlgR8QARzH-hSAPwc5-b4mJ5E9A=njLygKe4nxHzP911sM3TIgYMkMHsNWR97p3ZFIr0Voc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Error: node stack overflow

2018-04-14 Thread Hervé Pagès

via a namespace (and not attached):

[1] Rcpp_0.12.16 digest_0.6.15rprojroot_1.3-2

[4] bitops_1.0-6 backports_1.1.2magrittr_1.5

[7] evaluate_0.10.1zlibbioc_1.25.0stringi_1.1.7

[10] XVector_0.19.9 tools_3.5.0stringr_1.3.0

[13] RCurl_1.95-4.10compiler_3.5.0 htmltools_0.3.6

[16] knitr_1.20 GenomeInfoDbData_1.1.0


*



On Mon, Apr 2, 2018 at 2:25 PM, Hervé Pagès <hpa...@fredhutch.org 
<mailto:hpa...@fredhutch.org>> wrote:


Hi Zheng,

Thanks for the report. I will look into this and will let you know.

H.

On 04/01/2018 02:38 AM, Zheng Wei wrote:

Dear all,

I find this error if calling library(rJava) before using
BiocGenerics::unique

The code is pasted below.

Thanks,
Zheng

  > library(rJava)
  > library(GenomicRanges)
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

      clusterApply, clusterApplyLB, clusterCall, clusterEvalQ
      clusterExport, clusterMap, parApply, parCapply, parLapp
      parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:rJava’:

      anyDuplicated, duplicated, sort, unique

The following objects are masked from ‘package:stats’:

      IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

      anyDuplicated, append, as.data.frame, basename, cbind,
      colnames, colSums, dirname, do.call, duplicated, eval,
      Filter, Find, get, grep, grepl, intersect, is.unsorted,
      lengths, Map, mapply, match, mget, order, paste, pmax,
      pmin, pmin.int

<https://urldefense.proofpoint.com/v2/url?u=http-3A__pmin.int=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=WuplCAfYBn5Cq3N946mtZUi0612IchE0DfGaLDRFWTg=Km1Jhe0uxvMMuNveRSNtoTyFaOBZRO2jL8kmCFQd8Ec=>,
Position, rank, rbind, Reduce, rowMeans
      rowSums, sapply, setdiff, sort, table, tapply, union, u
      unsplit, which, which.max, which.min

Loading required package: S4Vectors


Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

      expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
  > gr1 <- GRanges(seqnames=Rle(c("ch1", "chMT"), c(2, 4)),
+                ranges=IRanges(16:21, 20),
+                strand=rep(c("+", "-", "*"), 2))
  > unique(gr1)
Error: node stack overflow
  > BiocGenerics::unique(gr1)
Error: node stack overflow



-- 
Hervé Pagès


Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>

___
Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

<https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=WuplCAfYBn5Cq3N946mtZUi0612IchE0DfGaLDRFWTg=Lw0PlsxUz2dbA2mDa_e4vZrNZ2FfjaNNQ-abMHuIMNY=>




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] problem with class definitions between S4Vectors and RNeXML in using Summarized Experiment

2018-04-14 Thread Hervé Pagès

KQOAOgm-DLN6S1TI6KmPXPitPvrUgI=QasLVQB428Ql4woG4ckrA0ljZSLRUgylm3PdN1fOn00=
/clusterExperiment/issues/66

<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_epurdom_clusterExperiment_issues_66=DwIFaQ=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s=27RAi9XMaRMwPy47RdOGbLATWZ3jxxsvAC3lBQmEVTo=FD3EbY8tWuTuwKQOAOgm-DLN6S1TI6KmPXPitPvrUgI=PsCvulwKcgNnhI8OtnUfHYf8C9LuPyz9sGQmzKhL_rc=>),
when it
appeared to be a problem with two definitions of the ‘Annotated’
class in two packages that are both dependencies of packages we
call. At that time, Michael Lawrence posted that he would fix the
problem, and it was then fixed in later versions of bioconductor/R.
But it appears to be back.  I am unfortunately unable to get the
RNeXML package to compile from source on my computer with the
current Mac OS X development binary which I just downloaded (2018-
04-05 r74542), so I haven’t been able to completely redo the code
that we presented in that earlier github issue to confirm it is the
exact same problem. I am having to rely on the error reports/logs
from both Bioconductor and TravisCI (e.g. 2018-04-07 r74551), where
this message shows up everywhere and didn’t before. Thus I’m
guessing that since they are the same messages from before that the
source is again the call to SummarizedExperiment.

I would note that in development version 2018-03-22 r74446, where I
was able to install all of the packages, I was not getting these
messages.

Thanks,
Elizabeth Purdom


  [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list

https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s=27RAi9XMaRMwPy47RdOGbLATWZ3jxxsvAC3lBQmEVTo=FD3EbY8tWuTuwKQOAOgm-DLN6S1TI6KmPXPitPvrUgI=sgZkMo8EM5lXrra1obmPLWz4H4hrqm1Y2HhQRwa8IaA=




___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=fol6oIBKr5fVzLGchc4RvlrsYKehauLVF5_G-LWRaXM=UovEzK2nsWLv9tn6_XmxsASzSfuAFOLSHTrOousDJOY=





___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=fol6oIBKr5fVzLGchc4RvlrsYKehauLVF5_G-LWRaXM=UovEzK2nsWLv9tn6_XmxsASzSfuAFOLSHTrOousDJOY=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] AAString - Amino acid code enforced?

2018-04-14 Thread Hervé Pagès


Hi Felix,

Please see my answer in the issue you opened on GitHub:

  https://github.com/Bioconductor/Biostrings/issues/10

Cheers,
H.


On 04/02/2018 06:07 AM, Felix Ernst wrote:

Dear all,

probably this is for Hervé Pagès:

I tried the following code, which should according to ?AAString not work, since 
ÜÖÄ are not part of any AA code.


AAString("ÜÄÖ")

   3-letter "AAString" instance
seq: ÜÄÖ

sessionInfo()

R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets  
methods   base

other attached packages:
[1] Biostrings_2.46.0   XVector_0.18.0  IRanges_2.12.0  
S4Vectors_0.16.0BiocGenerics_0.24.0

loaded via a namespace (and not attached):
[1] zlibbioc_1.24.0 compiler_3.4.4  tools_3.4.4 yaml_2.1.18


I don’t have access right now to the devel version of Biostrings, bit I checked 
out the current Code in the github repo and its recent changes. I am pretty 
sure, that this behavior is also in the current devel branch. Can someone 
confirm this?

My current interest is in using the XString classes and methods for an 
additional biological string representation. The initial question was, how can 
I restrict this to a certain character set, if the characters are not saved 
byte encoded? The latter option is not available to me, since characters like 
‚«‘ or ‚=‘ result in a two byte code using the charToRaw function. This trips 
up the build of the internal lookup table, which are passed down to the C 
backend.

Therefore I looked into, how this is done for an AAString differing from a 
BString. I discovered, that it currently doesn‘t. I also looked into the 
current 2.47.12 repo, which as far as I can tell does not use the 
AMINO_ACID_CODE constant in the creation of an AAString object.

So my questions are:
- What is the best practice for extending a class from XString with a 
restricted character set, which is not byte encoded?
- Is there a way to use byte encoding for chars with two ore more bytes?

  Thanks in advance for any help and suggestions.

Best regards,
Felix

PS: regarding the second question: One could change „as.integer(charToRaw(paste(letters, 
collapse="")))“ to „lapply(lapply(letters,charToRaw),as.integer)“ in 
.letterAsByteVal, but in any case it will not be atomic anymore, which I think is 
required to be excepted by the C backend. I didn’t test it.






[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=a9PVG834eyUM7vwSuw8Mtewx26gvgv4ZMOP3baqoUgI=49MtB5WcyN15mmFUV0rBOT2lMkEL51mvwbk01sYYhUU=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] update schedule of packages

2018-04-11 Thread Hervé Pagès


Hi,

As stated on https://bioconductor.org/, Bioconductor has two releases
each year. See the release dates of all the releases so far here:

  https://bioconductor.org/about/release-announcements/

As you can see, there is generally one release in Spring and one in
Fall.

H.

On 04/11/2018 08:02 AM, Minoo Ashtiani wrote:

Hi there,


Thanks for publishing my package on Bioconductor. Just one more question, is 
the updating schedule of Bioconductor only each year on April?


We are planing to submit the paper of the package on a journal. We wanted to 
know if we can make some editions in the package according to the journal 
reviewers comments in the future before next April or not?


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=Xihpoy1_6CIsMRaulXn9QucRVEAhRMGoi5grnm_DGSw=6AdKUYeJMBHCnEqKX2fADWWvyIV5o-ICqyRff0pqS9A=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Bioc-devel changes and SummarizedExperiment class

2018-04-10 Thread Hervé Pagès


Hi Leonard,

This should be fixed in SGSeq 1.13.6 (see commit
5dc16968f7ea1a4b59595ebaabacca9a76699b80).

Cheers,
H.

On 04/04/2018 09:23 AM, Leonard Goldstein wrote:

Hi Hervé,

Some recent changes in bioc-devel are causing trouble with
SummarizedExperiment objects if the rowRanges slot inherits from
GRangesList. Please see example below.

Thanks in advance for your help.

Leonard

--

library(SGSeq)

## SGVariants object inherits from GRangesList




is(sgv_pred)

  [1] "SGVariants" "GRangesList"
  [3] "Paths"  "GenomicRangesList"
  [5] "CompressedRangesList"   "GenomicRanges_OR_GRangesList"
  [7] "RangesList" "CompressedList"
  [9] "GenomicRanges_OR_GenomicRangesList" "List"
[11] "Vector" "list_OR_List"
[13] "Annotated"


## example counts




counts <- matrix(1:2, ncol = 1)

## creating SummarizedExperiment object fails




SummarizedExperiment(assays = list(counts), rowRanges = sgv_pred)

class: RangedSummarizedExperiment
dim: 2 1
metadata(0):
assays(1): ''
Error in .local(object, ..., verbose) : unused argument (check = FALSE)


## works after coercing to GRangestList




SummarizedExperiment(assays = list(counts), rowRanges = as(sgv_pred,

"GRangesList"))
class: RangedSummarizedExperiment
dim: 2 1
metadata(0):
assays(1): ''
rownames: NULL
rowData names(20): from to ... variantType variantName
colnames: NULL
colData names(0):


sessionInfo()

R Under development (unstable) (2017-10-20 r73567)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.6 (Santiago)

Matrix products: default
BLAS:
/gnet/is2/p01/apps/R/3.5.0-20171105-devel/x86_64-linux-2.6-rhel6/lib64/R/lib/libRblas.so
LAPACK:
/gnet/is2/p01/apps/R/3.5.0-20171105-devel/x86_64-linux-2.6-rhel6/lib64/R/lib/libRlapack.so

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
[8] methods   base

other attached packages:
  [1] SGSeq_1.13.5SummarizedExperiment_1.9.16
  [3] DelayedArray_0.5.23 BiocParallel_1.13.3
  [5] matrixStats_0.53.1  Biobase_2.39.2
  [7] Rsamtools_1.31.3Biostrings_2.47.12
  [9] XVector_0.19.9  GenomicRanges_1.31.23
[11] GenomeInfoDb_1.15.5 IRanges_2.13.28
[13] S4Vectors_0.17.39   BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16  compiler_3.5.0
  [3] GenomicFeatures_1.31.10   prettyunits_1.0.2
  [5] bitops_1.0-6  tools_3.5.0
  [7] zlibbioc_1.25.0   progress_1.1.2
  [9] biomaRt_2.35.13   digest_0.6.15
[11] bit_1.1-13RSQLite_2.1.0
[13] memoise_1.1.0 lattice_0.20-35
[15] pkgconfig_2.0.1   igraph_1.2.1
[17] Matrix_1.2-13 DBI_0.8
[19] GenomeInfoDbData_1.1.0rtracklayer_1.39.9
[21] httr_1.3.1stringr_1.3.0
[23] bit64_0.9-8   grid_3.5.0
[25] R6_2.2.2  AnnotationDbi_1.41.4
[27] XML_3.98-1.10 blob_1.1.1
[29] magrittr_1.5  GenomicAlignments_1.15.13
[31] RUnit_0.4.31  assertthat_0.2.0
[33] stringi_1.1.7 RCurl_1.95-4.10





[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=qpgk0XNxCYLZKHMRS-PnHD2znDDwj1P-Eiu7P4aUSuI=qjwH7TgvfYKGtMDCI77_VVUw8S-5PA6ctju8Jb3erUQ=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Error: node stack overflow

2018-04-02 Thread Hervé Pagès


Hi Zheng,

Thanks for the report. I will look into this and will let you know.

H.

On 04/01/2018 02:38 AM, Zheng Wei wrote:

Dear all,

I find this error if calling library(rJava) before using 
BiocGenerics::unique


The code is pasted below.

Thanks,
Zheng

 > library(rJava)
 > library(GenomicRanges)
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ??BiocGenerics??

The following objects are masked from ??package:parallel??:

 ?0?2 ?0?2 clusterApply, clusterApplyLB, clusterCall, clusterEvalQ
 ?0?2 ?0?2 clusterExport, clusterMap, parApply, parCapply, parLapp
 ?0?2 ?0?2 parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ??package:rJava??:

 ?0?2 ?0?2 anyDuplicated, duplicated, sort, unique

The following objects are masked from ??package:stats??:

 ?0?2 ?0?2 IQR, mad, sd, var, xtabs

The following objects are masked from ??package:base??:

 ?0?2 ?0?2 anyDuplicated, append, as.data.frame, basename, cbind,
 ?0?2 ?0?2 colnames, colSums, dirname, do.call, duplicated, eval,
 ?0?2 ?0?2 Filter, Find, get, grep, grepl, intersect, is.unsorted,
 ?0?2 ?0?2 lengths, Map, mapply, match, mget, order, paste, pmax,
 ?0?2 ?0?2 pmin, pmin.int, Position, rank, rbind, Reduce, rowMeans
 ?0?2 ?0?2 rowSums, sapply, setdiff, sort, table, tapply, union, u
 ?0?2 ?0?2 unsplit, which, which.max, which.min

Loading required package: S4Vectors


Attaching package: ??S4Vectors??

The following object is masked from ??package:base??:

 ?0?2 ?0?2 expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
 > gr1 <- GRanges(seqnames=Rle(c("ch1", "chMT"), c(2, 4)),
+?0?2 ?0?2 ?0?2 ?0?2 ?0?2 ?0?2 ?0?2 ?0?2 ranges=IRanges(16:21, 20),
+?0?2 ?0?2 ?0?2 ?0?2 ?0?2 ?0?2 ?0?2 ?0?2 strand=rep(c("+", "-", "*"), 2))
 > unique(gr1)
Error: node stack overflow
 > BiocGenerics::unique(gr1)
Error: node stack overflow




--
Herv?? Pag??s

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] Workflows are now in git (and other Important workflow-related changes)

2018-03-30 Thread Hervé Pagès


To the authors/maintainers of the workflows:


Following the svn-to-git migration of the software and data experiment
packages last summer, we've completed the migration of the workflow
packages.

The canonical location for the workflow source code now is
git.bioconductor.org

Please use your git client to access/maintain your workflow the same
way you would do it for a software or data-experiment package.

We've also migrated the workflows to our in-house build system.
Starting with Bioc 3.7, the build report for the devel versions of
the workflows can be found here:

  https://bioconductor.org/checkResults/devel/workflows-LATEST/

We run these builds every other day (Mondays, Wednesdays, Fridays).
Because of limited build resources, we now run the data-experiment
builds on Sundays, Tuesdays, and Thursdays only (instead of daily).

The links to the package landing pages are not working yet. This
will be addressed in the next few days.

Please address any error you see on the report for the workflow
you maintain.

Note that, from now on, we're also following the same version scheme
for these packages as for the software and data-experiment packages.
That is, we're using an even y (in x.y.z) in release and an odd y in
devel. We'll take care of bumping y at release time (like we do for
software and data-experiment packages).

After the next Bioconductor release (scheduled for May 1), we'll start
building the release versions of the workflows in addition to the
devel versions. The build report for the release versions will be here:

  https://bioconductor.org/checkResults/release/workflows-LATEST/

Finally, please note that with the latest version of BiocInstaller
(1.29.5), workflow packages can be installed with biocLite(), like
any other Bioconductor package. We'll deprecate the old mechanism
(workflowInstall()) at some point in the future.

Thanks to Andrzej, Lori, Nitesh, and Valerie for working on this
migration.

Let us know if you have any question about this.

H.


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] No Windows binary for GenomicRanges

2018-03-28 Thread Hervé Pagès

Hi Gordon,

The TIMEOUT that was preventing the Windows binary of GenomicRanges
from propagating has been addressed. The latest version of the package
(1.31.23) is now available for BioC devel users on all platforms via
biocLite().

Cheers,
H.

On 03/19/2018 10:01 AM, Hervé Pagès wrote:

Hi Gordon,

We're seeing a number of abnormal/unexpected TIMEOUTs on tokay2:

https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_checkResults_3.7_bioc-2DLATEST_tokay2-2Dindex.html-23show-3Dtimeout=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=HjFin7ZZwuYWdlO8wvE3KmoPwjRiq-bV_UKHWdKLLP4=jWxbIgfo7qej-qcydrB9NnzlvmwboSkvQFAWN6Fca1g=wj8zdVvqh_ucm_lMqZOzC37n85vndqhvSfqgMTjTMOg=

These TIMEOUTs have been preventing the Windows binaries of the
corresponding packages to propagate. We're currently investigating
the cause of these TIMEOUTs and will let you know.

Sorry for the inconvenience.

On 03/18/2018 07:35 PM, Gordon K Smyth wrote:
There has been no Windows binary for the GenomicRanges package since
version 1.31.12, which was nearly 2 months ago.

This means that I can't install GenomicFeatures either, because it
depends on a later version of GenomicRanges, so the whole Bioc ranges
infrastructure is incapacitated.

What is the ETA for a Windows version? I notice there haven't been any
commits to GenomicRanges since 27 Feb.

Gordon

--
Professor Gordon K Smyth
Head, Bioinformatics Division
Walter and Eliza Hall Institute of Medical Research

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=NrBBRFv3CReEoWizyQHKlkyTU3dgJ1g0EEVfrBK3-Io=QlSpRQkMeHpZXcbywJfyWc5ehC8aLRdx5oHlJsyVGDA=

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-03-28 Thread Hervé Pagès


On 03/28/2018 02:41 PM, Hervé Pagès wrote:

Hi Kieran,

Note that you can execute arbitrary code at load time by defining
an .onLoad() hook in your package. So you *could* put something
like this in your package:

   .onUnload <- function(libpath)
   {
     if (!reticulate::py_module_available("tensorflow"))
     tensorflow::install_tensorflow()
   }


should be .onLoad() in the above code

more below...



However, having things being automatically downloaded/installed
on the user machine at package load-time is not a good idea. There
are just too many things that can go wrong.

For example, I just tried to run tensorflow::install_tensorflow()
on my laptop (Ubuntu 16.04) and was successful only after the 3rd
attempt (I had to make some changes/adjustments to my system between
each attempt). And Debian Linux is probably the easiest target!

Also note that install.packages() tries to load the package at the
end of the installation when installing from source so if the
.onUnload() hook fails, install.packages() considers that

  ^^^
   .onLoad()

same here, sorry

H.


the installation of the package failed and it removes it.

Finally note that this installation needs to download hundreds of
Mb of Python stuff.

So this is probably the reasons why the authors of the tensorflow
CRAN package chose to separate installation of the tensorflow Python
module from the installation of the package itself. There are plenty
of good reasons for doing that.

What I would suggest instead is that you start your vignette with a
note reminding the user to run tensorflow::install_tensorflow() if
s/he didn't already do it. As a side note: I couldn't find a way to
programmatically figure out whether the tensorflow Python module is
already installed in the man page for tensorflow::install_tensorflow(),
I had to dig in the source code of the unit tests to find 
reticulate::py_module_available("tensorflow")).


In addition, you could also start each of your functions that rely on
the tensorflow Python module with a check to see whether the module is
available, and fail gracefully (with an informative error message) if
it's not.

We'll figure out a way to install the tensorflow Python module on our
build machines.

Hope this helps,
H.


On 03/28/2018 09:23 AM, Kieran Campbell wrote:

Hi all,

Rstudio have released the Tensorflow package for R -
https://urldefense.proofpoint.com/v2/url?u=https-3A__tensorflow.rstudio.com_tensorflow_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=AchAIWmKzcnyw9VXJ7eH5M4dqnTAS0SACVMigCPusHk= 
- and we have started

incorporating it into some of our genomics packages for the heavy
numerical computation.

We would ideally like these to be submitted to Bioconductor, but
there's a custom line required for Tensorflow installation in that
after calling

install.packages("tensorflow")

then Tensorflow must be installed via

tensorflow::install_tensorflow()

which would break package testing if tensorflow was simply imported
into the R package and wasn't already installed. Is there any way to
customise a package installation within Bioconductor to trigger the
tensorflow::install_tensorflow() ?

As more people use tensorflow / deep learning in genomics I can see
this being a problem so it would be good to have a solution in place.

Many thanks,

Kieran Campbell

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=RS0haeXXw_GuGbzVJJuh_ZJKHuYhliDfLjtojgmqKFc= 







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Tensorflow support for bioconductor packages

2018-03-28 Thread Hervé Pagès


Hi Kieran,

Note that you can execute arbitrary code at load time by defining
an .onLoad() hook in your package. So you *could* put something
like this in your package:

  .onUnload <- function(libpath)
  {
if (!reticulate::py_module_available("tensorflow"))
tensorflow::install_tensorflow()
  }

However, having things being automatically downloaded/installed
on the user machine at package load-time is not a good idea. There
are just too many things that can go wrong.

For example, I just tried to run tensorflow::install_tensorflow()
on my laptop (Ubuntu 16.04) and was successful only after the 3rd
attempt (I had to make some changes/adjustments to my system between
each attempt). And Debian Linux is probably the easiest target!

Also note that install.packages() tries to load the package at the
end of the installation when installing from source so if the
.onUnload() hook fails, install.packages() considers that
the installation of the package failed and it removes it.

Finally note that this installation needs to download hundreds of
Mb of Python stuff.

So this is probably the reasons why the authors of the tensorflow
CRAN package chose to separate installation of the tensorflow Python
module from the installation of the package itself. There are plenty
of good reasons for doing that.

What I would suggest instead is that you start your vignette with a
note reminding the user to run tensorflow::install_tensorflow() if
s/he didn't already do it. As a side note: I couldn't find a way to
programmatically figure out whether the tensorflow Python module is
already installed in the man page for tensorflow::install_tensorflow(),
I had to dig in the source code of the unit tests to find 
reticulate::py_module_available("tensorflow")).


In addition, you could also start each of your functions that rely on
the tensorflow Python module with a check to see whether the module is
available, and fail gracefully (with an informative error message) if
it's not.

We'll figure out a way to install the tensorflow Python module on our
build machines.

Hope this helps,
H.


On 03/28/2018 09:23 AM, Kieran Campbell wrote:

Hi all,

Rstudio have released the Tensorflow package for R -
https://urldefense.proofpoint.com/v2/url?u=https-3A__tensorflow.rstudio.com_tensorflow_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=AchAIWmKzcnyw9VXJ7eH5M4dqnTAS0SACVMigCPusHk=
 - and we have started
incorporating it into some of our genomics packages for the heavy
numerical computation.

We would ideally like these to be submitted to Bioconductor, but
there's a custom line required for Tensorflow installation in that
after calling

install.packages("tensorflow")

then Tensorflow must be installed via

tensorflow::install_tensorflow()

which would break package testing if tensorflow was simply imported
into the R package and wasn't already installed. Is there any way to
customise a package installation within Bioconductor to trigger the
tensorflow::install_tensorflow() ?

As more people use tensorflow / deep learning in genomics I can see
this being a problem so it would be good to have a solution in place.

Many thanks,

Kieran Campbell

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=w2p-VnxwECq9u90RNv_B6yCOpXxDkcIPAjcgcpbEeBE=RS0haeXXw_GuGbzVJJuh_ZJKHuYhliDfLjtojgmqKFc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Error during CHECK on Windows (tokay1)

2018-03-22 Thread Hervé Pagès

FWIW the build report for BioC release got updated about 2h ago
with today's results and we don't see this CHECK error for
openPrimeR anymore:

https://bioconductor.org/checkResults/3.6/bioc-LATEST/openPrimeR/tokay1-checksrc.html

This is typical with race conditions: some days we'll see the error
and some days we won't. Since openPrimeR has not changed between
release and devel, it's possible that we'll sometimes see this error
in devel too.

On 03/22/2018 11:02 AM, Hervé Pagès wrote:

The build system captures the output of the 'R CMD check' command
and displays it. If you look at the devel build report you'll
see that it also displays the content of some of the files produced
by 'R CMD check':

- openPrimeR.Rcheck/00install.out
- openPrimeR.Rcheck/tests_i386/testthat.Rout
- openPrimeR.Rcheck/tests_x64/testthat.Rout
- openPrimeR.Rcheck/examples_i386/openPrimeR-Ex.timings
- openPrimeR.Rcheck/examples_x64/openPrimeR-Ex.timings

Yes it will open these files in order to read their content but
openPrimeR-Ex_i386.Rout is not one them.

It also checks that those files exist before trying to open them.
If they don't exist, they're ignored. This is why the 'Tests output'
and 'Example timings' sections at the bottom of openPrimeR check
report in release are empty.

However I'm pretty confident that 'R CMD check' opens
openPrimeR-Ex_i386.Rout after running the examples in 32-bit
mode. I think it parses the file in order to detect/report
problems that happened during the run of the examples.

Just to clarify: I don't think the problem is that the file
doesn't exist when 'R CMD check' tries to open it. I think the
problem is that some process is still holding on the file (via
a write connection), which prevents 'R CMD check' from opening
it (even if it tries to open it in read-only mode).

Disabling parallel execution in your examples would be very
informative!

On 03/22/2018 10:29 AM, Matthias Döring wrote:

Dear Hervé,
thanks for the detailed explanation. I have, however, one question.
Isn't this rather a problem with the Bioconductor build system than
with the package itself? I will explain why I think so.
From my limited understanding, it seems to me that the out file from
running the package examples, in this case, "openPrimeR-Ex.Rout" is
not read by R CMD check itself but only written to.
So, only after R CMD check has run, should the Bioconductor build
system try to open the file to do some checks (I don't know what it's
checking but apparently there is something of interest in the file).

At this point, there should be a check whether the file is available
for reading, shouldn't there be? I took a dive into the past and found
this comment from Martin Morgan to a person that had the same problem
as I, which suggests my view.

https://urldefense.proofpoint.com/v2/url?u=http-3A__grokbase.com_t_r_bioc-2Ddevel_11asgm4efj_development-2Dversion-2Dof-2Dbayseq-2Dfailing-2Dcheck-2Don-2Dwindows-2Dmachines=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=4dLit0LWO62L2Nbb1SDhPnTyk0HX5yPPi35QNl604X4=27EQuRiPmUYDaFi8VIl4UJ2WS43kJT6sHbKhljxra6M=

So I'm not sure if adjusting the parallel execution would be a
permanent fix that would ensure that all Bioconductor packages pass
the checks smoothly on Windows. Maybe Martin could comment on this as
well?

Kind regards
Matthias

On 03/22/2018 05:08 PM, Hervé Pagès wrote:

Hi Matthias,

Not sure what's causing this but I just wanted to mention a couple
of things:

Even though you didn't modify your package, it can start failing
for many reasons. The most common reason is that R or another package
that your package depends on was modified. It turns out that a few
days ago we updated R from 3.4.3 to 3.4.4 on the release build
machines. Not sure that's related to the error you're seeing on
tokay1 but this cannot completely be discarded.

Unlike Unix-like system, Windows doesn't let a process open a file
if the file is currently in use (by the same or another process).
The "cannot open file 'foo': Permission denied" error is typically
the result of such situation. This type of error can be hard to
reproduce because it won't necessarily happen each time one runs
'R CMD check' on Windows. It only happens if one process is still
holding on file 'foo' when another process tries to access 'foo'.
But sometimes the process holding on 'foo' finishes a little bit
earlier (and releases the file) or the process trying to access
'foo' does it a little bit later, and everything is fine.
Because of this, troubleshooting this kind of error (called a
race condition) can be tricky.

Re: [Bioc-devel] Error during CHECK on Windows (tokay1)

2018-03-22 Thread Hervé Pagès


The build system captures the output of the 'R CMD check' command
and displays it. If you look at the devel build report you'll
see that it also displays the content of some of the files produced
by 'R CMD check':

  - openPrimeR.Rcheck/00install.out
  - openPrimeR.Rcheck/tests_i386/testthat.Rout
  - openPrimeR.Rcheck/tests_x64/testthat.Rout
  - openPrimeR.Rcheck/examples_i386/openPrimeR-Ex.timings
  - openPrimeR.Rcheck/examples_x64/openPrimeR-Ex.timings

Yes it will open these files in order to read their content but
openPrimeR-Ex_i386.Rout is not one them.

It also checks that those files exist before trying to open them.
If they don't exist, they're ignored. This is why the 'Tests output'
and 'Example timings' sections at the bottom of openPrimeR check
report in release are empty.

However I'm pretty confident that 'R CMD check' opens
openPrimeR-Ex_i386.Rout after running the examples in 32-bit
mode. I think it parses the file in order to detect/report
problems that happened during the run of the examples.

Just to clarify: I don't think the problem is that the file
doesn't exist when 'R CMD check' tries to open it. I think the
problem is that some process is still holding on the file (via
a write connection), which prevents 'R CMD check' from opening
it (even if it tries to open it in read-only mode).

Disabling parallel execution in your examples would be very
informative!

H.


On 03/22/2018 10:29 AM, Matthias Döring wrote:

Dear Hervé,
thanks for the detailed explanation. I have, however, one question. 
Isn't this rather a problem with the Bioconductor build system than with 
the package itself? I will explain why I think so.
 From my limited understanding, it seems to me that the out file from 
running the package examples, in this case, "openPrimeR-Ex.Rout" is not 
read by R CMD check itself but only written to.
So, only after R CMD check has run, should the Bioconductor build system 
try to open the file to do some checks (I don't know what it's checking 
but apparently there is something of interest in the file).



At this point, there should be a check whether the file is available for 
reading, shouldn't there be? I took a dive into the past and found this 
comment from Martin Morgan to a person that had the same problem as I, 
which suggests my view.


https://urldefense.proofpoint.com/v2/url?u=http-3A__grokbase.com_t_r_bioc-2Ddevel_11asgm4efj_development-2Dversion-2Dof-2Dbayseq-2Dfailing-2Dcheck-2Don-2Dwindows-2Dmachines=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=4dLit0LWO62L2Nbb1SDhPnTyk0HX5yPPi35QNl604X4=27EQuRiPmUYDaFi8VIl4UJ2WS43kJT6sHbKhljxra6M= 



So I'm not sure if adjusting the parallel execution would be a permanent 
fix that would ensure that all Bioconductor packages pass the checks 
smoothly on Windows. Maybe Martin could comment on this as well?


Kind regards
  Matthias


On 03/22/2018 05:08 PM, Hervé Pagès wrote:

Hi Matthias,

Not sure what's causing this but I just wanted to mention a couple
of things:

Even though you didn't modify your package, it can start failing
for many reasons. The most common reason is that R or another package
that your package depends on was modified. It turns out that a few
days ago we updated R from 3.4.3 to 3.4.4 on the release build
machines. Not sure that's related to the error you're seeing on
tokay1 but this cannot completely be discarded.

Unlike Unix-like system, Windows doesn't let a process open a file
if the file is currently in use (by the same or another process).
The "cannot open file 'foo': Permission denied" error is typically
the result of such situation. This type of error can be hard to
reproduce because it won't necessarily happen each time one runs
'R CMD check' on Windows. It only happens if one process is still
holding on file 'foo' when another process tries to access 'foo'.
But sometimes the process holding on 'foo' finishes a little bit
earlier (and releases the file) or the process trying to access
'foo' does it a little bit later, and everything is fine.
Because of this, troubleshooting this kind of error (called a
race condition) can be tricky.

The first thing you could do is disable parallel execution in
your examples and see if that eliminates the issue. If you still
see the error, then you can remove doParallel/foreach from the
equation. If you don't see the error anymore (but you'll have
to wait several build iterations to be sure because you could
just be lucky that the race condition doesn't happen for a few
days in a row), then you can fairly suspect that the problem
has something to do with doParallel/foreach and contact the
doParallel/foreach folks.

Hope this helps,
H.


On 03/22/2018 07:31 AM, Matthias Döring wrote:

Although I didn't make any changes to my package openPrimeR, it is
currently not passing the checks on tokay1 due to the following error
while running the examples:

* checking examples ...
** running examples for arch 'i386' ...War

Re: [Bioc-devel] Error during CHECK on Windows (tokay1)

2018-03-22 Thread Hervé Pagès

Hi Matthias,

Not sure what's causing this but I just wanted to mention a couple
of things:

The first thing you could do is disable parallel execution in
your examples and see if that eliminates the issue. If you still
see the error, then you can remove doParallel/foreach from the
equation. If you don't see the error anymore (but you'll have
to wait several build iterations to be sure because you could
just be lucky that the race condition doesn't happen for a few
days in a row), then you can fairly suspect that the problem
has something to do with doParallel/foreach and contact the
doParallel/foreach folks.

Hope this helps,
H.

On 03/22/2018 07:31 AM, Matthias Döring wrote:

Although I didn't make any changes to my package openPrimeR, it is
currently not passing the checks on tokay1 due to the following error
while running the examples:

* checking examples ...
** running examples for arch 'i386' ...Warning in file(con, "r") :
cannot open file '../openPrimeR-Ex_i386.Rout': Permission denied
Error in file(con, "r") : cannot open the connection
Execution halted

https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_checkResults_release_bioc-2DLATEST_openPrimeR_tokay1-2Dchecksrc.html=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=HjFin7ZZwuYWdlO8wvE3KmoPwjRiq-bV_UKHWdKLLP4=SZhlV-e7h7S6Ndf9G2qUGp37kMZHgjDPmkvI7cCtGDA=eK8w8Wfg-rPx9HSrExZjtAqbHQSDmYOxNJvdltvzCTM=

I've just searched for this type of error and found that it can be
caused by the parallel backend if it isn't explicitly closed. However,
I'm not creating clusters manually but rather using the foreach package
for this purpose, so this shouldn't be the root of the problem.
Interestingly, there are no problems for the development version on
tokay2
(https://urldefense.proofpoint.com/v2/url?u=http-3A__bioconductor.org_checkResults_devel_bioc-2DLATEST_openPrimeR_tokay2-2Dchecksrc.html=DwIDaQ=eRAMFD45gAfqt84VtBcfhQ=HjFin7ZZwuYWdlO8wvE3KmoPwjRiq-bV_UKHWdKLLP4=SZhlV-e7h7S6Ndf9G2qUGp37kMZHgjDPmkvI7cCtGDA=zAWDhUHJG5jAGFu2WUc1K_uSBXN1daXMN5VVObIUjEk=).

Should I just ignore this problem? Any pointers?

Best

Matthias

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Update existing packages and change name

2018-03-22 Thread Hervé Pagès


Hi Sokratis,

On 03/22/2018 06:13 AM, Sokratis Kariotis wrote:

  Hey all,

I have been updating my pathprint and pathprintGeoData packages (in 3.7)
and I encountered the following error in one of them (only on windows
server):

Error: cannot allocate vector of size 237.3 Mb

Here is the report: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_checkResults_3.7_bioc-2DLATEST_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=vq6ajmYLDu1OVYL9MQ4KO2qmc3ypMSQdmMTd3m4ygxY=kFbj_KW3LgfOmlyR4Pi4-ewT1wzqjOKnhdVUOB7u9xE=
pathprint/tokay2-checksrc.html

How can I fix this? Thanks in advance.


Note that this error happens only when running the examples in 32-bit
mode (arch i386) but not in 64-bit mode (arch x64). 32-bit Windows
doesn't allow a process to use more than 2 Gb of RAM. There is no such
limit with 64-bit Windows.

You have 2 options: (1) reduce the memory footprint of the examples,
or (2) give up on supporting your package for 32-bit Windows users.

(2) should be the last resort.

Cheers,
H.



Cheers,
Sokratis


On 21 February 2018 at 20:37, Obenchain, Valerie <
valerie.obench...@roswellpark.org> wrote:


Hi Sokratis,

Unfortunately you can't rename a package ... and there is no such thing as
a 'sub package'. Maybe you meant sub-package terms of dependencies. If you
have new data to add that's independent of what's in pathprintGEOData you
can create a new data package.

I'm not sure what you mean by 'considerably more data'. The
pathprintGEOData package is currently 22MG:

~ >du -h pathprintGEOData_1.4.0.tar.gz
22MpathprintGEOData_1.4.0.tar.gz

We don't support git LFS yet in the experimental data repo. If you plan on
making the package(s) very large we should consider making a ExperimentHub
package where the data can be stored in S3.

Valerie



On 02/21/2018 05:16 AM, Sokratis Kariotis wrote:

Hey all, I am updating 2 of my packages (pathprint & pathprintGEOData) with
considerably more data and a few code updates. I also want to change the
name of the data package to reflect the new content. Can the package name
be changed or do I have to create new sub-packages? Thanks in advance.

Cheers



This email message may contain legally privileged and/or confidential
information. If you are not the intended recipient(s), or the employee or
agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited. If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.







--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Help for Error "Maximal Number of DLLs reached..."

2018-03-21 Thread Hervé Pagès

‘FEM’
Warning: replacing previous import ‘igraph::normalize’ by 
‘BiocGenerics::normalize’ when loading ‘FEM’
Warning: replacing previous import 'plyr::summarise' by 
'plotly::summarise' when loading 'ChAMP'
Warning: replacing previous import 'plyr::rename' by 'plotly::rename' 
when loading 'ChAMP'
Warning: replacing previous import 'plyr::arrange' by 'plotly::arrange' 
when loading 'ChAMP'
Warning: replacing previous import 'plyr::mutate' by 'plotly::mutate' 
when loading 'ChAMP'

Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared object 
'/home/hpages/R/R-3.4.3/library/robustbase/libs/robustbase.so':

  `maximal number of DLLs reached...
ERROR: lazy loading failed for package 'ChAMP'
* removing '/home/hpages/R/R-3.4.3/library/ChAMP'

The downloaded source packages are in
‘/tmp/Rtmp87p9oq/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done

So even if ChAMP's documentation says something about increasing the
max number of DLLs, most users are probably going to find this situation
frustrating and complain about it.


Thus I think I have to reduce dlls used by
current package right? Like removing some relying packages or 
function. *My
another question is how many DLLs is allowed by Bioconductor check? I 
think
it's less than 100. But I don't know I should cut it into 80 or 60 or 
even

50 dlls used.*

It's really disappointing that I need to modify quite a lot of code, and
even could hurt some key functionality of the package. Thus here I am
seeking your help and suggestions here.


Frankly, I think a package with so many dependencies cannot be 
maintained -- a change in any one of those packages could break your 
package (e.g., by changing their own dependencies to include additional 
DLLs!), and it must be virtually impossible to get sufficient test 
coverage to be confident that these problems will be detected before the 
package is made available to the user. It is time to consider a more 
modular design focusing on essential features.


The first thing you could try to do is move some deps from Depends or
Imports to Suggests. Right now, after doing library(ChAMP),
sessionInfo() reports 36 packages on the search path and another 160
packages loaded via a namespace! Couldn't some of those packages be
moved to Suggests without hurting the usability of ChAMP?

One last thing: ChAMP is at version 2.9.10 in release and 2.9.11 in 
devel. This is not good. The y part of x.y.z should always be even

in release. Generally speaking, you should only bump the z part of the
version when you make changes to your package. We take care of bumping
the y part for you at release time. See:

  https://bioconductor.org/developers/how-to/version-numbering/

Too late to fix in release though. Just be aware that when we release
BioC 3.7, we'll bump ChAMP version to 2.10.0 in the new release and
to 2.11.0 in the new devel (i.e. BioC 3.8).

Cheers,
H.



Martin



Best
Yuan Tian

 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=p2zu7G8IjcYJfFGXTvJAlITWCwrBG0zBH72Htgm6go8=Rs9gi9iOQtOG4zUXSOO07D7t_smZ_h6OPyISr1DHf8c= 





[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=p2zu7G8IjcYJfFGXTvJAlITWCwrBG0zBH72Htgm6go8=Rs9gi9iOQtOG4zUXSOO07D7t_smZ_h6OPyISr1DHf8c= 






This email message may contain legally privileged and/or...{{dropped:2}}

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=p2zu7G8IjcYJfFGXTvJAlITWCwrBG0zBH72Htgm6go8=Rs9gi9iOQtOG4zUXSOO07D7t_smZ_h6OPyISr1DHf8c= 



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] No Windows binary for GenomicRanges

2018-03-19 Thread Hervé Pagès


Hi Gordon,

We're seeing a number of abnormal/unexpected TIMEOUTs on tokay2:


https://bioconductor.org/checkResults/3.7/bioc-LATEST/tokay2-index.html#show=timeout

These TIMEOUTs have been preventing the Windows binaries of the
corresponding packages to propagate. We're currently investigating
the cause of these TIMEOUTs and will let you know.

Sorry for the inconvenience.

H.


On 03/18/2018 07:35 PM, Gordon K Smyth wrote:

There has been no Windows binary for the GenomicRanges package since version 
1.31.12, which was nearly 2 months ago.

This means that I can't install GenomicFeatures either, because it depends on a 
later version of GenomicRanges, so the whole Bioc ranges infrastructure is 
incapacitated.

What is the ETA for a Windows version? I notice there haven't been any commits 
to GenomicRanges since 27 Feb.

Gordon

--
Professor Gordon K Smyth
Head, Bioinformatics Division
Walter and Eliza Hall Institute of Medical Research

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=NrBBRFv3CReEoWizyQHKlkyTU3dgJ1g0EEVfrBK3-Io=QlSpRQkMeHpZXcbywJfyWc5ehC8aLRdx5oHlJsyVGDA=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] mcols Function Not Found for Windows Build

2018-03-15 Thread Hervé Pagès


Hi Dario,

You're missing several imports. The CHECK results on the other platforms
contains "no visible global function definition" notes for several
symbols including 'mcols':


https://bioconductor.org/checkResults/3.7/bioc-LATEST/ClassifyR/malbec2-checksrc.html

Not sure why these missing imports only cause problems on Windows but
the error seems to happen in the context of BiocParallel, which could
be doing some platform-dependent business behind the scene. Note that,
generally speaking, code can still work properly even with missing
imports if the package where the missing imports are defined is in
the Depends field. This is the case here for S4Vectors. Because it's
in Depends and not in Imports, it ends up in the search path after
doing library(ClassifyR) so symbols defined in S4Vectors can be found
even if they are not imported. However, you would probably get an
error on all platforms if S4Vectors was in Imports instead of Depends.

Unless you have a good reason for importing selectively from S4Vectors,
I would recommended that you import S4Vectors entirely. This will
make the maintenance of your package easier in the long run and with
no significant downside.

Cheers,
H.


On 03/15/2018 06:00 PM, Dario Strbenac wrote:

Good day,

I notice an error happening when the vignette of ClassifyR is checked by 
tokay2. mcols is not found. I viewed the check reports of S4Vectors, and there 
are some Warnings for all operating systems, but no platform has Error, so it's 
unlikely to be related to the problem. Is there a way to make ClassifyR guard 
against this problem in Windows? I don't know how to begin solving this issue.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=2ciFRHT0N-QWOCEhGoXA9xk7mAqAFfd9sj2p159st_U=JxXD6gXsz-0euuGL4PsURm-OztXlZm4Ux1RRClPhKwg=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Best practice on commit

2018-03-15 Thread Hervé Pagès


Hi,

Just to clarify, all software packages are built and checked every
night, independently of whether their version got bumped or not.
The version bump only allows the modified package to propagate
to the public repo and become available via biocLite(), possibly
replacing the previous version of the package (if one is already
available).

The package landing page on the website should reflect what's in
the public repo for a given BioC version at any time (the landing
pages can lag behind by about 20 min which is the time it takes
to regenerate all the package landing pages after the daily package
propagation).

Cheers,
H.


On 03/15/2018 07:15 AM, Tyler Smith wrote:

On Sun, Mar 11, 2018, at 4:10 AM, Egon Willighagen wrote:

But previously I learned that when you push
something to the repository, you should bump the question, so currently I
do this for every change I made, leaving a ridiculous number of minor
release and really short NEWS entries...


I wondered about this, particularly for the NEWS entries. What I have been 
doing is bumping the version number for each commit - to make sure that the 
checks on the server get run - but collapsing multiple version bullets into a 
single NEWS item. i.e., my NEWS looks like:


Changes in version 1.5.6 (2018-03-08)
-

User Visible Changes:

 * BUG FIX: the G2 peak of the B sample was not getting incorporated
   into model construction, which caused model fitting to fail on
   samples with histograms skewed towards the left.

 * Updated DebrisModel documentation.

Internal Changes:

 * Fixed broken test.

Changes in version 1.5.3 (2018-01-17)
-



I think that will be more convenient for users, who won't care that the bug 
fix, updated documentation, and fixed test were actually three different 
commits. If they do want that level of detail, that information is available 
directly from the repository. And from my perspective, I definitely wanted to 
bump the version for at least the test fix and the bug fix, to make sure that 
the server-side tests were run.

Best,

Tyler

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=_Es-Y9BxNNAd6d09znWColNeE9BpwvW9oXOWFQ5AYGA=ldTZV-PR3E4FQclCT1hSz7Y761vz8xv4Fc_1EPYxEvc=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] as.list fails on IRanges inside of lapply(, blah)

2018-02-20 Thread Hervé Pagès


On 02/20/2018 01:25 PM, Gabe Becker wrote:

Herve,

Thanks for the response. The looping across a ranges that's still in 
tehre is:


dss = switch(seqtype,
                      bp = DNAStringSet(*lapply(ranges(srcs)*,
function(x) origin[x])),
                      aa = AAStringSet(*lapply(ranges(srcs),*
function(x) origin[x])),
                      stop("Unrecognized origin sequence type: ",
seqtype)
                      )

(Line 495 in genbankReader.R)


That was also fixed in genbankr 1.7.2. I replaced this with

  dss = extractAt(origin, ranges(srcs))

Do 'git show 340b0d4fac511f8171391fdeb2233ca6a410743d' to see
the details of the changes I made.

Cheers,
H.



srcs is a GRanges, making ranges(srcs) an IRanges, so this lapply fails. 
I'm not sure what I'm meant to do here as there's not an already 
vectorized version that I know of that does the rigth thing (I want 
separate DNAStrings for each range, so origin[ranges(srcs)] doesn't work).


I mean I can force the conversion to list issue with 
lapply(1:length(srcs), function(i) ranges(srcs)[i]) or similar but that 
seems pretty ugly...


As for the other issue with the build not working in release, that is a 
bug in the rentrez (which is on CRAN, not Bioc). I've submitted a PR to 
fix that, and we'll see what the response is as to whether I need to 
remove that integration or not.


~G






On Tue, Feb 20, 2018 at 10:48 AM, Hervé Pagès <hpa...@fredhutch.org 
<mailto:hpa...@fredhutch.org>> wrote:


Hi Gabe,

I made a couple of changes to genbankr (1.7.2) to avoid those looping
e.g. I replaced things like

     sapply(gr, width)

with

     width(gr)

I can't run a full 'R CMD build' + 'R CMD check' on the package though
because the code in the vignette seems to fail for reasons unrelated
to the recent changes to IRanges / GenomicRanges (I get the same error
with the release version, see release build report).

The previous behavior of as.list() on IRanges ans GRanges objects will
be restored (with a deprecation warning) once all the packages that
need a fix get one (only 7 packages left on my list). I should be done
with them in the next couple of days.

H.


On 02/20/2018 09:41 AM, Gabe Becker wrote:

All,

I'm trying to track down the new failure in my genbankr package
and it
appears to come down to the fact  that i'm trying to lapply over an
IRanges, which fails in the IRanges to list (or List?)
conversion. The
particular case that fails in my example is an IRanges of length
1 but that
does not appear to matter, as lapply fails over IRanges of
length >1 as
well.

Is this intentional? If so, it seems a change of this magnitude
would
warrant a deprecation cycle at least. If not, please let me know
so I can
leave the code as is and wait for the fix.

rng1 = IRanges(start = 1, end = 5)


rng2 = IRanges(start = c(1, 7), end = c(3, 10))


rng1


IRanges object with 1 range and 0 metadata columns:

            start       end     width

          

    [1]         1         5         5

rng2


IRanges object with 2 ranges and 0 metadata columns:

            start       end     width

          

    [1]         1         3         3

    [2]         7        10         4

lapply(rng1, identity)


*Error in (function (classes, fdef, mtable)  : *

*  unable to find an inherited method for function
‘getListElement’ for
signature ‘"IRanges"’*

lapply(rng2, identity)


*Error in (function (classes, fdef, mtable)  : *

*  unable to find an inherited method for function
‘getListElement’ for
signature ‘"IRanges"’*

sessionInfo()


R Under development (unstable) (2018-02-16 r74263)

Platform: x86_64-apple-darwin15.6.0 (64-bit)

Running under: OS X El Capitan 10.11.6


Matrix products: default

BLAS:

/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRblas.dylib

LAPACK:

/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib


locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


attached base packages:

[1] stats4    parallel  stats     graphics  grDevices utils   
  datasets


[8] methods   base


other attached packages:

*[1] IRanges_2.13.26     S4Vectors_0.17.33   BiocGenerics_0.25.3*


loaded via a namespace (a

<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3Dd-2Bvia-2Ba-2Bnamespace-2B-28a-26entry-3D

Re: [Bioc-devel] as.list fails on IRanges inside of lapply(, blah)

2018-02-20 Thread Hervé Pagès


Hi Gabe,

I made a couple of changes to genbankr (1.7.2) to avoid those looping
e.g. I replaced things like

sapply(gr, width)

with

width(gr)

I can't run a full 'R CMD build' + 'R CMD check' on the package though
because the code in the vignette seems to fail for reasons unrelated
to the recent changes to IRanges / GenomicRanges (I get the same error
with the release version, see release build report).

The previous behavior of as.list() on IRanges ans GRanges objects will
be restored (with a deprecation warning) once all the packages that
need a fix get one (only 7 packages left on my list). I should be done
with them in the next couple of days.

H.

On 02/20/2018 09:41 AM, Gabe Becker wrote:

All,

I'm trying to track down the new failure in my genbankr package and it
appears to come down to the fact  that i'm trying to lapply over an
IRanges, which fails in the IRanges to list (or List?) conversion. The
particular case that fails in my example is an IRanges of length 1 but that
does not appear to matter, as lapply fails over IRanges of length >1 as
well.

Is this intentional? If so, it seems a change of this magnitude would
warrant a deprecation cycle at least. If not, please let me know so I can
leave the code as is and wait for the fix.


rng1 = IRanges(start = 1, end = 5)



rng2 = IRanges(start = c(1, 7), end = c(3, 10))



rng1


IRanges object with 1 range and 0 metadata columns:

   start   end width

 

   [1] 1 5 5


rng2


IRanges object with 2 ranges and 0 metadata columns:

   start   end width

 

   [1] 1 3 3

   [2] 710 4


lapply(rng1, identity)


*Error in (function (classes, fdef, mtable)  : *

*  unable to find an inherited method for function ‘getListElement’ for
signature ‘"IRanges"’*


lapply(rng2, identity)


*Error in (function (classes, fdef, mtable)  : *

*  unable to find an inherited method for function ‘getListElement’ for
signature ‘"IRanges"’*


sessionInfo()


R Under development (unstable) (2018-02-16 r74263)

Platform: x86_64-apple-darwin15.6.0 (64-bit)

Running under: OS X El Capitan 10.11.6


Matrix products: default

BLAS:
/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRblas.dylib

LAPACK:
/Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib


locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


attached base packages:

[1] stats4parallel  stats graphics  grDevices utils datasets

[8] methods   base


other attached packages:

*[1] IRanges_2.13.26 S4Vectors_0.17.33   BiocGenerics_0.25.3*


loaded via a namespace (and not attached):

[1] compiler_3.5.0 tools_3.5.0



Best,
~G





--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] as.list of a GRanges

2018-02-19 Thread Hervé Pagès

Hi Renan,

Most packages affected by these changes are packages that loop on
the individual ranges of a GRanges object. They generally don't
call as.list() directly but use something like lapply(), vapply(),
sapply(), Map(), Reduce(), etc... All these functions indeed call
as.list() internally on the supplied object before looping on it.
Just to clarify, when I say I found a dozen of Bioconductor packages
in the entire software repo where as.list() was used on a GRanges
object, I'm counting all the packages that use it explicitly or
implicitly. This includes signeR, which I had on my list of packages
to fix.

BTW in this particular instance, I would recommend doing

reduce(granges, drop.empty.ranges=TRUE)

instead of

Reduce(union, as(granges, "GRangesList"))

reduce() walks on the individual ranges of the supplied object at
the C level so is much faster than performing a binary union in
an R loop. It should also be more memory efficient.

Cheers,
H.

On 02/16/2018 09:02 AM, Renan Valieris wrote:

FWIW, this change also affects code that don't call as.list() explicitly.

such as calling Reduce(union, granges), Reduce is implemented on base, and
will call as.list() if the predicate isn't a vector already.

I understand it wasn't intended to be used this way, but with this in mind
there are more packages potentially affected by the change.

On Fri, Feb 16, 2018 at 1:25 PM, Nathan Sheffield <nat...@code.databio.org>
wrote:

For what it's worth, my package (LOLA) was one that used as.list on a
GRanges or GRangesList, and those calls were broken by changes to devel.
Since I was also pushing changes at the time, I assumed the devel build
errors were due to my updates -- I spent quite a bit of time trying to
figure out what was wrong before I realized this breakage was not caused by
my updates, but by upstream changes in GRanges...eventually I tracked down
errors to as.list (and ultimately, found other errors, which we discussed
earlier on this list), but my conclusion from this was that, from my
perspective, using the deployed bioc devel as a way to test for what
refactoring will break doesn't seem like the ideal way to go -- I assumed
that generally, other package changes wouldn't typically be pushed that
would break my package's build, so it devalued the role of the dev builds
and reduced my confidence in using that (now when I see error I may assume
it's something else, and wait a few days, instead of diving right in to try
to solve the problem).

I like the idea of temporarily restoring as.list with a deprecation
message -- also, as a general development philosophy going forward in terms
of testing on devel. This would have saved me a lot of time troubleshooting
in this instance.

Just my 2 cents.

-Nathan

On 02/16/2018 02:57 AM, Bernat Gel wrote:

Hi Hervé and others,

Thanks for the responses.

I woudn't call as.list() of a GRanges an "obscure behaviour" but more a
"works as expected, even if not clearly documented" behaviour.

In any case I can change the code to as(gr, "GRangesList") as suggested.

Thanks again for the responses and discussion :)

Bernat

*Bernat Gel Moreno*
Bioinformatician

Hereditary Cancer Program
Program of Predictive and Personalized Medicine of Cancer (PMPPC)
Germans Trias i Pujol Research Institute (IGTP)

Campus Can Ruti
Carretera de Can Ruti, Camí de les Escoles s/n
08916 Badalona, Barcelona, Spain

Tel: (+34) 93 554 3068
Fax: (+34) 93 497 8654
08916 Badalona, Barcelona, Spain
b...@igtp.cat <mailto:b...@igtp.cat>
www.germanstrias.org 
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.germanstrias.org_=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=u-uKbpvH_T_qRONe44P6puvfV2kgFjcrH7YBeLoAyOg=Jq9kJoc872fO0LkbqV1pjIvd522K7WQXmvwvgfOsNLw=>

<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.germanstrias.org_=DwIFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=u-uKbpvH_T_qRONe44P6puvfV2kgFjcrH7YBeLoAyOg=Jq9kJoc872fO0LkbqV1pjIvd522K7WQXmvwvgfOsNLw=>

El 02/15/2018 a las 11:19 PM, Hervé Pagès escribió:

On 02/15/2018 01:57 PM, Michael Lawrence wrote:

On Thu, Feb 15, 2018 at 1:45 PM, Hervé Pagès <hpa...@fredhutch.org
<mailto:hpa...@fredhutch.org>> wrote:

 On 02/15/2018 11:53 AM, Cook, Malcolm wrote:

 Hi,

 Can I ask, is this change under discussion in current release or
 so far in Bioconductor devel only (my assumption)?

 Bioconductor devel only.

> On 02/15/2018 08:37 AM, Michael Lawrence wrote:
> > So is as.list() no longer supported for GRanges objects?
 I have found it
> > useful in places.
>
> Very few places. I found a dozen of them in the entire
 software repo.

 However there are probably more in the wild...

 What as.list() was doing on a GRanges object was not docu

Re: [Bioc-devel] as.list of a GRanges

2018-02-19 Thread Hervé Pagès


On 02/19/2018 06:43 AM, Michael Lawrence wrote:



On Mon, Feb 19, 2018 at 2:10 AM, Bernat Gel <b...@igtp.cat 
<mailto:b...@igtp.cat>> wrote:


Hi Hervé,

I completely agree with the goal of having the semantics of
list-like operations standardised and documented to avoid surprises,
and if to do so, the current use of as.list must be changed I'm
pefectly ok with that. I had not seen the strange behaviour with
IRanges, 



Just want to point out that it's important to keep in mind that many of 
our users never use IRanges directly, so consistency is not an absolute 
requirement.


Even if you only use GRanges objects, it's confusing that lapply()
works on them but not mapply(). The undergoing changes will also
address inconsistencies within the GRanges API, not just the
inconsistencies between the GRanges and IRanges APIs.

H.



so I was not aware of the problem.

In any case, thanks for fixing (and simplifying) karyoploteR. In
retrospective I don't know why I didn't use simple vectorization!
So, thanks


Bernat

*Bernat Gel Moreno*
Bioinformatician

Hereditary Cancer Program
Program of Predictive and Personalized Medicine of Cancer (PMPPC)
Germans Trias i Pujol Research Institute (IGTP)

Campus Can Ruti
Carretera de Can Ruti, Camí de les Escoles s/n
08916 Badalona, Barcelona, Spain

Tel: (+34) 93 554 3068 <tel:%28%2B34%29%2093%20554%203068>
Fax: (+34) 93 497 8654 <tel:%28%2B34%29%2093%20497%208654>
08916 Badalona, Barcelona, Spain
b...@igtp.cat <mailto:b...@igtp.cat> <mailto:b...@igtp.cat
<mailto:b...@igtp.cat>>
www.germanstrias.org

<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.germanstrias.org=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=yfkWiY-f8hZ6C9aR8HdZEQurWS_HnLMEGwRYr5dTJCo=ecVcL7fCYBvQUBiQR1iEPlGypQlW1bL-uT77qy5i3rQ=>
<http://www.germanstrias.org/

<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.germanstrias.org_=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=yfkWiY-f8hZ6C9aR8HdZEQurWS_HnLMEGwRYr5dTJCo=U2fT0SvLZq-JipogjNgeAMproXB4yH_oBQgNjS8HWsA=>>

<http://www.germanstrias.org/

<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.germanstrias.org_=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=yfkWiY-f8hZ6C9aR8HdZEQurWS_HnLMEGwRYr5dTJCo=U2fT0SvLZq-JipogjNgeAMproXB4yH_oBQgNjS8HWsA=>>







El 02/17/2018 a las 04:19 AM, Hervé Pagès escribió:

Hi Bernat,

On 02/15/2018 11:57 PM, Bernat Gel wrote:

Hi Hervé and others,

Thanks for the responses.

I woudn't call as.list() of a GRanges an "obscure behaviour"
but more a "works as expected, even if not clearly
documented" behaviour.


Most users/developers will probably agree that as.list() worked
as expected on a GRanges object. But then they'll be surprised
and confused when they use it on an IRanges object and discover
that it does something completely different. The current effort
is to bring more consistency between GRanges and IRanges objects
and to have their list-like semantics aligned and documented so
there will be no more such surprise.


In any case I can change the code to as(gr, "GRangesList")
as suggested.


I went ahead and fixed karyoploteR. This is karyoploteR 1.5.2. Make
sure to resync your GitHub repo by following the instructions here:



https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

<https://urldefense.proofpoint.com/v2/url?u=https-3A__bioconductor.org_developers_how-2Dto_git_sync-2Dexisting-2Drepositories_=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=yfkWiY-f8hZ6C9aR8HdZEQurWS_HnLMEGwRYr5dTJCo=Axg4lFtwzxFvnQIBBnj3vmFyle7QPiJNHtqKzBgRKAI=>


Note that the loop on the GRanges object (via the call to Map())
was not needed and could be replaced with a solution that uses
proper vectorization.

Best,
H.


Thanks again for the responses and discussion :)

Bernat


*Bernat Gel Moreno*
Bioinformatician

Hereditary Cancer Program
Program of Predictive and Personalized Medicine of Cancer
(PMPPC)
Germans Trias i Pujol Research Institute (IGTP)

Campus Can Ruti
Carretera de Can Ruti, Camí de les Escoles s/n
08916 Badalona, Barcelona, Spain

Tel: (+34) 93 554 3068 <tel:%28%2B34%29%2093%20554%203068>
Fax: (+34) 93 497 8654 <tel:%28%2B34%29%2093%20497%208654>
08916 Badalona, Barcelona, Spain
b...@igtp.cat <mailto:b.

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1294 matches

Mail list logo