Re: [Bioc-devel] Bioc2019 planning committee

2018-09-12 Thread Stephanie M. Gogarten
Are there any plans to have another BioC on the west coast in the near 
future?


thanks,
Stephanie

On 9/10/18 1:32 PM, Levi Waldron wrote:

Hi all,

I'm organizing a planning committee for Bioc2019 in New York City (at NYU
and Rockefeller University), June 24-26. There is lots to be done including
peer review of proposed talks, workshops, and posters, developing the
programme, adapting the web site (e.g. see http://bioc2018.bioconductor.org/),
seeking sponsorship, promoting the conference, and creating next year's
workshop booklet (e.g. see https://bioconductor.github.io/BiocWorkshops/).
If you like to take part, let me know, and I will include you in a kick-off
planning meeting within the next couple weeks.

Sincerely,
Levi



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Virtual class for `matrix` and `DelayedArray`? (or better strategy for dealing with them both)

2018-04-30 Thread Stephanie M. Gogarten
Rather than a class union, how about an internal function that is called 
by the methods for both matrix and DelayedArray:



setGeneric("myNewRowMeans", function(x,...) { 
standardGeneric("myNewRowMeans")})


#' @importFrom DelayedArray rowMeans
.myNewRowMeans <- function(x,...){
# a lot of code independent of x
print("This is a lot of code shared regardless of class of x\n")
# a lot of code that depends on x, but is dispatched by the 
functions called

out<-rowMeans(x)
#a lot of code based on output of out
out<-out+1
return(out)
}

setMethod("myNewRowMeans",
  signature = "matrix",
  definition = function(x,...){
  .myNewRowMeans(x,...)
  }
)

setMethod("myNewRowMeans",
  signature = "DelayedArray",
  definition = function(x,...){
  .myNewRowMeans(x,...)
  }
)


On 4/30/18 9:10 AM, Tim Triche, Jr. wrote:

But if you merge methods like that, the error method can be that much more
difficult to identify. It took a couple of weeks to chase that bug down
properly, and it ended up down to rowMeans2 vs rowMeans.

I suppose the merged/abstracted method allows to centralize any such
dispatch into one place and swap out ill-behaved methods once identified,
so as long as DelayedArray/DelayedMatrixStats quirks are
documented/understood, maybe it is better to create this union class?

The Matrix/matrixStats/DelayedMatrix/DelayedMatrixStats situation has been
"interesting" in practical terms, as seemingly simple abstractions appear
to require more thought. That was my only point.


--t

On Mon, Apr 30, 2018 at 11:28 AM, Martin Morgan <
martin.mor...@roswellpark.org> wrote:


But that issue will be fixed, so Tim's advice is inappropriate.


On 04/30/2018 10:42 AM, Tim Triche, Jr. wrote:


Don't do that.  Seriously, just don't.

https://github.com/Bioconductor/DelayedArray/issues/16

--t

On Mon, Apr 30, 2018 at 10:02 AM, Elizabeth Purdom <
epur...@stat.berkeley.edu> wrote:

Hello,


I am trying to extend my package to handle `HDF5Matrix` class ( or more
generally `DelayedArray`). I currently have S4 functions for `matrix`
class. Usually I have a method for `SummarizedExperiment`, which will
call
call the method on `assay(x)` and I want the method to be able to deal
with
if `assay(x)` is a `DelayedArray`.

Most of my functions, however, do not require separate code depending on
whether `x` is a `matrix` or `DelayedArray`. They are making use of
existing functions that will make that choice for me, e.g. rowMeans or
subsetting. My goal right now is compatibility, not cleverness, and I'm
not
creating HDF5 methods to handle other cases. (If something doesn't
currently exist, then I just enclose `x` with `data.matrix` or
`as.matrix`
and call the matrix into memory — for cleanliness and ease in updating
with
appropriate methods in future, I could make separate S4 functions for
these
specific tasks to dispatch, but that's outside of the scope of my
question). So for simplicity assume I don't really need to dispatch *my
code* -- that the methods I'm going to use do that.

The natural solution for me seem to use `setClassUnion` and I was
wondering if such a virtual class already exists? Or is there a better
way
to handle this?

Here's a simple example, using `rowMeans` as my example:

```
setGeneric("myNewRowMeans", function(x,...) { standardGeneric("
myNewRowMeans")})
setClassUnion("matrixOrDelayed",members=c("matrix", "DelayedArray"))

#' @importFrom DelayedArray rowMeans
setMethod("myNewRowMeans",
signature = "matrixOrDelayed",
definition = function(x,...){
  # a lot of code independent of x
  print("This is a lot of code shared regardless
of
class of x\n")
  # a lot of code that depends on x, but is
dispatched by the functions called
  out<-rowMeans(x)
  #a lot of code based on output of out
  out<-out+1
  return(out)
  }
)
```

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



 [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




This email message may contain legally privileged and/or confidential
information.  If you are not the intended recipient(s), or the employee or
agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited.  If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.



[[alternative HTML version deleted]]


Re: [Bioc-devel] Missing link files in Windows (release and devel)

2018-01-17 Thread Stephanie M. Gogarten
This sentence seems to indicate that the original format 
\link[base]{rbind} should work:


"Because they have been frequently misused, the HTML help system looks 
for topic foo in package pkg if it does not find file foo.html."



On 1/17/18 8:56 AM, James W. MacDonald wrote:

On Wed, Jan 17, 2018 at 11:38 AM, Leonardo Collado Torres 
wrote:


Thanks Martin! I just finished fixing the links in all my packages
using the \link[base:cbind]{rbind} syntax. One of them did seem a bit
weird to me:

  Rd warning: C:/Users/biocbuild/bbs-3.6-bioc/tmpdir/RtmpqyL54j/R.
INSTALL22cc280d642c/derfinder/man/loadCoverage.Rd:15:
missing file link 'BamFile'

As far as I can tell, shouldn't \link[Rsamtools:BamFile]{BamFile} be
the same as \link[Rsamtools]{BamFile} ? On my mac the help page is
called BamFile, but maybe the html file on Windows has a different
name.



On both my Linux  and Windows boxes it's BamFile-class.

Jim




Best,
Leo

On Tue, Jan 16, 2018 at 4:58 PM, Martin Morgan
 wrote:



On 01/16/2018 10:37 AM, Leonardo Collado Torres wrote:


Hi,

I have been seen warnings in several of my packages on both release
and devel only in the Windows build machines in relation to missing
link files. Is this something that I can address from my side or a
more widespread issue? If it matters, I use roxygen2 for making my Rd
files.



taking the first example


https://cran.r-project.org/doc/manuals/r-release/R-exts.

html#Cross_002dreferences


"There are two other forms of optional argument specified as

\link[pkg]{foo} and \link[pkg:bar]{foo} to link to the package pkg, to
files foo.html and bar.html respectively. These are rarely needed,...


You have \link[base]{rbind} so R is looking for rbind.html. The file is

actually cbind.html so \link[base:cbind]{rbind}. I don't know why the error
is only on Windows, perhaps because html manuals are only generated on
Windows?


I guess the 'These are rarely needed,...' part of the manual might be

informative.


Martin



Best,
Leonardo

Examples:

Rd warning: C:/Users/biocbuild/bbs-3.6-bioc/tmpdir/Rtmpi4zjs1/R.

INSTALL2e746d54e04/recount/man/geo_characteristics.Rd:17:

missing file link 'rbind'

Rd warning: C:/Users/biocbuild/bbs-3.7-bioc/tmpdir/Rtmp2NQKYR/R.

INSTALL21702e4399f/regionReport/man/derfinderReport.Rd:46:

missing file link 'plotIdeogram'

Rd warning: C:/Users/biocbuild/bbs-3.7-bioc/tmpdir/RtmpQtPk8B/R.

INSTALL17382396f82/derfinder/man/analyzeChr.Rd:54:

missing file link 'TxDb.Hsapiens.UCSC.hg19.knownGene'
Rd warning: C:/Users/biocbuild/bbs-3.7-bioc/tmpdir/RtmpQtPk8B/R.

INSTALL17382396f82/derfinder/man/annotateRegions.Rd:49:

missing file link 'countOverlaps'

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




This email message may contain legally privileged and/or confidential

information.  If you are not the intended recipient(s), or the employee or
agent responsible for the delivery of this message to the intended
recipient(s), you are hereby notified that any disclosure, copying,
distribution, or use of this email message is prohibited.  If you have
received this message in error, please notify the sender immediately by
e-mail and delete this email message from your computer. Thank you.




___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel







___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Confusion with how to maintain release/devel files on local computer.

2017-11-01 Thread Stephanie M. Gogarten
One possible point of confusion: Laurent's workflow includes maintaining 
separate branches "master" and "devel", which he syncs to his own Github 
repo and Bioconductor's git repo respectively. However, the 
documentation on the bioc website 
(https://bioconductor.org/developers/how-to/git/) assumes that you have 
only one "master" branch that you push to both remotes.


On 11/1/17 2:33 PM, Laurent Gatto wrote:


On  1 November 2017 20:36, Arman Shahrisa wrote:


I'm confused with development process.

At first, I need to have a folder with accepted packaged. Then I need to pull
origion RELEASE_3_6?

Then in another folder, I need to pull origion master?


No, it all happens in the same folder, but switching between branches
using git. Here's an example of one of my own packages. The first
command list all available branches (all, using -a, means also
remote-only branches). My current branch is noted with an *, and I also
have a feature branch called writeMSData, which also lives on GitHub
(https://github.com/lgatto/MSnbase/, but that's optional).

$ git branch -a
   devel
* master
   writeMSData
   remotes/origin/HEAD -> origin/master
   remotes/origin/centroiding
   remotes/origin/fixBracketSubset
   remotes/origin/issue82
   remotes/origin/master
   remotes/origin/orbifilter
   remotes/origin/processingData
   remotes/origin/removePrecMz
   remotes/origin/writeMSData
   remotes/upstream/RELEASE_2_10
   remotes/upstream/RELEASE_2_11
   remotes/upstream/RELEASE_2_12
   remotes/upstream/RELEASE_2_13
   remotes/upstream/RELEASE_2_14
   remotes/upstream/RELEASE_2_8
   remotes/upstream/RELEASE_2_9
   remotes/upstream/RELEASE_3_0
   remotes/upstream/RELEASE_3_1
   remotes/upstream/RELEASE_3_2
   remotes/upstream/RELEASE_3_3
   remotes/upstream/RELEASE_3_4
   remotes/upstream/RELEASE_3_5
   remotes/upstream/master

As you can see (and as specified by Gabe in his earlier reply), I
haven't have pulled all Bioconductor releases. master points to GitHub's
origin/master branch, and devel points to Bioconductor's
upstream/master. As you can see above, I haven't got the latest release
references yet. I can do this with

$ git fetch --all
Fetching origin
Fetching upstream
remote: Counting objects: 6, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 6 (delta 4), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
 From git.bioconductor.org:packages/MSnbase
  * [new branch]  RELEASE_3_6 -> upstream/RELEASE_3_6
b680678..a98138c  master -> upstream/master

And now

$ git branch -a
   devel
* master
   writeMSData
   remotes/origin/HEAD -> origin/master
   remotes/origin/centroiding
   remotes/origin/fixBracketSubset
   remotes/origin/issue82
   remotes/origin/master
   remotes/origin/orbifilter
   remotes/origin/processingData
   remotes/origin/removePrecMz
   remotes/origin/writeMSData
   remotes/upstream/RELEASE_2_10
   remotes/upstream/RELEASE_2_11
   remotes/upstream/RELEASE_2_12
   remotes/upstream/RELEASE_2_13
   remotes/upstream/RELEASE_2_14
   remotes/upstream/RELEASE_2_8
   remotes/upstream/RELEASE_2_9
   remotes/upstream/RELEASE_3_0
   remotes/upstream/RELEASE_3_1
   remotes/upstream/RELEASE_3_2
   remotes/upstream/RELEASE_3_3
   remotes/upstream/RELEASE_3_4
   remotes/upstream/RELEASE_3_5
   remotes/upstream/RELEASE_3_6
   remotes/upstream/master

If I want to modify the development branch (i.e. Bioconductor's
upstreams/master), then I checkout devel (that's how I named it
locally), do changes and push.

$ git checkout devel
## do stuff
$ git push

Same principle for other branches.


So that by opening each folder, I know what I'm editing.
Also during push, I need to be careful about where I'm pushing changes.
Origion is bioc's git address of my package whereas master is the package 
directory in GitHub?


No - I suggest you read a bit about git (GitHub is a web interface using
git) to familiarise yourself with the concepts and vocabulary.


Am I getting it correct?
Is there anywhere that contains whole the process and codes in steps?


All the setup and more details are provided in

   https://github.com/bioconductor/bioc_git_transition/

in particular the FAQ and all the scenarios at the bottom

   https://github.com/Bioconductor/bioc_git_transition/blob/master/doc/faq.md

Best wishes,

Laurent


Best regards,
Arman



[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel





___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BiocGenerics request

2017-10-13 Thread Stephanie M. Gogarten



On 10/13/17 3:32 PM, Hervé Pagès wrote:

Hi Stephanie,

Can you provide a little bit more context? Have you observed
conflicts between VariantAnnotation:::asVCF() and other asVCF()
functions defined elsewhere? Any reason why you can't use/import
VariantAnnotation:::asVCF() in your package?


I would prefer to leave VariantAnnotation in "Suggests" rather than 
"Imports" for performance reasons - it adds substantially to package 
load time, but will be used only rarely.




Alternatively, have you considered using/defining a coercion method
to VCF instead? That should work (and would be preferred) if you
don't need the extra arguments that an "asVCF" method would allow
you to support.


I will look into coercion methods; thanks for the suggestion! Can one 
define a coercion method where the "to" class is in a package that's not 
attached until the method is called (with requireNamespace)?


thanks,
Stephanie



The VCF class is defined in the VariantAnnotation package so it
would be weird to have the asVCF() generic in BiocGenerics.

Cheers,
H.


On 10/13/2017 01:55 PM, Stephanie M. Gogarten wrote:

Can we move the "asVCF" generic to BiocGenerics?

thanks,
Stephanie

___
Bioc-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=0Ri2Pb4ycHLpQX0GgcoL0ZBpK2f15xNE0cEV1MC6d9w=6u7LTi87hcmFtZl8tm7tne0VGxLc2kaQj7j3aActGuk= 







___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] BiocGenerics request

2017-10-13 Thread Stephanie M. Gogarten

Can we move the "asVCF" generic to BiocGenerics?

thanks,
Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] splitting a Bioconductor package in two

2017-09-06 Thread Stephanie M. Gogarten

Hi,

If one wanted to move some code from an existing Bioconductor package 
into its own, separate package for ease of maintenance, would that 
constitute a new package submission with the full review process? Or is 
there some expedited submission available?


We are refactoring the GENESIS package and debating whether to move the 
low-level statistical testing code into a different package (which 
GENESIS would then import). As I see it, pros include easier maintenance 
and testing, and the potential value of the stats package to other users 
and developers who might not want to import the entire GDS framework 
that GENESIS uses. Cons include the overhead of submitting, writing 
vignettes, etc. for a set of functions that are meant to be used internally.


What are other pros and cons of this approach (versus just keeping 
everything in one big package)?


thanks,
Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] New pre-receive hook on git.bioconductor.org

2017-08-30 Thread Stephanie M. Gogarten

Hi Nitesh,

Does this mean that merge commits are not allowed? I have been 
developing features on branches and then merging back into the master 
branch with a merge commit, which helps me keep track of which commits 
go together in a single feature. But when I just tried to push upstream, 
I was blocked by the duplicate commit filter.


Stephanie

On 8/30/17 5:15 AM, Turaga, Nitesh wrote:

Hi Sean,

A `git diff` between an "empty commit” (commit without a body, such 
as—`Creating branch for BioC 3.5 release`)  and a commit with a body, results in an 
empty diff.

So unfortunately, it catches things which are not supposed to be caught. This 
is an edge case which I’ve missed (but fixed now) in the pre-receive hook. The 
new fixed patch has been pushed, it should have propagated by now and 
maintainers can try again.

Hope this helps. Let me know if there are any more questions.

Best,

Nitesh



On Aug 30, 2017, at 8:09 AM, Sean Davis  wrote:

Hi, Nitesh.

Can you fill us in on what is changing between "duplicated commits" and "please try 
again"?

Thanks,
Sean


On Wed, Aug 30, 2017 at 8:04 AM, Turaga, Nitesh  
wrote:
Hi Philipp,

Please try again.

Best,

Nitesh



On Aug 30, 2017, at 4:33 AM, Angerer, Philipp 
 wrote:

hi, sadly my duplicate commits are authored by Herve Pages and already in the 
repo, so i can’t work at all on my package (destiny).

$ git push
Zähle Objekte: 12, Fertig.
Delta compression using up to 8 threads.
Komprimiere Objekte: 100% (12/12), Fertig.
Schreibe Objekte: 100% (12/12), 1.14 KiB | 1.14 MiB/s, Fertig.
Total 12 (delta 9), reused 0 (delta 0)
remote: Error: duplicate commits.
remote:
remote: There are duplicate commits in your commit history, These cannot be
remote: pushed to the Bioconductor git server. Please make sure that this is
remote: resolved.
remote:
remote: Take a look at the documentation to fix this,
remote: 
https://bioconductor.org/developers/how-to/git/sync-existing-repositories/,
remote: particularly, point #8 (force Bioconductor master to Github master).
remote:
remote: For more information, or help resolving this issue, contact
remote: . Provide the error, the package name and
remote: any other details we might need.
remote:
remote: Use
remote:
remote: git show 72261dcfd1d1e7adf73f16bb4e6d9e38eecccbf9
remote: git show b3fb272fc4517faab44a3792825a07460eadc345
remote:
remote: to see body of commits.
remote:
To git.bioconductor.org:packages/destiny
  ! [remote rejected] RELEASE_3_5 -> RELEASE_3_5 (pre-receive hook declined)
error: Fehler beim Versenden einiger Referenzen nach 
'g...@git.bioconductor.org:packages/destiny'
$ git show 72261dcfd1d1e7adf73f16bb4e6d9e38eecccbf9
commit 72261dcfd1d1e7adf73f16bb4e6d9e38eecccbf9
Author: Herve Pages 
Date:   Mon Apr 24 19:45:44 2017 +

 Creating branch for BioC 3.5 release

 git-svn-id: 
file:///home/git/hedgehog.fhcrc.org/bioconductor/branches/RELEASE_3_5/madman/Rpacks/destiny@129128
 bc3139a8-67e5-0310-9ffc-ced21a209358
$ git show b3fb272fc4517faab44a3792825a07460eadc345
commit b3fb272fc4517faab44a3792825a07460eadc345
Author: Herve Pages 
Date:   Mon Apr 24 19:25:24 2017 +

 bump x.y.z versions to even y prior to creation of 3_5 branch

 git-svn-id: 
file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/destiny@129126
 bc3139a8-67e5-0310-9ffc-ced21a209358

diff --git a/DESCRIPTION b/DESCRIPTION
index 9a01095..2368e6d 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
  Package: destiny
  Type: Package
  Title: Creates diffusion maps
-Version: 2.2.11
+Version: 2.4.0
  Date: 2014-12-19
  Authors@R: c(

Von: "Nitesh Turaga" 
An: "bioc-devel" 
Gesendet: Dienstag, 29. August 2017 18:15:38
Betreff: [Bioc-devel] New pre-receive hook on git.bioconductor.org

Hi Maintainers,

We have added a new pre-receive hook on our 
git.bioconductor.org system today. We now prevent 
large-files (>5MB) from being committed. We also prevent ‘duplicate' commits.

A duplicate commit is when two commits have the same content, but different SHA1 signatures. This 
can happen, for instance, when you clone git.bioconductor.org 
and then merge with your local git development repository or 
github.com/Bioconductor-mirror repository. The best 
way to approach this issue is in the documentation, 
http://bioconductor.org/developers/how-to/git/abandon-changes#force-bioconductor--to-github-, 
including the option to cherry-pick local commits into the master branch synced from 
git.bioconductor.org.

Please let us know if you have any questions.

Best,

Nitesh


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended 

Re: [Bioc-devel] Git transition -- regenerating repositories from svn

2017-08-21 Thread Stephanie M. Gogarten
I was actually thinking of SeqVarTools, since I have local commits that 
I'm not ready to push yet. GENESIS and GWASdata could be put on the "do 
not regenerate" list also - they have no unknown users, and I'd rather 
not go through all the steps again if I don't have to.


thanks,
Stephanie

On 8/21/17 1:23 PM, Martin Morgan wrote:

On 08/21/2017 03:17 PM, Stephanie M. Gogarten wrote:

If we followed the steps here:
https://www.bioconductor.org/developers/how-to/git/maintain-github-bioc/

How much, if any, of this will need to be redone after the 
repositories are regenerated? In particular, if I don't have an 
unknown user, will the regenerated commits be equal to the previous 
commits, or will "git fetch upstream" duplicate my commit history?


Hi Stephanie --

if there are no unknown users, then we should not regenerate your git 
repository. Is this GWAStools? If so let's leave it on the list of 
repositories not to regenerate?


Martin




thanks,
Stephanie

On 8/21/17 9:00 AM, Martin Morgan wrote:

Hi git transitioners --

We'd like to regenerate git repositories from svn. This is because 
some svn user ids were mapped to 'unknown' git users, so that 
contributors would not be credited accurately. This will  invalidate 
any local clones made from git.bioconductor.org.


Our plan is to regenerate all git repositories EXCEPT those that have 
been modified when we are ready (probably tomorrow morning). Modified 
repositories that we would NOT regenerate, based on current commits, 
are listed below; repositories modified between now and when we are 
ready to update would also NOT be regenerated:


beadarray BiocStyle CAMERA Cardinal CEMiTool ChemmineR cydar cytofkit 
derfinder derfinderHelper derfinderPlot DmelSGI DOSE EBImage ELMER 
ensembldb FamAgg gcapc GenVisR ggtree GOexpress gQTLstats GWASTools 
isomiRs karyoploteR LOBSTAHS motifcounter piano Rdisop REMP Rhdf5lib 
rnaseqcomp seqplots systemPipeR TCGAbiolinks TCGAbiolinksGUI vsn



For a little more detail, the problem is manifest as 'unknown' 
authors in a git commit, e.g., in Biobase from svn user 'jmc'


commit b5ae43bc8aae967b80062da13e5085a6a305b274
Author: unknown 
Date:   Fri Dec 7 15:17:06 2001 +

 fixed the arguments to 'show' methods


A more common problem is that the git author 'name' is 'unknown', as 
in this limma commit


commit 5910dc34a952a72816ada787d3f2c849edf48a95
Author: unknown <sm...@wehi.edu.au>
Date:   Tue Jul 25 07:23:39 2017 +



The problem primarily affects users with svn accounts from the 
earlier part of Bioconductor's svn history, and stems from incomplete 
historical records about the user name associated with svn accounts 
(this information is not stored in svn per se).


Please feel free to respond here if your package is listed above but 
you would like it to be regenerated anyway; remember that you will 
loose any commits made, and invalidate your local repository.


Sorry for the inconvenience,

Martin


This email message may contain legally privileged and/or...{{dropped:2}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee 
or agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have 
received this message in error, please notify the sender immediately by 
e-mail and delete this email message from your computer. Thank you.


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Git transition -- regenerating repositories from svn

2017-08-21 Thread Stephanie M. Gogarten

If we followed the steps here:
https://www.bioconductor.org/developers/how-to/git/maintain-github-bioc/

How much, if any, of this will need to be redone after the repositories 
are regenerated? In particular, if I don't have an unknown user, will 
the regenerated commits be equal to the previous commits, or will "git 
fetch upstream" duplicate my commit history?


thanks,
Stephanie

On 8/21/17 9:00 AM, Martin Morgan wrote:

Hi git transitioners --

We'd like to regenerate git repositories from svn. This is because some 
svn user ids were mapped to 'unknown' git users, so that contributors 
would not be credited accurately. This will  invalidate any local clones 
made from git.bioconductor.org.


Our plan is to regenerate all git repositories EXCEPT those that have 
been modified when we are ready (probably tomorrow morning). Modified 
repositories that we would NOT regenerate, based on current commits, are 
listed below; repositories modified between now and when we are ready to 
update would also NOT be regenerated:


beadarray BiocStyle CAMERA Cardinal CEMiTool ChemmineR cydar cytofkit 
derfinder derfinderHelper derfinderPlot DmelSGI DOSE EBImage ELMER 
ensembldb FamAgg gcapc GenVisR ggtree GOexpress gQTLstats GWASTools 
isomiRs karyoploteR LOBSTAHS motifcounter piano Rdisop REMP Rhdf5lib 
rnaseqcomp seqplots systemPipeR TCGAbiolinks TCGAbiolinksGUI vsn



For a little more detail, the problem is manifest as 'unknown' authors 
in a git commit, e.g., in Biobase from svn user 'jmc'


commit b5ae43bc8aae967b80062da13e5085a6a305b274
Author: unknown 
Date:   Fri Dec 7 15:17:06 2001 +

 fixed the arguments to 'show' methods


A more common problem is that the git author 'name' is 'unknown', as in 
this limma commit


commit 5910dc34a952a72816ada787d3f2c849edf48a95
Author: unknown 
Date:   Tue Jul 25 07:23:39 2017 +



The problem primarily affects users with svn accounts from the earlier 
part of Bioconductor's svn history, and stems from incomplete historical 
records about the user name associated with svn accounts (this 
information is not stored in svn per se).


Please feel free to respond here if your package is listed above but you 
would like it to be regenerated anyway; remember that you will loose any 
commits made, and invalidate your local repository.


Sorry for the inconvenience,

Martin


This email message may contain legally privileged and/or...{{dropped:2}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] git transition for projects with prior git history

2017-08-10 Thread Stephanie M. Gogarten

I tried following the instructions in scenario 9 after adding a remote:
$ git remote add upstream https://git.bioconductor.org/packages/GENESIS.git
$ git fetch --all
Fetching origin
Fetching upstream
warning: no common commits

When I merge both upstream and origin, I see all my commits in 
duplicate: one from my original repo, and one from Bioconductor.


My repo has a rather complicated history: original author forked from 
the Bioc mirror, I forked from his repo, submitted a pull request, he 
pushed those changes back to SVN. Later I took over maintenance, set the 
Bioc mirror as a remote, and pushed my changes directly using git 
cherry-pick.


I'm guessing that the only reasonable path forward here is to just 
delete the current repo and start over after the transition, but I'm 
wondering if anyone else has seen the "no common commits" message, or 
has any other ideas.


thanks,
Stephanie

On 7/27/17 1:52 PM, McDavid, Andrew wrote:

Is there a recommended recipe to utilize the 
git.bioconductor.org remote with an existing git repo 
that has non-zero history?  I tried adding the 
git.bioconductor.org as a remote, making a branch, and 
then checking out a branch on that remote, but it gave my computer sad.  Do I need to clone 
a new repo instead?

Example:
$ git remote -vv
bioc https://github.com/Bioconductor-mirror/MAST.git (fetch)
bioc https://github.com/Bioconductor-mirror/MAST.git (push)
biocgit 
g...@git.bioconductor.org:packages/MAST 
(fetch)
biocgit 
g...@git.bioconductor.org:packages/MAST (push)
origin g...@github.com:RGLab/MAST.git (fetch)
origin g...@github.com:RGLab/MAST.git (push

$ git fetch biocgit
$ git checkout -b bgMaster --track biocgit/master
...

...
$ git merge master bgMaster
fatal: refusing to merge unrelated histories

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] adding methods to BiocGenerics

2017-05-19 Thread Stephanie M. Gogarten

On 5/19/17 12:15 PM, Michael Lawrence wrote:

That could work, also.

On Fri, May 19, 2017 at 10:27 AM, Martin Morgan
<martin.mor...@roswellpark.org> wrote:

On 05/19/2017 12:14 PM, Michael Lawrence wrote:


Since SeqArray already imports SummarizedExperiment and
VariantAnnotation, why can't it just move them to Depends and use
their generics?



do they need to be moved to Depends? Or simply import the generic export the
new method(s) (and generic)?


We used to do this, but Xiuwen just moved SummarizedExperiment and 
VariantAnnotation from Imports to Suggests, because it was taking too 
long to load the package with them attached. So now we are redefining 
the generics.


In our daily use of SeqArray we almost never use VariantAnnotation or 
these generics, so the change improves our pipeline performance without 
any immediate downsides for us. But it does add annoyance for any 
(potential) users of both SeqArray and VariantAnnotation in the same 
session, as they would have to add the package name with "::" to 
disambiguate the methods.




Martin



On Fri, May 19, 2017 at 8:39 AM, Stephanie M. Gogarten
<sdmor...@u.washington.edu> wrote:


Dear Core Team,

What do you think about adding the following generics to BiocGenerics?

colData
rowRanges
ref
alt
qual
filt
header
fixed
info
geno

They are currently defined by both VariantAnnotation/SummarizedExperiment
and SeqArray. Given the increasingly widespread use of VCF files, it
seems
likely that other packages may want to use them in future also.

Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




This email message may contain legally privileged and/or...{{dropped:2}}


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] adding methods to BiocGenerics

2017-05-19 Thread Stephanie M. Gogarten

Dear Core Team,

What do you think about adding the following generics to BiocGenerics?

colData
rowRanges
ref
alt
qual
filt
header
fixed
info
geno

They are currently defined by both 
VariantAnnotation/SummarizedExperiment and SeqArray. Given the 
increasingly widespread use of VCF files, it seems likely that other 
packages may want to use them in future also.


Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] covr

2017-02-27 Thread Stephanie M. Gogarten

Hi Estefania,

After you commit to your github repository, you will need to explicitly 
push those changes to svn to get them into Bioconductor devel. I 
recommend following the instructions under "Dealing with prior history / 
merge conflicts" on this page:


http://www.bioconductor.org/developers/how-to/git-mirrors/

Using the "git cherry-pick" command seems to avoid a lot of errors 
others have seen in trying to move commits from git to svn. (At least, 
it worked for me.)


If you have not already done so, make sure to run the 
"update_remotes.sh" script referenced on that page.


Stephanie

On 2/23/17 9:11 AM, Estefania Mancini wrote:

Thanks Herve!

I will try to include test code in the next version of the package.

By the way, I am a little lost with changes in the package. I want to modify 
the code of the accepted package and  I want to add people to my project in 
GitHub so I think is better for us continuing using it repository. What do you 
suggest? I have a version in GitHub, linked to Bioconductor repository. Does 
commit impact on my devel version of the package? I dont understand clearly 
where I should commit.

Thanks in advance,

Estefania


De: Hervé Pagès [hpa...@fredhutch.org]
Enviado: lunes, 13 de febrero de 2017 03:15 p.m.
Para: Estefania Mancini; bioc-devel@r-project.org
Asunto: Re: [Bioc-devel] covr

Hi Estefania,

See here

   https://bioconductor.org/developers/how-to/unitTesting-guidelines/

for our guidelines to how to add unit tests to your package.

Cheers,
H.

On 02/13/2017 04:17 AM, Estefania Mancini wrote:

Hi,
I would like to add coverage test to my package. What should I do? Thanks in 
advance

Estefania

  [[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] ncdf deprecation

2015-11-30 Thread Stephanie M. Gogarten
I was notified today that the ncdf package is deprecated and will be 
removed from CRAN in January. Since this date does not correspond with a 
Bioconductor release, is it possible to add ncdf to the Bioconductor 
"extra" repository for the duration of this release? I will modify 
GWASTools to use the ncdf4 package, but such a major change does not 
seem appropriate to introduce into the release branch.


The Bioconductor packages TargetSearch and xcms are affected as well as 
GWASTools.


thanks,
Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Reducing memory footprint of large object

2015-11-05 Thread Stephanie M. Gogarten
gdsfmt is another option for storing large datasets on disk, similar to 
HDF5. Take a look at packages SNPRelate, GWASTools and SeqArray which 
all use it to store genotype data.


Stephanie

On 11/5/15 8:41 AM, Fischer, Bernd wrote:

Hi Christian,

you should have a look at packages that for partial reading of data, like
e.g. big memory that only load data partially in memory or implement partial
reading yourself using HDF5 and rhdf5.

Best,

Bernd




On 05.11.2015, at 16:22, Christian Arnold  wrote:


Hi all,

I wanted to ask around in this list with full of experts if any of you
have an advice about the following problem:

I got a large SNPhood object from someone (package SNPhood, which I
developed) from an analysis of 200.000 SNPs or so that stores lots of
read counts and the positions of overlapping reads in general. In total,
the object is 2 GB large. I examined the object and identified the slots
that need the most memory. In this particular slot, a nested list is
stored that saves the read start positions of all overlapping reads for
each SNP region.

For example, for one individual, a list of length 120,049 with integer
vectors, with 20,853,838 elements within the vectors in total:




format(object.size(SNPhood.o@internal$readStartPos$ambiguous$GM12878),
units = "Mb")
[1] "86 Mb"

Unsurprisingly, when unlisting, this can only be a bit improved:
format(object.size(unlist(SNPhood.o@internal$readStartPos$ambiguous$GM12878)),
units = "Mb")
"79.6 Mb"


length(SNPhood.o@internal$readStartPos$ambiguous$GM12878)

[1] 120049

length(unlist(SNPhood.o@internal$readStartPos$ambiguous$GM12878))

[1] 20853838

The vector of read start positions may look like this:

head(unlist(SNPhood.o@internal$readStartPos$ambiguous$GM12878),50)
  [1] 714086 714087 714088 714089 714099 714100 714106 714108 714110
714114 714114 714123 714123 714123 714125 714125 714128 714130 714138
714139 714145 714148 714149 714150 714151 714152 714154 714164 714164
714172 714173 714184 714186 714187 714188 714189 714192 714194 714198
714204 714206 714209 714209 714212 714216 714219 714219 714223 714224 714224

So there are a few reads with identical start sites, but this does not
occur too often. I indeed need all of this information for further
processing.

Do you have any idea if I can save this information more efficiently so
that the overall object size is reduced? I could try an Rle, but the
structure of the data does not be ideal for this...

Any tips are very much appreciated!

Thanks,
Christian

--
—
Christian Arnold, PhD
Staff Bioinformatician

SCB Unit - Computational Biology
Joint appointment Genome Biology
Joint appointment European Bioinformatics Institute (EMBL-EBI)

European Molecular Biology Laboratory (EMBL)
Meyerhofstrasse 1; 69117, Heidelberg, Germany

Email: christian.arn...@embl.de
Phone: +49(0)6221-387-8472
Web: http://www.zaugg.embl.de/

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] version increments for unchanged packages

2015-06-11 Thread Stephanie M. Gogarten
Why is it that packages with no changes still get new version numbers at 
every release? For example, my experiment data package GWASdata has not 
changed since the last release, but the version was bumped from 1.4.0 to 
1.6.0. I imagine most users expect that a change in version number 
indicates some change in content.


Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] bug in source control HOWTO

2014-12-04 Thread Stephanie M. Gogarten
Using the URLs exactly as given on the Source Control HOWTO results in 
the following error:


svn co 
https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_3_0/madman/Rpacks/MYPACKAGENAME
svn merge -c140 
https://hedgehog.fhcrc.org/gentleman/bioconductor/trunk/madman/Rpacks/MYPACKAGENAME


svn: 
'https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_3_0/madman/Rpacks/GWASTools' 
isn't in the same repository as 
'https://hedgehog.fhcrc.org/gentleman/bioconductor'


The svn merge command should remove gentleman from the example URL.

Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] RSS package feeds not updated

2014-12-02 Thread Stephanie M. Gogarten



On 12/2/14 8:17 AM, Dan Tenenbaum wrote:



- Original Message -

From: Julian Gehring julian.gehr...@embl.de
To: bioc-devel@r-project.org
Cc: Felix Klein fkl...@embl.de
Sent: Tuesday, December 2, 2014 2:19:01 AM
Subject: [Bioc-devel] RSS package feeds not updated

Hi,

some/many of the package build RSS feeds [1] don't receive updates.
  For
example,

   http://bioconductor.org/rss/build/packages/DESeq2.rss (from
   2014-11-04)

   http://bioconductor.org/rss/build/packages/BiocInstaller.rss (from
   2014-10-15)

report package builds as broken, although the current builds are
fine.

In contrast,

   http://bioconductor.org/rss/build/packages/GenomicRanges.rss

was recently updated (2014-12-01) and also contains the build report
of
both the devel and release branch (the broken ones listed above
don't).

Most of the packages that I checked manually seem to be
outdated/broken
- I guess there is a issue with the underlying feed system.



Thanks for the report. I've had a look. I've changed the system so it doesn't 
try to be smart about not updating feeds for packages that do not have issues.
This means all package feeds will be updated daily, whether there is an issue 
or not. Not sure if this will be annoying to feed users; we can revisit it if 
it's an issue.


I have to say, I liked the old system better.  I have a feed for each of 
my packages in the bottom of my email program, so I could see at a 
glance if there was a problem with one of the packages (new message in 
the feed).  Failure messages are a lot more noticeable if they're not 
buried among lots of nothing to see here messages.


Stephanie



Thanks,
Dan




Best
Julian


[1] http://bioconductor.org/developers/rss-feeds/

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Help with creating first Bioconductor package

2014-11-14 Thread Stephanie M. Gogarten



On 11/14/14, 9:20 AM, Laurent Gatto wrote:


Dear January,

On 14 November 2014 10:51, January Weiner wrote:


Dear all,

I am building my first Bioconductor package and, before wasting
everyone's time with a faulty submission, I would like to clarify
certain things.

1) The package seems to fulfill the requirements of the Bioconductor
Package Guidelines and passes all checks except one consideration:

CONSIDER: Indenting lines with a multiple of 4 spaces;

I love to indent my code with 2 spaces, is it a problem? Or do I have
to reformat all code before release? This is doable, but it would
complicate my workflow and if I am allowed to avoid it, I will. I see
that not all Bioconductor packages stick to this formatting (many even
use tabs instead of spaces).


I don't think this is a reason for rejection, as long as all other
aspects of the formatting are fine (for instance miles-long lines).


Emacs and RStudio both have 2 spaces as the default, so you will be in 
good company.  My own package is not even internally consistent, as it 
contains code written by multiple people, all using text editors with 
different indentation defaults.  Sometimes I get the urge to go through 
the (many) .R files and reformat it all, but it never makes it to the 
top of the to-do list.





2) Another problem I have is the testing package on other platforms. I
do not have a Windows machine to test my package. Could someone help
me and test my package (build, check and BiocCheck) on Windows and
MacOS? Otherwise -- how do you check your packages? You keep an up to
date R development environment on three platforms?


As mentioned by Julian, you could use http://win-builder.r-project.org/
for Windows. But I don't think you are expected to check it on all
platforms before submission. If your package contains straightforward R
code, there is no reason to anticipate issues on other platforms.


I don't have a Windows machine either, so I rely on the Bioc build 
system to alert me if something goes wrong on Windows but not the other 
platforms.


Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Please bump version number when committing changes

2014-09-05 Thread Stephanie M. Gogarten
I am guilty of doing this today, but I have (I think) a good reason. 
I'm making a bunch of changes that are all related to each other, but 
are being implemented and tested in stages.  I'd like to use svn to 
commit when I've made a set of changes that works, so I can roll back if 
I break something in the next step, but I'd like the users to see them 
all at once as a single version update.  Perhaps others are doing 
something similar?


Stephanie

On 9/4/14, 12:04 PM, Dan Tenenbaum wrote:

Hello,

Looking through our svn logs, I see that there are many commits that are not 
accompanied by version bumps.
All svn commits (or, if you are using the git-svn bridge, every group of commits included 
in a push) should include a version bump (that is, incrementing the z segment 
of the x.y.z version number). This practice is documented at 
http://www.bioconductor.org/developers/how-to/version-numbering/ .

Failure to bump the version has two consequences:

1) Your changes will not propagate to our package repository or web site, so 
users installing your package via biocLite() will not receive the latest 
changes unless you bump the version.

2) Users *can* always get the current files of your package using Subversion, 
but if you've made changes without bumping the version number, it can be 
difficult to troubleshoot problems. If two people are looking at what appears 
to be the same version of a package, but it's behaving differently, it can be 
really frustrating to realize that the packages actually differ (but not by 
version number).

So if you're not already, please get in the habit of bumping the version number 
with each set of changes you commit.

Let us know on bioc-devel if you have any questions about this.

Thanks,
Dan

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Confusing namespace issue with IRanges 1.99.17

2014-07-09 Thread Stephanie M. Gogarten



On 7/8/14 12:27 PM, Hervé Pagès wrote:

On 07/08/2014 11:58 AM, Leonardo Collado Torres wrote:

Hello,

Thank you everyone for the replies and help!

I did not know that it was due to S4Vectors::extractROWS nor what
Hervé exposed about the upcoming changes to them.

Regarding probably it is not desirable to move packages from loaded
to attached, but I don't think this influences performance in a
meaningful way?, I think that it doesn't. I was just surprised to see
the change since I thought that I was correctly specifying the
namespace.

As for But what's with needing to load IRanges to subset an Rle? Is
that temporary?, the real use case is the function fstats.apply()
located here
https://github.com/lcolladotor/derfinderHelper/blob/master/R/fstats.apply.R

It basically takes as input a DataFrame where each column is a
coverage Rle and calculates some statistics with it. The function has
three methods implemented: one in Rle world that is slow with large
samples data sets, another one that involves coercion to a regular
matrix object and a third one that involves coercing to a
Matrix::sparseMatrix object this is faster and less memory intensive.
It is for this last one that I use the mapply() call (see
https://github.com/lcolladotor/derfinderHelper/blob/master/R/fstats.apply.R#L184

). I guess that .transformSparseMatrix() could probably be made more
efficient but I haven't explored how to do so any further.

Going back to the namespace, I thought that it was considered a best
practice to just import the functions/methods needed. That's why I try
to have specific imports (using roxygen2). For instance, for
fstats.apply() I use the following roxygen2 tags:

#' @importFrom S4Vectors Rle
#' @importMethodsFrom S4Vectors as.numeric
#' @importMethodsFrom IRanges as.data.frame as.matrix Reduce ncol nrow
which '['
#' @importFrom Matrix sparseMatrix
#' @importMethodsFrom Matrix '%*%' drop

I can see in some BioC packages the namespace uses specific imports
and others where they import the full package.


Honestly I don't know why so many BioC packages do that. But it seems
to be a strong trend. IMHO it's a lot of work for very little benefits.
Doesn't seem to make a big difference from a loading time perspective.
However it makes the NAMESPACE big and adds some unnecessary overhead
to the overall maintainability of the package. For example, when some
low-level functionality moves from one package to the other (like it
happened recently with the Rle class), then all the BioC packages that
selectively import stuff from IRanges need to have their NAMESPACE
fixed.

I've heard some people claiming they do it to minimize the risk of a
name collision. Fair enough. But name collisions are pretty rare.
A simple and straightforward approach is to import full packages
until a name collision issue actually happens. For most packages,
it will never happen. But if it happens, you'll get a warning at
both: installation- and load-time, so you can't miss it. Then you can
adjust the NAMESPACE by selectively importing from one of the 2
packages involved in the collision.


I thought selective imports were considered best practice as well.  I 
seem to remember an email from Martin on this list a while ago saying 
just that.  So perhaps that is why everyone is doing it?


As in many things, perhaps the middle road is the best practice?  If you 
are using only one or two functions from a package, importFrom makes a 
lot more sense.  But if you are using multiple classes and methods from 
a package, or if you have to start importing things like '[', then it is 
more straightforward to import the entire package.


Stephanie



The selective imports is sometimes pushed to the extreme: I've seen
BioC packages trying to selectively import stuff from the methods
package! There is probably zero benefit in doing this, only maintenance
complications in the long run... Also I think I remember reading
somewhere (R-devel list? R official doc? Can't remember exactly)
that packages are not supposed to do that.

My 2 cents. I'm sure not everybody will agree with this.

H.


Should I stop doing so
and just import the full packages? That is:

#' @import IRanges Matrix S4Vectors

It would go from around 4 secs to around 6 secs to load the tiny package.


In my use case, I shipped fstats.apply() to a tiny package containing
just the function for using a Snow-based BiocParallel::blapply(). The
original package would take too long to load (around 40 secs, it used
to import a total of 18 packages) and this has a very large impact
compared to used a multicore-based blapply(). However, the Snow-based
version uses significantly less memory.



Thank you,
Leo













On Tue, Jul 8, 2014 at 11:15 AM, Hervé Pagès hpa...@fhcrc.org wrote:

Hi guys,


On 07/08/2014 05:29 AM, Michael Lawrence wrote:


This is why I tell people not to use require(). But what's with
needing to
load IRanges to subset an Rle? Is that temporary?



Very temporary. The source code 

[Bioc-devel] asVCF error coming from normalizeSingleBracketSubscript

2013-11-22 Thread Stephanie M. Gogarten

Hi Valerie,

The asVCF method in SeqArray is failing as of today with a (to me) 
mysterious error.  I get it for the test files chr22.vcf.gz, ex2.vcf, 
and gl_chr1.vcf in extdata of VariantAnnotation, but not for 
SeqArray/extdata/CEU_Exon.vcf.  Do you have any suggestions of where I 
might look to figure out where this error is coming from?


thanks,
Stephanie

 vcffile - system.file(extdata, ex2.vcf, package=VariantAnnotation)
 gdsfile - tempfile()
 seqVCF2GDS(vcffile, gdsfile)
 gdsobj - seqOpen(gdsfile)
 options(error=recover)
 vcfg - asVCF(gdsobj)
Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs

Enter a frame number, or 0 to exit

 1: asVCF(gdsobj)
 2: asVCF(gdsobj)
 3: .local(x, ...)
 4: VCF(rowData = .rowData(x), colData = .colData(x), exptData = 
SimpleList(hea

 5: .info(x, info)
 6: `[-`(`*tmp*`, x == , value = NA)
 7: `[-`(`*tmp*`, x == , value = NA)
 8: lsubset_List_by_List(x, i, value)
 9: .fast_lsubset_List_by_List(x, i, value)
10: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
11: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
12: extractROWS(setNames(seq_along(x), names(x)), i)
13: extractROWS(setNames(seq_along(x), names(x)), i)
14: normalizeSingleBracketSubscript(i, x)

 sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] VariantAnnotation_1.8.6 Rsamtools_1.14.1Biostrings_2.30.1
[4] GenomicRanges_1.14.3XVector_0.2.0   IRanges_1.20.6
[7] BiocGenerics_0.8.0  SeqArray_1.2.0  gdsfmt_1.0.0

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.24.0   Biobase_2.22.0 biomaRt_2.18.0
 [4] bitops_1.0-6   BSgenome_1.30.0DBI_0.2-7
 [7] GenomicFeatures_1.14.2 RCurl_1.95-4.1 RSQLite_0.11.4
[10] rtracklayer_1.22.0 stats4_3.0.2   tools_3.0.2
[13] XML_3.95-0.2   zlibbioc_1.8.0

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] exp. data, updates

2013-07-10 Thread Stephanie M. Gogarten
I had this issue with GWASTools/GWASdata at one point.  There is only 
one copy of an experiment data package, so if you change it, it will 
break the current release version of bsseq.  The solution I came up with 
was to add new objects with different names to the data package, and 
change my examples in the devel version to use the new names.  You can 
delete the old objects after the next release cycle.


Stephanie

On 7/10/13 7:05 AM, Kasper Daniel Hansen wrote:

I have just changed the class definition for a core class in bsseq.  A
supporting experiment data package, bsseqData, contains two objects of this
core class and now it needs to be updated (specifically because the
example() contains a command which prints the object and this printing now
fails).

I can update the bsseqData package, but do I mess with earlier releases
when I do this? I am uncertain because the /data is stored in another
repository. How should I handle this?

Best,
Kasper

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] converting character vector to DNAStringSetList

2013-04-19 Thread Stephanie M. Gogarten

Hi all,

There is a non-exported function in VariantAnnotation called 
.toDNAStringSetList that converts a vector of comma-separated character 
strings to a DNAStringSetList.  I'd like to use this code in a package 
I'm working on.  Would it make sense to export this from Biostrings?  If 
not, what is the proper way to attribute code if I borrow a few lines 
from .toDNAStringSetList?


thanks,
Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] converting character vector to DNAStringSetList

2013-04-19 Thread Stephanie M. Gogarten

Thanks, Herve!

To answer Tim's question (though it has become irrelevant after Herve's 
solution): because I'd rather not add VariantAnnotation to my dependency 
tree for only a couple of lines of code.  It also seemed like the sort 
of thing that should be possible with a DNAStringSetList constructor, so 
I'm glad it will be starting tomorrow.


Stephanie

On 4/19/13 12:35 PM, Hervé Pagès wrote:

Hi Stephanie,

On 04/19/2013 11:55 AM, Stephanie M. Gogarten wrote:

Hi all,

There is a non-exported function in VariantAnnotation called
.toDNAStringSetList that converts a vector of comma-separated character
strings to a DNAStringSetList.  I'd like to use this code in a package
I'm working on.  Would it make sense to export this from Biostrings?  If
not, what is the proper way to attribute code if I borrow a few lines
from .toDNAStringSetList?


This should be a 1-liner but the 1-liner fails at the moment because the
DNAStringSetList() constructor doesn't work on a list:

comma_sep_strings - c(AA,TT, ACGT, , TT,A,,TAG)
DNAStringSetList(strsplit(comma_sep_strings, ,, fixed=TRUE))
   Error in strsplit(comma_sep_strings, ,) : non-character argument

Of course this should work (and it will work tomorrow after you run
biocLite()). In the mean time the workaround is to use do.call():

do.call(DNAStringSetList, strsplit(comma_sep_strings, ,,
fixed=TRUE))
   DNAStringSetList of length 4
   [[1]] AA TT
   [[2]] ACGT
   [[3]]   A DNAStringSet instance of length 0
   [[4]] TT A  TAG

Cheers,
H.

PS: Calling strsplit() with 'fixed=TRUE' makes it about 4x faster :-)



thanks,
Stephanie

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] I would like to publish a bioconductor package.

2013-02-27 Thread Stephanie M. Gogarten
You can solve the package size issue by putting your example data in a 
separate experiment data package 
(http://www.bioconductor.org/packages/release/data/experiment/).


Stephanie

On 2/27/13 3:03 AM, Davide Rambaldi wrote:

Hi all,

I am working on a library called flowFit, the purpose of this library is to 
analyze the FACS data coming from proliferation tracking dyes study.

The library depends on the flowCore and flowViz bioconductor libraries and use 
minpack.lm (levenberg-marquadt algorithm) to fit a set of peaks over the FACS 
data.

A typical experimental pipeline:

1) Acquire with FACS a sample of unlabelled cells
2) Acquire with FACS a sample of labeled and unstimulated cells (the Parent 
Population)
3) Acquire with FACS a sample of labeled and stimulated cells (the 
Proliferative Population)

In R we can use the flowCore functions to transform the raw data and to gate 
the population of interest. Once we have gated the correct population, with 2 
commands of flowFit you can perform the fitting:


library(flowFit)
parent - parentFitting(QuahAndParish[[1]], FITC-A)
fitting - proliferationFitting(QuahAndParish[[2]],  FITC-A, 
parent.fitting.cfse@parentPeakPosition,  parent.fitting.cfse@parentPeakSize)


The function can generate also some graphical output with:


plot(fitting.cfse)


To demonstrate the correctness of the fitting I have made some in silico 
simulations and a retrospective analysis of the data from the paper:

New and improved methods for measuring lymphocyte proliferation in vitro and in 
vivo using CFSE-like fluorescent dyes, Benjamin J.C. Quah ⁎, Christopher R. Parish, 
Journal of Immunological Methods (2012)

In this paper, the same population of lymphocytes (proliferation with the same 
growth conditions) was stained with 3 different proliferation tracking dyes: if 
the fitting algorithm is working as expected, we expect to estimate the same % 
of cells for generation in the 3 sample.

Comparing the 3 samples we didn't see any significant difference in the 
estimation of the % of cell for generations, suggesting us that the algorithm 
is correctly estimating the % of cells / generation.

I have posted a graphical output example with the Quah and Parish data (pdf) 
here:

http://dl.dropbox.com/u/40644496/QuahAndPArishOut.pdf

The dataset will be included in the library (in the data subdir).

Actually I am writing the vignette (I am following the guidelines in 
http://www.bioconductor.org/developers/package-guidelines/) and fixing some 
graphical bugs (like the legend oversized …).

The package Pass R CMD build and R CMD CHECK (time: 86 seconds) with no errors 
on OSX and Linux (I have to find a windows machine somewhere ...), I still have 
to test with the R-devel version of R.

The library is bigger than expected (4.2 Mb) because the example datasets (FCS 
files converted in .Rdata) are big (3.7M) and I don't know how to solve this 
issue...

My question is, How I proceed from here?

I would like to publish the library/methods in a paper (Bioinformatics Journal 
may be?) and submit the library to Bioconductor, which is the correct way to 
proceed?

Thanks

P.S: If I miss (again!) some FAQ please apologize me

-
Davide Rambaldi, PhD.
-
IEO ~ MolMed
[e] davide.ramba...@ieo.eu
[e] davide.ramba...@gmail.com

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel