[Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error

2023-05-12 Thread Dario Strbenac via Bioc-devel
Good day,

The default value of colData is DataFrame(). Not specifying an informative 
colData is fine.

countsMini <- matrix(rpois(100, 100), ncol = 10)
colnames(countsMini) <- paste("Cell", 1:10)
rownames(countsMini) <- paste("Gene", 1:10)
SummarizedExperiment(assays = list(counts = countsMini)) # Creates the object 
successfully.

But, explicitly specifying an empty DataFrame triggers an error. I don't 
understand why it is not equivalent to the constructor's default.

SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame())
Error in `rownames<-`(`*tmp*`, value = .get_colnames_from_first_assay(assays)) 
: 
  invalid rownames length

What is the subtle difference? It also seems like there could be a clearer 
error message emitted if this is caught in the right place.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] TCGAbiolinks fails

2023-04-28 Thread Dario Strbenac via Bioc-devel
Good day,

The package has checking errors which the developers of it need to fix 
themselves.

Quitting from lines 114-121 (subtypes.Rmd) 
Error: processing vignette 'subtypes.Rmd' failed with diagnostics:
object 'lgg.gbm.subtype' not found

The installation error simply indicates that the package has never built 
successfully in Bioconductor 3.17.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] S4 Methods Documentation Convention Triggers Warnings

2023-01-27 Thread Dario Strbenac via Bioc-devel
Good day,

So, is the ultimate solution to manually change everything to the format of

\item{\code{show(x)}:}{
  ...
} ?

The warnings persist, so it does not seem as though R will revert to allowing 
the currently-popular syntax past its check.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] S4 Methods Documentation Convention Triggers Warnings

2022-11-26 Thread Dario Strbenac via Bioc-devel
Good day,

For a long time, it has been a convention to document S4 methods in the format:

\section{Displaying}{
  In the code snippets below, \code{x} is a GRanges object.
  \describe{
\item{}{
  \code{show(x)}:
  Displays the first five and last five elements.
}
  }
}

In R Under Development, this is now a warning:

* checking Rd files ... WARNING
checkRd: (5) GRanges-class.Rd:115-165: \item in \describe must have non-empty 
label.

This affects my own package as well as the core Bioconductor packages which I 
used as inspiration for designing my pacakge documentation seven years ago. 
What should the new convention be? Or could R developers be convinced to get 
rid of this check before this prototype is released?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] DataFrameList to Wide Format DataFrame

2021-12-16 Thread Dario Strbenac via Bioc-devel
Hello,

Ah, yes, the sample names should of course be in the rows - Friday afternoon 
error. In the question, I specified "largely the same set of features", 
implying that the overlap is not complete. So, the example below will error.

DFL <- DataFrameList(X = DataFrame(a = 1:3, b = 3:1, row.names = LETTERS[1:3]),
 Y = DataFrame(b = 4:6, c = 6:4, row.names = 
LETTERS[20:22]))
unlist(DFL)
Error in .aggregate_and_align_all_colnames(all_colnames, strict.colnames = 
strict.colnames) : 
  the DFrame objects to combine must have the same column names

This is long but works:

allFeatures <- unique(unlist(lapply(DFL, colnames)))
DFL <- lapply(DFL, function(DF)
{
  missingFeatures <- setdiff(allFeatures, colnames(DF))
  DF[missingFeatures] <- NA
  DF
})
DFLflattened <- do.call(rbind, DFL)

Is there a one-line function for it?

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] DataFrameList to Wide Format DataFrame

2021-12-16 Thread Dario Strbenac via Bioc-devel
Good day,

Is there a function in the S4Vectors API which converts a DataFrameList into a 
DataFrame, automatically putting the list names into one of the metadata 
columns, analogous to MultiAssayExperiment's wideFormat function? The scenario 
is mutliple data sets from different organisations measuring the largely the 
same set of features and patient outcome, but on completely different sets of 
patients in each organisation.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] bpparam Non-deterministic Default

2021-11-26 Thread Dario Strbenac via Bioc-devel
Hello,

Might it instead made possible to set an RNGseed value by specifying one to 
bpparam but still get the automated back-end selection, so that it could easily 
be set to a particular value in an R package?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] bpparam Non-deterministic Default

2021-11-26 Thread Dario Strbenac via Bioc-devel
Good day,

I maintain an R package which makes use of functions such as bplapply which has 
bpparam() as the default. I have received feedback from a beginnre user that 
the results change when he knitted his R Markdown document a second time. This 
stems from the default constructor of bpparam() which sets no RNGseed. I am 
wondering about the desirability of changing the RNGseed default in 
BiocParallel to a particular uncontroversial number, such as 12345, so that 
beginners get deterministic behaviour.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] S4 Method Slow Execution if Signature Has Multiple Class Unions

2021-11-22 Thread Dario Strbenac via Bioc-devel
Good day,

I created two constructor methods for a generic function. One is for the 
default empty constructor and the other is a constructor when any one or more 
parameters is specified by the user. The method signatures are:

1. c("missing", "missing", "missing", "missing", "missing", "missing", 
"missing", "missing"),
2. c("characterOrMissing", "numericOrMissing", "numericOrMissing", 
"numericOrMissing", "numericOrMissing", "characterOrMissing", 
"BiocParallelParamOrMissing", "numericOrMissing")

The class unions are defined as you might expect.

setClassUnion("characterOrMissing", c("character", "missing"))
setClassUnion("numericOrMissing", c("numeric", "missing"))
setClassUnion("BiocParallelParamOrMissing", c("BiocParallelParam", "missing"))

The first method works as expected:

> system.time(CrossValParams())
   user  system elapsed 
  0.165   0.000   0.165

The second takes over ten minutes and constantly uses 100% CPU usage, according 
to top.

> system.time(CrossValParams("Leave-k-Out", leave = 2))
   user  system elapsed 
760.018  15.093 775.090

Strangely, if I rerun this code again, it works quickly the second time.

> system.time(CrossValParams("Leave-k-Out", leave = 2))
   user  system elapsed 
  0.145   0.000   0.145

I haven't been able to come up with a minimal reproducile example of the issue. 
How can this be done consistently and efficiently?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Delayed Assignment to S4 Slots

2021-10-13 Thread Dario Strbenac via Bioc-devel
Good day,

I have an S4 class with some slots in my Bioconductor package. One of the slots 
stores the range of top variables to try during feature selection (the 
variables might be ranked by some score, like a t-test). The empty constructor 
looks like

setMethod("ResubstituteParams", "missing", function()
{
  new("ResubstituteParams", nFeatures = seq(10, 100, 10), performanceType = 
"balanced error")
})

But, someone might have a small omics data set with only 40 features (e.g. 
CyTOF). Therefore, trying the top 10, 20, ..., 100 is not a good default. A 
good default would wait until the S4 class is accessed within cross-validation 
and then, based on the dimensions of the matrix or DataFrame, pick a suitable 
range. I looked at delayedAssign, but x is described as "a variable name (given 
as a quoted string in the function call)". It doesn't seem to apply to S4 slots 
based on my understanding of it.

> r <- ResubstituteParams()
> delayedAssign("r@nFeatures", nrow(measurements))
> measurements <- matrix(1:100, ncol = 10)
> r@nFeatures # Still the value from empty constructor.
 [1]  10  20  30  40  50  60  70  80  90 100

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Windows-specific Function Not Found Error

2021-10-12 Thread Dario Strbenac via Bioc-devel
Hello,

Ah, I had a few different uses of MultiAssayExperiment::colData in a particular 
function of the package, but one line had only colData without the scoping in 
front. I wish that R error messages displayed R file names and line numbers 
more often.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Windows-specific Function Not Found Error

2021-10-12 Thread Dario Strbenac via Bioc-devel
Good day,

I see a checking failure for ClassifyR for Windows Server 2019 only. The error 
is

Error: BiocParallel errors
  4 remote errors, element index: 1, 4, 6, 8
  6 unevaluated and other errors
  first remote error: could not find function "colData"

Is there anything I can change in my code to help it pass? The error doesn't 
appear on the two other Bioconductor  servers.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] VariantAnnotation Installation Compile Error

2021-05-25 Thread Dario Strbenac
Hello,

The problem stemed from an .Rprofile file which was setting .libPaths with the 
directory path to a library of packages for the previous version of R and 
starting R with the --vanilla option avoided the problem.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] VariantAnnotation Installation Compile Error

2021-05-24 Thread Dario Strbenac
Good day,

I also see NULL on R start-up. I installed R from source.

$ R-4.1.0/bin/R

R version 4.1.0 (2021-05-18) -- "Camp Pontanezen"
......
Type 'q()' to quit R.

NULL
> 

I noticed a couple of error messages at the end of the installation which I 
thought were harmless. I will reinstall R. The extracted directory and prefix 
directory were the same, which might be problematic.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] VariantAnnotation Installation Compile Error

2021-05-24 Thread Dario Strbenac
Good day,

No, the temporary directory has space remaining. I wonder what file it is 
referring to by "No such file or directory". I had an idea to reinstall 
Biostrings using force = TRUE, but it didn't help.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] VariantAnnotation Installation Compile Error

2021-05-23 Thread Dario Strbenac
Good day,

I apparently have a valid Bioconductor package library but VariantAnnotation 
won't install successfully.

> valid()
[1] TRUE
> install("VariantAnnotation")
Bioconductor version 3.13 (BiocManager 1.30.15), R 4.1.0 (2021-05-18)
Installing package(s) 'VariantAnnotation'
trying URL 
'https://bioconductor.org/packages/3.13/bioc/src/contrib/VariantAnnotation_1.38.0.tar.gz'
Content type 'application/x-gzip' length 1726088 bytes (1.6 MB)
==
downloaded 1.6 MB

NULL
* installing *source* package ‘VariantAnnotation’ ...
** using staged installation
** libs
gcc -I"/verona/biostat/software/R-4.1.0/include" -DNDEBUG NULL 
-D_FILE_OFFSET_BITS=64 
-I'/dskh/biostat/software/R-4.1.0/library/S4Vectors/include' 
-I'/dskh/biostat/software/R-4.1.0/library/IRanges/include' 
-I'/dskh/biostat/software/R-4.1.0/library/XVector/include' 
-I'/dskh/biostat/software/R-4.1.0/library/Biostrings/include' 
-I'/dskh/biostat/software/R-4.1.0/library/Rhtslib/include' -I/usr/local/include 
  -fpic  -g -O2  -c Biostrings_stubs.c -o Biostrings_stubs.o
gcc: error: NULL: No such file or directory
make: *** [/verona/biostat/software/R-4.1.0/etc/Makeconf:168: 
Biostrings_stubs.o] Error 1
ERROR: compilation failed for package ‘VariantAnnotation’

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
BLAS:   /dskh/biostat/software/R-4.1.0/lib/libRblas.so
LAPACK: /dskh/biostat/software/R-4.1.0/lib/libRlapack.so

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] cannot reproduce the build error with InTAD package

2020-12-28 Thread Dario Strbenac
It looks like you are creating a MultiAssayExperiment in your vignette. 
Numerous Bioconductor packages relying on MultiAssayExperiment infrastructure 
started failing a few days ago with the release of version 1.17.3, but I don't 
see the breaking change explained in the News file.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Recent change in MultiAssayExperiment for inferred MAE-level colData

2020-12-28 Thread Dario Strbenac
This also happens to ClassifyR.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Update R package I developed which has been released by bioconductor

2020-09-20 Thread Dario Strbenac
Good day,

Step 1: Follow the steps at 
http://bioconductor.org/developers/how-to/git/push-to-github-bioc/

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Dario Strbenac
Good day,

I am not sure how to fix my package properly, even with the good example. A 
link to the specific part of my function is 
https://github.com/DarioS/ClassifyR/blob/e35899caceb401691990136387a517f4c3b57d5e/R/runTests.R#L567
 and the example in the help page of runTestsEasyHard function triggers the 
error shown in Bioconductor's daily build.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Dario Strbenac
Good day,

Thanks for the examples which demonstrate the issue. Do you have other 
recommendations if, inside the loop, another function in the package is being 
called and the variable being passed is the ellipsis? There are only a couple 
of variables which might be provided by the user collected in the ellipsis, so 
the functional approach might still be the best in that case.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Dario Strbenac
Good day,

I have a loop in a function of my R package which by default uses bpparam() to 
set the framework used for parallelisation. On Windows, I see the error

Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: object 'selParams' not found

This error does not happen on the Linux or MacOS operating systems. It happens 
using both R 3.6 and the upcoming version 4. The error can be reproduced 
running the examples of runTests function in ClassifyR.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Use of SummerisedExperiments or MultiAssayExperiments of many many Dataframes/ nested List objects

2020-01-31 Thread Dario Strbenac
Good day,

You are operating with tables of statistical hypothesis test summaries, rather 
than input data which the tests are done with, so it doesn't make sense to use 
SummarizedExperiment or MultiAssayExperiment. The data is not experimental 
measurements. You should try DataFrame from Bioconductor package S4Vectors. 
It's better than a data.frame and won't flood your console with output.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Whats the timeframe for Bioc Support updates and downtime?

2020-01-17 Thread Dario Strbenac
Good day,

Could the forum have automatic saving of drafted text like some other forums?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Bioconductor 3.10 is released!!

2019-10-30 Thread Dario Strbenac
Good day,

In the development branch, all packages are only built on Linux.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] patch old releases? e.g. RELEASE_3_8

2019-07-12 Thread Dario Strbenac
Good day,

No; anything older than the release branch at present is not modifiable.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] how to achieve reproducibility with BiocParallel regardless of number of threads and OS (set.seed is disallowed)

2019-06-19 Thread Dario Strbenac
Good day,

Should setting workers to 1 and RNGseed to a number result in a warning to the 
user that the seed will effectively be ignored?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] GAlignments Constructor Type Checking Error

2019-04-14 Thread Dario Strbenac
Good day,

Although the documentation states "Generally not used directly", I'm trying it. 
A small example fails because the input is evaluated to be in the wrong format, 
but it doesn't seem so when I look at the variable type of strand.

debug(GAlignments)
GAlignments("chr1", 1L, strand = Rle(factor('+')), cigar = "10M")

debug: new("GAlignments", NAMES = names, seqnames = seqnames, 
start = pos, cigar = cigar, strand = strand, elementMetadata = 
elementMetadata, 
seqinfo = seqinfo)
Browse[2]> strand
factor-Rle of length 1 with 1 run
  Lengths: 1
  Values : +
Levels(1): +
Browse[2]> n
Error in validObject(.Object) : 
  invalid class “GAlignments” object: 'strand(x)' must be an unnamed 'factor' 
Rle with no NAs (and with levels +, - and *)

This looks like a false-positive to me. Also, it would increase readability if 
the constructor didn't run off the edge of the PDF page in the reference manual 
by using \preformatted. Also, I wonder why seqnames is automatically converted 
into a factor Rle, but strand isn't. Couldn't strand also use .asFactorRle?

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] GitHub Pages Vignettes

2019-03-14 Thread Dario Strbenac
Good day,

On the Bioconductor website, scde has no vignettes listed in the Documentation 
section. Looking at the contents of the package, there is a vignettes directory 
with four vignettes in it. They each have output: md_document and are hosted on 
GitHub.io. The vignettes also are not accessible from within R

> browseVignettes("scde")
No vignettes found by browseVignettes("scde")

Is such a documentation design choice suitable for Bioconductor packages? The 
Vignettes section of Package Guidelines states that a vignette is mandatory, 
but there is no statement about acceptable output formats of vignettes. Also, 
the Package Vignettes webpage seems to have been written before HTML vignettes 
were possible, because it refers only to Rnw and PDF files. Its URL is 
http://bioconductor.org/help/package-vignettes/ Could such requirements be made 
explicit?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] QuasR Overwrites Base Graphics Settings

2019-03-06 Thread Dario Strbenac
Good day,

Doing a quality control plot with QuasR overwrites the user's figure margins 
and further plots don't work. A small, self-contained example is

plot(1:10) # A plot without error
library(QuasR)
testFile <- system.file("extdata", "ex1.bam", package="Rsamtools")
qQCReport(testFile) # Fails because figure margins too large
plot(1:10) # Also fails because figure margins too large

The value of par("mar") is different before and after using qQCReport. Can 
QuasR be changed so that it does not clobber the R session's graphics 
parameters?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Unable to install database package in devel version of Bioconductor 3.9

2019-02-24 Thread Dario Strbenac
Good day,

You need to provide more information to get useful guidance. What version of R 
did you use? From the error message, it seems that it's less than 3.5.0 but it 
should be R Under Development.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] package named spdeq causes error

2019-02-21 Thread Dario Strbenac
Good day,

I don't, but your software package imports agricolae which imports spdep. spdep 
is available from CRAN, so it's strange that the Bioconductor build server 
running Linux has not been able to install it.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] support the stable version of R

2019-01-16 Thread Dario Strbenac
Good day,

R's checking may encourage a dependency on R to be placed in the DESCRIPTION 
file, based on examining the data files distributed with the package. For 
ClassifyR, I get a warning if the dependency is absent.

* checking data for ASCII and uncompressed saves ... WARNING
  Warning: package needs dependence on R (>= 2.10)

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] ClassifyR Check Error on Linux and MacOS Systems

2018-12-17 Thread Dario Strbenac
Good day,

Thanks for running it. I found the error you had happens because of an example 
which used random sampling and rarely returned a zero-length result. I have 
made the example deterministic so that it always succeeds.

The error you saw is not related to the problem seen on the build servers, 
though. I found a browser() inside an R function which I forgot to remove 
before committing. After removal and committal, the error on malbec1 and 
merida1 is gone. It is surprising that it did not trigger an error when 
checking the package before committing it and that the error message observed 
on the build servers was not clear about what the problem was.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] ClassifyR Check Error on Linux and MacOS Systems

2018-12-06 Thread Dario Strbenac
Good day,

There is an error for ClassifyR on malbec1 and merida1 caused by a 
documentation example. However, it doesn't occur on tokay1. Can I get more 
information about which example is emitting the error on malbec1 server?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Cannot access remote upstream after changing the laptop

2018-11-21 Thread Dario Strbenac
Good day,

You could also copy the private key from the old computer to the new computer, 
if you still can use the old computer.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Need update access agaisnt package 'banocc'

2018-10-20 Thread Dario Strbenac
Good day,

The warning message is caused by a documentation linking inconsistency in R 
running on different operating systems. It may be avoided, but it's not 
essential. Perhaps documentation linking will be soon be consistent between 
operating systems.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Windows error "UCSC library operation failed" in package karyoploteR

2018-09-30 Thread Dario Strbenac
Good day,

The import of BigWig files does not work on Windows and is documented. Execute 
?BigWigFile-class and notice in the Description section: "These functions do 
not work on Windows.".
------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] EXTERNAL: Fwd: PREDA problems reported in the Multiple platform build/check report for BioC 3.7

2018-08-06 Thread Dario Strbenac
Good day,

Similar to you, I am awaiting the restoration of sparsediscrim which was 
removed on the same day as PREDA.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Action for Uncompressed Data Warning

2018-06-16 Thread Dario Strbenac
Good day,

I added a new data set to a package I develop and there is the warning:

* checking data for ASCII and uncompressed saves ... WARNING
  
  Note: significantly better compression could be obtained
by using R CMD build --resave-data
   old_size new_size compress
  asthma.RData715Kb484Kbbzip2

Should I ignore it or save it again with compression? The 231 Kb reduction in 
file size seems insignificant.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BiocParallel on Windows Never Ends

2018-06-13 Thread Dario Strbenac
Good day,

I couldn't get a working param object. It never completes the command 

param = bpstart(SnowParam(2, manager.hostname = "144.130.152.1", manager.port = 
2559))

I obtained the IP address by typing "My IP address" into Google and it gave me 
the address shown. I used netstat -an and 

  Proto  Local Address  Foreign AddressState
  TCP127.0.0.1:2559 0.0.0.0:0  LISTENING

was one of the results displayed. I have reproduced this problem on another 
computer with Windows 10. I also tried

param = bpstart(SnowParam(2, manager.hostname = "127.0.0.1", manager.port = 
2559)) but it doesn't complete.

I was able to identify the problem is with the line

bpbackend(x) <- do.call(parallel::makeCluster, cargs)

So, to summarise,

> cargs
$`spec`
[1] 2

$type
[1] "SOCK"

$snowlib
[1] "C:/Program Files/R/R-3.5.0/library/BiocParallel"

$master
[1] "127.0.0.1"

$port
[1] 2559

> do.call(parallel::makeCluster, cargs) # Freezes.

Should I ask the question on R-devel because it doesn't appear to be specific 
to Bioconductor ?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] BiocParallel on Windows Never Ends

2018-06-12 Thread Dario Strbenac
Good day,

I was interested how the performance of my package is on a 32-bit Windows 
computer because I'm going to give a workshop about it soon and some people 
might bring old laptops. I found that using SnowParam with workers set to more 
than 1 never finishes. The minimal code to cause the issue is:

bplapply(1:10, function(i) LETTERS[i], BPPARAM = SnowParam(workers = 1)) # 
Immediately returns a result.
bplapply(1:10, function(i) LETTERS[i], BPPARAM = SnowParam(workers = 2)) # 
Never completes.

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252
LC_MONETARY=English_Australia.1252 LC_NUMERIC=C  
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] BiocParallel_1.14.1

loaded via a namespace (and not attached):
[1] compiler_3.5.0 snow_0.4-2 parallel_3.5.0

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Unexpected Warning About Cross-Reference Without Package Specification

2018-05-12 Thread Dario Strbenac
Good day,

Thanks. I'll use the [limma] specifier to avoid the Warning from the build 
system.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Unexpected Warning About Cross-Reference Without Package Specification

2018-05-12 Thread Dario Strbenac
Good day,

I created a minimalist package that demonstrates the issue and it is attached 
to this letter. After using R CMD build, the subsequent R CMD check process 
emits one warning.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

Tester.tar.gz
Description: Tester.tar.gz
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Unexpected Warning About Cross-Reference Without Package Specification

2018-05-11 Thread Dario Strbenac
Good day,

limma was installed using biocLite, so it would be built before R CMD check was 
run. I could summarise all of the relevant information and send to 
R-package-devel mailing list to check if it is a bug.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Unexpected Warning About Cross-Reference Without Package Specification

2018-05-11 Thread Dario Strbenac
Good day,

Indeed, it is in the Suggests component of the dependency specification. I 
didn't find any extra requirements for this case in the Cross-references 
section of Writing R Extensions, so I'm unsure of where to read about the rule.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BiocInstaller: next generation

2018-05-11 Thread Dario Strbenac
Good day,

The features of the proposed package seem a lot like BiocInstaller. Once I have 
upgraded R and have the newest BiocInstaller installed using the bootstrapping 
technique of source("https://bioconductor.org/biocLite.R;), I typically do

library(BiocInstaller)
biocLite("GenomicAlignments")

to install the GenomicAlignments package in a subsequent R session, for 
instance. This avoids repetitive sourcing of the biocLite script from the 
Bioconductor server.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Unexpected Warning About Cross-Reference Without Package Specification

2018-05-10 Thread Dario Strbenac
Good day,

I have documented a parameter that is linked to lmFit's documentation.

\item{...}{Optional settings that are passed to \code{\link{lmFit}}.}

The package checking process displays a warning.

* checking Rd cross-references ... WARNING
Missing link or links in documentation object 'limmaSelection.Rd':
  ‘lmFit’

If I add [limma] to the cross-reference, the link is resolved.

* checking Rd cross-references ... OK

Why is the package specification not optional in this scenario? I am using the 
latest release of R.

* using R version 3.5.0 (2018-04-23)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] mcols Function Not Found for Windows Build

2018-03-15 Thread Dario Strbenac
Good day,

I notice an error happening when the vignette of ClassifyR is checked by 
tokay2. mcols is not found. I viewed the check reports of S4Vectors, and there 
are some Warnings for all operating systems, but no platform has Error, so it's 
unlikely to be related to the problem. Is there a way to make ClassifyR guard 
against this problem in Windows? I don't know how to begin solving this issue.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BiocCheck - warning: files are over 5MB

2018-03-10 Thread Dario Strbenac
Good day,

You could make use of the package named BSgenome.Celegans.UCSC.ce11. It 
contains the DNA sequences of all of the chromosomes of the roundworm and 
doesn't add any size to your package.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Numeric Operation on DataFrame

2018-01-15 Thread Dario Strbenac
Good day,

Would it be useful to provide the same operations which can be done to a 
data.frame for a DataFrame in a future release of S4Vectors? For example,

dataTable <- data.frame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTable)
#  aFeature anotherFeature 
# 3  3
dataTableS4 <- DataFrame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTableS4)
Error in colMeans(dataTableS4) : 
'x' must be an array of at least two dimensions

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Emails to package maintainer bouncing back - what to do?

2017-11-19 Thread Dario Strbenac
Good day,

Although the maintainer is unreachable, the original developer, Gábor Csárdi, 
is an active member of the R programming community. You should write to him.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Pandoc on Build Computers

2017-11-14 Thread Dario Strbenac
Good day,

In the vignette of ClassifyR, I have

institute: The University of Sydney, Australia

at the top. institute has been a valid entry since version 1.17 of Pandoc 
released in 2016. Could Pandoc on the Bioconductor computers be updated? I 
notice that version 2.0.2 is available since earlier this week.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] exonsBy dropping genes from TxDb

2017-10-28 Thread Dario Strbenac
Good day,

I stepped through the code until execution reached the end of postForm in RCurl 
which is called by getBM and obtains the textual result from the server. If I 
check the contents of write$value(), the example missing transcript is not 
there.

Browse[3]> grep("ENST0485971", write$value())
integer(0)

write$value is a weird function. It's prototype is function (collapse = "", 
...) but its body contains code such as

if (is.null(collapse)) 
return(txt)

I wonder where txt is created. It's not passed as an extra variable.

Browse[7]> print(list(...))
list()

Searching the R code reveals that txt is created as a global variable in 
another function named dynCurlReader by the code statement txt <<- character().

RCurl also uses functions that don't begin with a dot but are undocumented.

ans = encode(ans)
Browse[7]> ?encode
No documentation for ‘encode’ in specified packages and libraries

Anyway, the transcript ID is also missing from txt.

Browse[7]> grep("ENST0485971", txt)
integer(0)

It's hard to know what the obfuscated code of RCurl is doing.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] how can I contribute to the success of great packages?

2017-10-20 Thread Dario Strbenac
Good day,

Thanks for the clarification. I appreciate your regular insights on the support 
forum over the years. It seems that Gviz will be stable enough to use, although 
the same maintainer's domainsignatures package has strikethrough across its 
name in the 3.6 build report, indicating its deprecation from Bioconductor. 
domainsignatures has no NEWS file explaining why it is being deprecated, so the 
deprecation seems unplanned and unintentional, so end-users of it would have no 
advance notice if it later became defunct until they were faced with failed 
biocLite installation command. I simply wish to avoid that situation with 
genomic plotting. Indeed, I wouldn't be as cautious if I was considering csaw, 
for example, and noticed build system warnings close to the deadline.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Gviz Abandonware

2017-10-20 Thread Dario Strbenac
Hello,

Gviz hasn't been updated for the past two months but has a CHECK warning and 
there are almost no answered questions on the support website in the past three 
months. Is it worthwhile developing plotting functions based on Gviz if it is 
likely to become defunct next year?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Why should Bioconductor developers re-use core classes?

2017-10-18 Thread Dario Strbenac
Good day,

It might be useful to readers to have a comparison table (ticks and crosses) in 
the MultiAssayExperiment vignette that compares the features available in it to 
those available in SummarizedExperiment, to allow quicker decision making.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] ShortRead readFasta UniProt Incorrect Import

2017-10-17 Thread Dario Strbenac
Good day,

If I have a FASTA file that contains

>sp|Q9NYW0|T2R10_HUMAN Taste receptor type 2 member 10 OS=Homo sapiens 
>GN=TAS2R10 PE=1 SV=3
MLRVVEGIFIFVVVSESVFGVLGNGFIGLVNCIDCAKNKLSTIGFILTGLAISRIFLIWI
IITDGFIQIFSPNIYASGNLIEYISYFWVIGNQSSMWFATSLSIFYFLKIANFSNYIFLW
LKSRTNMVLPFMIVFLLISSLLNFAYIAKILNDYKTKNDTVWDLNMYKSEYFIKQILLNL
GVIFFFTLSLITCIFLIISLWRHNRQMQSNVTGLRDSNTEAHVKAMKVLISFIILFILYF
IGMAIEISCFTVRENKLLLMFGMTTTAIYPWGHSFILILGNSKLKQASLRVLQQLKCCEK
RKNLRVT

readFasta fails to import it with the warning

proteins <- readFasta('.', "test.fasta")

Warning message:
In .Call2("fasta_index", filexp_list, nrec, skip, seek.first.rec,  :
  reading FASTA file test.fasta: ignored 129 invalid one-letter sequence codes

Also, the amino acid sequence is incomplete. There are 308 amino acids, but 

> width(proteins)
[1] 178

It's undesirable for users that some amino acids are discarded. Hopefully, they 
notice the warning message before proceeding with the analysis.

Admittedly, readFasta is in ShortRead, so is designed to work with high 
througput sequencing reads. But, perhaps it would be better suited to a 
infrastructure package such as Biobase and generalised to correctly import any 
FASTA file. There's even a Bioconductor workflow at 
https://www.bioconductor.org/help/workflows/sequencing/ which has a section 
titled "DNA/amino acid sequence from FASTA files" and demonstrates the use of 
readFasta.

I used version 1.34.2 of ShortRead which is the newest one.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Why should Bioconductor developers re-use core classes?

2017-10-17 Thread Dario Strbenac
Good day,

I developed ClassifyR, which is a classification framework, based on 
ExpressionSet. Now that we're getting enquiries about inputting multiple 
datasets derived from the same patients, we plan to completely refactor the 
software to use MultiAssayExperiment as a foundation class.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] EXTERNAL: Fetuching Upstream Permission Denied

2017-10-11 Thread Dario Strbenac
Good day,

Thanks for your help. In the end, export GIT_SSH_COMMAND='ssh -i 
~/SSHkeys/digiOcean' did the trick. The write access is showing.

R Wpackages/ClassifyR

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Fetuching Upstream Permission Denied

2017-10-11 Thread Dario Strbenac
Good day,

I have submitted my public key a couple of months ago and am now trying to do 
some maintenance.

The code I used is:

git clone https://github.com/DarioS/ClassifyR.git
cd ClassifyR
git remote add upstream g...@git.bioconductor.org:packages/ClassifyR.git
git config core.sshCommand "ssh -i ~/SSHkeys/digiOcean"
git checkout master
git fetch upstream

but I get an error.

Permission denied (publickey).
fatal: Could not read from remote repository.

The key has the appropriate permissions.

$ ls -l ~/SSHkeys/digiOcean
-rw--- 1 dario dario 1675 Aug  5  2015 /home/dario/SSHkeys/digiOcean

Copying the private key to ~/.ssh/ does not help. How can I do it?

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] MultiAssayExperiment Subsetting Fails if Column Data Has One Column

2017-09-12 Thread Dario Strbenac
Good day,

Subsetting a MultiAssayExperiment object fails if the column data has one 
column but not 2 or more columns. Perhaps drop = FALSE is missing for the 
DataFrame subsetting. A minimal example is:

rowColNames <- list(paste0("Gene", 1:10), paste0("Person", 1:10))
aTable <- matrix(rnorm(100), ncol = 10, dimnames = rowColNames)
classes <- data.frame(row.names = paste0("Person", 1:10),
  class = rep(c("Non-Responder", "Recovery"), each = 5))
measurementsSet <- MultiAssayExperiment(list(RNA = aTable), classes)
measurementsSet[1, 1, ]

other attached packages:
[1] S4Vectors_0.15.7BiocGenerics_0.23.1 
MultiAssayExperiment_1.3.34

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] ExperimentList Contructor Failing

2017-09-12 Thread Dario Strbenac
Good day,

Whatever the problem is, it's gone with R Under Development and all packages 
installed from the development branch of Bioconductor.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] ExperimentList Contructor Failing

2017-09-12 Thread Dario Strbenac
Good day,

Although the package seems to build without errors, I can't run the basic 
examples of MultiAssayExperiment successfully.

library(MultiAssayExperiment)
> example("ExperimentList")

ExprmL> ## Create an empty ExperimentList instance
ExprmL> ExperimentList()
Error in checkSlotAssignment(object, name, value) : 
  assignment of an object of class “NULL” is not valid for slot 
‘elementMetadata’ in an object of class “ExperimentList”; is(value, 
"DataTableORNULL") is not TRUE

Everything seems fine with the package check:

> BiocInstaller::biocValid()

* sessionInfo()

R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_AU.UTF-8   LC_NUMERIC=C   LC_TIME=en_AU.UTF-8  
 
 [4] LC_COLLATE=en_AU.UTF-8 LC_MONETARY=en_AU.UTF-8
LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8   LC_NAME=C  LC_ADDRESS=C 
 
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C  
 

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] MultiAssayExperiment_1.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12   BiocInstaller_1.26.1   compiler_3.4.1   
 
 [4] GenomeInfoDb_1.12.2plyr_1.8.4 XVector_0.16.0   
 
 [7] bitops_1.0-6   tools_3.4.1zlibbioc_1.22.0  
 
[10] digest_0.6.12  tibble_1.3.4   gtable_0.2.0 
 
[13] lattice_0.20-35rlang_0.1.2Matrix_1.2-11
 
[16] DelayedArray_0.2.7 shiny_1.0.5parallel_3.4.1   
 
[19] GenomeInfoDbData_0.99.0gridExtra_2.3  stringr_1.2.0
 
[22] UpSetR_1.3.3   S4Vectors_0.14.4   IRanges_2.10.3   
 
[25] stats4_3.4.1   grid_3.4.1 shinydashboard_0.6.1 
 
[28] glue_1.1.1 Biobase_2.36.2 R6_2.2.2 
 
[31] purrr_0.2.3tidyr_0.7.1magrittr_1.5 
 
[34] reshape2_1.4.2 ggplot2_2.2.1  scales_0.5.0 
 
[37] matrixStats_0.52.2 htmltools_0.3.6BiocGenerics_0.22.0  
 
[40] GenomicRanges_1.28.5   SummarizedExperiment_1.6.3 mime_0.5 
 
[43] xtable_1.8-2   colorspace_1.3-2   httpuv_1.3.5 
 
[46] stringi_1.1.5  RCurl_1.95-4.8 lazyeval_0.2.0   
 
[49] munsell_0.4.3 

* Out-of-date packages
  Package LibPath Installed Built   ReposVer
rJava "rJava" "/usr/local/lib/R/site-library" "0.9-8"   "3.2.3" "0.9-8" 
  Repository
rJava "https://cran.rstudio.com/src/contrib;

update with biocLite()

Error: 1 package(s) out of date

The same example works on another computer using Windows operating system. 
What's the issue with this Linux environment?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] how to verify upstream git changes

2017-09-01 Thread Dario Strbenac
Good day,

I like the idea of a commits log on the Bioconductor website. It was useful 
being able to see at a glance which packages have recently been changing.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] SnpSet Creation Function Prototype Not Valid R Code

2017-08-23 Thread Dario Strbenac
Good day,

It's formatted in monospace font, implying that it's R code, but

new('SnpSet', phenoData = [AnnotatedDataFrame], experimentData = [MIAME], 
annotation = [character], protocolData = [AnnotatedDataFrame], call = [matrix], 
callProbability = [matrix], ...)

is just pseudocode. Also, object creation using new is discouraged. Perhaps 
SnpSet could have a proper constructor, like ExpressionSet does?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] git and public keys

2017-08-22 Thread Dario Strbenac
Good day,

Is the private key in a location other than the default SSH key folders? If so, 
use the ssh-add command to have the SSH agent know about it.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Generate valid SSH keys for the bioc-git server!

2017-08-21 Thread Dario Strbenac
Good day,

I filled out the form on Thursday, but can't fetch the repository.

$ git fetch upstream
Permission denied (publickey).
fatal: Could not read from remote repository.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Default Coverage Value

2017-07-27 Thread Dario Strbenac
Hello,

The coverage function is still inconvenient to use with a vector of weights to 
convert a GRanges metadata column into a RleList object.

"The coverage method for GRanges could gain a default value argument."
- Michael Lawrence, January 2013.

"Something like coverage(foo, bar, ..., NA.value=-1)?"
- Tim Triche, Jr., January 2013.

Might this plan be restored (with a default value of 0 for backwards 
compatibility)?

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] SplicingGraphs Feature Suggestion

2017-06-22 Thread Dario Strbenac
Hello,

It would be convenient if the colour or the width of the edges could be 
customised to represent whether an edge is equally present in two experimental 
conditions or the degree to which it is enriched in one of them of an RNA-seq 
dataset.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] pairwiseAlignment Improvements

2017-04-28 Thread Dario Strbenac
Good day,

The location of indels can be retrieved from a PairwiseAlignmentsSingleSubject 
object by using indel. Determining any difference between the two sequences, 
including substitutions, is not quick nor easy. I suppose that summary displays 
details of the mismatches, but the variable is of class 
PairwiseAlignmentsSingleSubjectSummary which has no documented accessors. So, 
the code to access the information looks bad.

summaryAlign@mismatchSummary[["subject"]]

SubjectPosition Subject Pattern Count Probability
1   2   T   A 1   1
2   3   T   A 1   1

This could be improved with accessors for end users.

Also, instead of being a data.frame, this would be better stored as IRanges 
with associated metadata columns, accessible with mcols, so that methods like 
reduce could easily be used to look for contiguous blocks of differences.

Is there a reason why the show method for the summary only shows mismatches, 
even if there are indels contained in it? This seems arbitrary and also 
misleading, because it always gives a false impression that there are no indels.

Could the return data types consistently be made to be IRanges ? Sometimes it's 
IntegerList, sometimes it's IRanges. For example,

> A
  11-letter "DNAString" instance
seq: GAACGAGGACC
> B
  8-letter "DNAString" instance
seq: GGACGAGC
> alignment <- pairwiseAlignment(A, B, gapOpening = 0, gapExtension = 1, 
> substitutionMatrix = substitutions)
> alignment@subject@mismatch
IntegerList of length 1
[[1]] 2
> alignment@subject@indel
IRangesList of length 1
[[1]]
IRanges of length 1
start end width
[1] 8   9 2

Lastly, why are functions like insertion, deletion, and indel documented in 
Numeric Summary Methods? Unlike nchar and score, they are not numerical 
summaries of the data.

It'd be nice to see this part of Biostrings thoroughly refactored with more 
focus on UX.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] ShortReadQ Serialisation Slow and Creates Large File

2017-04-26 Thread Dario Strbenac
Good day,

Accidentally using save instead of writeFastq lead me to noticing how large a 
ShortReadQ object on disk is. A small set of reads 

> RNAreads
class: ShortReadQ
length: 42680 reads; width: 50..100 cycles

was saved two ways. As a text file, they take 11 MB uncompressed. But, when 
saved in binary format, the size on disk is 2.0 GB. Is a lot of unnecessary 
detail saved when the object is serialised?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] GRangesList Conversion Fails For Unstranded Sequencing Data

2017-03-14 Thread Dario Strbenac
Good day,

countOverlaps doesn't work for a GAlignmentPairs object with strandMode set to 
0. This is because of an oversight in the grglist function. It has an if 
statement that checks whether the strand mode is 1 or 2. Then, it tries to 
subset the variable 'x_unlisted'. However, if strand mode is 0, neither of the 
conditional sections of code are executed and Error in .local(x, use.names, 
use.mcols, ...) : object 'x_unlisted' not found happens because the 
'x_unlisted' variable has not been created. It's a surprise no one else has 
encountered this bug before.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] edgeR and limma Default Offsets

2017-02-16 Thread Dario Strbenac
Good day,

Now I notice the differences in how the prior counts are applied. In edgeR's 
cpm:

prior.count.scaled <- lib.size/mean(lib.size) * prior.count
lib.size <- lib.size + 2 * prior.count.scaled
......
t(x) + prior.count.scaled

but in limma's voom:

t(counts + 0.5)/(lib.size + 1)

Basically, the values added to the counts and the library size ignore the 
library size of each sample in the voom function.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] edgeR and limma Default Offsets

2017-02-15 Thread Dario Strbenac
Good day,

The cpm function in edgeR uses a default offset of 0.25 and voom in limma uses 
0.5 (and provides no user modification) to calculate the base 2 logarithm of 
the counts per million. Might these be made consistent?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Alternative Hypothesis Specification For edgeR

2017-01-15 Thread Dario Strbenac
Good day,

In a future release, could the user be allowed to specify an alternative 
hypothesis such as the coefficient being positive? DESeq2 provides an 
altHypothesis parameter for such a purpose.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] readGAlignments Lacks strandMode

2017-01-14 Thread Dario Strbenac
Good day,

Now I know about invertStrand, I agree that it's best to keep the strandMode 
only for paired-end data. Indeed, it's an example at the end of the lengthy 
documentation of GAlignments.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] GAlignments Sorting Causes C Stack Error

2017-01-08 Thread Dario Strbenac
Good day,

When sort is used on a GAlignments object, a stack error is shown, no matter 
how small the object is.

> testAlignments
GAlignments object with 3 alignments and 0 metadata columns:
  seqnames strand   cigarqwidth 
start   end width njunc
  
   
  700666F:126:C8768ANXX:3:2204:3175:99484chr14  +  71S27M98 
 18386040  1838606627 0
  700666F:126:C8768ANXX:1:1107:8115:31928chr14  +  40S60M   100 
 18915005  1891506460 0
  700666F:126:C8768ANXX:1:2206:7564:34686chr14  +  40S50M90 
 18915005  1891505450 0
  ---
  seqinfo: 23 sequences from an unspecified genome

> sort(testAlignments)
Error: C stack usage  7970544 is too close to the limit

I use up-to-date packages.

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

locale:
 [1] LC_CTYPE=C.UTF-8   LC_NUMERIC=C   LC_TIME=C.UTF-8
LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8   
 [6] LC_MESSAGES=C.UTF-8LC_PAPER=C.UTF-8   LC_NAME=C  
LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets  
methods   base 

other attached packages:
 [1] GenomicAlignments_1.10.0   Rsamtools_1.26.1   Biostrings_2.42.1
  XVector_0.14.0
 [5] SummarizedExperiment_1.4.0 Biobase_2.34.0 GenomicRanges_1.26.2 
  GenomeInfoDb_1.10.1   
 [9] IRanges_2.8.1  S4Vectors_0.12.1   BiocGenerics_0.20.0  
 

loaded via a namespace (and not attached):
[1] lattice_0.20-34bitops_1.0-6   grid_3.3.2 zlibbioc_1.20.0
Matrix_1.2-7.1 BiocParallel_1.8.1
[7] tools_3.3.2

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] readGAlignments Lacks strandMode

2017-01-05 Thread Dario Strbenac
Good day,

readGAlignmentPairs has strandMode but readGAlignments doesn't, which means 
that single-end strand-specific RNA-seq that generates sequences on the 
opposite strand to the gene needs a subsequent ifelse statement. The API could 
be more consistent by providing a strandMode option for readGAlignments and 
other similar functions in GenomicAlignments.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] readGAlignmentPairs Fails if Used Inside mclapply Loop

2016-12-12 Thread Dario Strbenac
Hello,

I fixed the suggested test case, since the command didn't specify the 
connection and produced an error. It appears to work without a problem. 

> bamFile <- mappedReadsGenome[2]
> length(serialize(readGAlignmentPairs(bamFile, strandMode=2), NULL))
[1] 1329295005

I also tried mc.cores = 2 and it also resulted in an error. Each of the files 
has 30 to 40 million mappings, so I wouldn't expect them to be too big. I'll 
stick to bplapply.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] readGAlignmentPairs Fails if Used Inside mclapply Loop

2016-12-12 Thread Dario Strbenac
Good day,

I found that readGAlignmentPairs fails when used inside an mclapply loop but 
not an sapply loop. I haven't had such problems with other functions when using 
mclapply.

> class(mappedToGenomeFiles)
[1] "character"
> length(mappedToGenomeFiles)
[1] 13

> mappedReadsGenome <- sapply(mappedToGenomeFiles, function(bamFile)
  {
  readGAlignmentPairs(bamFile, strandMode = 2)
  })
# No error. Each item is of GAlignmentPairs class.

But, with mclapply:

> mappedReadsGenome <- mclapply(mappedToGenomeFiles, function(bamFile)
  {
  readGAlignmentPairs(bamFile, strandMode = 2)
  }, mc.cores = 7)
Warning message:
In mclapply(mappedToGenomeFiles, function(bamFile) { :
  scheduled cores 6, 5, 3, 1, 4, 2 encountered errors in user code, all values 
of the jobs will be affected
> mappedReadsGenome
[[1]]
[1] "fatal error in wrapper code"
attr(,"class")
[1] "try-error"
[[2]]
[1] "fatal error in wrapper code"
attr(,"class")
[1] "try-error"
   .
   .
   .   
[[7]]
GAlignmentPairs object with 41860576 pairs, strandMode=2, and 0 metadata 
columns:
 seqnames strand   : ranges  -- 
ranges
 :--  

 [1]chr14  +   :   [19010525, 19010623]  --   [19010414, 
19010513]
 [2]chr14  +   :   [19010543, 19010612]  --   [19010505, 
19010604]
 [3]chr14  +   :   [19010608, 19010707]  --   [19010577, 
19010676]
 [4]chr14  +   :   [19011187, 19011286]  --   [19011142, 
19011241]
 [5]chr14  +   :   [19011318, 19011415]  --   [19011187, 
19011286]
 ...  ...... ...... ...
...
  [41860572] chr4  +   : [190972787, 190972886]  -- [190972685, 
190972784]
  [41860573] chr4  -   : [190974302, 190974385]  -- [190974302, 
190974385]
  [41860574] chr4  -   : [190978480, 190978579]  -- [190978542, 
190978641]
  [41860575] chr4  -   : [190982116, 190982215]  -- [190982125, 
190982224]
  [41860576] chr4  +   : [191031678, 191031776]  -- [191031630, 
191031729]
  ---
  seqinfo: 25 sequences from an unspecified genome
   .
   .
   .
[[13]]
[1] "fatal error in wrapper code"
attr(,"class")
[1] "try-error"

Interestingly, reading in from one of the thirteen file paths worked.

In contrast, a simple test case of the same length works:

X=1:13
mclapply(X, function(x) x + 1, mc.cores = 7) # Prints 2:14.

The BAM file import also works with blapply and BPPARAM = 
MulticoreParam(workers = 7)

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

locale:
 [1] LC_CTYPE=C.UTF-8   LC_NUMERIC=C   LC_TIME=C.UTF-8
LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8   LC_NAME=C  LC_ADDRESS=C   
LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets  
methods   base 

other attached packages:
 [1] GenomicAlignments_1.10.0   SummarizedExperiment_1.4.0 
GenomicFeatures_1.26.0 AnnotationDbi_1.36.0   Biobase_2.34.0
 [6] Rsamtools_1.26.1   Biostrings_2.42.0  XVector_0.14.0   
  GenomicRanges_1.26.1   GenomeInfoDb_1.10.1   
[11] IRanges_2.8.1  S4Vectors_0.12.0   BiocGenerics_0.20.0  
 

loaded via a namespace (and not attached):
 [1] zlibbioc_1.20.0BiocParallel_1.8.1 lattice_0.20-34tools_3.3.2   
 grid_3.3.2 DBI_0.5-1  Matrix_1.2-7.1
 [8] rtracklayer_1.34.1 bitops_1.0-6   RCurl_1.95-4.8 biomaRt_2.30.0
 RSQLite_1.0.0  XML_3.98-1.5

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Implementation of vmatchPattern Indels

2016-12-04 Thread Dario Strbenac
Good day,

I notice that with.indels has been a parameter to vmatchPattern for almost a 
decade but is only a stub. I am hoping that this suggestion could put it into 
future development plans so the underlying functionality could be implemented 
soon. It could be a useful option for preprocessing of CRISPR genomic screens 
without leaving the R analysis environment, which is a new use case not 
existing before.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] vmatchPattern Returns Out of Bounds Indices

2016-11-18 Thread Dario Strbenac
Good day,

> These questions really belong to the support site.

I suppose, although it seemed like an unexpected issue at first because it's 
not documented within ?lowlevel-matching so users don't know what to expect.

> You'll get that behaviour by allowing indels.

This reveals a discrepancy between the documentation and the way the function 
operates. In the documentation, the function definition of vmatchPattern has 
with.indels = FALSE in it. However, changing it to TRUE results in

Error in .XStringSet.vmatchPattern(pattern, subject, max.mismatch, 
min.mismatch,  : 
  vmatchPattern() does not support indels yet

This is utilising Biostrings 2.42.0 in R 3.3.1.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] vmatchPattern Returns Out of Bounds Indices

2016-11-16 Thread Dario Strbenac
Hello,

If using vmatchPattern to find a sequence in another sequence, the resulting 
end index can be beyond the length of the subject XStringSet. For example:

forwardPrimer <- "TCTTGTGGAAAGGACGAAACACCG"
> range(width(reads))
[1] 75 75
primerEnds <- vmatchPattern(forwardPrimer, reads, max.mismatch = 1)
> range(unlist(endIndex(primerEnds))
[1] 23 76

This causes problems if using extractAt to obtain the sequences within each 
read. For example:

sequences = extractAt(reads, locations)
Error in .normarg_at2(at, x) : 
  some ranges in 'at' are off-limits with respect to their corresponding 
sequence
  in 'x'

It's rare, but still a problem, nonetheless.

> table(unlist(endIndex(primerLocations)) >  75)

 FALSE   TRUE 
366225  2

This happens with Biostrings 2.42.0.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Feasibility of Parallel Extraction of Matches with extractAllMatches

2016-11-16 Thread Dario Strbenac
Good day,

I'd like to request that extractAllMatches works when subject is an XStringSet. 
The function could check that subject and mindex have the same length and then 
process them in parallel. Currently, the following example isn't immediately 
possible.

words <- BStringSet(c("xxGOATzz", "xxMOATzz", "xxNOTEzz"))
matches <- vmatchPattern("GOAT", words, max.mismatch = 1)
similarWords <- extractAllMatches(words, matches) # Not possible.

Could that be implemented for the next release of Biostrings? Or, perhaps it 
can be deprecated since it duplicates the functionality of substr?

> substr(words, start(matches), end(matches))
[1] "GOAT" "MOAT" NA 

Also, the expected subsetting fails for MIndex objects.

> class(matches)
[1] "ByPos_MIndex"
> length(matches)
[1] 3
> length(matches[1])
[1] 3

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Subversion Log Stalled

2016-11-10 Thread Dario Strbenac
Good day,

The log at http://bioconductor.org/developers/svnlog/ stopped updating two 
weeks ago.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Bioc-devel Digest, Vol 152, Issue 10

2016-11-06 Thread Dario Strbenac
Good day,

The problem is simply caused by an incorrectly typed file path. The error 
message isn't clear about this and describes some temporary file name (based on 
today's date and time) which is confusing. Perhaps the importFusionData 
function could be made more robust by checking for the file's existence at the 
beginning of the function. For example,

if(file.exists(filename))
# Do fusion file import.
else
stop("Could not find the specified fusion file.")

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] chimera Attempts to Open Non-existent File

2016-11-06 Thread Dario Strbenac
Good day,

The examples section of importFusionData is almost entirely commented out, so 
it's unclear whether it works. Since a lot of the package code is never run by 
R CMD check and the test coverage is 0%, it's plausibly a package development 
issue.

-
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] chimera Attempts to Open Non-existent File

2016-11-02 Thread Dario Strbenac
Good day,

The importFusionData function fails trying to open a file that doesn't exist.

> fd = importFusionData("star", 
> "/verona/nobackup/biostat/datasets/melanoma/AAHChimeric.out.junction")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'Thu_Nov_3_13-46-27_2016': No such file or directory

The section of code where the error occurs seems to be in the .starImport 
function.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] FlipFlop Ignores Read Strand and Requires Antiquated File Formats

2016-11-01 Thread Dario Strbenac
Hello,

The package FlipFlop is made for isoform quantitiation. Why are there no 
options to specify the RNA-seq read strand ? Otherwise, the method produces 
incorrect counts where overlapping genes on both strands are being transcribed. 
Also, the software requires a SAM file as input. This is inefficient, since 
most mapping results are stored as BAM files. It would be better if FlipFlop 
made more use of the import and export functions available in Rsamtools. Also, 
requiring the gene database to be in BED12 format creates more unnecessary work 
for the user. ENSEMBL and GENCODE both provide GTF and GFF3 files, which can 
easily be imported into R with functions provided by rtracklayer.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] ignoreSelf Option for findOverlaps of GenomicRanges Query

2016-10-26 Thread Dario Strbenac
Good day,

For an IRanges object, findOverlaps has ignoreSelf and ignoreRedundant options. 
However, these aren't available for a GenomicRanges input object, even though 
the subject parameter is optional and a query GRanges object can be overlapped 
with itself. Could this be changed to be consistent?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] NEWS files

2016-10-12 Thread Dario Strbenac
Good day,

I see it, too. There's no problem.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] NEWS files

2016-10-12 Thread Dario Strbenac
Good day,

ClassifyR has a NEWS file, but I don't see any link to it on ClassifyR's 
webpage. There are no warning or errors during the checking process. What is 
causing it to be missed ? I also tried news(package = "ClassifyR") and it 
renders well, although the R logo is gigantic.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] VCF Intersection Using readVcf Remarkably Slow

2016-09-27 Thread Dario Strbenac
Good day,

file <- system.file("extdata", "chr22.vcf.gz", package = "VariantAnnotation")
anotherFile <- system.file("extdata", "hapmap_exome_chr22.vcf.gz", package = 
"VariantAnnotation")
aSet <- readVcf(file, "hg19")
system.time(commonMutations <- readVcf(anotherFile, "hg19", rowRanges(aSet)))
   user  system elapsed 
209.120  16.628 226.083 

Reading in the Exome chromosome 22 VCF and intersecting it with the other file 
in the data directory takes almost 4 minutes.

However, reading in the whole file is much faster.

> system.time(anotherSet <- readVcf(anotherFile, "hg19"))
   user  system elapsed 
  0.376   0.016   0.392 

and doing the intersection manually takes a fraction of a second

> system.time(fastCommonMutations <- intersect(rowRanges(aSet), 
> rowRanges(anotherSet)))
   user  system elapsed 
  0.128   0.000   0.129

This comparison ignores the finer details such as the identities of the 
alleles, but does it have to be so slow ? My real use case is intersecting 
dozens of VCF files of cancer samples with the ExAC consortium's VCF file that 
is 4 GB in size when compressed. I can't imagine how long that would take.

Can the code of readVcf be optimised ?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Warning when Reading Example VCF

2016-09-27 Thread Dario Strbenac
Good day,

When importing a VCF file from VariantAnnotation's data directory into R, a 
warning is emitted.

library(VariantAnnotation)
aFile <- system.file("extdata", "hapmap_exome_chr22.vcf.gz", package = 
"VariantAnnotation")
aSet <- readVcf(aFile, "hg19")

Warning message:
In .bcfHeaderAsSimpleList(header) :
  duplicate keys in header will be forced to unique rownames

Is there some problem with one of the VCF file's format which is distributed 
with VariantAnnotation ? I wouldn't expect any package data files to emit 
warnings to the end user.

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 15.10
VariantAnnotation 1.18.7

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Why sleuth is not in Bioconductor?

2016-09-14 Thread Dario Strbenac
Hello,

That contradicts the instructions on the sleuth Download page. It contains the 
R commands

biocLite("rhdf5")
and
devtools::install_github("pachterlab/sleuth")

You may have read the instructions too quickly and mixed the arguments up.

Bioconductor requires lots of function documentation with runnable examples and 
a vignette. Sleuth isn't currently at the R package quality level necessary for 
Bioconductor.

------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] String Matching in Parallel

2016-09-04 Thread Dario Strbenac
Hello,

Functions such as vmatchPattern and vmatchPDict naturally lend themselves to 
being parallelised. Could they be enhanced to accept a BiocParallelParam object 
? Or, is there no significant performance difference using them as-is and 
having the bplapply loop surrounding them and repeatedly calling DNAString 
(it's odd that vmatchPattern - for searching BSgenome objects - requires a 
DNAString for the pattern, rather than a DNAStringSet) or DNAStringSet ?

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BStringSet Documentation

2016-09-02 Thread Dario Strbenac
Hello,

Actually, I thought that substr unintentionally worked and perhaps they should 
both produce an error message. Thanks for adding the functionality for 
strsplit, though!

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] BStringSet Documentation

2016-09-01 Thread Dario Strbenac
Good day,

According to the documentation, I wouldn't think that substr or strsplit would 
work on a BStringSet, but substr does.

> IDs
  A BStringSet instance of length 5
width seq
[1]61 D00626:168:C9CWMANXX:1:1105:1816:1998 1:N:0:TCCGGAGA+ATAGAGGC
[2]61 D00626:168:C9CWMANXX:1:1105:2113:1989 1:N:0:TCCGGAGA+ATAGAGGC
[3]61 D00626:168:C9CWMANXX:1:1105:2703:1986 1:N:0:TCCGGAGA+ATAGAGGC
[4]61 D00626:168:C9CWMANXX:1:1105:3255:1979 1:N:0:TCCGGAGA+ATAGAGGC
[5]61 D00626:168:C9CWMANXX:1:1105:4525:1995 1:N:0:TCCGGAGA+ATAGAGGC
> substr(IDs, 1, 37)
[1] "D00626:168:C9CWMANXX:1:1105:1816:1998"
[2] "D00626:168:C9CWMANXX:1:1105:2113:1989"
[3] "D00626:168:C9CWMANXX:1:1105:2703:1986"
[4] "D00626:168:C9CWMANXX:1:1105:3255:1979"
[5] "D00626:168:C9CWMANXX:1:1105:4525:1995"
> strsplit(IDs, ' ')
Error in strsplit(IDs, " ") : non-character argument

I think that both of these functions shouldn't work or both should work, to be 
consistent.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Repitools Development Version Webpage Gone

2016-08-26 Thread Dario Strbenac
Recently, the overview webpage of the development version of Repitools has 
vanished. It is still listed in the build report, though. There are also some 
strange build errors on Linux.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


  1   2   >