Re: [Bioc-devel] VariantAnnotation: VCF to VRanges with multiple INFO values

2014-12-11 Thread Julian Gehring
Can you backport the fixes to bioc-release which is also affected?

Best
Julian


Valerie Obenchain (12/09/14 03:24):

 Thanks to Michael and Julian for taking care of this. Fixes are in 
 devel, = 1.13.15.

 Valerie


 On 12/03/14 08:44, Michael Lawrence wrote:
 Looks like an issue when expand()ing the VCF. Maybe Val could take a look?

 On Wed, Dec 3, 2014 at 7:39 AM, Julian Gehring julian.gehr...@embl.de
 wrote:

 Hi,

 The conversion from a 'VCF' to 'VRanges' object fails if an INFO field
 with multiple values for different ALT alleles is present:

 Here an example VCF entry for which this fails (line 71151250 in
 'ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.sites.vcf.gz'
 , taken from

 ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.sites.vcf.gz
 ):

10001541  rs12451372  C   G,T 100 PASS

 AC=700,298;AF=0.139776,0.0595048;AN=5008;NS=2504;DP=17289;EAS_AF=0.2421,0;AMR_AF=0.1801,0.0115;AFR_AF=0.0749,0.2194;EUR_AF=0.0915,0;SAS_AF=0.1431,0;AA=T|||

 The respective code to reproduce this:

library(VariantAnnotation)
roi = GRanges(17, IRanges(1e7+1541, width = 1))
vcf = readVcf(path, GRCh37, ScanVcfParam(which = roi, info = AF))
## 'info = character()' and other versions also cause the error

vrc = as(vcf, VRanges) ## error

 fails with

Error in colSums(ielt) : 'x' must be an array of at least two dimensions

 This occurs both with the latest version of VariantAnnotation in
 bioc-release and bioc-devel.

 Best
 Julian

 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel


  [[alternative HTML version deleted]]

 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel


 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Experiment Data biocViews updates

2014-12-11 Thread Sonali Arora

Hi Vince,

As an initial cut,
a) I have removed WholeGenomeData from under TechnologyData - 
SequencingData
b) added Somatic and Germline under SpecimenSource. ( you need to 
click the checkbox on top to see these two as they are childless biocViews.


Thanks and Regards,
Sonali.

On 12/8/2014 8:44 AM, Vincent Carey wrote:

On Mon, Dec 8, 2014 at 11:22 AM, Sonali Arora sar...@fredhutch.org wrote:


Hi Vince,

On 12/8/2014 7:38 AM, Vincent Carey wrote:


Very nice.  Is WholeGenomeData under SequencingData  sufficiently
clear?


We mirrored the Technology category under Software biocViews for
TechnologyData, thus SequencingData mirrored Sequencing
what would you suggest instead ?


Well, this is something we should get some consensus on if there will be
any changes.  It seems
to me that various assays can be regarded as whole genome data whereas I
think what is intended
for that term is whole genome dna sequencing, and one might want to
distinguish germ line and
tumor sequencing for an initial cut?





We may also want to have a source tissue terminology in the experimental
data space.


We have :
SpecimenSource - Tissue;


Again I don't want to introduce pointless complexity.  But my understanding
is that one can only
use terms that are in the hierarchy, so if there are informative
distinctions missing from the hierarchy,
resources go out with vague labels.

one set of terms for tissues is at

http://www.gtexportal.org/home/

since we might want to repackage some of that data, perhaps a coarse subset
of those terms would
be useful to have in the hierarchy



SpecimenSource - Proteome;
SpecimenSource - Genome;
SpecimenSource - StemCell;
SpecimenSource - CellCulture;

I believe you want us to expand Tissue more ? Please suggest the best way
to break it up further.

Thanks for the quick response,
Sonali.



On Mon, Dec 8, 2014 at 10:28 AM, Sonali Arora sar...@fredhutch.org
wrote:

  Hi everyone,

We have revised the Experiment Data biocViews and have updated most of
the
Experiment Data packages using our word lookup function(
recommendBiocViews() ) .

Package Authors are encouraged to check the new biocViews added to their
Experiment Data package  and add more relevant ones from the new updated
Experiment Data biocViews tree.

The new Experiment Data biocView tree can be viewed at :
http://www.bioconductor.org/packages/devel/BiocViews.html#
___ExperimentData

Just a friendly reminder - You are allowed to add biocViews only from the
category of biocViews that you're package belongs to.
For example, ExperimentData Packages can contain biocViews only from the
Experiment Data biocViews category.

We hope that this updated tree will help you find test data  more easily
and efficiently, for the new software packages that you write.

--
Thanks and Regards,
Sonali




--
Thanks and Regards,
Sonali




___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] biovizBase Installation rbind Problem

2014-12-11 Thread Dario Strbenac
Hello,

I can't install biovizBase, although I have the correct versions of all 
packages in Depends and Imports fields.

 source(http://bioconductor.org/biocLite.R;)
Bioconductor version 3.0 (BiocInstaller 1.16.1), ?biocLite for help
 biocLite(biovizBase)
BioC_mirror: http://bioconductor.org
Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2.
Installing package(s) 'biovizBase'
trying URL 
'http://bioconductor.org/packages/3.0/bioc/src/contrib/biovizBase_1.14.0.tar.gz'
Content type 'application/x-gzip' length 2429664 bytes (2.3 Mb)
opened URL
==
downloaded 2.3 Mb

* installing *source* package 'biovizBase' ...
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2 -pipe -g  -c 
R_init_biovizBase.c -o R_init_biovizBase.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2 -pipe -g  -c 
bin_offsets.c -o bin_offsets.o
gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o biovizBase.so 
R_init_biovizBase.o bin_offsets.o -L/usr/lib/R/lib -lR
installing to /dskh/nobackup/biostat/Bioconductor/biovizBase/libs
** R
** data
** inst
** preparing package for lazy loading
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match
Error : unable to load R code in package 'biovizBase'
ERROR: lazy loading failed for package 'biovizBase'
* removing '/dskh/nobackup/biostat/Bioconductor/biovizBase'
* restoring previous '/dskh/nobackup/biostat/Bioconductor/biovizBase'

 sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
 [1] splines   grid  stats4parallel  stats graphics  grDevices
 [8] utils datasets  methods   base 

other attached packages:
 [1] BiocInstaller_1.16.1 dichromat_2.0-0  RColorBrewer_1.1-2  
 [4] VariantAnnotation_1.12.6 Rsamtools_1.18.2 AnnotationDbi_1.28.1
 [7] Biobase_2.26.0   scales_0.2.4 Hmisc_3.14-6
[10] Formula_1.1-2survival_2.37-7  lattice_0.20-29 
[13] Biostrings_2.34.0XVector_0.6.0GenomicRanges_1.18.3
[16] GenomeInfoDb_1.2.3   IRanges_2.0.0S4Vectors_0.4.0 
[19] BiocGenerics_0.12.1 

loaded via a namespace (and not attached):
 [1] BBmisc_1.8  BSgenome_1.34.0 BatchJobs_1.5  
 [4] BiocParallel_1.0.0  DBI_0.3.1   GenomicAlignments_1.2.1
 [7] GenomicFeatures_1.18.2  RCurl_1.95-4.5  RSQLite_1.0.0  
[10] Rcpp_0.11.3 XML_3.98-1.1acepack_1.3-3.3
[13] base64enc_0.1-2 biomaRt_2.22.0  bitops_1.0-6   
[16] brew_1.0-6  checkmate_1.5.0 cluster_1.15.3 
[19] codetools_0.2-9 colorspace_1.2-4digest_0.6.6   
[22] fail_1.2foreach_1.4.2   foreign_0.8-61 
[25] iterators_1.0.7 latticeExtra_0.6-26 munsell_0.4.2  
[28] nnet_7.3-8  plyr_1.8.1  rpart_4.1-8
[31] rtracklayer_1.26.2  sendmailR_1.2-1 stringr_0.6.2  
[34] tools_3.1.2 zlibbioc_1.12.0

Are there important and undeclared dependencies for biovizBase ?

I also get different errors for a few other packages I'm trying to update

 Installed Built   ReposVer
AllelicImbalance 1.0.0   3.0.3 1.4.0 
GGtools  4.10.0  3.0.3 5.2.0 
Gviz 1.6.0   3.0.3 1.10.3
ReportingTools   2.2.0   3.0.3 2.6.0 
biomvRCNS1.2.0   3.0.3 1.6.0 
biovizBase   1.10.8  3.0.3 1.14.0
casper   1.4.0   3.0.3 2.0.0 
cummeRbund   2.4.1   3.0.3 2.8.2 
ggbio1.10.16 3.0.3 1.14.0
intansv  1.2.0   3.0.3 1.6.0 
methyAnalysis1.4.2   3.0.3 1.8.0 
qrqc 1.16.0  3.0.3 1.20.0
spliceR  1.3.1   3.0.3 1.8.0 

 biocLite(spliceR)
BioC_mirror: http://bioconductor.org
Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2.
Installing package(s) 'spliceR'
trying URL 
'http://bioconductor.org/packages/3.0/bioc/src/contrib/spliceR_1.8.0.tar.gz'
Content type 'application/x-gzip' length 356213 bytes (347 Kb)
opened URL
==
downloaded 347 Kb

* installing *source* package 'spliceR' ...
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2 -pipe -g  -c 
utils.c -o utils.o
gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o spliceR.so utils.o 
-L/usr/lib/R/lib -lR
installing to /dskh/nobackup/biostat/Bioconductor/spliceR/libs
** R
** inst
** preparing package for lazy loading
Error : object 'renameSeqlevels' is not exported by 'namespace:GenomicRanges'
Error : package 'Gviz' could not be loaded
ERROR: lazy loading failed for package 'spliceR'
* removing '/dskh/nobackup/biostat/Bioconductor/spliceR'
* restoring previous '/dskh/nobackup/biostat/Bioconductor/spliceR'

--
Dario Strbenac
PhD 

Re: [Bioc-devel] biovizBase Installation rbind Problem

2014-12-11 Thread Dan Tenenbaum


- Original Message -
 From: Dario Strbenac dstr7...@uni.sydney.edu.au
 To: bioc-devel@r-project.org
 Sent: Thursday, December 11, 2014 5:00:11 PM
 Subject: [Bioc-devel] biovizBase Installation rbind Problem
 
 Hello,
 
 I can't install biovizBase, although I have the correct versions of
 all packages in Depends and Imports fields.
 
  source(http://bioconductor.org/biocLite.R;)
 Bioconductor version 3.0 (BiocInstaller 1.16.1), ?biocLite for help
  biocLite(biovizBase)
 BioC_mirror: http://bioconductor.org
 Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version
 3.1.2.
 Installing package(s) 'biovizBase'
 trying URL
 'http://bioconductor.org/packages/3.0/bioc/src/contrib/biovizBase_1.14.0.tar.gz'
 Content type 'application/x-gzip' length 2429664 bytes (2.3 Mb)
 opened URL
 ==
 downloaded 2.3 Mb
 
 * installing *source* package 'biovizBase' ...
 ** libs
 gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2 -pipe
 -g  -c R_init_biovizBase.c -o R_init_biovizBase.o
 gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2 -pipe
 -g  -c bin_offsets.c -o bin_offsets.o
 gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o biovizBase.so
 R_init_biovizBase.o bin_offsets.o -L/usr/lib/R/lib -lR
 installing to /dskh/nobackup/biostat/Bioconductor/biovizBase/libs
 ** R
 ** data
 ** inst
 ** preparing package for lazy loading
 Error in rbind(deparse.level, ...) :
   numbers of columns of arguments do not match
 Error : unable to load R code in package 'biovizBase'
 ERROR: lazy loading failed for package 'biovizBase'
 * removing '/dskh/nobackup/biostat/Bioconductor/biovizBase'
 * restoring previous '/dskh/nobackup/biostat/Bioconductor/biovizBase'
 

See https://support.bioconductor.org/p/63510/#63529



  sessionInfo()
 R version 3.1.2 (2014-10-31)
 Platform: x86_64-pc-linux-gnu (64-bit)
 
 locale:
 [1] C
 
 attached base packages:
  [1] splines   grid  stats4parallel  stats graphics
   grDevices
  [8] utils datasets  methods   base
 
 other attached packages:
  [1] BiocInstaller_1.16.1 dichromat_2.0-0
   RColorBrewer_1.1-2
  [4] VariantAnnotation_1.12.6 Rsamtools_1.18.2
  AnnotationDbi_1.28.1
  [7] Biobase_2.26.0   scales_0.2.4 Hmisc_3.14-6
 [10] Formula_1.1-2survival_2.37-7
  lattice_0.20-29
 [13] Biostrings_2.34.0XVector_0.6.0
GenomicRanges_1.18.3
 [16] GenomeInfoDb_1.2.3   IRanges_2.0.0
S4Vectors_0.4.0
 [19] BiocGenerics_0.12.1
 
 loaded via a namespace (and not attached):
  [1] BBmisc_1.8  BSgenome_1.34.0 BatchJobs_1.5
  [4] BiocParallel_1.0.0  DBI_0.3.1
GenomicAlignments_1.2.1
  [7] GenomicFeatures_1.18.2  RCurl_1.95-4.5  RSQLite_1.0.0
 [10] Rcpp_0.11.3 XML_3.98-1.1acepack_1.3-3.3
 [13] base64enc_0.1-2 biomaRt_2.22.0  bitops_1.0-6
 [16] brew_1.0-6  checkmate_1.5.0 cluster_1.15.3
 [19] codetools_0.2-9 colorspace_1.2-4digest_0.6.6
 [22] fail_1.2foreach_1.4.2   foreign_0.8-61
 [25] iterators_1.0.7 latticeExtra_0.6-26 munsell_0.4.2
 [28] nnet_7.3-8  plyr_1.8.1  rpart_4.1-8
 [31] rtracklayer_1.26.2  sendmailR_1.2-1 stringr_0.6.2
 [34] tools_3.1.2 zlibbioc_1.12.0
 
 Are there important and undeclared dependencies for biovizBase ?
 
 I also get different errors for a few other packages I'm trying to
 update
 
  Installed Built   ReposVer
 AllelicImbalance 1.0.0   3.0.3 1.4.0
 GGtools  4.10.0  3.0.3 5.2.0
 Gviz 1.6.0   3.0.3 1.10.3
 ReportingTools   2.2.0   3.0.3 2.6.0
 biomvRCNS1.2.0   3.0.3 1.6.0
 biovizBase   1.10.8  3.0.3 1.14.0
 casper   1.4.0   3.0.3 2.0.0
 cummeRbund   2.4.1   3.0.3 2.8.2
 ggbio1.10.16 3.0.3 1.14.0
 intansv  1.2.0   3.0.3 1.6.0
 methyAnalysis1.4.2   3.0.3 1.8.0
 qrqc 1.16.0  3.0.3 1.20.0
 spliceR  1.3.1   3.0.3 1.8.0
 
  biocLite(spliceR)
 BioC_mirror: http://bioconductor.org
 Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version
 3.1.2.
 Installing package(s) 'spliceR'
 trying URL
 'http://bioconductor.org/packages/3.0/bioc/src/contrib/spliceR_1.8.0.tar.gz'
 Content type 'application/x-gzip' length 356213 bytes (347 Kb)
 opened URL
 ==
 downloaded 347 Kb
 
 * installing *source* package 'spliceR' ...
 ** libs
 gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2 -pipe
 -g  -c utils.c -o utils.o
 gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o spliceR.so
 utils.o -L/usr/lib/R/lib -lR
 installing to /dskh/nobackup/biostat/Bioconductor/spliceR/libs
 ** R
 ** inst
 ** preparing package for lazy loading
 Error : object 'renameSeqlevels' is not exported by
 'namespace:GenomicRanges'
 Error : package 'Gviz' could not be loaded
 

Re: [Bioc-devel] biovizBase Installation rbind Problem

2014-12-11 Thread Dan Tenenbaum


- Original Message -
 From: Dan Tenenbaum dtene...@fredhutch.org
 To: Dario Strbenac dstr7...@uni.sydney.edu.au
 Cc: bioc-devel@r-project.org
 Sent: Thursday, December 11, 2014 5:10:34 PM
 Subject: Re: [Bioc-devel] biovizBase Installation rbind Problem
 
 
 
 - Original Message -
  From: Dario Strbenac dstr7...@uni.sydney.edu.au
  To: bioc-devel@r-project.org
  Sent: Thursday, December 11, 2014 5:00:11 PM
  Subject: [Bioc-devel] biovizBase Installation rbind Problem
  
  Hello,
  
  I can't install biovizBase, although I have the correct versions of
  all packages in Depends and Imports fields.
  
   source(http://bioconductor.org/biocLite.R;)
  Bioconductor version 3.0 (BiocInstaller 1.16.1), ?biocLite for help
   biocLite(biovizBase)
  BioC_mirror: http://bioconductor.org
  Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version
  3.1.2.
  Installing package(s) 'biovizBase'
  trying URL
  'http://bioconductor.org/packages/3.0/bioc/src/contrib/biovizBase_1.14.0.tar.gz'
  Content type 'application/x-gzip' length 2429664 bytes (2.3 Mb)
  opened URL
  ==
  downloaded 2.3 Mb
  
  * installing *source* package 'biovizBase' ...
  ** libs
  gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2
  -pipe
  -g  -c R_init_biovizBase.c -o R_init_biovizBase.o
  gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG  -fpic  -O2
  -pipe
  -g  -c bin_offsets.c -o bin_offsets.o
  gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o
  biovizBase.so
  R_init_biovizBase.o bin_offsets.o -L/usr/lib/R/lib -lR
  installing to /dskh/nobackup/biostat/Bioconductor/biovizBase/libs
  ** R
  ** data
  ** inst
  ** preparing package for lazy loading
  Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
  Error : unable to load R code in package 'biovizBase'
  ERROR: lazy loading failed for package 'biovizBase'
  * removing '/dskh/nobackup/biostat/Bioconductor/biovizBase'
  * restoring previous
  '/dskh/nobackup/biostat/Bioconductor/biovizBase'
  
 
 See https://support.bioconductor.org/p/63510/#63529
 
 
 
   sessionInfo()
  R version 3.1.2 (2014-10-31)
  Platform: x86_64-pc-linux-gnu (64-bit)
  
  locale:
  [1] C
  
  attached base packages:
   [1] splines   grid  stats4parallel  stats graphics
grDevices
   [8] utils datasets  methods   base
  
  other attached packages:
   [1] BiocInstaller_1.16.1 dichromat_2.0-0
RColorBrewer_1.1-2
   [4] VariantAnnotation_1.12.6 Rsamtools_1.18.2
   AnnotationDbi_1.28.1
   [7] Biobase_2.26.0   scales_0.2.4 Hmisc_3.14-6
  [10] Formula_1.1-2survival_2.37-7
   lattice_0.20-29
  [13] Biostrings_2.34.0XVector_0.6.0
 GenomicRanges_1.18.3
  [16] GenomeInfoDb_1.2.3   IRanges_2.0.0
 S4Vectors_0.4.0
  [19] BiocGenerics_0.12.1
  
  loaded via a namespace (and not attached):
   [1] BBmisc_1.8  BSgenome_1.34.0 BatchJobs_1.5
   [4] BiocParallel_1.0.0  DBI_0.3.1
 GenomicAlignments_1.2.1
   [7] GenomicFeatures_1.18.2  RCurl_1.95-4.5  RSQLite_1.0.0
  [10] Rcpp_0.11.3 XML_3.98-1.1
 acepack_1.3-3.3
  [13] base64enc_0.1-2 biomaRt_2.22.0  bitops_1.0-6
  [16] brew_1.0-6  checkmate_1.5.0 cluster_1.15.3
  [19] codetools_0.2-9 colorspace_1.2-4digest_0.6.6
  [22] fail_1.2foreach_1.4.2   foreign_0.8-61
  [25] iterators_1.0.7 latticeExtra_0.6-26 munsell_0.4.2
  [28] nnet_7.3-8  plyr_1.8.1  rpart_4.1-8
  [31] rtracklayer_1.26.2  sendmailR_1.2-1 stringr_0.6.2
  [34] tools_3.1.2 zlibbioc_1.12.0
  
  Are there important and undeclared dependencies for biovizBase ?
  
  I also get different errors for a few other packages I'm trying to
  update
  
   Installed Built   ReposVer
  AllelicImbalance 1.0.0   3.0.3 1.4.0
  GGtools  4.10.0  3.0.3 5.2.0
  Gviz 1.6.0   3.0.3 1.10.3
  ReportingTools   2.2.0   3.0.3 2.6.0
  biomvRCNS1.2.0   3.0.3 1.6.0
  biovizBase   1.10.8  3.0.3 1.14.0
  casper   1.4.0   3.0.3 2.0.0
  cummeRbund   2.4.1   3.0.3 2.8.2
  ggbio1.10.16 3.0.3 1.14.0
  intansv  1.2.0   3.0.3 1.6.0
  methyAnalysis1.4.2   3.0.3 1.8.0
  qrqc 1.16.0  3.0.3 1.20.0
  spliceR  1.3.1   3.0.3 1.8.0
  
   biocLite(spliceR)
  BioC_mirror: http://bioconductor.org
  Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version
  3.1.2.
  Installing package(s) 'spliceR'
  trying URL
  'http://bioconductor.org/packages/3.0/bioc/src/contrib/spliceR_1.8.0.tar.gz'
  Content type 'application/x-gzip' length 356213 bytes (347 Kb)
  opened URL
  ==
  downloaded 347 Kb
  
  * installing *source* package 'spliceR' ...
  ** libs
  gcc -std=gnu99 

Re: [Rd] R CMD check --as-cran and (a)spell checking

2014-12-11 Thread Martin Maechler
 Henrik Bengtsson h...@biostat.ucsf.edu
 on Fri, 5 Dec 2014 18:17:57 -0800 writes:

 Does anyone know if it is possible to add a dictionary
 file of known words that becomes part of the *built*
 package to tell 'R CMD check --as-cran' not to report
 these words as misspelled.  I want this dictionary to come
 with the *.tar.gz such that it will be available
 regardless where the package is checked.  For instance,
 currently I get:

 * using log directory
 'T:/R/_R-3.1.2patched/matrixStats.Rcheck' * using R
 version 3.1.2 Patched (2014-12-03 r67101) * using
 platform: x86_64-w64-mingw32 (64-bit) * using session
 charset: ISO8859-1 * checking for file
 'matrixStats/DESCRIPTION' ... OK * this is package
 'matrixStats' version '0.12.0' * checking CRAN incoming
 feasibility ... NOTE Maintainer: 'Henrik Bengtsson
 henr...@braju.com' Possibly mis-spelled words in
 DESCRIPTION: rowMedians (18:74) rowRanks (18:92) rowSds
 (18:111) * checking package namespace information ... OK
 ...

I agree that some customization possibility would be great here.
Maybe it'll be sufficient to allow a short list of about a dozen
words in the DESCRIPTION itself, e.g., with an item

Extrawords: rowMedians, rowRanks

Would you feel like providing a patch to the R (devel) sources
for this?

Best,
Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] package.skeleton leads to R CMD check aborting

2014-12-11 Thread Dirk Eddelbuettel

The package.skeleton() base function is useful for creating empty packages.

The fact that the help files generated by the function cause errors was long
an annoyance, but I noticed last month (while preparing for a workshop on R
and packages) that it creates packages which cause 'R CMD check ...' to die
in error.  Which is always a bug.  

I just verified that it still dies in error under R-devel, and filed bug
report 16105 at the bugzilla instance. See below for a session log.

As a bug reporter, I should offer help. I do, though somewhat hesitantly as
package.skeleton() has gotten a bit complicated over the years.  But if a
patch simplifying the output of a simple default package is of interest I
will work on it.

Dirk


edd@max:/tmp$ rm -rf demo/
edd@max:/tmp$ mkdir demo  cd demo
edd@max:/tmp/demo$ R-devel.sh  

R Under development (unstable) (2014-12-09 r67142) -- Unsuffered Consequences
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


John Kane: I have 120 columns in a data.frame. I have one value in a column 
named blaw that I want to change. How do I find the coordinates?
Roger Koenker: It is the well-known wicked which problem: If you had 
(grammatically incorrectly) thought ... which I want to change then you might 
have
been led to type (in another window):
  ?which
and you would have seen the light. Maybe that() should be an alias for which()?
   -- John Kane and Roger Koenker
  R-help (August 2006)

R package.skeleton(quickDemo)
Creating directories ...
Creating DESCRIPTION ...
Creating NAMESPACE ...
Creating Read-and-delete-me ...
Saving functions and data ...
Making help files ...
Done.
Further steps are described in './quickDemo/Read-and-delete-me'.
R q()
edd@max:/tmp/demo$ R-devel.sh CMD build quickDemo
* checking for file ‘quickDemo/DESCRIPTION’ ... OK
* preparing ‘quickDemo’:
* checking DESCRIPTION meta-information ... OK
* installing the package to process help pages
* saving partial Rd database
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building ‘quickDemo_1.0.tar.gz’

edd@max:/tmp/demo$ R-devel.sh CMD check quickDemo_1.0.tar.gz
* using log directory ‘/tmp/demo/quickDemo.Rcheck’
* using R Under development (unstable) (2014-12-09 r67142)
* using platform: x86_64-unknown-linux-gnu (64-bit)
* using session charset: UTF-8
* checking for file ‘quickDemo/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘quickDemo’ version ‘1.0’
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘quickDemo’ can be installed ... WARNING
Found the following significant warnings:
  Warning: 
/tmp/Rtmp3vskac/Rbuild4e1945bdc52a/quickDemo/man/quickDemo-package.Rd:26: All 
text must be in a section
See ‘/tmp/demo/quickDemo.Rcheck/00install.out’ for details.
* checking installed package size ... OK
* checking package directory ... OK
* checking DESCRIPTION meta-information ... WARNING
Non-standard license specification:
  What license is it under?
Standardizable: FALSE
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... WARNING
prepare_Rd: quickDemo-package.Rd:26: All text must be in a section
* checking Rd metadata ... OK
* checking Rd cross-references ... WARNING
Unknown package 

Re: [Rd] Fwd: No source view when using gdb

2014-12-11 Thread Martyn Plummer
On Thu, 2014-12-11 at 14:00 +0100, Pierrick Bruneau wrote:
 Dear R contributors,
 
 Say I want to debug some C code invoked through .Call() - say
 varbayes in the VBmix package. following the instructions in
 Writing R Extensions, I perform the following actions :
 
 R -d gdb
 run
 library(VBmix)
 CTRL+C
 break varbayes
 signal 0
 mod - varbayes(as.matrix(iris)[,1:4], 2)
 
 The breakpoint is indeed activated, seemingly at the correct position
 in the source file, but instead of the actual text at the respective
 line, I get the following :
 
 69varbayes.c: No such file or directory.
 
 Issuing next afterwards seems to attain the expected purpose (step
 by step progression), but source code lines are replaced by, e.g. :
 
 72in varbayes.c
 
 There should be some way of installing the source code files, but I
 did not find R-specific info there. Does someone have a clue for my
 problem?

This happens when you install a package from the tarball. The source is
unpacked into a temporary directory which is then deleted. Try unpacking
the source tarball yourself, and then installing from the unpacked
directory, e.g. 

tar xfvz VBMix_0.2.17.tar.gz
R CMD INSTALL VBMix

Martyn

 Thanks by advance,
 Pierrick
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

---
This message and its attachments are strictly confidenti...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: No source view when using gdb

2014-12-11 Thread Pierrick Bruneau
Works like a charm, thanks!

On Thu, Dec 11, 2014 at 3:00 PM, Martyn Plummer plumm...@iarc.fr wrote:
 On Thu, 2014-12-11 at 14:00 +0100, Pierrick Bruneau wrote:
 Dear R contributors,

 Say I want to debug some C code invoked through .Call() - say
 varbayes in the VBmix package. following the instructions in
 Writing R Extensions, I perform the following actions :

 R -d gdb
 run
 library(VBmix)
 CTRL+C
 break varbayes
 signal 0
 mod - varbayes(as.matrix(iris)[,1:4], 2)

 The breakpoint is indeed activated, seemingly at the correct position
 in the source file, but instead of the actual text at the respective
 line, I get the following :

 69varbayes.c: No such file or directory.

 Issuing next afterwards seems to attain the expected purpose (step
 by step progression), but source code lines are replaced by, e.g. :

 72in varbayes.c

 There should be some way of installing the source code files, but I
 did not find R-specific info there. Does someone have a clue for my
 problem?

 This happens when you install a package from the tarball. The source is
 unpacked into a temporary directory which is then deleted. Try unpacking
 the source tarball yourself, and then installing from the unpacked
 directory, e.g.

 tar xfvz VBMix_0.2.17.tar.gz
 R CMD INSTALL VBMix

 Martyn

 Thanks by advance,
 Pierrick

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

 ---
 This message and its attachments are strictly confiden...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] SUGGESTION: Force install.packages() to use ASCII encoding when parse():ing code?

2014-12-11 Thread Henrik Bengtsson
SUGGESTION:
Would it make sense if install.packages() and friends always use an
ascii(*) encoding when parse():ing R package source code files?

I believe this should be safe, because R code files should be in ASCII
[http://en.wikipedia.org/wiki/ASCII] and only in source-code comments
you may use other characters.  This is from Section 'Package
subdirectories' in 'Writing R Extensions':

Only ASCII characters (and the control characters tab, formfeed, LF
and CR) should be used in code files. Other characters are accepted in
comments, but then the comments may not be readable in e.g. a UTF-8
locale. Non-ASCII characters in object names will normally fail when
the package is installed. Any byte will be allowed in a quoted
character string but \u escapes should be used for non-ASCII
characters. However, non-ASCII character strings may not be usable in
some locales and may display incorrectly in others.

Since comments are dropped by parse(), their actual content does not
matter, and the rest of the code should be in ASCII.

(*) It could be that the specific encoding ascii is not cross
platforms. If so, is there another way to specify a pure ASCII
encoding?



BACKGROUND:
If a user/system sets the 'encoding' option at startup, it may break
package installations from source if the package has source code
comments with non-ASCII characters.  For example,

$ mkdir foo; cd foo
$ echo options(encoding='UTF-8')  .Rprofile
$ R --vanilla
 install.packages(R.oo, type=source)

 install.packages(R.oo, type=source)
Installing package into 'C:/Users/hb/R/win-library/3.2'
(as 'lib' is unspecified)
--- Please select a CRAN mirror for use in this session ---
trying URL 'http://cran.at.r-project.org/src/contrib/R.oo_1.18.0.tar.gz'
Content type 'application/x-gzip' length 394545 bytes (385 KB)
opened URL
downloaded 385 KB

* installing *source* package 'R.oo' ...
** package 'R.oo' successfully unpacked and MD5 sums checked
** R
Warning in parse(outFile) :
  invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/
R.oo'
** inst
** preparing package for lazy loading
Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) :
  invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/
R.oo'
** help
[...]

(This can be an extremely time consuming task to troubleshoot,
particularly if reported to a package maintainer not having access to
the original system).

FYI, setting it only in the session is alright:

 options(encoding=UTF-8)
 install.packages(R.oo, type=source)

because install.packages() launches a separated R process for the
installation and it's only then the startup code becomes an issue.


TROUBLESHOOTING:
My understanding for the

Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) :
  invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/

is that this happens when there is a non-ASCII character in one of the
source-code comments (*) with a bit pattern matching a multi-byte
UTF-8 sequence [http://en.wikipedia.org/wiki/UTF-8#Description].  For
instance, consider a source code comment with an acute accent:

 raw - as.raw(c(0x23, 0x20, 0xe9, 0x74, 0x75, 0x64, 0x69, 0x61, 0x6e, 0x74, 
 0x0a))
 writeBin(raw, con=foo.R)
 code - readLines(foo.R)
 code
[1] # étudiant

 options(encoding=UTF-8)
 parse(foo.R)
Warning message:
In readLines(file, warn = FALSE) :
  invalid input found on input connection 'foo.R'

 options(encoding=ascii)
 parse(foo.R)
expression()

Reason for the invalid input: The bit pattern for raw[3:5], is:

 R.utils::intToBin(raw[3:5])
[1] 11101001 01110100 01110101

The first byte (raw[3]) matched special UTF-8 byte pattern 1110,
which according to UTF-8 should be followed by two more bytes with bit
patterns 10xx and 10x
[http://en.wikipedia.org/wiki/UTF-8#Description].  Since raw[4:5] does
not match those, it's an invalid UTF-8 byte sequence.  So, technically
this does not happen for all comments using acute accents, but it's
very likely.  More generally, a multi-byte UTF-8 sequence is expected
when byte pattern 11x (= 192 in decimal values) is encountered.
Looking http://en.wikipedia.org/wiki/ISO/IEC_8859, there are several
characters with this bit pattern for many Latin-N encodings, which
I'd assume is still in dominant use by many developers.

So, since options(encoding=UTF-8) was set at startup, that is also
the encoding that R tries to follow.  My suggestion is that it seems
that R should be able to always use a pure-ASCII encoding when parsing
R code in packages, because that is what 'Writing R Extensions' says
we should use in the first place.

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] SUGGESTION: Force install.packages() to use ASCII encoding when parse():ing code?

2014-12-11 Thread Duncan Murdoch

On 11/12/2014 12:59 PM, Henrik Bengtsson wrote:

SUGGESTION:
Would it make sense if install.packages() and friends always use an
ascii(*) encoding when parse():ing R package source code files?


I think that would be a step backwards.  It would be better to accept 
other encodings.  As an English speaker this isn't a big deal to me, but 
users of other languages may want to have messages and variable names in 
their native language, and ASCII might not be enough for that.


On the other hand, I think it's quite reasonable to require a declared 
encoding if anything other than ASCII is used, and possibly to fail for 
some encodings.  It is probably also reasonable to at least warn when 
non-ASCII characters are used in strings in packages on CRAN, as many 
users can't display all characters.


Duncan Murdoch


I believe this should be safe, because R code files should be in ASCII
[http://en.wikipedia.org/wiki/ASCII] and only in source-code comments
you may use other characters.  This is from Section 'Package
subdirectories' in 'Writing R Extensions':

Only ASCII characters (and the control characters tab, formfeed, LF
and CR) should be used in code files. Other characters are accepted in
comments, but then the comments may not be readable in e.g. a UTF-8
locale. Non-ASCII characters in object names will normally fail when
the package is installed. Any byte will be allowed in a quoted
character string but \u escapes should be used for non-ASCII
characters. However, non-ASCII character strings may not be usable in
some locales and may display incorrectly in others.

Since comments are dropped by parse(), their actual content does not
matter, and the rest of the code should be in ASCII.

(*) It could be that the specific encoding ascii is not cross
platforms. If so, is there another way to specify a pure ASCII
encoding?



BACKGROUND:
If a user/system sets the 'encoding' option at startup, it may break
package installations from source if the package has source code
comments with non-ASCII characters.  For example,

$ mkdir foo; cd foo
$ echo options(encoding='UTF-8')  .Rprofile
$ R --vanilla
 install.packages(R.oo, type=source)

 install.packages(R.oo, type=source)
Installing package into 'C:/Users/hb/R/win-library/3.2'
(as 'lib' is unspecified)
--- Please select a CRAN mirror for use in this session ---
trying URL 'http://cran.at.r-project.org/src/contrib/R.oo_1.18.0.tar.gz'
Content type 'application/x-gzip' length 394545 bytes (385 KB)
opened URL
downloaded 385 KB

* installing *source* package 'R.oo' ...
** package 'R.oo' successfully unpacked and MD5 sums checked
** R
Warning in parse(outFile) :
   invalid input found on input connection 
'C:/Users/hb/R/win-library/3.2/R.oo/R/
R.oo'
** inst
** preparing package for lazy loading
Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) :
   invalid input found on input connection 
'C:/Users/hb/R/win-library/3.2/R.oo/R/
R.oo'
** help
[...]

(This can be an extremely time consuming task to troubleshoot,
particularly if reported to a package maintainer not having access to
the original system).

FYI, setting it only in the session is alright:

 options(encoding=UTF-8)
 install.packages(R.oo, type=source)

because install.packages() launches a separated R process for the
installation and it's only then the startup code becomes an issue.


TROUBLESHOOTING:
My understanding for the

Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) :
   invalid input found on input connection 
'C:/Users/hb/R/win-library/3.2/R.oo/R/

is that this happens when there is a non-ASCII character in one of the
source-code comments (*) with a bit pattern matching a multi-byte
UTF-8 sequence [http://en.wikipedia.org/wiki/UTF-8#Description].  For
instance, consider a source code comment with an acute accent:

 raw - as.raw(c(0x23, 0x20, 0xe9, 0x74, 0x75, 0x64, 0x69, 0x61, 0x6e, 0x74, 
0x0a))
 writeBin(raw, con=foo.R)
 code - readLines(foo.R)
 code
[1] # étudiant

 options(encoding=UTF-8)
 parse(foo.R)
Warning message:
In readLines(file, warn = FALSE) :
   invalid input found on input connection 'foo.R'

 options(encoding=ascii)
 parse(foo.R)
expression()

Reason for the invalid input: The bit pattern for raw[3:5], is:

 R.utils::intToBin(raw[3:5])
[1] 11101001 01110100 01110101

The first byte (raw[3]) matched special UTF-8 byte pattern 1110,
which according to UTF-8 should be followed by two more bytes with bit
patterns 10xx and 10x
[http://en.wikipedia.org/wiki/UTF-8#Description].  Since raw[4:5] does
not match those, it's an invalid UTF-8 byte sequence.  So, technically
this does not happen for all comments using acute accents, but it's
very likely.  More generally, a multi-byte UTF-8 sequence is expected
when byte pattern 11x (= 192 in decimal values) is encountered.
Looking http://en.wikipedia.org/wiki/ISO/IEC_8859, there are several
characters with this bit pattern for many Latin-N encodings, which
I'd assume is still in 

[Rd] Significant memory leak when using XML on Windows

2014-12-11 Thread Janko Thyson
Dear list,

I'm sorry to keep coming back with this time and time again, but this bug
is still not fixed even though the root cause of the issue has been around
for 2-3 years now. And as the number of packages that depend on XML grows,
I thought maybe this deserves some wider attention.

I did my best to make reproduction of the issue as easy as possible:
https://github.com/omegahat/XML/issues/4
http://goo.gl/aV17Lv

But as I'm not familiar with C I'm kind of out of clues of what else do to.

Duncan has been really dedicated and helpful so far, but unfortunately he
seems to have too little time to really dig into this himself. So I thought
I'd try and raise the attention of other developers that have the skills to
fix this. Apparently, the issue is caused by the way the memory consumed by
the underlying C-objects/pointers is released (or not released, for that
matter).

I'd so much appreciate if someone could have a look at this. If I can be of
any help whatsoever, please let me know!

Thanks and best regards,
Janko

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] SUGGESTION: Force install.packages() to use ASCII encoding when parse():ing code?

2014-12-11 Thread Henrik Bengtsson
On Thu, Dec 11, 2014 at 10:47 AM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 11/12/2014 12:59 PM, Henrik Bengtsson wrote:

 SUGGESTION:
 Would it make sense if install.packages() and friends always use an
 ascii(*) encoding when parse():ing R package source code files?


 I think that would be a step backwards.  It would be better to accept other
 encodings.  As an English speaker this isn't a big deal to me, but users of
 other languages may want to have messages and variable names in their native
 language, and ASCII might not be enough for that.

Thanks for the feedback.  While I'll probably agree with you that R
packages should support other source code encodings than ASCII, that
would require a change in the specifications and design.  What I'm
proposing is (just) an adjustment to the implementation to meet the
current specs and design.


 On the other hand, I think it's quite reasonable to require a declared
 encoding if anything other than ASCII is used, and possibly to fail for some
 encodings.  It is probably also reasonable to at least warn when non-ASCII
 characters are used in strings in packages on CRAN, as many users can't
 display all characters.

That would be a reasonable extension of the design, which would be
backward compatible with the current design, i.e. if encoding for the
source code is not declared, then it is assumed to be ASCII.

Source code comments are special, because by the current design
('Writing R Extensions'), it somehow leaves it open to use any type of
encoding.  If I read it freely, it could even be that you can use
different encoding for different comments in the same file (which is
not unlikely to occur considered cut'n'paste and open-source
licenses).  If other encodings are to be supported, then I see two
ways forward:

1. Have R completely ignore what's in the comments (what follows #
until the newline) such that encoding does not matter, or
2. require the same encoding for the source code comments as the rest
of the code.

As I see it, today's design falls (could fall?) under 1, but the
implementation does not go all the way to support it.

/Henrik

PS. It should be emphasized that this is about R packages. AFAIK, you
can already now source() code written in any encoding, e.g.
 raw - as.raw(c(
+  0xcf, 0x80, 0x20, 0x3c, 0x2d, 0x20, 0x70, 0x69, 0x0a,
+  0x70, 0x72, 0x69, 0x6e, 0x74, 0x28, 0xcf, 0x80, 0x29, 0x0a
+ ))
 writeBin(raw, con=pi.R)
 source(pi.R, encoding=UTF-8)
[1] 3.141593


 Duncan Murdoch


 I believe this should be safe, because R code files should be in ASCII
 [http://en.wikipedia.org/wiki/ASCII] and only in source-code comments
 you may use other characters.  This is from Section 'Package
 subdirectories' in 'Writing R Extensions':

 Only ASCII characters (and the control characters tab, formfeed, LF
 and CR) should be used in code files. Other characters are accepted in
 comments, but then the comments may not be readable in e.g. a UTF-8
 locale. Non-ASCII characters in object names will normally fail when
 the package is installed. Any byte will be allowed in a quoted
 character string but \u escapes should be used for non-ASCII
 characters. However, non-ASCII character strings may not be usable in
 some locales and may display incorrectly in others.

 Since comments are dropped by parse(), their actual content does not
 matter, and the rest of the code should be in ASCII.

 (*) It could be that the specific encoding ascii is not cross
 platforms. If so, is there another way to specify a pure ASCII
 encoding?



 BACKGROUND:
 If a user/system sets the 'encoding' option at startup, it may break
 package installations from source if the package has source code
 comments with non-ASCII characters.  For example,

 $ mkdir foo; cd foo
 $ echo options(encoding='UTF-8')  .Rprofile
 $ R --vanilla
  install.packages(R.oo, type=source)

  install.packages(R.oo, type=source)
 Installing package into 'C:/Users/hb/R/win-library/3.2'
 (as 'lib' is unspecified)
 --- Please select a CRAN mirror for use in this session ---
 trying URL 'http://cran.at.r-project.org/src/contrib/R.oo_1.18.0.tar.gz'
 Content type 'application/x-gzip' length 394545 bytes (385 KB)
 opened URL
 downloaded 385 KB

 * installing *source* package 'R.oo' ...
 ** package 'R.oo' successfully unpacked and MD5 sums checked
 ** R
 Warning in parse(outFile) :
invalid input found on input connection
 'C:/Users/hb/R/win-library/3.2/R.oo/R/
 R.oo'
 ** inst
 ** preparing package for lazy loading
 Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE)
 :
invalid input found on input connection
 'C:/Users/hb/R/win-library/3.2/R.oo/R/
 R.oo'
 ** help
 [...]

 (This can be an extremely time consuming task to troubleshoot,
 particularly if reported to a package maintainer not having access to
 the original system).

 FYI, setting it only in the session is alright:

  options(encoding=UTF-8)
  install.packages(R.oo, type=source)

 because install.packages() launches