Re: [Bioc-devel] VariantAnnotation: VCF to VRanges with multiple INFO values
Can you backport the fixes to bioc-release which is also affected? Best Julian Valerie Obenchain (12/09/14 03:24): Thanks to Michael and Julian for taking care of this. Fixes are in devel, = 1.13.15. Valerie On 12/03/14 08:44, Michael Lawrence wrote: Looks like an issue when expand()ing the VCF. Maybe Val could take a look? On Wed, Dec 3, 2014 at 7:39 AM, Julian Gehring julian.gehr...@embl.de wrote: Hi, The conversion from a 'VCF' to 'VRanges' object fails if an INFO field with multiple values for different ALT alleles is present: Here an example VCF entry for which this fails (line 71151250 in 'ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.sites.vcf.gz' , taken from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.sites.vcf.gz ): 10001541 rs12451372 C G,T 100 PASS AC=700,298;AF=0.139776,0.0595048;AN=5008;NS=2504;DP=17289;EAS_AF=0.2421,0;AMR_AF=0.1801,0.0115;AFR_AF=0.0749,0.2194;EUR_AF=0.0915,0;SAS_AF=0.1431,0;AA=T||| The respective code to reproduce this: library(VariantAnnotation) roi = GRanges(17, IRanges(1e7+1541, width = 1)) vcf = readVcf(path, GRCh37, ScanVcfParam(which = roi, info = AF)) ## 'info = character()' and other versions also cause the error vrc = as(vcf, VRanges) ## error fails with Error in colSums(ielt) : 'x' must be an array of at least two dimensions This occurs both with the latest version of VariantAnnotation in bioc-release and bioc-devel. Best Julian ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Experiment Data biocViews updates
Hi Vince, As an initial cut, a) I have removed WholeGenomeData from under TechnologyData - SequencingData b) added Somatic and Germline under SpecimenSource. ( you need to click the checkbox on top to see these two as they are childless biocViews. Thanks and Regards, Sonali. On 12/8/2014 8:44 AM, Vincent Carey wrote: On Mon, Dec 8, 2014 at 11:22 AM, Sonali Arora sar...@fredhutch.org wrote: Hi Vince, On 12/8/2014 7:38 AM, Vincent Carey wrote: Very nice. Is WholeGenomeData under SequencingData sufficiently clear? We mirrored the Technology category under Software biocViews for TechnologyData, thus SequencingData mirrored Sequencing what would you suggest instead ? Well, this is something we should get some consensus on if there will be any changes. It seems to me that various assays can be regarded as whole genome data whereas I think what is intended for that term is whole genome dna sequencing, and one might want to distinguish germ line and tumor sequencing for an initial cut? We may also want to have a source tissue terminology in the experimental data space. We have : SpecimenSource - Tissue; Again I don't want to introduce pointless complexity. But my understanding is that one can only use terms that are in the hierarchy, so if there are informative distinctions missing from the hierarchy, resources go out with vague labels. one set of terms for tissues is at http://www.gtexportal.org/home/ since we might want to repackage some of that data, perhaps a coarse subset of those terms would be useful to have in the hierarchy SpecimenSource - Proteome; SpecimenSource - Genome; SpecimenSource - StemCell; SpecimenSource - CellCulture; I believe you want us to expand Tissue more ? Please suggest the best way to break it up further. Thanks for the quick response, Sonali. On Mon, Dec 8, 2014 at 10:28 AM, Sonali Arora sar...@fredhutch.org wrote: Hi everyone, We have revised the Experiment Data biocViews and have updated most of the Experiment Data packages using our word lookup function( recommendBiocViews() ) . Package Authors are encouraged to check the new biocViews added to their Experiment Data package and add more relevant ones from the new updated Experiment Data biocViews tree. The new Experiment Data biocView tree can be viewed at : http://www.bioconductor.org/packages/devel/BiocViews.html# ___ExperimentData Just a friendly reminder - You are allowed to add biocViews only from the category of biocViews that you're package belongs to. For example, ExperimentData Packages can contain biocViews only from the Experiment Data biocViews category. We hope that this updated tree will help you find test data more easily and efficiently, for the new software packages that you write. -- Thanks and Regards, Sonali -- Thanks and Regards, Sonali ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[Bioc-devel] biovizBase Installation rbind Problem
Hello, I can't install biovizBase, although I have the correct versions of all packages in Depends and Imports fields. source(http://bioconductor.org/biocLite.R;) Bioconductor version 3.0 (BiocInstaller 1.16.1), ?biocLite for help biocLite(biovizBase) BioC_mirror: http://bioconductor.org Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2. Installing package(s) 'biovizBase' trying URL 'http://bioconductor.org/packages/3.0/bioc/src/contrib/biovizBase_1.14.0.tar.gz' Content type 'application/x-gzip' length 2429664 bytes (2.3 Mb) opened URL == downloaded 2.3 Mb * installing *source* package 'biovizBase' ... ** libs gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c R_init_biovizBase.c -o R_init_biovizBase.o gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c bin_offsets.c -o bin_offsets.o gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o biovizBase.so R_init_biovizBase.o bin_offsets.o -L/usr/lib/R/lib -lR installing to /dskh/nobackup/biostat/Bioconductor/biovizBase/libs ** R ** data ** inst ** preparing package for lazy loading Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match Error : unable to load R code in package 'biovizBase' ERROR: lazy loading failed for package 'biovizBase' * removing '/dskh/nobackup/biostat/Bioconductor/biovizBase' * restoring previous '/dskh/nobackup/biostat/Bioconductor/biovizBase' sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] C attached base packages: [1] splines grid stats4parallel stats graphics grDevices [8] utils datasets methods base other attached packages: [1] BiocInstaller_1.16.1 dichromat_2.0-0 RColorBrewer_1.1-2 [4] VariantAnnotation_1.12.6 Rsamtools_1.18.2 AnnotationDbi_1.28.1 [7] Biobase_2.26.0 scales_0.2.4 Hmisc_3.14-6 [10] Formula_1.1-2survival_2.37-7 lattice_0.20-29 [13] Biostrings_2.34.0XVector_0.6.0GenomicRanges_1.18.3 [16] GenomeInfoDb_1.2.3 IRanges_2.0.0S4Vectors_0.4.0 [19] BiocGenerics_0.12.1 loaded via a namespace (and not attached): [1] BBmisc_1.8 BSgenome_1.34.0 BatchJobs_1.5 [4] BiocParallel_1.0.0 DBI_0.3.1 GenomicAlignments_1.2.1 [7] GenomicFeatures_1.18.2 RCurl_1.95-4.5 RSQLite_1.0.0 [10] Rcpp_0.11.3 XML_3.98-1.1acepack_1.3-3.3 [13] base64enc_0.1-2 biomaRt_2.22.0 bitops_1.0-6 [16] brew_1.0-6 checkmate_1.5.0 cluster_1.15.3 [19] codetools_0.2-9 colorspace_1.2-4digest_0.6.6 [22] fail_1.2foreach_1.4.2 foreign_0.8-61 [25] iterators_1.0.7 latticeExtra_0.6-26 munsell_0.4.2 [28] nnet_7.3-8 plyr_1.8.1 rpart_4.1-8 [31] rtracklayer_1.26.2 sendmailR_1.2-1 stringr_0.6.2 [34] tools_3.1.2 zlibbioc_1.12.0 Are there important and undeclared dependencies for biovizBase ? I also get different errors for a few other packages I'm trying to update Installed Built ReposVer AllelicImbalance 1.0.0 3.0.3 1.4.0 GGtools 4.10.0 3.0.3 5.2.0 Gviz 1.6.0 3.0.3 1.10.3 ReportingTools 2.2.0 3.0.3 2.6.0 biomvRCNS1.2.0 3.0.3 1.6.0 biovizBase 1.10.8 3.0.3 1.14.0 casper 1.4.0 3.0.3 2.0.0 cummeRbund 2.4.1 3.0.3 2.8.2 ggbio1.10.16 3.0.3 1.14.0 intansv 1.2.0 3.0.3 1.6.0 methyAnalysis1.4.2 3.0.3 1.8.0 qrqc 1.16.0 3.0.3 1.20.0 spliceR 1.3.1 3.0.3 1.8.0 biocLite(spliceR) BioC_mirror: http://bioconductor.org Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2. Installing package(s) 'spliceR' trying URL 'http://bioconductor.org/packages/3.0/bioc/src/contrib/spliceR_1.8.0.tar.gz' Content type 'application/x-gzip' length 356213 bytes (347 Kb) opened URL == downloaded 347 Kb * installing *source* package 'spliceR' ... ** libs gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c utils.c -o utils.o gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o spliceR.so utils.o -L/usr/lib/R/lib -lR installing to /dskh/nobackup/biostat/Bioconductor/spliceR/libs ** R ** inst ** preparing package for lazy loading Error : object 'renameSeqlevels' is not exported by 'namespace:GenomicRanges' Error : package 'Gviz' could not be loaded ERROR: lazy loading failed for package 'spliceR' * removing '/dskh/nobackup/biostat/Bioconductor/spliceR' * restoring previous '/dskh/nobackup/biostat/Bioconductor/spliceR' -- Dario Strbenac PhD
Re: [Bioc-devel] biovizBase Installation rbind Problem
- Original Message - From: Dario Strbenac dstr7...@uni.sydney.edu.au To: bioc-devel@r-project.org Sent: Thursday, December 11, 2014 5:00:11 PM Subject: [Bioc-devel] biovizBase Installation rbind Problem Hello, I can't install biovizBase, although I have the correct versions of all packages in Depends and Imports fields. source(http://bioconductor.org/biocLite.R;) Bioconductor version 3.0 (BiocInstaller 1.16.1), ?biocLite for help biocLite(biovizBase) BioC_mirror: http://bioconductor.org Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2. Installing package(s) 'biovizBase' trying URL 'http://bioconductor.org/packages/3.0/bioc/src/contrib/biovizBase_1.14.0.tar.gz' Content type 'application/x-gzip' length 2429664 bytes (2.3 Mb) opened URL == downloaded 2.3 Mb * installing *source* package 'biovizBase' ... ** libs gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c R_init_biovizBase.c -o R_init_biovizBase.o gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c bin_offsets.c -o bin_offsets.o gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o biovizBase.so R_init_biovizBase.o bin_offsets.o -L/usr/lib/R/lib -lR installing to /dskh/nobackup/biostat/Bioconductor/biovizBase/libs ** R ** data ** inst ** preparing package for lazy loading Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match Error : unable to load R code in package 'biovizBase' ERROR: lazy loading failed for package 'biovizBase' * removing '/dskh/nobackup/biostat/Bioconductor/biovizBase' * restoring previous '/dskh/nobackup/biostat/Bioconductor/biovizBase' See https://support.bioconductor.org/p/63510/#63529 sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] C attached base packages: [1] splines grid stats4parallel stats graphics grDevices [8] utils datasets methods base other attached packages: [1] BiocInstaller_1.16.1 dichromat_2.0-0 RColorBrewer_1.1-2 [4] VariantAnnotation_1.12.6 Rsamtools_1.18.2 AnnotationDbi_1.28.1 [7] Biobase_2.26.0 scales_0.2.4 Hmisc_3.14-6 [10] Formula_1.1-2survival_2.37-7 lattice_0.20-29 [13] Biostrings_2.34.0XVector_0.6.0 GenomicRanges_1.18.3 [16] GenomeInfoDb_1.2.3 IRanges_2.0.0 S4Vectors_0.4.0 [19] BiocGenerics_0.12.1 loaded via a namespace (and not attached): [1] BBmisc_1.8 BSgenome_1.34.0 BatchJobs_1.5 [4] BiocParallel_1.0.0 DBI_0.3.1 GenomicAlignments_1.2.1 [7] GenomicFeatures_1.18.2 RCurl_1.95-4.5 RSQLite_1.0.0 [10] Rcpp_0.11.3 XML_3.98-1.1acepack_1.3-3.3 [13] base64enc_0.1-2 biomaRt_2.22.0 bitops_1.0-6 [16] brew_1.0-6 checkmate_1.5.0 cluster_1.15.3 [19] codetools_0.2-9 colorspace_1.2-4digest_0.6.6 [22] fail_1.2foreach_1.4.2 foreign_0.8-61 [25] iterators_1.0.7 latticeExtra_0.6-26 munsell_0.4.2 [28] nnet_7.3-8 plyr_1.8.1 rpart_4.1-8 [31] rtracklayer_1.26.2 sendmailR_1.2-1 stringr_0.6.2 [34] tools_3.1.2 zlibbioc_1.12.0 Are there important and undeclared dependencies for biovizBase ? I also get different errors for a few other packages I'm trying to update Installed Built ReposVer AllelicImbalance 1.0.0 3.0.3 1.4.0 GGtools 4.10.0 3.0.3 5.2.0 Gviz 1.6.0 3.0.3 1.10.3 ReportingTools 2.2.0 3.0.3 2.6.0 biomvRCNS1.2.0 3.0.3 1.6.0 biovizBase 1.10.8 3.0.3 1.14.0 casper 1.4.0 3.0.3 2.0.0 cummeRbund 2.4.1 3.0.3 2.8.2 ggbio1.10.16 3.0.3 1.14.0 intansv 1.2.0 3.0.3 1.6.0 methyAnalysis1.4.2 3.0.3 1.8.0 qrqc 1.16.0 3.0.3 1.20.0 spliceR 1.3.1 3.0.3 1.8.0 biocLite(spliceR) BioC_mirror: http://bioconductor.org Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2. Installing package(s) 'spliceR' trying URL 'http://bioconductor.org/packages/3.0/bioc/src/contrib/spliceR_1.8.0.tar.gz' Content type 'application/x-gzip' length 356213 bytes (347 Kb) opened URL == downloaded 347 Kb * installing *source* package 'spliceR' ... ** libs gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c utils.c -o utils.o gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o spliceR.so utils.o -L/usr/lib/R/lib -lR installing to /dskh/nobackup/biostat/Bioconductor/spliceR/libs ** R ** inst ** preparing package for lazy loading Error : object 'renameSeqlevels' is not exported by 'namespace:GenomicRanges' Error : package 'Gviz' could not be loaded
Re: [Bioc-devel] biovizBase Installation rbind Problem
- Original Message - From: Dan Tenenbaum dtene...@fredhutch.org To: Dario Strbenac dstr7...@uni.sydney.edu.au Cc: bioc-devel@r-project.org Sent: Thursday, December 11, 2014 5:10:34 PM Subject: Re: [Bioc-devel] biovizBase Installation rbind Problem - Original Message - From: Dario Strbenac dstr7...@uni.sydney.edu.au To: bioc-devel@r-project.org Sent: Thursday, December 11, 2014 5:00:11 PM Subject: [Bioc-devel] biovizBase Installation rbind Problem Hello, I can't install biovizBase, although I have the correct versions of all packages in Depends and Imports fields. source(http://bioconductor.org/biocLite.R;) Bioconductor version 3.0 (BiocInstaller 1.16.1), ?biocLite for help biocLite(biovizBase) BioC_mirror: http://bioconductor.org Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2. Installing package(s) 'biovizBase' trying URL 'http://bioconductor.org/packages/3.0/bioc/src/contrib/biovizBase_1.14.0.tar.gz' Content type 'application/x-gzip' length 2429664 bytes (2.3 Mb) opened URL == downloaded 2.3 Mb * installing *source* package 'biovizBase' ... ** libs gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c R_init_biovizBase.c -o R_init_biovizBase.o gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -fpic -O2 -pipe -g -c bin_offsets.c -o bin_offsets.o gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o biovizBase.so R_init_biovizBase.o bin_offsets.o -L/usr/lib/R/lib -lR installing to /dskh/nobackup/biostat/Bioconductor/biovizBase/libs ** R ** data ** inst ** preparing package for lazy loading Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match Error : unable to load R code in package 'biovizBase' ERROR: lazy loading failed for package 'biovizBase' * removing '/dskh/nobackup/biostat/Bioconductor/biovizBase' * restoring previous '/dskh/nobackup/biostat/Bioconductor/biovizBase' See https://support.bioconductor.org/p/63510/#63529 sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] C attached base packages: [1] splines grid stats4parallel stats graphics grDevices [8] utils datasets methods base other attached packages: [1] BiocInstaller_1.16.1 dichromat_2.0-0 RColorBrewer_1.1-2 [4] VariantAnnotation_1.12.6 Rsamtools_1.18.2 AnnotationDbi_1.28.1 [7] Biobase_2.26.0 scales_0.2.4 Hmisc_3.14-6 [10] Formula_1.1-2survival_2.37-7 lattice_0.20-29 [13] Biostrings_2.34.0XVector_0.6.0 GenomicRanges_1.18.3 [16] GenomeInfoDb_1.2.3 IRanges_2.0.0 S4Vectors_0.4.0 [19] BiocGenerics_0.12.1 loaded via a namespace (and not attached): [1] BBmisc_1.8 BSgenome_1.34.0 BatchJobs_1.5 [4] BiocParallel_1.0.0 DBI_0.3.1 GenomicAlignments_1.2.1 [7] GenomicFeatures_1.18.2 RCurl_1.95-4.5 RSQLite_1.0.0 [10] Rcpp_0.11.3 XML_3.98-1.1 acepack_1.3-3.3 [13] base64enc_0.1-2 biomaRt_2.22.0 bitops_1.0-6 [16] brew_1.0-6 checkmate_1.5.0 cluster_1.15.3 [19] codetools_0.2-9 colorspace_1.2-4digest_0.6.6 [22] fail_1.2foreach_1.4.2 foreign_0.8-61 [25] iterators_1.0.7 latticeExtra_0.6-26 munsell_0.4.2 [28] nnet_7.3-8 plyr_1.8.1 rpart_4.1-8 [31] rtracklayer_1.26.2 sendmailR_1.2-1 stringr_0.6.2 [34] tools_3.1.2 zlibbioc_1.12.0 Are there important and undeclared dependencies for biovizBase ? I also get different errors for a few other packages I'm trying to update Installed Built ReposVer AllelicImbalance 1.0.0 3.0.3 1.4.0 GGtools 4.10.0 3.0.3 5.2.0 Gviz 1.6.0 3.0.3 1.10.3 ReportingTools 2.2.0 3.0.3 2.6.0 biomvRCNS1.2.0 3.0.3 1.6.0 biovizBase 1.10.8 3.0.3 1.14.0 casper 1.4.0 3.0.3 2.0.0 cummeRbund 2.4.1 3.0.3 2.8.2 ggbio1.10.16 3.0.3 1.14.0 intansv 1.2.0 3.0.3 1.6.0 methyAnalysis1.4.2 3.0.3 1.8.0 qrqc 1.16.0 3.0.3 1.20.0 spliceR 1.3.1 3.0.3 1.8.0 biocLite(spliceR) BioC_mirror: http://bioconductor.org Using Bioconductor version 3.0 (BiocInstaller 1.16.1), R version 3.1.2. Installing package(s) 'spliceR' trying URL 'http://bioconductor.org/packages/3.0/bioc/src/contrib/spliceR_1.8.0.tar.gz' Content type 'application/x-gzip' length 356213 bytes (347 Kb) opened URL == downloaded 347 Kb * installing *source* package 'spliceR' ... ** libs gcc -std=gnu99
Re: [Rd] R CMD check --as-cran and (a)spell checking
Henrik Bengtsson h...@biostat.ucsf.edu on Fri, 5 Dec 2014 18:17:57 -0800 writes: Does anyone know if it is possible to add a dictionary file of known words that becomes part of the *built* package to tell 'R CMD check --as-cran' not to report these words as misspelled. I want this dictionary to come with the *.tar.gz such that it will be available regardless where the package is checked. For instance, currently I get: * using log directory 'T:/R/_R-3.1.2patched/matrixStats.Rcheck' * using R version 3.1.2 Patched (2014-12-03 r67101) * using platform: x86_64-w64-mingw32 (64-bit) * using session charset: ISO8859-1 * checking for file 'matrixStats/DESCRIPTION' ... OK * this is package 'matrixStats' version '0.12.0' * checking CRAN incoming feasibility ... NOTE Maintainer: 'Henrik Bengtsson henr...@braju.com' Possibly mis-spelled words in DESCRIPTION: rowMedians (18:74) rowRanks (18:92) rowSds (18:111) * checking package namespace information ... OK ... I agree that some customization possibility would be great here. Maybe it'll be sufficient to allow a short list of about a dozen words in the DESCRIPTION itself, e.g., with an item Extrawords: rowMedians, rowRanks Would you feel like providing a patch to the R (devel) sources for this? Best, Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] package.skeleton leads to R CMD check aborting
The package.skeleton() base function is useful for creating empty packages. The fact that the help files generated by the function cause errors was long an annoyance, but I noticed last month (while preparing for a workshop on R and packages) that it creates packages which cause 'R CMD check ...' to die in error. Which is always a bug. I just verified that it still dies in error under R-devel, and filed bug report 16105 at the bugzilla instance. See below for a session log. As a bug reporter, I should offer help. I do, though somewhat hesitantly as package.skeleton() has gotten a bit complicated over the years. But if a patch simplifying the output of a simple default package is of interest I will work on it. Dirk edd@max:/tmp$ rm -rf demo/ edd@max:/tmp$ mkdir demo cd demo edd@max:/tmp/demo$ R-devel.sh R Under development (unstable) (2014-12-09 r67142) -- Unsuffered Consequences Copyright (C) 2014 The R Foundation for Statistical Computing Platform: x86_64-unknown-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. John Kane: I have 120 columns in a data.frame. I have one value in a column named blaw that I want to change. How do I find the coordinates? Roger Koenker: It is the well-known wicked which problem: If you had (grammatically incorrectly) thought ... which I want to change then you might have been led to type (in another window): ?which and you would have seen the light. Maybe that() should be an alias for which()? -- John Kane and Roger Koenker R-help (August 2006) R package.skeleton(quickDemo) Creating directories ... Creating DESCRIPTION ... Creating NAMESPACE ... Creating Read-and-delete-me ... Saving functions and data ... Making help files ... Done. Further steps are described in './quickDemo/Read-and-delete-me'. R q() edd@max:/tmp/demo$ R-devel.sh CMD build quickDemo * checking for file ‘quickDemo/DESCRIPTION’ ... OK * preparing ‘quickDemo’: * checking DESCRIPTION meta-information ... OK * installing the package to process help pages * saving partial Rd database * checking for LF line-endings in source and make files * checking for empty or unneeded directories * building ‘quickDemo_1.0.tar.gz’ edd@max:/tmp/demo$ R-devel.sh CMD check quickDemo_1.0.tar.gz * using log directory ‘/tmp/demo/quickDemo.Rcheck’ * using R Under development (unstable) (2014-12-09 r67142) * using platform: x86_64-unknown-linux-gnu (64-bit) * using session charset: UTF-8 * checking for file ‘quickDemo/DESCRIPTION’ ... OK * checking extension type ... Package * this is package ‘quickDemo’ version ‘1.0’ * checking package namespace information ... OK * checking package dependencies ... OK * checking if this is a source package ... OK * checking if there is a namespace ... OK * checking for executable files ... OK * checking for hidden files and directories ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking whether package ‘quickDemo’ can be installed ... WARNING Found the following significant warnings: Warning: /tmp/Rtmp3vskac/Rbuild4e1945bdc52a/quickDemo/man/quickDemo-package.Rd:26: All text must be in a section See ‘/tmp/demo/quickDemo.Rcheck/00install.out’ for details. * checking installed package size ... OK * checking package directory ... OK * checking DESCRIPTION meta-information ... WARNING Non-standard license specification: What license is it under? Standardizable: FALSE * checking top-level files ... OK * checking for left-over files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the package can be unloaded cleanly ... OK * checking whether the namespace can be loaded with stated dependencies ... OK * checking whether the namespace can be unloaded cleanly ... OK * checking loading without being on the library search path ... OK * checking dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... OK * checking Rd files ... WARNING prepare_Rd: quickDemo-package.Rd:26: All text must be in a section * checking Rd metadata ... OK * checking Rd cross-references ... WARNING Unknown package
Re: [Rd] Fwd: No source view when using gdb
On Thu, 2014-12-11 at 14:00 +0100, Pierrick Bruneau wrote: Dear R contributors, Say I want to debug some C code invoked through .Call() - say varbayes in the VBmix package. following the instructions in Writing R Extensions, I perform the following actions : R -d gdb run library(VBmix) CTRL+C break varbayes signal 0 mod - varbayes(as.matrix(iris)[,1:4], 2) The breakpoint is indeed activated, seemingly at the correct position in the source file, but instead of the actual text at the respective line, I get the following : 69varbayes.c: No such file or directory. Issuing next afterwards seems to attain the expected purpose (step by step progression), but source code lines are replaced by, e.g. : 72in varbayes.c There should be some way of installing the source code files, but I did not find R-specific info there. Does someone have a clue for my problem? This happens when you install a package from the tarball. The source is unpacked into a temporary directory which is then deleted. Try unpacking the source tarball yourself, and then installing from the unpacked directory, e.g. tar xfvz VBMix_0.2.17.tar.gz R CMD INSTALL VBMix Martyn Thanks by advance, Pierrick __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel --- This message and its attachments are strictly confidenti...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Fwd: No source view when using gdb
Works like a charm, thanks! On Thu, Dec 11, 2014 at 3:00 PM, Martyn Plummer plumm...@iarc.fr wrote: On Thu, 2014-12-11 at 14:00 +0100, Pierrick Bruneau wrote: Dear R contributors, Say I want to debug some C code invoked through .Call() - say varbayes in the VBmix package. following the instructions in Writing R Extensions, I perform the following actions : R -d gdb run library(VBmix) CTRL+C break varbayes signal 0 mod - varbayes(as.matrix(iris)[,1:4], 2) The breakpoint is indeed activated, seemingly at the correct position in the source file, but instead of the actual text at the respective line, I get the following : 69varbayes.c: No such file or directory. Issuing next afterwards seems to attain the expected purpose (step by step progression), but source code lines are replaced by, e.g. : 72in varbayes.c There should be some way of installing the source code files, but I did not find R-specific info there. Does someone have a clue for my problem? This happens when you install a package from the tarball. The source is unpacked into a temporary directory which is then deleted. Try unpacking the source tarball yourself, and then installing from the unpacked directory, e.g. tar xfvz VBMix_0.2.17.tar.gz R CMD INSTALL VBMix Martyn Thanks by advance, Pierrick __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel --- This message and its attachments are strictly confiden...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] SUGGESTION: Force install.packages() to use ASCII encoding when parse():ing code?
SUGGESTION: Would it make sense if install.packages() and friends always use an ascii(*) encoding when parse():ing R package source code files? I believe this should be safe, because R code files should be in ASCII [http://en.wikipedia.org/wiki/ASCII] and only in source-code comments you may use other characters. This is from Section 'Package subdirectories' in 'Writing R Extensions': Only ASCII characters (and the control characters tab, formfeed, LF and CR) should be used in code files. Other characters are accepted in comments, but then the comments may not be readable in e.g. a UTF-8 locale. Non-ASCII characters in object names will normally fail when the package is installed. Any byte will be allowed in a quoted character string but \u escapes should be used for non-ASCII characters. However, non-ASCII character strings may not be usable in some locales and may display incorrectly in others. Since comments are dropped by parse(), their actual content does not matter, and the rest of the code should be in ASCII. (*) It could be that the specific encoding ascii is not cross platforms. If so, is there another way to specify a pure ASCII encoding? BACKGROUND: If a user/system sets the 'encoding' option at startup, it may break package installations from source if the package has source code comments with non-ASCII characters. For example, $ mkdir foo; cd foo $ echo options(encoding='UTF-8') .Rprofile $ R --vanilla install.packages(R.oo, type=source) install.packages(R.oo, type=source) Installing package into 'C:/Users/hb/R/win-library/3.2' (as 'lib' is unspecified) --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.at.r-project.org/src/contrib/R.oo_1.18.0.tar.gz' Content type 'application/x-gzip' length 394545 bytes (385 KB) opened URL downloaded 385 KB * installing *source* package 'R.oo' ... ** package 'R.oo' successfully unpacked and MD5 sums checked ** R Warning in parse(outFile) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ R.oo' ** inst ** preparing package for lazy loading Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ R.oo' ** help [...] (This can be an extremely time consuming task to troubleshoot, particularly if reported to a package maintainer not having access to the original system). FYI, setting it only in the session is alright: options(encoding=UTF-8) install.packages(R.oo, type=source) because install.packages() launches a separated R process for the installation and it's only then the startup code becomes an issue. TROUBLESHOOTING: My understanding for the Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ is that this happens when there is a non-ASCII character in one of the source-code comments (*) with a bit pattern matching a multi-byte UTF-8 sequence [http://en.wikipedia.org/wiki/UTF-8#Description]. For instance, consider a source code comment with an acute accent: raw - as.raw(c(0x23, 0x20, 0xe9, 0x74, 0x75, 0x64, 0x69, 0x61, 0x6e, 0x74, 0x0a)) writeBin(raw, con=foo.R) code - readLines(foo.R) code [1] # étudiant options(encoding=UTF-8) parse(foo.R) Warning message: In readLines(file, warn = FALSE) : invalid input found on input connection 'foo.R' options(encoding=ascii) parse(foo.R) expression() Reason for the invalid input: The bit pattern for raw[3:5], is: R.utils::intToBin(raw[3:5]) [1] 11101001 01110100 01110101 The first byte (raw[3]) matched special UTF-8 byte pattern 1110, which according to UTF-8 should be followed by two more bytes with bit patterns 10xx and 10x [http://en.wikipedia.org/wiki/UTF-8#Description]. Since raw[4:5] does not match those, it's an invalid UTF-8 byte sequence. So, technically this does not happen for all comments using acute accents, but it's very likely. More generally, a multi-byte UTF-8 sequence is expected when byte pattern 11x (= 192 in decimal values) is encountered. Looking http://en.wikipedia.org/wiki/ISO/IEC_8859, there are several characters with this bit pattern for many Latin-N encodings, which I'd assume is still in dominant use by many developers. So, since options(encoding=UTF-8) was set at startup, that is also the encoding that R tries to follow. My suggestion is that it seems that R should be able to always use a pure-ASCII encoding when parsing R code in packages, because that is what 'Writing R Extensions' says we should use in the first place. /Henrik __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] SUGGESTION: Force install.packages() to use ASCII encoding when parse():ing code?
On 11/12/2014 12:59 PM, Henrik Bengtsson wrote: SUGGESTION: Would it make sense if install.packages() and friends always use an ascii(*) encoding when parse():ing R package source code files? I think that would be a step backwards. It would be better to accept other encodings. As an English speaker this isn't a big deal to me, but users of other languages may want to have messages and variable names in their native language, and ASCII might not be enough for that. On the other hand, I think it's quite reasonable to require a declared encoding if anything other than ASCII is used, and possibly to fail for some encodings. It is probably also reasonable to at least warn when non-ASCII characters are used in strings in packages on CRAN, as many users can't display all characters. Duncan Murdoch I believe this should be safe, because R code files should be in ASCII [http://en.wikipedia.org/wiki/ASCII] and only in source-code comments you may use other characters. This is from Section 'Package subdirectories' in 'Writing R Extensions': Only ASCII characters (and the control characters tab, formfeed, LF and CR) should be used in code files. Other characters are accepted in comments, but then the comments may not be readable in e.g. a UTF-8 locale. Non-ASCII characters in object names will normally fail when the package is installed. Any byte will be allowed in a quoted character string but \u escapes should be used for non-ASCII characters. However, non-ASCII character strings may not be usable in some locales and may display incorrectly in others. Since comments are dropped by parse(), their actual content does not matter, and the rest of the code should be in ASCII. (*) It could be that the specific encoding ascii is not cross platforms. If so, is there another way to specify a pure ASCII encoding? BACKGROUND: If a user/system sets the 'encoding' option at startup, it may break package installations from source if the package has source code comments with non-ASCII characters. For example, $ mkdir foo; cd foo $ echo options(encoding='UTF-8') .Rprofile $ R --vanilla install.packages(R.oo, type=source) install.packages(R.oo, type=source) Installing package into 'C:/Users/hb/R/win-library/3.2' (as 'lib' is unspecified) --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.at.r-project.org/src/contrib/R.oo_1.18.0.tar.gz' Content type 'application/x-gzip' length 394545 bytes (385 KB) opened URL downloaded 385 KB * installing *source* package 'R.oo' ... ** package 'R.oo' successfully unpacked and MD5 sums checked ** R Warning in parse(outFile) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ R.oo' ** inst ** preparing package for lazy loading Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ R.oo' ** help [...] (This can be an extremely time consuming task to troubleshoot, particularly if reported to a package maintainer not having access to the original system). FYI, setting it only in the session is alright: options(encoding=UTF-8) install.packages(R.oo, type=source) because install.packages() launches a separated R process for the installation and it's only then the startup code becomes an issue. TROUBLESHOOTING: My understanding for the Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ is that this happens when there is a non-ASCII character in one of the source-code comments (*) with a bit pattern matching a multi-byte UTF-8 sequence [http://en.wikipedia.org/wiki/UTF-8#Description]. For instance, consider a source code comment with an acute accent: raw - as.raw(c(0x23, 0x20, 0xe9, 0x74, 0x75, 0x64, 0x69, 0x61, 0x6e, 0x74, 0x0a)) writeBin(raw, con=foo.R) code - readLines(foo.R) code [1] # étudiant options(encoding=UTF-8) parse(foo.R) Warning message: In readLines(file, warn = FALSE) : invalid input found on input connection 'foo.R' options(encoding=ascii) parse(foo.R) expression() Reason for the invalid input: The bit pattern for raw[3:5], is: R.utils::intToBin(raw[3:5]) [1] 11101001 01110100 01110101 The first byte (raw[3]) matched special UTF-8 byte pattern 1110, which according to UTF-8 should be followed by two more bytes with bit patterns 10xx and 10x [http://en.wikipedia.org/wiki/UTF-8#Description]. Since raw[4:5] does not match those, it's an invalid UTF-8 byte sequence. So, technically this does not happen for all comments using acute accents, but it's very likely. More generally, a multi-byte UTF-8 sequence is expected when byte pattern 11x (= 192 in decimal values) is encountered. Looking http://en.wikipedia.org/wiki/ISO/IEC_8859, there are several characters with this bit pattern for many Latin-N encodings, which I'd assume is still in
[Rd] Significant memory leak when using XML on Windows
Dear list, I'm sorry to keep coming back with this time and time again, but this bug is still not fixed even though the root cause of the issue has been around for 2-3 years now. And as the number of packages that depend on XML grows, I thought maybe this deserves some wider attention. I did my best to make reproduction of the issue as easy as possible: https://github.com/omegahat/XML/issues/4 http://goo.gl/aV17Lv But as I'm not familiar with C I'm kind of out of clues of what else do to. Duncan has been really dedicated and helpful so far, but unfortunately he seems to have too little time to really dig into this himself. So I thought I'd try and raise the attention of other developers that have the skills to fix this. Apparently, the issue is caused by the way the memory consumed by the underlying C-objects/pointers is released (or not released, for that matter). I'd so much appreciate if someone could have a look at this. If I can be of any help whatsoever, please let me know! Thanks and best regards, Janko [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] SUGGESTION: Force install.packages() to use ASCII encoding when parse():ing code?
On Thu, Dec 11, 2014 at 10:47 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 11/12/2014 12:59 PM, Henrik Bengtsson wrote: SUGGESTION: Would it make sense if install.packages() and friends always use an ascii(*) encoding when parse():ing R package source code files? I think that would be a step backwards. It would be better to accept other encodings. As an English speaker this isn't a big deal to me, but users of other languages may want to have messages and variable names in their native language, and ASCII might not be enough for that. Thanks for the feedback. While I'll probably agree with you that R packages should support other source code encodings than ASCII, that would require a change in the specifications and design. What I'm proposing is (just) an adjustment to the implementation to meet the current specs and design. On the other hand, I think it's quite reasonable to require a declared encoding if anything other than ASCII is used, and possibly to fail for some encodings. It is probably also reasonable to at least warn when non-ASCII characters are used in strings in packages on CRAN, as many users can't display all characters. That would be a reasonable extension of the design, which would be backward compatible with the current design, i.e. if encoding for the source code is not declared, then it is assumed to be ASCII. Source code comments are special, because by the current design ('Writing R Extensions'), it somehow leaves it open to use any type of encoding. If I read it freely, it could even be that you can use different encoding for different comments in the same file (which is not unlikely to occur considered cut'n'paste and open-source licenses). If other encodings are to be supported, then I see two ways forward: 1. Have R completely ignore what's in the comments (what follows # until the newline) such that encoding does not matter, or 2. require the same encoding for the source code comments as the rest of the code. As I see it, today's design falls (could fall?) under 1, but the implementation does not go all the way to support it. /Henrik PS. It should be emphasized that this is about R packages. AFAIK, you can already now source() code written in any encoding, e.g. raw - as.raw(c( + 0xcf, 0x80, 0x20, 0x3c, 0x2d, 0x20, 0x70, 0x69, 0x0a, + 0x70, 0x72, 0x69, 0x6e, 0x74, 0x28, 0xcf, 0x80, 0x29, 0x0a + )) writeBin(raw, con=pi.R) source(pi.R, encoding=UTF-8) [1] 3.141593 Duncan Murdoch I believe this should be safe, because R code files should be in ASCII [http://en.wikipedia.org/wiki/ASCII] and only in source-code comments you may use other characters. This is from Section 'Package subdirectories' in 'Writing R Extensions': Only ASCII characters (and the control characters tab, formfeed, LF and CR) should be used in code files. Other characters are accepted in comments, but then the comments may not be readable in e.g. a UTF-8 locale. Non-ASCII characters in object names will normally fail when the package is installed. Any byte will be allowed in a quoted character string but \u escapes should be used for non-ASCII characters. However, non-ASCII character strings may not be usable in some locales and may display incorrectly in others. Since comments are dropped by parse(), their actual content does not matter, and the rest of the code should be in ASCII. (*) It could be that the specific encoding ascii is not cross platforms. If so, is there another way to specify a pure ASCII encoding? BACKGROUND: If a user/system sets the 'encoding' option at startup, it may break package installations from source if the package has source code comments with non-ASCII characters. For example, $ mkdir foo; cd foo $ echo options(encoding='UTF-8') .Rprofile $ R --vanilla install.packages(R.oo, type=source) install.packages(R.oo, type=source) Installing package into 'C:/Users/hb/R/win-library/3.2' (as 'lib' is unspecified) --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.at.r-project.org/src/contrib/R.oo_1.18.0.tar.gz' Content type 'application/x-gzip' length 394545 bytes (385 KB) opened URL downloaded 385 KB * installing *source* package 'R.oo' ... ** package 'R.oo' successfully unpacked and MD5 sums checked ** R Warning in parse(outFile) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ R.oo' ** inst ** preparing package for lazy loading Warning in parse(n = -1, file = file, srcfile = NULL, keep.source = FALSE) : invalid input found on input connection 'C:/Users/hb/R/win-library/3.2/R.oo/R/ R.oo' ** help [...] (This can be an extremely time consuming task to troubleshoot, particularly if reported to a package maintainer not having access to the original system). FYI, setting it only in the session is alright: options(encoding=UTF-8) install.packages(R.oo, type=source) because install.packages() launches