Re: [Bioc-devel] differences between petty and perceval (OS X 10.6.8 build machines for release/devel)
Dear Dan, Martin and Nate, Thank you for looking into it. I guess that is pointing to a problem within bowtie. It looks like the EXC_BAD_ACCESS you see on petty in ebwt.h is not reproducible on the other Mac or Linux machines we tried. Is it possible to run valgrind on petty? That may confirm/rule out if the memory (de-)allocation issues reported on Linux are related. I would like to submit a bug-report to the bowtie developers, but am reluctant to do that without being able to reproduce the problem or test potential fixed. I would have the options to go through Rbowtie build cycles, but would have to rely on the assumption that petty will keep hitting this hickup even with modified bowtie code. The minor differences between bowtie 1.0.1 and bowtie 1.0.1-bug-312 argue against that. I am tempted to stay with the current situtation: - OS X before 10.9 needs to use Rbowtie = 1.4.4 (based on bowtie 1.0.1) - OS X 10.9 onwards and everything else uses Rbowtie = 1.4.5 (based on bowtie 1.0.1 /patched bugs-312). Thanks again for your efforts, Michael On 14.06.2014 01:31, Dan Tenenbaum wrote: Hi Michael, - Original Message - From: Michael Stadler michael.stad...@fmi.ch To: Dan Tenenbaum dtene...@fhcrc.org, bioc-devel@r-project.org Sent: Friday, June 13, 2014 12:32:52 AM Subject: differences between petty and perceval (OS X 10.6.8 build machines for release/devel) Hi Dan, I'm cc'ing the list; maybe somebody else has experienced differences between petty and perceval. Rbowtie release (1.4.5) is not building under OS X 10.6.8 (petty). Rbowtie release (1.4.5) and development (1.5.5) are virtually identical (only DESCRIPTION and NEWS differ). The development version builds without problems on perceval, but the release version fails on petty: http://bioconductor.org/checkResults/devel/bioc-LATEST/Rbowtie/perceval-buildsrc.html http://bioconductor.org/checkResults/release/bioc-LATEST/Rbowtie/petty-buildsrc.html The only difference I can make out from the node info pages is that perceval has an additional section on C++11 compiler that is lacking from petty's NodeInfo page. Unfortunately, I cannot reproduce the issue, both Rbowtie 1.4.5 and 1.5.5 build successfully under OS X 10.6.8 and 10.7.5 using llvm-gcc-4.2. Do you have any idea what else could be different between petty and perceval? Martin and Nate and I took a look at this. I managed to come up with a bowtie command line that would reliably reproduce the segfault on petty. Then we ran that under gdb (and turned off compiler optimizations) and came up with this, which may or may not help you: petty:vignettes biocbuild$ gdb --args '/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rbowtie/bowtie' -y -S -k 10 -m 10 -v 2 -r -p 4 --best --strata 'doit/refsIndex/index' 'doit/SpliceMapTemp_876c378e20ac/25mers.map' 'doit/SpliceMapTemp_876c378e20ac/25mers.map_unsorted' GNU gdb 6.3.50-20050815 (Apple version gdb-1708) (Mon Aug 15 16:03:10 UTC 2011) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as x86_64-apple-darwin...Reading symbols for shared libraries ... done (gdb) run Starting program: /Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rbowtie/bowtie -y -S -k 10 -m 10 -v 2 -r -p 4 --best --strata doit/refsIndex/index doit/SpliceMapTemp_876c378e20ac/25mers.map doit/SpliceMapTemp_876c378e20ac/25mers.map_unsorted Reading symbols for shared libraries ++. done Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x23d0d92d [Switching to process 36144 thread 0x20f] 0x000478b1 in Ebwtseqan::Stringseqan::SimpleTypeunsigned char, seqan::_Dna, seqan::Allocvoid ::rowL (this=0xbfffda10, l=@0xa300e14) at ebwt.h:1816 1816return unpack_2b_from_8b(l.side(this-_ebwt)[l._by], l._bp); (gdb) l 1811inline int EbwtTStr::rowL(const SideLocus l) const { 1812// Extract and return appropriate bit-pair 1813#ifdef SIXTY4_FORMAT 1814return (((uint64_t*)l.side(this-_ebwt))[l._by 3] l._by 7) 2) + l._bp) 1)) 3; 1815#else 1816return unpack_2b_from_8b(l.side(this-_ebwt)[l._by], l._bp); 1817#endif 1818} 1819 1820/** (gdb) p this -_ebwt $1 = (uint8_t *) 0x4804a00 \b2 (gdb) p *this -_ebwt $2 = 8 '\b' (gdb) p l._by $3 = 45 (gdb) p l.side $4 = SideLocus::side(unsigned char const*) const (gdb) p l.side(this-_ebwt) $5 = (uint8_t *) 0x23d0d900 Address 0x23d0d900 out of bounds (gdb) p l.side(this-_ebwt)[l._by] Cannot access memory at address 0x23d0d92d (gdb) p this -_ebwt $6 = (uint8_t *) 0x4804a00 \b2 (gdb) Running
[Bioc-devel] question about affy::plotLocation
Dear BiocDevelR! I'm working lot with the excelent *affy package* of Rafael A. Irizarry, I find it very useful. I have a bit strange experience with it's *plotLocation function*. It seems, *I have to mirror Y coordinates* to plot properly. Perhaps it's because the CEL file reading starts from the top, and plotting starts from the bottom. I'm not sure if I'm rigtht, can you check, that I haven't made mistake? If yes, I suggest a (simple) solution for this. I attach two plot made from a GEO GSM CEL file (see script). First I've plotted all gene name (ProbeSet) on the CEL file images, second I've plotted after mirroring the Y coordinates. As you can see on the raw plotting there are points on chip name (printed by BioB spots). I attach my plotting script too, and a potential correction for the affy::plotLocation. (I've tried it, it seems good.) Yours sincerly: Kristóf Jakab I've linked 2 files to this email: geo_testing_spot_locations_mirrored.png https://www.box.com/shared/ow3q5sn3fpmyz3u8w533(6.0 MB)Box https://www.box.com/thunderbirdhttps://www.box.com/shared/ow3q5sn3fpmyz3u8w533 geo_testing_spot_locations_raw.png https://www.box.com/shared/3sj9i3lpkixkq85qar0r(6.1 MB)Box https://www.box.com/thunderbirdhttps://www.box.com/shared/3sj9i3lpkixkq85qar0r Mozilla Thunderbird http://www.getthunderbird.com makes it easy to share large files over email. ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] question about affy::plotLocation - scripts
It seems I can't send attachments, I copy the codes here. test_plotLocation_affy.R #!/usr/bin/env Rscript #kristof.ja...@hegelab.org # MAKE AFFYBATCH #-- # download CEL file library(GEOquery) getGEOSuppFiles(GSM229005) #-- # read CEL file library(affy) geoS - ReadAffy(filenames=paste(GSM229005,GSM229005.CEL.gz, sep=/)) # PLOTTING TO PNG #-- # raw png(filename=geo_testing_spot_locations_raw.png,height=744*10,width=744*10,res=1200) ## image (log scale intensities) image(geoS,transfo=log) ## perfectmatches l - indexProbes(geoS, which=pm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) plotLocation(xy,col=tomato,pch=18,cex=0.075) }) ## missmatches l - indexProbes(geoS, which=mm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) plotLocation(xy,col=aquamarine,pch=18,cex=0.075) }) dev.off() #-- # mirrored png(filename=geo_testing_spot_locations_mirrored.png,height=744*10,width=744*10,res=1200) ## image (log scale intensities) image(geoS,transfo=log) ## perfectmatches l - indexProbes(geoS, which=pm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) xy - cbind(x=xy[,1],y=(743-xy[,2])) # mirroring plotLocation(xy,col=tomato,pch=18,cex=0.075) }) ## missmatches l - indexProbes(geoS, which=mm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) xy - cbind(x=xy[,1],y=(743-xy[,2])) # mirroring plotLocation(xy,col=aquamarine,pch=18,cex=0.075) }) dev.off() correction_for_plotLocation.R plotLocation - function(x, col=green, pch=22, ...) { if (is.list(x)) { x - cbind(unlist(lapply(x, function(x) x[,1])), unlist(lapply(x, function(x) x[,2]))) } points(x[,1], 743-x[,2] # mirroring 744Ã---744 matrix, numbered from 0 to 743 , pch=pch, col=col, ...) } On 06/16/2014 10:59 AM, Kristóf Jakab wrote: Dear BiocDevelR! I'm working lot with the excelent *affy package* of Rafael A. Irizarry, I find it very useful. I have a bit strange experience with it's *plotLocation function*. It seems, *I have to mirror Y coordinates* to plot properly. Perhaps it's because the CEL file reading starts from the top, and plotting starts from the bottom. I'm not sure if I'm rigtht, can you check, that I haven't made mistake? If yes, I suggest a (simple) solution for this. I attach two plot made from a GEO GSM CEL file (see script). First I've plotted all gene name (ProbeSet) on the CEL file images, second I've plotted after mirroring the Y coordinates. As you can see on the raw plotting there are points on chip name (printed by BioB spots). I attach my plotting script too, and a potential correction for the affy::plotLocation. (I've tried it, it seems good.) Yours sincerly: Kristóf Jakab I've linked 2 files to this email: geo_testing_spot_locations_mirrored.png https://www.box.com/shared/ow3q5sn3fpmyz3u8w533(6.0 MB)Box https://www.box.com/thunderbirdhttps://www.box.com/shared/ow3q5sn3fpmyz3u8w533 geo_testing_spot_locations_raw.png https://www.box.com/shared/3sj9i3lpkixkq85qar0r(6.1 MB)Box https://www.box.com/thunderbirdhttps://www.box.com/shared/3sj9i3lpkixkq85qar0r Mozilla Thunderbird http://www.getthunderbird.com makes it easy to share large files over email. [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] question about affy::plotLocation - scripts
Hi Kristóf, On 6/16/2014 10:20 AM, Kristóf Jakab wrote: It seems I can't send attachments, I copy the codes here. test_plotLocation_affy.R #!/usr/bin/env Rscript #kristof.ja...@hegelab.org # MAKE AFFYBATCH #-- # download CEL file library(GEOquery) getGEOSuppFiles(GSM229005) #-- # read CEL file library(affy) geoS - ReadAffy(filenames=paste(GSM229005,GSM229005.CEL.gz, sep=/)) # PLOTTING TO PNG #-- # raw png(filename=geo_testing_spot_locations_raw.png,height=744*10,width=744*10,res=1200) ## image (log scale intensities) image(geoS,transfo=log) ## perfectmatches l - indexProbes(geoS, which=pm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) plotLocation(xy,col=tomato,pch=18,cex=0.075) }) ## missmatches l - indexProbes(geoS, which=mm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) plotLocation(xy,col=aquamarine,pch=18,cex=0.075) }) dev.off() #-- # mirrored png(filename=geo_testing_spot_locations_mirrored.png,height=744*10,width=744*10,res=1200) ## image (log scale intensities) image(geoS,transfo=log) ## perfectmatches l - indexProbes(geoS, which=pm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) xy - cbind(x=xy[,1],y=(743-xy[,2])) # mirroring plotLocation(xy,col=tomato,pch=18,cex=0.075) }) ## missmatches l - indexProbes(geoS, which=mm, geneNames(geoS)) lapply(l,function(li){ xy - indices2xy(li, abatch=geoS) xy - cbind(x=xy[,1],y=(743-xy[,2])) # mirroring plotLocation(xy,col=aquamarine,pch=18,cex=0.075) }) dev.off() correction_for_plotLocation.R plotLocation - function(x, col=green, pch=22, ...) { if (is.list(x)) { x - cbind(unlist(lapply(x, function(x) x[,1])), unlist(lapply(x, function(x) x[,2]))) } points(x[,1], 743-x[,2] # mirroring 744Ã---744 matrix, numbered from 0 to 743 , pch=pch, col=col, ...) } Thanks for pointing this out. It's apparent almost nobody ever uses this code, as it has been in the affy package since pretty much the beginning (2002), and you are the first to notice this. Unfortunately, hard-coding the number of rows isn't the answer, since Affy arrays have different dimensions. Probably the best fix is to add an additional required argument 'affybatch' that we can use to extract the chip dimensions from. Best, Jim On 06/16/2014 10:59 AM, Kristóf Jakab wrote: Dear BiocDevelR! I'm working lot with the excelent *affy package* of Rafael A. Irizarry, I find it very useful. I have a bit strange experience with it's *plotLocation function*. It seems, *I have to mirror Y coordinates* to plot properly. Perhaps it's because the CEL file reading starts from the top, and plotting starts from the bottom. I'm not sure if I'm rigtht, can you check, that I haven't made mistake? If yes, I suggest a (simple) solution for this. I attach two plot made from a GEO GSM CEL file (see script). First I've plotted all gene name (ProbeSet) on the CEL file images, second I've plotted after mirroring the Y coordinates. As you can see on the raw plotting there are points on chip name (printed by BioB spots). I attach my plotting script too, and a potential correction for the affy::plotLocation. (I've tried it, it seems good.) Yours sincerly: Kristóf Jakab I've linked 2 files to this email: geo_testing_spot_locations_mirrored.png https://www.box.com/shared/ow3q5sn3fpmyz3u8w533(6.0 MB)Box https://www.box.com/thunderbirdhttps://www.box.com/shared/ow3q5sn3fpmyz3u8w533 geo_testing_spot_locations_raw.png https://www.box.com/shared/3sj9i3lpkixkq85qar0r(6.1 MB)Box https://www.box.com/thunderbirdhttps://www.box.com/shared/3sj9i3lpkixkq85qar0r Mozilla Thunderbird http://www.getthunderbird.com makes it easy to share large files over email. [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[Bioc-devel] filterVcf: why require a filter?
Hi, I was trying to use filterVcf just to filter a VCF by a range, via which in ScanVcfParam, without any filters, but it failed with: Error in filterVcf(tbx, genome = genome, destination = destination, ..., (from #2) : no 'prefilters' or 'filters' specified Why not allow identity, i.e., where the filter is inherent in the restricted query? Michael [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] evaluation of C post-increments changed in GCC 4.8.2
hi Nathaniel cc Dan, thanks a lot for clearing up completely the entire story. I'm afraid that one or two cycles ago of our conversation i did a simple reply instead of a reply-all and the bioc-devel list wasn't included anymore in the recipients of these emails. since what you say below sounds like a relevant piece of information for anyone working with C code i'm cc'ing the bioc-devel list again. cheers, robert. On 6/16/14 11:15 PM, Nathaniel Hayden wrote: Hi, Robert. You are correct. zin2 and petty failed to emit warnings for the problematic code. After some digging we discovered that for gcc, any optimization level above 0 prevents emission of the -Wsequence-point warning in this case. But the optimizations must stay for production code. As a follow-up to the recommendations before about flags to use during package development, we have added content to the Package Guidelines page on our website: http://www.bioconductor.org/developers/package-guidelines/#c-code The failure of some build machines to emit the warning under production conditions underscores the importance of the original recommendation to enable as many warnings as possible during development. Thanks for bringing it up! Nate On Mon 16 Jun 2014 07:42:36 AM PDT, Robert Castelo wrote: hi Nathaniel, On 06/14/2014 01:01 AM, Nathaniel Hayden wrote: Hi, Robert. You're welcome. It sounds like something isn't happening, but you think it should. Could you be more precise about what you expect to happen (the conditions that *should* lead to the warning, but do not)? There are lots of variables floating around: - devel or release? (I see similar commits to devel and release so unclear which I should look at; current devel version looks like it fails before it has a chance to give the warning.) yes, this was an unrelated error, which actually Dan warned me about and for which i sent a fix yesterday. the situation i was describing was occurring in both, devel and release, but both are fixed by now. - it sounds like you're talking about a Mavericks machine in the Bioc build system; can you confirm which one? Both the devel and release Mavericks build machines use clang, and both linux machines (zin1/zin2) use gcc with -Wall. so, for instance, the release version from VariantFiltering 1.0.1 was giving these warnings i was talking about: Found the following significant warnings: methods-WeightMatrix.c:256:19: warning: unsequenced modification and access to 'q' [-Wunsequenced] methods-WeightMatrix.c:638:17: warning: unsequenced modification and access to 'q' [-Wunsequenced] *only* in the R CMD check from 'morelia' and not from 'petty' or 'zin2', while all three machines in principle have the -Wall option activated. currently, because i submitted the fix, version 1.0.2 does not give these warnings anymore. however, i have just committed a new version to de release branch, 1.0.3, that has this problem back in line 256: while ((*q++=tolower(*q))); and should recreate the odd situation i saw, that only 'morelia' warns about this line, but not 'petty' or 'zin2'. thanks! robert. Thanks, Nate On 06/13/2014 12:54 AM, Robert Castelo wrote: hi Nathaniel, thanks for the very clear examples. after all, probably it is just my package which may have this problem. one further question below.. On 06/12/2014 07:12 PM, Nathaniel Hayden wrote: Hi, Robert. C++ is my area so I can't speak as knowledgeably about C, [...] I confirm that using your test file gcc 4.6.3 indeed warns about unsequenced shenanigans with -Wall 'warning: operation on ‘p’ may be undefined [-Wsequence-point]'. I would add it's also a good idea during the development cycle to use -Wextra and -pedantic flags. (You can read about them here: http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html) the strange things is that the only machine at the building pipeline of BioC that warned about this in my package was the one running Mac OSX Mavericks with gcc 4.8.2 and not also the Linux zin2 which is running gcc 4.6.3 you can see it if the 1.0.1 version of VariantFiltering is still at the check report. anyway, i'll use those options during development and that should avoid me this kind of problems in the future. thanks! robert. ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Rd] index.search
Adrian Dușa dusa.adr...@unibuc.ro on Mon, 16 Jun 2014 08:33:59 +0300 writes: On Mon, Jun 16, 2014 at 6:37 AM, Gabriel Becker gmbec...@ucdavis.edu wrote: [...] You can. This is valid R source, so the parser will understand it expr = parse(text= example(deMorgan, package=QCA, give.lines=TRUE)) You can then evaluate some or all of that expression using either R's own eval package or, e.g. Hadley Wickham's evaluate package (for your particular usecase evaluate will be easier I think). Oh, I see...! In that case I can use it, of course. Did install the evaluate package, although one would expect some better documentation (no examples at all, especially at the main evaluate function). [...] index.search is an unexported function, which means that it is subject to change in how it behaves without notice or even externally available reasons. You can get it via :::, but again, it's really not the right tool here, and not safe to use in general in code you expect to keep working. Yes, I figured that much. Of course it's not meant to be used in any decently working code, but I learn heavily by simply looking at these sort of (hidden) R functions. Thanks again, Adrian Apropos not the right tool. I'm a bit astonished that nobody mentioned the fact R already provides the tool to automatically compare all example outputs with a previous version (of the packages example outputs): *THE* manual (every package writer should know about, re-read/browse about once a year, and search in for such questions): Writing R Extensions, section Package subdirectories (e.g. on the CRAN master in Vienna, http://cran.r-project.org/doc/manuals/R-exts.html#Package-subdirectories ) says |If directory 'tests' has a subdirectory 'Examples' containing a file |'PKG-Ex.Rout.save', this is compared to the output file for running the |examples when the latter are checked. So: After an 'R CMD check PKG' you only need to take and keep the PKG-Ex.Rout file that is produced (in the PKG.Rcheck/ directory), and save it into PKG/tests/PKG-Ex.Rout.save and from then on, every time you run R CMD check PKG the comparison will be made. Martin Maechler, ETH Zurich __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] index.search
Oh my... this is so simple, why didn't I think of that...? Thanks a lot Martin, beautiful, Adrian On Mon, Jun 16, 2014 at 10:32 AM, Martin Maechler maech...@stat.math.ethz.ch wrote: Adrian Duºa dusa.adr...@unibuc.ro on Mon, 16 Jun 2014 08:33:59 +0300 writes: On Mon, Jun 16, 2014 at 6:37 AM, Gabriel Becker gmbec...@ucdavis.edu wrote: [...] You can. This is valid R source, so the parser will understand it expr = parse(text= example(deMorgan, package=QCA, give.lines=TRUE)) You can then evaluate some or all of that expression using either R's own eval package or, e.g. Hadley Wickham's evaluate package (for your particular usecase evaluate will be easier I think). Oh, I see...! In that case I can use it, of course. Did install the evaluate package, although one would expect some better documentation (no examples at all, especially at the main evaluate function). [...] index.search is an unexported function, which means that it is subject to change in how it behaves without notice or even externally available reasons. You can get it via :::, but again, it's really not the right tool here, and not safe to use in general in code you expect to keep working. Yes, I figured that much. Of course it's not meant to be used in any decently working code, but I learn heavily by simply looking at these sort of (hidden) R functions. Thanks again, Adrian Apropos not the right tool. I'm a bit astonished that nobody mentioned the fact R already provides the tool to automatically compare all example outputs with a previous version (of the packages example outputs): *THE* manual (every package writer should know about, re-read/browse about once a year, and search in for such questions): Writing R Extensions, section Package subdirectories (e.g. on the CRAN master in Vienna, http://cran.r-project.org/doc/manuals/R-exts.html#Package-subdirectories ) says |If directory 'tests' has a subdirectory 'Examples' containing a file |'PKG-Ex.Rout.save', this is compared to the output file for running the |examples when the latter are checked. So: After an 'R CMD check PKG' you only need to take and keep the PKG-Ex.Rout file that is produced (in the PKG.Rcheck/ directory), and save it into PKG/tests/PKG-Ex.Rout.save and from then on, every time you run R CMD check PKG the comparison will be made. Martin Maechler, ETH Zurich -- Adrian Dusa University of Bucharest Romanian Social Data Archive 1, Schitu Magureanu Bd. 050025 Bucharest sector 5 Romania Tel.:+40 21 3126618 \ +40 21 3120210 / int.101 Fax: +40 21 3158391 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] index.search
Thanks for the great insight. I love that there's always something else to learn in R. â¢â¢â¢â¢â¢ Brian Lee Yung Rowe Founder, Zato Novo Professor, M.S. Data Analytics, CUNY On Jun 16, 2014, at 3:34 AM, Martin Maechler maech...@stat.math.ethz.ch wrote: Adrian DuÈa dusa.adr...@unibuc.ro on Mon, 16 Jun 2014 08:33:59 +0300 writes: On Mon, Jun 16, 2014 at 6:37 AM, Gabriel Becker gmbec...@ucdavis.edu wrote: [...] You can. This is valid R source, so the parser will understand it expr = parse(text= example(deMorgan, package=QCA, give.lines=TRUE)) You can then evaluate some or all of that expression using either R's own eval package or, e.g. Hadley Wickham's evaluate package (for your particular usecase evaluate will be easier I think). Oh, I see...! In that case I can use it, of course. Did install the evaluate package, although one would expect some better documentation (no examples at all, especially at the main evaluate function). [...] index.search is an unexported function, which means that it is subject to change in how it behaves without notice or even externally available reasons. You can get it via :::, but again, it's really not the right tool here, and not safe to use in general in code you expect to keep working. Yes, I figured that much. Of course it's not meant to be used in any decently working code, but I learn heavily by simply looking at these sort of (hidden) R functions. Thanks again, Adrian Apropos not the right tool. I'm a bit astonished that nobody mentioned the fact R already provides the tool to automatically compare all example outputs with a previous version (of the packages example outputs): *THE* manual (every package writer should know about, re-read/browse about once a year, and search in for such questions): Writing R Extensions, section Package subdirectories (e.g. on the CRAN master in Vienna, http://cran.r-project.org/doc/manuals/R-exts.html#Package-subdirectories ) says |If directory 'tests' has a subdirectory 'Examples' containing a file |'PKG-Ex.Rout.save', this is compared to the output file for running the |examples when the latter are checked. So: After an 'R CMD check PKG' you only need to take and keep the PKG-Ex.Rout file that is produced (in the PKG.Rcheck/ directory), and save it into PKG/tests/PKG-Ex.Rout.save and from then on, every time you run R CMD check PKG the comparison will be made. Martin Maechler, ETH Zurich __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] index.search
On 16/06/2014 3:32 AM, Martin Maechler wrote: Adrian Dușa dusa.adr...@unibuc.ro on Mon, 16 Jun 2014 08:33:59 +0300 writes: On Mon, Jun 16, 2014 at 6:37 AM, Gabriel Becker gmbec...@ucdavis.edu wrote: [...] You can. This is valid R source, so the parser will understand it expr = parse(text= example(deMorgan, package=QCA, give.lines=TRUE)) You can then evaluate some or all of that expression using either R's own eval package or, e.g. Hadley Wickham's evaluate package (for your particular usecase evaluate will be easier I think). Oh, I see...! In that case I can use it, of course. Did install the evaluate package, although one would expect some better documentation (no examples at all, especially at the main evaluate function). [...] index.search is an unexported function, which means that it is subject to change in how it behaves without notice or even externally available reasons. You can get it via :::, but again, it's really not the right tool here, and not safe to use in general in code you expect to keep working. Yes, I figured that much. Of course it's not meant to be used in any decently working code, but I learn heavily by simply looking at these sort of (hidden) R functions. Thanks again, Adrian Apropos not the right tool. I'm a bit astonished that nobody mentioned the fact R already provides the tool to automatically compare all example outputs with a previous version (of the packages example outputs): *THE* manual (every package writer should know about, re-read/browse about once a year, and search in for such questions): Writing R Extensions, section Package subdirectories (e.g. on the CRAN master in Vienna, http://cran.r-project.org/doc/manuals/R-exts.html#Package-subdirectories ) says |If directory 'tests' has a subdirectory 'Examples' containing a file |'PKG-Ex.Rout.save', this is compared to the output file for running the |examples when the latter are checked. So: After an 'R CMD check PKG' you only need to take and keep the PKG-Ex.Rout file that is produced (in the PKG.Rcheck/ directory), and save it into PKG/tests/PKG-Ex.Rout.save and from then on, every time you run R CMD check PKG the comparison will be made. It's also worth mentioning that there is something similar to test for changes to vignettes: If there is a target output file .Rout.save in the vignette source directory, the output from running the code in that vignette is compared with the target output file and any differences are reported (but not recorded in the log file). The slightly surprising thing is that R CMD check doesn't produce vignette.Rout; the file that is compared to vignette.Rout.save is vignette.log. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] index.search
On Mon, Jun 16, 2014 at 10:32 AM, Martin Maechler maech...@stat.math.ethz.ch wrote: [...] Apropos not the right tool. I'm a bit astonished that nobody mentioned the fact R already provides the tool to automatically compare all example outputs with a previous version (of the packages example outputs): As appealing as this is, while trying to figure out a solution of my own (until Martin's email), I think I've succeeded in creating a rather useful function which allows fine grained control over each and every line of code in the examples sections: # helpfiles - c( allExpressions, calibrate, createMatrix, deMorgan, demoChart, eqmcc, factorize, findSubsets, findSupersets, findTh, getRow, pof, solveChart, superSubset, truthTable ) testQCAmaybe - function() { results - vector(mode=list, length=length(helpfiles)) names(results) - helpfiles for (i in seq(length(helpfiles))) { Rdfile - file.path(find.package(QCA), paste(helpfiles[i], .Rd, sep=)) commands - parse(text=capture.output(tools::Rd2ex(Rdfile))) results[[i]] - vector(mode=list, length=length(commands)) names(results[[i]]) - commands for (j in seq(length(commands))) { results[[i]][[j]] - suppressWarnings(capture.output(eval(commands[j]))) } } return(results) } # Using all.equal(), over the entire list or sequentially over parts of it quickly identifies sources of difference. I hope this helps anyone, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive 1, Schitu Magureanu Bd. 050025 Bucharest sector 5 Romania Tel.:+40 21 3126618 \ +40 21 3120210 / int.101 Fax: +40 21 3158391 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] what is the current correct repos structure for mac osx binaries?
Dear R-devel, Apologies for the confusing typo in the reported paths my previous question, thanks to Simon for providing the answer that the default repository type on the mac is now mac.binary.mavericks not mac.binary as the docs for install.packages state. Perhaps the docs for install packages could be updated something like: ... type character, indicating the type of package to download and install Possible values are (currently) source, mac.binary.BUILD_NAME and win.binary. The BUILD_NAME on OSX is determined internally by ???. ... I'm still not quite clear how the CRAN-like repository should be structured for OSX. CRAN seems to include .tgz packages in both http://cran.r-project.org/bin/macosx/contrib/3.1/ and http://cran.r-project.org/bin/macosx/mavericks/contrib/3.1/ The directory contents are not identical, but both include packages built as recently as today. Is bin/macosx/contrib/3.1/ a snowleopard build? Do I need to maintain two directories as well? It seems like if I put my packages in http://foo/bin/macosx/contrib/3.1/ the mavericks machines won't find them. But if I put the packages in http://foo/bin/macosx/mavericks/contrib/3.1/ people with the snowleopard build wont find them. Perhaps this is the desired behavior if the mavericks binaries are not snowleopard compatible? thanks again for your help, -skye On 06/13/2014 05:22 PM, Simon Urbanek wrote: On Jun 13, 2014, at 5:41 PM, Skye Bender-deMoll skyeb...@skyeome.net wrote: Dear R-developers, As part of our package building process, we maintain internal CRAN-like repositories of our packages. This has worked pretty well, but we are running into issues with R 3.1 and OSX mavericks. Specifically, machines with osx mavericks seem to, by default, expect packages to be located under a 'mavericks' sub-directory, but this is not the location reported when generating a mac.binary appropriate contrib url. contrib.url('foo') [1] foo/bin/macosx/mavericks/contrib/3.1/ If I ask where the mac binaries are on a linux machine (AND on mac mavericks machines) I get contrib.url('foo',type='mac.binary') [1] foo/bin/macosx/mavericks/contrib/3.1/ I don't think that is true. On all machines (Linux, OS X, ...) I get contrib.url('foo', type='mac.binary') [1] foo/bin/macosx/contrib/3.1 Note that the type for the mavericks build is mac.binary.mavericks, so on all machines you also get contrib.url('foo',type='mac.binary.mavericks') [1] foo/bin/macosx/mavericks/contrib/3.1 The only difference are the defaults for pkgType - they differ by the build, but the repo structure is fixed and consistent across all platforms. Cheers, Simon But the OSX machine gives an error and fails to locate the packages if they are located at foo/bin/macosx/contrib/3.1/ So where are the mac binaries supposed to located in a CRAN-like repository so that they can be installed on a mac with the default install command? And is there a way for a non-mac machine (i.e. our linux deploy server) to determine that directory other than contrib.url(,type='mac.binary) ? thanks for your help, -skye __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] model.frame and parent environment
Someone has reported a problem with predict.coxph that I can't seem to solve. The underlying issue is with model.frame.coxph; the same issue is also found in lm so I'll use that for the example. -- test - data.frame(y = 1:10 + runif(10), x=1:10) myfun - function(formula, nd) { fit - lm(formula, data=nd, model=FALSE) model.frame(fit) } myfun(test) Error in is.data.frame(data): object nd not found 1. The key line, in both model.frame.coxph and model.frame.lm is eval(fcall, env, parent.frame()) and it appear (at least to me) that the parent.frame() part of this is effectively ignored when fcall is itself a reference to model.frame. I'd like to understand this better. 2. The modeling functions coxph and survreg in the survival default to model=FALSE, originally in mimicry of lm and glm; I don't know when R changed the default to model=TRUE for lm and glm. One possible response to my question would be advice to change my routine's defaults too. I'm somewhat reluctant since I work with a few very large data sets, but would entertain that discussion as well. I'd still like to understand how model.frame could be made to work under the current regimen. Terry Therneau __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] what is the current correct repos structure for mac osx binaries?
On Jun 16, 2014, at 1:18 PM, Skye Bender-deMoll skyeb...@skyeome.net wrote: Dear R-devel, Apologies for the confusing typo in the reported paths my previous question, thanks to Simon for providing the answer that the default repository type on the mac is now mac.binary.mavericks not mac.binary as the docs for install.packages state. That is incorrect. The default varies by the distribution you use. For the regular binary based on 10.6+ it is mac.binary. For the special Mavericks distribution it is mac.binary.mavericks. Perhaps the docs for install packages could be updated something like: ... type character, indicating the type of package to download and install Possible values are (currently) source, mac.binary.BUILD_NAME and win.binary. The BUILD_NAME on OSX is determined internally by ???. ... I'm still not quite clear how the CRAN-like repository should be structured for OSX. CRAN seems to include .tgz packages in both http://cran.r-project.org/bin/macosx/contrib/3.1/ and http://cran.r-project.org/bin/macosx/mavericks/contrib/3.1/ The directory contents are not identical, but both include packages built as recently as today. Is bin/macosx/contrib/3.1/ a snowleopard build? Do I need to maintain two directories as well? It seems like if I put my packages in http://foo/bin/macosx/contrib/3.1/ the mavericks machines won't find them. But if I put the packages in http://foo/bin/macosx/mavericks/contrib/3.1/ people with the snowleopard build wont find them. Perhaps this is the desired behavior if the mavericks binaries are not snowleopard compatible? Yes, Mavericks build is incompatible with the Snow Leopard build, that's why there are two separate distributions and two separate repositories. Cheers, Simon thanks again for your help, -skye On 06/13/2014 05:22 PM, Simon Urbanek wrote: On Jun 13, 2014, at 5:41 PM, Skye Bender-deMoll skyeb...@skyeome.net wrote: Dear R-developers, As part of our package building process, we maintain internal CRAN-like repositories of our packages. This has worked pretty well, but we are running into issues with R 3.1 and OSX mavericks. Specifically, machines with osx mavericks seem to, by default, expect packages to be located under a 'mavericks' sub-directory, but this is not the location reported when generating a mac.binary appropriate contrib url. contrib.url('foo') [1] foo/bin/macosx/mavericks/contrib/3.1/ If I ask where the mac binaries are on a linux machine (AND on mac mavericks machines) I get contrib.url('foo',type='mac.binary') [1] foo/bin/macosx/mavericks/contrib/3.1/ I don't think that is true. On all machines (Linux, OS X, ...) I get contrib.url('foo', type='mac.binary') [1] foo/bin/macosx/contrib/3.1 Note that the type for the mavericks build is mac.binary.mavericks, so on all machines you also get contrib.url('foo',type='mac.binary.mavericks') [1] foo/bin/macosx/mavericks/contrib/3.1 The only difference are the defaults for pkgType - they differ by the build, but the repo structure is fixed and consistent across all platforms. Cheers, Simon But the OSX machine gives an error and fails to locate the packages if they are located at foo/bin/macosx/contrib/3.1/ So where are the mac binaries supposed to located in a CRAN-like repository so that they can be installed on a mac with the default install command? And is there a way for a non-mac machine (i.e. our linux deploy server) to determine that directory other than contrib.url(,type='mac.binary) ? thanks for your help, -skye __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] model.frame and parent environment
On 16/06/2014 19:35, Therneau, Terry M., Ph.D. wrote: Someone has reported a problem with predict.coxph that I can't seem to solve. The underlying issue is with model.frame.coxph; the same issue is also found in lm so I'll use that for the example. -- test - data.frame(y = 1:10 + runif(10), x=1:10) myfun - function(formula, nd) { fit - lm(formula, data=nd, model=FALSE) model.frame(fit) } myfun(test) Error in is.data.frame(data): object nd not found You have specified formula = test and given no value for nd. Is that really what you intended? It is undocumented that it works for lm(). 1. The key line, in both model.frame.coxph and model.frame.lm is eval(fcall, env, parent.frame()) and it appear (at least to me) that the parent.frame() part of this is effectively ignored when fcall is itself a reference to model.frame. I'd like to understand this better. Way back (ca R 1.2.0) an advocate of lexical scoping changed model.frame.lm to refer to an environment not a data frame for 'env'. That pretty fundamental change means that your sort of example is not a recommended way to do this: you are mixing scoping models. 2. The modeling functions coxph and survreg in the survival default to model=FALSE, originally in mimicry of lm and glm; I don't know when R changed the default to model=TRUE for lm and glm. One possible response I am not sure R ever did: model = TRUE was the default 16 years ago at the beginning of the CVS/SVN archive. to my question would be advice to change my routine's defaults too. I'm somewhat reluctant since I work with a few very large data sets, but would entertain that discussion as well. I'd still like to understand how model.frame could be made to work under the current regimen. For smaller problems using model = TRUE is the most robust solution. As the components of the model frame can be changed after fitting, there is no way to guarantee to recreate the model frame, so to be sure you need to store it. If you called myfun(y ~ x, test) it will look for 'nd' in the global environment, the environment of the formula. One way to get that to work more often is something like myfun - function(formula, nd) { qnd - substitute(nd) fit - lm(formula, data=nd, model=FALSE) fit$call$data - qnd model.frame(fit) } so it looks for the value of 'nd' instead. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] model.frame and parent environment
I had a typo in the prior example when transcribing from R to the message, now corrected below. (The call to myfun). My apologies for that. Corrected message below. Someone has reported a problem with predict.coxph that I can't seem to solve. The underlying issue is with model.frame.coxph; the same issue is also found in lm so I'll use that for the example. -- test - data.frame(y = 1:10 + runif(10), x=1:10) myfun - function(formula, nd) { fit - lm(formula, data=nd, model=FALSE) model.frame(fit) } myfun( y~x, test) Error in is.data.frame(data): object nd not found 1. The key line, in both model.frame.coxph and model.frame.lm is eval(fcall, env, parent.frame()) and it appear (at least to me) that the parent.frame() part of this is effectively ignored when fcall is itself a reference to model.frame. I'd like to understand this better. 2. The modeling functions coxph and survreg in the survival default to model=FALSE, originally in mimicry of lm and glm; I don't know when R changed the default to model=TRUE for lm and glm. One possible response to my question would be advice to change my routine's defaults too. I'm somewhat reluctant since I work with a few very large data sets, but would entertain that discussion as well. I'd still like to understand how model.frame could be made to work under the current regimen. Terry Therneau __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel