Re: [Bioc-devel] error when install ŒRsamtools¹, ŒGenomicAlignments¹ and Œrtracklayer¹
Hi Herve, Thanks for your reply. I updated my R to yesterday's building and it works great. Looks like it is the problem of my old version of R 3.10. Yours sincerely, Jianhong Ou LRB 670A Program in Gene Function and Expression 364 Plantation Street Worcester, MA 01605 On 12/13/13 2:50 PM, Hervé Pagès hpa...@fhcrc.org wrote: Hi Jianhong, On 12/13/2013 11:45 AM, Ou, Jianhong wrote: Dear all, When I try to renew Rsamtools, GenomicAlignments and rtracklayer, I got error Error : objects ŒGAlignments¹, ŒGAlignmentPairs¹ are not exported by 'namespace:GenomicRanges' You didn't show us what you did exactly. Try to do: biocLite(GenomicRanges) biocLite(Rsamtools) biocLite(GenomicAlignments) biocLite(rtracklayer) biocLite() in *that* order. Cheers, H. Does anybody know how to figure out this problem? biocValid() * sessionInfo() R Under development (unstable) (2013-09-29 r64014) Platform: x86_64-apple-darwin12.5.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] BiocInstaller_1.13.3 TxDb.Mmusculus.UCSC.mm10.knownGene_2.10.1 [3] GenomicFeatures_1.15.4AnnotationDbi_1.25.9 [5] Biobase_2.23.3GenomicRanges_1.15.13 [7] XVector_0.3.3 IRanges_1.21.14 [9] BiocGenerics_0.9.1 loaded via a namespace (and not attached): [1] biomaRt_2.19.1 Biostrings_2.31.3bitops_1.0-6 [4] BSgenome_1.31.7 DBI_0.2-7 GenomicAlignments_0.99.3 [7] RCurl_1.95-4.1 Rsamtools_1.15.9 RSQLite_0.11.4 [10] rtracklayer_1.23.3 stats4_3.1.0 tools_3.1.0 [13] XML_3.98-1.1 zlibbioc_1.9.0 * Out-of-date packages Package GenomicAlignments GenomicAlignments Rsamtools Rsamtools rtracklayer rtracklayer LibPath Installed Built GenomicAlignments /Library/Frameworks/R.framework/Versions/3.1/Resources/library 0.99.3 3.1.0 Rsamtools /Library/Frameworks/R.framework/Versions/3.1/Resources/library 1.15.9 3.1.0 rtracklayer /Library/Frameworks/R.framework/Versions/3.1/Resources/library 1.23.3 3.1.0 ReposVer Repository GenomicAlignments 0.99.9 http://bioconductor.org/packages/2.14/bioc/src/contrib; Rsamtools 1.15.15 http://bioconductor.org/packages/2.14/bioc/src/contrib; rtracklayer 1.23.6 http://bioconductor.org/packages/2.14/bioc/src/contrib; update with biocLite() Error: 3 package(s) out of date Yours sincerely, Jianhong Ou LRB 670A Program in Gene Function and Expression 364 Plantation Street Worcester, MA 01605 [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] equivalent of seqselect,vector,Rle
Hi Michael, On 12/13/2013 01:03 PM, Michael Lawrence wrote: I used to use seqselect for subsetting ordinary R vectors by Ranges and Rle. IRanges:::extractROWS does this, but it's hidden behind the namespace. What is the public way of doing this? Maybe we just need to export extractROWS()? Or something with a better name? I'll add [,vector,Ranges and [,vector,Rle methods (and probably also [,factor,Ranges and [,factor,Rle). They'll just be wrappers to IRanges:::extractROWS which I'd like to keep hidden. Was not sure people where doing this on ordinary R vectors so was waiting for someone to speak up. H. Michael [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] equivalent of seqselect,vector,Rle
Thanks, makes sense. Didn't realize we could dispatch on the 'i' parameter. I sort of recall the perception that we couldn't, and that was one of the main motivations behind seqselect. But it does appear possible. Michael On Fri, Dec 13, 2013 at 1:10 PM, Hervé Pagès hpa...@fhcrc.org wrote: Hi Michael, On 12/13/2013 01:03 PM, Michael Lawrence wrote: I used to use seqselect for subsetting ordinary R vectors by Ranges and Rle. IRanges:::extractROWS does this, but it's hidden behind the namespace. What is the public way of doing this? Maybe we just need to export extractROWS()? Or something with a better name? I'll add [,vector,Ranges and [,vector,Rle methods (and probably also [,factor,Ranges and [,factor,Rle). They'll just be wrappers to IRanges:::extractROWS which I'd like to keep hidden. Was not sure people where doing this on ordinary R vectors so was waiting for someone to speak up. H. Michael [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] equivalent of seqselect,vector,Rle
Hi Michael, On 12/13/2013 06:39 PM, Michael Lawrence wrote: Coercion might suffice. I do remember Patrick optimizing these selections with e.g. memcpy(), so they are pretty fast. The memcpy() trick was used (and is still used in extractROWS) when seqselect'ing by a Ranges object. For subsetting *by* an integer-Rle, there was no (and there is still no) optimization: the subscript was just passed thru as.integer() internally. Subsetting by a numeric-Rle or character-Rle was broken. No profiling data though. I do have some performance critical code that has relied on the Rle-based extraction. Would be nice to avoid re-evaluating the performance. From a performance point of view, there should be no significant difference between doing x[as.vector(i)] and doing IRanges:::extractROWS(x, i) when 'i' is an Rle, because the latter passes 'i' thru as.vector() internally (internal helper normalizeSingleBracketSubscript actually does that). However I would still recommend you use the latter in your package so it will take advantage of optimizations that might happen in the future. H. On Fri, Dec 13, 2013 at 6:19 PM, Hervé Pagès hpa...@fhcrc.org mailto:hpa...@fhcrc.org wrote: On 12/13/2013 01:49 PM, Michael Lawrence wrote: Thanks, makes sense. Didn't realize we could dispatch on the 'i' parameter. I sort of recall the perception that we couldn't, and that was one of the main motivations behind seqselect. But it does appear possible. Well I was hoping I could do this but it doesn't work :-/ Found in the man page for `[`: S4 methods: These operators are also implicit S4 generics, but as primitives, S4 methods will be dispatched only on S4 objects ‘x’. OK, fair enough. But the following is really misleading: library(IRanges) `[` .Primitive([) getGeneric([) standardGeneric for [ defined from package base function (x, i, j, ..., drop = TRUE) standardGeneric([, .Primitive([)) bytecode: 0x168cba0 environment: 0x1ccfd90 Methods may be defined for arguments: x, i, j, drop Use showMethods([) for currently available ones. So the implicit generic actually does dispatch on 'i'. I can see my new [,vector,Ranges method: selectMethod([, c(vector, Ranges)) Method Definition: function (x, i, j, ..., drop = TRUE) { if (!missing(j) || length(list(...)) 0L) stop(invalid subsetting) extractROWS(x, i) } environment: namespace:IRanges Signatures: xi target vector Ranges defined vector Ranges And dispatch works if I explicitly call the generic: getGeneric([)(letters, IRanges(4, 8)) [1] d e f g h but not if I call the primitive: letters[IRanges(4, 8)] Error in letters[IRanges(4, 8)] : invalid subscript type 'S4' Seems like the primitive first checks 'x' and only if it's an S4 object it then delegates to the implicit S4 generic. Probably for performance reasons as it avoids the cost of having to perform full multiple dispatch when 'x' is an ordinary objects. The following hack works: `[` - getGeneric([) letters[IRanges(4, 8)] [1] d e f g h but putting this in IRanges feels wrong (I tried and it caused troubles with ref classes). So I guess I should go ahead and export/document extractROWS() and replaceROWS(). What are the other options? In the mean time of course you can always pass your Ranges or Rle subscript thru unlist() or as.vector() first (not much more typing than doing seqselect() and I don't expect this will impact performance too much in practise). H. Michael On Fri, Dec 13, 2013 at 1:10 PM, Hervé Pagès hpa...@fhcrc.org mailto:hpa...@fhcrc.org mailto:hpa...@fhcrc.org mailto:hpa...@fhcrc.org wrote: Hi Michael, On 12/13/2013 01:03 PM, Michael Lawrence wrote: I used to use seqselect for subsetting ordinary R vectors by Ranges and Rle. IRanges:::extractROWS does this, but it's hidden behind the namespace. What is the public way of doing this? Maybe we just need to export extractROWS()? Or something with a better name? I'll add [,vector,Ranges and [,vector,Rle methods (and probably also [,factor,Ranges and [,factor,Rle). They'll just be wrappers to IRanges:::extractROWS which I'd like to keep hidden. Was not sure people where doing this on ordinary R vectors so was waiting for someone to speak up. H. Michael [[alternative HTML version
[Rd] substring() and propagation of names
Hi, In R 3.0.0, we used to get: substring(c(A=abcdefghij, B=123456789), 2, 6:2) A B A B A bcdef 2345 bcd23 b But in R = 3.0.0, we get: substring(c(A=abcdefghij, B=123456789), 2, 6:2) [1] bcdef 2345 bcd 23b The names are not propagated anymore. Is this an intended change or a bug? I can't find anything about this in the NEWS file. The man page for substring() in R = 3.0.0 still states: Value: ... For ‘substring’, a character vector of length the longest of the arguments. This will have names taken from ‘x’ (if it has any after coercion, repeated as needed), and other attributes copied from ‘x’ if it is the longest of the arguments). Also note that the first argument of substring() is 'text' not 'x'. Thanks, H. -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
Gabor I agree with you. There's Travis CI, and r-travis -- an attempt to integrate R package testing with Travis. Pushing back to GitHub is possible, but the setup is somewhat difficult. Also, this can be subject to race conditions because each push triggers a test run and they can happen in parallel even for the same repository. How do you handle branches? It would be really great to be able to execute custom R code before building. Perhaps in a PreBuild: section in DESCRIPTION? Cheers Kirill On 12/12/2013 02:21 AM, Gábor Csárdi wrote: Hi, this is maybe mostly a personal preference, but I prefer not to put generated files in the vc repository. Changes in the generated files, especially if there is many of them, pollute the diffs and make them less useful. If you really want to be able to install the package directly from github, one solution is to 1. create another repository, that contains the complete generated package, so that install_github() can install it. 2. set up a CI service, that can download the package from github, build the package or the generated files (check the package, while it is at it), and then push the build stuff back to github. 3. set up a hook on github, that invokes the CI after each commit. I have used this setup in various projects with jenkins-ci and it works well. Diffs are clean, the package is checked and built frequently, and people can download it without having to install the tools that generate the generated files. The only downside is that you need to install a CI, so you need a server for that. Maybe you can do this with travis-ci, maybe not, I am not familiar with it that much. Best, Gabor On Wed, Dec 11, 2013 at 7:39 PM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed, perhaps it can be summarized as follows: - The devtools package is not the right place to implement roxygenize-before-build - A continuous integration service would be better for that, but currently there's nothing that would be easy to use - Roxygenizing via src/Makefile could work but requires further investigation and an installation of Rtools/xcode on Windows/OS X Especially the last point looks interesting to me, but since this is not widely used there must be pitfalls I'm not aware of. The general idea would be: - Place code that builds/updates the .Rd and NAMESPACE files into src/Makefile - Users installing the package from source will require infrastructure (Rtools/make) - For binary packages, the .Rd files are already generated and added to the .tar.gz during R CMD build before they are submitted to CRAN/WinBuilder, and they are also generated (in theory) by R CMD build --binary I'd like to hear your opinion on that. I have also found a thread on package development workflow (https://stat.ethz.ch/pipermail/r-devel/2011-September/061955.html) but there was nothing on un-versioning .Rd files. Cheers Kirill __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- _ ETH Zürich Institute for Transport Planning and Systems HIL F 32.2 Wolfgang-Pauli-Str. 15 8093 Zürich Phone: +41 44 633 33 17 Fax: +41 44 633 10 57 Secretariat: +41 44 633 31 05 E-Mail: kirill.muel...@ivt.baug.ethz.ch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
Pushing back to github is not so difficult. See e.g http://blog.r-enthusiasts.com/2013/12/04/automated-blogging.html You can manage branches easily in travis. You could for example decide to do something different if you are on the master branch ... Romain Le 13 déc. 2013 à 12:03, Kirill Müller kirill.muel...@ivt.baug.ethz.ch a écrit : Gabor I agree with you. There's Travis CI, and r-travis -- an attempt to integrate R package testing with Travis. Pushing back to GitHub is possible, but the setup is somewhat difficult. Also, this can be subject to race conditions because each push triggers a test run and they can happen in parallel even for the same repository. How do you handle branches? It would be really great to be able to execute custom R code before building. Perhaps in a PreBuild: section in DESCRIPTION? Cheers Kirill On 12/12/2013 02:21 AM, Gábor Csárdi wrote: Hi, this is maybe mostly a personal preference, but I prefer not to put generated files in the vc repository. Changes in the generated files, especially if there is many of them, pollute the diffs and make them less useful. If you really want to be able to install the package directly from github, one solution is to 1. create another repository, that contains the complete generated package, so that install_github() can install it. 2. set up a CI service, that can download the package from github, build the package or the generated files (check the package, while it is at it), and then push the build stuff back to github. 3. set up a hook on github, that invokes the CI after each commit. I have used this setup in various projects with jenkins-ci and it works well. Diffs are clean, the package is checked and built frequently, and people can download it without having to install the tools that generate the generated files. The only downside is that you need to install a CI, so you need a server for that. Maybe you can do this with travis-ci, maybe not, I am not familiar with it that much. Best, Gabor On Wed, Dec 11, 2013 at 7:39 PM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed, perhaps it can be summarized as follows: - The devtools package is not the right place to implement roxygenize-before-build - A continuous integration service would be better for that, but currently there's nothing that would be easy to use - Roxygenizing via src/Makefile could work but requires further investigation and an installation of Rtools/xcode on Windows/OS X Especially the last point looks interesting to me, but since this is not widely used there must be pitfalls I'm not aware of. The general idea would be: - Place code that builds/updates the .Rd and NAMESPACE files into src/Makefile - Users installing the package from source will require infrastructure (Rtools/make) - For binary packages, the .Rd files are already generated and added to the .tar.gz during R CMD build before they are submitted to CRAN/WinBuilder, and they are also generated (in theory) by R CMD build --binary I'd like to hear your opinion on that. I have also found a thread on package development workflow (https://stat.ethz.ch/pipermail/r-devel/2011-September/061955.html) but there was nothing on un-versioning .Rd files. Cheers Kirill __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- _ ETH Zürich Institute for Transport Planning and Systems HIL F 32.2 Wolfgang-Pauli-Str. 15 8093 Zürich Phone: +41 44 633 33 17 Fax: +41 44 633 10 57 Secretariat: +41 44 633 31 05 E-Mail: kirill.muel...@ivt.baug.ethz.ch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Status of reserved keywords and builtins
It would have those benefits, but it would be harder to prototype changes by actually replacing the `if` function. Implementations that want to optimize the calls have other ways to do it, e.g. the sorts of things the compiler does. Does anyone actually prototype changes to the `if` function? Allowing users to replace the definitions of reserved keywords and builtins is horribly expensive performance-wise with or without compilation. If you look at the compiler package, the way it optimizes these function calls is by breaking the language spec. See the beginnings of sections 5 and 6 of Luke's write up (http://homepage.stat.uiowa.edu/~luke/R/compiler/compiler.pdf), noting that the *default* optimization level is 2, at which level, In addition to the inlining permitted by Level 1, functions that are syntactically special or are considered core language functions and are found via the global environment at compile time may be inlined. This is an area where a small change to the language spec would impact essentially no users and would result in a language that could be executed much more efficiently. Justin Talbot __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
On 12/13/2013 12:50 PM, Romain Francois wrote: Pushing back to github is not so difficult. See e.g http://blog.r-enthusiasts.com/2013/12/04/automated-blogging.html Thanks for the writeup, I'll try this. Perhaps it's better to push the results of `R CMD build`, though. You can manage branches easily in travis. You could for example decide to do something different if you are on the master branch ... That's right. But then no .Rd files are built when I'm on a branch, so I can't easily preview the result. The ideal situation would be: 1. I manage only R source files on GitHub, not Rd files, NAMESPACE nor the Collate section of DESCRIPTION. Machine-readable instructions on how to build those are provided with the package. 2. Anyone can install from GitHub using devtools::install_github(). This also should work for branches, forks and pull requests. 3. I can build the package so that the result can be accepted by CRAN. The crucial point on that list is point 2, the others I can easily solve myself. The way I see it, point 2 can be tackled by extending devtools or extending the ways packages are built. Extending devtools seems to be the inferior approach, although, to be honest, I'd be fine with that as well. -Kirill Romain Le 13 déc. 2013 à 12:03, Kirill Müller kirill.muel...@ivt.baug.ethz.ch mailto:kirill.muel...@ivt.baug.ethz.ch a écrit : Gabor I agree with you. There's Travis CI, and r-travis -- an attempt to integrate R package testing with Travis. Pushing back to GitHub is possible, but the setup is somewhat difficult. Also, this can be subject to race conditions because each push triggers a test run and they can happen in parallel even for the same repository. How do you handle branches? It would be really great to be able to execute custom R code before building. Perhaps in a PreBuild: section in DESCRIPTION? Cheers Kirill On 12/12/2013 02:21 AM, Gábor Csárdi wrote: Hi, this is maybe mostly a personal preference, but I prefer not to put generated files in the vc repository. Changes in the generated files, especially if there is many of them, pollute the diffs and make them less useful. If you really want to be able to install the package directly from github, one solution is to 1. create another repository, that contains the complete generated package, so that install_github() can install it. 2. set up a CI service, that can download the package from github, build the package or the generated files (check the package, while it is at it), and then push the build stuff back to github. 3. set up a hook on github, that invokes the CI after each commit. I have used this setup in various projects with jenkins-ci and it works well. Diffs are clean, the package is checked and built frequently, and people can download it without having to install the tools that generate the generated files. The only downside is that you need to install a CI, so you need a server for that. Maybe you can do this with travis-ci, maybe not, I am not familiar with it that much. Best, Gabor On Wed, Dec 11, 2013 at 7:39 PM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch mailto:kirill.muel...@ivt.baug.ethz.ch wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed, perhaps it can be summarized as follows: - The devtools package is not the right place to implement roxygenize-before-build - A continuous integration service would be better for that, but currently there's nothing that would be easy to use - Roxygenizing via src/Makefile could work but requires further investigation and an installation of Rtools/xcode on Windows/OS X Especially the last point looks interesting to me, but since this is not widely used there must be pitfalls I'm not aware of. The general idea would be: - Place code that builds/updates the .Rd and NAMESPACE files into src/Makefile - Users installing the package from source will require infrastructure (Rtools/make) - For binary packages, the .Rd files are already generated and added to the .tar.gz during R CMD build before they are submitted to CRAN/WinBuilder, and they are also generated (in theory) by R CMD build --binary I'd like to hear your opinion on that. I have also
Re: [Rd] Status of reserved keywords and builtins
It would have those benefits, but it would be harder to prototype changes by actually replacing the `if` function. Implementations that want to optimize the calls have other ways to do it, e.g. the sorts of things the compiler does. Does anyone actually prototype changes to the `if` function? I do - in the dplyr package (https://github.com/hadley/dplyr), I construct environments where many of the most common R functions are replaced by alternates that return SQL strings. This makes it possible to use R's parser, while translating output into another language. I think it's a pretty elegant approach that's facilitated by lexical scoping and first-class environments, but it's an admittedly rare case. Hadley -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Status of reserved keywords and builtins
On 13-12-13 8:36 AM, Justin Talbot wrote: It would have those benefits, but it would be harder to prototype changes by actually replacing the `if` function. Implementations that want to optimize the calls have other ways to do it, e.g. the sorts of things the compiler does. Does anyone actually prototype changes to the `if` function? I don't know of any examples of that, but I can easily imagine someone wanting to. For example, some conditions take a long time to evaluate. Maybe I would want to compute both TRUE and FALSE paths in parallel in anticipation of the result, if I have cores to spare. That's pretty tricky to get right because of side effects, so prototyping in R code could make a lot of sense. Allowing users to replace the definitions of reserved keywords and builtins is horribly expensive performance-wise with or without compilation. If you look at the compiler package, the way it optimizes these function calls is by breaking the language spec. See the beginnings of sections 5 and 6 of Luke's write up (http://homepage.stat.uiowa.edu/~luke/R/compiler/compiler.pdf), noting that the *default* optimization level is 2, at which level, In addition to the inlining permitted by Level 1, functions that are syntactically special or are considered core language functions and are found via the global environment at compile time may be inlined. This is an area where a small change to the language spec would impact essentially no users and would result in a language that could be executed much more efficiently. That only breaks the language spec if the compiler doesn't detect cases where it is an invalid optimization. It may be that that is currently the case (I haven't checked), but it needn't always be. I would much prefer that the compiler code were made smarter about detecting this rather than adding exceptions to the language design. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [R] freetype 2.5.2, problem with the survival package, build R 2.15.x with gcc 4.8.x
On Dec 11, 2013, at 7:30 PM, Hin-Tak Leung wrote: Here is a rather long discussion etc about freetype 2.5.2, problem with the survival package, and build R 2.15.x with gcc 4.8.x. Please feel free to skip forward. - freetype 2.5.2: the fix to cope with one of the Mac OS X's system fonts just before the release of freetype 2.5.1 caused a regression, crashing over one of Microsoft windows' system fonts. So there is a 2.5.2. There are new 2.5.2 bundles for windows Mac OS X. The official win/mac binaries of R were built statically with 2+-years-old freetype with a few known problems. Most should upgrade/rebuild. http://sourceforge.net/projects/outmodedbonsai/files/R/ - problem with the survival package: Trying to re-run a vignette to get the same result as two years ago reveal a strange change. I went and bisected it down to r11513 and r11516 of the survival package. -- r11513 clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set)) coef exp(coef) se(coef) z p addContr(A)2 -0.620 0.5380.217 -2.86 0.0043 addContr(C)2 0.482 1.6200.217 2.22 0.0270 addContr(A.C)1-2 -0.778 0.4590.275 -2.83 0.0047 addContr(A.C)2-1 NANA0.000NA NA addContr(A.C)2-2 NANA0.000NA NA Likelihood ratio test=26 on 3 df, p=9.49e-06 n= 13110, number of events= 3524 -- - r11516 - clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set)) coef exp(coef) se(coef) z p addContr(A)2 -0.14250 0.867 110812 -1.29e-06 1 addContr(C)2 0.00525 1.005 110812 4.74e-08 1 addContr(A.C)1-2 -0.30097 0.740 110812 -2.72e-06 1 addContr(A.C)2-1 -0.47712 0.621 110812 -4.31e-06 1 addContr(A.C)2-2 NANA0NA NA Likelihood ratio test=26 on 4 df, p=3.15e-05 n= 13110, number of events= 3524 -- r11514 does not build, and r11515 have serious memory hogs, so the survival package broke somewhere between r11513 and r11516. Anyway, here is the diff in the vignette, and the data, etc is in the directory above. If somebody want to fix this before I spend any more time on this particular matter, please feel free to do so. http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical8_survival-clogit-diff.pdf/download That's the one problem from David's 10 practicals which are not due to bugs in snpStats. Some might find it reassuring that only 3 of the 4 problems with the practicals are due to snpStats bugs. http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical7_snpStatsBug-diff.pdf/download http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical6_snpStatsBug-diff.pdf/download http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical3_snpStatsBug-diff.pdf/download - build R 2.15.x with gcc 4.8.x I wish the R commit log was a bit more detailed with r62430 than just tweak needed for gcc 4.8.x. Anyway, building R 2.15.x with gcc 4.8.x could result in segfaults in usage as innocent and essential as running summary() on a data.frame: *** caught segfault *** address 0x2f8e6a00, cause 'memory not mapped' Traceback: 1: sort.list(y) 2: factor(a, exclude = exclude) 3: table(object, exclude = NULL) 4: summary.default(X[[3L]], ...) 5: FUN(X[[3L]], ...) 6: lapply(X = as.list(object), FUN = summary, maxsum = maxsum, digits = 12, ...) 7: summary.data.frame(support) ... r62430 needs a bit of adapting to apply to R 2.15.x , but you get the idea. I hope this info is useful to somebody else who is still using R 2.15.x , no doubt for very good reasons. Hin-Tak Leung wrote: The freetype people fixed the 2nd set of issues with system fonts shipped with Mac OS X, and released 2.5.1 almost immediately after that. So there are new bundles under http://sourceforge.net/projects/outmodedbonsai/files/R/ . Just a reminder that the official R binaries for windows/mac OS X are statically linked with rather dated versions of freetype with a few known issues. This affects the cairo-based functionalities in R. So a rebuild is needed. Most unix users should just upgrade their system's libfreetype, and dynamic-linking should take care of the rest. __ r-h...@r-project.org mailing list
Re: [Rd] [R] freetype 2.5.2, problem with the survival package, build R 2.15.x with gcc 4.8.x
On Dec 11, 2013, at 7:30 PM, Hin-Tak Leung wrote: Here is a rather long discussion etc about freetype 2.5.2, problem with the survival package, and build R 2.15.x with gcc 4.8.x. Please feel free to skip forward. - freetype 2.5.2: the fix to cope with one of the Mac OS X's system fonts just before the release of freetype 2.5.1 caused a regression, crashing over one of Microsoft windows' system fonts. So there is a 2.5.2. There are new 2.5.2 bundles for windows Mac OS X. The official win/mac binaries of R were built statically with 2+-years-old freetype with a few known problems. Most should upgrade/rebuild. http://sourceforge.net/projects/outmodedbonsai/files/R/ - problem with the survival package: Trying to re-run a vignette to get the same result as two years ago reveal a strange change. I went and bisected it down to r11513 and r11516 of the survival package. -- r11513 clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set)) coef exp(coef) se(coef) z p addContr(A)2 -0.620 0.5380.217 -2.86 0.0043 addContr(C)2 0.482 1.6200.217 2.22 0.0270 addContr(A.C)1-2 -0.778 0.4590.275 -2.83 0.0047 addContr(A.C)2-1 NANA0.000NA NA addContr(A.C)2-2 NANA0.000NA NA Likelihood ratio test=26 on 3 df, p=9.49e-06 n= 13110, number of events= 3524 -- - r11516 - clogit(cc ~ addContr(A) + addContr(C) + addContr(A.C) + strata(set)) coef exp(coef) se(coef) z p addContr(A)2 -0.14250 0.867 110812 -1.29e-06 1 addContr(C)2 0.00525 1.005 110812 4.74e-08 1 addContr(A.C)1-2 -0.30097 0.740 110812 -2.72e-06 1 addContr(A.C)2-1 -0.47712 0.621 110812 -4.31e-06 1 addContr(A.C)2-2 NANA0NA NA Likelihood ratio test=26 on 4 df, p=3.15e-05 n= 13110, number of events= 3524 -- r11514 does not build, and r11515 have serious memory hogs, so the survival package broke somewhere between r11513 and r11516. Anyway, here is the diff in the vignette, and the data, etc is in the directory above. If somebody want to fix this before I spend any more time on this particular matter, please feel free to do so. http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical8_survival-clogit-diff.pdf/download That's the one problem from David's 10 practicals which are not due to bugs in snpStats. Some might find it reassuring that only 3 of the 4 problems with the practicals are due to snpStats bugs. http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical7_snpStatsBug-diff.pdf/download http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical6_snpStatsBug-diff.pdf/download http://sourceforge.net/projects/outmodedbonsai/files/Manuals%2C%20Overviews%20and%20Slides%20for%20talks/2013SummerCourse/practicals/with-answers/practical3_snpStatsBug-diff.pdf/download - build R 2.15.x with gcc 4.8.x I wish the R commit log was a bit more detailed with r62430 than just tweak needed for gcc 4.8.x. Anyway, building R 2.15.x with gcc 4.8.x could result in segfaults in usage as innocent and essential as running summary() on a data.frame: *** caught segfault *** address 0x2f8e6a00, cause 'memory not mapped' Traceback: 1: sort.list(y) 2: factor(a, exclude = exclude) 3: table(object, exclude = NULL) 4: summary.default(X[[3L]], ...) 5: FUN(X[[3L]], ...) 6: lapply(X = as.list(object), FUN = summary, maxsum = maxsum, digits = 12, ...) 7: summary.data.frame(support) ... r62430 needs a bit of adapting to apply to R 2.15.x , but you get the idea. I hope this info is useful to somebody else who is still using R 2.15.x , no doubt for very good reasons. First: Sorry for the blank message. Need more coffee. Second: Does this mean that only Mac users who are still using 2.15.x need to worry about this issue? Third: I'm reading this (and Terry's comment about singularity conditions) to mean that a numerical discrepancy between vignette output when code was run being from what was expected was causing a segfault under some situation that I cannot quite reconstruct. Was the implication that Mac users (of 2.15.x) need to build from sources only if they wanted to build the survival package from source? Does this have any implications for those of us who use the survival package as the binary? (And I'm using 3.0.2, so a split answer might be needed
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
On 12/11/2013 4:39 PM, Kirill Müller wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed, perhaps it can be summarized as follows: - The devtools package is not the right place to implement roxygenize-before-build - A continuous integration service would be better for that, but currently there's nothing that would be easy to use - Roxygenizing via src/Makefile could work but requires further investigation and an installation of Rtools/xcode on Windows/OS X Especially the last point looks interesting to me, but since this is not widely used there must be pitfalls I'm not aware of. The general idea would be: - Place code that builds/updates the .Rd and NAMESPACE files into src/Makefile - Users installing the package from source will require infrastructure (Rtools/make) - For binary packages, the .Rd files are already generated and added to the .tar.gz during R CMD build before they are submitted to CRAN/WinBuilder, and they are also generated (in theory) by R CMD build --binary I'd like to hear your opinion on that. I have also found a thread on package development workflow (https://stat.ethz.ch/pipermail/r-devel/2011-September/061955.html) but there was nothing on un-versioning .Rd files. One downside I can see with this third approach is that by making the package documentation generation part of the build process, you must then make the package depend/require roxygen (or whatever tools you are using to generate documentation). This dependence, though, is just to build the package, not to actually use the package. And by pushing this dependency onto the end users of the package, you have transferred the problem you mentioned (... and also binds package development to a specific version of roxygen2 ...) to the many end users rather than the few developers. Cheers Kirill -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health Science University __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
On Fri, Dec 13, 2013 at 6:03 AM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: Gabor I agree with you. There's Travis CI, and r-travis -- an attempt to integrate R package testing with Travis. Pushing back to GitHub is possible, but the setup is somewhat difficult. Also, this can be subject to race conditions because each push triggers a test run and they can happen in parallel even for the same repository. I set my CI, so that it does not allow concurrent builds from the same job. So there are no race conditions. This is probably possible with Travis, I don't know. How do you handle branches? So far I didn't, and only pushed back main branch. But you can just push back to different branches. In this case I would probably create another repo, and have the same branches in both in the source repo and the publish repo. It would be really great to be able to execute custom R code before building. Perhaps in a PreBuild: section in DESCRIPTION? I am just using make to create the package. This creates all autogenerated files and then calls R PKG build. Another option for this whole problem is not using github at all, but setting up a CRAN-like repository, and make the CI publish the built and checked packages there. Gabor Cheers Kirill On 12/12/2013 02:21 AM, Gábor Csárdi wrote: Hi, this is maybe mostly a personal preference, but I prefer not to put generated files in the vc repository. Changes in the generated files, especially if there is many of them, pollute the diffs and make them less useful. If you really want to be able to install the package directly from github, one solution is to 1. create another repository, that contains the complete generated package, so that install_github() can install it. 2. set up a CI service, that can download the package from github, build the package or the generated files (check the package, while it is at it), and then push the build stuff back to github. 3. set up a hook on github, that invokes the CI after each commit. I have used this setup in various projects with jenkins-ci and it works well. Diffs are clean, the package is checked and built frequently, and people can download it without having to install the tools that generate the generated files. The only downside is that you need to install a CI, so you need a server for that. Maybe you can do this with travis-ci, maybe not, I am not familiar with it that much. Best, Gabor On Wed, Dec 11, 2013 at 7:39 PM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed, perhaps it can be summarized as follows: - The devtools package is not the right place to implement roxygenize-before-build - A continuous integration service would be better for that, but currently there's nothing that would be easy to use - Roxygenizing via src/Makefile could work but requires further investigation and an installation of Rtools/xcode on Windows/OS X Especially the last point looks interesting to me, but since this is not widely used there must be pitfalls I'm not aware of. The general idea would be: - Place code that builds/updates the .Rd and NAMESPACE files into src/Makefile - Users installing the package from source will require infrastructure (Rtools/make) - For binary packages, the .Rd files are already generated and added to the .tar.gz during R CMD build before they are submitted to CRAN/WinBuilder, and they are also generated (in theory) by R CMD build --binary I'd like to hear your opinion on that. I have also found a thread on package development workflow (https://stat.ethz.ch/pipermail/r-devel/2011-September/061955.html) but there was nothing on un-versioning .Rd files. Cheers Kirill __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- _ ETH Zürich Institute for Transport Planning and Systems HIL F 32.2 Wolfgang-Pauli-Str. 15 8093 Zürich Phone: +41 44 633 33 17 Fax: +41 44 633 10 57 Secretariat: +41 44 633 31 05 E-Mail: kirill.muel...@ivt.baug.ethz.ch __
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
It seems like an easy solution to the users don't know make problem is to provide a make file which runs any R code it finds in pkg/inst/preinstall/preinstall.R, perhaps in dev_tools/inst/extras or simply from a website. That way the users don't need to know make, they just need to know what to name the file with R code they want to run at the beginning of the install process. The only time this wouldn't work would be if the package already has another makefile, in which case the author of the package clearly knows make and thus can add the necessary invocations him/herself. Also no need to worry about local vs remote, github vs tarred source, etc. It would all just work. My understanding is that this would include the Windows binaries built by services like WinBuilder, though I'm not super familiar with such things so there may be some details to watch out for. There might be a security concern, but I don't think this would be any less secure than installing a non-trusted R package in the first place. ~G On Fri, Dec 13, 2013 at 9:26 AM, Gábor Csárdi csardi.ga...@gmail.comwrote: On Fri, Dec 13, 2013 at 6:03 AM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: Gabor I agree with you. There's Travis CI, and r-travis -- an attempt to integrate R package testing with Travis. Pushing back to GitHub is possible, but the setup is somewhat difficult. Also, this can be subject to race conditions because each push triggers a test run and they can happen in parallel even for the same repository. I set my CI, so that it does not allow concurrent builds from the same job. So there are no race conditions. This is probably possible with Travis, I don't know. How do you handle branches? So far I didn't, and only pushed back main branch. But you can just push back to different branches. In this case I would probably create another repo, and have the same branches in both in the source repo and the publish repo. It would be really great to be able to execute custom R code before building. Perhaps in a PreBuild: section in DESCRIPTION? I am just using make to create the package. This creates all autogenerated files and then calls R PKG build. Another option for this whole problem is not using github at all, but setting up a CRAN-like repository, and make the CI publish the built and checked packages there. Gabor Cheers Kirill On 12/12/2013 02:21 AM, Gábor Csárdi wrote: Hi, this is maybe mostly a personal preference, but I prefer not to put generated files in the vc repository. Changes in the generated files, especially if there is many of them, pollute the diffs and make them less useful. If you really want to be able to install the package directly from github, one solution is to 1. create another repository, that contains the complete generated package, so that install_github() can install it. 2. set up a CI service, that can download the package from github, build the package or the generated files (check the package, while it is at it), and then push the build stuff back to github. 3. set up a hook on github, that invokes the CI after each commit. I have used this setup in various projects with jenkins-ci and it works well. Diffs are clean, the package is checked and built frequently, and people can download it without having to install the tools that generate the generated files. The only downside is that you need to install a CI, so you need a server for that. Maybe you can do this with travis-ci, maybe not, I am not familiar with it that much. Best, Gabor On Wed, Dec 11, 2013 at 7:39 PM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed, perhaps it can be summarized as follows: - The devtools package is not the right place to implement roxygenize-before-build - A continuous integration service would be better for that, but currently there's nothing that would be easy to use - Roxygenizing via src/Makefile could work but requires further investigation and an installation of Rtools/xcode on Windows/OS X Especially the last point looks interesting to me, but since this is not widely
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
FWIW this is essentially what RForge.net provides. Each GitHub commit triggers a build (branches are supported as the branch info is passed in the WebHook) which can be either classic R CMD build or a custom shell script (hence you can do anything you want). The result is a tar ball (which includes the generated files) and that tar ball gets published in the R package repository. R CMD check is run as well on the tar ball and the results are published. This way you don't need devtools, users can simply use install.packages() without requiring any additional tools. There are some talks about providing the above as a cloud service, so that anyone can run and/or use it. Cheers, Simon On Dec 13, 2013, at 8:51 AM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: On 12/13/2013 12:50 PM, Romain Francois wrote: Pushing back to github is not so difficult. See e.g http://blog.r-enthusiasts.com/2013/12/04/automated-blogging.html Thanks for the writeup, I'll try this. Perhaps it's better to push the results of `R CMD build`, though. You can manage branches easily in travis. You could for example decide to do something different if you are on the master branch ... That's right. But then no .Rd files are built when I'm on a branch, so I can't easily preview the result. The ideal situation would be: 1. I manage only R source files on GitHub, not Rd files, NAMESPACE nor the Collate section of DESCRIPTION. Machine-readable instructions on how to build those are provided with the package. 2. Anyone can install from GitHub using devtools::install_github(). This also should work for branches, forks and pull requests. 3. I can build the package so that the result can be accepted by CRAN. The crucial point on that list is point 2, the others I can easily solve myself. The way I see it, point 2 can be tackled by extending devtools or extending the ways packages are built. Extending devtools seems to be the inferior approach, although, to be honest, I'd be fine with that as well. -Kirill Romain Le 13 déc. 2013 à 12:03, Kirill Müller kirill.muel...@ivt.baug.ethz.ch mailto:kirill.muel...@ivt.baug.ethz.ch a écrit : Gabor I agree with you. There's Travis CI, and r-travis -- an attempt to integrate R package testing with Travis. Pushing back to GitHub is possible, but the setup is somewhat difficult. Also, this can be subject to race conditions because each push triggers a test run and they can happen in parallel even for the same repository. How do you handle branches? It would be really great to be able to execute custom R code before building. Perhaps in a PreBuild: section in DESCRIPTION? Cheers Kirill On 12/12/2013 02:21 AM, Gábor Csárdi wrote: Hi, this is maybe mostly a personal preference, but I prefer not to put generated files in the vc repository. Changes in the generated files, especially if there is many of them, pollute the diffs and make them less useful. If you really want to be able to install the package directly from github, one solution is to 1. create another repository, that contains the complete generated package, so that install_github() can install it. 2. set up a CI service, that can download the package from github, build the package or the generated files (check the package, while it is at it), and then push the build stuff back to github. 3. set up a hook on github, that invokes the CI after each commit. I have used this setup in various projects with jenkins-ci and it works well. Diffs are clean, the package is checked and built frequently, and people can download it without having to install the tools that generate the generated files. The only downside is that you need to install a CI, so you need a server for that. Maybe you can do this with travis-ci, maybe not, I am not familiar with it that much. Best, Gabor On Wed, Dec 11, 2013 at 7:39 PM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch mailto:kirill.muel...@ivt.baug.ethz.ch wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed, perhaps it can be summarized as follows: - The devtools package is not the right place to implement roxygenize-before-build - A continuous integration
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
Oh, I didn't know RForge.net supported external git repos, cool! Gabor On Fri, Dec 13, 2013 at 3:14 PM, Simon Urbanek simon.urba...@r-project.org wrote: FWIW this is essentially what RForge.net provides. Each GitHub commit triggers a build (branches are supported as the branch info is passed in the WebHook) which can be either classic R CMD build or a custom shell script (hence you can do anything you want). The result is a tar ball (which includes the generated files) and that tar ball gets published in the R package repository. R CMD check is run as well on the tar ball and the results are published. This way you don't need devtools, users can simply use install.packages() without requiring any additional tools. There are some talks about providing the above as a cloud service, so that anyone can run and/or use it. [...] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
Btw. one thing that probably would not work (well) with RForge.net (or another CRAN-like repo), is the multiple branches. The problem is that you cannot put the branch name in the package version string, because that is not allowed, and then the versions from the multiple branches get mixed up. This works fine with install_github() because you can explicitly specify the branch there. One possible solution is to create multiple repos, one for each branch. Not really elegant, though. I don't really need this myself, I am just saying because it came up in this thread. Gabor On Fri, Dec 13, 2013 at 3:24 PM, Gábor Csárdi csardi.ga...@gmail.com wrote: Oh, I didn't know RForge.net supported external git repos, cool! Gabor On Fri, Dec 13, 2013 at 3:14 PM, Simon Urbanek simon.urba...@r-project.org wrote: FWIW this is essentially what RForge.net provides. Each GitHub commit triggers a build (branches are supported as the branch info is passed in the WebHook) which can be either classic R CMD build or a custom shell script (hence you can do anything you want). The result is a tar ball (which includes the generated files) and that tar ball gets published in the R package repository. R CMD check is run as well on the tar ball and the results are published. This way you don't need devtools, users can simply use install.packages() without requiring any additional tools. There are some talks about providing the above as a cloud service, so that anyone can run and/or use it. [...] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
Thanks a lot. This would indeed solve the problem. I'll try mkdist today ;-) Is the NEWS file parsed before of after mkdist has been executed? Would you be willing to share the code for the infrastructure, perhaps on GitHub? -Kirill On 12/13/2013 09:14 PM, Simon Urbanek wrote: FWIW this is essentially what RForge.net provides. Each GitHub commit triggers a build (branches are supported as the branch info is passed in the WebHook) which can be either classic R CMD build or a custom shell script (hence you can do anything you want). The result is a tar ball (which includes the generated files) and that tar ball gets published in the R package repository. R CMD check is run as well on the tar ball and the results are published. This way you don't need devtools, users can simply use install.packages() without requiring any additional tools. There are some talks about providing the above as a cloud service, so that anyone can run and/or use it. Cheers, Simon On Dec 13, 2013, at 8:51 AM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch wrote: On 12/13/2013 12:50 PM, Romain Francois wrote: Pushing back to github is not so difficult. See e.g http://blog.r-enthusiasts.com/2013/12/04/automated-blogging.html Thanks for the writeup, I'll try this. Perhaps it's better to push the results of `R CMD build`, though. You can manage branches easily in travis. You could for example decide to do something different if you are on the master branch ... That's right. But then no .Rd files are built when I'm on a branch, so I can't easily preview the result. The ideal situation would be: 1. I manage only R source files on GitHub, not Rd files, NAMESPACE nor the Collate section of DESCRIPTION. Machine-readable instructions on how to build those are provided with the package. 2. Anyone can install from GitHub using devtools::install_github(). This also should work for branches, forks and pull requests. 3. I can build the package so that the result can be accepted by CRAN. The crucial point on that list is point 2, the others I can easily solve myself. The way I see it, point 2 can be tackled by extending devtools or extending the ways packages are built. Extending devtools seems to be the inferior approach, although, to be honest, I'd be fine with that as well. -Kirill Romain Le 13 déc. 2013 à 12:03, Kirill Müller kirill.muel...@ivt.baug.ethz.ch mailto:kirill.muel...@ivt.baug.ethz.ch a écrit : Gabor I agree with you. There's Travis CI, and r-travis -- an attempt to integrate R package testing with Travis. Pushing back to GitHub is possible, but the setup is somewhat difficult. Also, this can be subject to race conditions because each push triggers a test run and they can happen in parallel even for the same repository. How do you handle branches? It would be really great to be able to execute custom R code before building. Perhaps in a PreBuild: section in DESCRIPTION? Cheers Kirill On 12/12/2013 02:21 AM, Gábor Csárdi wrote: Hi, this is maybe mostly a personal preference, but I prefer not to put generated files in the vc repository. Changes in the generated files, especially if there is many of them, pollute the diffs and make them less useful. If you really want to be able to install the package directly from github, one solution is to 1. create another repository, that contains the complete generated package, so that install_github() can install it. 2. set up a CI service, that can download the package from github, build the package or the generated files (check the package, while it is at it), and then push the build stuff back to github. 3. set up a hook on github, that invokes the CI after each commit. I have used this setup in various projects with jenkins-ci and it works well. Diffs are clean, the package is checked and built frequently, and people can download it without having to install the tools that generate the generated files. The only downside is that you need to install a CI, so you need a server for that. Maybe you can do this with travis-ci, maybe not, I am not familiar with it that much. Best, Gabor On Wed, Dec 11, 2013 at 7:39 PM, Kirill Müller kirill.muel...@ivt.baug.ethz.ch mailto:kirill.muel...@ivt.baug.ethz.ch wrote: Hi Quite a few R packages are now available on GitHub long before they appear on CRAN, installation is simple thanks to devtools::install_github(). However, it seems to be common practice to keep the .Rd files (and NAMESPACE and the Collate section in the DESCRIPTION) in the Git tree, and to manually update it, even if they are autogenerated from the R code by roxygen2. This requires extra work for each update of the documentation and also binds package development to a specific version of roxygen2 (because otherwise lots of bogus changes can be added by roxygenizing with a different version). What options are there to generate the .Rd files during build/install? In https://github.com/hadley/devtools/issues/43 the issue has been discussed,
Re: [Rd] Strategies for keeping autogenerated .Rd files out of a Git tree
On 12/13/2013 06:09 PM, Brian Diggs wrote: One downside I can see with this third approach is that by making the package documentation generation part of the build process, you must then make the package depend/require roxygen (or whatever tools you are using to generate documentation). This dependence, though, is just to build the package, not to actually use the package. And by pushing this dependency onto the end users of the package, you have transferred the problem you mentioned (... and also binds package development to a specific version of roxygen2 ...) to the many end users rather than the few developers. That's right. As outlined in another message, roxygen2 would be required for building from the raw source (hosted on GitHub) but not for installing from a source tarball (which would contain the .Rd files). Not sure if that's possible, though. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel