date:20190226

[Bioc-devel] R CMD check takes too long

2019-02-26 Thread Kevin Wang

Hi,
I have submitted my package scMerge to Bioconductor’s GitHub Issue page. The 
only warning that I am receiving now tells me the R CMD check took more than 5 
minutes. I am not sure how can I shorten this time as a number of the checks 
are taking much longer time than my local computers. Even on the BioC build 
report, the example for the function scMerge takes 5.6 seconds on Linux, 8 
seconds on OS X, and 75 seconds on Windows.

http://bioconductor.org/spb_reports/scMerge_buildreport_20190226202635.html

Any help would be appreciated

Best Wishes
Kevin
Kevin Y.X. WANG | PhD Candidate
Postgraduate Teaching Fellow
Faculty of Science, School of Mathematics and Statistics
THE UNIVERSITY OF SYDNEY
Carslaw Building (F07) | The University of Sydney | NSW | Australia | 2006
T +61 2 9114 1276  | M +61 404 955 255
E kevin.w...@sydney.edu.au  | W 
https://kevinwang09.github.io/
CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised use is 
strictly prohibited. If you receive this email in error, please delete it and 
any attachments.
Please think of our environment and only print this e-mail if necessary.


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenVisR Package Failing

2019-02-26 Thread Skidmore, Zach

Got it, thanks for the tip! I've pushed a fix that should resolve this but I 
have another question.

The only way I could get check.Renviron to load was to either

1. explicitly load the file in R with readRenviron("~/check.Renviron")

2. manually set the R ENV variable with 
Sys.setenv("R_CHECK_ENVIRON"="~/check.Renviron")

>From the docs on bioc I should have been able to just add export 
>R_CHECK_ENVIRON=~/check.Renviron to my .bashrc and R should have recognized 
>the variable was set however this did not work. I also tried putting the file 
>in /Library/Frameworks/R.framework/Resources/etc/ based on what I read here 
>however that did not work either. Any thoughts, I want to avoid having to 
>manually set these variables every time I work on the package?

maybe the easiest thing is to just set these flags in the rstudio config for 
this project, I see you can do this with --install=value but couldn't find any 
examples for setting this for a list of parameters and what I tried must have 
been the wrong syntax

Zach

On 2/26/19 10:28 AM, Shepherd, Lori wrote:

The ERROR your package is experiencing has to do with conditional length 
greater than 1.  We did send you an email indicating this entitled 
"Bioconductor Package ERROR"  that had the following information:

In a continued effort to better code, there have been efforts to identify 
problematic existing code in packages. The current issue at hand is identifying 
a conditional with a length greater than 1. This could either be an over site 
of not using any() or all() or could potential indicate problematic code with 
unexpected multiple output.

dummy example:

```
> check = c(1:5)
> if (check > 3){
+FALSE
+ }

It is controlled with the following that are implemented on the build system

_R_CHECK_LENGTH_1_CONDITION_ =package:_R_CHECK_PACKAGE_NAME_
_R_CHECK_LENGTH_1_LOGIC2_=package:_R_CHECK_PACKAGE_NAME_

And there is some documentation here which should help get the environment 
similar to reproduce the error:

http://bioconductor.org/developers/package-guidelines/#checkingenv

There is a more detailed Test output section at the bottom of the build report 
page that shows the code that was being run.

http://bioconductor.org/checkResults/3.9/bioc-LATEST/GenVisR/malbec2-checksrc.html

Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263

From: Bioc-devel 
 on 
behalf of Skidmore, Zach 
Sent: Tuesday, February 26, 2019 11:10 AM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] GenVisR Package Failing

Hi All,

I maintain the GenVisR package which is currently failing on the devel
branch. I can see it has something to do with the test cases however the
error message is not clear, I don't get a line number for which test
case has failed or any other indication to that effect. Normally this
wouldn't be a problem but I only see failures in the bioc environments
(locally everything is fine). I had thought it might be related to my
visual test cases with the vdiffr package however I skipped all those in
version 1.15.1.

Is there any way I can find the sessionInfo() for the bioc-devel
environment to make sure everything matches my local environment?

Thanks!

Zach

The materials in this message are private and may contain Protected Healthcare 
Information or other information of a sensitive nature. If you are not the 
intended recipient, be advised that any unauthorized use, disclosure, copying 
or the taking of any action in reliance on the contents of this information is 
strictly prohibited. If you have received this email in error, please 
immediately notify the sender via telephone or return mail.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

This email message may contain legally privileged and/or confidential 
information. If you are not the intended recipient(s), or the employee or agent 
responsible for the delivery of this message to the intended recipient(s), you 
are hereby notified that any disclosure, copying, distribution, or use of this 
email message is prohibited. If you have received this message in error, please 
notify the sender immediately by e-mail and delete this email message from your 
computer. Thank you.

The materials in this message are private and may contain Protected Healthcare 
Information or other information of a sensitive nature. If you are not the 
intended recipient, be advised that any unauthorized use, disclosure, copying 
or the taking of any action in reliance on the contents of this information is 
strictly prohibited. If you have received this email in

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

2019-02-26 Thread Dirk Eddelbuettel



On 26 February 2019 at 20:38, Uwe Ligges wrote:
| 
| 
| On 26.02.2019 18:59, Iñaki Ucar wrote:
| > On Tue, 26 Feb 2019 at 18:38, Oliver Dechant  wrote:
| >>
| >> Somewhat relatedly is there a way to monitor when a package has new
| >> reverse depends or imports added?
| > 
| > I don't know of any service currently providing this, if that's what
| > you're asking. But it's pretty straightforward to set up a cron job to
| > monitor the DESCRIPTION file of interest.
| 
| Well, if *reverse* dependencies are of interest, rather monitor he 
| package's CRAN webpage which lists reverse dependencis.

Or access the already-parsed-and-ready-to-use info:

R> crandb <- tools::CRAN_package_db()
R> colnames(crandb)
 [1] "Package" "Version" "Priority" 
  
 [4] "Depends" "Imports" "LinkingTo"
  
 [7] "Suggests""Enhances""License"  
  
[10] "License_is_FOSS" "License_restricts_use"   "OS_type"  
  
[13] "Archs"   "MD5sum"  "NeedsCompilation" 
  
[16] "Additional_repositories" "Author"  "Authors@R"
  
[19] "Biarch"  "BugReports"  "BuildKeepEmpty"   
  
[22] "BuildManual" "BuildResaveData" "BuildVignettes"   
  
[25] "Built"   "ByteCompile" "Classification/ACM"   
  
[28] "Classification/ACM-2012" "Classification/JEL"  "Classification/MSC"   
  
[31] "Classification/MSC-2010" "Collate" "Collate.unix" 
  
[34] "Collate.windows" "Contact" "Copyright"
  
[37] "Date""Description" "Encoding" 
  
[40] "KeepSource"  "Language""LazyData" 
  
[43] "LazyDataCompression" "LazyLoad""MailingList"  
  
[46] "Maintainer"  "Note""Packaged" 
  
[49] "RdMacros""SysDataCompression"  "SystemRequirements"   
  
[52] "Title"   "Type""URL"  
  
[55] "VignetteBuilder" "ZipData" "Published"
  
[58] "Path""X-CRAN-Comment"  "Reverse depends"  
  
[61] "Reverse imports" "Reverse linking to"  "Reverse suggests" 
  
[64] "Reverse enhances""MD5sum" 
R> 
R> dim(crandb)
[1] 1377665
R>

Now, if Oliver wants this _through time_ he will have snapshot it. The
information provided is always 'as is' for 'right now'.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] Compile R to WebAssembly / Emscripten?

2019-02-26 Thread Gabriel Becker

As I recall, the major blocker is that R links against a number of other
things (notably BLAS, pcre, etc) so while technically possible (?) I
suppose, the universe of things you'd have to compile over and then get
working is much larger than just the R internals.

I think most people who consider this (including me years ago, as well as
the poster of Gabor's message to rdevel) hit that point and then go try to
find a less herculean task to pursue.

~G

On Wed, Feb 20, 2019 at 12:57 AM Gábor Csárdi 
wrote:

> This was some time ago:
> https://stat.ethz.ch/pipermail/r-devel/2013-May/066724.html
>
> So probably not hopeless, but I would think it is a lot of work.
>
> Gabor
>
> On Wed, Feb 20, 2019 at 8:17 AM Todd Wilder  wrote:
> >
> > Has anyone attempted to compile R (probably without any OS bindings) to
> > WebAssembly / Emscripten? If so, how far did you get? (would be crazy
> > awesome if you could get all the way to a ggplot bitmap output). If not,
> is
> > this a waste of time or is there some daylight to doing this?
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

2019-02-26 Thread Uwe Ligges





On 26.02.2019 18:59, Iñaki Ucar wrote:

On Tue, 26 Feb 2019 at 18:38, Oliver Dechant  wrote:


Somewhat relatedly is there a way to monitor when a package has new
reverse depends or imports added?


I don't know of any service currently providing this, if that's what
you're asking. But it's pretty straightforward to set up a cron job to
monitor the DESCRIPTION file of interest.


Well, if *reverse* dependencies are of interest, rather monitor he 
package's CRAN webpage which lists reverse dependencis.


Best,
Uwe Ligges




Iñaki

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

2019-02-26 Thread Iñaki Ucar

On Tue, 26 Feb 2019 at 18:38, Oliver Dechant  wrote:
>
> Somewhat relatedly is there a way to monitor when a package has new
> reverse depends or imports added?

I don't know of any service currently providing this, if that's what
you're asking. But it's pretty straightforward to set up a cron job to
monitor the DESCRIPTION file of interest.

Iñaki

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

2019-02-26 Thread Oliver Dechant

Somewhat relatedly is there a way to monitor when a package has new
reverse depends or imports added?

-- 
Oliver Dechant
__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Bioc-devel] How critical is package style for bioconductor?

2019-02-26 Thread Michael Lawrence

Preferably the public API would be camel case for consistency with
other Bioconductor APIs (at least the core ones). It's not a huge deal
though.


On Tue, Feb 26, 2019 at 8:12 AM Turaga, Nitesh
 wrote:
>
> As long as it’s consistent, you can use another style. Consistency helps 
> reviewers read the code easily.
>
> Most people generally use “camelCase" or “snake_case" for variable names and 
> function names.  But keep in mind that class names need to be “CamelCase" as 
> that is the R style of doing things.
>
> You should follow the other set of rules as given (such as Indentation, File 
> names, Class names, use of space, comments, Namespaces, end user messages, 
> Misc section) http://bioconductor.org/developers/how-to/coding-style/.
>
> Best,
>
> Nitesh
>
>
> On Feb 26, 2019, at 11:00 AM, Aaron Chevalier 
> mailto:a...@bu.edu>> wrote:
>
> Hi all,
>
> I'm developing a package that to submit to Bioconductor and I've noticed
> that the style guide suggests camelCasealternatingWords which I find hard
> to read and is different from other style guides and automated
> style-checking I've seen (lintr, Hadley Wickam).
>
> Is this an optional requirement that can be done with another consistent
> style, or if not are there packages that enforce this style guide in an
> automated fashion? I tried BiocCheck and it said nothing about the naming
> conventions.
>
> Thanks!
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>
> This email message may contain legally privileged and/or confidential 
> information.  If you are not the intended recipient(s), or the employee or 
> agent responsible for the delivery of this message to the intended 
> recipient(s), you are hereby notified that any disclosure, copying, 
> distribution, or use of this email message is prohibited.  If you have 
> received this message in error, please notify the sender immediately by 
> e-mail and delete this email message from your computer. Thank you.
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenVisR Package Failing

2019-02-26 Thread Shepherd, Lori

The ERROR your package is experiencing has to do with conditional length 
greater than 1.  We did send you an email indicating this entitled 
"Bioconductor Package ERROR"  that had the following information:



In a continued effort to better code, there have been efforts to identify 
problematic existing code in packages. The current issue at hand is identifying 
a conditional with a length greater than 1. This could either be an over site 
of not using any() or all() or could potential indicate problematic code with 
unexpected multiple output.

dummy example:

```
> check = c(1:5)
> if (check > 3){
+FALSE
+ }



It is controlled with the following that are implemented on the build system


_R_CHECK_LENGTH_1_CONDITION_ =package:_R_CHECK_PACKAGE_NAME_
_R_CHECK_LENGTH_1_LOGIC2_=package:_R_CHECK_PACKAGE_NAME_

And there is some documentation here which should help get the environment 
similar to reproduce the error:

http://bioconductor.org/developers/package-guidelines/#checkingenv




There is a more detailed Test output section at the bottom of the build report 
page that shows the code that was being run.

http://bioconductor.org/checkResults/3.9/bioc-LATEST/GenVisR/malbec2-checksrc.html








Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263



From: Bioc-devel  on behalf of Skidmore, Zach 

Sent: Tuesday, February 26, 2019 11:10 AM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] GenVisR Package Failing

Hi All,


I maintain the GenVisR package which is currently failing on the devel
branch. I can see it has something to do with the test cases however the
error message is not clear, I don't get a line number for which test
case has failed or any other indication to that effect. Normally this
wouldn't be a problem but I only see failures in the bioc environments
(locally everything is fine). I had thought it might be related to my
visual test cases with the vdiffr package however I skipped all those in
version 1.15.1.


Is there any way I can find the sessionInfo() for the bioc-devel
environment to make sure everything matches my local environment?


Thanks!

Zach



The materials in this message are private and may contain Protected Healthcare 
Information or other information of a sensitive nature. If you are not the 
intended recipient, be advised that any unauthorized use, disclosure, copying 
or the taking of any action in reliance on the contents of this information is 
strictly prohibited. If you have received this email in error, please 
immediately notify the sender via telephone or return mail.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] How critical is package style for bioconductor?

2019-02-26 Thread Kasper Daniel Hansen

lintr can be configured to use camelCase.

On Tue, Feb 26, 2019 at 11:12 AM Turaga, Nitesh <
nitesh.tur...@roswellpark.org> wrote:

> As long as it’s consistent, you can use another style. Consistency helps
> reviewers read the code easily.
>
> Most people generally use “camelCase" or “snake_case" for variable names
> and function names.  But keep in mind that class names need to be
> “CamelCase" as that is the R style of doing things.
>
> You should follow the other set of rules as given (such as Indentation,
> File names, Class names, use of space, comments, Namespaces, end user
> messages, Misc section)
> http://bioconductor.org/developers/how-to/coding-style/.
>
> Best,
>
> Nitesh
>
>
> On Feb 26, 2019, at 11:00 AM, Aaron Chevalier  a...@bu.edu>> wrote:
>
> Hi all,
>
> I'm developing a package that to submit to Bioconductor and I've noticed
> that the style guide suggests camelCasealternatingWords which I find hard
> to read and is different from other style guides and automated
> style-checking I've seen (lintr, Hadley Wickam).
>
> Is this an optional requirement that can be done with another consistent
> style, or if not are there packages that enforce this style guide in an
> automated fashion? I tried BiocCheck and it said nothing about the naming
> conventions.
>
> Thanks!
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] How critical is package style for bioconductor?

2019-02-26 Thread Turaga, Nitesh

As long as it’s consistent, you can use another style. Consistency helps 
reviewers read the code easily.

Most people generally use “camelCase" or “snake_case" for variable names and 
function names.  But keep in mind that class names need to be “CamelCase" as 
that is the R style of doing things.

You should follow the other set of rules as given (such as Indentation, File 
names, Class names, use of space, comments, Namespaces, end user messages, Misc 
section) http://bioconductor.org/developers/how-to/coding-style/.

Best,

Nitesh


On Feb 26, 2019, at 11:00 AM, Aaron Chevalier mailto:a...@bu.edu>> 
wrote:

Hi all,

I'm developing a package that to submit to Bioconductor and I've noticed
that the style guide suggests camelCasealternatingWords which I find hard
to read and is different from other style guides and automated
style-checking I've seen (lintr, Hadley Wickam).

Is this an optional requirement that can be done with another consistent
style, or if not are there packages that enforce this style guide in an
automated fashion? I tried BiocCheck and it said nothing about the naming
conventions.

Thanks!

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] GenVisR Package Failing

2019-02-26 Thread Skidmore, Zach

Hi All,


I maintain the GenVisR package which is currently failing on the devel
branch. I can see it has something to do with the test cases however the
error message is not clear, I don't get a line number for which test
case has failed or any other indication to that effect. Normally this
wouldn't be a problem but I only see failures in the bioc environments
(locally everything is fine). I had thought it might be related to my
visual test cases with the vdiffr package however I skipped all those in
version 1.15.1.


Is there any way I can find the sessionInfo() for the bioc-devel
environment to make sure everything matches my local environment?


Thanks!

Zach



The materials in this message are private and may contain Protected Healthcare 
Information or other information of a sensitive nature. If you are not the 
intended recipient, be advised that any unauthorized use, disclosure, copying 
or the taking of any action in reliance on the contents of this information is 
strictly prohibited. If you have received this email in error, please 
immediately notify the sender via telephone or return mail.
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] How critical is package style for bioconductor?

2019-02-26 Thread Aaron Chevalier

Hi all,

I'm developing a package that to submit to Bioconductor and I've noticed
that the style guide suggests camelCasealternatingWords which I find hard
to read and is different from other style guides and automated
style-checking I've seen (lintr, Hadley Wickam).

Is this an optional requirement that can be done with another consistent
style, or if not are there packages that enforce this style guide in an
automated fashion? I tried BiocCheck and it said nothing about the naming
conventions.

Thanks!

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Rd] Possible Update to R-internals Manual

2019-02-26 Thread brodie gaslam via R-devel

According the R-ints the only current uses of the `truelength` meta datum is 
for environment hash tables.  Jim Hester just made me aware that R3.4.0 
introduces a new use case: growable vectors.

I attach a patch to the R-ints manual that reflects this change.  The wording 
is obviously just a suggestion.  Additionally, it may be worth moving the 
footnote into the main body of the document now that there are more use cases.

Best,

Brodie.Index: doc/manual/R-ints.texi
===
--- doc/manual/R-ints.texi  (revision 76152)
+++ doc/manual/R-ints.texi  (working copy)
@@ -366,6 +366,9 @@
 
 Bit 4 is turned on to mark S4 objects.
 
+Bit 5 for vectors is used to indicate that the vector is overallocated
+and thus may be growable without a new allocation.
+
 Bits 1, 2, 3, 5 and 6 are used for a @code{CHARSXP} to denote its
 encoding.  Bit 1 indicates that the @code{CHARSXP} should be treated as
 a set of bytes, not necessarily representing a character in any known
@@ -406,16 +409,19 @@
 types are a @code{VECTOR_SEXPREC}, which again consists of the header
 and the same three pointers, but followed by two integers giving the
 length and `true length'@footnote{This is almost unused.  The only
-current use is for hash tables of environments (@code{VECSXP}s), where
+current uses are for hash tables of environments (@code{VECSXP}s), where
 @code{length} is the size of the table and @code{truelength} is the
-number of primary slots in use, and for the reference hash tables in
+number of primary slots in use, for the reference hash tables in
 serialization (@code{VECSXP}s), where @code{truelength} is the number of
-slots in use.} of the vector, and then followed by the data (aligned as
-required: on most 32-bit systems with a 24-byte @code{VECTOR_SEXPREC}
-node the data can follow immediately after the node).  The data are a
-block of memory of the appropriate length to store `true length'
-elements (rounded up to a multiple of 8 bytes, with the 8-byte blocks
-being the `Vcells' referred in the documentation for @code{gc()}).
+slots in use, and for vectors that are over-allocated due to assignment
+past the original length, where @code{length} is the in-use length and
+@code{truelength} is the allocated length.} of the vector, and then
+followed by the data (aligned as required: on most 32-bit systems with a
+24-byte @code{VECTOR_SEXPREC} node the data can follow immediately after
+the node).  The data are a block of memory of the appropriate length to
+store `true length' elements (rounded up to a multiple of 8 bytes, with
+the 8-byte blocks being the `Vcells' referred in the documentation for
+@code{gc()}).
 
 The `data' for the various types are given in the table below.  A lot of
 this is interpretation, i.e.@: the types are not checked.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] Possible Rtools path problem

2019-02-26 Thread Horia Yeb

Hey all and thank you for helping me,
In the end, I re-installed R, and R tools and the check gives me no error
–on this computer. I tried all of the above, didn't solve the problem.

Thanks again anyway!
Horia.

On Mon, 25 Feb 2019 at 19:09, Uwe Ligges 
wrote:

> One of the programs that don't work is find from R tools if it is behind
> C:\Windows\system32 where the ompletely different windows find is found.
>
> Best,
> Uwe Ligges
>
>
> On 25.02.2019 19:06, Duncan Murdoch wrote:
> > On 25/02/2019 11:01 a.m., Dirk Eddelbuettel wrote:
> >>
> >> The R-on-Windows FAQ has recommends to NOT install in a path with
> spaces.
> >>
> >> The R Installer on Windows defaults to a path with spaces.
> >>
> >> I cannot reconcile it either. Such is life, sometimes.
> >>
> >> But when I had to work on that platform in the past I put my open source
> >> stuff into c:/opt/ -- so maybe try reinstalling?
> >>
> >> Rtools also had (has ?) a gotcha requiring c:/ placement.
> >>
> >
> > Those might be the problem, but to me it looks more like a path order
> > problem:
> >
> >> *My path.getenv is : *
> >>
> >>
> >>  [1] "C:\\Users\\ USER\\Documents\\R\\R-3.5.2\\bin\\i386"
> >>
> >>  [2] "C:\\Program Files (x86)\\Intel\\Intel(R) Management Engine
> >> Components\\iCLS\\"
> >>  [3] "C:\\Program Files\\Intel\\Intel(R) Management Engine
> >> Components\\iCLS\\"
> >>  [4] "C:\\WINDOWS\\system32"
> >>
> >>  [5] "C:\\WINDOWS"
> >>
> >>  [6] "C:\\WINDOWS\\System32\\Wbem"
> >>
> >>  [7] "C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\"
> >>
> >>  [8] "C:\\Program Files (x86)\\Intel\\Intel(R) Management Engine
> >> Components\\DAL"
> >>  [9] "C:\\Program Files\\Intel\\Intel(R) Management Engine
> >> Components\\DAL"
> >> [10] "C:\\Program Files (x86)\\Intel\\Intel(R) Management Engine
> >> Components\\IPT"
> >> [11] "C:\\Program Files\\Intel\\Intel(R) Management Engine
> >> Components\\IPT"
> >> [12] "C:\\Program Files\\CMake\\bin"
> >>
> >> [13] "C:\\Program Files\\R\\R-3.5.0\\bin"
> >>
> >> [14] "C:\\WINDOWS\\System32\\OpenSSH\\"
> >>
> >> [15] "C:\\HashiCorp\\Vagrant\\bin"
> >>
> >> [16] "C:\\ProgramData\\chocolatey\\bin"
> >>
> >> [17] "C:\\Program Files\\Microsoft SQL Server\\130\\Tools\\Binn\\"
> >>
> >> [18] "C:\\Program Files\\Java\\jdk1.8.0_181\\bin"
> >>
> >> [19] "C:\\Qt\\Tools\\QtCreator\\bin"
> >>
> >> [20] "C:\\Qt\\5.11.1\\msvc2017_64\\bin"
> >>
> >> [21] "C:\\Windows\\System32"
> >>
> >> [22] "C:\\Program Files (x86)\\GnuWin32\\bin"
> >>
> >> [23] "C:\\Program Files\\Git\\cmd"
> >>
> >> [24] "C:\\Rtools\\bin"
> >>
> >> [25] "C:\\Rtools\\MinGW\\bin"
> >>
> >> [26] "C:\\Users\\ USER\\AppData\\Local\\Microsoft\\WindowsApps"
> >>
> >> [27] "C:\\Users\\ USER\\AppData\\Local\\atom\\bin"
> >>
> >> [28] "C:\\Users\\ USER\\AppData\\Local\\Programs\\MiKTeX
> >> 2.9\\miktex\\bin\\x64\\"
> >
> > The OP has 23 directories in the path ahead of the Rtools directories;
> > I'd guess one of them contains a like-named command that is messing
> > things up.  From the message
> >
> >> 1: In FUN(X[[i]], ...) : this requires 'nm' to be on the PATH
> >
> > my guess would be that there's a bad 'nm.exe' somewhere in there.  The
> > real one is likely in directory 25 (I haven't got current Rtools
> > installed, so can't tell), but if there's another one earlier, things
> > won't work.
> >
> > I'd recommend putting the Rtools directories first.  That might mess up
> > one of the other programs that also wants to be first, so I wouldn't do
> > it globally, just set up a batch or cmd file to modify the path when you
> > want to use Rtools.
> >
> > Duncan Murdoch
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] bias issue in sample() (PR 17494)

2019-02-26 Thread Tierney, Luke

On Tue, 26 Feb 2019, Kirill Müller wrote:

> Ralf
>
>
> I don't doubt this is expected with the current implementation, I doubt the 
> implementation is desirable. Suggesting to turn this to
>
> pbirthday(1e6, classes = 2^53)
> ## [1] 5.550956e-05

That isn't a small number given simulation sizes people routinely run
these days. Just about right to miss an issue in a pilot run and get
bitten on the real one.

In the inversion generator for normals we already use a higher
resolution uniform produced from two regular ones. I considered
switching to that approach for all uniforms, either in addition to or
instead of changing the uniform integer sampling algorithm used in
sample(). But that would have been even more disruptive:

- all simulation results (except normals) would change;
- there would be a performance penalty;
- the streams would be used up twice as fast;

I would also probably be necessary to rethink things like how to use
the L'Ecuyer generator to produce multiple streams in the `parallel`
package.

We may need to take this route in the future, but it didn't seem like
a good idea at this time.

Best,

luke

>
> (which is still non-zero, but much less likely to cause confusion.)
>
>
> Best regards
>
> Kirill
>
> On 26.02.19 10:18, Ralf Stubner wrote:
>> Kirill,
>> 
>> I think some level of collision is actually expected! R uses a 32bit MT
>> that can produce 2^32 different doubles. The probability for a collision
>> within a million draws is
>> 
>>> pbirthday(1e6, classes = 2^32)
>> [1] 1
>> 
>> Greetings
>> Ralf
>> 
>> 
>> On 26.02.19 07:06, Kirill Müller wrote:
>>> Gabe
>>> 
>>> 
>>> As mentioned on Twitter, I think the following behavior should be fixed
>>> as part of the upcoming changes:
>>> 
>>> R.version.string
>>> ## [1] "R Under development (unstable) (2019-02-25 r76160)"
>>> .Machine$double.digits
>>> ## [1] 53
>>> set.seed(123)
>>> RNGkind()
>>> ## [1] "Mersenne-Twister" "Inversion"    "Rejection"
>>> length(table(runif(1e6)))
>>> ## [1] 999863
>>> 
>>> I don't expect any collisions when using Mersenne-Twister to generate a
>>> million floating point values. I'm not sure what causes this behavior,
>>> but it's documented in ?Random:
>>> 
>>> "Do not rely on randomness of low-order bits from RNGs. Most of the
>>> supplied uniform generators return 32-bit integer values that are
>>> converted to doubles, so they take at most 2^32 distinct values and long
>>> runs will return duplicated values (Wichmann-Hill is the exception, and
>>> all give at least 30 varying bits.)"
>>> 
>>> The "Wichman-Hill" bit is interesting:
>>> 
>>> RNGkind("Wichmann-Hill")
>>> length(table(runif(1e6)))
>>> ## [1] 100
>>> length(table(runif(1e6)))
>>> ## [1] 100
>>> 
>>> Mersenne-Twister has a much much larger periodicity than Wichmann-Hill,
>>> it would be great to see the above behavior also for Mersenne-Twister.
>>> Thanks for considering.
>>> 
>>> 
>>> Best regards
>>> 
>>> Kirill
>>> 
>>> 
>>> On 20.02.19 08:01, Gabriel Becker wrote:
 Luke,
 
 I'm happy to help with this. Its great to see this get tackled (I've
 cc'ed
 Kelli Ottoboni who helped flag this issue).
 
 I can prepare a patch for the RNGkind related stuff and the doc update.
 
 As for ???, what are your (and others') thoughts about the possibility of
 a) a reproducibility API which takes either an R version (or maybe
 alternatively a date) and sets the RNGkind to the default for that
 version/date, and/or b) that sessionInfo be modified to capture (and
 display) the RNGkind in effect.
 
 Best,
 ~G
 
 
 On Tue, Feb 19, 2019 at 11:52 AM Tierney, Luke 
 wrote:
 
> Before the next release we really should to sort out the bias issue in
> sample() reported by Ottoboni and Stark in
> https://www.stat.berkeley.edu/~stark/Preprints/r-random-issues.pdf and
> filed aa a bug report by Duncan Murdoch at
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17494.
> 
> Here are two examples of bad behavior through current R-devel:
>
>    set.seed(123)
>    m <- (2/5) * 2^32
>    x <- sample(m, 100, replace = TRUE)
>    table(x %% 2, x > m / 2)
>    ##
>    ##    FALSE   TRUE
>    ## 0 300620 198792
>    ## 1 200196 300392
>
>    table(sample(2/7 * 2^32, 100, replace = TRUE) %% 2)
>    ##
>    ##  0  1
>    ## 429054 570946
> 
> I committed a modification to R_unif_index to address this by
> generating random bits (blocks of 16) and rejection sampling, but for
> now this is only enabled if the environment variable R_NEW_SAMPLE is
> set before the first call.
> 
> Some things still needed:
> 
> - someone to look over the change and see if there are any issues
> - adjustment of RNGkind to allowing the old behavior to be selected
> - make the new behavior the default
> -

Re: [Bioc-devel] Warning about the size of git pack file

2019-02-26 Thread Turaga, Nitesh

Hi Vinh,

See in line replies,


On Feb 22, 2019, at 2:30 PM, Vinh Tran 
mailto:t...@bio.uni-frankfurt.de>> wrote:

Dear all,

I am planning to submit a package into Bioconductor. But BiocCheck() gave a 
warning about the size of the git pack file:

$warning
[1] "The following files are over 5MB in size: 
'.git/objects/pack/pack-29a3f2d2d1ed7d701d59dd2ce921229f2b9dab68.pack'"

I tried to use git filter-branch to delete the big files in git history, but 
the pack file is still almost 25MB. I also tried BFG-Tool to do the same task, 
but I couldn’t push the changes from BFG to github due the some hidden branches 
(or deleted/merged git pulls) that denied the commit.

I would like to ask, if it is a serious warning that must be resolved? If yes, 
could you please suggest me another way to reduce the size of the git pack file?


Yes, we consider it a warning that needs to be resolved. Please see if any of 
the following links help you first,

http://bioconductor.org/developers/how-to/git/remove-large-data/

There is an older post 
https://stat.ethz.ch/pipermail/bioc-devel/2018-November/014274.html which deals 
with the same issue. Cleaning up large .git files with BFG cleaner. Please take 
a look at that as well and see if it helps you.

Another thing to keep in mind is, unless you go through with “git —prune”, your 
package size doesn’t actually get smaller because the previous data is still 
stored in the .git/pack files


Secondly, I also got a note from BiocCheck() that I am not listed in the 
bioc-devel mailing list. But it is unclear to me, because I've already 
subscribed to both bioc-devel and also the 
support.bioonductor.org 
 using this email 
(t...@bio.uni-frankfurt.de 
). Could anyone check this for me, please?

$note
[1] "Cannot determine whether maintainer is subscribed to the 
bioc-devel\nmailing list (requires admin credentials).  Subscribe 
here:\nhttps://stat.ethz.ch/mailman/listinfo/bioc-devel”


You should be able to submit the package and if the issue persists on the build 
machine, we can take a look at it. One of the admin’s on the mailing list can 
see if you are subscribed successfully or not.

I wouldn’t worry about this till your package gets submitted without the large 
files issue.

Best,

Nitesh


Thanks so much for your help!
Best regards,
Vinh

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

2019-02-26 Thread Peter Carbonetto

That works quite well! Thank you for the suggestion.

Peter

On Tue, Feb 26, 2019 at 1:45 AM Iñaki Ucar  wrote:

> On Tue, 26 Feb 2019 at 05:29, Peter Carbonetto
>  wrote:
> >
> > I'm wondering if there is a way to get a list of reverse dependencies on
> > CRAN that includes archived packages.
> >
> > I am asking because "ashr", an R package I maintain, was recently removed
> > from CRAN, and now it is back after I fixed the critical problem.
> However,
> > some of the downstream packages that were affected by this removal are no
> > longer listed as a reverse dependency.
>
> I think that the easiest way is to use Microsoft's CRAN time machine
> [1]. Simply set the session's repo to the day before the archiving
> date, and then you can get the complete list using regular procedures.
>
> [1] https://mran.microsoft.com/timemachine
>
> Iñaki
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Bioc-devel] Error package build

2019-02-26 Thread Shepherd, Lori

Yes the maintainer that is going to actively maintain the package (or 
coordinate maintenance) should be listed in the Description and that maintainer 
email MUST be registered at the mailing list and support site.

Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263

From: Bioc-devel  on behalf of margaret linan 

Sent: Tuesday, February 26, 2019 6:07:30 AM
To: Kasper Daniel Hansen; valentin.d...@asu.edu
Cc: bioc-devel
Subject: Re: [Bioc-devel] Error package build

Hi -

I am have been working with Dr. Valentine Dinu on the PoTRA package since last 
year. I have the current version on the GitHub: Bioconductor-PoTRA.

His lab members work on developing and testing different versions of PoTRA so 
those versions are kept on his lab page. Which is why I created the separate 
GitHub.

I can send him the links to subscribe and register for the different 
Bioconductor email lists and devel support site if that resolves the build 
errors.

Thanks,
Margaret

Sent from my iPhone

> On Feb 25, 2019, at 10:32 PM, Kasper Daniel Hansen 
>  wrote:
>
> According to the GitHub repos, the current maintainer is
>   Valentin Dinu 
> Which is not the email you have sent this message from. Do you have multiple 
> email addresses and this is causing this?
>
>> On Mon, Feb 25, 2019 at 8:44 PM margaret linan  wrote:
>> Hi -
>>
>> I have submitted my package PoTRA using the Bioconductor Issues page and the 
>> build results keep stating two errors:
>>
>> 1. That I am not a subscriber to the bioc-devel mailing list.
>>
>> 2. That I am not registered at the support site.
>>
>> I am a subscriber and have registered, so I am confused about why these are 
>> coming up as errors.
>>
>> Thanks.
>>
>> ML
>>
>> Sent from my iPhone
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] bias issue in sample() (PR 17494)

2019-02-26 Thread Kirill Müller


Ralf


I don't doubt this is expected with the current implementation, I doubt 
the implementation is desirable. Suggesting to turn this to


pbirthday(1e6, classes = 2^53)
## [1] 5.550956e-05

(which is still non-zero, but much less likely to cause confusion.)


Best regards

Kirill

On 26.02.19 10:18, Ralf Stubner wrote:

Kirill,

I think some level of collision is actually expected! R uses a 32bit MT
that can produce 2^32 different doubles. The probability for a collision
within a million draws is


pbirthday(1e6, classes = 2^32)

[1] 1

Greetings
Ralf


On 26.02.19 07:06, Kirill Müller wrote:

Gabe


As mentioned on Twitter, I think the following behavior should be fixed
as part of the upcoming changes:

R.version.string
## [1] "R Under development (unstable) (2019-02-25 r76160)"
.Machine$double.digits
## [1] 53
set.seed(123)
RNGkind()
## [1] "Mersenne-Twister" "Inversion"    "Rejection"
length(table(runif(1e6)))
## [1] 999863

I don't expect any collisions when using Mersenne-Twister to generate a
million floating point values. I'm not sure what causes this behavior,
but it's documented in ?Random:

"Do not rely on randomness of low-order bits from RNGs. Most of the
supplied uniform generators return 32-bit integer values that are
converted to doubles, so they take at most 2^32 distinct values and long
runs will return duplicated values (Wichmann-Hill is the exception, and
all give at least 30 varying bits.)"

The "Wichman-Hill" bit is interesting:

RNGkind("Wichmann-Hill")
length(table(runif(1e6)))
## [1] 100
length(table(runif(1e6)))
## [1] 100

Mersenne-Twister has a much much larger periodicity than Wichmann-Hill,
it would be great to see the above behavior also for Mersenne-Twister.
Thanks for considering.


Best regards

Kirill


On 20.02.19 08:01, Gabriel Becker wrote:

Luke,

I'm happy to help with this. Its great to see this get tackled (I've
cc'ed
Kelli Ottoboni who helped flag this issue).

I can prepare a patch for the RNGkind related stuff and the doc update.

As for ???, what are your (and others') thoughts about the possibility of
a) a reproducibility API which takes either an R version (or maybe
alternatively a date) and sets the RNGkind to the default for that
version/date, and/or b) that sessionInfo be modified to capture (and
display) the RNGkind in effect.

Best,
~G


On Tue, Feb 19, 2019 at 11:52 AM Tierney, Luke 
wrote:


Before the next release we really should to sort out the bias issue in
sample() reported by Ottoboni and Stark in
https://www.stat.berkeley.edu/~stark/Preprints/r-random-issues.pdf and
filed aa a bug report by Duncan Murdoch at
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17494.

Here are two examples of bad behavior through current R-devel:

   set.seed(123)
   m <- (2/5) * 2^32
   x <- sample(m, 100, replace = TRUE)
   table(x %% 2, x > m / 2)
   ##
   ##    FALSE   TRUE
   ## 0 300620 198792
   ## 1 200196 300392

   table(sample(2/7 * 2^32, 100, replace = TRUE) %% 2)
   ##
   ##  0  1
   ## 429054 570946

I committed a modification to R_unif_index to address this by
generating random bits (blocks of 16) and rejection sampling, but for
now this is only enabled if the environment variable R_NEW_SAMPLE is
set before the first call.

Some things still needed:

- someone to look over the change and see if there are any issues
- adjustment of RNGkind to allowing the old behavior to be selected
- make the new behavior the default
- adjust documentation
- ???

Unfortunately I don't have enough free cycles to do this, but I can
help if someone else can take the lead.

There are two other places I found that might suffer from the same
issue, in walker_ProbSampleReplace (pointed out bu O & S) and in
src/nmath/wilcox.c.  Both can be addressed by using R_unif_index. I
have done that for walker_ProbSampleReplace, but the wilcox change
might need adjusting to support the standalone math library and I
don't feel confident enough I'd get that right.

Best,

luke


--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics and    Fax:   319-335-3017
  Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


 [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Bioc-devel] Error package build

2019-02-26 Thread margaret linan

Hi -

I am have been working with Dr. Valentine Dinu on the PoTRA package since last 
year. I have the current version on the GitHub: Bioconductor-PoTRA. 

His lab members work on developing and testing different versions of PoTRA so 
those versions are kept on his lab page. Which is why I created the separate 
GitHub.

I can send him the links to subscribe and register for the different 
Bioconductor email lists and devel support site if that resolves the build 
errors.

Thanks,
Margaret 

Sent from my iPhone

> On Feb 25, 2019, at 10:32 PM, Kasper Daniel Hansen 
>  wrote:
> 
> According to the GitHub repos, the current maintainer is
>   Valentin Dinu 
> Which is not the email you have sent this message from. Do you have multiple 
> email addresses and this is causing this?
> 
>> On Mon, Feb 25, 2019 at 8:44 PM margaret linan  wrote:
>> Hi -
>> 
>> I have submitted my package PoTRA using the Bioconductor Issues page and the 
>> build results keep stating two errors:
>> 
>> 1. That I am not a subscriber to the bioc-devel mailing list.
>> 
>> 2. That I am not registered at the support site.
>> 
>> I am a subscriber and have registered, so I am confused about why these are 
>> coming up as errors.
>> 
>> Thanks.
>> 
>> ML
>> 
>> Sent from my iPhone
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Rd] bias issue in sample() (PR 17494)

2019-02-26 Thread Ralf Stubner

Kirill,

I think some level of collision is actually expected! R uses a 32bit MT
that can produce 2^32 different doubles. The probability for a collision
within a million draws is

> pbirthday(1e6, classes = 2^32)
[1] 1

Greetings
Ralf


On 26.02.19 07:06, Kirill Müller wrote:
> Gabe
> 
> 
> As mentioned on Twitter, I think the following behavior should be fixed
> as part of the upcoming changes:
> 
> R.version.string
> ## [1] "R Under development (unstable) (2019-02-25 r76160)"
> .Machine$double.digits
> ## [1] 53
> set.seed(123)
> RNGkind()
> ## [1] "Mersenne-Twister" "Inversion"    "Rejection"
> length(table(runif(1e6)))
> ## [1] 999863
> 
> I don't expect any collisions when using Mersenne-Twister to generate a
> million floating point values. I'm not sure what causes this behavior,
> but it's documented in ?Random:
> 
> "Do not rely on randomness of low-order bits from RNGs. Most of the
> supplied uniform generators return 32-bit integer values that are
> converted to doubles, so they take at most 2^32 distinct values and long
> runs will return duplicated values (Wichmann-Hill is the exception, and
> all give at least 30 varying bits.)"
> 
> The "Wichman-Hill" bit is interesting:
> 
> RNGkind("Wichmann-Hill")
> length(table(runif(1e6)))
> ## [1] 100
> length(table(runif(1e6)))
> ## [1] 100
> 
> Mersenne-Twister has a much much larger periodicity than Wichmann-Hill,
> it would be great to see the above behavior also for Mersenne-Twister.
> Thanks for considering.
> 
> 
> Best regards
> 
> Kirill
> 
> 
> On 20.02.19 08:01, Gabriel Becker wrote:
>> Luke,
>>
>> I'm happy to help with this. Its great to see this get tackled (I've
>> cc'ed
>> Kelli Ottoboni who helped flag this issue).
>>
>> I can prepare a patch for the RNGkind related stuff and the doc update.
>>
>> As for ???, what are your (and others') thoughts about the possibility of
>> a) a reproducibility API which takes either an R version (or maybe
>> alternatively a date) and sets the RNGkind to the default for that
>> version/date, and/or b) that sessionInfo be modified to capture (and
>> display) the RNGkind in effect.
>>
>> Best,
>> ~G
>>
>>
>> On Tue, Feb 19, 2019 at 11:52 AM Tierney, Luke 
>> wrote:
>>
>>> Before the next release we really should to sort out the bias issue in
>>> sample() reported by Ottoboni and Stark in
>>> https://www.stat.berkeley.edu/~stark/Preprints/r-random-issues.pdf and
>>> filed aa a bug report by Duncan Murdoch at
>>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17494.
>>>
>>> Here are two examples of bad behavior through current R-devel:
>>>
>>>   set.seed(123)
>>>   m <- (2/5) * 2^32
>>>   x <- sample(m, 100, replace = TRUE)
>>>   table(x %% 2, x > m / 2)
>>>   ##
>>>   ##    FALSE   TRUE
>>>   ## 0 300620 198792
>>>   ## 1 200196 300392
>>>
>>>   table(sample(2/7 * 2^32, 100, replace = TRUE) %% 2)
>>>   ##
>>>   ##  0  1
>>>   ## 429054 570946
>>>
>>> I committed a modification to R_unif_index to address this by
>>> generating random bits (blocks of 16) and rejection sampling, but for
>>> now this is only enabled if the environment variable R_NEW_SAMPLE is
>>> set before the first call.
>>>
>>> Some things still needed:
>>>
>>> - someone to look over the change and see if there are any issues
>>> - adjustment of RNGkind to allowing the old behavior to be selected
>>> - make the new behavior the default
>>> - adjust documentation
>>> - ???
>>>
>>> Unfortunately I don't have enough free cycles to do this, but I can
>>> help if someone else can take the lead.
>>>
>>> There are two other places I found that might suffer from the same
>>> issue, in walker_ProbSampleReplace (pointed out bu O & S) and in
>>> src/nmath/wilcox.c.  Both can be addressed by using R_unif_index. I
>>> have done that for walker_ProbSampleReplace, but the wilcox change
>>> might need adjusting to support the standalone math library and I
>>> don't feel confident enough I'd get that right.
>>>
>>> Best,
>>>
>>> luke
>>>
>>>
>>> -- 
>>> Luke Tierney
>>> Ralph E. Wareham Professor of Mathematical Sciences
>>> University of Iowa  Phone: 319-335-3386
>>> Department of Statistics and    Fax:   319-335-3017
>>>  Actuarial Science
>>> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
>>> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Ralf Stubner
Senior Software Engineer / Trainer

daqana GmbH
Dortustraße 48
14467

[Bioc-devel] R CMD check takes too long

Re: [Bioc-devel] GenVisR Package Failing

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

Re: [Rd] Compile R to WebAssembly / Emscripten?

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

Re: [Bioc-devel] How critical is package style for bioconductor?

Re: [Bioc-devel] GenVisR Package Failing

Re: [Bioc-devel] How critical is package style for bioconductor?

Re: [Bioc-devel] How critical is package style for bioconductor?

[Bioc-devel] GenVisR Package Failing

[Bioc-devel] How critical is package style for bioconductor?

[Rd] Possible Update to R-internals Manual

Re: [R-pkg-devel] Possible Rtools path problem

Re: [Rd] bias issue in sample() (PR 17494)

Re: [Bioc-devel] Warning about the size of git pack file

Re: [R-pkg-devel] List of reverse dependencies, including archived packages

Re: [Bioc-devel] Error package build

Re: [Rd] bias issue in sample() (PR 17494)

Re: [Bioc-devel] Error package build

Re: [Rd] bias issue in sample() (PR 17494)

22 matches

Site Navigation

Mail list logo

Footer information