On 6/16/2015 1:32 PM, Peter Meissner wrote:
Am .06.2015, 14:55 Uhr, schrieb Millot Gael <gael.mil...@curie.fr>:

Hi.

I have a problem with the default behavior of sample(), which performs
sample(1:x) when x is a single value.
This behavior is well explained in ?sample.
However, this behavior is annoying when the number of value is not
predictable. Would it be possible to add an argument
that desactivates this and perform the sampling on a single value ?
Examples:
sample(10, size = 1, replace = FALSE)
10

sample(10, size = 3, replace = TRUE)
10 10 10

sample(10, size = 3, replace = FALSE)
Error

I think the problem here is that the function actually does what you
would expect it to do given a statistic perspective. A sample of size
three from a population of one without allowing to draw elements again
that were drawn already is simply not defined. What shall the function
give back?


If I understand right, this error is exactly what the poster would like to see, but which you dont get currently. If length(population) == 1, you will now sample from 1:population, not the population itself. So:

> sample(8:10, 3, replace = FALSE)
[1] 10  8  9
> sample(9:10, 3, replace = FALSE)
Error in sample.int(length(x), size, replace, prob) :
  cannot take a sample larger than the population when 'replace = FALSE'
> sample(10:10, 3, replace = FALSE)
[1]  8 10  2

I have to admit that I also find this behaviour inconsistent, even if it is well described already on the first line of the details in the documentation. It is definitely a feature which can cause some trouble, and where the tests might end up more complicated than you would first think.



... You can always wrap your code in a try() like this to prevent errors
to break loops or functions:

try(sample(...))

No error is given when length(population) == 1, and the result might be perfectly valid if population is variable. So this will easily stay in the script as an undetected bug.


... or you might check your arguments before execution:


if ( !replace & length(population) >= size ){
   sample(population, size = size , replace = replace)
}else{
   ...
}

This test is not sufficient if length(population) == size == 1, so you will also need to check for this special case:

if (length(population) == 1 & size == 1) {
  population
} else if (!replace & length(population) >= size) {
  sample(population, size = size, replace = replace)
} else {
  ...
}

Then the question would be if this test could be replaced with a new argument to sample, e.g. expandSingle, which has TRUE as default for backward compatibility, but FALSE if you dont want population to be expanded to 1:population. It could certainly be useful in some cases, but you still need to know about the expansion to use it. I think most of these bugs occur because users did not think about the expansion in the first place or did not realize that their population could be of length 1 in some situations. These users would therefore not think about changing the argument either.

Cheers,
Jon




Many thanks for your help.

Best wishes,

Gael Millot.


Gael Millot
UMR 3244 (IC-CNRS-UPMC) et Universite Pierre et Marie Curie
Equipe Recombinaison et instabilite genetique
Pav Trouillet Rossignol 5eme etage
Institut Curie
26 rue d'Ulm
75248 Paris Cedex 05
FRANCE
tel : 33 1 56 24 66 34
fax : 33 1 56 24 66 44
Email : gael.mil...@curie.fr
http://perso.curie.fr/Gael.Millot/index.html


    [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Best, Peter

--

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Jon Olav Skøien
Joint Research Centre - European Commission
Institute for Environment and Sustainability (IES)
Climate Risk Management Unit

Via Fermi 2749, TP 100-01,  I-21027 Ispra (VA), ITALY

jon.sko...@jrc.ec.europa.eu
Tel:  +39 0332 789205

Disclaimer: Views expressed in this email are those of the individual and do not necessarily represent official views of the European Commission.

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to