On 03-07-14 03:43, Vaclav Petras wrote:
On Wed, Jul 2, 2014 at 8:15 PM, Glynn Clements
<gl...@gclements.plus.com <mailto:gl...@gclements.plus.com>> wrote:
> Shouldn't the seed not be generated on e.g, OS time,
> which would ensure that each run would give a different result?
No. The reason is to provide reproducibility. Anyone running the same
command with the same data should obtain the same result.
It is certainly be good to be able to reproduce commands. However, I
think in most (statistical) software the default / expected behaviour is
to have a new automatically generated seed at each run. In R for
example, if you have to explicitly specify the seed using the function
set.seed(). I would think therefore what most users will expect a
similar behaviour in GRASS. It would certainly be my personal preference
to have the option to set the seed explicitly if you want
reproducibility, but have it generated automatically otherwise. But that
is just a personal preference.
Does the reproducibility go behind one operating system, compiler or
library? I don't think that the first random number is specified by
the C language standard. If the results would be really reproducible
it would be good for testing framework but I'm afraid that they are
not (with my limited knowledge about the topic).
If you want a different result each time, set GRASS_RND_SEED to a
different value each time, e.g.
GRASS_RND_SEED=`date +%N` r.mapcalc "a = rand(0,100)"
[%N is the nanoseconds portion of the current time; this is a GNU
extension.]
Perhaps this can be explained like this in the manual page? A far better
option would be to provide this as a normal parameter so it can be set
from the gui interface or command line like any other variable.
I've heard that this is not enough on powerful computers/clusters,
that you have to use also PID because nanoseconds might be the same (I
think I rememberer that it was nanoseconds not seconds).
> On a related note, it would be nice to be able to set the seed
(I think
> there has been such a request before, but not sure about the
answer at that
> time).
GRASS_RND_SEED was the answer.
I think there should be some possibility of randomization
(auto-setting of seed) build-in the modules providing random(ized)
results. Perhaps a flag which would turn it on. It can be also an
option which would behave like GRASS_RND_SEED but would have one
special value for auto-generating the seed. (GRASS_RND_SEED if present
would override this option.) With the default value of the option we
should ask a question what is actually the expected behavior of the
module giving random results.
Yes, that would be great. As for the default value, see my earlier argument.
This would provide a nicer interface in Python, standard interface in
command line, and possibility to set it in the GUI (which means
possibility to set it for users which don't use command line.)
Moreover, it would provide all users with the way of setting the
random seen in the manner which we consider the best according to our
knowledge.
Agree. The way to set the seed now may not be understood by everybody
and with all the work going into streamlining the GUI, this kind of
fairly important options should also be available through the GUI
Vaclav
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev