Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-10 Thread Yves Parès
I just thought about something: basically all these APIs provides a IO
[a] (where a is a randomly generable type) function.
Is there a problem with the approach that is to rely on lazy evaluation to
pass to pure code (either explicitely or through State) the infinite list
generated in IO and consume its head each time we need a random value?
Because there is no issue such as resource holding, like in the case of
file reading and enumerators, which would make lazy IO a not so good
approach...

2012/2/9 Aleksey Khudyakov alexey.sklad...@gmail.com

 On 09.02.2012 17:28, Jerzy Karczmarczuk wrote:

 Aleksey Khudyakov:

 On 09.02.2012 15:32, Jerzy Karczmarczuk wrote:

 1. Mersenne Twister, AND congruential generators AND the Marsaglia
 stuff, all use some kind of seed, all are stateful. There are no
 miracles. Just look the agressive monadization, the form of defaultSeed,
 etc. within MWC.hs, before saying that this generator doesn't rely on
 some global state.

  I think you are missing the point here. Surely all PRNG carry some
 state around. But both StdGen and mwc-random (and likely many others)
 allow to have many generators at once (for use in different threads)

 This is irrelevant. I believe that there is a misunderstanding in
 terminology. When I say global state it means not a local variable in
 some function. You seem to say one object per programme. This is
 confirmed by:

  mersenne-random is just wrapper around vastly impure library (as
 documentation says) and allows only one genrator per program.
 This is why I said it uses *global* state


  In any case, the seed changes after each generation, and
 must be stored somewhere.

  No. It doesn't and cannot

  data StdGen
  = StdGen Int32 Int32

 If generator state is stored in IORef it's not possible to implement
 `next :: g → (Int,g)'. To do something useful with it one have to
 go to IO monad but we can't. So state have to be copied.

  We can't WHAT?
 Look, all data that change or are created MUST be stored somewhere,
 don't say dubious things. Your next function is a threading generator,
 which makes another StdGen, OK, but this is not a copy. This is a
 creation of a new seed. When I spoke about IORefs, I thought about the
 global generator, which USES the l'Ecuyer stuff, newStdGen and its
 friends.

 The threading becomes implicit. Try, say
 r=newStdGen
 r = return . next

 and you will see, it works, and you don't keep explicitly your seed.
  From the efficiency point of view all this is comparable. With IOrefs
 you do not pollute the memory which has to be garbage-collected, but
 its administration is heavier than the standard heap operations. StdGen
 with l'Ecuyer two-number seed, or some 600 of the Mersenne, I don't see
 a conceptual difference. The Marsaglia generator has a global seed quite
 voluminous as well.

  I didn't quite understand you

 In order to implement RandomGen one have to implement `next' function

  next :: g → (Int,g)

 Lets assume that g stores internal generator state (In case of NWC256 it's
 258 Word32s). We cannot modify it in place. Someone may hold this g and
 changing it behind the scenes will violate referential transparence. We
 have to create new array and this is expensive.

 There still way out as Duncan Coutts pointed out. We may generate
 stream of random numbers in lazy ST monad and use them as random generator.



  No idea what do you mean. In the Random library you will find the
 generators using IORefs, AND the class Random, with the member random
 (or randomR, etc.) and you may write
 getStdRandom random
 getStdRandom random
 ...
 as you wish, getting different results. What's wrong with that?


 Nothing. I was talking about problems with `next' function. One always can
 use IORefs to create global generator but that's irrelevant


 __**_
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-10 Thread Aleksey Khudyakov

On 10.02.2012 18:38, Yves Parès wrote:

I just thought about something: basically all these APIs provides a IO
[a] (where a is a randomly generable type) function.
Is there a problem with the approach that is to rely on lazy evaluation
to pass to pure code (either explicitely or through State) the infinite
list generated in IO and consume its head each time we need a random value?
Because there is no issue such as resource holding, like in the case of
file reading and enumerators, which would make lazy IO a not so good
approach...



It's more like Seed → [a]. IO is not really needed here. I can see
following problems. None are real showstoppers (except maybe 2)


1. One may want to generate values of different types and/or
   distibutions. This is easily solved by state monad. But I can't
   see advantage over wrapped ST.

2. No way to make snapshot of generator state.

3. Laziness overhead. It may be significant if you try to sqeeze last
   bit of performance.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-10 Thread Yves Parès
 It's more like Seed → [a]. IO is not really needed here.
You're absolutely right. I stand corrected.

 1. One may want to generate values of different types and/or distibutions.
Yes, I saw this limitation before. But you can just generate one infinite
list (and then one seed) per distribution, right?
Or could you generate something homomorphic to [(Random a) = a]? (Each
element would then basically be a closure containing the current Seed)

 2. No way to make snapshot of generator state.
You mean the current seed?
You don't need to access that if you only consume PRNs from the infinite
list(s), and that during the whole program.

 3. Laziness overhead. It may be significant if you try to squeeze last
bit of performance.
I'm afraid that if you're frightened by the laziness overhead then you
won't use Haskell.
I'm kidding, but it's just I don't picture the overhead being so huge in
that case. What so big would the runtime need to keep in memory except for
the closure that generates the infinite list?

2012/2/10 Aleksey Khudyakov alexey.sklad...@gmail.com

 On 10.02.2012 18:38, Yves Parès wrote:

 I just thought about something: basically all these APIs provides a IO
 [a] (where a is a randomly generable type) function.
 Is there a problem with the approach that is to rely on lazy evaluation
 to pass to pure code (either explicitely or through State) the infinite
 list generated in IO and consume its head each time we need a random
 value?
 Because there is no issue such as resource holding, like in the case of
 file reading and enumerators, which would make lazy IO a not so good
 approach...


 It's more like Seed → [a]. IO is not really needed here. I can see
 following problems. None are real showstoppers (except maybe 2)


 1. One may want to generate values of different types and/or
   distibutions. This is easily solved by state monad. But I can't
   see advantage over wrapped ST.

 2. No way to make snapshot of generator state.

 3. Laziness overhead. It may be significant if you try to sqeeze last
   bit of performance.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Ertugrul Söylemez
Yves Parès yves.pa...@gmail.com wrote:

 I've been in the past told that mersenne-random was much better than
 the standard random package.

This is relative.  The Mersenne Twister algorithm has a large memory
footprint, but in relation to the size of the average Haskell program,
this is probably not much of an issue.


 So is it possible to use the fast and efficient mersenne generator
 with the convenient and general random API?

No, because mersenne-random provides an impure generator with a global
state, as is also pointed out by the Haddock documentation.  As an
alternative consider the mersenne-random-pure64 and the cprng-aes
packages.  Both provide the interface you want.

Personally I prefer the cprng-aes package, because it is reasonably fast
and convenient for non-cryptographic applications, but also provides a
strong Crypto-API interface, if you need cryptographically strong random
numbers.


Greets,
Ertugrul

-- 
nightmare = unsafePerformIO (getWrongWife = sex)
http://ertes.de/


signature.asc
Description: PGP signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Aleksey Khudyakov

On 09.02.2012 01:56, Yves Parès wrote:

Hi,
I've been in the past told that mersenne-random was much better than the
standard random package.

...

So is it possible to use the fast and efficient mersenne generator with
the convenient and general random API?

I think design of Random type class basically precludes efficient 
generators with large periods and consequently large state.

Look at next function:

 next :: g - (Int, g)

It means that state has to be copied but for efficiency we want to
mutate it in place. I consider Random type class a failure and ignore
it.

P.S. For monte-carlo and thing like that I'd recommend mwc-random
it more featureful than mersenne-random and don't rely on global state

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Jerzy Karczmarczuk

Aleksey Khudyakov :
I think design of Random type class basically precludes efficient 
generators with large periods and consequently large state.

Look at next function:

 next :: g - (Int, g)

It means that state has to be copied but for efficiency we want to
mutate it in place. I consider Random type class a failure and ignore
it.

P.S. For monte-carlo and thing like that I'd recommend mwc-random
it more featureful than mersenne-random and don't rely on global state

I am afraid that Aleksey Khudakov confuses a few things.

1. Mersenne Twister, AND congruential generators AND the Marsaglia 
stuff, all use some kind of seed, all are stateful. There are no 
miracles. Just look the agressive monadization, the form of defaultSeed, 
etc. within MWC.hs, before saying that this generator doesn't rely on 
some global state.


2. In the standard generator stuff the state is stored in IoRefs and is 
not copied. Did Aleksey ever look inside the sources of these 
generators? In any case, the seed changes after each generation, and 
must be stored somewhere.


3. The API question is a conventional one. People who are really 
unhappy, may make their own interfaces, or give such a mini-project as a 
students' assignment, instead of weeping that the library is lousy and 
should be ignored. E. g., I wanted random numers in some purely 
functional, lazy context, and I didn't want the existing interface ; I 
manufactured a lazy stream interface, and that was all. Look Ma!: no 
global state...


4. L'Ecuyer signalled some 15 years ago that MWC generators introduce 
some bias on the most significant bits (complementary MWC are safer). 
This is less annoying that the last bits periodicity of simple 
congruential generators, but for SOME Monte-Carlo it may be harmful.


===

In general, I believe that saying publicly that some part of the 
available library is a failure, should be avoided, unless accompanied by 
some SERIOUS analysis, and - if possible - some constructive suggestions.


With my thanks to all people who made those generators however imperfect 
they are. Only Mister Nobody is perfect.


Jerzy Karczmarczuk


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Aleksey Khudyakov

On 09.02.2012 15:32, Jerzy Karczmarczuk wrote:

Aleksey Khudyakov :
1. Mersenne Twister, AND congruential generators AND the Marsaglia
stuff, all use some kind of seed, all are stateful. There are no
miracles. Just look the agressive monadization, the form of defaultSeed,
etc. within MWC.hs, before saying that this generator doesn't rely on
some global state.

I think you are missing the point here. Surely all PRNG carry some state 
around. But both StdGen and mwc-random (and likely many others)

allow to have many generators at once (for use in different threads)

mersenne-random is just wrapper around vastly impure library (as
documentation says) and allows only one genrator per program.
This is why I said it uses *global* state



2. In the standard generator stuff the state is stored in IoRefs and is
not copied. Did Aleksey ever look inside the sources of these
generators? In any case, the seed changes after each generation, and
must be stored somewhere.


No. It doesn't and cannot

 data StdGen
  = StdGen Int32 Int32

If generator state is stored in IORef it's not possible to implement
`next :: g → (Int,g)'. To do something useful with it one have to
go to IO monad but we can't. So state have to be copied.




3. The API question is a conventional one. People who are really
unhappy, may make their own interfaces, or give such a mini-project as a
students' assignment, instead of weeping that the library is lousy and
should be ignored. E. g., I wanted random numers in some purely
functional, lazy context, and I didn't want the existing interface ; I
manufactured a lazy stream interface, and that was all. Look Ma!: no
global state...


Usually. But sometimes it's not possible to implement some algorithms
for particular API. For example if PRNG rely on in-place array updates
it cannot implement Random type class.




4. L'Ecuyer signalled some 15 years ago that MWC generators introduce
some bias on the most significant bits (complementary MWC are safer).
This is less annoying that the last bits periodicity of simple
congruential generators, but for SOME Monte-Carlo it may be harmful.


Thank you for reminder. I wanted to read their paper for some time.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Jerzy Karczmarczuk

Aleksey Khudyakov:

On 09.02.2012 15:32, Jerzy Karczmarczuk wrote:

1. Mersenne Twister, AND congruential generators AND the Marsaglia
stuff, all use some kind of seed, all are stateful. There are no
miracles. Just look the agressive monadization, the form of defaultSeed,
etc. within MWC.hs, before saying that this generator doesn't rely on
some global state.

I think you are missing the point here. Surely all PRNG carry some 
state around. But both StdGen and mwc-random (and likely many others)

allow to have many generators at once (for use in different threads)
This is irrelevant. I believe that there is a misunderstanding in 
terminology. When I say global state it means not a local variable in 
some function. You seem to say one object per programme. This is 
confirmed by:



mersenne-random is just wrapper around vastly impure library (as
documentation says) and allows only one genrator per program.
This is why I said it uses *global* state




In any case, the seed changes after each generation, and
must be stored somewhere.


No. It doesn't and cannot

 data StdGen
  = StdGen Int32 Int32

If generator state is stored in IORef it's not possible to implement
`next :: g → (Int,g)'. To do something useful with it one have to
go to IO monad but we can't. So state have to be copied.


We can't WHAT?
Look, all data that change or are created MUST be stored somewhere, 
don't say dubious things. Your next function is a threading generator, 
which makes another StdGen, OK, but this is not a copy. This is a 
creation of a new seed. When I spoke about IORefs, I thought about the 
global generator, which USES the l'Ecuyer stuff, newStdGen and its friends.


The threading becomes implicit. Try, say
r=newStdGen
r = return . next

and you will see, it works, and you don't keep explicitly your seed. 
From the efficiency point of view all this is comparable. With IOrefs 
you do not pollute the memory which has to be garbage-collected, but 
its administration is heavier than the standard heap operations. StdGen 
with l'Ecuyer two-number seed, or some 600 of the Mersenne, I don't see 
a conceptual difference. The Marsaglia generator has a global seed quite 
voluminous as well.




  ...sometimes it's not possible to implement some algorithms
for particular API. For example if PRNG rely on in-place array updates
it cannot implement Random type class.



No idea what do you mean. In the Random library you will find the 
generators using IORefs, AND the class Random, with the member random 
(or randomR, etc.) and you may  write

getStdRandom random
getStdRandom random
...
as you wish, getting different results. What's wrong with that?

Jerzy Karczmarczuk



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Duncan Coutts
On 9 February 2012 10:59, Aleksey Khudyakov alexey.sklad...@gmail.com wrote:

 So is it possible to use the fast and efficient mersenne generator with
 the convenient and general random API?

 I think design of Random type class basically precludes efficient generators
 with large periods and consequently large state.
 Look at next function:

 next :: g - (Int, g)

 It means that state has to be copied but for efficiency we want to
 mutate it in place. I consider Random type class a failure and ignore
 it.

Actually it is not true that the state has to be copied. Using the
lazy ST monad we can implement this interface and internally use
mutable ST arrays.

See for example
http://web.archive.org/web/20090108050217/http://www.augustsson.net/Darcs/MT/MersenneTwister.hs

It ends up with this function that generates the infinite lazy
sequence of words from a seed word. This can be then made to fit the
next :: g - (Int, g) interface, with g = [W].

mersenneTwister :: W - [W]
mersenneTwister s = L.runST (mersenneTwisterST s)

For reference:

import Control.Monad.ST.Lazy as L
type W = Word32
mersenneTwisterST :: W - L.ST s [W]

So yes this looses a small constant factor because of the extra lazy
list (which could be reduced somewhat), so it's not suitable for the
absolutely max performance interface, but it certainly allows the use
of traditional mutable PRNGs with the nice interface with decent
performance.

Certainly none of these PRNGs support split, but that's a separate
issue. Overall, I think the Random type class is salvageable given a
few changes. In the end, the Random module probably needs both an
ST-monadic interface and a pure one. People can use the ST one if it
happens to be more convenient or if they need to squeeze the last drop
of performance out.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Aleksey Khudyakov

On 09.02.2012 18:27, Duncan Coutts wrote:

Actually it is not true that the state has to be copied. Using the
lazy ST monad we can implement this interface and internally use
mutable ST arrays.

See for example
http://web.archive.org/web/20090108050217/http://www.augustsson.net/Darcs/MT/MersenneTwister.hs

It ends up with this function that generates the infinite lazy
sequence of words from a seed word. This can be then made to fit the
next :: g -  (Int, g) interface, with g = [W].


This is very elegant trick! Ability to snapshot and restore generator
is lost.



So yes this looses a small constant factor because of the extra lazy
list (which could be reduced somewhat), so it's not suitable for the
absolutely max performance interface, but it certainly allows the use
of traditional mutable PRNGs with the nice interface with decent
performance.

Certainly none of these PRNGs support split, but that's a separate
issue. Overall, I think the Random type class is salvageable given a
few changes. In the end, the Random module probably needs both an
ST-monadic interface and a pure one. People can use the ST one if it
happens to be more convenient or if they need to squeeze the last drop
of performance out.

There is another problem. Functions like `next' aren't meant to be used 
by humans. One have to thread state manually and this is tedious

and error-prone. So PRNG should be wrapped into monad. Are `next'-like
function needed at all?

I'd expect that creation of generic and useful type class would be
very nontrivial.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Mersenne-random and standard random API

2012-02-09 Thread Aleksey Khudyakov

On 09.02.2012 17:28, Jerzy Karczmarczuk wrote:

Aleksey Khudyakov:

On 09.02.2012 15:32, Jerzy Karczmarczuk wrote:

1. Mersenne Twister, AND congruential generators AND the Marsaglia
stuff, all use some kind of seed, all are stateful. There are no
miracles. Just look the agressive monadization, the form of defaultSeed,
etc. within MWC.hs, before saying that this generator doesn't rely on
some global state.


I think you are missing the point here. Surely all PRNG carry some
state around. But both StdGen and mwc-random (and likely many others)
allow to have many generators at once (for use in different threads)

This is irrelevant. I believe that there is a misunderstanding in
terminology. When I say global state it means not a local variable in
some function. You seem to say one object per programme. This is
confirmed by:


mersenne-random is just wrapper around vastly impure library (as
documentation says) and allows only one genrator per program.
This is why I said it uses *global* state




In any case, the seed changes after each generation, and
must be stored somewhere.


No. It doesn't and cannot

 data StdGen
 = StdGen Int32 Int32

If generator state is stored in IORef it's not possible to implement
`next :: g → (Int,g)'. To do something useful with it one have to
go to IO monad but we can't. So state have to be copied.


We can't WHAT?
Look, all data that change or are created MUST be stored somewhere,
don't say dubious things. Your next function is a threading generator,
which makes another StdGen, OK, but this is not a copy. This is a
creation of a new seed. When I spoke about IORefs, I thought about the
global generator, which USES the l'Ecuyer stuff, newStdGen and its friends.

The threading becomes implicit. Try, say
r=newStdGen
r = return . next

and you will see, it works, and you don't keep explicitly your seed.
 From the efficiency point of view all this is comparable. With IOrefs
you do not pollute the memory which has to be garbage-collected, but
its administration is heavier than the standard heap operations. StdGen
with l'Ecuyer two-number seed, or some 600 of the Mersenne, I don't see
a conceptual difference. The Marsaglia generator has a global seed quite
voluminous as well.


I didn't quite understand you

In order to implement RandomGen one have to implement `next' function

 next :: g → (Int,g)

Lets assume that g stores internal generator state (In case of NWC256 
it's 258 Word32s). We cannot modify it in place. Someone may hold this g 
and changing it behind the scenes will violate referential transparence. 
We have to create new array and this is expensive.


There still way out as Duncan Coutts pointed out. We may generate
stream of random numbers in lazy ST monad and use them as random generator.



No idea what do you mean. In the Random library you will find the
generators using IORefs, AND the class Random, with the member random
(or randomR, etc.) and you may write
getStdRandom random
getStdRandom random
...
as you wish, getting different results. What's wrong with that?


Nothing. I was talking about problems with `next' function. One always 
can use IORefs to create global generator but that's irrelevant


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe