Re: [rng] Copying samplers

2019-05-09 Thread Gilles Sadowski
Le jeu. 9 mai 2019 à 17:00, Alex Herbert  a écrit :
>
>
> On 09/05/2019 15:39, Gilles Sadowski wrote:
> > Le jeu. 9 mai 2019 à 15:41, Alex Herbert  a écrit 
> > :
> >> On Sat, 4 May 2019 at 23:52, Alex Herbert  wrote:
> >>
> >>>
>  On 4 May 2019, at 22:34, Gilles Sadowski  wrote:
> 
>  Hi.
> 
>  Le sam. 4 mai 2019 à 21:31, Alex Herbert  a
> >>> écrit :
> >
> >
> >> On 4 May 2019, at 14:46, Gilles Sadowski  wrote:
> >>
> >> Hello.
> >>
> >> Le ven. 3 mai 2019 à 16:57, Alex Herbert  >>> > a écrit :
> >>> Most of the samplers in the library have very small states that are
> >>> easy
> >>> to compute. Some have computations that are more expensive, such as
> >>> the
> >>> LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
> >>>
> >>> However once the state is computed the only part of the state that
> >>> changes is the RNG. I would like to suggest a way to copy samplers as
> >>> something like:
> >>>
> >>> DiscreteSampler newInstance(UniformRandomProvider)
> >>>
> >>> The new instance would share all the private state of the first
> >>> sampler
> >>> except the RNG. This can be used for multi-threaded applications which
> >>> require a new sampler per thread but sample from the same
> >>> distribution.
> >>> A particular case in point is the as yet not integrated
> >>> MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
> >>> "large" state [2] that takes a "long" time [3] to compute but is
> >>> effectively immutable. This could be shared across instances saving
> >>> memory for parallel application.
> >>>
> >>> A copy instance would be almost zero set-up time and provide
> >>> opportunity
> >>> for caching of commonly used samplers.
> >> The goal is sharing (immutable) state so it seems that the semantics is
> >> not "copy".
> >>
> >> Isn't it a "factory" that we are after?  E.g. something like:
> >> public final class CachedSamplingFactory {
> >>private static PoissonSamplerCache poisson = new
> >>> PoissonSamplerCache();
> >>public PoissonSampler createPoissonSampler(UniformRandomProvider
> >> rng, double mean) {
> >>if (!poisson.isCached(mean)) {
> >>poisson.createCache(mean); // Initialize (requires
> >> synchronization) ...
> >>}
> >>return new PoissonSampler(poisson.getCache(mean), rng); //
> >> Construct using pre-built state.
> >>}
> >> }
> >> [It may be overkill, more work, and less performant…]
> > But you need a factory for every class you want to share state for. And
> >>> the factory actually has to look in a cache. If you operate on an instance
> >>> then you get what you want. Another working version of the same sampler. 
> >>> It
> >>> would also be thread safe without synchronisation as long as the state is
> >>> immutable. The only mutable state is the passed in RNG.
>  Agreed.  It was what I meant by the last sentence.
> 
> >> IIUC, you suggest to add "newInstance" in the "DiscreatSampler"
> >>> interface (?).
> > I did think of extending DiscreteSampler with this functionality. Not
> >>> adding to the interface as it currently is ‘functional’ as it has only one
> >>> method. I think that should not change. Having thought about it a bit more
> >>> I like the idea of a new functional interface. Perhaps:
> > interface DiscreteSamplerProvider {
> > DiscreteSampler create(UniformRandomProvider rng);
> > }
> >
> > Substitute ‘Provider’ for:
> >
> > - Generator
> > - Supplier (possible clash or alignment with Java 8 depending on the
> >>> way it is done)
> > - Factory (though the method is not static so I do not like this)
> > - etc
> >
> > So this then becomes a functional interface that can be used by
> >>> anything. However instances of a sampler would be expected to return a
> >>> sampler matching their own functionality.
> > Note there are some samplers not implementing an interface that also
> >>> could benefit from this. Namely CollectionSampler and
> >>> DiscreteProbabilityCollectionSampler. So does this need a generic 
> >>> interface:
> > Sampler {
> > T sample();
> > }
> >
> > To be complimented with:
> >
> > SamplerProvider {
> > Sampler create(UniformRandomProvider rng);
> > }
> >
> > So the library would require:
> >
> > SamplerProvider
> > DiscreteSamplerProvider
> > ContinuousSamplerProvider
> >
> > Any sampler can choose to implement being a Provider. There are some
> >>> cases where it is mute. For example a ZigguratNormalizedGaussianSampler
> >>> just stores the rng in the constructor. However it could still be a
> >>> Provider just the method would only call the constructor. It would allow
> >>> writing a generic 

Re: [rng] Copying samplers

2019-05-09 Thread Alex Herbert



On 09/05/2019 15:39, Gilles Sadowski wrote:

Le jeu. 9 mai 2019 à 15:41, Alex Herbert  a écrit :

On Sat, 4 May 2019 at 23:52, Alex Herbert  wrote:




On 4 May 2019, at 22:34, Gilles Sadowski  wrote:

Hi.

Le sam. 4 mai 2019 à 21:31, Alex Herbert  a

écrit :




On 4 May 2019, at 14:46, Gilles Sadowski  wrote:

Hello.

Le ven. 3 mai 2019 à 16:57, Alex Herbert 
> a écrit :

Most of the samplers in the library have very small states that are

easy

to compute. Some have computations that are more expensive, such as

the

LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.

However once the state is computed the only part of the state that
changes is the RNG. I would like to suggest a way to copy samplers as
something like:

DiscreteSampler newInstance(UniformRandomProvider)

The new instance would share all the private state of the first

sampler

except the RNG. This can be used for multi-threaded applications which
require a new sampler per thread but sample from the same

distribution.

A particular case in point is the as yet not integrated
MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
"large" state [2] that takes a "long" time [3] to compute but is
effectively immutable. This could be shared across instances saving
memory for parallel application.

A copy instance would be almost zero set-up time and provide

opportunity

for caching of commonly used samplers.

The goal is sharing (immutable) state so it seems that the semantics is
not "copy".

Isn't it a "factory" that we are after?  E.g. something like:
public final class CachedSamplingFactory {
   private static PoissonSamplerCache poisson = new

PoissonSamplerCache();

   public PoissonSampler createPoissonSampler(UniformRandomProvider
rng, double mean) {
   if (!poisson.isCached(mean)) {
   poisson.createCache(mean); // Initialize (requires
synchronization) ...
   }
   return new PoissonSampler(poisson.getCache(mean), rng); //
Construct using pre-built state.
   }
}
[It may be overkill, more work, and less performant…]

But you need a factory for every class you want to share state for. And

the factory actually has to look in a cache. If you operate on an instance
then you get what you want. Another working version of the same sampler. It
would also be thread safe without synchronisation as long as the state is
immutable. The only mutable state is the passed in RNG.

Agreed.  It was what I meant by the last sentence.


IIUC, you suggest to add "newInstance" in the "DiscreatSampler"

interface (?).

I did think of extending DiscreteSampler with this functionality. Not

adding to the interface as it currently is ‘functional’ as it has only one
method. I think that should not change. Having thought about it a bit more
I like the idea of a new functional interface. Perhaps:

interface DiscreteSamplerProvider {
DiscreteSampler create(UniformRandomProvider rng);
}

Substitute ‘Provider’ for:

- Generator
- Supplier (possible clash or alignment with Java 8 depending on the

way it is done)

- Factory (though the method is not static so I do not like this)
- etc

So this then becomes a functional interface that can be used by

anything. However instances of a sampler would be expected to return a
sampler matching their own functionality.

Note there are some samplers not implementing an interface that also

could benefit from this. Namely CollectionSampler and
DiscreteProbabilityCollectionSampler. So does this need a generic interface:

Sampler {
T sample();
}

To be complimented with:

SamplerProvider {
Sampler create(UniformRandomProvider rng);
}

So the library would require:

SamplerProvider
DiscreteSamplerProvider
ContinuousSamplerProvider

Any sampler can choose to implement being a Provider. There are some

cases where it is mute. For example a ZigguratNormalizedGaussianSampler
just stores the rng in the constructor. However it could still be a
Provider just the method would only call the constructor. It would allow
writing a generic multi-threaded application that just uses e.g. a
DiscreteSamplerProvider to create samplers for each thread. You can then
drop in the actual implementation you require. For example you could swap
the type of PoissonSampler in your simulation by swapping the provider for
the Poisson distribution.

How does that sound?

Fine to have
  DiscreteSamplerProvider
  ContinuousSamplerProvider
[Perhaps the "Supplier" suffix would be better to avoid confusion with
"UniformRandomProvider".]

At first sight, I don't think that the generic interface would have
any actual use since, ultimately, the return value of "sample()"
will be either "int" or "double" (no polymorphism).


The generic interface is for the samplers that are typed for collections
and currently return a sample T, or those that return arrays. It would not
be for Integer or Double from the probability distribution samplers. Here
are what could use it:


Re: [rng] Copying samplers

2019-05-09 Thread Gilles Sadowski
Le jeu. 9 mai 2019 à 15:41, Alex Herbert  a écrit :
>
> On Sat, 4 May 2019 at 23:52, Alex Herbert  wrote:
>
> >
> >
> > > On 4 May 2019, at 22:34, Gilles Sadowski  wrote:
> > >
> > > Hi.
> > >
> > > Le sam. 4 mai 2019 à 21:31, Alex Herbert  a
> > écrit :
> > >>
> > >>
> > >>
> > >>> On 4 May 2019, at 14:46, Gilles Sadowski  wrote:
> > >>>
> > >>> Hello.
> > >>>
> > >>> Le ven. 3 mai 2019 à 16:57, Alex Herbert  > > a écrit :
> > 
> >  Most of the samplers in the library have very small states that are
> > easy
> >  to compute. Some have computations that are more expensive, such as
> > the
> >  LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
> > 
> >  However once the state is computed the only part of the state that
> >  changes is the RNG. I would like to suggest a way to copy samplers as
> >  something like:
> > 
> >  DiscreteSampler newInstance(UniformRandomProvider)
> > 
> >  The new instance would share all the private state of the first
> > sampler
> >  except the RNG. This can be used for multi-threaded applications which
> >  require a new sampler per thread but sample from the same
> > distribution.
> > 
> >  A particular case in point is the as yet not integrated
> >  MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
> >  "large" state [2] that takes a "long" time [3] to compute but is
> >  effectively immutable. This could be shared across instances saving
> >  memory for parallel application.
> > 
> >  A copy instance would be almost zero set-up time and provide
> > opportunity
> >  for caching of commonly used samplers.
> > >>>
> > >>> The goal is sharing (immutable) state so it seems that the semantics is
> > >>> not "copy".
> > >>>
> > >>> Isn't it a "factory" that we are after?  E.g. something like:
> > >>> public final class CachedSamplingFactory {
> > >>>   private static PoissonSamplerCache poisson = new
> > PoissonSamplerCache();
> > >>>
> > >>>   public PoissonSampler createPoissonSampler(UniformRandomProvider
> > >>> rng, double mean) {
> > >>>   if (!poisson.isCached(mean)) {
> > >>>   poisson.createCache(mean); // Initialize (requires
> > >>> synchronization) ...
> > >>>   }
> > >>>   return new PoissonSampler(poisson.getCache(mean), rng); //
> > >>> Construct using pre-built state.
> > >>>   }
> > >>> }
> > >>> [It may be overkill, more work, and less performant…]
> > >>
> > >> But you need a factory for every class you want to share state for. And
> > the factory actually has to look in a cache. If you operate on an instance
> > then you get what you want. Another working version of the same sampler. It
> > would also be thread safe without synchronisation as long as the state is
> > immutable. The only mutable state is the passed in RNG.
> > >
> > > Agreed.  It was what I meant by the last sentence.
> > >
> > >>>
> > >>> IIUC, you suggest to add "newInstance" in the "DiscreatSampler"
> > interface (?).
> > >>
> > >> I did think of extending DiscreteSampler with this functionality. Not
> > adding to the interface as it currently is ‘functional’ as it has only one
> > method. I think that should not change. Having thought about it a bit more
> > I like the idea of a new functional interface. Perhaps:
> > >>
> > >> interface DiscreteSamplerProvider {
> > >>DiscreteSampler create(UniformRandomProvider rng);
> > >> }
> > >>
> > >> Substitute ‘Provider’ for:
> > >>
> > >> - Generator
> > >> - Supplier (possible clash or alignment with Java 8 depending on the
> > way it is done)
> > >> - Factory (though the method is not static so I do not like this)
> > >> - etc
> > >>
> > >> So this then becomes a functional interface that can be used by
> > anything. However instances of a sampler would be expected to return a
> > sampler matching their own functionality.
> > >>
> > >> Note there are some samplers not implementing an interface that also
> > could benefit from this. Namely CollectionSampler and
> > DiscreteProbabilityCollectionSampler. So does this need a generic interface:
> > >>
> > >> Sampler {
> > >>T sample();
> > >> }
> > >>
> > >> To be complimented with:
> > >>
> > >> SamplerProvider {
> > >>Sampler create(UniformRandomProvider rng);
> > >> }
> > >>
> > >> So the library would require:
> > >>
> > >> SamplerProvider
> > >> DiscreteSamplerProvider
> > >> ContinuousSamplerProvider
> > >>
> > >> Any sampler can choose to implement being a Provider. There are some
> > cases where it is mute. For example a ZigguratNormalizedGaussianSampler
> > just stores the rng in the constructor. However it could still be a
> > Provider just the method would only call the constructor. It would allow
> > writing a generic multi-threaded application that just uses e.g. a
> > DiscreteSamplerProvider to create samplers for each thread. You can then
> > drop in the actual implementation you require. For 

Re: [rng] Copying samplers

2019-05-09 Thread Alex Herbert
On Sat, 4 May 2019 at 23:52, Alex Herbert  wrote:

>
>
> > On 4 May 2019, at 22:34, Gilles Sadowski  wrote:
> >
> > Hi.
> >
> > Le sam. 4 mai 2019 à 21:31, Alex Herbert  a
> écrit :
> >>
> >>
> >>
> >>> On 4 May 2019, at 14:46, Gilles Sadowski  wrote:
> >>>
> >>> Hello.
> >>>
> >>> Le ven. 3 mai 2019 à 16:57, Alex Herbert  > a écrit :
> 
>  Most of the samplers in the library have very small states that are
> easy
>  to compute. Some have computations that are more expensive, such as
> the
>  LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
> 
>  However once the state is computed the only part of the state that
>  changes is the RNG. I would like to suggest a way to copy samplers as
>  something like:
> 
>  DiscreteSampler newInstance(UniformRandomProvider)
> 
>  The new instance would share all the private state of the first
> sampler
>  except the RNG. This can be used for multi-threaded applications which
>  require a new sampler per thread but sample from the same
> distribution.
> 
>  A particular case in point is the as yet not integrated
>  MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
>  "large" state [2] that takes a "long" time [3] to compute but is
>  effectively immutable. This could be shared across instances saving
>  memory for parallel application.
> 
>  A copy instance would be almost zero set-up time and provide
> opportunity
>  for caching of commonly used samplers.
> >>>
> >>> The goal is sharing (immutable) state so it seems that the semantics is
> >>> not "copy".
> >>>
> >>> Isn't it a "factory" that we are after?  E.g. something like:
> >>> public final class CachedSamplingFactory {
> >>>   private static PoissonSamplerCache poisson = new
> PoissonSamplerCache();
> >>>
> >>>   public PoissonSampler createPoissonSampler(UniformRandomProvider
> >>> rng, double mean) {
> >>>   if (!poisson.isCached(mean)) {
> >>>   poisson.createCache(mean); // Initialize (requires
> >>> synchronization) ...
> >>>   }
> >>>   return new PoissonSampler(poisson.getCache(mean), rng); //
> >>> Construct using pre-built state.
> >>>   }
> >>> }
> >>> [It may be overkill, more work, and less performant…]
> >>
> >> But you need a factory for every class you want to share state for. And
> the factory actually has to look in a cache. If you operate on an instance
> then you get what you want. Another working version of the same sampler. It
> would also be thread safe without synchronisation as long as the state is
> immutable. The only mutable state is the passed in RNG.
> >
> > Agreed.  It was what I meant by the last sentence.
> >
> >>>
> >>> IIUC, you suggest to add "newInstance" in the "DiscreatSampler"
> interface (?).
> >>
> >> I did think of extending DiscreteSampler with this functionality. Not
> adding to the interface as it currently is ‘functional’ as it has only one
> method. I think that should not change. Having thought about it a bit more
> I like the idea of a new functional interface. Perhaps:
> >>
> >> interface DiscreteSamplerProvider {
> >>DiscreteSampler create(UniformRandomProvider rng);
> >> }
> >>
> >> Substitute ‘Provider’ for:
> >>
> >> - Generator
> >> - Supplier (possible clash or alignment with Java 8 depending on the
> way it is done)
> >> - Factory (though the method is not static so I do not like this)
> >> - etc
> >>
> >> So this then becomes a functional interface that can be used by
> anything. However instances of a sampler would be expected to return a
> sampler matching their own functionality.
> >>
> >> Note there are some samplers not implementing an interface that also
> could benefit from this. Namely CollectionSampler and
> DiscreteProbabilityCollectionSampler. So does this need a generic interface:
> >>
> >> Sampler {
> >>T sample();
> >> }
> >>
> >> To be complimented with:
> >>
> >> SamplerProvider {
> >>Sampler create(UniformRandomProvider rng);
> >> }
> >>
> >> So the library would require:
> >>
> >> SamplerProvider
> >> DiscreteSamplerProvider
> >> ContinuousSamplerProvider
> >>
> >> Any sampler can choose to implement being a Provider. There are some
> cases where it is mute. For example a ZigguratNormalizedGaussianSampler
> just stores the rng in the constructor. However it could still be a
> Provider just the method would only call the constructor. It would allow
> writing a generic multi-threaded application that just uses e.g. a
> DiscreteSamplerProvider to create samplers for each thread. You can then
> drop in the actual implementation you require. For example you could swap
> the type of PoissonSampler in your simulation by swapping the provider for
> the Poisson distribution.
> >>
> >> How does that sound?
> >
> > Fine to have
> >  DiscreteSamplerProvider
> >  ContinuousSamplerProvider
> > [Perhaps the "Supplier" suffix would be better to avoid confusion 

Re: [rng] Copying samplers

2019-05-04 Thread Alex Herbert



> On 4 May 2019, at 22:34, Gilles Sadowski  wrote:
> 
> Hi.
> 
> Le sam. 4 mai 2019 à 21:31, Alex Herbert  a écrit :
>> 
>> 
>> 
>>> On 4 May 2019, at 14:46, Gilles Sadowski  wrote:
>>> 
>>> Hello.
>>> 
>>> Le ven. 3 mai 2019 à 16:57, Alex Herbert >> > a écrit :
 
 Most of the samplers in the library have very small states that are easy
 to compute. Some have computations that are more expensive, such as the
 LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
 
 However once the state is computed the only part of the state that
 changes is the RNG. I would like to suggest a way to copy samplers as
 something like:
 
 DiscreteSampler newInstance(UniformRandomProvider)
 
 The new instance would share all the private state of the first sampler
 except the RNG. This can be used for multi-threaded applications which
 require a new sampler per thread but sample from the same distribution.
 
 A particular case in point is the as yet not integrated
 MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
 "large" state [2] that takes a "long" time [3] to compute but is
 effectively immutable. This could be shared across instances saving
 memory for parallel application.
 
 A copy instance would be almost zero set-up time and provide opportunity
 for caching of commonly used samplers.
>>> 
>>> The goal is sharing (immutable) state so it seems that the semantics is
>>> not "copy".
>>> 
>>> Isn't it a "factory" that we are after?  E.g. something like:
>>> public final class CachedSamplingFactory {
>>>   private static PoissonSamplerCache poisson = new PoissonSamplerCache();
>>> 
>>>   public PoissonSampler createPoissonSampler(UniformRandomProvider
>>> rng, double mean) {
>>>   if (!poisson.isCached(mean)) {
>>>   poisson.createCache(mean); // Initialize (requires
>>> synchronization) ...
>>>   }
>>>   return new PoissonSampler(poisson.getCache(mean), rng); //
>>> Construct using pre-built state.
>>>   }
>>> }
>>> [It may be overkill, more work, and less performant…]
>> 
>> But you need a factory for every class you want to share state for. And the 
>> factory actually has to look in a cache. If you operate on an instance then 
>> you get what you want. Another working version of the same sampler. It would 
>> also be thread safe without synchronisation as long as the state is 
>> immutable. The only mutable state is the passed in RNG.
> 
> Agreed.  It was what I meant by the last sentence.
> 
>>> 
>>> IIUC, you suggest to add "newInstance" in the "DiscreatSampler" interface 
>>> (?).
>> 
>> I did think of extending DiscreteSampler with this functionality. Not adding 
>> to the interface as it currently is ‘functional’ as it has only one method. 
>> I think that should not change. Having thought about it a bit more I like 
>> the idea of a new functional interface. Perhaps:
>> 
>> interface DiscreteSamplerProvider {
>>DiscreteSampler create(UniformRandomProvider rng);
>> }
>> 
>> Substitute ‘Provider’ for:
>> 
>> - Generator
>> - Supplier (possible clash or alignment with Java 8 depending on the way it 
>> is done)
>> - Factory (though the method is not static so I do not like this)
>> - etc
>> 
>> So this then becomes a functional interface that can be used by anything. 
>> However instances of a sampler would be expected to return a sampler 
>> matching their own functionality.
>> 
>> Note there are some samplers not implementing an interface that also could 
>> benefit from this. Namely CollectionSampler and 
>> DiscreteProbabilityCollectionSampler. So does this need a generic interface:
>> 
>> Sampler {
>>T sample();
>> }
>> 
>> To be complimented with:
>> 
>> SamplerProvider {
>>Sampler create(UniformRandomProvider rng);
>> }
>> 
>> So the library would require:
>> 
>> SamplerProvider
>> DiscreteSamplerProvider
>> ContinuousSamplerProvider
>> 
>> Any sampler can choose to implement being a Provider. There are some cases 
>> where it is mute. For example a ZigguratNormalizedGaussianSampler just 
>> stores the rng in the constructor. However it could still be a Provider just 
>> the method would only call the constructor. It would allow writing a generic 
>> multi-threaded application that just uses e.g. a DiscreteSamplerProvider to 
>> create samplers for each thread. You can then drop in the actual 
>> implementation you require. For example you could swap the type of 
>> PoissonSampler in your simulation by swapping the provider for the Poisson 
>> distribution.
>> 
>> How does that sound?
> 
> Fine to have
>  DiscreteSamplerProvider
>  ContinuousSamplerProvider
> [Perhaps the "Supplier" suffix would be better to avoid confusion with
> "UniformRandomProvider".]
> 
> At first sight, I don't think that the generic interface would have
> any actual use since, ultimately, the return value of "sample()"
> will be either "int" 

Re: [rng] Copying samplers

2019-05-04 Thread Gilles Sadowski
Hi.

Le sam. 4 mai 2019 à 21:31, Alex Herbert  a écrit :
>
>
>
> > On 4 May 2019, at 14:46, Gilles Sadowski  wrote:
> >
> > Hello.
> >
> > Le ven. 3 mai 2019 à 16:57, Alex Herbert  > > a écrit :
> >>
> >> Most of the samplers in the library have very small states that are easy
> >> to compute. Some have computations that are more expensive, such as the
> >> LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
> >>
> >> However once the state is computed the only part of the state that
> >> changes is the RNG. I would like to suggest a way to copy samplers as
> >> something like:
> >>
> >> DiscreteSampler newInstance(UniformRandomProvider)
> >>
> >> The new instance would share all the private state of the first sampler
> >> except the RNG. This can be used for multi-threaded applications which
> >> require a new sampler per thread but sample from the same distribution.
> >>
> >> A particular case in point is the as yet not integrated
> >> MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
> >> "large" state [2] that takes a "long" time [3] to compute but is
> >> effectively immutable. This could be shared across instances saving
> >> memory for parallel application.
> >>
> >> A copy instance would be almost zero set-up time and provide opportunity
> >> for caching of commonly used samplers.
> >
> > The goal is sharing (immutable) state so it seems that the semantics is
> > not "copy".
> >
> > Isn't it a "factory" that we are after?  E.g. something like:
> > public final class CachedSamplingFactory {
> >private static PoissonSamplerCache poisson = new PoissonSamplerCache();
> >
> >public PoissonSampler createPoissonSampler(UniformRandomProvider
> > rng, double mean) {
> >if (!poisson.isCached(mean)) {
> >poisson.createCache(mean); // Initialize (requires
> > synchronization) ...
> >}
> >return new PoissonSampler(poisson.getCache(mean), rng); //
> > Construct using pre-built state.
> >}
> > }
> > [It may be overkill, more work, and less performant…]
>
> But you need a factory for every class you want to share state for. And the 
> factory actually has to look in a cache. If you operate on an instance then 
> you get what you want. Another working version of the same sampler. It would 
> also be thread safe without synchronisation as long as the state is 
> immutable. The only mutable state is the passed in RNG.

Agreed.  It was what I meant by the last sentence.

> >
> > IIUC, you suggest to add "newInstance" in the "DiscreatSampler" interface 
> > (?).
>
> I did think of extending DiscreteSampler with this functionality. Not adding 
> to the interface as it currently is ‘functional’ as it has only one method. I 
> think that should not change. Having thought about it a bit more I like the 
> idea of a new functional interface. Perhaps:
>
> interface DiscreteSamplerProvider {
> DiscreteSampler create(UniformRandomProvider rng);
> }
>
> Substitute ‘Provider’ for:
>
> - Generator
> - Supplier (possible clash or alignment with Java 8 depending on the way it 
> is done)
> - Factory (though the method is not static so I do not like this)
> - etc
>
> So this then becomes a functional interface that can be used by anything. 
> However instances of a sampler would be expected to return a sampler matching 
> their own functionality.
>
> Note there are some samplers not implementing an interface that also could 
> benefit from this. Namely CollectionSampler and 
> DiscreteProbabilityCollectionSampler. So does this need a generic interface:
>
> Sampler {
> T sample();
> }
>
> To be complimented with:
>
> SamplerProvider {
> Sampler create(UniformRandomProvider rng);
> }
>
> So the library would require:
>
> SamplerProvider
> DiscreteSamplerProvider
> ContinuousSamplerProvider
>
> Any sampler can choose to implement being a Provider. There are some cases 
> where it is mute. For example a ZigguratNormalizedGaussianSampler just stores 
> the rng in the constructor. However it could still be a Provider just the 
> method would only call the constructor. It would allow writing a generic 
> multi-threaded application that just uses e.g. a DiscreteSamplerProvider to 
> create samplers for each thread. You can then drop in the actual 
> implementation you require. For example you could swap the type of 
> PoissonSampler in your simulation by swapping the provider for the Poisson 
> distribution.
>
> How does that sound?

Fine to have
  DiscreteSamplerProvider
  ContinuousSamplerProvider
[Perhaps the "Supplier" suffix would be better to avoid confusion with
"UniformRandomProvider".]

At first sight, I don't think that the generic interface would have
any actual use since, ultimately, the return value of "sample()"
will be either "int" or "double" (no polymorphism).

Gilles

>
> Alex
>
>
>
> > I'm a bit wary that this would compound two different functionalities:
> >  * data generator (method 

Re: [rng] Copying samplers

2019-05-04 Thread Alex Herbert


> On 4 May 2019, at 14:46, Gilles Sadowski  wrote:
> 
> Hello.
> 
> Le ven. 3 mai 2019 à 16:57, Alex Herbert  > a écrit :
>> 
>> Most of the samplers in the library have very small states that are easy
>> to compute. Some have computations that are more expensive, such as the
>> LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
>> 
>> However once the state is computed the only part of the state that
>> changes is the RNG. I would like to suggest a way to copy samplers as
>> something like:
>> 
>> DiscreteSampler newInstance(UniformRandomProvider)
>> 
>> The new instance would share all the private state of the first sampler
>> except the RNG. This can be used for multi-threaded applications which
>> require a new sampler per thread but sample from the same distribution.
>> 
>> A particular case in point is the as yet not integrated
>> MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
>> "large" state [2] that takes a "long" time [3] to compute but is
>> effectively immutable. This could be shared across instances saving
>> memory for parallel application.
>> 
>> A copy instance would be almost zero set-up time and provide opportunity
>> for caching of commonly used samplers.
> 
> The goal is sharing (immutable) state so it seems that the semantics is
> not "copy".
> 
> Isn't it a "factory" that we are after?  E.g. something like:
> public final class CachedSamplingFactory {
>private static PoissonSamplerCache poisson = new PoissonSamplerCache();
> 
>public PoissonSampler createPoissonSampler(UniformRandomProvider
> rng, double mean) {
>if (!poisson.isCached(mean)) {
>poisson.createCache(mean); // Initialize (requires
> synchronization) ...
>}
>return new PoissonSampler(poisson.getCache(mean), rng); //
> Construct using pre-built state.
>}
> }
> [It may be overkill, more work, and less performant…]

But you need a factory for every class you want to share state for. And the 
factory actually has to look in a cache. If you operate on an instance then you 
get what you want. Another working version of the same sampler. It would also 
be thread safe without synchronisation as long as the state is immutable. The 
only mutable state is the passed in RNG.

> 
> IIUC, you suggest to add "newInstance" in the "DiscreatSampler" interface (?).

I did think of extending DiscreteSampler with this functionality. Not adding to 
the interface as it currently is ‘functional’ as it has only one method. I 
think that should not change. Having thought about it a bit more I like the 
idea of a new functional interface. Perhaps:

interface DiscreteSamplerProvider {
DiscreteSampler create(UniformRandomProvider rng);
}

Substitute ‘Provider’ for:

- Generator
- Supplier (possible clash or alignment with Java 8 depending on the way it is 
done)
- Factory (though the method is not static so I do not like this)
- etc

So this then becomes a functional interface that can be used by anything. 
However instances of a sampler would be expected to return a sampler matching 
their own functionality.

Note there are some samplers not implementing an interface that also could 
benefit from this. Namely CollectionSampler and 
DiscreteProbabilityCollectionSampler. So does this need a generic interface:

Sampler {
T sample();
}

To be complimented with:

SamplerProvider {
Sampler create(UniformRandomProvider rng);
}

So the library would require:

SamplerProvider
DiscreteSamplerProvider
ContinuousSamplerProvider

Any sampler can choose to implement being a Provider. There are some cases 
where it is mute. For example a ZigguratNormalizedGaussianSampler just stores 
the rng in the constructor. However it could still be a Provider just the 
method would only call the constructor. It would allow writing a generic 
multi-threaded application that just uses e.g. a DiscreteSamplerProvider to 
create samplers for each thread. You can then drop in the actual implementation 
you require. For example you could swap the type of PoissonSampler in your 
simulation by swapping the provider for the Poisson distribution.

How does that sound?

Alex



> I'm a bit wary that this would compound two different functionalities:
>  * data generator (method "sample"),
>  * generator generator (method "newInstance").
> [But I currently don't have an example where this would be a problem.]
> 
> Regards,
> Gilles
> 
>> Alex
>> 
>> [1] https://issues.apache.org/jira/browse/RNG-91 
>> 
>> 
>> [2] kB, or possibly MB, of tabulated data
>> 
>> [3] Set-up cost for a Poisson sampler is in the order of 30 to 165 times
>> as long as a SmallMeanPoissonSampler for a mean of 2 and 32. Note
>> however that construction still takes only 1.1 and 4.5 microseconds for
>> the "long" time.
> 
> -
> To unsubscribe, e-mail: 

Re: [rng] Copying samplers

2019-05-04 Thread Gilles Sadowski
Hello.

Le ven. 3 mai 2019 à 16:57, Alex Herbert  a écrit :
>
> Most of the samplers in the library have very small states that are easy
> to compute. Some have computations that are more expensive, such as the
> LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.
>
> However once the state is computed the only part of the state that
> changes is the RNG. I would like to suggest a way to copy samplers as
> something like:
>
> DiscreteSampler newInstance(UniformRandomProvider)
>
> The new instance would share all the private state of the first sampler
> except the RNG. This can be used for multi-threaded applications which
> require a new sampler per thread but sample from the same distribution.
>
> A particular case in point is the as yet not integrated
> MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a
> "large" state [2] that takes a "long" time [3] to compute but is
> effectively immutable. This could be shared across instances saving
> memory for parallel application.
>
> A copy instance would be almost zero set-up time and provide opportunity
> for caching of commonly used samplers.

The goal is sharing (immutable) state so it seems that the semantics is
not "copy".

Isn't it a "factory" that we are after?  E.g. something like:
public final class CachedSamplingFactory {
private static PoissonSamplerCache poisson = new PoissonSamplerCache();

public PoissonSampler createPoissonSampler(UniformRandomProvider
rng, double mean) {
if (!poisson.isCached(mean)) {
poisson.createCache(mean); // Initialize (requires
synchronization) ...
}
return new PoissonSampler(poisson.getCache(mean), rng); //
Construct using pre-built state.
}
}
[It may be overkill, more work, and less performant...]

IIUC, you suggest to add "newInstance" in the "DiscreatSampler" interface (?).
I'm a bit wary that this would compound two different functionalities:
  * data generator (method "sample"),
  * generator generator (method "newInstance").
[But I currently don't have an example where this would be a problem.]

Regards,
Gilles

> Alex
>
> [1] https://issues.apache.org/jira/browse/RNG-91
>
> [2] kB, or possibly MB, of tabulated data
>
> [3] Set-up cost for a Poisson sampler is in the order of 30 to 165 times
> as long as a SmallMeanPoissonSampler for a mean of 2 and 32. Note
> however that construction still takes only 1.1 and 4.5 microseconds for
> the "long" time.

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[rng] Copying samplers

2019-05-03 Thread Alex Herbert
Most of the samplers in the library have very small states that are easy 
to compute. Some have computations that are more expensive, such as the 
LargeMeanPoissonSampler or the DiscreteProbabilityCollectionSampler.


However once the state is computed the only part of the state that 
changes is the RNG. I would like to suggest a way to copy samplers as 
something like:


DiscreteSampler newInstance(UniformRandomProvider)

The new instance would share all the private state of the first sampler 
except the RNG. This can be used for multi-threaded applications which 
require a new sampler per thread but sample from the same distribution.


A particular case in point is the as yet not integrated 
MarsagliaTsangWangSmallMeanPoissonSampler (see RNG-91 [1]) which has a 
"large" state [2] that takes a "long" time [3] to compute but is 
effectively immutable. This could be shared across instances saving 
memory for parallel application.


A copy instance would be almost zero set-up time and provide opportunity 
for caching of commonly used samplers.


Alex

[1] https://issues.apache.org/jira/browse/RNG-91

[2] kB, or possibly MB, of tabulated data

[3] Set-up cost for a Poisson sampler is in the order of 30 to 165 times 
as long as a SmallMeanPoissonSampler for a mean of 2 and 32. Note 
however that construction still takes only 1.1 and 4.5 microseconds for 
the "long" time.




-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org