Re: Making mir.random.ndvariable.multivariateNormalVar create bigger data sets than 2

2018-09-10 Thread 9il via Digitalmars-d-learn

On Tuesday, 27 February 2018 at 09:23:49 UTC, kerdemdemir wrote:

I need a classifier in my project.
Since it is I believe most easy to implement I am trying to 
implement logistic regression.


[...]


Mir Random v1.0.0 has new `range` overloads that can work 
NdRandomVariable.

Example: https://run.dlang.io/is/jte3gx


Re: Making mir.random.ndvariable.multivariateNormalVar create bigger data sets than 2

2018-02-27 Thread jmh530 via Digitalmars-d-learn

On Tuesday, 27 February 2018 at 21:54:34 UTC, Nathan S. wrote:
Cross-posting from the github issue 
(https://github.com/libmir/mir-random/issues/77) with a 
workaround (execute it at https://run.dlang.io/is/Swr1xU):


[snip]



Step in the right direction at least.


Re: Making mir.random.ndvariable.multivariateNormalVar create bigger data sets than 2

2018-02-27 Thread Nathan S. via Digitalmars-d-learn
Cross-posting from the github issue 
(https://github.com/libmir/mir-random/issues/77) with a 
workaround (execute it at https://run.dlang.io/is/Swr1xU):



I am not sure what the correct interface should be for this in 
the long run, but for now you can use a wrapper function to 
convert an ndvariable to a variable:


```d
/++
Converts an N-dimensional variable to a fixed-dimensional 
variable.

+/
auto specifyDimension(ReturnType, NDVariable)(NDVariable vr)
if (__traits(isStaticArray, ReturnType) && __traits(compiles, 
{static assert(NDVariable.isRandomVariable);}))

{
import mir.random : isSaturatedRandomEngine;
import mir.random.variable : isRandomVariable;
static struct V
{
enum bool isRandomVariable = true;
NDVariable vr;
ReturnType opCall(G)(scope ref G gen) if 
(isSaturatedRandomEngine!G)

{
ReturnType ret;
vr(gen, ret[]);
return ret;
}

ReturnType opCall(G)(scope G* gen) if 
(isSaturatedRandomEngine!G)

{
return opCall!(G)(*gen);
}
}
static assert(isRandomVariable!V);
V v = { vr };
return v;
}
```

So `main` from your above example becomes:

```d
void main()
{
import std.stdio;
import mir.random : Random, threadLocalPtr;
import mir.random.ndvariable : multivariateNormalVar;
import mir.random.algorithm : range;
import mir.ndslice.slice : sliced;
import std.range : take;

auto mu = [10.0, 0.0].sliced;
auto sigma = [2.0, -1.5, -1.5, 2.0].sliced(2,2);

Random* rng = threadLocalPtr!Random;
auto sample = rng
.range(multivariateNormalVar(mu, 
sigma).specifyDimension!(double[2]))

.take(10);
writeln(sample);
}
```


Re: Making mir.random.ndvariable.multivariateNormalVar create bigger data sets than 2

2018-02-27 Thread jmh530 via Digitalmars-d-learn

On Tuesday, 27 February 2018 at 17:24:22 UTC, Nathan S. wrote:

On Tuesday, 27 February 2018 at 16:42:00 UTC, Nathan S. wrote:

On Tuesday, 27 February 2018 at 15:08:42 UTC, jmh530 wrote:
Nevertheless, it probably can't hurt to file an issue if you 
can't get something like the first one to work. I would think 
it should just work.


The problem is that `mir.random.ndvariable` doesn't satisfy 
`mir.random.variable.isRandomVariable!T`. ndvariables have a 
slightly different interface from variables: instead of of  
`rv(gen)` returning a result, `rv(gen, dst)` writes to dst. I 
agree that the various methods for working with variables 
should be enhanced to work with ndvariables.


So, I see that the interface will have to be slightly different 
for ndvariable than for variable. With the exception of 
MultivariateNormalVariable, the same ndvariable instance can be 
called to fill output of any length "n", so one can't 
meaningfully create a range based on just the ndvariable 
without further specification. What would "front" return? For 
MultivariateNormalVariable "n" is constrained but it is a 
runtime parameter rather than a compile-time parameter.


You'll want to ping @9il / Ilya Yaroshenko to discuss what the 
API should be like for this.


Honestly, I think the post above was my first use of mir.random, 
so I'm nowhere near familiar enough at this point to add much 
useful feedback. I'm definitely glad that it is getting worked on 
and plan on using it in the future.


The only thing I would note is that there are not just 
N-dimensional random variables, there are also NXN dimensional 
random variables (not sure what else there could be, but it would 
be significantly less popular). A Wishart distribution (used for 
the distribution of covariance matrices) can be simulated by 
multiplying the transpose of a multivariate random normal by 
itself. This produces an NXN matrix. Ideally, the API could 
handle this type of distribution as well.


Another type of distribution I sometimes see is from Bayesian 
statistics (less common than typical distributions and could 
probably be built on top of what is already in mir.random, but I 
figured it couldn't hurt to bring it to your attention). A 
normal-inverse-gamma distribution is one example of these types 
of distributions. Simulating from this distribution would produce 
a pair of the mean and variance, not just one value. This would 
contrast with multivariate normal in that you would know it has 
two dimensions at compile-time.


Re: Making mir.random.ndvariable.multivariateNormalVar create bigger data sets than 2

2018-02-27 Thread Nathan S. via Digitalmars-d-learn

On Tuesday, 27 February 2018 at 16:42:00 UTC, Nathan S. wrote:

On Tuesday, 27 February 2018 at 15:08:42 UTC, jmh530 wrote:
Nevertheless, it probably can't hurt to file an issue if you 
can't get something like the first one to work. I would think 
it should just work.


The problem is that `mir.random.ndvariable` doesn't satisfy 
`mir.random.variable.isRandomVariable!T`. ndvariables have a 
slightly different interface from variables: instead of of  
`rv(gen)` returning a result, `rv(gen, dst)` writes to dst. I 
agree that the various methods for working with variables 
should be enhanced to work with ndvariables.


So, I see that the interface will have to be slightly different 
for ndvariable than for variable. With the exception of 
MultivariateNormalVariable, the same ndvariable instance can be 
called to fill output of any length "n", so one can't 
meaningfully create a range based on just the ndvariable without 
further specification. What would "front" return? For 
MultivariateNormalVariable "n" is constrained but it is a runtime 
parameter rather than a compile-time parameter.


You'll want to ping @9il / Ilya Yaroshenko to discuss what the 
API should be like for this.


Re: Making mir.random.ndvariable.multivariateNormalVar create bigger data sets than 2

2018-02-27 Thread Nathan S. via Digitalmars-d-learn

On Tuesday, 27 February 2018 at 15:08:42 UTC, jmh530 wrote:
Nevertheless, it probably can't hurt to file an issue if you 
can't get something like the first one to work. I would think 
it should just work.


The problem is that `mir.random.ndvariable` doesn't satisfy 
`mir.random.variable.isRandomVariable!T`. ndvariables have a 
slightly different interface from variables: instead of of  
`rv(gen)` returning a result, `rv(gen, dst)` writes to dst. I 
agree that the various methods for working with variables should 
be enhanced to work with ndvariables.


Re: Making mir.random.ndvariable.multivariateNormalVar create bigger data sets than 2

2018-02-27 Thread jmh530 via Digitalmars-d-learn

On Tuesday, 27 February 2018 at 09:23:49 UTC, kerdemdemir wrote:

I need a classifier in my project.
Since it is I believe most easy to implement I am trying to 
implement logistic regression.


I am trying to do the same as the python example:  
https://beckernick.github.io/logistic-regression-from-scratch/


I need to data sets with which I will test.

This works(https://run.dlang.io/is/yGa4a0) :

double[2] x1;
Random* gen = threadLocalPtr!Random;

auto mu = [0.0, 0.0].sliced;
auto sigma = [1.0, 0.75, 0.75, 1].sliced(2,2);
auto rv = multivariateNormalVar(mu, sigma);
rv(gen, x1[]);
writeln(x1);

But when I increase my data set size from double[2] to 
double[100] I am getting an assert :


mir-random-0.4.3/mir-random/source/mir/random/ndvariable.d(378): Assertion 
failure

which is:
assert(result.length == n);

How can I have a result vector which has size like 5000 
something?


Erdemdem


I haven't made much use of mir.random yet...

The dimension 2 in this case is the size of the dimension of the 
random variable. What you want to do is simulate multiple times 
from this 2-dimensional random variable.


It looks like the examples on the main Readme page uses 
mir.random.algorithm.range. I tried below, but I got errors. I 
did notice that the MultivariateNormalVariable documentation says 
that it is in beta still.


void main()
{
import mir.random : Random, unpredictableSeed;
import mir.random.ndvariable : MultivariateNormalVariable;
import mir.random.algorithm : range;
import mir.ndslice.slice : sliced;
import std.range : take;

auto mu = [10.0, 0.0].sliced;
auto sigma = [2.0, -1.5, -1.5, 2.0].sliced(2,2);

auto rng = Random(unpredictableSeed);
auto sample = range!rng
(MultivariateNormalVariable!double(mu, sigma))
.take(10);
}

However, doing it manually with a for loop works.

void main()
{
import mir.random : rne;
import mir.random.ndvariable : multivariateNormalVar;
import mir.random.algorithm : range;
import mir.ndslice.slice : sliced;
import std.stdio : writeln;

auto mu = [10.0, 0.0].sliced;
auto sigma = [2.0, -1.5, -1.5, 2.0].sliced(2,2);

auto rv = multivariateNormalVar(mu, sigma);

double[2][100] x;
for (size_t i = 0; i < 100; i++) {
rv(rne, x[i][]);
}
writeln(x);
}

Nevertheless, it probably can't hurt to file an issue if you 
can't get something like the first one to work. I would think it 
should just work.