```> On 13 Jul 2017, at 19:14, Fabien COELHO <coe...@cri.ensmp.fr> wrote:
> Documentation says that the closer theta is from 0 the flatter the
> distribution
> but the implementation requires at least 1, including strange error messages:
>  zipfian parameter must be greater than 1.000000 (not 1.000000)
> Could theta be allowed between 0 & 1 ? I've tried forcing with theta = 0.1
> and it worked well, so I'm not sure that I understand the restriction.
> I also tried with theta=0.001 but it seemed less good.```
Algorithm works with theta less than 1. The only problem here is that theta can
not be 1, because of next line of code

cell->alpha = 1. / (1 - theta);

That’s why I put such restriction. Now I see 2 possible solutions for that:
1) Exclude 1, and allow everything in range (0;+∞).
2) Or just increase/decrease theta by very small number if it is 1.

> I have also tried to check the distribution wrt the explanations, with the
> attached scripts, n=100, theta=1.000001/1.5/3.0: It does not seem to work,
> there is repeatable +15% bias on i=3 and repeatable -3% to -30% bias for
> values in i=10-100, this for different values of theta (1.000001,1.5, 3.0).
>
> If you try the script, beware to set parameters (theta, n) consistently.

I've executed scripts that you attached with different theta and number of
outcomes(not n, n remains the same = 100) and I found out that for theta = 0.1
and big number of outcomes it gives distribution very similar to zipfian(for
number of outcomes = 100 000, bias -6% to 8% in whole range and for NOO = 1000
000, bias is -2% to 2%).

By, number of outcomes(NOO) I mean how many times random_zipfian was called.
For example:
pgbench -f compte_bench.sql -t 100000

So, I think it works but works worse for small number of outcomes. And also we
need to find optimal theta for better results.

