Oh, sorry, I forgot to write URL referring picture. http://en.wikipedia.org/wiki/Normal_distribution http://en.wikipedia.org/wiki/Exponential_distribution

regards, -- Mitsumasa KONDO 2014-03-15 17:50 GMT+09:00 Mitsumasa KONDO <kondo.mitsum...@gmail.com>: > Hi > > 2014-03-15 15:53 GMT+09:00 Fabien COELHO <coe...@cri.ensmp.fr>: > > >> Hello Heikki, >> >> >> A couple of comments: >>> >>> * There should be an explicit "\setrandom ... uniform" option too, even >>> though you get that implicitly if you don't specify the distribution >>> >> >> Indeed. I agree. I suggested it, but it got lost. > > OK. If we keep to the SQL grammar, your saying is right. I will add it. > > >> * What exactly does the "threshold" mean? The docs informally explain >>> that "the larger the thresold, the more frequent values close to the middle >>> of the interval are drawn", but that's pretty vague. >>> >> >> There are explanations and computations as comments in the code. If it is >> about the documentation, I'm not sure that a very precise mathematical >> definition will help a lot of people, and might rather hinder >> understanding, so the doc focuses on an intuitive explanation instead. > > Yeah, I think that we had better to only explain necessary infomation for > using this feature. If we add mathematical theory in docs, it will be too > difficult for user. And it's waste. > > > * Does min and max really make sense for gaussian and exponential >>> distributions? For gaussian, I would expect mean and standard deviation as >>> the parameters, not min/max/threshold. >>> >> >> Yes... and no:-) The aim is to draw an integer primary key from a table, >> so it must be in a specified range. This is approximated by drawing a >> double value with the expected distribution (gaussian or exponential) and >> project it carefully onto integers. If it is out of range, there is a loop >> and another value is drawn. The minimal threshold constraint (2.0) ensures >> that the probability of looping is low. > > I think it is difficult to understand from our text... So I create picture > that will help you to understand it. > Please see it. > > >> >> * How about setting the variable as a float instead of integer? Would >>> seem more natural to me. At least as an option. >>> >> >> Which variable? The values set by setrandom are mostly used for primary >> keys. We really want integers in a range. > > I think he said threshold parameter. Threshold parameter is very sensitive > parameter, so we need to set double in threshold. I think that you can > consent it when you see attached picture. > > regards, > -- > Mitsumasa KONDO > NTT Open Source Software Center >