rok commented on issue #47288:
URL: https://github.com/apache/arrow/issues/47288#issuecomment-3201074791

   > > Generating random data for testing (where you generally want data to be 
"interesting" but you don't need high statistical quality) is not the same 
thing as providing random generation facilities for production (where you 
generally want data to have guaranteed statistical properties, such as: 
uniform/normal/etc., with certain parameters).
   
   Agreed. The reason I've started this discussion is because we already have a 
[random 
kernel](https://arrow.apache.org/docs/cpp/compute.html#random-number-generation)
 which we claim is uniform. So I assumed we're now in business of statistically 
exact random generation too.
   
   > We can definitely provide more random generation kernels for other numeric 
types, but you can also generate `float64` and cast to the target type (the 
exception being higher-precision data types such as decimals).
   
   Does that preserve statistical qualities?
   
   > As for random binary types, are there well-known distributions we can 
expose?
   
   I'm not familiar with those. If we agree there's interest we should discuss 
case by case I suppose.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to