egolearner commented on issue #47255:
URL: https://github.com/apache/arrow/issues/47255#issuecomment-3166265706

   Following the discussion from 
https://github.com/apache/arrow/issues/47199#issuecomment-3150104399_
   
   > Although if we have a dependency free test data generator we would 
probably want to use it everywhere. And in that case we probably want to spend 
the time to connect to the cpp generator we already have in place for cpp 
testing for Python testing.
   
   
   One possible solution to expose cpp generator is by extending  compute 
`random` in 
https://arrow.apache.org/docs/cpp/compute.html#random-number-generation. 
Current `random` only support generate [0, 1) double-precision float numbers. 
We can extend `random` to support generate random integers by adding 
`min/max/type` arguments. 
   
   
   ```
   def random(n, *, initializer='system', options=None, memory_pool=None,
   +                   min=0.0, max=1.0, type=pyarrow.float64()  
       """
       Generate numbers in the range [0, 1).
   
       Generated values are uniformly-distributed, double-precision
       in range [0, 1). Algorithm and seed can be changed via RandomOptions.
   ```
   
   Behaviour change: Current random generate numbers in the range [0, 1.0), but 
for integer scenario we'd better generate close interval [min, max] just as 
`np.random.randint` do.  As a result, default number will generate [0, 1.0] 
instead of [0, 1.0).
   
   For non numerical types(e.g. bool/string), we may choose to not support in 
`random` or support without min/max limit. I prefer to not supporting numerical 
types in `random`.
   
   Any suggestions? @rok @raulcd 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to