matthewmturner commented on issue #1486:
URL: 
https://github.com/apache/arrow-datafusion/issues/1486#issuecomment-1016684233


   @realno you can take this with a grain of salt as I am new to this.
   
   My thinking is that I would prefer to see the exact median implementation 
before having an approximate (i.e the approximate would be an add-on feature).  
I could be wrong but I believe datafusion had `DISTINCT` before 
`approx_distinct`.
   
   Regarding the implementation - I thought that we would be able to use 
existing arrow compute kernels for this and not have to re-implement existing 
functionality:
   
   - sort: https://docs.rs/arrow/latest/arrow/compute/kernels/sort/fn.sort.html
   - length: 
https://docs.rs/arrow/latest/arrow/array/trait.Array.html#method.len
   - value: 
https://docs.rs/arrow/latest/arrow/array/struct.PrimitiveArray.html#method.value
   
   I suppose this would be somewhere between your Option 1 and Option 2.
   
   i definitely defer to @alamb though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to