alamb commented on issue #21528:
URL: https://github.com/apache/datafusion/issues/21528#issuecomment-4237576874

   Thanks @kumarUjjawal and @aryan-212 
   
   As @asolimando points out, it is not at all clear to me if this is a bug or 
not -- I am not an expert in how approx distinct works
   
   For this kind of change I would be worried that changing the existing 
implementation would result in other users perceiving the changes as a bug / 
regression ("something that used to work for me is now not working")
   
   Maybe you can step back and provide some more context of the rationale for 
this change. For example, is your goal to make an approx_* function that 
produces the same answer as Spark/DataBricks? 
   
   Your description says:
   
   > Run:
   > 
   > select approx_percentile(cc_sq_ft, 0.85) from call_center;
   > Observe the output
   
   But it didnt' explain what the output was
   
   If the goal is to create a function that mirrors spark/DataBricks, perhaps 
adding something to the `datafusion-functions-spark` is a better approach than 
trying to change the existing function's behavior
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to