alamb commented on issue #21528:
URL: https://github.com/apache/datafusion/issues/21528#issuecomment-4237576874
Thanks @kumarUjjawal and @aryan-212
As @asolimando points out, it is not at all clear to me if this is a bug or
not -- I am not an expert in how approx distinct works
For this kind of change I would be worried that changing the existing
implementation would result in other users perceiving the changes as a bug /
regression ("something that used to work for me is now not working")
Maybe you can step back and provide some more context of the rationale for
this change. For example, is your goal to make an approx_* function that
produces the same answer as Spark/DataBricks?
Your description says:
> Run:
>
> select approx_percentile(cc_sq_ft, 0.85) from call_center;
> Observe the output
But it didnt' explain what the output was
If the goal is to create a function that mirrors spark/DataBricks, perhaps
adding something to the `datafusion-functions-spark` is a better approach than
trying to change the existing function's behavior
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]