[GitHub] [spark] RyanBerti commented on pull request #39678: [SPARK-16484][SQL] Add HyperLogLogPlusPlus sketch generator/evaluator/aggregator

2023-05-03 Thread via GitHub
RyanBerti commented on PR #39678: URL: https://github.com/apache/spark/pull/39678#issuecomment-1533691734 Closing in favor of https://github.com/apache/spark/pull/40615 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] RyanBerti commented on pull request #39678: [SPARK-16484][SQL] Add HyperLogLogPlusPlus sketch generator/evaluator/aggregator

2023-01-21 Thread via GitHub
RyanBerti commented on PR #39678: URL: https://github.com/apache/spark/pull/39678#issuecomment-1399332706 Hi @dtenedor and @huaxingao Thanks for the input! I agree with you both that migrating Spark's existing HLL++ implementation to use the Apache Datasketches library would be

[GitHub] [spark] RyanBerti commented on pull request #39678: [SPARK-16484][SQL] Add HyperLogLogPlusPlus sketch generator/evaluator/aggregator

2023-01-20 Thread via GitHub
RyanBerti commented on PR #39678: URL: https://github.com/apache/spark/pull/39678#issuecomment-1398762642 For reference, @dtenedor worked with me on a pre-review of these changes; relevant discussions are available in this PR in my fork: https://github.com/RyanBerti/spark/pull/1 -- This