Re: [E] Re: Apache DataSketches integration

2021-08-27 Thread Alexander Saydakov
I submitted a pull request with some changes I tried to explain here. https://github.com/apache/impala/pull/30 There are still open questions for me regarding: - better dependency mechanism - updating dependency to the latest 3.1.0 - process flow in aggregate functions (avoiding overhead of pairwi

Re: [E] Re: Apache DataSketches integration

2021-08-24 Thread Alexander Saydakov
I am afraid that I was misunderstood regarding a few points. Let me try to clarify. Regarding serialization using bytes as opposed to a stream. This has nothing to do with BINARY data type in Impala. Currently I see in the Impala code something like this (simplified): std::stringstream tmp; sketch

Re: [E] Re: Apache DataSketches integration

2021-08-16 Thread Alexander Saydakov
I am away for a few days. I will have a look soon. Thank you. On Mon, Aug 16, 2021 at 9:44 AM Quanlong Huang wrote: > Thank Fucun for creating the JIRAs! > > Regarding the dependency. I see that the current approach is to copy all >> files from Datasketches into a single pile. Is there a better