ankit0811 commented on PR #14462: URL: https://github.com/apache/druid/pull/14462#issuecomment-1639471268
> Thanks for raising this PR!! I am relatively unfamiliar with the complex column's serde and am still going through the implementation, however, I have a few high-level comments, based on my understanding of the StringFirst/Last aggregators: > > 1. In the `deserializeColumn`, we should allocate the first byte for a version number. This would help if we decide to update the strategy down the line and want backward compatibility. > 2. The current StringFirst/Last comparators use delta encoding and compression on the numeric column to reduce the size. Also, the PR ([Improve String Last/First Storage Efficiency #12879](https://github.com/apache/druid/pull/12879)) which introduced this change also added a few classes which help in this. I suppose that we would want it sometime. The versioning as mentioned above would help in making these changes if we decide that we don't want it now. However, the complexity of changing from one version to another is high and we require backward compatibility therefore we should question if that should be done in the original PR itself (check this comment: https://github.com/apache/druid/blob/d63eff3b1b6b15fac1bb278a784ce6c391a52ce2/processing/src/main/java/org/apache/druid/query/aggregation/SerializablePairLongStringComplexMetricSerde.java#L54-L66 > ) > 3. Can we refactor the preexisting LongString serde classes to be subclasses of the newly introduced `AbstractSerializableLongObjectPairSerde`? I see that there are some similarities in the methods. In any case, it would be helpful if all of them are under the same umbrella, and we can further categorize the newly added classes as children of `AbstractSerializableLongNumbericPairSerde` Best if we take this in a separate PR? else it will be too big a PR to be reviewed. It already touches quite a few classes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
