ankit0811 commented on PR #14462:
URL: https://github.com/apache/druid/pull/14462#issuecomment-1639471268

   > Thanks for raising this PR!! I am relatively unfamiliar with the complex 
column's serde and am still going through the implementation, however, I have a 
few high-level comments, based on my understanding of the StringFirst/Last 
aggregators:
   > 
   > 1. In the `deserializeColumn`, we should allocate the first byte for a 
version number. This would help if we decide to update the strategy down the 
line and want backward compatibility.
   > 2. The current StringFirst/Last comparators use delta encoding and 
compression on the numeric column to reduce the size. Also, the PR ([Improve 
String Last/First Storage Efficiency 
#12879](https://github.com/apache/druid/pull/12879)) which introduced this 
change also added a few classes which help in this. I suppose that we would 
want it sometime. The versioning as mentioned above would help in making these 
changes if we decide that we don't want it now. However, the complexity of 
changing from one version to another is high and we require backward 
compatibility therefore we should question if that should be done in the 
original PR itself (check this comment: 
https://github.com/apache/druid/blob/d63eff3b1b6b15fac1bb278a784ce6c391a52ce2/processing/src/main/java/org/apache/druid/query/aggregation/SerializablePairLongStringComplexMetricSerde.java#L54-L66
   >    )
   > 3. Can we refactor the preexisting LongString serde classes to be 
subclasses of the newly introduced `AbstractSerializableLongObjectPairSerde`? I 
see that there are some similarities in the methods. In any case, it would be 
helpful if all of them are under the same umbrella, and we can further 
categorize the newly added classes as children of 
`AbstractSerializableLongNumbericPairSerde`
   
   Best if we take this in a separate PR? else it will be too big a PR to be 
reviewed. It already touches quite a few classes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to