cryptoe commented on code in PR #17044:
URL: https://github.com/apache/druid/pull/17044#discussion_r1760473229
##########
server/src/main/java/org/apache/druid/segment/metadata/FingerprintGenerator.java:
##########
@@ -53,12 +59,19 @@ public FingerprintGenerator(ObjectMapper objectMapper)
* Generates fingerprint or hash string for an object using SHA-256 hash
algorithm.
*/
@SuppressWarnings("UnstableApiUsage")
- public String generateFingerprint(SchemaPayload schemaPayload, String
dataSource, int version)
+ public String generateFingerprint(final SchemaPayload schemaPayload, final
String dataSource, final int version)
{
+ // Sort the column names in lexicographic order
+ // This ensures that all permutations of a given columns would result in
the same fingerprint
+ // thus avoiding schema explosion in the metadata database
+ // Note that this signature is not persisted anywhere, it is only used for
fingerprint computation
+ final RowSignature sortedSignature =
getLexicographicallySortedSignature(schemaPayload.getRowSignature());
+ final SchemaPayload updatedPayload = new SchemaPayload(sortedSignature,
schemaPayload.getAggregatorFactories());
Review Comment:
Please mention a note about the aggregator factories as well that they are
column order independent since they are backed by a map.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]