cryptoe commented on code in PR #17044:
URL: https://github.com/apache/druid/pull/17044#discussion_r1760473229


##########
server/src/main/java/org/apache/druid/segment/metadata/FingerprintGenerator.java:
##########
@@ -53,12 +59,19 @@ public FingerprintGenerator(ObjectMapper objectMapper)
    * Generates fingerprint or hash string for an object using SHA-256 hash 
algorithm.
    */
   @SuppressWarnings("UnstableApiUsage")
-  public String generateFingerprint(SchemaPayload schemaPayload, String 
dataSource, int version)
+  public String generateFingerprint(final SchemaPayload schemaPayload, final 
String dataSource, final int version)
   {
+    // Sort the column names in lexicographic order
+    // This ensures that all permutations of a given columns would result in 
the same fingerprint
+    // thus avoiding schema explosion in the metadata database
+    // Note that this signature is not persisted anywhere, it is only used for 
fingerprint computation
+    final RowSignature sortedSignature = 
getLexicographicallySortedSignature(schemaPayload.getRowSignature());
+    final SchemaPayload updatedPayload = new SchemaPayload(sortedSignature, 
schemaPayload.getAggregatorFactories());

Review Comment:
   Please mention a note about the aggregator factories as well that they are 
column order independent since they are backed by a map. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to