clintropolis commented on code in PR #14014:
URL: https://github.com/apache/druid/pull/14014#discussion_r1155479450


##########
processing/src/main/java/org/apache/druid/segment/IndexMergerV9.java:
##########
@@ -263,19 +191,19 @@ private File makeIndexFiles(
       log.debug("Completed factory.json in %,d millis", 
System.currentTimeMillis() - startTime);
 
       progress.progress();
-      final Map<String, TypeSignature<ValueType>> metricTypes = new 
TreeMap<>(Comparators.naturalNullsFirst());
-      final List<ColumnCapabilities> dimCapabilities = 
Lists.newArrayListWithCapacity(mergedDimensions.size());
-      mergeCapabilities(adapters, mergedDimensions, metricTypes, 
dimCapabilities);
+      final Map<String, ColumnFormat> metricTypes = new 
TreeMap<>(Comparators.naturalNullsFirst());
+      final List<ColumnFormat> dimFormats = 
Lists.newArrayListWithCapacity(mergedDimensions.size());
+      mergeFormat(adapters, mergedDimensions, metricTypes, dimFormats);
 
-      final Map<String, DimensionHandler> handlers = 
makeDimensionHandlers(mergedDimensions, dimCapabilities);
+      final Map<String, DimensionHandler> handlers = 
makeDimensionHandlers(mergedDimensions, dimFormats);
       final List<DimensionMergerV9> mergers = new ArrayList<>();
       for (int i = 0; i < mergedDimensions.size(); i++) {
         DimensionHandler handler = handlers.get(mergedDimensions.get(i));
         mergers.add(
             handler.makeMerger(
                 indexSpec,
                 segmentWriteOutMedium,
-                dimCapabilities.get(i),
+                dimFormats.get(i).toColumnCapabilities(),
                 progress,
                 closer
             )

Review Comment:
   the snag here is that right now the string dimension merger needs the 'has 
multiple values' flag set on the capabilities to determine whether it makes a 
serializer for a single value or mutli value string. 
   
   I didn't want to rework the classic columns and indexers to use the 
`ColumnFormat` stuff at this time to minimize the number of changes, but the 
answer is yes, once we make a dedicate format for the `String` dimension schema 
i would think it would capture whether the string was multi-valued or not and 
we can drop this argument.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to