kgyrtkirk commented on code in PR #16517:
URL: https://github.com/apache/druid/pull/16517#discussion_r1627164179


##########
processing/src/main/java/org/apache/druid/segment/DictionaryEncodedColumnMerger.java:
##########
@@ -322,7 +322,7 @@ public void processMergedRow(ColumnValueSelector selector) 
throws IOException
     if (encodedValueSerializer instanceof ColumnarMultiIntsSerializer) {
       ((ColumnarMultiIntsSerializer) encodedValueSerializer).addValues(row);
     } else {
-      int value = row.size() == 0 ? 0 : row.get(0);
+      int value = rowSize == 0 ? 0 : row.get(0);

Review Comment:
   oh - that's true; and its being used as well around here!
   
   but how it works now is that it will call `getRow` for every `size` and 
`get(i)` invocations as well [see 
here](https://github.com/apache/druid/blob/30c59042e0cde3c76392afa3ba73d0830b11b0a5/processing/src/main/java/org/apache/druid/segment/DictionaryEncodedColumnIndexer.java#L161)
   
   the costly process is to produce the `row`  itself; currently built by: 
[StringDimensionIndexer$IndexerDimensionSelector#getRow](https://github.com/apache/druid/blob/30c59042e0cde3c76392afa3ba73d0830b11b0a5/processing/src/main/java/org/apache/druid/segment/StringDimensionIndexer.java#L304)
 - if that could be cached somehow - that would result in the same benefits.
   There are better access to internals there; I believe if the "same" entry is 
being used (`currEntry.get()`) then it might be ok to return the previously 
built row.
   Although the 
[IncrementalIndexRow](https://github.com/apache/druid/blob/30c59042e0cde3c76392afa3ba73d0830b11b0a5/processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndexRow.java#L110)
  looks partially mutable; but the since the `timestamp` can't be set: I think 
its not intended to be reused - so I think you could put a caching logic there 
based on whether `currEntry.get()` is the same
   
   to ensure that I'll don't send you on a wild goose chase I'll ask 
@clintropolis to possibly doublecheck the above thinking :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to