clintropolis commented on code in PR #17460:
URL: https://github.com/apache/druid/pull/17460#discussion_r1844406606


##########
processing/src/main/java/org/apache/druid/segment/nested/DictionaryIdLookup.java:
##########
@@ -262,45 +231,73 @@ private void ensureStringDictionaryLoaded()
   private void ensureLongDictionaryLoaded()
   {
     if (longDictionary == null) {
-      longDictionaryFile = makeTempFile(name + 
ColumnSerializerUtils.LONG_DICTIONARY_FILE_NAME);
-      longBuffer = SerializerUtils.mapSerializer(longDictionaryFile, 
longDictionaryWriter);
-      longDictionary = FixedIndexed.read(longBuffer, TypeStrategies.LONG, 
ByteOrder.nativeOrder(), Long.BYTES).get();
-      // reset position
-      longBuffer.position(0);
+      final String fileName = ColumnSerializerUtils.getInternalFileName(
+          name,
+          ColumnSerializerUtils.LONG_DICTIONARY_FILE_NAME
+      );
+      longDictionaryFile = makeTempDir(fileName);
+      longBufferMapper = closer.register(
+          ColumnSerializerUtils.mapSerializer(longDictionaryFile, 
longDictionaryWriter, fileName)
+      );
+      try {
+        final ByteBuffer buffer = longBufferMapper.mapFile(fileName);

Review Comment:
   i think this is a fine assumption for now, `FixedIndexed` currently only 
writes to a single chunk and doesn't support "large" parts that are spread 
across multiple containers like `GenericIndexed`. Once this changes, we would 
still map the entry point file container the same way (assuming we are still 
stuck with ByteBuffer and have to chunk big stuff) and pass in the smoosh 
mapper to `FixedIndexed.read` similar to how we do for the string case:
   
   ```
   ...
           final ByteBuffer stringBuffer = stringBufferMapper.mapFile(fileName);
           stringDictionary = 
StringEncodingStrategies.getStringDictionarySupplier(
               stringBufferMapper,
               stringBuffer,
               ByteOrder.nativeOrder()
           ).get();
   ...
   ```
   
   I could add some checks in case someone updates `FixedIndexedWriter` to 
allow it to spread across chunks but forgets to update these read call sites, 
though I imagine that if this is done all of the calls to `FixedIndexed.read` 
would be updated to take a mapper if we do it in a similar way as 
`GenericIndexed`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to