nsivabalan commented on code in PR #13614:
URL: https://github.com/apache/hudi/pull/13614#discussion_r2232101872


##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/KeyBasedFileGroupRecordBuffer.java:
##########
@@ -99,11 +99,9 @@ public void processDataBlock(HoodieDataBlock dataBlock, 
Option<KeySpec> keySpecO
   @Override
   public void processNextDataRecord(BufferedRecord<T> record, Serializable 
recordKey) throws IOException {
     BufferedRecord<T> existingRecord = records.get(recordKey);
-    Option<BufferedRecord<T>> bufferRecord = doProcessNextDataRecord(record, 
existingRecord);

Review Comment:
   so we can delete `doProcessNextDataRecord` method also right



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/BufferedRecordSerializer.java:
##########
@@ -31,52 +31,81 @@
 
 /**
  * An implementation of {@link CustomSerializer} for {@link BufferedRecord}.
- *
  */
 public class BufferedRecordSerializer<T> implements 
CustomSerializer<BufferedRecord<T>> {
-  public static final int KRYO_SERIALIZER_INITIAL_BUFFER_SIZE = 1048576;
-  private final Kryo kryo;
-  // Caching ByteArrayOutputStream to avoid recreating it for every operation
-  private final ByteArrayOutputStream baos;
+  // Caching kryo serializer to avoid creating kryo instance for every serde 
operation
+  private static final ThreadLocal<InternalSerializerInstance> SERIALIZER_REF =

Review Comment:
   curious to understand the overhead here. 
   we will be instantiating just once per file group right? (even before this 
patch). I see we use this in ExternalSpillableMap for the merged log records 
and so its one instance per file group. 
   so, how much we might get a benefit in here. 
   and where do we access these across threads. Isn't Threadlocal.withInitial() 
mainly helps w/ thread safety across threads and each thread gets its own local 
copy. 
   
   can you shed some light please.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to