zhijiangW commented on a change in pull request #11507: [FLINK-16587] Add basic 
CheckpointBarrierHandler for unaligned checkpoint
URL: https://github.com/apache/flink/pull/11507#discussion_r403725829
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/serialization/SpillingAdaptiveSpanningRecordDeserializer.java
 ##########
 @@ -557,6 +579,28 @@ private void addNextChunkFromMemorySegment(MemorySegment 
segment, int offset, in
                        }
                }
 
+               @Nullable
+               MemorySegment copyToTargetSegment() {
+                       // for the case of only partial length, no data
+                       final int position = lengthBuffer.position();
+                       if (position > 0) {
+                               MemorySegment segment = 
MemorySegmentFactory.allocateUnpooledSegment(lengthBuffer.remaining());
+                               segment.put(0, lengthBuffer, 
lengthBuffer.remaining());
+                               lengthBuffer.position(position);
+                               return segment;
+                       }
+
+                       // for the case of full length, partial data in buffer
+                       if (recordLength != -1) {
+                               // In the PoC we skip the case of large record 
which size exceeds THRESHOLD_FOR_SPILLING,
 
 Review comment:
   I am not thinking through the solution yet. In my previous PoC, I also found 
it is difficult to handle since it is also not easy to read the partial data 
from spilled files ATM, so I left it as TODO to not pay much efforts.
   
   I think we might not support this large record case in MVP, but we should 
not mute it. Otherwise once we encounter this case in production, the completed 
checkpoint can not be restored correctly because of missing partial records. 
Maybe we can discard the unaligned checkpoint for large record case or throw 
exceptions to warn users for tuning the proper checkpoint setting?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to