vamshikrishnakyatham commented on code in PR #13852:
URL: https://github.com/apache/hudi/pull/13852#discussion_r2330811556


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/versioning/v1/TimelineArchiverV1.java:
##########
@@ -442,9 +443,14 @@ private void writeToFile(Schema wrapperSchema, 
List<IndexedRecord> records) thro
       Map<HeaderMetadataType, String> header = new HashMap<>();
       header.put(HoodieLogBlock.HeaderMetadataType.SCHEMA, 
wrapperSchema.toString());
       final String keyField = 
table.getMetaClient().getTableConfig().getRecordKeyFieldProp();
-      List<HoodieRecord> indexRecords = 
records.stream().map(HoodieAvroIndexedRecord::new).collect(Collectors.toList());
-      HoodieAvroDataBlock block = new HoodieAvroDataBlock(indexRecords, 
header, keyField);
-      writer.appendBlock(block);
+      List<HoodieRecord> indexRecords = records.stream()
+          .filter(Objects::nonNull)

Review Comment:
   you are right, I will fix that but upon flakiness error stack 
(https://github.com/apache/hudi/actions/runs/17526286058/job/49777188436?pr=13852)
   
   the data in the record is getting a null value
   
   my assumption on what may be going wrong.. 
   
   ```
   Thread 1: Gets lock -> Takes timeline snapshot -> Sees instant inst1
   Thread 2: Waits for lock
   Thread 1: Archives instant inst1 -> Deletes files -> Releases lock
   Thread 2: Gets lock -> Creates new timeline -> Instant inst1 missing!
   Thread 2: groupByTsAction.get() returns null for inst1
   Thread 2: Returns Stream.empty() -> kind of silent data loss
   In LSM Archive: Missing entries -> Incomplete null data
   Downgrade: Tries to convert incomplete archive -> with null record data
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to