rahil-c commented on code in PR #17768:
URL: https://github.com/apache/hudi/pull/17768#discussion_r2697157911
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/common/model/HoodieSparkRecord.java:
##########
@@ -327,7 +328,15 @@ public Option<HoodieAvroIndexedRecord>
toIndexedRecord(HoodieSchema recordSchema
@Override
public ByteArrayOutputStream getAvroBytes(HoodieSchema recordSchema,
Properties props) throws IOException {
- throw new UnsupportedOperationException();
+ // Convert Spark InternalRow to Avro GenericRecord
+ if (data == null) {
Review Comment:
@vinothchandar
Originally I hit the following exception in
`TestLanceDataSource#testBasicUpsertModifyExistingRow` when trying to upsert on
an existing row for the MOR case (where there should have been a lance base
file but an avro log file)
```
Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while
appending records to
/var/folders/lm/0j1q1s_n09b4wgqkdqbzpbkm0000gn/T/junit-11448262777148643233/dataset/test_lance_upsert_merge_on_read/.3169035e-e73a-49ec-be8f-c7045242bf56-0_20260115220744098.log.1_0-38-60
at
org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:511)
at
org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:470)
at
org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:82)
at
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:358)
... 35 more
Caused by: java.lang.UnsupportedOperationException
at
org.apache.hudi.common.model.HoodieSparkRecord.getAvroBytes(HoodieSparkRecord.java:331)
at
org.apache.hudi.common.table.log.block.HoodieAvroDataBlock.serializeRecords(HoodieAvroDataBlock.java:122)
at
org.apache.hudi.common.table.log.block.HoodieDataBlock.getContentBytes(HoodieDataBlock.java:132)
at
org.apache.hudi.common.table.log.HoodieLogFormatWriter.appendBlocks(HoodieLogFormatWriter.java:147)
at
org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:503)
... 38 more
```
When examining the frames of the stack trace, i can see that is it going
thru the `upsert` path and to `HoodieAppendHandle`
<img width="864" height="283" alt="Screenshot 2026-01-15 at 10 12 28 PM"
src="https://github.com/user-attachments/assets/5edd8d24-dbb8-42c4-ae73-a9ca106e4915"
/>
and attempts to write a log file in
`HoodieAppendHandle#appendDataAndDeleteBlocks`, in the following code pointer,
https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java#L503
The actual block seems to be the `HoodieAvroDataBlock`
<img width="1231" height="248" alt="Screenshot 2026-01-15 at 10 17 29 PM"
src="https://github.com/user-attachments/assets/eef2960b-61ca-44ed-bff3-49d487e5af00"
/>
which contains a method called `serializeRecords`
https://github.com/apache/hudi/blob/30029e37017f64b1a4d682f08c99021fadede70b/hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java#L122
The actual record being used in this case is `HoodieSparkRecord` which
currently did not have a `getAvroBytes` hence why I implemented it for now.
<img width="1150" height="439" alt="Screenshot 2026-01-15 at 10 21 44 PM"
src="https://github.com/user-attachments/assets/0f0fef7d-cdc8-411a-86de-faefc11522d2"
/>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]