Alexey Kudinkin created HUDI-2898:
-------------------------------------

             Summary: `AvroRuntimeException` thrown when clustering payloads w/ 
no "ts" field
                 Key: HUDI-2898
                 URL: https://issues.apache.org/jira/browse/HUDI-2898
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Alexey Kudinkin


While validating Z-/Hilbert curves ordering i've stumbled upon issues to ingest 
[Amazon Reviews|https://s3.amazonaws.com/amazon-reviews-pds/readme.html] 
dataset into Hudi seeing following exceptions in the log:

 
{code:java}
21/11/30 10:22:42 ERROR HoodieWriteHandle: Error writing record 
HoodieRecord{key=HoodieKey { recordKey=R2I675JE64OFU1 partitionPath=default}, 
currentLocation='null', newLocation='null'}
org.apache.avro.AvroRuntimeException: Not a valid schema field: ts
        at org.apache.avro.generic.GenericData$Record.get(GenericData.java:256)
        at 
org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:462)
        at 
org.apache.hudi.common.model.DefaultHoodieRecordPayload.updateEventTime(DefaultHoodieRecordPayload.java:90)
        at 
org.apache.hudi.common.model.DefaultHoodieRecordPayload.getInsertValue(DefaultHoodieRecordPayload.java:84)
        at 
org.apache.hudi.execution.HoodieLazyInsertIterable$HoodieInsertValueGenResult.<init>(HoodieLazyInsertIterable.java:90)
        at 
org.apache.hudi.execution.HoodieLazyInsertIterable.lambda$getTransformFunction$0(HoodieLazyInsertIterable.java:103)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryQueue.insertRecord(BoundedInMemoryQueue.java:190)
        at 
org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:46)
        at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:92)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748){code}
 

 

The root-cause have been found to be in `DefaultHoodieRecordPayload` class 
trying to access `ts` field in the payload that doesn't actually contain it, 
resulting in `AvroRuntimeException`. 

 

The root-cause of this degradation have been found to be this 
[PR#4115|https://github.com/apache/hudi/pull/4115] changing default record 
payload class to `DefaultHoodieRecordPayload`.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to