Alexey Kudinkin created HUDI-2898:
-------------------------------------
Summary: `AvroRuntimeException` thrown when clustering payloads w/
no "ts" field
Key: HUDI-2898
URL: https://issues.apache.org/jira/browse/HUDI-2898
Project: Apache Hudi
Issue Type: Bug
Reporter: Alexey Kudinkin
While validating Z-/Hilbert curves ordering i've stumbled upon issues to ingest
[Amazon Reviews|https://s3.amazonaws.com/amazon-reviews-pds/readme.html]
dataset into Hudi seeing following exceptions in the log:
{code:java}
21/11/30 10:22:42 ERROR HoodieWriteHandle: Error writing record
HoodieRecord{key=HoodieKey { recordKey=R2I675JE64OFU1 partitionPath=default},
currentLocation='null', newLocation='null'}
org.apache.avro.AvroRuntimeException: Not a valid schema field: ts
at org.apache.avro.generic.GenericData$Record.get(GenericData.java:256)
at
org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:462)
at
org.apache.hudi.common.model.DefaultHoodieRecordPayload.updateEventTime(DefaultHoodieRecordPayload.java:90)
at
org.apache.hudi.common.model.DefaultHoodieRecordPayload.getInsertValue(DefaultHoodieRecordPayload.java:84)
at
org.apache.hudi.execution.HoodieLazyInsertIterable$HoodieInsertValueGenResult.<init>(HoodieLazyInsertIterable.java:90)
at
org.apache.hudi.execution.HoodieLazyInsertIterable.lambda$getTransformFunction$0(HoodieLazyInsertIterable.java:103)
at
org.apache.hudi.common.util.queue.BoundedInMemoryQueue.insertRecord(BoundedInMemoryQueue.java:190)
at
org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:46)
at
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:92)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}
The root-cause have been found to be in `DefaultHoodieRecordPayload` class
trying to access `ts` field in the payload that doesn't actually contain it,
resulting in `AvroRuntimeException`.
The root-cause of this degradation have been found to be this
[PR#4115|https://github.com/apache/hudi/pull/4115] changing default record
payload class to `DefaultHoodieRecordPayload`.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)