rubenssoto commented on issue #2563:
URL: https://github.com/apache/hudi/issues/2563#issuecomment-783862335
`Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge
old record into new file for key 7176859 from old file
s3://dl/courier_api/customer_address/3ee388f2-fa45-437a-a279-d9e3e3369bbd-0_9-137-2635_20210223033155.parquet
to new file
s3://ld/courier_api/customer_address/3ee388f2-fa45-437a-a279-d9e3e3369bbd-0_9-377-7189_20210223035129.parquet
with writerSchema {
"type" : "record",
"name" : "customer_address_record",
"namespace" : "hoodie.customer_address",
"fields" : [ {
"name" : "_hoodie_commit_time",
"type" : [ "null", "string" ],
"doc" : "",
"default" : null
}, {
"name" : "_hoodie_commit_seqno",
"type" : [ "null", "string" ],
"doc" : "",
"default" : null
}, {
"name" : "_hoodie_record_key",
"type" : [ "null", "string" ],
"doc" : "",
"default" : null
}, {
"name" : "_hoodie_partition_path",
"type" : [ "null", "string" ],
"doc" : "",
"default" : null
}, {
"name" : "_hoodie_file_name",
"type" : [ "null", "string" ],
"doc" : "",
"default" : null
}, {
"name" : "Op",
"type" : [ "string", "null" ]
}, {
"name" : "LineCreatedTimestamp",
"type" : [ "string", "null" ]
}, {
"name" : "created_date",
"type" : [ {
"type" : "long",
"logicalType" : "timestamp-micros"
}, "null" ]
}, {
"name" : "updated_date",
"type" : [ {
"type" : "long",
"logicalType" : "timestamp-micros"
}, "null" ]
}, {
"name" : "id",
"type" : [ "int", "null" ]
}, {
"name" : "address_type",
"type" : [ "string", "null" ]
}, {
"name" : "name",
"type" : [ "string", "null" ]
}, {
"name" : "customer_email",
"type" : [ "string", "null" ]
}, {
"name" : "street",
"type" : [ "string", "null" ]
}, {
"name" : "number",
"type" : [ "string", "null" ]
}, {
"name" : "address_line2",
"type" : [ "string", "null" ]
}, {
"name" : "city",
"type" : [ "string", "null" ]
}, {
"name" : "province",
"type" : [ "string", "null" ]
}, {
"name" : "zipcode",
"type" : [ "string", "null" ]
}, {
"name" : "country",
"type" : [ "string", "null" ]
}, {
"name" : "neighborhood",
"type" : [ "string", "null" ]
}, {
"name" : "latitude",
"type" : [ "double", "null" ]
}, {
"name" : "longitude",
"type" : [ "double", "null" ]
}, {
"name" : "commit_version",
"type" : "long"
}, {
"name" : "_hoodie_is_deleted",
"type" : "boolean"
} ]
}
at
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:256)
at
org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:122)
at
org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:112)
at
org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37)
at
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:121)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
Caused by: java.lang.RuntimeException: Null-value for required field:
commit_version
at
org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:194)
at
org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:165)
at
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299)
at
org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:94)
at
org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:251)
... 8 more
Driver stacktrace:
at jobs.TableProcessor.start(TableProcessor.scala:104)
at
TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23)
at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at
java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
ApplicationMaster host: ip-10-0-53-212.us-west-2.compute.internal
ApplicationMaster RPC port: 41723
queue: default
start time: 1614052265461
final status: FAILED
tracking URL:
http://ip-10-0-49-168.us-west-2.compute.internal:20888/proxy/application_1613496813774_2805/
user: hadoop`
@nsivabalan I had this error, I have a table without the column
commit_version(it is a column that I created), I add the column commit_version
in my script and the new data try to update the old one.
Is this problem is addressed too?
Thank you so much.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]