[
https://issues.apache.org/jira/browse/HUDI-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Selvaraj Periyasamy updated HUDI-1057:
--------------------------------------
Comment: was deleted
(was: I have some of the old records inserted with null for those columns ,
instead of [])
> optional int32 is not a group
> -----------------------------
>
> Key: HUDI-1057
> URL: https://issues.apache.org/jira/browse/HUDI-1057
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Selvaraj Periyasamy
> Priority: Major
>
> I am using Hudi 0.5.0 and writing to COW table using Spark. Consecutive
> writes fails with below error.
>
> Caused by: java.lang.ClassCastException: optional int32
> trli_sequence_number_list is not a groupCaused by:
> java.lang.ClassCastException: optional int32 trli_sequence_number_list is not
> a group at org.apache.parquet.schema.Type.asGroupType(Type.java:202) at
> org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:206)
> at
> org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:112)
> at
> org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:79)
> at
> org.apache.parquet.avro.AvroRecordMaterializer.<init>(AvroRecordMaterializer.java:33)
> at
> org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:132)
> at
> org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:175)
> at
> org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:149) at
> org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:125) at
> org.apache.hudi.func.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:47)
> at
> org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:44)
> at
> org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:91)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4
> more
>
>
> Corresponding schema is as below.
>
> {
> "name" : "trli_sequence_number_list",
> "type" : [ {
> "type" : "array",
> "items" : [ "string", "null" ]
> }, "null" ]
> },
>
>
> I have multiple columns having array data type.
>
> {
> "name" : "rli_invoice_number_list",
> "type" : [ {
> "type" : "array",
> "items" : [ "string", "null" ]
> }, "null" ]
> }, {
> "name" : "trli_sequence_number_list",
> "type" : [ {
> "type" : "array",
> "items" : [ "string", "null" ]
> }, "null" ]
> },
>
> Is there a way to avoid this error?
>
>
> Similarly there is an another error log on the same run.
>
> Caused by: java.lang.ClassCastException: optional binary app_application_list
> (UTF8) is not a group
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)