Selvaraj Periyasamy created HUDI-1057:
-----------------------------------------

             Summary: optional int32 is not a group
                 Key: HUDI-1057
                 URL: https://issues.apache.org/jira/browse/HUDI-1057
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Selvaraj Periyasamy


I am using Hudi 0.5.0 and writing to COW table using Spark. Consecutive writes 
fails with below error.

 

Caused by: java.lang.ClassCastException: optional int32 
trli_sequence_number_list is not a groupCaused by: 
java.lang.ClassCastException: optional int32 trli_sequence_number_list is not a 
group at org.apache.parquet.schema.Type.asGroupType(Type.java:202) at 
org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:206)
 at 
org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:112)
 at 
org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:79) 
at 
org.apache.parquet.avro.AvroRecordMaterializer.<init>(AvroRecordMaterializer.java:33)
 at 
org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:132)
 at 
org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:175)
 at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:149) 
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:125) at 
org.apache.hudi.func.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:47)
 at 
org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:44)
 at 
org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:91)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 
more

 

 

Corresponding schema is as below.

 

{
 "name" : "trli_sequence_number_list",
 "type" : [ {
 "type" : "array",
 "items" : [ "string", "null" ]
 }, "null" ]
 },

 

 

I have multiple columns having array data type.

 

{
 "name" : "rli_invoice_number_list",
 "type" : [ {
 "type" : "array",
 "items" : [ "string", "null" ]
 }, "null" ]
 }, {
 "name" : "trli_sequence_number_list",
 "type" : [ {
 "type" : "array",
 "items" : [ "string", "null" ]
 }, "null" ]
 },

 

Is there a way to avoid this error?

 

 

Similarly there is an another error log on the same run.  

 

Caused by: java.lang.ClassCastException: optional binary app_application_list 
(UTF8) is not a group

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to