GourdErwa opened a new issue #4834:
URL: https://github.com/apache/hudi/issues/4834


   Once the Parquet file appears, the query will report an error
   
   Steps to reproduce the behavior:
   
   ```sql
   # FLINK 1.14.4
   # hudi 0.10.1
   # hudi 0.10.1
   
   # link https://hudi.apache.org/docs/flink-quick-start-guide#insert-data
   
   set execution.result-mode = tableau;
   
   CREATE TABLE IF NOT EXISTS hudi_example_map
   (
       device STRING NOT NULL,
       data_time BIGINT,
       dt        INTEGER,
       tags MAP <STRING,STRING>,
       fields MAP <STRING,STRING>
   ) PARTITIONED BY (`dt`,`device`)
   WITH (
     'connector' = 'hudi',
     'path' = 'hdfs:///user/hudi/iot/hudi_example_map',
     'hoodie.datasource.write.recordkey.field' = 'device,data_time',
     'table.type' = 'COPY_ON_WRITE'
   );
   
   insert into hudi_example_map values ('id1', 1, 20220101, Map['k1', 'v1', 
'k2', 'v2'], Map['k1', 'v1', 'k2', 'v2']);
   insert into hudi_example_map values ('id1', 2, 20220101, Map['k1', 'v1', 
'k2', 'v2'], Map['k1', 'v1', 'k2', 'v2']);
   insert into hudi_example_map values ('id1', 3, 20220101, Map['k1', 'v1', 
'k2', 'v2'], Map['k1', 'v1', 'k2', 'v2']);
   
   Flink SQL> select * from hudi_example_map;
   
   ## ERROR LOG
   ## debug msg=typeName:{tags}. typeName:{MAP}. t.isPrimitive()={false}. 
t.isRepetition(Type.Repetition.REPEATED)
   # ={false}
   WARN  org.apache.flink.runtime.taskmanager.Task  [] - Source: 
bounded_source(table=[hudi_example_map], fields=[device, data_time, dt, tags, 
fields]) -> NotNullEnforcer(fields=[device]) (4/4)#0 
(a0e97ef25005459abe6b207cf7938f75) switched from RUNNING to FAILED
     cause: java.lang.UnsupportedOperationException: Complex types not 
supported.
        at 
org.apache.hudi.table.format.cow.ParquetColumnarRowSplitReader.checkSchema(ParquetColumnarRowSplitReader.java:244)
        at 
org.apache.hudi.table.format.cow.ParquetColumnarRowSplitReader.<init>(ParquetColumnarRowSplitReader.java:152)
        at 
org.apache.hudi.table.format.cow.ParquetSplitReaderUtil.genPartColumnarRowReader(ParquetSplitReaderUtil.java:125)
        at 
org.apache.hudi.table.format.cow.CopyOnWriteInputFormat.open(CopyOnWriteInputFormat.java:116)
        at 
org.apache.hudi.table.format.cow.CopyOnWriteInputFormat.open(CopyOnWriteInputFormat.java:65)
        at 
org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:84)
        at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:110)
        at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:67)
        at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:323)
   
   ```
   
   **Environment Description**
   * Flink version : 1.14.3
   
   * Hudi version : 0.10.1
   
   * Hadoop version : 3.0.0-cdh6.3.2
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to