AirToSupply opened a new issue #2976:
URL: https://github.com/apache/hudi/issues/2976


   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.Build from source with branch [master], the version is 0.9.0-SNAPSHOT. 
   2.Start a Fink1.12.x streaming job, read data from hudi table to test.
   3.Online observation, the exception caused the flink job to fail
   
   **Expected behavior**
   java.lang.IllegalArgumentException: Unexpected type: ...
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.9.0-SNAPSHOT
   
   * Spark version : None
   
   * Hive version : None
   
   * Hadoop version : 2.9.2
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   The timing of the exception is: when the specified partition column field is 
not at the end of the sequence of fields written to the hudi table.
   
   For example, if the order of the fields (including partition columns) 
written in the hudi table is: col1, col2, col3. At this time, if the partition 
column field is col1, the exception will be generated. If the partition column 
field is col3, it can work normally.
   
   A clear and concise description of the problem.
   
   **Stacktrace**
   
   The exception stack is as follows:
   
   
![BB0B7B65-BC82-40da-ABD9-6550956AAFDD](https://user-images.githubusercontent.com/62897740/119125433-588c0780-ba64-11eb-9bb6-1fad46a2a3b5.png)
   
   The local debugging is as follows:
   
   
![C10E0226-BBAD-4ef3-B3AE-161586449B35](https://user-images.githubusercontent.com/62897740/119125566-82452e80-ba64-11eb-81ab-3576fc4ff97b.png)
   
   Initial diagnosis reason: When reading the hudi table through Flink, 
org.apache.hudi.table.format.cow.ParquetSplitReaderUtil#genPartColumnarRowReader
 will be called. This method returns that the selectedTypes and 
selectedFieldNames arrays in the ParquetColumnarRowSplitReader object are 
misaligned.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to