leobiscassi commented on issue #6142:
URL: https://github.com/apache/hudi/issues/6142#issuecomment-1198147081

   Hi hudi community, I'm experiencing a similar issue, for some tables in my 
data lake we got the following error when trying to query:
   
   [16777224] Query failed (#20220727_185609_00434_4n5pr): The column 
my_column_name_here of table my_tablename_here is declared as type string, but 
the Parquet file 
(s3a://bucket/prefix/befb27ee-ee21-4791-95bb-d8aeb521aff9-0_15-22-5118_20220629223504.parquet)
 declares the column as type INT32 com.facebook.presto.spi.PrestoException: The 
column my_column_name_here of table my_tablename_here is declared as type 
string, but the Parquet file 
(s3a://bucket/prefix/befb27ee-ee21-4791-95bb-d8aeb521aff9-0_15-22-5118_20220629223504.parquet)
 declares the column as type INT32
   
   **My environment**
   hudi: amzn 0.10.1 / amzn 0.11.0 on EMR
   presto: 0.267 / 0.272 on EMR
   
   What I've done trying to fix it until now:
   
   - Tested in more than one hudi version (0.10.1 and 0.11.0)
   - Copied the jar `hudi-presto-bundle.jar` from EMR to the presto instalation
   - Followed 
[this](https://stackoverflow.com/questions/60183579/presto-fails-with-type-mismatch-errors)
 stackoverflow thread and tried to change the config 
`hive.parquet.use-column-names=true` on `hive.properties` file on EMR
   
   None of this worked. Does someone knows how to deal with it or if is it a 
bug on the integration?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to