rubenssoto commented on issue #1981:
URL: https://github.com/apache/hudi/issues/1981#issuecomment-677647272


   Yeah, I could try.
   
   I made some tests, the smaller table was partitioned by day, so now I 
partitioned by year-month, so now I have greater files...my simple count 
improve a lot before was taking 1 minute and 30 seconds, now 17 seconds, but 
count on bigger table takes only 7 seconds.
   
   I could try on EMR but I catch this error
   
   Query 20200820_125020_00004_h9eb5 failed: Not valid Parquet file: 
s3://datalake/raw/courier_api/demand_coverage/created_year_month_brt=2020-06-01/b89ad14e-8cf2-446b-934a-b27107e88e20-0_26-8-4880_20200819200116.parquet
 expected magic number: [80, 65, 82, 49] got: [51, -66, -112, 88] 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to