[GitHub] [hudi] pengzhiwei2018 commented on pull request #2283: [HUDI-1415] Incorrect query result for hudi hive table when using spa…

GitBox Wed, 27 Jan 2021 08:29:12 -0800


pengzhiwei2018 commented on pull request #2283:
URL: https://github.com/apache/hudi/pull/2283#issuecomment-768405930



   > I had the same problem, but I saw less rows not more.
   > Reading with spark datasource I have more than 30 million rows and using 
spark sql with hive only 4 million.
   > 
   > I had this problem only these two options are enabled
   > 
   > "spark.sql.hive.convertMetastoreParquet": "false"
   > "spark.hadoop.hoodie.metadata.enable": "true"
   > 
   > @pengzhiwei2018
   
   Hi @rubenssoto ,currently spark sql will try hoodie table as a hive table 
and read it as a parquet format.So it will result in read more record as the 
update & delete to the table. So i think the your problem  maybe different with 
this one. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] pengzhiwei2018 commented on pull request #2283: [HUDI-1415] Incorrect query result for hudi hive table when using spa…

Reply via email to