pengzhiwei2018 commented on pull request #2283: URL: https://github.com/apache/hudi/pull/2283#issuecomment-768405930
> I had the same problem, but I saw less rows not more. > Reading with spark datasource I have more than 30 million rows and using spark sql with hive only 4 million. > > I had this problem only these two options are enabled > > "spark.sql.hive.convertMetastoreParquet": "false" > "spark.hadoop.hoodie.metadata.enable": "true" > > @pengzhiwei2018 Hi @rubenssoto ,currently spark sql will try hoodie table as a hive table and read it as a parquet format.So it will result in read more record as the update & delete to the table. So i think the your problem maybe different with this one. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
