stackfun commented on issue #1860: URL: https://github.com/apache/hudi/issues/1860#issuecomment-663176816
I used the setting you recommended, and still get similar results. In this run, I was inserting 200 records in the writer job. ``` Hive Query: 600 Spark Query: 777 Hive Query: 800 Spark Query: 800 Hive Query: 800 Spark Query: 800 Hive Query: 800 Spark Query: 800 Hive Query: 800 Spark Query: 851 Hive Query: 1000 Spark Query: 1000 ``` I'm refreshing the table before each query, so the table metadata in Spark should be cleared. Does this seem like a bug to you, or is there some other setting that I should try? I was stress testing Hudi's atomic write feature as our team is determining whether we can use Hudi for an efficient data lake. Directly querying the hive table using Spark SQL seems to work flawlessly, so we're not blocked. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
