kasakrisz commented on PR #6089:
URL: https://github.com/apache/hive/pull/6089#issuecomment-3575499587

   @ramitg254 
   Thanks for reporting this bug and working on the fix. I have a few questions 
to get a better picture of the issue:
   
   > I ran a script which executes the queries of pattern cbo_* and query* 
under perf directory with
   
   1. Did you actually execute all TPC-DS queries or just compile them? The 
driver `TestTezTPCDS30TBPerfCliDriver` doesn't execute the queries since the 
data is not available. It uses a Postgres HMS backend db dump to simulate an 
environment where the TPC-DS schema exists and calls Hive's SQL compiler using 
the `explain` and `explain cbo` commands.
   
   2. Do the numbers you shared in the tables (Apache master version, with 
aggrStatsUseDB/without aggrStatsUseDB and with batching/without batching) show 
the overall compilation time of all queries?
   
   3. I haven't found any new tests in this patch, no golden file changes 
either. Could you please provide a minimal repro of the issue? Please don't 
copy-paste any q file of the tpc-ds queries. IIUC this issue should be 
reproducible with a table having a few partitions and a batch size smaller than 
the number of partitions. Unit tests are also welcome. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to