Re: Default JIT setting in V12

Soumyadeep Chakraborty Wed, 29 Jan 2020 12:31:24 -0800

Hello,

Based on this thread, Alexandra and I decided to investigate if we could
borrow
some passes from -O1 and add on to the default optimization of -O0 and
mem2reg.
To determine what passes would make most sense, we ran ICW with
jit_above_cost
set to 0, dumped all the backends and then analyzed them with 'opt'. Based
on
the stats dumped that the instcombine pass and sroa had the most scope for
optimization. We have attached the stats we dumped.


Then, we investigated whether mixing in sroa and instcombine gave us a
better
run time. We used TPCH Q1 (TPCH repo we used:
https://github.com/dimitri/tpch-citus) at scales of 1, 5 and 50. We found
that
there was no significant difference in query runtime over the default of -O0
with mem2reg.

We also performed the same experiment with -O1 as the default
optimization level, as Andres had suggested on this thread. We found
that the results were much more promising (refer the results for scale
= 5 and 50 below). At the lower scale of 1, we had to force optimization
to meet the query cost. There was no adverse impact from increased
query optimization time due to the ramp up to -O1 at this lower scale.


Results summary (eyeball-averaged over 5 runs, excluding first run after
restart. For each configuration we flushed the OS cache and restarted the
database):

settings: max_parallel_workers_per_gather = 0

scale = 50:
-O3                                                      : 77s
-O0 + mem2reg                                   : 107s
-O0 + mem2reg + instcombine            : 107s
-O0 + mem2reg + sroa                        : 107s
-O0 + mem2reg + sroa + instcombine : 107s
-O1                                                       : 84s

scale = 5:
-O3                                                       : 8s
-O0 + mem2reg                                    : 10s
-O0 + mem2reg + instcombine             : 10s
-O0 + mem2reg + sroa                         : 10s
-O0 + mem2reg + sroa + instcombine : 10s
-O1                                                       : 8s


scale = 1:
-O3                                                       : 1.7s
-O0 + mem2reg                                    : 1.7s
-O0 + mem2reg + instcombine            : 1.7s
-O0 + mem2reg + sroa                         : 1.7s
-O0 + mem2reg + sroa + instcombine : 1.7s
-O1                                                       : 1.7s

Based on the evidence above, maybe it is worth considering ramping up the
default optimization level to -O1.

Regards,

Soumyadeep and Alexandra

opt_dump.pdf
Description: Adobe PDF document

Re: Default JIT setting in V12

Reply via email to