Tim Armstrong has posted comments on this change.

Change subject: Reduce size of cross-compiled IR
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/2464/1//COMMIT_MSG
Commit Message:

Line 9: instnamer
> what did that do and why were we using it?
It makes the cross-compiled IR slightly more readable if you disassemble it: 
"This is a little utility pass that gives instructions names, this is mostly 
useful when diffing the effect of an optimization because deleting an unnamed 
instruction can change all other instruction numbering, making the diff very 
noisy."

It made the IR size 30% larger because of the longer strings so it didn't seem 
worthwhile. It maybe made more sense when the IR module was smaller.


Line 20: reduced from 5698 to 3883.
> did you do any query benchmarks?
I did a while back, there wasn't much of a change aside from slightly reduced 
codegen time. A couple of queries might have changed by a few % due to 
different JIT output (hard to tell if it's noise or not), but I'd have to dig 
in to understand if there is anything there.

Since we're changing the JIT compiler anyway it seems easier to investigate any 
codegen regressions all at once, since I don't think.


http://gerrit.cloudera.org:8080/#/c/2464/1/be/src/common/status.h
File be/src/common/status.h:

Line 101: 
> all the ones in this file we always want inlined whether it's IR or not, wh
The LLVM runtime optimisation (at -O2) will almost certainly inline these. I 
believe it treats the "inline" annotation as a hint and has a somewhat lower 
threshold for inlining.

The cross-compiled IR is generated at -O1, which only inlines things annotated 
with the alway_inline attribute. It's better to inline trivial 1-2 line 
functions at cross-compilation time to save the runtime optimiser some work.


http://gerrit.cloudera.org:8080/#/c/2464/1/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 131:   int IR_ALWAYS_INLINE level() const { return level_; }
> seems strange that llvm wouldn't already inline this given that it should b
It does do the inlining in the runtime -O2 optimisation, but it was difficult 
to coax the optimiser into doing it to the cross-compiled IR without starting 
to inline larger funcitons.

The heuristic seems to be some combination of # of callsites and function size 
in LLVM instructions. I think these accessor functions bodies end up being two 
LLVM instructions (compute offset, then load) so it thinks inlining the 
function will increase code size. I tried to get the cross-compiled IR 
optimiser to inline these automatically, but increasing the threshold too much 
started to inline larger functions.


-- 
To view, visit http://gerrit.cloudera.org:8080/2464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iea04ad2e4b365564ee71082657262485d3a85446
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to