On March 25, 2017 4:56:11 PM PDT, Ants Aasma <ants.aa...@eesti.ee> wrote: >On Sun, Mar 26, 2017 at 12:22 AM, Andres Freund <and...@anarazel.de> >wrote: >>> At least with current gcc (6.3.1 on Fedora 25) at -O2, >>> what I see is multiple places jumping to the same indirect jump >>> instruction :-(. It's not a total disaster: as best I can tell, all >the >>> uses of EEO_JUMP remain distinct. But gcc has chosen to implement >about >>> 40 of the 71 uses of EEO_NEXT by jumping to the same couple of >>> instructions that increment the "op" register and then do an >indirect >>> jump :-(. >> >> Yea, I see some of that too - "usually" when there's more than just >the >> jump in common. I think there's some gcc variables that influence >this >> (min-crossjump-insns (5), max-goto-duplication-insns (8)). Might be >> worthwhile experimenting with setting them locally via a pragma or >such. >> I think Aants wanted to experiment with that, too. > >I haven't had the time to research this properly, but initial tests >show that with GCC 6.2 adding > >#pragma GCC optimize ("no-crossjumping") > >fixes merging of the op tail jumps. > >Some quick and dirty benchmarking suggests that the benefit for the >interpreter is about 15% (5% speedup on a workload that spends 1/3 in >ExecInterpExpr). My idea of prefetching op->resnull/resvalue to local >vars before the indirect jump is somewhere between a tiny benefit and >no effect, certainly not worth introducing extra complexity. Clang 3.8 >does the correct thing out of the box and is a couple of percent >faster than GCC with the pragma.
That's large enough to be worth doing (although I recall you seeing all jumps commonalized). We should probably do this on a per function basis however (either using pragma push option, or function attributes). Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers