I am working on a change which introduces a couple of indirect function calls in the fast path. I used indirect function calls instead of "if/else" as it keeps the code cleaner and more readable and provides for extensibility.

I do not expect any measurable overhead as modern CPU's use pre-fetching and multiple parallel execution engines. Indirect jump prediction has gotten very good too. Assembly shows that the difference is an instruction to get the table address followed by a call to *callq instead of callq. I could not find official Intel numbers[1] but third party numbers suggest that *callq may have an over head of 1 clock cycle. Over head of accessing the table address is similar to reading a variable or testing a flag.

Will a patch that introduces indirect function calls be rejected just because it uses indirect function calls in the fast path or is the use/impact evaluated on a case by case basis ? I do see usage of indirect function calls in TCP fast path and Ethernet drivers so it can't be that bad.



[1] I believe this is because the instruction is broken down into multiple uops and due to perfecting and parallel execution the latency is hard to measure.

Reply via email to