On Fri, 8 Jul 2011, Richard Guenther wrote:
On Fri, Jul 8, 2011 at 5:20 AM, Dimitrios Apostolou <ji...@gmx.net> wrote:
Hello list,
The attached patch does two things for df_get_call_refs():
* First it uses HARD_REG_SETs for defs_generated and
regs_invalidated_by_call, instead of bitmaps. Replacing in total more than
400K calls (for my testcase) to bitmap_bit_p() with the much faster
TEST_HARD_REG_BIT, reduces the total instruction count from about 13M to
1.5M.
* Second it produces the REFs in REGNO order, which is important to keep the
collection_rec sorted most times, and avoid expensive calls to qsort().
Thanks to Paolo Bonzini for idea and mentoring.
The second part makes a big difference if accompanied with another patch in
df_insn_refs_collect(). I'll post a followup patch, that is unfortunately
unstable for some of my tests, so I'd appreciate any comments.
Did you check the impact on memory usage? I suppose on targets
with not many hard registers it should even improve, but do we expect
memory usage to be worse in any case?
Hi Richard, I didn't check memory usage, is that important? Since the
struct bitmap is fairly bulky, it should take an arch with lots of hard
regs (which one has the most?).
But still a few bytes tradeoff wouldn't be acceptable for a much faster
type? And IMHO it makes the code better to understand, since once you see
HARD_REG_SET you know you can't expect else. FWIW I'm now in the process
of converting all other bitmap uses for hard regs, to HARD_REG_SETs, at
least within DF. I'm not sure whether performance gains will be visible,
however, not much code is as hot as df_get_call_refs().
Thanks,
Dimitris