On Sep 30, 2014, at 2:22 AM, Bin Cheng <bin.ch...@arm.com> wrote:
> Then I decided to take one step forward to introduce a generic
> instruction fusion infrastructure in GCC, because in essence, load/store
> pair is nothing different with other instruction fusion, all these 
> optimizations
> want is to push instructions together in instruction flow.

I like the step you took.  I had exactly this in mind when I wrote the original.

> N0 ~= 1300
> N1/N2 ~= 5000
> N3 ~= 7500

Nice.  Would be nice to see metrics for time to ensure that the code isn’t 
actually worse (CSiBE and/or spec and/or some other).  I didn’t have any large 
scale benchmark runs with my code and I did worry about extending lifetimes and 
register pressure.

> I cleared up Mike's patch and fixed some implementation bugs in it

So, I’m wondering what the bugs or missed opportunities were?  And, if they 
were of the type of problem that generated incorrect code or if they were of 
the type that was merely a missed opportunity.

Reply via email to