Status: New
Owner: ----

New issue 2096 by [email protected]: Fixing Navier-Stokes performance on ARM.
http://code.google.com/p/v8/issues/detail?id=2096

ARM currently under-performs in the Navier-Stokes benchmark.
I had a look today and found something.
On my Tegra2 a one line hack (patch ns.diff attached) changes the score from ~1000 to ~1800.

Any idea how to cleanly fix this?
Let me know if you need more details/traces or have questions (available on chat).


More details:

1 ---- issue with representation inference

In navier-stokes.js, starting from line 204:

function advect(...)
{
  var Wp5 = width + 0.5;     // width = 128
  var Hp5 = height + 0.5;    // height = 128

  for() {    // OSR-ed loop
    ...

    if (x < 0.5)
      x = 0.5;
    else if (x > Wp5)
      x = Wp5;

    ...
  }
...

Wp5 and Hp5 have double values. But it seems something goes wrong in the representation inference mechanism, and they end up converted to int32 - the conversion (non-truncating) fails triggering deop.

The attached c1visualizer output attached shows in block B1 that a 'wrong' decision has been taken.

2 ---- Deoptimization

From my observation the actual deoptimization occurs when trying to set up the value when entering the OSR-ed loop. I guess the first attempt to conversion is done in when entering the OSR-ed loop.

In attached advect.opt_code: (--print_opt_code --code_comments)

Jumping from

316                   ;;; @140: tagged-to-i.
317 0x4b06508c   652  e1b010c1       movs r1, r1, asr #1
318 0x4b065090   656  2a000203       bcs +2068 -> 2724  (0x4b0658a4)

to the deferred-tagged-to-i, which fails with an inexact conversion (128.5 to int32)

1162 0x4b0658e4  2788  e312001f       tst r2, #31
1163                                  ;; Deoptimization check. id = 47
1164 0x4b0658e8  2792  1a000142       bne +1296 -> 4088  (0x4b065df8)


3 ---- Too optimistic representation change in TryChange() ?

I think the decision to prefer Integer32 to Double in
  HInferRepresentation::TryChange(HValue* value) (hydrogen.cc line 1904)
is too optimistic.
It seems to me
  IsConvertibleToInteger()
does not always guarantee that the value can be converted to an int32.

As the patch shows, preferring double over int32 fixes the performance gap.



Note that I am not sure how the 3 points exactly relate together. I didn't investigate further after finding a hack.
(It seems to me 1 and 2 are two different results caused by 3.)
It is unlikely that the transition (in the c1visualizer trace)
  0    1    i351  Change d23 d to i -0? range[-2147483648,2147483647,m0=0]
is generated by (hydrogen.cc line 1860)
  HInferRepresentation::InferBasedOnUses()
since we should have returned line 1862
  if(r.isSpecialization() || value->HasNoUses()) return;


Attachments:
        ns.diff  642 bytes
        c1-visu.png  159 KB
        advect.opt_code  102 KB

--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev

Reply via email to