[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-17 Thread Ivo Raisr
https://bugs.kde.org/show_bug.cgi?id=385843

Ivo Raisr  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |CONFIRMED
   Assignee|jsew...@acm.org |iv...@ivosh.net

--- Comment #11 from Ivo Raisr  ---
We do have a number of test programs under none/tests/arm which are supposed to
test VFP.

Please could you have a look if your test program would belong into one of
these; otherwise add it there.

Then post the results of running the regression test suite before and after
your changes are applied. I will then take your patch in.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-17 Thread Sindre Aamås
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #10 from Sindre Aamås  ---
The reason why I ran across this is likely that our toolchain has -mfpu=neon as
default, which may be uncommon, which would explain why this has not been
reported previously.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-17 Thread Ivo Raisr
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #9 from Ivo Raisr  ---
(In reply to Sindre Aamås from comment #6)
> I have not looked at the stats, but the output is as follows.

As suspected, there is a code bloat in the generated code because of the
additional spilling before helper calls.
Ratio before: 15.9
Ratio after:  16.0

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-17 Thread Sindre Aamås
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #8 from Sindre Aamås  ---
Created attachment 108398
  --> https://bugs.kde.org/attachment.cgi?id=108398=edit
Stats after

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-17 Thread Sindre Aamås
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #7 from Sindre Aamås  ---
Created attachment 108397
  --> https://bugs.kde.org/attachment.cgi?id=108397=edit
Stats before

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-17 Thread Sindre Aamås
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #6 from Sindre Aamås  ---
Q registers should be mostly relevant for NEON (often intrinsics or assembly
routines with limited control flow). I ran memcheck with and without the change
on a fairly NEON-intensive component (mostly DSP and neural networks), and see
little performance difference, if any, (Cortex A57 running 32-bit code, which
may seem like a weird choice, but it was convenient). I have not looked at the
stats, but the output is as follows.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-16 Thread Ivo Raisr
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #5 from Ivo Raisr  ---
Your fix touches register allocation. It is crucial that both ARMInstr_Call()
and  getRRegUniverse_ARM() are kept in sync (hinted in getRRegUniverse_ARM as
well)
and that register allocator is presented with the workable set of registers.

By marking all Q ones as caller saved (trashed for call), register allocator
would need to spill them all before the call. This creates a performance
penalty and bloats the generated r-code.

Would you try running Memcheck on a program which uses 128-bit VFP (q)
registers  with '--stats=yes' and note the difference on 'ratio' reported?

How prevalent are programs utilizing 128-bit VFP registers compared to 64-bit
ones? In other words, are compilers (gcc) likely to utilize those registers a
lot?


A clue for your answer might lie in the document you cited first:
"Registers s16-s31 (d8-d15, q4-q7) must be preserved across subroutine calls;
registers s0-s15 (d0-d7, q0-q3) do not need to be preserved (and can be used
for passing arguments or returning results in standard procedure-call
variants). Registers d16-d31 (q8-q15), if present, do not need to be
preserved."

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-16 Thread Sindre Aamås
https://bugs.kde.org/show_bug.cgi?id=385843

Sindre Aamås  changed:

   What|Removed |Added

 Attachment #108392|0   |1
is obsolete||

--- Comment #4 from Sindre Aamås  ---
Created attachment 108396
  --> https://bugs.kde.org/attachment.cgi?id=108396=edit
[PATCH] VEX/ARM: mark caller-save VFP registes as trashed by calls

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-16 Thread Sindre Aamås
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #3 from Sindre Aamås  ---
(In reply to Ivo Raisr from comment #2)
> The comment in getRRegUniverse_ARM() should be more explicit about which
> registers are caller save and callee save.
> 
> And if this is indeed the case, then the patch needs to take care of
> supplying different set of Q registers for register allocator use. Callee
> saved registers are preferred over caller saved ones.

The callee-save registers are used up for S and D registers (which alias with
the Q registers). Can we first get this fixed and then look at eventually
improving the register selection? (Or do you have a suggestion for a better set
of registers?)

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-16 Thread Ivo Raisr
https://bugs.kde.org/show_bug.cgi?id=385843

Ivo Raisr  changed:

   What|Removed |Added

 CC||iv...@ivosh.net

--- Comment #2 from Ivo Raisr  ---
The comment in getRRegUniverse_ARM() should be more explicit about which
registers are caller save and callee save.

And if this is indeed the case, then the patch needs to take care of supplying
different set of Q registers for register allocator use. Callee saved registers
are preferred over caller saved ones.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 385843] [PATCH] ARM: mark caller-save VFP registes as trashed by calls

2017-10-16 Thread Sindre Aamås
https://bugs.kde.org/show_bug.cgi?id=385843

--- Comment #1 from Sindre Aamås  ---
Created attachment 108392
  --> https://bugs.kde.org/attachment.cgi?id=108392=edit
[PATCH] VEX/ARM: mark caller-save VFP registes as trashed by calls

-- 
You are receiving this mail because:
You are watching all bug changes.