https://bugs.kde.org/show_bug.cgi?id=481127
Mark Wielaard changed:
What|Removed |Added
Status|CONFIRMED |RESOLVED
Resolution|---
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #11 from Paul Floyd ---
For the attribution I would just put
Patch contributed by Gražvydas Ignotas and Bruno
Lathuilière
(with the right accents if possible)
--
You are receiving this mail because:
You are watching all bug changes.
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #10 from Mark Wielaard ---
Created attachment 168440
--> https://bugs.kde.org/attachment.cgi?id=168440=edit
valgrind.fma_arm64.diff
This is the variant of the patch that I tested.
It looks good to me.
How should we credit this when
https://bugs.kde.org/show_bug.cgi?id=481127
Mark Wielaard changed:
What|Removed |Added
Status|REPORTED|CONFIRMED
Ever confirmed|0
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #8 from Bruno Lathuilière ---
The merge between the two last version is done:
https://raw.githubusercontent.com/edf-hpc/verrou/bl/test_merge_fma/valgrind.fma_amd64.diff
--
You are receiving this mail because:
You are watching all bug
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #7 from Bruno Lathuilière ---
Sorry I did'nt finish my message (bad keyboard shorkey ...)
I think we will need new IR to take into account vectorized fma :
Iop_MAdd_F64x2, Iop_MAdd_F64x4, Iop_MAdd_F32x4, Iop_MAdd_F32x8,
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #6 from Bruno Lathuilière ---
(In reply to Paul Floyd from comment #5)
> Should we also support F16?
No, there are no Iop_MAddF16 or IopMSubF16.
And to my knowledge AVX512 is the only way to generate half floating-point
operations. And
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #5 from Paul Floyd ---
Should we also support F16?
Does this also work with the other permutatons 132 and 231?
Lastly, do packed and scalar make any difference?
This will need a regression test as well.
--
You are receiving this mail
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #4 from Bruno Lathuilière ---
I was able to test my patch
https://github.com/edf-hpc/verrou/blob/master/valgrind.fma_amd64.diff with
fma4 hardware : it works.
I also improve the patch of "notasas" to take into account double precision (cf
https://bugs.kde.org/show_bug.cgi?id=481127
--- Comment #3 from Bruno Lathuilière ---
Created attachment 167413
--> https://bugs.kde.org/attachment.cgi?id=167413=edit
Improve the previous patch to take into account double.
--
You are receiving this mail because:
You are watching all bug
https://bugs.kde.org/show_bug.cgi?id=481127
Bruno Lathuilière changed:
What|Removed |Added
CC||bruno.lathuili...@edf.fr
--- Comment #2
https://bugs.kde.org/show_bug.cgi?id=481127
Paul Floyd changed:
What|Removed |Added
CC||pjfl...@wanadoo.fr
--- Comment #1 from Paul Floyd
12 matches
Mail list logo