--- Comment #22 from aph at gcc dot gnu dot org 2009-04-23 11:08 ---
Re named register variables:
You can, instead of using
[coeff_ptr_l1] +r (coeff_ptr_l1)
declare something like
register long double *coeff_ptr_l1 asm (%%r8);
and then use %%r8 in your asm. This means that you
--- Comment #1 from d at teklibre dot com 2009-04-22 13:49 ---
Created an attachment (id=17672)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17672action=view)
Test program demonstrating 16 register breakage on inline asm in x86_64
--
--- Comment #2 from d at teklibre dot com 2009-04-22 14:04 ---
tested against: gcc (Ubuntu 4.3.2-1ubuntu12) 4.3.2 - fails
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39847
--- Comment #5 from pinskia at gcc dot gnu dot org 2009-04-22 15:00 ---
Actually no, it would be better if you moved over to using the intrincs which
can be optimized and scheduled.
--
pinskia at gcc dot gnu dot org changed:
What|Removed |Added
--- Comment #4 from d at teklibre dot com 2009-04-22 14:58 ---
+ counts as two operands? ok, that makes sense. So, basically
2*num_of_physical_regs would be a saner default ...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39847
--- Comment #6 from d at teklibre dot com 2009-04-22 15:40 ---
Pinska: Actually, no. I started with the intrinsics and looked hard at what the
code scheduler was doing before settling on rewriting this in inline assembly.
The intrinsics have several problems that effect the code
--- Comment #7 from pinskia at gcc dot gnu dot org 2009-04-22 15:45 ---
(In reply to comment #6)
Pinska: Actually, no. I started with the intrinsics and looked hard at what
the
code scheduler was doing before settling on rewriting this in inline
assembly.
The intrinsics have
--- Comment #8 from aph at gcc dot gnu dot org 2009-04-22 16:53 ---
I don't see why this is changed to WONTFIX. Fixing inline asm to allow the use
of all a machine's registers is trivial, and should not be refused for the sake
of a pedantic argument about whether someone should be
--- Comment #10 from aph at redhat dot com 2009-04-22 17:33 ---
Subject: Re: 16 symbolic register names generates error:
more than 30 operands in 'asm'
0) I will build gcc-4.4 and try that. I will also make the 1 line patch to it
to try increasing the number of asm params, and try
--- Comment #11 from aph at gcc dot gnu dot org 2009-04-22 17:47 ---
I suspect the reason the limit is 30 is that when that code was written the
largest register set was 32 registers, 2 of which were reserved to the
implementation. Inline asm hasn't kept up with the hardware.
--
--- Comment #12 from jakub at gcc dot gnu dot org 2009-04-22 17:51 ---
That's not going to fly very well e.g. on ia64:
config/ia64/ia64.h:#define FIRST_PSEUDO_REGISTER 334
then we have automatic arrays like:
char operands_match[MAX_RECOG_OPERANDS][MAX_RECOG_OPERANDS];
etc., many of them
--- Comment #9 from d at teklibre dot com 2009-04-22 17:24 ---
Pinskia:
It is going to take me a long time to address these issues piecemeal, so...
0) I will build gcc-4.4 and try that. I will also make the 1 line patch to it
to try increasing the number of asm params, and try that. I
--- Comment #13 from d at teklibre dot com 2009-04-22 17:55 ---
@Andrew
I suspect the reason the limit is 30 is that when that code was written the
largest register set was 32 registers, 2 of which were reserved to the
implementation. Inline asm hasn't kept up with the hardware.
That
--- Comment #14 from d at teklibre dot com 2009-04-22 18:00 ---
@Jakub:
I'm going to build this thing today. (once I figure out the best way, and I
figure it will take a while, even so) Are there any specific tests I should run
to check for performance issues? I expect any stack
--- Comment #15 from pinskia at gcc dot gnu dot org 2009-04-22 18:03
---
(In reply to comment #13)
That old huh? Given that I/O operands take two virtual regs... methinks that
the history of this is more of an x86ism...
and symbolic register parameters date back to gcc 3.1
--- Comment #16 from d at teklibre dot com 2009-04-22 18:30 ---
@Jakub/Andrew:
max_recog_operands =
MIN(FIRST_PSEUDO_REGISTER*2,SOME_SANE_VALUE_DERIVED_FROM_SMASHING_THE_STACK_ON_IA64)
; // ?
I certainly am not in a position to make a one line change to gcc and test it
on ia64 or
--- Comment #17 from aph at gcc dot gnu dot org 2009-04-22 18:40 ---
I agree with Jakub's point.
David, can you try instead of register operands using named register variables
instead? I think that may work, unless there is some other limit of which I'm
unaware.
--
--- Comment #18 from hjl dot tools at gmail dot com 2009-04-22 19:03
---
(In reply to comment #6)
Pinska: Actually, no. I started with the intrinsics and looked hard at what
the
code scheduler was doing before settling on rewriting this in inline
assembly.
The intrinsics have
--- Comment #19 from d at teklibre dot com 2009-04-22 19:48 ---
@Andrew: I agree with Jakub's point too, but don't believe merely doubling the
number of operands will hurt much. Am trying it against 4.3.2... it's building
as I write.
When I figure out how to safely build 4.4 I will
--- Comment #20 from d at teklibre dot com 2009-04-23 03:29 ---
I got gcc 4.4 built with the 1 line patch.
It assembles my 24 operand function just fine (which had several errors in the
asm that I couldn't detect without assembling it - pushed out to my repo now -
it even gets through
--- Comment #21 from d at teklibre dot com 2009-04-23 03:44 ---
(In reply to comment #20)
It successfully compiles, links, and runs my project (ScreamingRabbitCode) at
-O3 -mtune=core2 w/o the asm code in 17.397s. gcc 4.3.2 takes 17.865s. (both
are best case times over several runs
21 matches
Mail list logo