Hi Ben,
On Wed, 16 Jul 2014, Ben Avison wrote:
On Sun, 13 Jul 2014 10:10:00 +0100, Martin Storsjö <[email protected]> wrote:
On Fri, 11 Jul 2014, Ben Avison wrote:
I tried this code on most of our odd configurations, and ran into a
bunch of issues, most of which I've been able to resolve in one way or
another, but some parts of it is a bit ugly...
Thanks Martin. It might seem like it sometimes, but I don't deliberately
set out to break gas-preprocessor with every new patch series :)
Well, given that gas-preprocessor originally only supported the (at that
time) small subset of gas features used by the existing arm and ppc
assembly, it's quite understandable that new code finds issues - this only
makes the tool better :-)
It's a shame we have to worry about getting this code to assemble
correctly in Thumb mode, as it seems rather unlikely that it will ever
actually be used that way - most CPUs either don't support enough Thumb
instructions for it to assemble, or have NEON that will be used in
preference, or don't support short vector VFP mode, which the code
requires.
Nevertheless, I have made the changes you suggest - probably the same as
the ones you made locally. I'll repost this patch shortly.
Thanks! Indeed, it is a bit of extra burden to make sure it works in those
mostly irrelevant configurations, but it eases the maintainance of the
code if it is as portable as possible.
Btw, I saw that you renamed the local symbols to all use the .L prefix -
sorry if I was unclear about this. This would only have been necessary if
we kept the thumb offset table (and even then, we would have had to jump
through some extra hoops, to make sure there were no non-local labels
anywhere between the offset table and the functions it pointed to). Since
this new patch only uses a normal function pointer table, this shouldn't
be necessary any longer, so in case there's a need for yet another round
of the patch you can do this whichever way you prefer. Keeping the symbol
names visible would perhaps be useful for debugging?
If you prefer to do it that way I can also amend that before pushing (no
need to resend just for that), in case there's nothing else that needs to
be done.
+A .word ff_fft16_vfp @ this one alone is exported
This reference needs to be wrapped in X()
I've only ever seen X() resolve to a no-op, so I don't get this sort of
thing flagged during my own tests. When does it make any difference? I
was never clear on whether it only applied to labels from *other*
compilation units rather than the current one (made less clear by the
fact that it never seems to be used when a label is defined), or whether
it applied just to code labels, or both code and data. The name "X"
doesn't really give much of a hint!
I think the name X() was chosen just as a macro as short as possible. It's
used for mangling of external names - on arm it only matters on
darwin/iOS, where external symbols are prefixed with an underscore. This
underscore is automatically added when you declare functions with the
function macro, but has to be added whenever you reference such symbols
elsewhere. Given the naming scheme, you would use it when referencing any
ff_ prefixed labels.
.align 3
Is there any particular reason why this is aligned to 8 bytes instead
of 4 - shouldn't 4 (aka .align 2) be enough for float constants? (Yes,
I know this isn't added by this patch though.)
It's because the ARM11's memory bus is actually 64 bits wide. cos1pi4 and
cos1pi8 are loaded in the same vldr instruction; if cos1pi4 is 64-bit
aligned then this takes 1 cycle, otherwise 2 cycles. This multiplies up,
so for something like FFT512, that's quite a few extra cycles.
Ok, thanks for the clarification. This issue should have gone away now
that you use a table with function pointers instead of manually
calculating offsets.
+ ldr v5, =\costable
Should this perhaps use the movrelx macro?
Um, maybe. I must admit to not really understanding what the movrel and
movrelx macros are for and when to use which. Could I respectfully ask
for some comments in asm.S to explain this to those of us who aren't
familiar with the full range of supported ABIs?
Sure. I'm actually not all that familiar with it myself, hopefully Janne
can fill in the details.
I noticed this since gas-preprocessor for MS armasm didn't handle ldr with
imported symbols yet - because all the similar cases had been wrapped in
the movrel/movrelx macros, that end up as movw/movt in the armasm case.
Nevertheless, wouldn't this be more readable written just as a normal
gas macro instead of as a cpp macro? That's how the corresponding macro
is done in fft_neon.S.
I couldn't figure out how to do string concatenations in gas - it's not
my favourite assembly language and I often find string manipulation is
one of its weaknesses. Nevertheless, the trick in fft_neon.S seems to
work, so thanks for the pointer.
Ok - thanks for the resend, I'll retest it in all these configurations
now.
// Martin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel