Zachary Turner wrote:
> I guess the same reason people would want any asm functions in C
> source code. Sometimes it's just the best way to express something.
> Like in the example I mentioned, I could write 4 different functions
> in assembly, one for each size suffix, wrap them all up in a separate
> assembly language file but IMHO it's more readable, quicker to code,
> and more expressive to use a template switch like I've done. C++ is
> built on the philosophy of giving you enough rope to hang yourself
> with.
>
> I don't think there's a better way to express the selection of an
> instruction based on operand size than through a naked template
> specialization.
>
> Using a .s file is more difficult to port across different compilers.
> Many compilers provide support for naked functions and it's easy to
> just use a #ifdef to check which compiler you're running on and define
> the appropriate naked declaration string.
>
> Besides, it's supported for embedded architectures, it's frustrating
> because it feels like back in the days of a 386SX's where the
> processors had working FPUs on them but they were switched off "just
> because". All the investment has already been done to add support for
> naked functions, so I think people should be "permitted" to use it,
> even if other people feel like they should be using something else.
I still don't get it. A gcc asm version of this is
-------------------------------------------------------------------------
template<typename T> intptr_t scas(T *a, T val, int len);
template<> intptr_t scas<uint8_t>(uint8_t *a, uint8_t val, int len)
{
intptr_t result;
__asm__ ("rep scasb" : "=D"(result): "a"(val), "D"(a), "c"(len));
return result;
}
template<typename T>
int find_first_nonzero_scas(T* x, int cnt)
{
intptr_t result = 0;
result = scas<T>(x, 0, cnt);
result -= reinterpret_cast<intptr_t>(x);
result /= sizeof(T);
return --result;
}
-------------------------------------------------------------------------
which, when instantiated, generates
int find_first_nonzero_scas<unsigned char>(unsigned char*, int):
movq %rdi, %rdx
xorl %eax, %eax
movl %esi, %ecx
notq %rdx
rep scasb
leaq (%rdx,%rdi), %rax
ret
How is this not better in every way ?
I can understand that you want something compatible with your source. But
you said "I don't think anyone has ever presented a good example of where
[naked asms are] really really useful on x86 architectures."
Baffled,
Andrew.