Would it be possible to set a flag HAVE_MPN_SUMDIFF_N in mpir.h if
this function is available and similarly for addsub? That way code
like mine could use these functions when they are available by
conditionally including alternatives in the cases where these defines
don't exist.

Obviously this won't be a problem when the code goes into mpir, but
this sort of thing is a strong disincentive to writing fast mpn code
as it is much easier to do as standalone than as part of mpir,
initially.

Bill.

On 31 December 2011 18:48, Bill Hart <goodwillh...@googlemail.com> wrote:
> For the FFT the aliasing is not so important, as the butterfly can be
> done into temporary space then swapped. I use this strategy a lot and
> it doesn't affect the FFT timings.
>
> Bill
>
> On 31 December 2011 18:22, Jason <ja...@njkfrudils.plus.com> wrote:
>> On Saturday 31 December 2011 17:47:03 Jason wrote:
>>> On Tuesday 27 December 2011 17:27:48 Bill Hart wrote:
>>> > In my FFT I make use of mpn_sumdiff_n and mpn_addsub_n. It seems these
>>> > are not exported even though there are generic C versions.
>>> >
>>> > Also, I see there is no sumdiff_n.as on core2 style machines. Is it
>>> > possible to include mpn_sumdiff_n.c in the library on such machines so
>>> > that it is included unconditionally for all machines?
>>> >
>>>
>>> we would still have the other arches to do ie power,arm etc ,to make it 
>>> unconditional addsub needs to allocate some tmp space ,I suppose we could 
>>> split the addsub it to various overlap cases  this may be possible , but 
>>> for sumdiff I dont think it is
>>>
>>
>> addsub is possible , and so is addadd although the case addadd_n(t,x,y,z) 
>> where t=x=y=z requires mul_1(t,x,3) which on core2 and sandybridge the same 
>> speed as two adds , dont know about the other arch ,although if we consider 
>> this a rare case then it may not be important. sumdiff the only difficult 
>> case is when the sum and difference are aliased with the bot the sources , 
>> we could exclude this overlap condition? , it would also relax the 
>> instruction ordering which would ease up finding faster asm versions
>>
>>
>>> > Is there a reason to not have an assembly optimised version for core2?
>>> >
>>>
>>> I havent found one for core2 or sandybridge which is faster than a separate 
>>> add and sub
>>>
>>> > Bill.
>>> >
>>> >
>>>
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "mpir-devel" group.
>> To post to this group, send email to mpir-devel@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> mpir-devel+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/mpir-devel?hl=en.
>>

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Reply via email to