On Oct 30, 2012, at 4:13 PM, Michael Liao <[email protected]> wrote:
> These unary insns are not modelled accurately to capture the hardware > behavior (high parts of result is preserved in SSE variant). Even though > the fix works here, it'd better to fix them in X86 backend. Bug is filed > in PR14221. I actually have a fix in backend, but thought it is cleaner to fix it in the front-end :) Thanks, Manman > > Yours > - Michael > > On Tue, 2012-10-30 at 10:57 -0700, manman ren wrote: >> On Oct 29, 2012, at 6:01 PM, Eli Friedman <[email protected]> wrote: >> >>> On Thu, Oct 25, 2012 at 8:23 PM, Manman Ren <[email protected]> wrote: >>>> >>>> >>>> Sent from my iPhone >>>> >>>> On Oct 25, 2012, at 7:54 PM, Eli Friedman <[email protected]> wrote: >>>> >>>>> On Thu, Oct 25, 2012 at 7:35 PM, Manman Ren <[email protected]> wrote: >>>>>> >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>> On Oct 25, 2012, at 6:13 PM, Eli Friedman <[email protected]> wrote: >>>>>> >>>>>>> On Thu, Oct 25, 2012 at 5:25 PM, Manman Ren <[email protected]> wrote: >>>>>>>> Author: mren >>>>>>>> Date: Thu Oct 25 19:25:10 2012 >>>>>>>> New Revision: 166743 >>>>>>>> >>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=166743&view=rev >>>>>>>> Log: >>>>>>>> X86 SSE Intrinsics: update header for sqrt_ss, rsqrt_ss and rcp_ss. >>>>>>>> >>>>>>>> There intrinsics pass through the upper FP values from the input. >>>>>>>> rdar://12558838 >>>>>>>> >>>>>>>> Modified: >>>>>>>> cfe/trunk/lib/Headers/xmmintrin.h >>>>>>>> cfe/trunk/test/CodeGen/sse-builtins.c >>>>>>>> >>>>>>>> Modified: cfe/trunk/lib/Headers/xmmintrin.h >>>>>>>> URL: >>>>>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/xmmintrin.h?rev=166743&r1=166742&r2=166743&view=diff >>>>>>>> ============================================================================== >>>>>>>> --- cfe/trunk/lib/Headers/xmmintrin.h (original) >>>>>>>> +++ cfe/trunk/lib/Headers/xmmintrin.h Thu Oct 25 19:25:10 2012 >>>>>>>> @@ -95,7 +95,8 @@ >>>>>>>> static __inline__ __m128 __attribute__((__always_inline__, >>>>>>>> __nodebug__)) >>>>>>>> _mm_sqrt_ss(__m128 a) >>>>>>>> { >>>>>>>> - return __builtin_ia32_sqrtss(a); >>>>>>>> + __m128 c = __builtin_ia32_sqrtss(a); >>>>>>>> + return (__m128) { c[0], a[1], a[2], a[3] }; >>>>>>>> } >>>>>>> >>>>>>> What does exactly does __builtin_ia32_sqrtss return, if not the result >>>>>>> of a sqrtss instruction? >>>>>> builtin returns the result of sqrtss instruction, which only updates the >>>>>> lowest FP. mm_sqrt_ss has the upper FPs pass through to the output. >>>>> >>>>> What exactly is in the top 96 bits of the return value of >>>>> __builtin_ia32_sqrtss, then? >>>> Those bits are undefined with the current implementation, same goes for >>>> sqrt_sd. >>> >>> Why are the intrinsics defined to take vectors in the first place, then? >> Not sure why the builtins for sqrtss, rsqrtss, rcpss and sqrtsd take vectors. >> Their top bits are undefined since they directly map to the corresponding >> instructions which have the top bits from the un-initialized destination >> registers. >> >> Thanks, >> Manman >>> >>> -Eli >> >> _______________________________________________ >> cfe-commits mailing list >> [email protected] >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits > > _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
