On Tue, Jun 5, 2012 at 1:18 PM, Chandler Carruth <[email protected]>wrote:
> On Tue, Jun 5, 2012 at 1:15 PM, Stephen Canon <[email protected]> wrote: > >> On Jun 5, 2012, at 1:08 PM, Chandler Carruth <[email protected]> >> wrote: >> >> Hey Lang, >> >> Sorry to jump in late, but was catching on up email and finally read >> through this thread. This is the exchange that caught my interest: >> >> On Fri, Jun 1, 2012 at 4:50 AM, Stephen Canon <[email protected]> wrote: >> >>> On May 31, 2012, at 10:40 PM, John McCall <[email protected]> wrote: >>> >>> > On May 31, 2012, at 7:22 PM, Lang Hames wrote: >>> >> Thanks for the suggestion Matthieu. I spoke to Doug and he >>> recommended using attributes rather than a FunctionDecl bit to represent >>> the fp_contract state. >>> > >>> > Hmm. I had suggested a bit on FunctionDecl on the assumption that >>> this would often be controlled globally, maybe by using a flag to control >>> the default or by activating a #pragma before including all the headers. >>> Actually, I could even imagine a target (maybe a GPU target?) even >>> opting-in to this behavior by default. If we're going to use an Attr, we >>> need to make sure it doesn't get added unless the current #pragma state is >>> different from the global default; we really don't want to be allocating >>> an attribute for every function definition in the translation unit. >>> >>> We want FP_CONTRACT ON to be the default for all targets. It's also >>> worth noting that it's critical that we support setting the pragma to OFF, >>> but in practice this will be exceedingly rare (almost certainly less than >>> 1% of sources, and probably far less than that). >>> >> >> Based on this comment, I'm really not keen on the current representation, >> but maybe I've mis-understood it, so I'll ask questions first: >> >> The 'fmuladd' intrinsic is used to whitelist specific operations for >> fused multiply+add handling, correct? >> >> >> Correct. >> >> If so, and if Stephen's stance is correct (I certainly agree with it!) >> that this should be allowed for the vast majority of code, that means that >> almost every fmul and fadd in the current IR should be a candidate for >> fusing? >> >> >> Only those that originate from a common source-language *expression*. >> Your examples should not be fused because the multiply and add are in two >> separate expressions (which is why we need FE involvement; that information >> isn't available later). >> > > Ok, now I'm extra confused. Thanks for replying, hopefully you can help me > understand better. > > Why would it not be OK to fuse multiplies and adds that occur in two > source-language expressions? I have some vague memory of Fortran having > lots of special rules about within-expression semantics versus semantics > across expressions, but C++ has no such constraints to my knowledge, nor > would it want them. > > Having these types of artificial source-representation restrictions on > semantics in C++ undermines specific language constructs like overloaded > operators and transparent "wrapper" classes. > Trying to at least do my homework, as I'm not usually working w/ numerics, I've been reading up. I've now read the FP_CONTRACT part of the C11 spec, and see where your statement comes from. I find this restriction... mysterious. I would love to understand why it is important to prevent inlining from exposing contraction opportunities if you can give any examples. That said, FP_CONTRACT doesn't apply to C++, and it's quite unlikely to become a serious part of the standard given these (among other) limitations. Curiously, in C++11, it may not be needed to get the benefit of fused multiply-add: [expr] p11 seems to indicate that in C++, we are almost always allowed to use increased precision to represent operations. The only exception we can find in the C++ standard (and thanks to Richard for helping me crawl through this part) is this: static_cast<float>(static_cast<double>(x)) For any expression 'x' of floating point type, the expression may be evaluated with extra precision, but the result of round-trip casting it through a double must not. ;] It's not entirely clear this contortion was intended[1]. This definition, while awkward and arbitrary, has a nice property of being able to cleanly represent boundaries of increased precision allowance w/o regard for inlining or other optimizations. The state of C++11 makes my (somewhat crazy) idea of a flag a less attractive representation, as does the C11 contraction specification, but it still doesn't make me enthused about the default representation becoming an intrinsic, and forcing the FE to pre-fuse all of these rather than marking the range of fuse-able operations and allowing the middle end to perform the fusion. I'm actually beginning to like the start/stop intrinsic pair to represent the sequences of ineligible operations. -Chandler [1] There is a footnote in the latest working draft that indicates 'static_cast<float>(x)' may have been intended to be enough to force the precision, but the current wording isn't strict enough for that to be the case.
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
