leonardchan added inline comments.

================
Comment at: clang/lib/Basic/FixedPoint.cpp:242
+  } else
+    Overflowed = Result < Min || Result > Max;
+
----------------
ebevhan wrote:
> rjmccall wrote:
> > ebevhan wrote:
> > > rjmccall wrote:
> > > > leonardchan wrote:
> > > > > ebevhan wrote:
> > > > > > rjmccall wrote:
> > > > > > > ebevhan wrote:
> > > > > > > > ebevhan wrote:
> > > > > > > > > rjmccall wrote:
> > > > > > > > > > If the maximum expressible value is *k*, and the 
> > > > > > > > > > fully-precise multiplication yields *k+e* for some epsilon 
> > > > > > > > > > *e* that isn't representable in the result semantics, is 
> > > > > > > > > > that considered an overflow?  If so, I think you need to do 
> > > > > > > > > > the shift after these bound checks, since the shift 
> > > > > > > > > > destroys the difference between *k* and *k+e*.  That is, 
> > > > > > > > > > unless there's a compelling mathematical argument that it's 
> > > > > > > > > > not possible to overflow only in the fully-precision 
> > > > > > > > > > multiplication — but while I think that's possibly true of 
> > > > > > > > > > `_Fract` (since *k^2 < k*), it seems unlikely to be true of 
> > > > > > > > > > `_Accum`, although I haven't looked for a counter-example.  
> > > > > > > > > > And if there is a compelling argument, it should probably 
> > > > > > > > > > be at least alluded to in a comment.
> > > > > > > > > > 
> > > > > > > > > > Would this algorithm be simpler if you took advantage of 
> > > > > > > > > > the fact that `APFixedPointSemantics` doesn't have to 
> > > > > > > > > > correspond to a real type?  You could probably just convert 
> > > > > > > > > > to a double-width common semantics, right?
> > > > > > > > > > If the maximum expressible value is *k*, and the 
> > > > > > > > > > fully-precise multiplication yields *k+e* for some epsilon 
> > > > > > > > > > *e* that isn't representable in the result semantics, is 
> > > > > > > > > > that considered an overflow? If so, I think you need to do 
> > > > > > > > > > the shift after these bound checks, since the shift 
> > > > > > > > > > destroys the difference between *k* and *k+e*.
> > > > > > > > > 
> > > > > > > > > I don't think I would consider that to be overflow; that's 
> > > > > > > > > precision loss. E-C considers these to be different:
> > > > > > > > > 
> > > > > > > > > > If the source value cannot be represented exactly by the 
> > > > > > > > > > fixed-point type, the source value is rounded to either the 
> > > > > > > > > > closest fixed-point value greater than the source value 
> > > > > > > > > > (rounded up) or to the closest fixed-point value less than 
> > > > > > > > > > the source value (rounded down).
> > > > > > > > > >
> > > > > > > > > > When the source value does not fit within the range of the 
> > > > > > > > > > fixed-point type, the conversion overflows. [...]
> > > > > > > > > >
> > > > > > > > > > [...]
> > > > > > > > > >
> > > > > > > > > > If the result type of an arithmetic operation is a 
> > > > > > > > > > fixed-point type, [...] the calculated result is the 
> > > > > > > > > > mathematically exact result with overflow handling and 
> > > > > > > > > > rounding performed to the full precision of the result type 
> > > > > > > > > > as explained in 4.1.3. 
> > > > > > > > > 
> > > > > > > > > There is also no value of `e` that would affect saturation. 
> > > > > > > > > Any full precision calculation that gives `k+e` must be `k` 
> > > > > > > > > after downscaling, since the bits that represent `e` must 
> > > > > > > > > come from the extra precision range. Even though `k+e` is 
> > > > > > > > > technically larger than `k`, saturation would still just give 
> > > > > > > > > us `k` after truncating out `e`, so the end result is the 
> > > > > > > > > same.
> > > > > > > > > 
> > > > > > > > > > Would this algorithm be simpler if you took advantage of 
> > > > > > > > > > the fact that APFixedPointSemantics doesn't have to 
> > > > > > > > > > correspond to a real type? You could probably just convert 
> > > > > > > > > > to a double-width common semantics, right?
> > > > > > > > > 
> > > > > > > > > It's likely possible to use APFixedPoint in the calculations 
> > > > > > > > > here, but I used APInt to make the behavior explicit and not 
> > > > > > > > > accidentally be dependent on the behavior of APFixedPoint's 
> > > > > > > > > conversions or operations.
> > > > > > > > Although.,. I guess I see your point in that an intermediate 
> > > > > > > > result of k+e technically "does not fit within the range of the 
> > > > > > > > fixed-point type"... but I wonder if treating such cases as 
> > > > > > > > overflow is particularly meaningful. I don't find there to be 
> > > > > > > > much of a distinction between such a case and the case where 
> > > > > > > > the exact result lands inbetween two representable values. We 
> > > > > > > > just end up with a less precise result.
> > > > > > > Right, I was wondering if there was an accepted answer here.  For 
> > > > > > > saturating arithmetic, it's equivalent to truncate this extra 
> > > > > > > precision down to *k* or to saturate to the maximum representable 
> > > > > > > value, since by assumption that was just *k*; but for 
> > > > > > > non-saturating arithmetic, it affects whether the operation has 
> > > > > > > UB.  All else being the same, it's better to have fewer 
> > > > > > > corner-case sources of UB.
> > > > > > > 
> > > > > > > My read is that Embedded C is saying there's a sequence here: 
> > > > > > > compute the exact mathematical result; round that to the 
> > > > > > > precision of the result type; the operation overflows if the 
> > > > > > > rounded result is not representable in the result type.  Is the 
> > > > > > > rounding direction completely unspecified, down to being 
> > > > > > > potentially operand-specific?  If so, we could just say that we 
> > > > > > > always  round to avoid overflow if possible.  The main 
> > > > > > > consideration here is that we need to give the operation the same 
> > > > > > > semantics statically and dynamically, and I don't know if there's 
> > > > > > > any situation where those semantics would affect the performance 
> > > > > > > of the operation when done dynamically.
> > > > > > > For saturating arithmetic, it's equivalent to truncate this extra 
> > > > > > > precision down to *k* or to saturate to the maximum representable 
> > > > > > > value, since by assumption that was just *k*; but for 
> > > > > > > non-saturating arithmetic, it affects whether the operation has 
> > > > > > > UB.
> > > > > > 
> > > > > > I'm fairly sure that the conclusions here about k and e only hold 
> > > > > > if k truly is the maximum representable value. If k is anything 
> > > > > > else (even epsilon-of-the-representable-range less), k+e can never 
> > > > > > be greater than the maximum.
> > > > > > 
> > > > > > And actually, crunching the numbers on this... If we have integers 
> > > > > > a and b of width N, sign extended to the double bitwidth A and B, 
> > > > > > there can be no values for a and b for which A*B is greater than 
> > > > > > N_Max<<N (`k`). Taking 8-bit as an example: Max is 127, and Max<<8 
> > > > > > is 32512. The maximum possible value attainable is -128*-128, which 
> > > > > > is 16384. That isn't even close to the k+e case.
> > > > > > 
> > > > > > I'm unsure if this reasoning applies in the minimum case as well.
> > > > > > 
> > > > > > > My read is that Embedded C is saying there's a sequence here: 
> > > > > > > compute the exact mathematical result; round that to the 
> > > > > > > precision of the result type; the operation overflows if the 
> > > > > > > rounded result is not representable in the result type.
> > > > > > 
> > > > > > I wonder if it's intended to be a sequence. It's starting to feel 
> > > > > > like it can't actually be both cases at the same time.
> > > > > > 
> > > > > > And actually, crunching the numbers on this... If we have integers 
> > > > > > a and b of width N, sign extended to the double bitwidth A and B, 
> > > > > > there can be no values for a and b for which A*B is greater than 
> > > > > > N_Max<<N (k). Taking 8-bit as an example: Max is 127, and Max<<8 is 
> > > > > > 32512. The maximum possible value attainable is -128*-128, which is 
> > > > > > 16384. That isn't even close to the k+e case.
> > > > > 
> > > > > I think you mean the scale instead of `N` for `N_Max<<N`, and we 
> > > > > would run into this case for `(N_max << scale) < (a * b) < ((N_max + 
> > > > > 1) << scale)` where `a` and `b` represent the scaled integers. An 
> > > > > example is `1.75 * 2.25`, represented as 4 bit unsigned ints with 
> > > > > scales of 2:
> > > > > 
> > > > > ```
> > > > >   01.11 (1.75)
> > > > > x 10.01 (2.25)
> > > > > -------------
> > > > > 11.1111 (3.9375) -> shr 2 -> 11.11 (3.75)
> > > > > ```
> > > > > 
> > > > > where the our `e` in this < 0.25.
> > > > > 
> > > > > My interpretation of the spec (which could be wrong) is whenever they 
> > > > > refer to "source value", they mean the exact mathematical result 
> > > > > (`3.9375`), so precision loss and overflow can occur at the same time 
> > > > > independently of each other. For the non-saturating case, I'd 
> > > > > consider the `k + e` to be UB because of this.
> > > > Your logic only works if the entire integer is scaled, i.e. for 
> > > > `_Fract`; for `_Accum` types where the scale S can be less than N, it's 
> > > > possible to have an "epsilon" overflow.  For example, with S=4 and N=8, 
> > > > `(44/16) * (93/16) == (255/16) + (12/256)`.
> > > > 
> > > > Here's a program to brute-force search for counter-examples for an 
> > > > arbitrary unsigned fixed-point type: 
> > > > https://gist.github.com/rjmccall/562c2c7c9d289edd8cdf034edd6c1f17
> > > > I think you mean the scale instead of N
> > > 
> > > >Your logic only works if the entire integer is scaled
> > > 
> > > Yes, you're absolutely correct, big mistake on my part. Realized that I'd 
> > > made the mistake the same day but stuff got in the way of responding :)
> > > 
> > > > My interpretation of the spec (which could be wrong) is whenever they 
> > > > refer to "source value", they mean the exact mathematical result 
> > > > (3.9375), so precision loss and overflow can occur at the same time 
> > > > independently of each other. For the non-saturating case, I'd consider 
> > > > the k + e to be UB because of this.
> > > 
> > > I agree with the interpretation of "source value".
> > > 
> > > This is still a bit uncertain for me, though. Can they really occur 
> > > simultaneously? Aren't we just considering the overflow case first rather 
> > > than the precision loss/rounding case first? If we instead rounded down 
> > > first (the shift) and then checked overflow, it wouldn't be UB.
> > > 
> > > It feels like a common case to get this kind of result. All that happened 
> > > during the operation was that we lost precision. Is it really worth 
> > > considering it to be UB?
> > Well, like I said up-thread, since the spec doesn't seem to impose any 
> > constraints on rounding at all, I think we can just define it such that we 
> > always round to avoid overflow if possible.  For saturating math, it's the 
> > same either way, since we either (1) "round to avoid overflow" and thus 
> > only see a maximal/minimal value or (2) we detect overflow and thus 
> > substitute the maximal/minimal value.  For non-saturating math, it changes 
> > whether UB formally occurs, which I think affects three areas:
> > 
> > - C++ `constexpr`, which isn't allowed to invoke UB.  Abstractly, it's 
> > better to accept more programs here instead of emitting really pedantic 
> > errors about unrepresentable overflows.
> > 
> > - Fixed-point intrinsics which check whether UB occurred dynamically. I 
> > don't think we have any of these today, but we might add them someday — 
> > among other things, I think the UBSan people would say that UBSan should 
> > have a check for this, which would require such an intrinsic.  It's not 
> > unlikely that this will complicate the implementation because we won't be 
> > able to simply consider whether the underlying integer operation overflowed.
> > 
> > - High-level optimizations that exploit the UB-ness of non-saturating 
> > overflow.  For example, it is abstractly true that `x * C > x` when `x` is 
> > known to be strictly positive and `C` is a constant greater than 1, but if 
> > we define rounding as avoiding overflow, there might be corner cases where 
> > this isn't true for some `C`.  I'm not sure we would ever do any 
> > optimizations like this, but if we did, they'd probably have to be more 
> > conservative in some cases.
> > 
> > So it's really a judgment call for you folks, one that you need to make 
> > with an understanding of where you want to take this feature.
> Okay, these are good points. I think I'm starting to agree with the idea of 
> avoiding overflow if possible. I was a bit concerned that it might be a bit 
> too strong of a claim to make, for example if there were cases where it would 
> be more natural for a particular calculation to detect an overflow rather 
> than round and avoid it. But I'm starting to wonder if there really are any 
> such cases.
> 
> I would also be fine with simply defining the order in which we perform the 
> operations; round first, then check for overflow. That would be in line with 
> the order it's written in the spec, but I don't know if that was how it was 
> intended to work.
> 
> We'll see what Leonard has to say.
I think for simplicity and since this doesn't seem to actively go against the 
spec, it would be good to do rounding then overflow check in that sense.

Going on a tangent (I don't remember if this was brought up before, but do 
remind me if there was a consensus on this): let's say we have a target that 
defines rounding to always be towards positive infinity for their 
multiplication intrinsics. Currently in this patch, I believe the default is 
always going to be rounding towards negative infinity from right shifting after 
the multiplication. To match the static calculation behavior against the 
dynamic intrinsics, would it be better to add a field in `TargetInfo`, next to 
the fixed point type widths, that specified different rounding types?

Something that's been bothering me with this is that if we wanted to do 
something like `contexpr` evaluation for these types, we'd also need to 
consider the rounding, but that could potentially mean a `constexpr` value can 
vary depending on the target, unless this is allowed or already considered.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73186/new/

https://reviews.llvm.org/D73186



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to