Re: [swift-dev] Making the sign of NaNs unspecified to enable enum layout optimization

Joe Groff via swift-dev Mon, 24 Oct 2016 13:42:06 -0700

> On Oct 24, 2016, at 1:36 PM, John McCall <[email protected]> wrote:
> 
>> On Oct 24, 2016, at 1:23 PM, Joe Groff <[email protected] 
>> <mailto:[email protected]>> wrote:
>>> On Oct 24, 2016, at 12:58 PM, John McCall <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>>> On Oct 24, 2016, at 12:30 PM, Stephen Canon <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>>> On Oct 24, 2016, at 2:55 PM, John McCall via swift-dev 
>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>> 
>>>>>> On Oct 24, 2016, at 8:49 AM, Joe Groff via swift-dev 
>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>> On Oct 22, 2016, at 10:39 AM, Chris Lattner <[email protected] 
>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>> 
>>>>>>>> On Oct 20, 2016, at 2:59 PM, Joe Groff via swift-dev 
>>>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>>>> 
>>>>>>>>> copysign( ) is a reason to not pick the first option.  I’m not very 
>>>>>>>>> worried about it, but it is a reason.  I see no problem with the 
>>>>>>>>> second option.
>>>>>>>> 
>>>>>>>> As we discussed in person this morning, de-canonicalizing b11 might be 
>>>>>>>> a better compromise to minimize the potential impact of layout 
>>>>>>>> optimizations. That would leave the implementation with 2^51 NaN 
>>>>>>>> representations (50 significand bits, plus the sign bit) in Double to 
>>>>>>>> play with, which ought to be enough for anyone™. I liked the idea of 
>>>>>>>> using the sign bit originally since testing for NaNs and sign bits is 
>>>>>>>> something that can be easily done using common FPU instructions 
>>>>>>>> without crossing domains, but as you noted, it sounds like comparison 
>>>>>>>> and branching operations tend to do that anyway, so masking and 
>>>>>>>> branching using integer operations shouldn't be too much of a burden. 
>>>>>>>> Jordan's question of to what degree we consider different NaN 
>>>>>>>> encodings to be distinct semantic values is still an interesting one, 
>>>>>>>> but if we take only the b11 NaN payloads away, that should minimize 
>>>>>>>> the degree to which the implementation needs to be considered as a 
>>>>>>>> constraint in having that discussion.
>>>>>>> 
>>>>>>> To your original email, I agree this is an important problem to tackle, 
>>>>>>> and that we should handle the inhabitant masking when the FP value is 
>>>>>>> converted to optional.
>>>>>>> 
>>>>>>> That said, I don’t understand the above.  With the “b11” 
>>>>>>> representation, what how is a "Double?" tested for “.None"? One 
>>>>>>> advantage of using the signbit is that “is negative” comparisons are 
>>>>>>> very cheap on risc systems, because you don’t have to materialize a 
>>>>>>> large/weird immediate.
>>>>>> 
>>>>>> That's why I liked using the sign bit originally too. Steve noted that, 
>>>>>> since any operation on an Optional is probably going to involve testing 
>>>>>> and branching before revealing the underlying float value, and float 
>>>>>> comparisons and branches tend to unavoidably burn a couple cycles 
>>>>>> engaging the integer ALU, there's unlikely to be much benefit on ARM or 
>>>>>> Intel avoiding integer masking operations. (More strictly RISCy 
>>>>>> architectures like Power would be more negatively impacted, perhaps.) On 
>>>>>> ARM64 at least, the bitmask for a b11 NaN is still representable as an 
>>>>>> immediate, since it involves a single contiguous run of 1 bits.
>>>>> 
>>>>> There isn't any efficient way of just testing the sign bit of a value 
>>>>> using FP instructions that I can see.  You could maybe take advantage of 
>>>>> the vector registers overlapping the FP registers and use integer vector 
>>>>> operations, but it would take a lot of code and have false-dependency 
>>>>> problems.  So in both representations, the most efficient test sequence 
>>>>> seems to be (1) get value in integer register (2) compare against some 
>>>>> specific integer value.  And in that case, in both representations it 
>>>>> seems to me that the obvious extra-inhabitant sequence is 0xFFFFFFFF, 
>>>>> 0xFFFFFFFE, …
>>>> 
>>>> The test for detecting the reserved encoding is essentially identical 
>>>> either way (pseudo-assembly):
>>>> 
>>>>    detectNegativeNaN:
>>>>            ADD encoding, encoding, 0x0010000000000000
>>>>            JC nil
>>>> 
>>>>    detectLeading11NaN:
>>>>            ADD encoding, encoding, 0x0004000000000000
>>>>            JO nil
>>> 
>>> Sure, that's basically just a different way of spelling the comparison.  
>>> For the most part, though, Swift will not need to perform this operation; 
>>> it'll be checking for a specific value.  I don't see any reason to say that 
>>> e.g. .none can be encoded by an arbitrary reserved NaN rather than a 
>>> specific one.
>> 
>> When we know there's exactly one no-payload case, as with .none in Optional, 
>> we do have the option of testing for an arbitrary extra inhabitant if it 
>> happens to be cheaper/smaller code, since having any extra inhabitant 
>> representation other than the first would be UB anyway.
> 
> Sure.
> 
>> In these cases, either the mask or first inhabitant should fit in an ARM64 
>> bitmask immediate, and are a 64-bit movabs on Intel either way, so it's 
>> probably not worthwhile.
> 
> Well, if we always set the sign bit on our extra inhabitants, we end up with 
> a prefix that's amenable to extra inhabitants typically being small-magnitude 
> negative numbers, right?  Or am I missing something important?


Ah, I see what you're saying now. Yeah, setting the sign bit for extra 
inhabitants definitely makes sense, for the benefit of platforms with less 
clever immediate encodings.

-Joe

_______________________________________________
swift-dev mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] Making the sign of NaNs unspecified to enable enum layout optimization

Reply via email to