> On Jan 10, 2018, at 9:29 PM, Chris Lattner <clatt...@nondot.org> wrote:
> 
> On Jan 10, 2018, at 11:55 AM, Michael Ilseman via swift-dev 
> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:
>> (A gist-formatted version of this email can be found at 
>> https://gist.github.com/milseman/bb39ef7f170641ae52c13600a512782f 
>> <https://gist.github.com/milseman/bb39ef7f170641ae52c13600a512782f>)
> 
> I’m very very excited for this, thank you for the detailed writeup and 
> consideration of the effects and tradeoffs involved.
> 
>> Given that ordering is not fit for human consumption, but rather machine 
>> processing, it might as well be fast. The current ordering differs on Darwin 
>> and Linux platforms, with Linux in particular suffering from poor 
>> performance due to choice of ordering (UCA with DUCET) and older versions of 
>> ICU. Instead, [String Comparison 
>> Prototype](https://github.com/apple/swift/pull/12115 
>> <https://github.com/apple/swift/pull/12115>)  provides a simpler ordering 
>> that allows for many common-case fast-paths and optimizations. For all the 
>> Unicode enthusiasts out there, this is the lexicographical ordering of 
>> NFC-normalized UTF-16 code units.
> 
> Thank you for fixing this.  Your tradeoffs make perfect sense to me.
> 
>> ### Small String Optimization
> ..
>> For example, assuming a 16-byte String struct and 8 bits used for flags and 
>> discriminators (including discriminators for which small form a String is 
>> in), 120 bits are available for a small string payload. 120 bits can hold 7 
>> UTF-16 code units, which is sufficient for most graphemes and many common 
>> words and separators. 120 bits can also fit 15 ASCII/UTF-8 code units 
>> without any packing, which suffices for many programmer/system strings 
>> (which have a strong skew towards ASCII).
>> 
>> We may also want a compact 5-bit encoding for formatted numbers, such as 
>> 64-bit memory addresses in hex, `Int.max` in base-10, and `Double` in 
>> base-10, which would require 18, 19, and 24 characters respectively. 120 
>> bits with a 5-bit encoding would fit all of these. This would speed up the 
>> creation and interpolation of many strings containing numbers.
> 
> I think it is important to consider that having more special cases and 
> different representations slows down nearly *every* operation on string 
> because they have to check and detangle all of the possible representations.  
> Given the ability to hold 15 digits of ascii, I don’t see why it would be 
> worthwhile to worry about a 5-bit representation for digits.  String should 
> be an Any!

^ String should NOT be an Any!

-Chris

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to