JoshRosen commented on issue #24709: [SPARK-27841][SQL] Improve UTF8String to/fromString()/numBytesForFirstByte() performance URL: https://github.com/apache/spark/pull/24709#issuecomment-496026174 By the way, if I was to prioritize these changes for inclusion / consideration, I'd rank them as; 1. `numBytesForFirstByte()` 2. `fromString()` 3. `toString()` The `fromString()` changes have a significantly larger impact than `toString()` because they result in a much more significant reduction in garbage creation. Since this is all just an experiment, I'd be totally cool with spinning off a subset of these changes to a separate, much tinier PR in case we decide that only some of these are worthwhile.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
