stevedlawrence opened a new pull request #316: Assortment of changes to improve performance URL: https://github.com/apache/incubator-daffodil/pull/316 - Optimize how we parse date/times from an XML infoset. Previously we attempted to parse a handful of different potential patterns using ICU4J SimpleDateFormat.parse(). Profiling show this simple parse performed many allocations. Considering that this is a fairly strict format we can implement the parsing ourselves, which is much more efficient, minimizes allocations, and avoids thread-locals since SimpleDateFormat is not thread safe. - Instead of creating new calendar every time we parse an XML Date/Time, clone an already existing one. The process to initialize a new Calendar is pretty expensive, cloning is much more efficient. - Skip remapping XML strings if they do not contain any invalid characters. This avoids unnecessary string allocations in the common case, but does incur some overhead in the rare case when remapping is needed. - Use the same empty Array of DFADelimiters in the delimiter stack. Otherwise we allocate a bunch of empty arrays, and scala even ends up allocating even ore stuff to support it (e.g. ClassTags). - Fix MaybeBoolean to avoid allocating new MaybeBoolean's - Use Array.length == 0 instead of Array.isEmpty in inner loops. Using .isEmpty will allocate a scala ArrayOps, which is unnecssary if all were doing is checking the length. Note that .size will also allocate an ArrayOps. - Fix maybeConstant_ in Evaluatable to avoid Maybe allocations These issues were discovered while profiling an unparse of a CSV schema. Performance averaged around 20% faster on my testing. DAFFODIL-2222, DAFFODIL-1883
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
