mbeckerle opened a new pull request #30: Performance improvements around FormatInfo change. URL: https://github.com/apache/incubator-daffodil/pull/30 Flattened out EvaluatableBase since it was only used one place. Simplified methods - e.g., apply() gone. Inlined simple methods. Combined CharsetEv and CheckCharsetEv together into one so only one evaluation is needed. Introduced BitsCharset which is NOT derived from java's java.nio.charset.Charset because well that class has a lot of final methods. We need to be able to create a proxy charset that delegates to an existing java Charset. Because of all the final methods in java Charset, there is no way to do this. So instead we have Daffodil's own BitsCharset which implements an API very close to that of java's Charset. But BitsCharset is NOT derived from java Charset - which is fine because we are the only ones calling it anyway. A BitsCharset implements the additional bits-oriented methods on the BitsCharset and on the encoders and decoders it creates. NBitsWidthCharset is a BitsCharset NonByteSizedCharset is a java Charset. Turns out we need both - NonByteSizedCharset depends on the java Charset framework. It implements only decodeLoop(), and depends on java Charset methods to call it. The NonByteSized stuff is implemented with this pattern in mind, so we need BitsCharset which is NOT a java Charset, and we need NonByteSizedCharset, which IS a java Charset. Lastly: DFDLCharset - this is now a container for a BitsCharset, or the name thereof really. Probably BitsCharset and DFDLCharset can be merged together so that all BitsCharset are serializable. (Issue1: Merge DFDLCharset and BitsCharset ??) JavaBitsCharset implements BitsCharset by delegation to a java Charset. NBitsWidthCharset implements BitsCharset directly in terms of a NonByteSizedCharset, which actually implements java Charset. (Issue2: Can we clarify the names here? These classes are coupled tightly. Really these are NBitsWidth_BitsCharset and NBitsWidth_JavaCharset. There is an issue with test mixed_encodings2 in ContentFramingProps.tdml. <element A - a single 7 bit character - LSBF> <element B - a UTF-8 character so has bitOrder MSBF, also specifies alignment =8 bits. Question is should the alignment=8 be enough - and should that perform alignment using the LSBF that we need in order to get to the byte boundary so we can change bitOrder? Or does the alignment region for B have to be rendered in MSBF bit order. We want to align to 8 bit BEFORE we change the bitOrder. I.e., alignment is really about wherever the prior term left us as far as bitOrder. Added Methods to allow us to skip mandatory alignment calculations. Improve performance by caching FormatInfo members in the PState/UState. Caches are cleared every parse1 or unparse1 call. So the FormatInfo members are computed only once every parse or unparse. DAFFODIL-1876
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
