[ https://issues.apache.org/jira/browse/DAFFODIL-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063318#comment-17063318 ]
Steve Lawrence commented on DAFFODIL-2293: ------------------------------------------ I think everything you need is contained in a single directory: [daffodil-io/src/main/scala/org/apache/daffodil/processors/charset|https://github.com/apache/incubator-daffodil/tree/master/daffodil-io/src/main/scala/org/apache/daffodil/processors/charset] You should be able to copy one of the {{USASCII*BitPacked.scala files}} and create a {{USASCII8BitPacked.scala}} file and make the appropriate changes to make it 8 bits long. Then modify DaffodilCharsetProvider.scala to add your new charset. Some concerns I have: 1. The bullk of the functionality for these non-byte size character sets is provided by {{BitsCharsetNonByteSize.scala}}. This file has a least one assumption (on line 68) that the width of a code unit is <= 7. But we need 8 for this. It's possible that you can just bump this number to 8 and everything will just work. I'm not sure if any other parts of that code rely on that assumption. 2. The code I think also makes assumption that a character is not made up of multiple bytes (e.g UTF-8, UTF-16). If you want to support one of these, you'll probably need to make some additional changes. Hopefully use just need ASCII or ISO-5559 or something that is only single byte based. > Too many bits in xs:string > -------------------------- > > Key: DAFFODIL-2293 > URL: https://issues.apache.org/jira/browse/DAFFODIL-2293 > Project: Daffodil > Issue Type: Question > Reporter: Alexander Deutschmann > Priority: Major > > Hello everyone, > i have the following schema: > {code:xml} > <xs:complexType name="statusReportDetails"> > <xs:sequence> > <xs:element name="state" type="abc:stateenum" > dfdl:length="4" /> > <xs:element name="indicators" type="abc:indicators" > dfdl:length="16" /> > <xs:element name="v" type="v" dfdl:length="10" /> > <xs:element name="driverId" type="xs:string" > dfdl:lengthKind="explicit" dfdl:length="128" dfdl:alignment="8" /> > </xs:sequence> > </xs:complexType> > {code} > And the related bitstream: > {code:java} > 0101 -> Enum > 0000110000000001 -> indicators > 0001100100 -> v > 0000110001001100010011000100110001001100100011001000110010001100100011001100110011001100110011001100110100001101000011010000110100 > -> driverId > {code} > The driverId has 130 bits and not the 128. bits which is defined in the > schema. > My question is where comes the first two bits ? I know it is an configuration > mistake or something like that. > I hope someone can help me. > Thank you. > Alex -- This message was sent by Atlassian Jira (v8.3.4#803005)