[ 
https://issues.apache.org/jira/browse/DAFFODIL-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063318#comment-17063318
 ] 

Steve Lawrence commented on DAFFODIL-2293:
------------------------------------------

I think everything you need is contained in a single directory: 
[daffodil-io/src/main/scala/org/apache/daffodil/processors/charset|https://github.com/apache/incubator-daffodil/tree/master/daffodil-io/src/main/scala/org/apache/daffodil/processors/charset]

You should be able to copy one of the {{USASCII*BitPacked.scala files}} and 
create a {{USASCII8BitPacked.scala}} file and make the appropriate changes to 
make it 8 bits long. Then modify DaffodilCharsetProvider.scala to add your new 
charset.

Some concerns I have:
1. The bullk of the functionality for these non-byte size character sets is 
provided by {{BitsCharsetNonByteSize.scala}}. This file has a least one 
assumption (on line 68) that the width of a code unit is <= 7. But we need 8 
for this. It's possible that you can just bump this number to 8 and everything 
will just work. I'm not sure if any other parts of that code rely on that 
assumption.
2. The code I think also makes assumption that a character is not made up of 
multiple bytes (e.g UTF-8, UTF-16). If you want to support one of these, you'll 
probably need to make some additional changes. Hopefully use just need ASCII or 
ISO-5559 or something that is only single byte based. 

> Too many bits in xs:string
> --------------------------
>
>                 Key: DAFFODIL-2293
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2293
>             Project: Daffodil
>          Issue Type: Question
>            Reporter: Alexander Deutschmann
>            Priority: Major
>
> Hello everyone,
> i have the following schema:
> {code:xml}
> <xs:complexType name="statusReportDetails">
>               <xs:sequence>
>                       <xs:element name="state" type="abc:stateenum" 
> dfdl:length="4" />
>                       <xs:element name="indicators" type="abc:indicators"  
> dfdl:length="16" />
>                       <xs:element name="v" type="v" dfdl:length="10" />
>                       <xs:element name="driverId" type="xs:string" 
> dfdl:lengthKind="explicit" dfdl:length="128" dfdl:alignment="8" />
>               </xs:sequence>
>       </xs:complexType>
> {code}
> And the related bitstream:
> {code:java}
> 0101 -> Enum
> 0000110000000001 -> indicators
> 0001100100 -> v
> 0000110001001100010011000100110001001100100011001000110010001100100011001100110011001100110011001100110100001101000011010000110100
>  -> driverId
> {code}
> The driverId has 130 bits and not the 128. bits which is defined in the 
> schema. 
> My question is where comes the first two bits ? I know it is an configuration 
> mistake or something like that.
> I hope someone can help me.
> Thank you.
> Alex



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to