[ https://issues.apache.org/jira/browse/DAFFODIL-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17130689#comment-17130689 ]
Mike Beckerle commented on DAFFODIL-2351: ----------------------------------------- See also DAFFODIL-1927 extensible layer transforms feature. There are numerous other tickets about layering. Search for tickets on "layer" to find them. > layer improvements to enable JPEG format > ---------------------------------------- > > Key: DAFFODIL-2351 > URL: https://issues.apache.org/jira/browse/DAFFODIL-2351 > Project: Daffodil > Issue Type: Bug > Components: Back End > Affects Versions: 2.6.0 > Reporter: Mike Beckerle > Priority: Major > Fix For: 3.0.0 > > > JPEG format has "Entropy Coded Segments" or ECS Segments. > These are terminated by the byte-pattern that indicates the start of the > following JPEG segment, so we need the ability to isolate these bytes by > finding, but not consuming, the start of the next segment. > Currently the only way to do this is with lengthKind='pattern', and a regex > with lookahead. This is problematic due to the way the implementation of > regex scanning works (buffers that are gradually enlarged if needed). The > buffers cannot be made big enough and this will simply not work for JPEG's > with very large images (JPEG2000 format has the same problem and holds even > larger images). > The ability to define a layer that contains data up to, but not including, a > particular marker is needed. In JPEG the marker is a 2-byte sequence. > In addition, for JPEG, these ECS segments are "byte stuffed", which is an > escaping scheme where if the first byte of the marker is found in the data it > is modified by inserting a zero byte after it so that it does not match the > marker. This inserted zero needs to be removed from the data on parsing, and > re-inserted on unparsing by the layer transform. > Finally, all the implementation of this feature needs to not require staging > a copy of the entire contents of the ECS segment in any array, so long as the > ultimate destination of the bytes is as a DFDL BLOB (extension to DFDL v1.0). > These layers need to allow streaming the bytes of the ECS segment out to an > external BLOB (e.g., a BLOB file) without the need to create any object in > the Daffodil process memory that is the size of the whole ECS segment. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)