[
https://issues.apache.org/jira/browse/DAFFODIL-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917998#comment-16917998
]
Michael Beckerle commented on DAFFODIL-2194:
--------------------------------------------
Good points. I think one thing to consider is perhaps we don't allow points of
uncertainty around blobs at all. I.e., you must have occursCountKind="explicit"
so there are no optional element PoUs, and we require choices to be
choice-by-dispatch. Then there can be no backtracking. It would be an SDE to
have a blob inside of a PoU.
Note that a choice with PoU with multiple branches the BLOB could still be in
the last branch, as by then there is no more backtracking.
A BLOB could also appear in a speculative branch if *after* a discriminator.
There may be some other caveats, but it might be possible to just say PoU and
BLOB don't mix.
> buffered data output stream has a chunk limit of 2GB
> ----------------------------------------------------
>
> Key: DAFFODIL-2194
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2194
> Project: Daffodil
> Issue Type: Bug
> Components: Back End
> Reporter: Steve Lawrence
> Assignee: Steve Lawrence
> Priority: Major
> Fix For: 2.5.0
>
>
> A buffered data outupt stream is backed by a growable ByteArrayOutputStream,
> which can only grow to 2GB in size. So if we ever try to write more than 2GB
> to a buffered output stream during unparse (very possible with large blobs),
> we'll get an OutOfMemoryError.
> One potential solution is to be aware of the size of a ByteArrayOutputStream
> when buffering output and automatically create a split when it gets to 2GB in
> sizes. This will still require a ton of memory since we're buffering these in
> memoary, but we'll at least be able to unparse more than 2GB of continuous
> data.
> Note that we should still be able to unparse more than 2GB of data total, as
> long as there so single buffer that's more than 2GB.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)