[jira] [Commented] (DAFFODIL-2194) buffered data output stream has a chunk limit of 2GB

Michael Beckerle (Jira) Wed, 28 Aug 2019 11:02:17 -0700


    [ 
https://issues.apache.org/jira/browse/DAFFODIL-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917969#comment-16917969
 ]


Michael Beckerle commented on DAFFODIL-2194:
--------------------------------------------

True enough. If I just have a protocol buffer that works like HTTP, the entire 
thing, even though it is just an array of lines of text, must be measured for 
length because the header has to carry the total length. So even without any 
blobs, a 2.1GByte page of small objects would blow this limit up even if the 
ONLY suspension was for this single header length field.

So this fix is really two fixes. 1 to fix 2GByte + boundary on output buffer 
sizes, and the other to avoid bringing blobs into memory at all.

> buffered data output stream has a chunk limit of 2GB
> ----------------------------------------------------
>
>                 Key: DAFFODIL-2194
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2194
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Back End
>            Reporter: Steve Lawrence
>            Assignee: Steve Lawrence
>            Priority: Major
>             Fix For: 2.5.0
>
>
> A buffered data outupt stream is backed by a growable ByteArrayOutputStream, 
> which can only grow to 2GB in size. So if we ever try to write more than 2GB 
> to a buffered output stream during unparse (very possible with large blobs), 
> we'll get an OutOfMemoryError.
> One potential solution is to be aware of the size of a ByteArrayOutputStream 
> when buffering output and automatically create a split when it gets to 2GB in 
> sizes. This will still require a ton of memory since we're buffering these in 
> memoary, but we'll at least be able to unparse more than 2GB of continuous 
> data. 
> Note that we should still be able to unparse more than 2GB of data total, as 
> long as there so single buffer that's more than 2GB.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (DAFFODIL-2194) buffered data output stream has a chunk limit of 2GB

Reply via email to