[
https://issues.apache.org/jira/browse/DRILL-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940400#comment-14940400
]
ASF GitHub Bot commented on DRILL-3874:
---------------------------------------
Github user jaltekruse commented on the pull request:
https://github.com/apache/drill/pull/181#issuecomment-144846430
Overall, the changes look solid.
I am thinking it may be worth trying to remove the bodies of the existing
getBufferSize() methods and replace them with calls into these new methods,
passing in the current valueCount that is expected to be set with a call to
setValueCount(int) before getBufferSize() is used. It seems weird to have two
different ways to compute the same values, although I see the reason for it,
previously we were using the writerIndex() for all of the fixed length
primitives, which is also set in setValueCount().
I don't consider it a must fix to check in, I just wanted to know if you
had thought about this and ruled it out.
> flattening large JSON objects consumes too much direct memory
> -------------------------------------------------------------
>
> Key: DRILL-3874
> URL: https://issues.apache.org/jira/browse/DRILL-3874
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.1.0
> Reporter: Chris Westin
> Assignee: Chris Westin
>
> A JSON record has a field whose value is an array with 20,000 elements; the
> record's size is 4MB. A select is used to flatten this. The query profile
> reports that the peak memory utilization was 8GB, most of it used by the
> flatten.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)