[
https://issues.apache.org/jira/browse/DRILL-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940700#comment-14940700
]
ASF GitHub Bot commented on DRILL-3874:
---------------------------------------
Github user cwestin commented on the pull request:
https://github.com/apache/drill/pull/181#issuecomment-144898125
Parth:
Re ObjectVector: I don't know what that's for. I just followed the pattern:
getBufferSize() already throws that exception.
Re OUTPUT_MEMORY_LIMIT: what do you think? I tend to avoid adding more
knobs, but I can easily do that if you like (with the current 512MB as the
default). Let me know soon, about to kick off testing on Jason's suggested
replacement of getBufferSize() implementations with calls to
getBufferSizeFor(). The problem I see here is that it will affect all
flatten()s, whether they need it or not. And, this isn't really the long term
solution, which is really to add projection capabilities so that we're not
passing through the original record like this.
> flattening large JSON objects consumes too much direct memory
> -------------------------------------------------------------
>
> Key: DRILL-3874
> URL: https://issues.apache.org/jira/browse/DRILL-3874
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.1.0
> Reporter: Chris Westin
> Assignee: Chris Westin
>
> A JSON record has a field whose value is an array with 20,000 elements; the
> record's size is 4MB. A select is used to flatten this. The query profile
> reports that the peak memory utilization was 8GB, most of it used by the
> flatten.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)