[
https://issues.apache.org/jira/browse/ARROW-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-1943:
----------------------------------
Labels: pull-request-available (was: )
> Handle setInitialCapacity() for deeply nested lists of lists
> ------------------------------------------------------------
>
> Key: ARROW-1943
> URL: https://issues.apache.org/jira/browse/ARROW-1943
> Project: Apache Arrow
> Issue Type: Bug
> Reporter: Siddharth Teotia
> Assignee: Siddharth Teotia
> Labels: pull-request-available
>
> The current implementation of setInitialCapacity() uses a factor of 5 for
> every level we go into list:
> So if the schema is LIST (LIST (LIST (LIST (LIST (LIST (LIST (BIGINT))))))
> and we start with an initial capacity of 128, we end up not throwing
> OversizedAllocationException from the BigIntVector because at every level we
> increased the capacity by 5 and by the time we reached inner scalar that
> actually stores the data, we were well over max size limit per vector (1MB).
> We saw this problem in Dremio when we failed to read deeply nested JSON data.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)