[ 
https://issues.apache.org/jira/browse/ARROW-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-1943:
----------------------------------
    Labels: pull-request-available  (was: )

> Handle setInitialCapacity() for deeply nested lists of lists
> ------------------------------------------------------------
>
>                 Key: ARROW-1943
>                 URL: https://issues.apache.org/jira/browse/ARROW-1943
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: Siddharth Teotia
>            Assignee: Siddharth Teotia
>              Labels: pull-request-available
>
> The current implementation of setInitialCapacity() uses a factor of 5 for 
> every level we go into list:
> So if the schema is LIST (LIST (LIST (LIST (LIST (LIST (LIST (BIGINT)))))) 
> and we start with an initial capacity of 128, we end up not throwing 
> OversizedAllocationException from the BigIntVector because at every level we 
> increased the capacity by 5 and by the time we reached inner scalar that 
> actually stores the data, we were well over max size limit per vector (1MB).
> We saw this problem in Dremio when we failed to read deeply nested JSON data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to