[
https://issues.apache.org/jira/browse/DRILL-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers updated DRILL-5100:
-------------------------------
Issue Type: Bug (was: Sub-task)
Parent: (was: DRILL-5080)
> External Sort does not manage memory requirements of a schema change
> --------------------------------------------------------------------
>
> Key: DRILL-5100
> URL: https://issues.apache.org/jira/browse/DRILL-5100
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.8.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
>
> The external sort is given a fixed amount of memory to hold buffered
> in-memory batches prior to spilling. External sort also handles certain
> schema changes when union vectors are enabled. When a schema change occurs,
> existing vectors are coerced into the new schema format, perhaps replacing an
> existing vector with a new union vector.
> This conversion requires (direct) memory. When done when the external sort
> has already almost filled its in-memory buffer, the conversion process can
> cause memory overflow and failure.
> The following show the allocated memory before and after schema changes in
> the unit tests {{TestExternalSort.testNumericTypes}}:
> {code}
> Before: 134144
> After: 150528
> Before: 150528
> After: 166912
> {code}
> Union vectors appear to be larger than the original BIGINT vectors. External
> sort must anticipate this and perhaps spill to ensure sufficient room exists
> for the new, larger vectors.
> Further, the conversion process itself requires that two copies of each
> vector be in memory: the original and the new, converted one. The external
> sort does not check to ensure this much working memory is available, leading
> to potential OOM errors during each vector conversion.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)