[
https://issues.apache.org/jira/browse/DRILL-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers updated DRILL-5688:
-------------------------------
Reviewer: Karthikeyan Manivannan
Description:
DRILL-5211 describes how Drill runs into OOM issues due to Drill's two
allocators: Netty and Unsafe. That JIRA also describes the solution: limit
vectors to 16 MB in length (with the eventual goal of limiting overall batch
size.) DRILL-5517 added "size-aware" support to the column accessors created to
parallel Drill's existing readers and writers. (The parallel implementation
ensures that we don't break existing code that uses the existing mechanism;
same as we did for the external sort.)
This ticket describes work to extend the column accessors to handle repeated
maps and lists. Key themes:
* Define a common metadata schema for use in this layer and the "result set
loader" of DRILL-5657. This schema layer builds on top of the existing schema
to add the kind of metadata needed here and by the "sizer" created for the
external sort.
* Define a JSON-like reader and writer structure that supports the full Drill
data model semantics. (The earlier version focused on the scalar types and
arrays of scalars to prove the concept of limiting vector sizes.)
* Revising test code to use the revised column writer structure.
Implementation details appear in the PR.
was:
DRILL-5211 describes how Drill runs into OOM issues due to Drill's two
allocators: Netty and Unsafe. That JIRA also describes the solution: limit
vectors to 16 MB in length (with the eventual goal of limiting overall batch
size.) DRILL-5517 added "size-aware" support to the column accessors created to
parallel Drill's existing readers and writers. (The parallel implementation
ensures that we don't break existing code that uses the existing mechanism;
same as we did for the external sort.)
This ticket describes work to extend the column accessors to handle repeated
maps and lists. Key themes:
* Define a common metadata schema for use in this layer and the "result set
loader" of DRILL-5657. This schema layer builds on top of the existing schema
to add the kind of metadata needed here and by the "sizer" created for the
external sort.
* Define a JSON-like reader and writer structure that supports the full Drill
data model semantics. (The earlier version focused on the scalar types and
arrays of scalars to prove the concept of limiting vector sizes.)
* Revising test code to use the revised column writer structure.
Implementation details will appear in the PR.
> Add repeated map support to column accessors
> --------------------------------------------
>
> Key: DRILL-5688
> URL: https://issues.apache.org/jira/browse/DRILL-5688
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.12.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: 1.12.0
>
>
> DRILL-5211 describes how Drill runs into OOM issues due to Drill's two
> allocators: Netty and Unsafe. That JIRA also describes the solution: limit
> vectors to 16 MB in length (with the eventual goal of limiting overall batch
> size.) DRILL-5517 added "size-aware" support to the column accessors created
> to parallel Drill's existing readers and writers. (The parallel
> implementation ensures that we don't break existing code that uses the
> existing mechanism; same as we did for the external sort.)
> This ticket describes work to extend the column accessors to handle repeated
> maps and lists. Key themes:
> * Define a common metadata schema for use in this layer and the "result set
> loader" of DRILL-5657. This schema layer builds on top of the existing schema
> to add the kind of metadata needed here and by the "sizer" created for the
> external sort.
> * Define a JSON-like reader and writer structure that supports the full Drill
> data model semantics. (The earlier version focused on the scalar types and
> arrays of scalars to prove the concept of limiting vector sizes.)
> * Revising test code to use the revised column writer structure.
> Implementation details appear in the PR.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)