[
https://issues.apache.org/jira/browse/ARROW-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620933#comment-16620933
]
Paul Rogers commented on ARROW-3267:
------------------------------------
Yes, that's were Drill started also, and it is what step 2 in the previous note
does.
I suspect you'll find that, once you have a function, you'll want an easy way
to create the schema (step 1).
Then, unless a mechanism already exists, if you watch allocation logging,
you'll see vector doublings you can avoid. So, soon want to optimize allocation
performance by providing size hints. The size hint step can be a separate bunch
of data, or can be part of the schema passed to the empty_table function. (You
might want to have an allocate_table function that creates the table and
allocates vectors.)
Sounds like you're not hit these issues yet; but keep this in mind if/when you
do.
> [Python] Create empty table from schema
> ---------------------------------------
>
> Key: ARROW-3267
> URL: https://issues.apache.org/jira/browse/ARROW-3267
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Uwe L. Korn
> Assignee: Uwe L. Korn
> Priority: Major
> Fix For: 0.11.0
>
>
> When one knows the expected schema for its input data but has no input data
> for a data pipeline, it is necessary to construct an empty table as a
> sentinel value to pass through.
> This is a small but often useful convenience function.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)