[ 
https://issues.apache.org/jira/browse/ARROW-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620933#comment-16620933
 ] 

Paul Rogers commented on ARROW-3267:
------------------------------------

Yes, that's were Drill started also, and it is what step 2 in the previous note 
does.

I suspect you'll find that, once you have a function, you'll want an easy way 
to create the schema (step 1).

Then, unless a mechanism already exists, if you watch allocation logging, 
you'll see vector doublings you can avoid. So, soon want to optimize allocation 
performance by providing size hints. The size hint step can be a separate bunch 
of data, or can be part of the schema passed to the empty_table function. (You 
might want to have an allocate_table function that creates the table and 
allocates vectors.)

Sounds like you're not hit these issues yet; but keep this in mind if/when you 
do.
 

> [Python] Create empty table from schema
> ---------------------------------------
>
>                 Key: ARROW-3267
>                 URL: https://issues.apache.org/jira/browse/ARROW-3267
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Uwe L. Korn
>            Assignee: Uwe L. Korn
>            Priority: Major
>             Fix For: 0.11.0
>
>
> When one knows the expected schema for its input data but has no input data 
> for a data pipeline, it is necessary to construct an empty table as a 
> sentinel value to pass through.
> This is a small but often useful convenience function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to