[jira] [Assigned] (ARROW-1964) [Python] Expose Builder classes

2018-05-10 Thread Antoine Pitrou (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou reassigned ARROW-1964:
-

Assignee: Donal Simmie

> [Python] Expose Builder classes
> ---
>
> Key: ARROW-1964
> URL: https://issues.apache.org/jira/browse/ARROW-1964
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Donal Simmie
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 0.10.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Having the builder classes available from Python would be very helpful. 
> Currently a construction of an Arrow array always need to have a Python list 
> or numpy array as intermediate. As  the builder in combination with jemalloc 
> are very efficient in building up non-chunked memory, it would be nice to 
> directly use them in certain cases.
> The most useful builders are the 
> [StringBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L714]
>  and 
> [DictionaryBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L872]
>  as they provide functionality to create columns that are not easily 
> constructed using NumPy methods in Python.
> The basic approach would be to wrap the C++ classes in 
> https://github.com/apache/arrow/blob/master/python/pyarrow/includes/libarrow.pxd
>  so that they can be used from Cython. Afterwards, we should start a new file 
> {{python/pyarrow/builder.pxi}} where we have classes take typical Python 
> objects like {{str}} and pass them on to the C++ classes. At the end, these 
> classes should also return (Python accessible) {{pyarrow.Array}} instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-1964) [Python] Expose Builder classes

2018-05-06 Thread Alex Hagerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Hagerman reassigned ARROW-1964:


Assignee: (was: Alex Hagerman)

> [Python] Expose Builder classes
> ---
>
> Key: ARROW-1964
> URL: https://issues.apache.org/jira/browse/ARROW-1964
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Having the builder classes available from Python would be very helpful. 
> Currently a construction of an Arrow array always need to have a Python list 
> or numpy array as intermediate. As  the builder in combination with jemalloc 
> are very efficient in building up non-chunked memory, it would be nice to 
> directly use them in certain cases.
> The most useful builders are the 
> [StringBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L714]
>  and 
> [DictionaryBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L872]
>  as they provide functionality to create columns that are not easily 
> constructed using NumPy methods in Python.
> The basic approach would be to wrap the C++ classes in 
> https://github.com/apache/arrow/blob/master/python/pyarrow/includes/libarrow.pxd
>  so that they can be used from Cython. Afterwards, we should start a new file 
> {{python/pyarrow/builder.pxi}} where we have classes take typical Python 
> objects like {{str}} and pass them on to the C++ classes. At the end, these 
> classes should also return (Python accessible) {{pyarrow.Array}} instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-1964) [Python] Expose Builder classes

2018-05-04 Thread Alex Hagerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Hagerman reassigned ARROW-1964:


Assignee: Alex Hagerman

> [Python] Expose Builder classes
> ---
>
> Key: ARROW-1964
> URL: https://issues.apache.org/jira/browse/ARROW-1964
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Alex Hagerman
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Having the builder classes available from Python would be very helpful. 
> Currently a construction of an Arrow array always need to have a Python list 
> or numpy array as intermediate. As  the builder in combination with jemalloc 
> are very efficient in building up non-chunked memory, it would be nice to 
> directly use them in certain cases.
> The most useful builders are the 
> [StringBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L714]
>  and 
> [DictionaryBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L872]
>  as they provide functionality to create columns that are not easily 
> constructed using NumPy methods in Python.
> The basic approach would be to wrap the C++ classes in 
> https://github.com/apache/arrow/blob/master/python/pyarrow/includes/libarrow.pxd
>  so that they can be used from Cython. Afterwards, we should start a new file 
> {{python/pyarrow/builder.pxi}} where we have classes take typical Python 
> objects like {{str}} and pass them on to the C++ classes. At the end, these 
> classes should also return (Python accessible) {{pyarrow.Array}} instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)