[jira] [Assigned] (ARROW-1964) [Python] Expose Builder classes
[ https://issues.apache.org/jira/browse/ARROW-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-1964: - Assignee: Donal Simmie > [Python] Expose Builder classes > --- > > Key: ARROW-1964 > URL: https://issues.apache.org/jira/browse/ARROW-1964 > Project: Apache Arrow > Issue Type: New Feature > Components: Python >Reporter: Uwe L. Korn >Assignee: Donal Simmie >Priority: Major > Labels: beginner, pull-request-available > Fix For: 0.10.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Having the builder classes available from Python would be very helpful. > Currently a construction of an Arrow array always need to have a Python list > or numpy array as intermediate. As the builder in combination with jemalloc > are very efficient in building up non-chunked memory, it would be nice to > directly use them in certain cases. > The most useful builders are the > [StringBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L714] > and > [DictionaryBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L872] > as they provide functionality to create columns that are not easily > constructed using NumPy methods in Python. > The basic approach would be to wrap the C++ classes in > https://github.com/apache/arrow/blob/master/python/pyarrow/includes/libarrow.pxd > so that they can be used from Cython. Afterwards, we should start a new file > {{python/pyarrow/builder.pxi}} where we have classes take typical Python > objects like {{str}} and pass them on to the C++ classes. At the end, these > classes should also return (Python accessible) {{pyarrow.Array}} instances. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-1964) [Python] Expose Builder classes
[ https://issues.apache.org/jira/browse/ARROW-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Hagerman reassigned ARROW-1964: Assignee: (was: Alex Hagerman) > [Python] Expose Builder classes > --- > > Key: ARROW-1964 > URL: https://issues.apache.org/jira/browse/ARROW-1964 > Project: Apache Arrow > Issue Type: New Feature > Components: Python >Reporter: Uwe L. Korn >Priority: Major > Labels: beginner, pull-request-available > Fix For: 1.0.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Having the builder classes available from Python would be very helpful. > Currently a construction of an Arrow array always need to have a Python list > or numpy array as intermediate. As the builder in combination with jemalloc > are very efficient in building up non-chunked memory, it would be nice to > directly use them in certain cases. > The most useful builders are the > [StringBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L714] > and > [DictionaryBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L872] > as they provide functionality to create columns that are not easily > constructed using NumPy methods in Python. > The basic approach would be to wrap the C++ classes in > https://github.com/apache/arrow/blob/master/python/pyarrow/includes/libarrow.pxd > so that they can be used from Cython. Afterwards, we should start a new file > {{python/pyarrow/builder.pxi}} where we have classes take typical Python > objects like {{str}} and pass them on to the C++ classes. At the end, these > classes should also return (Python accessible) {{pyarrow.Array}} instances. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-1964) [Python] Expose Builder classes
[ https://issues.apache.org/jira/browse/ARROW-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Hagerman reassigned ARROW-1964: Assignee: Alex Hagerman > [Python] Expose Builder classes > --- > > Key: ARROW-1964 > URL: https://issues.apache.org/jira/browse/ARROW-1964 > Project: Apache Arrow > Issue Type: New Feature > Components: Python >Reporter: Uwe L. Korn >Assignee: Alex Hagerman >Priority: Major > Labels: beginner, pull-request-available > Fix For: 1.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Having the builder classes available from Python would be very helpful. > Currently a construction of an Arrow array always need to have a Python list > or numpy array as intermediate. As the builder in combination with jemalloc > are very efficient in building up non-chunked memory, it would be nice to > directly use them in certain cases. > The most useful builders are the > [StringBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L714] > and > [DictionaryBuilder|https://github.com/apache/arrow/blob/5030e235047bdffabf6a900dd39b64eeeb96bdc8/cpp/src/arrow/builder.h#L872] > as they provide functionality to create columns that are not easily > constructed using NumPy methods in Python. > The basic approach would be to wrap the C++ classes in > https://github.com/apache/arrow/blob/master/python/pyarrow/includes/libarrow.pxd > so that they can be used from Cython. Afterwards, we should start a new file > {{python/pyarrow/builder.pxi}} where we have classes take typical Python > objects like {{str}} and pass them on to the C++ classes. At the end, these > classes should also return (Python accessible) {{pyarrow.Array}} instances. -- This message was sent by Atlassian JIRA (v7.6.3#76005)