[
https://issues.apache.org/jira/browse/ARROW-13939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413344#comment-17413344
]
Weston Pace commented on ARROW-13939:
-------------------------------------
Hmm, I'm not sure what documentation you are referring to. If you are looking
at the C++ documentation then the Cython API does not fully mirror the C++ API.
In other words, CResult does not have every method that Result has. If you
want the value from a result the proper thing to do is GetResultValue(val)
which will check the status of the result and, if valid, returns the value. If
it isn't valid, it converts the invalid status into the appropriate python
exception and raises it.
> Will iterating the whole table be slow in cython?
If you are going through every value with GetScalar then yes, it probably will
be but I don't know for sure. Ideally, if you want to process every value, you
will want to get access to the raw buffers and operate on them. Can you give
an example of the transformation you want to do? Your best bet might be to
create compute kernels in C++ to do the manipulation you desire and then call
those kernel functions from python.
> which is the best to use to append new elements to. Is there a way i create
> an empty table of same schema and keep appending to it. Or should I use
> vectors/list and then pass them to create a table.
Where are these elements coming from? For example, if you are receiving them
already in python (via some on_new_event method or something) then a simple and
reasonably efficient approach would be to just gather them in a python list
and, when the list is large enough, convert the list to an arrow array. If the
elements you are receiving are in C++ then you probably don't want to marshal
them to python and add them to a python list. Using the C++ array builders
would be a better choice.
> how to do resampling of arrow table using cython
> ------------------------------------------------
>
> Key: ARROW-13939
> URL: https://issues.apache.org/jira/browse/ARROW-13939
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++, Python
> Reporter: krishna deepak
> Priority: Minor
>
> Please can someone point me to resources, how to write a resampling code in
> cython for Arrow table.
> # Will iterating the whole table be slow in cython?
> # which is the best to use to append new elements to. Is there a way i
> create an empty table of same schema and keep appending to it. Or should I
> use vectors/list and then pass them to create a table.
> Performance is very important for me. Any help is highly appreciated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)