[
https://issues.apache.org/jira/browse/ARROW-13939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413477#comment-17413477
]
krishna deepak commented on ARROW-13939:
----------------------------------------
[~westonpace]
Thanks this is very helpful,
Regarding documentation, it makes sense. But then, the cython documentation is
single page with not much useful info. The function
{code:java}
GetResultValue(val){code}
is no where to be found.
I'm still stuck after using this. It outputs a 'shared_ptr[CScalar]' and
therefore 'CScalar *'. But still stuck with extracting value out of it.
lets say i know that its of IntScalar, how to extract it int a =
doSomethingOnCResult(val)
---------------------------------------------------------------------------------------------------------------------------------------------------------------
What im trying to do is converting data from [[11:01, 3], [11:02,4],
[11:03:,2], [11:04,1], [11:05, 3], [11:06,6]] to [[11:03:3], [11:06:6]], just
resampling 1 min data to 3min data. Here the transformation function was max of
all values
So I have to iterate through all values.
> Your best bet might be to create compute kernels in C++ to do the
> manipulation you desire and then call those kernel functions from python.
I believe this resembles to what i'm doing, having this resampling code in
Cython. If I'm wrong please let me know.
> if you want to process every value, you will want to get access to the raw
> buffers and operate on them.
I have no idea how to do this. Please can you point me to some resources.
---------------------------------------------------------------------------------------------------------------------------------------------------------------
> Where are these elements coming from?
Everything is in cython. so I pass my larger table from python to cython
resampling function. This function iterates over the whole table and builds a
new table as it iterates.
My plan is to use cpp vector to build individual columns and pass it to Arrow
Table constructor and then return back to python .
> how to do resampling of arrow table using cython
> ------------------------------------------------
>
> Key: ARROW-13939
> URL: https://issues.apache.org/jira/browse/ARROW-13939
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++, Python
> Reporter: krishna deepak
> Priority: Minor
>
> Please can someone point me to resources, how to write a resampling code in
> cython for Arrow table.
> # Will iterating the whole table be slow in cython?
> # which is the best to use to append new elements to. Is there a way i
> create an empty table of same schema and keep appending to it. Or should I
> use vectors/list and then pass them to create a table.
> Performance is very important for me. Any help is highly appreciated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)