[ 
https://issues.apache.org/jira/browse/ARROW-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-8945:
--------------------------------
    Description: 
I've been thinking it would be useful to have a minimal Cython package, call it 
"cyarrow", containing some pxd files and a small amount of compiled pyx code 
(using a C compiler only) that enables projects written in Cython to interact 
with Arrow datasets in minimal ways (for example, iterating over their values, 
interacting with dictionary-encoded/categorical arrays) that don't amount to 
reimplementation of the "hard stuff" where they would want to utilize pyarrow 
or the C++ library instead. Otherwise, every Python project that has compiled 
code in Cython and wants to use the C interface 
(https://github.com/apache/arrow/blob/master/docs/source/format/CDataInterface.rst)
 would have to create their own minimal implementation. 

Target user for this project would be Python projects like scikit-learn that 
are mostly written in Cython

  was:
I've been thinking it would be useful to have a minimal Cython package, call it 
"cyarrow", containing some pxd files and a small amount of compiled pyx code 
(using a C compiler only) that enables projects written in Cython to interact 
with Arrow datasets in minimal ways (for example, iterating over their values, 
interacting with dictionary-encoded/categorical arrays) that don't amount to 
reimplementation of the "hard stuff" where they would want to utilize pyarrow 
or the C++ library instead. Otherwise, every Python project that has compiled 
code in Cython and wants to use the C interface would have to create their own 
minimal implementation. 

Target user for this project would be Python projects like scikit-learn that 
are mostly written in Cython


> [Python] An independent Cython package for Cython-based projects that want to 
> program against the C data interface
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-8945
>                 URL: https://issues.apache.org/jira/browse/ARROW-8945
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>
> I've been thinking it would be useful to have a minimal Cython package, call 
> it "cyarrow", containing some pxd files and a small amount of compiled pyx 
> code (using a C compiler only) that enables projects written in Cython to 
> interact with Arrow datasets in minimal ways (for example, iterating over 
> their values, interacting with dictionary-encoded/categorical arrays) that 
> don't amount to reimplementation of the "hard stuff" where they would want to 
> utilize pyarrow or the C++ library instead. Otherwise, every Python project 
> that has compiled code in Cython and wants to use the C interface 
> (https://github.com/apache/arrow/blob/master/docs/source/format/CDataInterface.rst)
>  would have to create their own minimal implementation. 
> Target user for this project would be Python projects like scikit-learn that 
> are mostly written in Cython



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to