[
https://issues.apache.org/jira/browse/ARROW-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Hagerman updated ARROW-2787:
---------------------------------
Labels: cython (was: )
> Memory Issue passing table from python to c++ via cython
> --------------------------------------------------------
>
> Key: ARROW-2787
> URL: https://issues.apache.org/jira/browse/ARROW-2787
> Project: Apache Arrow
> Issue Type: Bug
> Components: Integration, Python
> Affects Versions: 0.9.0
> Environment: clang6
> Reporter: Joseph Toth
> Priority: Major
> Labels: cython
>
> I wanted to create a simple example of reading a table in Python and pass it
> to C+, but I'm doing something wrong or there is a memory issue. When the
> table gets to C+ and I print out column names it also prints out a lot of
> junk and what looks like pydocs. Let me know if you need any more info.
> Thanks!
>
> *demo.py*
> import numpy
> from psy.automl import cyth
> import pandas as pd
> from absl import app
> def main(argv):
> sup = pd.DataFrame({
> 'int': [1, 2],
> 'str': ['a', 'b']
> })
> table = pa.Table.from_pandas(sup)
> cyth.c_t(table)
> *cyth.pyx*
> import pandas as pd
> import pyarrow as pa
> from pyarrow.lib cimport *
> cdef extern from "cyth.h" namespace "psy":
> void t(shared_ptr[CTable])
> def c_t(obj):
> # These print work
> # for i in range(obj.num_columns):
> # print(obj.column(i).name
> cdef shared_ptr[CTable] tbl = pyarrow_unwrap_table(obj)
> t(tbl)
> *cyth.h*
> #include <iostream>
> #include <string>
> #include "arrow/api.h"
> #include "arrow/python/api.h"
> #include "Python.h"
> namespace psy {
> void t(std::shared_ptr<arrow::Table> pytable) {
> // This works
> std::cout << "NUM" << pytable->num_columns();
> // This prints a lot of garbage
> for(int i = 0; i < pytable->num_columns(); i++) {
> std::cout << pytable->column(i)->name();
> }
> }
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)