[
https://issues.apache.org/jira/browse/ARROW-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph Toth updated ARROW-2787:
-------------------------------
Description:
I wanted to create a simple example of reading a table in Python and pass it to
C+, but I'm doing something wrong or there is a memory issue. When the table
gets to C+ and I print out column names it also prints out a lot of junk and
what looks like pydocs. Let me know if you need any more info. Thanks!
*demo.py*
??import numpy??
??from psy.automl import cyth??
??import pandas as pd??
??from absl import app??
??def main(argv):??
?? sup = pd.DataFrame({??
?? 'int': [1, 2],??
?? 'str': ['a', 'b']??
?? })??
?? table = pa.Table.from_pandas(sup)??
?? cyth.c_t(table)??
*??cyth.pyx??*
??import pandas as pd??
??import pyarrow as pa??
??from pyarrow.lib cimport *??
??cdef extern from "cyth.h" namespace "psy":??
??void t(shared_ptr[CTable])??
??def c_t(obj):??
# ??These print work??
??# for i in range(obj.num_columns):??
??# print(obj.column(i).name??
?? cdef shared_ptr[CTable] tbl = pyarrow_unwrap_table(obj)??
?? t(tbl)??
*cyth.h*
??#include <iostream>??
??#include <string>??
??#include "arrow/api.h"??
??#include "arrow/python/api.h"??
??#include "Python.h"??
??namespace psy {??
??void t(std::shared_ptr<arrow::Table> pytable) {??
??// This works??
?? std::cout << "NUM" << pytable->num_columns();??
??// This prints a lot of garbage??
?? for(int i = 0; i < pytable->num_columns(); i++) {??
?? std::cout << pytable->column(i)->name();??
?? }??
??}??
was:
I wanted to create a simple example of reading a table in Python and pass it to
C++, but I'm doing something wrong or there is a memory issue. When the table
gets to C++ and I print out column names it also prints out a lot of junk and
what looks like pydocs. Let me know if you need any more info. Thanks!
*demo.py*
??import numpy??
??from psy.automl import cyth??
??import pandas as pd??
??from absl import app??
??def main(argv):??
?? sup = pd.DataFrame({??
?? 'int': [1, 2],??
?? 'str': ['a', 'b']??
?? })??
?? table = pa.Table.from_pandas(sup)??
?? cyth.c_t(table)??
*??cyth.pyx??*
??import pandas as pd??
??import pyarrow as pa??
??from pyarrow.lib cimport *??
??cdef extern from "cyth.h" namespace "psy":??
??void t(shared_ptr[CTable])??
??def c_t(obj):??
# ??These print work??
??# for i in range(obj.num_columns):??
??# print(obj.column(i).name??
?? cdef shared_ptr[CTable] tbl = pyarrow_unwrap_table(obj)??
?? t(tbl)??
*cyth.h*
??#include <iostream>??
??#include <string>??
??#include "arrow/api.h"??
??#include "arrow/python/api.h"??
??#include "Python.h"??
??namespace psy {??
??void t(std::shared_ptr<arrow::Table> pytable) {??
??// This works??
?? std::cout << "NUM" << pytable->num_columns();??
??// This prints a lot of garbage??
?? for(int i = 0; i < pytable->num_columns(); i++) {??
?? std::cout << pytable->column(i)->name();??
?? }??
??}??
> Memory Issue passing table from python to c++ via cython
> --------------------------------------------------------
>
> Key: ARROW-2787
> URL: https://issues.apache.org/jira/browse/ARROW-2787
> Project: Apache Arrow
> Issue Type: Bug
> Components: Integration
> Affects Versions: 0.9.0
> Environment: clang6
> Reporter: Joseph Toth
> Priority: Major
>
> I wanted to create a simple example of reading a table in Python and pass it
> to C+, but I'm doing something wrong or there is a memory issue. When the
> table gets to C+ and I print out column names it also prints out a lot of
> junk and what looks like pydocs. Let me know if you need any more info.
> Thanks!
>
> *demo.py*
> ??import numpy??
> ??from psy.automl import cyth??
> ??import pandas as pd??
> ??from absl import app??
> ??def main(argv):??
> ?? sup = pd.DataFrame({??
> ?? 'int': [1, 2],??
> ?? 'str': ['a', 'b']??
> ?? })??
> ?? table = pa.Table.from_pandas(sup)??
> ?? cyth.c_t(table)??
>
> *??cyth.pyx??*
> ??import pandas as pd??
> ??import pyarrow as pa??
> ??from pyarrow.lib cimport *??
> ??cdef extern from "cyth.h" namespace "psy":??
> ??void t(shared_ptr[CTable])??
> ??def c_t(obj):??
> # ??These print work??
> ??# for i in range(obj.num_columns):??
> ??# print(obj.column(i).name??
> ?? cdef shared_ptr[CTable] tbl = pyarrow_unwrap_table(obj)??
> ?? t(tbl)??
> *cyth.h*
> ??#include <iostream>??
> ??#include <string>??
> ??#include "arrow/api.h"??
> ??#include "arrow/python/api.h"??
> ??#include "Python.h"??
> ??namespace psy {??
> ??void t(std::shared_ptr<arrow::Table> pytable) {??
> ??// This works??
> ?? std::cout << "NUM" << pytable->num_columns();??
> ??// This prints a lot of garbage??
> ?? for(int i = 0; i < pytable->num_columns(); i++) {??
> ?? std::cout << pytable->column(i)->name();??
> ?? }??
> ??}??
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)