[
https://issues.apache.org/jira/browse/ARROW-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Li updated ARROW-13649:
-----------------------------
Summary: [C++][Python] SIGSEGV inside datasets or compute kernel (was:
pyarrow is causing segfault randomly) (was: pyarrow is causing segfault
randomly)
> [C++][Python] SIGSEGV inside datasets or compute kernel (was: pyarrow is
> causing segfault randomly)
> ---------------------------------------------------------------------------------------------------
>
> Key: ARROW-13649
> URL: https://issues.apache.org/jira/browse/ARROW-13649
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 5.0.0
> Environment: openSUSE Leap 15.2
> conda python3.9 env
> Reporter: krishna deepak
> Priority: Critical
>
> I'm using pyarrow to read feather files. I'm randomly getting the following
> segfault error.
> *
> **
> *** SIGSEGV received at time=1629226305 on cpu 3 ***
> PC: @ 0x7fa9e177272a (unknown) arrow::BitUtil::SetBitmap()
> @ 0x7fa9f5dec2d0 (unknown) (unknown)
> Segmentation fault (core dumped)
> I initially thought its because of some bug in my cython code, but then even
> after removing all cython calls, I get this error randomly.
> The python code is very simple read
> {quote}{color:#c1c7d0} index_data = ds.dataset(INDEX_DATA_PATH / self.ticker
> / str(year) / 'indexed_table.feather',
> format='feather')
>
> index_data = index_data.to_table()
> trade_days = self.get_trading_days(year)
>
> options_data = ds.dataset(OPTIONS_DATA_PATH / self.ticker / self.expiry_type
> / str(year), format='feather')
> options_data = options_data.to_table(
> filter=(
> (ds.field('dt') >= trade_days[0]) & (ds.field('dt') <= trade_days[-1])
> ),
> columns=options_data_columns
> )
>
> expiry_dts = [x.as_py() for x in pc.unique(options_data['expiry_dt'])]
> expiry_dts.sort(){color}
> {quote}
>
> The error only happens randomly like 1 out of 5 times
--
This message was sent by Atlassian Jira
(v8.3.4#803005)