[ 
https://issues.apache.org/jira/browse/ARROW-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

krishna deepak updated ARROW-13649:
-----------------------------------
    Description: 
I'm using pyarrow to read feather files. I'm randomly getting the following 
segfault error.
 * 
 ** 
 *** SIGSEGV received at time=1629226305 on cpu 3 ***
 PC: @ 0x7fa9e177272a (unknown) arrow::BitUtil::SetBitmap()
 @ 0x7fa9f5dec2d0 (unknown) (unknown)
 Segmentation fault (core dumped)

I initially thought its because of some bug in my cython code, but then even 
after removing all cython calls, I get this error randomly.

The python code is very simple read
{quote}{color:#c1c7d0} index_data = ds.dataset(INDEX_DATA_PATH / self.ticker / 
str(year) / 'indexed_table.feather',
 format='feather')
 
 index_data = index_data.to_table()
 trade_days = self.get_trading_days(year)
 
 options_data = ds.dataset(OPTIONS_DATA_PATH / self.ticker / self.expiry_type / 
str(year), format='feather')
 options_data = options_data.to_table(
 filter=(
 (ds.field('dt') >= trade_days[0]) & (ds.field('dt') <= trade_days[-1])
 ),
 columns=options_data_columns
 )
 
 expiry_dts = [x.as_py() for x in pc.unique(options_data['expiry_dt'])]
 expiry_dts.sort(){color}
{quote}
 

The error only happens randomly like 1 out of 5 times 

  was:
I'm using pyarrow to read feather files. I'm randomly getting the following 
segfault error.

*** SIGSEGV received at time=1629226305 on cpu 3 ***
PC: @     0x7fa9e177272a  (unknown)  arrow::BitUtil::SetBitmap()
    @     0x7fa9f5dec2d0  (unknown)  (unknown)
Segmentation fault (core dumped)

I initially thought its because of some bug in my cython code, but then even 
after removing all cython calls, I get this error randomly.

The python code is very simple read

{{            index_data = ds.dataset(INDEX_DATA_PATH / self.ticker / str(year) 
/ 'indexed_table.feather',
                                    format='feather')

            index_data = index_data.to_table()
            trade_days = self.get_trading_days(year)

            options_data = ds.dataset(OPTIONS_DATA_PATH / self.ticker / 
self.expiry_type / str(year), format='feather')
            options_data = options_data.to_table(
                filter=(
                        (ds.field('dt') >= trade_days[0]) & (ds.field('dt') <= 
trade_days[-1])
                ),
                columns=options_data_columns
            )

            expiry_dts = [x.as_py() for x in 
pc.unique(options_data['expiry_dt'])]
            expiry_dts.sort()}}


 


> pyarrow is causing segfault randomly
> ------------------------------------
>
>                 Key: ARROW-13649
>                 URL: https://issues.apache.org/jira/browse/ARROW-13649
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 5.0.0
>         Environment: openSUSE Leap 15.2
> conda python3.9 env
>            Reporter: krishna deepak
>            Priority: Critical
>
> I'm using pyarrow to read feather files. I'm randomly getting the following 
> segfault error.
>  * 
>  ** 
>  *** SIGSEGV received at time=1629226305 on cpu 3 ***
>  PC: @ 0x7fa9e177272a (unknown) arrow::BitUtil::SetBitmap()
>  @ 0x7fa9f5dec2d0 (unknown) (unknown)
>  Segmentation fault (core dumped)
> I initially thought its because of some bug in my cython code, but then even 
> after removing all cython calls, I get this error randomly.
> The python code is very simple read
> {quote}{color:#c1c7d0} index_data = ds.dataset(INDEX_DATA_PATH / self.ticker 
> / str(year) / 'indexed_table.feather',
>  format='feather')
>  
>  index_data = index_data.to_table()
>  trade_days = self.get_trading_days(year)
>  
>  options_data = ds.dataset(OPTIONS_DATA_PATH / self.ticker / self.expiry_type 
> / str(year), format='feather')
>  options_data = options_data.to_table(
>  filter=(
>  (ds.field('dt') >= trade_days[0]) & (ds.field('dt') <= trade_days[-1])
>  ),
>  columns=options_data_columns
>  )
>  
>  expiry_dts = [x.as_py() for x in pc.unique(options_data['expiry_dt'])]
>  expiry_dts.sort(){color}
> {quote}
>  
> The error only happens randomly like 1 out of 5 times 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to