[jira] [Commented] (ARROW-2082) [Python] SegFault in pyarrow.parquet.write_table with specific options

ASF GitHub Bot (JIRA) Tue, 17 Apr 2018 10:14:57 -0700

    [ 
https://issues.apache.org/jira/browse/ARROW-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441198#comment-16441198
 ]


ASF GitHub Bot commented on ARROW-2082:
---------------------------------------

xhochy commented on a change in pull request #456: ARROW-2082: Prevent segfault 
that was occurring when writing a nanosecond timestamp with arrow writer 
properties set to coerce timestamps and support deprecated int96 timestamps.
URL: https://github.com/apache/parquet-cpp/pull/456#discussion_r182158877
 
 

 ##########
 File path: src/parquet/arrow/arrow-reader-writer-test.cc
 ##########
 @@ -1403,6 +1403,56 @@ TEST(TestArrowReadWrite, ConvertedDateTimeTypes) {
   AssertTablesEqual(*ex_table, *result);
 }
 
+// Regression for ARROW-2802
+TEST(TestArrowReadWrite, CoerceTimestampsAndSupportDeprecatedInt96) {
+  using namespace ::arrow;
 
 Review comment:
   Please list the types you use explicitly here, e.g. `using 
::arrow::TimestampType`. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> [Python] SegFault in pyarrow.parquet.write_table with specific options
> ----------------------------------------------------------------------
>
>                 Key: ARROW-2082
>                 URL: https://issues.apache.org/jira/browse/ARROW-2082
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.8.0
>         Environment: tested on MacOS High Sierra with python 3.6 and Ubuntu 
> Xenial (Python 3.5)
>            Reporter: Clément Bouscasse
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>
> I originally filed an issue in the pandas project but we've tracked it down 
> to arrow itself, when called via pandas in specific circumstances:
> [https://github.com/pandas-dev/pandas/issues/19493]
> basically using
> {code:java}
>  df.to_parquet('filename.parquet', flavor='spark'){code}
> gives a seg fault if `df` contains a datetime column.
> Under the covers,  pandas translates this to the following call:
> {code:java}
> pq.write_table(table, 'output.parquet', flavor='spark', compression='snappy', 
> coerce_timestamps='ms')
> {code}
> which gives me an instant crash.
> There is a repro on the github ticket.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-2082) [Python] SegFault in pyarrow.parquet.write_table with specific options

Reply via email to