[jira] [Commented] (ARROW-6958) [python] tutorial script for arrow in spark throws error

2019-10-23 Thread Joris Van den Bossche (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957749#comment-16957749
 ] 

Joris Van den Bossche commented on ARROW-6958:
--

The relevant spark issue is https://issues.apache.org/jira/browse/SPARK-29367

> [python] tutorial script for arrow in spark throws error
> 
>
> Key: ARROW-6958
> URL: https://issues.apache.org/jira/browse/ARROW-6958
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java, Python
>Affects Versions: 0.15.0
> Environment: Ubuntu v 18. Cluster spun up on google dataproc - see 
> startup for specs of cluster
>Reporter: Karl Svensson
>Priority: Major
>  Labels: newbie
> Fix For: 0.8.0
>
> Attachments: arrow_error.txt, start-cluster-nl.ps1.txt
>
>
> Running the arrow example for pyspark ([found here 
> |[https://github.com/apache/spark/blob/master/examples/src/main/python/sql/arrow.py]])
>  causes a java.lang.IllegalArgumentException error. Running the same script 
> with pyarrow v 0.8.0 causes the script to run correctly.
> Attached are the startup settings in google dataproc I'm using to create the 
> cluster, as well as the output (with error text). It isn't immediately 
> obvious to me what is causing the issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6958) [python] tutorial script for arrow in spark throws error

2019-10-23 Thread Joris Van den Bossche (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957745#comment-16957745
 ] 

Joris Van den Bossche commented on ARROW-6958:
--

pyspark is not yet compatible with the latest pyarrow 0.15.0 version, see eg 
https://stackoverflow.com/questions/58273063/pandasudf-and-pyarrow-0-15-0 for 
an explanation

> [python] tutorial script for arrow in spark throws error
> 
>
> Key: ARROW-6958
> URL: https://issues.apache.org/jira/browse/ARROW-6958
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java, Python
>Affects Versions: 0.15.0
> Environment: Ubuntu v 18. Cluster spun up on google dataproc - see 
> startup for specs of cluster
>Reporter: Karl Svensson
>Priority: Major
>  Labels: newbie
> Fix For: 0.8.0
>
> Attachments: arrow_error.txt, start-cluster-nl.ps1.txt
>
>
> Running the arrow example for pyspark ([found here 
> |[https://github.com/apache/spark/blob/master/examples/src/main/python/sql/arrow.py]])
>  causes a java.lang.IllegalArgumentException error. Running the same script 
> with pyarrow v 0.8.0 causes the script to run correctly.
> Attached are the startup settings in google dataproc I'm using to create the 
> cluster, as well as the output (with error text). It isn't immediately 
> obvious to me what is causing the issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)