[jira] [Commented] (SPARK-26943) Weird behaviour with `.cache()`

Will Uto (JIRA) Sat, 02 Mar 2019 12:45:33 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782518#comment-16782518
 ]


Will Uto commented on SPARK-26943:
----------------------------------

Thanks for the information - I was hoping that if I e.g. installed PySpark 
v2.4.0 in each Python Virtual Environment on each cluster worker/node, then I 
could run against Spark v2.4.0, but it sounds like I would need to upgrade 
Spark through something like Cloudera Manager.

> Weird behaviour with `.cache()`
> -------------------------------
>
>                 Key: SPARK-26943
>                 URL: https://issues.apache.org/jira/browse/SPARK-26943
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.1.0
>            Reporter: Will Uto
>            Priority: Major
>
>  
> {code:java}
> sdf.count(){code}
>  
> works fine. However:
>  
> {code:java}
> sdf = sdf.cache()
> sdf.count()
> {code}
>  does not, and produces error
> {code:java}
> Py4JJavaError: An error occurred while calling o314.count.
> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 75 
> in stage 8.0 failed 4 times, most recent failure: Lost task 75.3 in stage 8.0 
> (TID 438, uat-datanode-02, executor 1): java.text.ParseException: Unparseable 
> number: "(N/A)"
>       at java.text.NumberFormat.parse(NumberFormat.java:350)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-26943) Weird behaviour with `.cache()`

Reply via email to