[ https://issues.apache.org/jira/browse/SPARK-37730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564732#comment-17564732 ]
Apache Spark commented on SPARK-37730: -------------------------------------- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/37146 > plot.hist throws AttributeError on pandas=1.3.5 > ----------------------------------------------- > > Key: SPARK-37730 > URL: https://issues.apache.org/jira/browse/SPARK-37730 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.2.0, 3.3.0 > Environment: Conda environment.yml (also tested with 3.3.0-SNAPSHOT): > {{name: testenv}} > {{channels:}} > {{ - conda-forge}} > {{dependencies:}} > {{ - python=3.9.9}} > {{ }} > {{ - numpy=1.21.5}} > {{ - pandas=1.3.5}} > {{ - matplotlib=3.5.1}} > {{ }} > {{ - pyspark=3.2.0}} > > Reporter: Michał Słapek > Assignee: Michał Słapek > Priority: Major > Fix For: 3.3.0, 3.2.2 > > > plot.hist from PySpark throws AttributeError exception when pyspark.pandas is > used with pandas=1.3.5. > Pandas in commit > [https://github.com/pandas-dev/pandas/commit/029907c9d69a0260401b78a016a6c4515d8f1c40] > replaced MPLPlot._add_legend_handle with > MPLPlot._append_legend_handles_labels. > I've attached PR on github which replaces use of MPLPlot._add_legend_handle > in PySpark with MPLPlot._append_legend_handles_labels. > > Code: > > {code:java} > import pyspark.pandas as ps > from matplotlib import pyplot as plt > ps.set_option("plotting.backend", "matplotlib") > df = ps.DataFrame({'data': [4, 5, 5, 6, 8, 9]}) > df['data'].plot.hist() > plt.show() > {code} > > > Truncated traceback: > {code:java} > Traceback (most recent call last): > > File "/home/develop/Documents/sparkbug/code.py", line 6, in <module> > df['data'].plot.hist() > ... > File > "/mnt/transient/develop/miniconda3/envs/testenv/lib/python3.9/site-packages/pyspark/pandas/plot/matplotlib.py", > line 403, in _make_plot > self._add_legend_handle(artists[0], label, index=i) > AttributeError: 'PandasOnSparkHistPlot' object has no attribute > '_add_legend_handle' {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org