Nicholas Chammas created SPARK-5865:
---------------------------------------
Summary: Add doc warnings for methods that collect an RDD to the
driver
Key: SPARK-5865
URL: https://issues.apache.org/jira/browse/SPARK-5865
Project: Spark
Issue Type: Improvement
Components: Spark Core, SQL
Reporter: Nicholas Chammas
Priority: Minor
We should include a note in the doc string for any method that collects an RDD
to the driver so that users have some hint of why their call might be OOMing.
{{RDD.collect()}}
*
[Scala|https://github.com/apache/spark/blob/d8adefefcc2a4af32295440ed1d4917a6968f017/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L803-L806]
*
[Python|https://github.com/apache/spark/blob/d8adefefcc2a4af32295440ed1d4917a6968f017/python/pyspark/rdd.py#L680-L683]
{{DataFrame.toPandas()}}
*
[Python|https://github.com/apache/spark/blob/c76da36c2163276b5c34e59fbb139eeb34ed0faa/python/pyspark/sql/dataframe.py#L637-L645]
{{Column.toPandas()}}
*
[Python|https://github.com/apache/spark/blob/c76da36c2163276b5c34e59fbb139eeb34ed0faa/python/pyspark/sql/dataframe.py#L965-L973]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]