[ https://issues.apache.org/jira/browse/SPARK-31525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164086#comment-17164086 ]
Apache Spark commented on SPARK-31525: -------------------------------------- User 'tianshizz' has created a pull request for this issue: https://github.com/apache/spark/pull/29214 > Inconsistent result of df.head(1) and df.head() > ----------------------------------------------- > > Key: SPARK-31525 > URL: https://issues.apache.org/jira/browse/SPARK-31525 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL > Affects Versions: 2.4.6, 3.0.0 > Reporter: Joshua Hendinata > Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > In this line > [https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py#L1339], > if you are calling `df.head()` and dataframe is empty, it will return *None* > but if you are calling `df.head(1)` and dataframe is empty, it will return > *empty list* instead. > This particular behaviour is not consistent and can create confusion. > Especially when you are calling `len(df.head())` which will throw an > exception for empty dataframe -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org