[ https://issues.apache.org/jira/browse/SPARK-24671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-24671. ------------------------------- Resolution: Won't Fix > DataFrame length using a dunder/magic method in PySpark > ------------------------------------------------------- > > Key: SPARK-24671 > URL: https://issues.apache.org/jira/browse/SPARK-24671 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 2.3.1 > Reporter: Ondrej Kokes > Priority: Minor > > In Python, if a class implements a method called __len__, one can use the > builtin `len` function to get a length of an instance of said class, whatever > that means in its context. This is e.g. how you get the number of rows of a > pandas DataFrame. > It should be straightforward to add this functionality to PySpark, because > df.count() is already implemented, so the patch I'm proposing is just two > lines of code (and two lines of tests). It's in this commit, I'll submit a PR > shortly. > https://github.com/kokes/spark/commit/4d0afaf3cd046b11e8bae43dc00ddf4b1eb97732 -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org