Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/21654#discussion_r217773474
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -375,6 +375,9 @@ def _truncate(self):
return int(self.sql_ctx.getConf(
"spark.sql.repl.eagerEval.truncate", "20"))
+ def __len__(self):
--- End diff --
I mean I _think_ a reasonable thing to do if someone calls `list(df)` is
collect - they clearly want the dataframe as a list. If that's a good idea or
not is up to the developer.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]