Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22365#discussion_r216482340
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -880,18 +880,23 @@ def sampleBy(self, col, fractions, seed=None):
| 0| 5|
| 1| 9|
+---+-----+
+ >>> dataset.sampleBy(col("key"), fractions={2: 1.0},
seed=0).count()
--- End diff --
Added
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]