Github user akopich commented on the issue:
https://github.com/apache/spark/pull/19565
Ping @hhbyyh, @WeichenXu123, @srowen.
Seems like the discussion is stuck. Does anybody think that the general
approach implemented in this PR should be changed? Currently it is filtering
before sampling with no caching. --- --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
