Github user viirya commented on the issue:
https://github.com/apache/spark/pull/20146
Ah, I didn't see that suggestion, sounds good to me to use a dataset
without duplicated values. I will look up for a proper dataset. Or you have a
suggested one already?--- --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
