Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/20146
I think all dataset with a string order get indexed, as far as I recall?
Pick existing R dataset is just a convenience, we can also make up a few
lines of data if that works out better.
Although as a separate note the difference in sort order is potentially
something we should document, esp if it goes beyond glm, for example in sql
functions too
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]