Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/160#issuecomment-38318849
These are supposedly all 0.0 or 1.0 since they're really labels, not
numeric values. Given that, I reverse myself and suppose that "== 1" is maybe
best. You could say that, well, "> 0.5" is more flexible in case someone passes
in things that aren't quite 0/1 labels but probabilities instead. That's a
decent argument but wonder if it's just inviting abuse of the fact that labels
happen to be floating point values now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---