Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/4622#issuecomment-91229215
@mengxr I have added a new function to automatically calculate the
preferences used in AP algorithm. In fact, it just the median of similarities.
Because I use the function `percentile_approx` in Hive to calculate the
median value, it has to add spark-hive as dependency of MLlib. I don't know if
this is a good idea? How do you think?
If it is not, in order to provide this function to users, where it is
better to put this in? Or it is better to implement a custom method to compute
the median value?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]