Github user manishamde commented on the pull request:

    https://github.com/apache/spark/pull/2780#issuecomment-58991243
  
    @chouqin Thanks for the PR. It looks good to me in general.
    
    If I understand correctly, we are histogramming (value counting) each 
feature instead of just simple array access which could have led to duplicates. 
For large ```maxBins``` settings, this could lead to a slower operation 
compared to the current implementation. Any thoughts on performance?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to