Github user staple commented on the pull request:

    https://github.com/apache/spark/pull/2491#issuecomment-57855256
  
    @mengxr Sorry about that, in the future I’ll follow the best practice 
you’ve outlined.
    
    Here are the take-aways from my perspective:
    - Investigate use of sparse storage for the conditional distribution. I 
believe the existing implementation in master uses dense conditional 
distribution matrices, but sparse is obviously possible.
    - Remove grouping of conditional probabilities, as it adds complexity and 
you mentioned you aren’t sure if it will help performance.
    - Add support for predictValues with consistent partitioning.
    
    I’ll look into all these. Thanks for your feedback!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to