Github user dbtsai commented on the pull request:

    https://github.com/apache/spark/pull/1379#issuecomment-66336110
  
    @avulanov 
    
    1. I did the same optimization for MLlib in [my recently 
PRs](https://github.com/apache/spark/commits/master?author=dbtsai).
    
    * Accessing the values in dense/sparse vector directly is very slow without 
having a local reference of primitive array due to the dereference. See #3577 
and #3435. There is bytecode analysis for this issue in #3435
    * Breeze's foreachActive is very slow, so I implemented a 4x faster version 
in #3288 My experience is that if Breeze is used in critical code path, it has 
to be cautious.  
    
    2. I don't check out your ANN implementation yet, but I will check today. 
I'll send you our optimized Gradient Computation code for MLOR. Will be 
interesting to see the new benchmark compared with the one you tested.
    
    3. See page 27 at Prof. CJ Lin's slide. 
http://www.csie.ntu.edu.tw/~cjlin/talks/SFmeetup.pdf It's just doing the 
feature expansion by mapping the data into higher dimension space. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to