Github user akopich commented on the pull request:

    https://github.com/apache/spark/pull/1269#issuecomment-66109601
  
    (1) Users implementing their own regularizers
    
    OK. I'd prefer to set all the methods private[mllib] for regularizers. 
    
    (2) Regular and Robust in the same class
    
    I understand what dynamic polymorphism is.  Unfortunately getNewTheta() 
methods have different parameters in robust and non-robust classes. 
    
    What's more significant, user should know instance of which class is 
returned -- robust or non-robust. Without this knowledge one will have to cast 
returned parameter (e.g. of type `DocumentParameters` to type  
`RobustDocumentParametrs`  ) in order to access `noise` field.  That's why I 
see no way to provide a user with a single facade class. 
    
    And thank you for mentioning visibility  -- my fault. 
    
    (3) PLSA and RobustPLSA code duplication
    
    Thank you very much for reading the code.
    
    (4) Float vs. Double and linear algebra operations
    
    OK. I'll use `Array[Array[Float]]` then. But you've mentioned, it'd be nice 
to extract all the linear algebra code to `mllib/linalg/`. Could you please 
point at my code implementing linear algebra operations that should be modved 
to `mllib/linalg/`. BTW I'm not sure if it's possible due to the fact that  
`mllib/linalg/` relies on `trait Matrix` and my code relies on 
`Array[Array[Float]]`. 
    
    (5) You've also said, Enumerator should be private. I definitely can make 
it private and change a method `TopicModel.infer()` in the way for it to 
consume `RDD[Seq[String]]` instead of `RDD[Documents]` and call `Enumerator` in 
the method. 
    
    But what if one wants consequently to train ten models (in order to choose 
the best parameters)?  Enumeration will be performed 10 times. Isn't it a waste?
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to