Github user akopich commented on the pull request:

    https://github.com/apache/spark/pull/1269#issuecomment-65223123
  
    @jkbradley, thank you for you comments!
    
    It seems like we should discuss API for this set of models first. 
    As far as I can understand, you are not about to provide a user with an 
ability to implement their own regularizers. Are you? 
    
    In this case it's possible to implement a facade class and set all the 
other classes private. But there's still an issue. 
    
    First, note that `RobustDocumentParameters` inherits from 
`DocumentParameters` and `RobustGlobalParameters` inherits from 
`GlobalParameters`. Now imagine the facade class. Let it have a parameter 
`robust : Boolean`. Now let it have a method like 
    
    `def infer(documents : RDD[Seq[String]]) : (RDD[DocumentParameters], 
GlobalParameter)`
    
    If `robust == false`, everything is OK, but how am I supposed to return 
`RobustDocumentParameters` and `RobustGlobalParameters` if `robust == true`? 
It's definitely possible, but user will have to explicitly cast returned 
values. 
    
    The only way I see to solve this problem is to define a pair of facades 
(robust and non-robust) inheriting one another.  Probably, it's just because 
I'm not a Scala expert. If there's a better solution, please give me a hint. 
    
    By the way. You said, `PLSA` and `RobustPLSA` are very similar and suffer 
from code duplication. Their codes are similar, though different (e.g. in 
robust version I have to pass background and noise matrices to almost every 
function). Now I've done my best to avoid code duplication via inheriting both 
of these classes from `AbstractPLSA`. If there is a way for further code 
generalization, please point it.
    
    You've also mentioned  mllib/linalg/. Should I use DenseMatrix instead of 
Array[Array[Float]]? I don't like this idea because DenseMatrix contains Double 
values -- it seems to be a waste of memory and network resourses. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to