[GitHub] spark pull request: [SPARK-2199] [mllib] topic modeling

jkbradley Wed, 10 Dec 2014 13:41:43 -0800

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/1269#issuecomment-66529072
  
    @akopich  Thanks for the updates.  It looks like rebasing did not work 
correctly (looking at the 10K+ lines in this PR!).  It should be possible to 
fix with rebase + conflict resolution.  But since most updates in this PR are 
in separate files, it may be easiest to get a clean copy of master, copy the 
files & changes in, and then force push to update this PR's branch (or make a 
new PR).
    
    About Enumerator / TokenIndexer: Renaming sounds good.  You're right that 
both APIs (setters vs. constructor arguments) are used in spark.mllib, but in 
the new spark.ml API, only setters will be used.  (And hopefully spark.ml will 
become the primary ML package soon.)  That's my main reason for recommending 
setters.
    
    About Dirichlet: Since there isn't a precedent to follow, I'd recommend 
putting it in a new stat/impl/ folder and leaving it private.  A later PR could 
add public APIs for some common distributions.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-2199] [mllib] topic modeling

Reply via email to