[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition
[ https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16132118#comment-16132118 ] Nick Pentreath commented on SPARK-4981: --- Hey folks, as interesting as this would be, I think it's fairly clear that it won't be moving ahead any time soon (and furthermore any ML-on-Structured-Streaming is not imminent). Shall we close this off? > Add a streaming singular value decomposition > > > Key: SPARK-4981 > URL: https://issues.apache.org/jira/browse/SPARK-4981 > Project: Spark > Issue Type: New Feature > Components: DStreams, MLlib >Reporter: Jeremy Freeman > > This is for tracking WIP on a streaming singular value decomposition > implementation. This will likely be more complex than the existing streaming > algorithms (k-means, regression), but should be possible using the family of > sequential update rule outlined in this paper: > "Fast low-rank modifications of the thin singular value decomposition" > by Matthew Brand > http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition
[ https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300848#comment-14300848 ] Reza Zadeh commented on SPARK-4981: --- Another option: see slide 31 to solve the problem using IndexedRDDs, thanks to Ankur's nice slides and work on IndexedRDD: https://issues.apache.org/jira/secure/attachment/12656374/2014-07-07-IndexedRDD-design-review.pdf Add a streaming singular value decomposition Key: SPARK-4981 URL: https://issues.apache.org/jira/browse/SPARK-4981 Project: Spark Issue Type: New Feature Components: MLlib, Streaming Reporter: Jeremy Freeman This is for tracking WIP on a streaming singular value decomposition implementation. This will likely be more complex than the existing streaming algorithms (k-means, regression), but should be possible using the family of sequential update rule outlined in this paper: Fast low-rank modifications of the thin singular value decomposition by Matthew Brand http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition
[ https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300849#comment-14300849 ] Reza Zadeh commented on SPARK-4981: --- Another option: see slide 31 to solve the problem using IndexedRDDs, thanks to Ankur's nice slides and work on IndexedRDD: https://issues.apache.org/jira/secure/attachment/12656374/2014-07-07-IndexedRDD-design-review.pdf Add a streaming singular value decomposition Key: SPARK-4981 URL: https://issues.apache.org/jira/browse/SPARK-4981 Project: Spark Issue Type: New Feature Components: MLlib, Streaming Reporter: Jeremy Freeman This is for tracking WIP on a streaming singular value decomposition implementation. This will likely be more complex than the existing streaming algorithms (k-means, regression), but should be possible using the family of sequential update rule outlined in this paper: Fast low-rank modifications of the thin singular value decomposition by Matthew Brand http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition
[ https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299666#comment-14299666 ] Reza Zadeh commented on SPARK-4981: --- To be model parallel, we can simply warm-start the current ALS implementation in org.apache.spark.mllib.recommendation The work involved would be to expose a warm-start option in ALS, and then redo training with say 2 iterations instead of 10, with each batch of RDDs. The stream would be over batches of Ratings. This should be the simplest option. Add a streaming singular value decomposition Key: SPARK-4981 URL: https://issues.apache.org/jira/browse/SPARK-4981 Project: Spark Issue Type: New Feature Components: MLlib, Streaming Reporter: Jeremy Freeman This is for tracking WIP on a streaming singular value decomposition implementation. This will likely be more complex than the existing streaming algorithms (k-means, regression), but should be possible using the family of sequential update rule outlined in this paper: Fast low-rank modifications of the thin singular value decomposition by Matthew Brand http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition
[ https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299682#comment-14299682 ] Tathagata Das commented on SPARK-4981: -- +1 This will be awesome :P Add a streaming singular value decomposition Key: SPARK-4981 URL: https://issues.apache.org/jira/browse/SPARK-4981 Project: Spark Issue Type: New Feature Components: MLlib, Streaming Reporter: Jeremy Freeman This is for tracking WIP on a streaming singular value decomposition implementation. This will likely be more complex than the existing streaming algorithms (k-means, regression), but should be possible using the family of sequential update rule outlined in this paper: Fast low-rank modifications of the thin singular value decomposition by Matthew Brand http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition
[ https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259789#comment-14259789 ] Reza Zadeh commented on SPARK-4981: --- We could do matrix completion (least squares objective, reqularized, note that this is not SVD) in a streaming fashion using Stochastic Gradient Descent. See the update equations in Algorithm 1: http://stanford.edu/~rezab/papers/factorbird.pdf The stream is over individual entries (as opposed a whole row/column). We should probably do streaming matrix completion before streaming SVD. Add a streaming singular value decomposition Key: SPARK-4981 URL: https://issues.apache.org/jira/browse/SPARK-4981 Project: Spark Issue Type: New Feature Components: MLlib, Streaming Reporter: Jeremy Freeman This is for tracking WIP on a streaming singular value decomposition implementation. This will likely be more complex than the existing streaming algorithms (k-means, regression), but should be possible using the family of sequential update rule outlined in this paper: Fast low-rank modifications of the thin singular value decomposition by Matthew Brand http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org