[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition

2017-08-18 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16132118#comment-16132118
 ] 

Nick Pentreath commented on SPARK-4981:
---

Hey folks, as interesting as this would be, I think it's fairly clear that it 
won't be moving ahead any time soon (and furthermore any 
ML-on-Structured-Streaming is not imminent). Shall we close this off?

> Add a streaming singular value decomposition
> 
>
> Key: SPARK-4981
> URL: https://issues.apache.org/jira/browse/SPARK-4981
> Project: Spark
>  Issue Type: New Feature
>  Components: DStreams, MLlib
>Reporter: Jeremy Freeman
>
> This is for tracking WIP on a streaming singular value decomposition 
> implementation. This will likely be more complex than the existing streaming 
> algorithms (k-means, regression), but should be possible using the family of 
> sequential update rule outlined in this paper:
> "Fast low-rank modifications of the thin singular value decomposition"
> by Matthew Brand
> http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition

2015-02-01 Thread Reza Zadeh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300848#comment-14300848
 ] 

Reza Zadeh commented on SPARK-4981:
---

Another option: see slide 31 to solve the problem using IndexedRDDs, thanks to 
Ankur's nice slides and work on IndexedRDD:

https://issues.apache.org/jira/secure/attachment/12656374/2014-07-07-IndexedRDD-design-review.pdf

 Add a streaming singular value decomposition
 

 Key: SPARK-4981
 URL: https://issues.apache.org/jira/browse/SPARK-4981
 Project: Spark
  Issue Type: New Feature
  Components: MLlib, Streaming
Reporter: Jeremy Freeman

 This is for tracking WIP on a streaming singular value decomposition 
 implementation. This will likely be more complex than the existing streaming 
 algorithms (k-means, regression), but should be possible using the family of 
 sequential update rule outlined in this paper:
 Fast low-rank modifications of the thin singular value decomposition
 by Matthew Brand
 http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition

2015-02-01 Thread Reza Zadeh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300849#comment-14300849
 ] 

Reza Zadeh commented on SPARK-4981:
---

Another option: see slide 31 to solve the problem using IndexedRDDs, thanks to 
Ankur's nice slides and work on IndexedRDD:

https://issues.apache.org/jira/secure/attachment/12656374/2014-07-07-IndexedRDD-design-review.pdf

 Add a streaming singular value decomposition
 

 Key: SPARK-4981
 URL: https://issues.apache.org/jira/browse/SPARK-4981
 Project: Spark
  Issue Type: New Feature
  Components: MLlib, Streaming
Reporter: Jeremy Freeman

 This is for tracking WIP on a streaming singular value decomposition 
 implementation. This will likely be more complex than the existing streaming 
 algorithms (k-means, regression), but should be possible using the family of 
 sequential update rule outlined in this paper:
 Fast low-rank modifications of the thin singular value decomposition
 by Matthew Brand
 http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition

2015-01-30 Thread Reza Zadeh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299666#comment-14299666
 ] 

Reza Zadeh commented on SPARK-4981:
---

To be model parallel, we can simply warm-start the current ALS implementation 
in org.apache.spark.mllib.recommendation

The work involved would be to expose a warm-start option in ALS, and then redo 
training with say 2 iterations instead of 10, with each batch of RDDs.

The stream would be over batches of Ratings.

This should be the simplest option.


 Add a streaming singular value decomposition
 

 Key: SPARK-4981
 URL: https://issues.apache.org/jira/browse/SPARK-4981
 Project: Spark
  Issue Type: New Feature
  Components: MLlib, Streaming
Reporter: Jeremy Freeman

 This is for tracking WIP on a streaming singular value decomposition 
 implementation. This will likely be more complex than the existing streaming 
 algorithms (k-means, regression), but should be possible using the family of 
 sequential update rule outlined in this paper:
 Fast low-rank modifications of the thin singular value decomposition
 by Matthew Brand
 http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition

2015-01-30 Thread Tathagata Das (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299682#comment-14299682
 ] 

Tathagata Das commented on SPARK-4981:
--

+1 This will be awesome :P

 Add a streaming singular value decomposition
 

 Key: SPARK-4981
 URL: https://issues.apache.org/jira/browse/SPARK-4981
 Project: Spark
  Issue Type: New Feature
  Components: MLlib, Streaming
Reporter: Jeremy Freeman

 This is for tracking WIP on a streaming singular value decomposition 
 implementation. This will likely be more complex than the existing streaming 
 algorithms (k-means, regression), but should be possible using the family of 
 sequential update rule outlined in this paper:
 Fast low-rank modifications of the thin singular value decomposition
 by Matthew Brand
 http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4981) Add a streaming singular value decomposition

2014-12-28 Thread Reza Zadeh (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259789#comment-14259789
 ] 

Reza Zadeh commented on SPARK-4981:
---

We could do matrix completion (least squares objective, reqularized, note that 
this is not SVD) in a streaming fashion using Stochastic Gradient Descent.

See the update equations in Algorithm 1:
http://stanford.edu/~rezab/papers/factorbird.pdf

The stream is over individual entries (as opposed a whole row/column).

We should probably do streaming matrix completion before streaming SVD.

 Add a streaming singular value decomposition
 

 Key: SPARK-4981
 URL: https://issues.apache.org/jira/browse/SPARK-4981
 Project: Spark
  Issue Type: New Feature
  Components: MLlib, Streaming
Reporter: Jeremy Freeman

 This is for tracking WIP on a streaming singular value decomposition 
 implementation. This will likely be more complex than the existing streaming 
 algorithms (k-means, regression), but should be possible using the family of 
 sequential update rule outlined in this paper:
 Fast low-rank modifications of the thin singular value decomposition
 by Matthew Brand
 http://www.stat.osu.edu/~dmsl/thinSVDtracking.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org