Re: scala Vector vs mllib Vector

2014-10-04 Thread Dean Wampler
Spark isolates each task, so I would use the MLlib vector. I didn't mention
this, but it also integrates with Breeze, a Scala mathematics library that
you might find useful.

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
 (O'Reilly)
Typesafe 
@deanwampler 
http://polyglotprogramming.com

On Sat, Oct 4, 2014 at 8:52 AM, ll  wrote:

> thanks dean.  thanks for the answer with great clarity!
>
> i'm working on an algorithm that has a weight vector W(w0, w1, .., wN).
> the
> elements of this weight vector are adjusted/updated frequently - every
> iteration of the algorithm.  how would you recommend to implement this
> vector?  what is the best practice to implement this in Scala & Spark?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/scala-Vector-vs-mllib-Vector-tp15736p15741.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: scala Vector vs mllib Vector

2014-10-04 Thread ll
thanks dean.  thanks for the answer with great clarity!  

i'm working on an algorithm that has a weight vector W(w0, w1, .., wN).  the
elements of this weight vector are adjusted/updated frequently - every
iteration of the algorithm.  how would you recommend to implement this
vector?  what is the best practice to implement this in Scala & Spark?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/scala-Vector-vs-mllib-Vector-tp15736p15741.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: scala Vector vs mllib Vector

2014-10-04 Thread Dean Wampler
Briefly, MLlib's Vector and the concrete subclasses DenseVector and
SparkVector wrap Java arrays, which are mutable and maximize memory
efficiency. To update one of these vectors, you mutate the elements of the
underlying array. That's great for performance, but dangerous in
multithreaded programs for all the usual reasons. Scala's Vector is a
*persistent
data structure* (best to google that term...), with O(1) operations, but a
higher constant factor. Scala Vector instances are immutable, so mutating
operations return a new Vector, but the "persistent" implementation uses
structure sharing (the unchanged parts) to make efficient copies.

Also, Scala Vector isn't designed to represent sparse vectors.

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
 (O'Reilly)
Typesafe 
@deanwampler 
http://polyglotprogramming.com

On Sat, Oct 4, 2014 at 1:44 AM, ll  wrote:

> what are the pros/cons of each?  when should we use mllib Vector, and when
> to
> use standard scala Vector?  thanks.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/scala-Vector-vs-mllib-Vector-tp15736.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>