AFAIK ordering is not strictly guaranteed unless the RDD is the
product of a sort. I think that in practice, you'll never find
elements of a file read in some random order, for example (although
see the recent issue about partition ordering potentially depending on
how the local file system lists t
Sean,
On Mon, Jan 26, 2015 at 10:28 AM, Sean Owen wrote:
> Note that RDDs don't really guarantee anything about ordering though,
> so this only makes sense if you've already sorted some upstream RDD by
> a timestamp or sequence number.
>
Speaking of order, is there some reading on guarantees an
stance (x1,y2) and (x2,y2) and
> distance (x2,y2) and (x3,y3)
>
> Imagine that the list of coordinate point comes from a GPS and describes a
> trip.
>
> - Steve
>
> From: Joseph Lust
> Date: Sunday, January 25, 2015 at 17:17
> To: Steve Nunez , "user@spark.apache
If this is really about just Scala Lists, then a simple answer (using
tuples of doubles) is:
val points: List[(Double,Double)] = ...
val distances = for (p1 <- points; p2 <- points) yield {
val dx = p1._1 - p2._1
val dy = p1._2 - p2._2
math.sqrt(dx*dx + dy*dy)
}
distances.sum / 2
It's "/ 2"
...@mc10inc.com>>
Date: Sunday, January 25, 2015 at 17:17
To: Steve Nunez mailto:snu...@hortonworks.com>>,
"user@spark.apache.org<mailto:user@spark.apache.org>"
mailto:user@spark.apache.org>>
Subject: Re: Pairwise Processing of a List
So you've got a point A and you w
So you’ve got a point A and you want the sum of distances between it and all
other points? Or am I misunderstanding you?
// target point, can be Broadcast global sent to all workers
val tarPt = (10,20)
val pts = Seq((2,2),(3,3),(2,3),(10,2))
val rdd= sc.parallelize(pts)
rdd.map( pt => Math.sqrt(
Hi,
On Mon, Jan 26, 2015 at 9:32 AM, Steve Nunez wrote:
> I’ve got a list of points: List[(Float, Float)]) that represent (x,y)
> coordinate pairs and need to sum the distance. It’s easy enough to compute
> the distance:
>
Are you saying you want all combinations (N^2) of distances? That shoul