Hi Guillermo,
assuming that the first "a,b" is a typo and you actually meant "a,d",
this is a sorting problem.

You could easily model your data as an RDD or tuples (or as a
dataframe/set) and use the sortBy (or orderBy for dataframe/sets)
methods.

best,
--Jakob

On Wed, Feb 24, 2016 at 2:26 PM, Guillermo Ortiz <konstt2...@gmail.com> wrote:
> I want to do some algorithm in Spark.. I know how to do it in a single
> machine where all data are together, but I don't know a good way to do it in
> Spark.
>
> If someone has an idea..
> I have some data like this
> a , b
> x , y
> b , c
> y , y
> c , d
>
> I want something like:
> a , d
> b , d
> c , d
> x , y
> y , y
>
> I need to know that a->b->c->d, so a->d, b->d and c->d.
> I don't want the code, just an idea how I could deal with it.
>
> Any idea?

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to