Hi Guillermo, assuming that the first "a,b" is a typo and you actually meant "a,d", this is a sorting problem.
You could easily model your data as an RDD or tuples (or as a dataframe/set) and use the sortBy (or orderBy for dataframe/sets) methods. best, --Jakob On Wed, Feb 24, 2016 at 2:26 PM, Guillermo Ortiz <konstt2...@gmail.com> wrote: > I want to do some algorithm in Spark.. I know how to do it in a single > machine where all data are together, but I don't know a good way to do it in > Spark. > > If someone has an idea.. > I have some data like this > a , b > x , y > b , c > y , y > c , d > > I want something like: > a , d > b , d > c , d > x , y > y , y > > I need to know that a->b->c->d, so a->d, b->d and c->d. > I don't want the code, just an idea how I could deal with it. > > Any idea? --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org