Deepak, depending on your use case, you might find it appropriate and certainly easy create a lightweight sequence number service that serves requests from parallel clients.
http://stackoverflow.com/questions/2671858/distributed-sequence-number-generation/5685869 There's no shame in using non-Spark or custom services in aid of Spark processing :) -- Christopher T. Nguyen Co-founder & CEO, Adatao <http://adatao.com> linkedin.com/in/ctnguyen On Mon, Feb 24, 2014 at 10:53 AM, Deepak Nulu <deepakn...@gmail.com> wrote: > Hi Evan, > > Thanks for the quick response. The only mapping between UUIDs and Longs > that > I can think of is one where I sequentially assign Longs as I load the UUIDs > from the DB. But this results in having to centralize this mapping. I am > guessing that centralizing this is not a good idea for a distributed graph > processing engine. > > Also, I will be running Spark on the same nodes as my distributed DB > (Cassandra) and I am hoping that the Spark worker on each node will load > the > data from the local Cassandra node. I am not sure if this is possible with > GraphX, but I am hoping it is, and therefore my concern with centralizing > the UUID<->Long mapping. > > Thanks. > > -deepak > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/GraphX-with-UUID-vertex-IDs-instead-of-Long-tp1953p1982.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >