Implementing random walk in spark

naveenkumarmarri Wed, 24 Feb 2016 09:41:33 -0800

Hi,

I'm new to spark, I'm trying to compute similarity between users/products.
I've a huge table which I can't do a self join with the cluster I have.


I'm trying to implement do self join using random walk methodology which
will approximately give the results. The table is a bipartite graph with 2
columns

Idea:

   - take any element(t1) in the first column in random
   - picking the corresponding element(t2) in for the element(t1) in the
   graph.
   - lookup for possible elements in the graph for t2 in random say t3
   - create a edge between t1 and t3
   - Iterate it in the order of atleat n*n so that results will be
   approximate

Questions


   - Is spark a suitable environment to do this?
   - I've coded logic for picking elements in random but facing issue when
   building graph
   - Should consider graphx?

Any help is highly appreciated.

Regards,
Naveen

Implementing random walk in spark

Reply via email to