Hi,

My problem: integrating Giraph LinkRank implementation in Nutch 2.x DbUpdate stage.

For Nutch 2.x,

In DbUpdateJob, DbUpdateMapper and DbUpdateReducer are applied to the WebPage objects. Some scoring filters are activated as well.
I want to export this data as

nodes:
http://www.google.com 0.33
http://www.yahoo.com 0.33
http://www.bing.com 0.33

edges:
http://www.google.com http://www.yahoo.com
http://www.yahoo.com http://www.bing.com
http://www.bing.com http://www.google.com


and then use this data in my Giraph LinkRank implementation. I could not figure out where and how to extract this data and let call a Giraph Job and use the results for the next steps in Nutch 2.x.


Should I write a Gora job for exporting/importing data on Hbase and one Giraph job for reading from Hbase and executing the code?


Best,


Reply via email to