Some questions regarding nutch in distributed computing environment

jeffersonzhou Wed, 10 Aug 2011 01:24:47 -0700

Hi,


I have three questions and hope some can help answer them.

 

1.       Is there a way to update, add or delete contents in crawlDB? I am
more interested in knowing the answer in distributed computing environment.

2.       I have used Berkeley DB in standalone Nutch, and I want to use
Berkeley DB in distributed Nutch environment. How can I read from and write
to the Berkeley DB in HDFS?

3.       I have stored some frequently used data in memory, how can the data
be accessed by all the nutch instances?

 

Thanks

Some questions regarding nutch in distributed computing environment

Reply via email to