Hi, all,

I am a new Nutch user. Before knowing Nutch, I designed a crawler myself.
However, the quality is not good. So I decide to try Nutch.

However, after reading some materials about Nutch, I notice that Nutch puts
all of crawled pages into persistent Lucene indexes. In my project, I hope I
could get crawled data in memory. So I can manipulate them in Java or C#
collections. I don't want to retrieve the indexes crawled by Nutch.

Could you give me a solution to that? Thanks so much!

Best regards,
Li Bing

Reply via email to