Hi, On Thu, Jun 10, 2010 at 20:17, Alex McLintock <[email protected]>wrote:
> I'm not exactly new to Nutch, but haven't used it for a year or so. > I'm a bit out of touch with current "state of the art". > > I see there is some HBase code in the form of some patches. I don't > know whether this is more than "proof of concept" stuff. > > I also see that there is a 1.1 release candidate in the works. > > however I can see no mention of HBase in the release candidate? Is it > there at all? > > If I use Nutch I am going to have to develop several plugins of my own > and perhaps change the way that URLs are found for second and > subsequent crawls. I think that HBase would significantly help with > this. > > > References: > http://www.gossamer-threads.com/lists/lucene/general/99072 [VOTE] > Apache Nutch 1.1 Release Candidate #2 > and > http://people.apache.org/~mattmann/apache-nutch-1.1/rc2/CHANGES-1.1.txt > and > https://issues.apache.org/jira/browse/NUTCH-650 > Nutch-hbase integration is still on track but development slowed down a lot for a while. It is currently picking up speed again, and early next week, I will send an email explaining current situation and then we can discuss next steps from there. FWIW, my goal is to finish it for Nutch 2.0. -- Doğacan Güney

