Hi, Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask about a suitable project descriptor.
So far on trunk we have ** Apache Nutch is an open source web-search software project. Stemming from Apache Lucene, it now builds on Apache Solr adding web-specifics, such as a crawler, a link-graph database and parsing support handled by Apache Tika for HTML and and array other document formats. This is merely a pot shot, but I was thinking for Nutch 2.0, something like ** Apache Nutch 2.X is an experimental branch of the Apache Nutch open source web-search software project. It builds on Apache Gora for data persistence and Apache Solr for indexing adding web-specifics, such as a crawler, a link-graph database and parsing support handled by Apache Tika for HTML and and array other document formats. Although there are not many changes here I just wanted to run it by you folks...? Thanks Lewis -- Lewis

