" and and array other document " looks like a typo, rest is fine On 13 June 2012 13:45, Ferdy Galema <[email protected]> wrote:
> Hi, > > I would remove the 'experimental' notion. Aside from that it's fine with > me. > > Ferdy. > > > On Wed, Jun 13, 2012 at 2:29 PM, Lewis John Mcgibbney < > [email protected]> wrote: > >> Hi, >> >> Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask >> about a suitable project descriptor. >> >> So far on trunk we have >> >> ** Apache Nutch is an open source web-search software project. >> Stemming from Apache Lucene, it now builds on Apache Solr adding >> web-specifics, such as a crawler, a link-graph database and parsing >> support handled by Apache Tika for HTML and and array other document >> formats. >> >> This is merely a pot shot, but I was thinking for Nutch 2.0, something >> like >> >> ** Apache Nutch 2.X is an experimental branch of the Apache Nutch open >> source web-search software project. It builds on Apache Gora for data >> persistence and Apache Solr for indexing adding web-specifics, such as >> a crawler, a link-graph database and parsing support handled by Apache >> Tika for HTML and and array other document formats. >> >> Although there are not many changes here I just wanted to run it by >> you folks...? >> >> Thanks >> Lewis >> >> -- >> Lewis >> > > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

