[cc to crawler-commons list]

I wasn't part of the initial discussion so I don't know what the arguments
for / against were.
I suppose it depends partially on user adoption. The project has had a slow
start but with this initial release it should gain a bit of traction. The
license is already Apache 2.0. We'll see how it goes, but as long as it
thrives I don't really mind were it lives

Julien

On 6 July 2011 21:15, Markus Jelsma <[email protected]> wrote:

> Impressive! Are you guys going for the ASF incubator?
>
> > [Apologies for cross-posting]
> >
> > The initial release of crawler-commons is available from :
> > http://code.google.com/p/crawler-commons/downloads/list
> >
> > The purpose of this project is to develop a set of reusable Java
> components
> > that implement functionality common to any web crawler. These components
> > would benefit from collaboration among various existing web crawler
> > projects, and reduce duplication of effort.
> > The current version contains resources for :
> > - parsing robots.txt
> > - parsing sitemaps
> > - URL analyzer which returns Top Level Domains
> > - a simple HttpFetcher
> >
> > This release is available on Sonatype's OSS Nexus repository [
> >
> https://oss.sonatype.org/content/repositories/releases/com/google/code/craw
> > ler-commons/] and should be available on Maven Central soon.
> >
> > Please send your questions, comments or suggestions to
> > http://groups.google.com/group/crawler-commons
> >
> > Best regards,
> >
> > Julien
> >
> > --
> >
> > Open Source Solutions for Text Engineering
> >
> > http://digitalpebble.blogspot.com/
> > http://www.digitalpebble.com
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to