Impressive! Are you guys going for the ASF incubator?

> [Apologies for cross-posting]
> 
> The initial release of crawler-commons is available from :
> http://code.google.com/p/crawler-commons/downloads/list
> 
> The purpose of this project is to develop a set of reusable Java components
> that implement functionality common to any web crawler. These components
> would benefit from collaboration among various existing web crawler
> projects, and reduce duplication of effort.
> The current version contains resources for :
> - parsing robots.txt
> - parsing sitemaps
> - URL analyzer which returns Top Level Domains
> - a simple HttpFetcher
> 
> This release is available on Sonatype's OSS Nexus repository [
> https://oss.sonatype.org/content/repositories/releases/com/google/code/craw
> ler-commons/] and should be available on Maven Central soon.
> 
> Please send your questions, comments or suggestions to
> http://groups.google.com/group/crawler-commons
> 
> Best regards,
> 
> Julien
> 
> --
> 
> Open Source Solutions for Text Engineering
> 
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com

Reply via email to