Hey Luca,

On Mon, Mar 2, 2015 at 1:08 PM, <[email protected]> wrote:

>
> I'm new to using Any23, and it's already been a great library to use.
>

great


> However I'm stuck with something rather basic. I followed this example
> on how to simply GET a URL and return the triples it contains:
> http://any23.apache.org/dev-data-extraction.html
>

OK


>
> I'd like to run many HTTP requests in a non-blocking fashion,
> concurrently. Are there facilities to do this using the HTTP code
> contained in Any23?
>
> There is no code in Any23 for this. You may wish to investigate the Any23
Basic HTTP crawler plugin however
https://github.com/apache/any23/tree/master/plugins/basic-crawler
You can define the number of crawlers on the command line
https://github.com/apache/any23/blob/master/plugins/basic-crawler/src/main/java/org/apache/any23/cli/Crawler.java#L67
As an alternative you could investigate using something like Crawler
Commons [0] or Apache Nutch [1] for dealing with the HTTP logic

[0] https://code.google.com/p/crawler-commons/
[1] http://nutch.apache.org

Reply via email to