Hi just my 2 cents on this one Doesn't matter which site we will use this will come back to us sooner or later so I'll be more into start some jetty client by the test itself and point the tests to local url It is more job todo but once for all we fix it and the tests can be more rich as you might test more things since you know what is going to be crawled/fetched/extracted etc..
If no one willing to take this I can be the volunteer ;-) and start by fixing the first one org.apache.any23.cli.CrawlerTest Cheers Szymon On 20 April 2012 14:00, Michele Mostarda <[email protected]> wrote: > A solution would be to use another RDF enabled site guaranteed to be more > reliable to test the Crawler CLI. > > The best. > > Mic > > On 19 April 2012 23:06, Lewis John Mcgibbney <[email protected]>wrote: > >> It doesn't look like the crawler actually picks up any document??? >> >> ------------------------------------------------------- >> T E S T S >> ------------------------------------------------------- >> Running org.apache.any23.cli.CrawlerTest >> >> ------------------------------------------------------------------------ >> Apache Any23 :: crawler >> ------------------------------------------------------------------------ >> >> Deleting content of: >> /tmp/crawler-metadata-dc5e0a70-7fbd-41bc-a66c-3549b37e9f38/frontier >> log4j:WARN No appenders could be found for logger >> (edu.uci.ics.crawler4j.crawler.CrawlController). >> log4j:WARN Please initialize the log4j system properly. >> Processing page: [http://eventiesagre.it/] >> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 11.56 sec >> <<< FAILURE! >> Total Documents: 0, Total Triples: 0 >> > > > > -- > Michele Mostarda > Senior Software Engineer > skype: michele.mostarda > twitter: micmos > mail: [email protected] > site : http://www.michelemostarda.com
