> > I think it would be better to have the junit tests > > start jetty then > > crawl localhost. I'd love to see some end-to-end > > unit tests like that.
+1 > I think this would also make it nice to test things > like recursive linking, parsing pdfs or other file > formats, observing robots.txt or any crawling bugs > that are encountered and then fixed. I think end to end testing must focus on end to end problems (ie checking pdf parsing is already checked by unit tests, and it is really the right place for doing it). It should be better to performs some end to end tests (functional tests) for checking (not exhaustive): * that depending on many configurations, the good documents are fecthed and correctly parsed (as you suggested it). * checking some limit cases : Protocol errors, Corrupted content, * Performs some fetching/crawling/indexing performance tests with many confs * Performs some searching performance tests with many querying charges/database size, ... That just some ideas.... But it could be very cool if you can work on this subject. Suggestions for where to put such test content in the > tree? What about creating a trunk/qa "module" ? Regards Jérôme -- http://motrech.free.fr/ http://www.frutch.org/
