Hey there I'm having issues searching with my newly (vastly) expanded database. Could anyone shed any light on this? Basically, on a newly started server, I search for "test", and this appears in catalina.out:
2006-12-20 10:51:40,710 INFO NutchBean - creating new bean 2006-12-20 10:51:40,725 INFO NutchBean - opening merged index in crawl/index 2006-12-20 10:51:40,871 INFO Configuration - found resource common-terms.utf8 at file:/nutch/apache-tomcat-5.5/webapps/ROOT/WEB-INF/classes/common-terms.utf8 2006-12-20 10:51:40,880 INFO NutchBean - opening segments in crawl/segments 2006-12-20 10:51:40,898 INFO SummarizerFactory - Using the first summarizer extension found: Basic Summarizer 2006-12-20 10:51:40,901 INFO NutchBean - opening linkdb in crawl/linkdb 2006-12-20 10:51:40,907 INFO NutchBean - query request from 195.166.60.2 2006-12-20 10:51:40,925 INFO NutchBean - query: test 2006-12-20 10:51:40,925 INFO NutchBean - lang: en 2006-12-20 10:51:40,974 INFO NutchBean - searching for 20 raw hits 2006-12-20 10:52:13,306 ERROR [jsp] - Servlet.service() for servlet jsp threw exception java.lang.OutOfMemoryError: Java heap space If I then refresh the page (which is blank by the way), I get this: 2006-12-20 10:53:23,729 INFO NutchBean - query request from 195.166.60.2 2006-12-20 10:53:23,730 INFO NutchBean - query: test 2006-12-20 10:53:23,730 INFO NutchBean - lang: en 2006-12-20 10:53:23,735 INFO NutchBean - searching for 20 raw hits 2006-12-20 10:54:04,685 ERROR [jsp] - Servlet.service() for servlet jsp threw exception java.lang.RuntimeException: java.lang.NoClassDefFoundError ..plus a lot of stack trace. The odd thing is though If I do this: [EMAIL PROTECTED]:/nutch$ bin/nutch org.apache.nutch.searcher.NutchBean test Total hits: 64106 0 20061215102534/http://www.dyslexia-test.co.uk/ ... About us About dyslexia Dyslexia Test 7-16 Dyslexia Test for Adults Frequently ... results in the test ... 1 20061215102534/http://www.dsa.gov.uk/ [etc] It works absolutely fine. Does anyone have any idea what might be preventing the web interface from working properly? I have seen this tomcat installation work with exactly the same webapp before - that is, before I expanded the index. [EMAIL PROTECTED]:/nutch$ bin/nutch readdb crawl/crawldb/ -stats CrawlDb statistics start: crawl/crawldb/ Statistics for CrawlDb: crawl/crawldb/ TOTAL urls: 11502550 retry 0: 11429183 retry 1: 61224 retry 2: 10594 retry 3: 1549 min score: 0.0 avg score: 0.05785237 max score: 1309.991 status 1 (DB_unfetched): 9067758 status 2 (DB_fetched): 2221161 status 3 (DB_gone): 213631 CrawlDb statistics: done Any help would be great Thanks -Rob ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
