Vishal Shah wrote: > Hello Richard, > > How big is your index? My index is quite small probably 1000 pages
> I have had these problems sometimes. In some > cases, it's because there isn't enough memory to search the results. > This can be resolved by setting JAVA_OPTS="-Xmx512m" or some other > suitable value. Try setting this env variable or putting the export > statement in the bin/startup.sh script for tomcat to see if it works. > > OK, this is good general knowledge, but i dont think this is my problem > Also, have you merged your indices by running the bin/nutch merge > command? Make sure the output of this command goes to a dir called > index, coz that's what the searcher looks for first. If not found, it > will look for the indexes directory. I have an indexes directory. I haven't run merge, and since I have only run one index i didn;t need a merge. I just do the basic generate, fetch, update functions followed up with an inverlinks and then an index command. Only after I do the next index command, at which point i have two indexes, do I do the merge index command. Sometimes i just delete the index and reindex all. > Try giving this a shot too. > > Regards, > > -vishal. > > > > -----Original Message----- > From: Richard Braman [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 19, 2006 5:00 AM > To: [email protected] > Subject: Nutch running on FC5 - No search results yet > > Thanks Andrzej, when I installed Fedora Core 5 it had an option for Java > development kit, which I incorrectly assumed was JDK. I was able to > get JDK up and running on Fedora using JPackage for Sun Compat There are > some good instruction here for other who may need to get nutch up on > FC5. http://ccl.net/cca/software/SOURCES/JAVA/JSDK-1.5/index.shtml. Now > nutch is somewhat working on FC5, but still a couple problems I never > ran into on windows. One is the null meta data error on fetch, which > causes fetch to abort. I was able to fetch no problem with this host > filtered out. This only happens on a specific host, maybe its something > breaking the HTML parser. I posted this problem on another post. I dont > think this is unique to FC5, as I never crawled this same site on > Windows. > > The second error is with searcher. Even though I was able to fetch and > index some pages, I am not getting any search results. My tomcat log > files look good, but no results being returned. I know I can look at > the index through LUKE GUI, but I cant get Xwindows up. I still have an > issue open that requires recompiling kernel with FGLRX for ATI video > drivers. Lubos recommended trying CentOS instead of Fedora, so I might > try that as the lack of xwindows can be frustrating. I am sure there is > a way to use Luke from command line, maybe I will try that to make sure > I have data in my index. > > Here is my catalina log. I am running on Tomcat 5 > > 2006-09-18 18:37:59,678 INFO PluginRepository - Nutch Content > Parser (org.apache.nutch.parse.Parser) > 2006-09-18 18:37:59,679 INFO PluginRepository - Ontology Model > Loader (org.apache.nutch.ontology.Ontology) > 2006-09-18 18:37:59,679 INFO PluginRepository - Nutch Analysis > (org.apache.nutch.analysis.NutchAnalyzer) > 2006-09-18 18:37:59,679 INFO PluginRepository - Nutch Query > Filter (org.apache.nutch.searcher.QueryFilter) > 2006-09-18 18:37:59,758 INFO NutchBean - creating new bean > 2006-09-18 18:37:59,814 INFO NutchBean - opening indexes in > alaskacruises/indexes > 2006-09-18 18:38:00,030 INFO Configuration - found resource > common-terms.utf8 at > file:/var/lib/tomcat5/webapps/nutch-0.9-dev/WEB-INF/classes/common-terms > .utf8 > 2006-09-18 18:38:00,060 INFO NutchBean - opening segments in > alaskacruises/segments > 2006-09-18 18:38:00,125 INFO SummarizerFactory - Using the first > summarizer extension found: Basic Summarizer > 2006-09-18 18:38:00,125 INFO NutchBean - opening linkdb in > alaskacruises/linkdb > 2006-09-18 18:38:00,151 INFO NutchBean - query request from > 192.168.1.34 > 2006-09-18 18:38:00,228 INFO NutchBean - query: alaska > 2006-09-18 18:38:00,229 INFO NutchBean - lang: en > 2006-09-18 18:38:00,423 INFO NutchBean - searching for 20 raw hits > 2006-09-18 18:38:00,632 INFO NutchBean - total hits: 0 > > > > > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
