Thanks Andrzej, when I installed Fedora Core 5 it had an option for Java
development kit, which I incorrectly assumed was JDK.   I was able to
get JDK up and running on Fedora using JPackage for Sun Compat There are
some good instruction here for other who may need to get nutch up on
FC5. http://ccl.net/cca/software/SOURCES/JAVA/JSDK-1.5/index.shtml.  Now
nutch is somewhat working on FC5, but still a couple problems I never
ran into on windows.  One is the null meta data error on fetch, which
causes fetch to abort.  I was able to fetch no problem with this host
filtered out. This only happens on a specific host, maybe its something
breaking the HTML parser. I posted this problem on another post.  I dont
think this is unique to FC5, as I never crawled this same site on Windows.

The second error is with searcher.  Even though I was able to fetch and
index some pages, I am not getting any search results.  My tomcat log
files look good, but no results being returned.  I know I can look at
the index through LUKE GUI, but I cant get Xwindows up.  I still have an
issue open that requires recompiling kernel with FGLRX for ATI video
drivers.  Lubos recommended trying CentOS instead of Fedora, so I might
try that as the lack of xwindows can be frustrating. I am sure there is
a way to use Luke from command line, maybe I will try that to make sure
I have data in my index.

Here is my catalina log.  I am running on Tomcat 5

2006-09-18 18:37:59,678 INFO  PluginRepository -        Nutch Content
Parser (org.apache.nutch.parse.Parser)
2006-09-18 18:37:59,679 INFO  PluginRepository -        Ontology Model
Loader (org.apache.nutch.ontology.Ontology)
2006-09-18 18:37:59,679 INFO  PluginRepository -        Nutch Analysis
(org.apache.nutch.analysis.NutchAnalyzer)
2006-09-18 18:37:59,679 INFO  PluginRepository -        Nutch Query
Filter (org.apache.nutch.searcher.QueryFilter)
2006-09-18 18:37:59,758 INFO  NutchBean - creating new bean
2006-09-18 18:37:59,814 INFO  NutchBean - opening indexes in
alaskacruises/indexes
2006-09-18 18:38:00,030 INFO  Configuration - found resource
common-terms.utf8 at
file:/var/lib/tomcat5/webapps/nutch-0.9-dev/WEB-INF/classes/common-terms.utf8
2006-09-18 18:38:00,060 INFO  NutchBean - opening segments in
alaskacruises/segments
2006-09-18 18:38:00,125 INFO  SummarizerFactory - Using the first
summarizer extension found: Basic Summarizer
2006-09-18 18:38:00,125 INFO  NutchBean - opening linkdb in
alaskacruises/linkdb
2006-09-18 18:38:00,151 INFO  NutchBean - query request from 192.168.1.34
2006-09-18 18:38:00,228 INFO  NutchBean - query: alaska
2006-09-18 18:38:00,229 INFO  NutchBean - lang: en
2006-09-18 18:38:00,423 INFO  NutchBean - searching for 20 raw hits
2006-09-18 18:38:00,632 INFO  NutchBean - total hits: 0



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to