Further information to this.... I'm running on a single machine in fake clustering mode.
A tmp directory gets created, with nothing but another empty directory inside of it. The hadoop log file just says the same thing over and over every 30 seconds.... 2009-08-11 20:20:57,803 INFO plugin.PluginRepository - Plugins: looking in: /local/apps/software/nutch/plugins 2009-08-11 20:20:58,158 INFO plugin.PluginRepository - Plugin Auto-activation mode: [true] 2009-08-11 20:20:58,159 INFO plugin.PluginRepository - Registered Plugins: 2009-08-11 20:20:58,159 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints) 2009-08-11 20:20:58,159 INFO plugin.PluginRepository - Basic Query Filter (query-basic) 2009-08-11 20:20:58,159 INFO plugin.PluginRepository - Basic URL Normalizer (urlnormalizer-basic) 2009-08-11 20:20:58,159 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic) 2009-08-11 20:20:58,159 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - Site Query Filter (query-site) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - Basic Summarizer Plug-in (summary-basic) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - HTTP Framework (lib-http) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - Pass-through URL Normalizer (urlnormalizer-pass) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - XML Response Writer Plug-in (response-xml) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - Regex URL Normalizer (urlnormalizer-regex) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic) 2009-08-11 20:20:58,160 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml) 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - Anchor Indexing Filter (index-anchor) 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - URL Query Filter (query-url) 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter) 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - JSON Response Writer Plug-in (response-json) 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - Registered Extension-Points: 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - Nutch Summarizer (org.apache.nutch.searcher.Summarizer) 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - Nutch Protocol (org.apache.nutch.protocol.Protocol) 2009-08-11 20:20:58,161 INFO plugin.PluginRepository - Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch Field Filter (org.apache.nutch.indexer.field.FieldFilter) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch Query Filter (org.apache.nutch.searcher.QueryFilter) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch Search Results Response Writer (org.apache.nutch.searcher.response.ResponseWriter) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch Online Search Results Clustering Plugin (org.apache.nutch.clustering.OnlineClusterer) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter) 2009-08-11 20:20:58,162 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser) 2009-08-11 20:20:58,163 INFO plugin.PluginRepository - Nutch Scoring (org.apache.nutch.scoring.ScoringFilter) 2009-08-11 20:20:58,163 INFO plugin.PluginRepository - Ontology Model Loader (org.apache.nutch.ontology.Ontology) 2009-08-11 20:20:58,171 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter 2009-08-11 20:20:58,202 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter Is Solr output a plugin, and is it not set up above? 2009/8/11 Alex McLintock <[email protected]>: > I'm trying to send my Nutch crawl to SolR. I've "generated, fetched, > updated", several times. I've done an invertlinks. > But when I try to do the solrindex it just sits there for ages and > doesnt seem to stress the solr server at all. > > I'm using Nutch 1.0, Sun Java 1.6, Ubuntu Linux 9.04. > > /local/apps/software/nutch$ bin/nutch solrindex > http://rio23:8983/solr/ crawl/crawldb crawl/linkdb crawl/segments/* > > Is there some kind of "verbose" option so that I can better see what > it is doing? I could maybe insert some extra deugging, or do i need to > run this in Eclipse? > > The Java process seems to be using up most of a core's CPU time so it > seems to be doing *something*. > > This is my first Solr project so I have proved that it is up and > running, but havent actually added any data to it yet... > > Alex >
