I don't know the answer, but I'd also check in the tomcat/J2ee container
logs to see if there are any clues.  This helped me solve a problem with
nutch solrindex once.  Also, I think the data directory for solr should
be growing as it add in more stuff.

|-----Original Message-----
|From: Alex McLintock [mailto:[email protected]]
|Sent: Tuesday, August 11, 2009 12:22 PM
|To: [email protected]
|Subject: Re: Nutch to SolR. First steps
|
|Further information to this....
|
|I'm running on a single machine in fake clustering mode.
|
|A tmp directory gets created, with nothing but another empty directory
|inside of it.
|
|The hadoop log file just says the same thing over and over every 30
|seconds....
|
|2009-08-11 20:20:57,803 INFO  plugin.PluginRepository - Plugins:
|looking in: /local/apps/software/nutch/plugins
|2009-08-11 20:20:58,158 INFO  plugin.PluginRepository - Plugin
|Auto-activation mode: [true]
|2009-08-11 20:20:58,159 INFO  plugin.PluginRepository - Registered
Plugins:
|2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         the
|nutch core extension points (nutch-extensionpoints)
|2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Basic
|Query Filter (query-basic)
|2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Basic
|URL Normalizer (urlnormalizer-basic)
|2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Basic
|Indexing Filter (index-basic)
|2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Html
|Parse Plug-in (parse-html)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Site
|Query Filter (query-site)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Basic
|Summarizer Plug-in (summary-basic)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         HTTP
|Framework (lib-http)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -
|Pass-through URL Normalizer (urlnormalizer-pass)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Regex
|URL Filter (urlfilter-regex)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Http
|Protocol Plug-in (protocol-http)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         XML
|Response Writer Plug-in (response-xml)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Regex
|URL Normalizer (urlnormalizer-regex)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         OPIC
|Scoring Plug-in (scoring-opic)
|2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -
|CyberNeko HTML Parser (lib-nekohtml)
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Anchor
|Indexing Filter (index-anchor)
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         URL
|Query Filter (query-url)
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Regex
|URL Filter Framework (lib-regex-filter)
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         JSON
|Response Writer Plug-in (response-json)
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository - Registered
|Extension-Points:
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Nutch
|Summarizer (org.apache.nutch.searcher.Summarizer)
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Nutch
|Protocol (org.apache.nutch.protocol.Protocol)
|2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Nutch
|Analysis (org.apache.nutch.analysis.NutchAnalyzer)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|Field Filter (org.apache.nutch.indexer.field.FieldFilter)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         HTML
|Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|Query Filter (org.apache.nutch.searcher.QueryFilter)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|Search Results Response Writer
|(org.apache.nutch.searcher.response.ResponseWriter)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|URL Normalizer (org.apache.nutch.net.URLNormalizer)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|URL Filter (org.apache.nutch.net.URLFilter)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|Online Search Results Clustering Plugin
|(org.apache.nutch.clustering.OnlineClusterer)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
|2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
|Content Parser (org.apache.nutch.parse.Parser)
|2009-08-11 20:20:58,163 INFO  plugin.PluginRepository -         Nutch
|Scoring (org.apache.nutch.scoring.ScoringFilter)
|2009-08-11 20:20:58,163 INFO  plugin.PluginRepository -
|Ontology Model Loader (org.apache.nutch.ontology.Ontology)
|2009-08-11 20:20:58,171 INFO  indexer.IndexingFilters - Adding
|org.apache.nutch.indexer.basic.BasicIndexingFilter
|2009-08-11 20:20:58,202 INFO  indexer.IndexingFilters - Adding
|org.apache.nutch.indexer.anchor.AnchorIndexingFilter
|
|
|
|Is Solr output a plugin, and is it not set up above?
|
|2009/8/11 Alex McLintock <[email protected]>:
|> I'm trying to send my Nutch crawl to SolR. I've "generated, fetched,
|> updated", several times. I've done an invertlinks.
|> But when I try to do the solrindex it just sits there for ages and
|> doesnt seem to stress the solr server at all.
|>
|> I'm using Nutch 1.0, Sun Java 1.6, Ubuntu Linux 9.04.
|>
|> /local/apps/software/nutch$ bin/nutch solrindex
|> http://rio23:8983/solr/ crawl/crawldb crawl/linkdb crawl/segments/*
|>
|> Is there some kind of "verbose" option so that I can better see what
|> it is doing? I could maybe insert some extra deugging, or do i need
to
|> run this in Eclipse?
|>
|> The Java process seems to be using up most of a core's CPU time so it
|> seems to be doing *something*.
|>
|> This is my first Solr project so I have proved that it is up and
|> running, but havent actually added any data to it yet...
|>
|> Alex
|>

Reply via email to