Further information to this....

I'm running on a single machine in fake clustering mode.

A tmp directory gets created, with nothing but another empty directory
inside of it.

The hadoop log file just says the same thing over and over every 30 seconds....

2009-08-11 20:20:57,803 INFO  plugin.PluginRepository - Plugins:
looking in: /local/apps/software/nutch/plugins
2009-08-11 20:20:58,158 INFO  plugin.PluginRepository - Plugin
Auto-activation mode: [true]
2009-08-11 20:20:58,159 INFO  plugin.PluginRepository - Registered Plugins:
2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         the
nutch core extension points (nutch-extensionpoints)
2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Basic
Query Filter (query-basic)
2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Basic
URL Normalizer (urlnormalizer-basic)
2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Basic
Indexing Filter (index-basic)
2009-08-11 20:20:58,159 INFO  plugin.PluginRepository -         Html
Parse Plug-in (parse-html)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Site
Query Filter (query-site)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Basic
Summarizer Plug-in (summary-basic)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         HTTP
Framework (lib-http)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -
Pass-through URL Normalizer (urlnormalizer-pass)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Regex
URL Filter (urlfilter-regex)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Http
Protocol Plug-in (protocol-http)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         XML
Response Writer Plug-in (response-xml)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         Regex
URL Normalizer (urlnormalizer-regex)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -         OPIC
Scoring Plug-in (scoring-opic)
2009-08-11 20:20:58,160 INFO  plugin.PluginRepository -
CyberNeko HTML Parser (lib-nekohtml)
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Anchor
Indexing Filter (index-anchor)
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         URL
Query Filter (query-url)
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Regex
URL Filter Framework (lib-regex-filter)
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         JSON
Response Writer Plug-in (response-json)
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository - Registered
Extension-Points:
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Nutch
Summarizer (org.apache.nutch.searcher.Summarizer)
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Nutch
Protocol (org.apache.nutch.protocol.Protocol)
2009-08-11 20:20:58,161 INFO  plugin.PluginRepository -         Nutch
Analysis (org.apache.nutch.analysis.NutchAnalyzer)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
Field Filter (org.apache.nutch.indexer.field.FieldFilter)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         HTML
Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
Query Filter (org.apache.nutch.searcher.QueryFilter)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
Search Results Response Writer
(org.apache.nutch.searcher.response.ResponseWriter)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
URL Normalizer (org.apache.nutch.net.URLNormalizer)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
URL Filter (org.apache.nutch.net.URLFilter)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
Online Search Results Clustering Plugin
(org.apache.nutch.clustering.OnlineClusterer)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
2009-08-11 20:20:58,162 INFO  plugin.PluginRepository -         Nutch
Content Parser (org.apache.nutch.parse.Parser)
2009-08-11 20:20:58,163 INFO  plugin.PluginRepository -         Nutch
Scoring (org.apache.nutch.scoring.ScoringFilter)
2009-08-11 20:20:58,163 INFO  plugin.PluginRepository -
Ontology Model Loader (org.apache.nutch.ontology.Ontology)
2009-08-11 20:20:58,171 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.basic.BasicIndexingFilter
2009-08-11 20:20:58,202 INFO  indexer.IndexingFilters - Adding
org.apache.nutch.indexer.anchor.AnchorIndexingFilter



Is Solr output a plugin, and is it not set up above?

2009/8/11 Alex McLintock <[email protected]>:
> I'm trying to send my Nutch crawl to SolR. I've "generated, fetched,
> updated", several times. I've done an invertlinks.
> But when I try to do the solrindex it just sits there for ages and
> doesnt seem to stress the solr server at all.
>
> I'm using Nutch 1.0, Sun Java 1.6, Ubuntu Linux 9.04.
>
> /local/apps/software/nutch$ bin/nutch solrindex
> http://rio23:8983/solr/ crawl/crawldb crawl/linkdb crawl/segments/*
>
> Is there some kind of "verbose" option so that I can better see what
> it is doing? I could maybe insert some extra deugging, or do i need to
> run this in Eclipse?
>
> The Java process seems to be using up most of a core's CPU time so it
> seems to be doing *something*.
>
> This is my first Solr project so I have proved that it is up and
> running, but havent actually added any data to it yet...
>
> Alex
>

Reply via email to