I do have language identifier enabled in the plugins,
i tried removing this item and get:
050424 175406 parsing:
/home2/mozdex/nutch/build/plugins/clustering-carrot2/plugin.xml
050424 175406 parsing:
/home2/mozdex/nutch/build/plugins/ontology/plugin.xml
Exception in thread "main"
java.lang.ExceptionInInitializerError
at
org.apache.nutch.indexer.IndexSegment.indexPages(IndexSegment.java:145)
at
org.apache.nutch.indexer.IndexSegment.main(IndexSegment.java:254)
Caused by: java.lang.RuntimeException:
org.apache.nutch.indexer.IndexingFilter not found.
at
org.apache.nutch.indexer.IndexingFilters.<clinit>(IndexingFilters.java:36)
... 2 more
[EMAIL PROTECTED] [/home2/mozdex/nutch]#
It looks like when you generate a segment with a
plugin enabled you must have the plugin on for it to
process segment creation.
Can i re-run the segment through a fix or filter and
create an index on it? (or am i barking up the wrong
tree here?)
--- Byron Miller <[EMAIL PROTECTED]> wrote:
> I'm not sure what it is, but it seems i can only
> index
> about 28-32 pg/sec. While not terribly slow on its
> own, it did take nearly 30+ hours to index a 4
> million
> page segment.
>
> i used to see indexing scroll by.. is there anything
> new or perhaps a config i can tweak to bring back
> some
> of the performance from before?
>
> (is it because of index-more gathering that much
> more data???)
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam
> protection around
> http://mail.yahoo.com
>
>
>
-------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT
> Products from real users.
> Discover which products truly live up to the hype.
> Start reading now.
>
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Nutch-general mailing list
> [email protected]
>
https://lists.sourceforge.net/lists/listinfo/nutch-general
>
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com