My apologies in advance.

I've been digging through the mail archives searching for information on
splitting the index after crawling, but I am getting even more confused or
the information is too incomplete for a newbie like myself.

I see reference to using mergesegs, but not enough to make an educated guess
(at least at my level, which I admit is low right now).

I've gotten to the point of having worked my way through the tutorial here:
http://wiki.apache.org/nutch/Nutch0.9-Hadoop0.10-Tutorial
and have a working site using a single computer. I have four more computers
to add, and would like to try distributed search.

When I read that tutorial to the Distributed Searching portion followed by
"split the index" it mentions this link:
http://wiki.apache.org/nutch/%5Bhttp%3A//www.nabble.com/Lucene-index-manipulation-tools-tf2781692.html#a7760917

But that may as well be saying "then some magic happens".

Does anyone have "step by step" instructions for spitting the index for use
in distributed search using mergesegs or otherwise? It doesn't have to have
a lot of explanation, just a list of example steps.


Mostly this is experimental for me with no major plans than my own
education, but because I am starting completely fresh at this, some things
are still quite confusing.

Thanks,
Jesse

Reply via email to