Hi Andrzej and Piotr,
I just want to confirm that this worked.
I renamed fetcher/ to fetcher_output/ and removed index/ and index.done
and then ran the following:
% bin/nutch parse <path to segment>
% bin/nutch index <path to segment>
All of the PDFs appear to be parsed -- however when I did a search I
still got a few that would not show the summary, but it clearly
searched the contents of the PDF as the query string I used did not
appear in the doc title or the URL.
Again, thanks.
-lucas
On Aug 25, 2005, at 12:48 PM, Andrzej Bialecki wrote:
Piotr Kosiorowski wrote:
You can try it out but I think parsing separately expects some
directories in segment have different names than you have after
standard fetch with parsing.
Yes, just rename fetcher/ to fetcher_output/ .
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com