Hi Andrzej and Piotr,

I just want to confirm that this worked.

I renamed fetcher/ to fetcher_output/ and removed index/ and index.done and then ran the following:

   % bin/nutch parse <path to segment>
   % bin/nutch index <path to segment>

All of the PDFs appear to be parsed -- however when I did a search I still got a few that would not show the summary, but it clearly searched the contents of the PDF as the query string I used did not appear in the doc title or the URL.

Again, thanks.

-lucas

On Aug 25, 2005, at 12:48 PM, Andrzej Bialecki wrote:

Piotr Kosiorowski wrote:
You can try it out but I think parsing separately expects some directories in segment have different names than you have after standard fetch with parsing.

Yes, just rename fetcher/ to fetcher_output/ .


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to