Re: FetchedSegments.getSummary() for a PDF

Lucas Rockwell Fri, 26 Aug 2005 11:48:07 -0700

Hi Andrzej and Piotr,

I just want to confirm that this worked.

I renamed fetcher/ to fetcher_output/ and removed index/ and index.doneand then ran the following:


   % bin/nutch parse <path to segment>
   % bin/nutch index <path to segment>

All of the PDFs appear to be parsed -- however when I did a search Istill got a few that would not show the summary, but it clearlysearched the contents of the PDF as the query string I used did notappear in the doc title or the URL.


Again, thanks.

-lucas

On Aug 25, 2005, at 12:48 PM, Andrzej Bialecki wrote:

Piotr Kosiorowski wrote:

You can try it out but I think parsing separately expects somedirectories in segment have different names than you have afterstandard fetch with parsing.


Yes, just rename fetcher/ to fetcher_output/ .


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: FetchedSegments.getSummary() for a PDF

Reply via email to