Re: Does Nutch index content for .PDF image on text format?

2009-02-27 Thread Andrzej Bialecki
Bradford Stephens wrote: Greetings, IIRC, Lucene (which Nutch uses for document indexing) actually indexes data types via plugins. So if you have a plugin for PDF parsing (I believe there is one), then you would be able to do what you wish for it. Cheers, Bradford On Thu, Feb 26, 2009 at 11:40

Re: Does Nutch index content for .PDF image on text format?

2009-02-26 Thread Bradford Stephens
h search the text within the > image and then catalog the text as part of that PDF document? > > > *Does Nutch index content for .PDF image on text format?* >

Does Nutch index content for .PDF image on text format?

2009-02-26 Thread Robert Edmiston
the image and then catalog the text as part of that PDF document? *Does Nutch index content for .PDF image on text format?*