about PDF / HTML index

alvaro z Tue, 15 Jul 2003 15:19:54 -0700

im using lucene with TXT and HTML files , its working.

the only problem with HTML files is that i have to index html files as txt first , 
before to index them as HTML.


do anyone have try to index pdf files ? 

im trying the pdfbox , is there any samples for indexing pdf files ? (i dont find any 
samples to do that) with any of the parsers (pdfbox, jpedal ,etc).

thanks for helping,

Alvaro. from Lima - Peru


---------------------------------
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!

about PDF / HTML index

Reply via email to