Re: pdf box help

2007-03-12 Thread Steven Rowe
This may help: http://www.pdfbox.org/userguide/text_extraction.html#Lucene+Integration ashwin kumar wrote: > hi all i am able to convert a pdf in to a text file using pdfbox. and this > is the code that i used > > import org.pdfbox.pdfparser.PDFParser; > import org.pdfbox.pdmodel.PDDocument; > i

Re: pdf box help

2007-03-11 Thread karl wettin
12 mar 2007 kl. 07.54 skrev ashwin kumar: ya sorry got it but that link contains only a program to index text i have already successfully indexed .txt now want to index pdf You can not index the PDF. You need to index the text you have extracted. >> >content = strip.getText(doc);

Re: pdf box help

2007-03-11 Thread ashwin kumar
ya sorry got it but that link contains only a program to index text i have already successfully indexed .txt now want to index pdf On 3/12/07, karl wettin <[EMAIL PROTECTED]> wrote: 12 mar 2007 kl. 07.44 skrev ashwin kumar: > it says that the requested URL is not found Compare the URL in you

Re: pdf box help

2007-03-11 Thread karl wettin
12 mar 2007 kl. 07.44 skrev ashwin kumar: it says that the requested URL is not found Compare the URL in your browser with the URL in the mail. Perhaps your mail client does not handle the line feed? On 3/12/07, karl wettin <[EMAIL PROTECTED]> wrote: 12 mar 2007 kl. 07.03 skrev ashwi

Re: pdf box help

2007-03-11 Thread ashwin kumar
it says that the requested URL is not found On 3/12/07, karl wettin <[EMAIL PROTECTED]> wrote: 12 mar 2007 kl. 07.03 skrev ashwin kumar: > hi all i am able to convert a pdf in to a text file using pdfbox. > and this > is the code that i used > { > >String pdfFile=new String ("D:\\ASHWIN\\

Re: pdf box help

2007-03-11 Thread karl wettin
12 mar 2007 kl. 07.03 skrev ashwin kumar: hi all i am able to convert a pdf in to a text file using pdfbox. and this is the code that i used { String pdfFile=new String ("D:\\ASHWIN\\res\\ashwin.pdf"); PDDocument doc = PDDocument.load(pdfFile); PDFTextStripper strip = new PDFTextStr