RE: indexing pdfs

2007-03-09 Thread Kainth, Sachin
To: java-user@lucene.apache.org Subject: Re: indexing pdfs hi sachin the link wat u gave me only a zip file and an exe file for downoad. and this zip file also contains no class files.but wouldn't we be requiring a jar file or class file ??? On 3/8/07, Kainth, Sachin [EMAIL PROTECTED] wrote: Hi

indexing pdfs

2007-03-08 Thread ashwin kumar
hi can some one help me by giving any sample programs for indexing pdfs and .doc files thanks regards ashwin

RE: indexing pdfs

2007-03-08 Thread Kainth, Sachin
- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 09:37 To: java-user@lucene.apache.org Subject: indexing pdfs hi can some one help me by giving any sample programs for indexing pdfs and .doc files thanks regards ashwin This message has been scanned for viruses by MailControl

Re: indexing pdfs

2007-03-08 Thread Ulf Dittmer
For DOC files you can use the Jakarta POI library. Text extraction is outlined here: http://jakarta.apache.org/poi/hwpf/quick-guide.html Ulf On 08.03.2007, at 10:37, ashwin kumar wrote: hi can some one help me by giving any sample programs for indexing pdfs and .doc files

Re: indexing pdfs

2007-03-08 Thread ashwin kumar
PDFTextStripper(); // get text from doc using stripper return stripper.getText(doc); } Sachin -Original Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 09:37 To: java-user@lucene.apache.org Subject: indexing pdfs hi can some one help me

RE: indexing pdfs

2007-03-08 Thread Kainth, Sachin
[mailto:[EMAIL PROTECTED] Sent: 08 March 2007 11:35 To: java-user@lucene.apache.org Subject: Re: indexing pdfs Is the only way index pdfs is to convert it into a text and then only index it ??? On 3/8/07, Kainth, Sachin [EMAIL PROTECTED] wrote: Hi Aswin, You can try pdfbox to convert the pdf

Re: indexing pdfs

2007-03-08 Thread ashwin kumar
. The only other way I have heard of is to use Ifilters. I believe SeekAFile does indexing of pdfs. Sachin -Original Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 11:35 To: java-user@lucene.apache.org Subject: Re: indexing pdfs Is the only way index pdfs

RE: indexing pdfs

2007-03-08 Thread Kainth, Sachin
Hi, Here it is: http://www.seekafile.org/ -Original Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 13:07 To: java-user@lucene.apache.org Subject: Re: indexing pdfs hi again do we have to download any jar files to run this program if so can u give me

Re: indexing pdfs

2007-03-08 Thread ashwin kumar
Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 13:07 To: java-user@lucene.apache.org Subject: Re: indexing pdfs hi again do we have to download any jar files to run this program if so can u give me the link pls ashwin On 3/8/07, Kainth, Sachin [EMAIL PROTECTED] wrote