Re: pdfboxhelp
Hi natarajan, I kept log4j.properties in the classpath my new classpath is .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien t.ja r;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.4.1\ lib\ xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sdk1.4.1\l ib\s ervlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1. 4.1\ lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar;C:\ j2sd k1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.4.1\lib\ jaxp .jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C:\struts.jar ;F:\ apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.jar;C:\j2sdk1.4. 1\li b\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\log4j.jar ;C:\ j2sdk1.4.1\lib\log4j.properties; but there is no difference in the output - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:56 AM Subject: RE: pdfboxhelp Hi Santhosh, The attached file must be in your class path. Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:51 AM To: Lucene Users List Subject: Fw: pdfboxhelp hi karthik, did u find any solution? should I send the pdf to u? - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:23 AM Subject: Re: pdfboxhelp hi karthik, I kept log4j in the classpath , I am sending classpath variable CLASSPATH .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc lien t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s dk1. 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\ j2sd k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s dk1. 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl .jar ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s dk1. 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z ip;C :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0. 6.6. jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox- 0.6. 6\external\log4j.jar please check the error - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:26 AM Subject: RE: pdfboxhelp Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using
Re: pdfboxhelp
I kept the file in the classpath .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien t.ja r;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;D:\JAVAPRO;E:\ Prog ram Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\j2sdk1.4.1\lib\classes12.z ip;C:\struts.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.jar;C:\j2sdk1.4.1\lib\lucene -200 30909.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\log4j.jar;C:\j2sdk1.4 .1\l ib\log4j.properties;D:\setups\searchEngine\PDFBox-0.6.6\external\ant.jar;D:\ setu ps\searchEngine\PDFBox-0.6.6\external\checkstyle-all-2.4.jar;D:\setups\searc hEng ine\PDFBox-0.6.6\external\junit.jar;D:\setups\searchEngine\PDFBox-0.6.6\exte rnal \lucene-1.4-final.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\lucene-de mos- 1.4-final.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\xercesImpl.jar;D: \set ups\searchEngine\PDFBox-0.6.6\external\xml-apis.jar; but there is no change in the output, it is same as previous E:\java org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly. what might be the error? - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:56 AM Subject: RE: pdfboxhelp Hi Santhosh, The attached file must be in your class path. Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:51 AM To: Lucene Users List Subject: Fw: pdfboxhelp hi karthik, did u find any solution? should I send the pdf to u? - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:23 AM Subject: Re: pdfboxhelp hi karthik, I kept log4j in the classpath , I am sending classpath variable CLASSPATH .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc lien t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s dk1. 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\ j2sd k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s dk1. 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl .jar ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s dk1. 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z ip;C :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0. 6.6. jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox- 0.6. 6\external\log4j.jar please check the error - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:26 AM Subject: RE: pdfboxhelp Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf
Re: pdfboxhelp
Your classpath should point to a directory that contains log4j.properties, not the file directly, see below. sv On Mon, 23 Aug 2004, Santosh wrote: Hi natarajan, I kept log4j.properties in the classpath my new classpath is C:\j2sdk1.4.1\lib\log4j.properties; should be C:\j2sdk1.4.1\lib\ but there is no difference in the output - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:56 AM Subject: RE: pdfboxhelp Hi Santhosh, The attached file must be in your class path. Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:51 AM To: Lucene Users List Subject: Fw: pdfboxhelp hi karthik, did u find any solution? should I send the pdf to u? - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:23 AM Subject: Re: pdfboxhelp hi karthik, I kept log4j in the classpath , I am sending classpath variable CLASSPATH .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc lien t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s dk1. 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\ j2sd k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s dk1. 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl .jar ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s dk1. 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z ip;C :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0. 6.6. jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox- 0.6. 6\external\log4j.jar please check the error - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:26 AM Subject: RE: pdfboxhelp Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto
RE: pdfboxhelp
Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the real path of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable with requirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:31 PM To: Lucene Users List Subject: Re: pdfboxhelp thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{PDFParser parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc = parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor = new DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new PDFTextStripper();pdftext = stripper.getText(pdDoc); info = pdDoc.getDocumentInformation();}catch(Exception err) {System.out.println(err.getMessage());}pdDoc.close();return pdftext;} Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:14 PM To: Lucene Users List Subject: Re: pdfboxhelp Hi Don, your Idea is nice, but whenever I write the following code in IndexHTML.java of lucene import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Iam getting the following error package org.pdfbox.searchengine.lucene does not exist I have downloaded pdfbox source code and kept the jar file in the classpath, please help me on this- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PMSubject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object.Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PMSubject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly
Re: pdfboxhelp
hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:31 PM To: Lucene Users List Subject: Re: pdfboxhelp thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{PDFParser parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc = parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor = new DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new PDFTextStripper();pdftext = stripper.getText(pdDoc); info = pdDoc.getDocumentInformation();}catch(Exception err) {System.out.println(err.getMessage());}pdDoc.close();return pdftext;} Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:14 PM To: Lucene Users List Subject: Re: pdfboxhelp Hi Don, your Idea is nice, but whenever I write the following code in IndexHTML.java of lucene import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Iam getting the following error package org.pdfbox.searchengine.lucene does not exist I have downloaded pdfbox source code and kept the jar file in the classpath, please help me on this- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PMSubject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object.Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PMSubject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems
RE: pdfboxhelp
Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:31 PM To: Lucene Users List Subject: Re: pdfboxhelp thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{PDFParser parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc = parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor = new DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new PDFTextStripper();pdftext = stripper.getText(pdDoc); info = pdDoc.getDocumentInformation();}catch(Exception err) {System.out.println(err.getMessage());}pdDoc.close();return pdftext;} Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:14 PM To: Lucene Users List Subject: Re: pdfboxhelp Hi Don, your Idea is nice, but whenever I write the following code in IndexHTML.java of lucene import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Iam getting the following error package org.pdfbox.searchengine.lucene does not exist I have downloaded pdfbox source code and kept the jar file in the classpath, please help me on this- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PMSubject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object.Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday
Re: pdfboxhelp
hi karthik, I kept log4j in the classpath , I am sending classpath variable CLASSPATH .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1. 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1. 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1. 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6. jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6. 6\external\log4j.jar please check the error - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:26 AM Subject: RE: pdfboxhelp Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:31 PM To: Lucene Users List Subject: Re: pdfboxhelp thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{PDFParser parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc = parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor = new DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new PDFTextStripper();pdftext = stripper.getText(pdDoc); info = pdDoc.getDocumentInformation();}catch(Exception err) {System.out.println(err.getMessage());}pdDoc.close();return pdftext;} Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday
Fw: pdfboxhelp
hi karthik, did u find any solution? should I send the pdf to u? - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:23 AM Subject: Re: pdfboxhelp hi karthik, I kept log4j in the classpath , I am sending classpath variable CLASSPATH .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1. 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1. 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1. 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6. jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6. 6\external\log4j.jar please check the error - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:26 AM Subject: RE: pdfboxhelp Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:31 PM To: Lucene Users List Subject: Re: pdfboxhelp thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{PDFParser parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc = parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor = new
RE: pdfboxhelp
Hi Santhosh, The attached file must be in your class path. Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:51 AM To: Lucene Users List Subject: Fw: pdfboxhelp hi karthik, did u find any solution? should I send the pdf to u? - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:23 AM Subject: Re: pdfboxhelp hi karthik, I kept log4j in the classpath , I am sending classpath variable CLASSPATH .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc lien t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s dk1. 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\ j2sd k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s dk1. 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl .jar ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s dk1. 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z ip;C :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0. 6.6. jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox- 0.6. 6\external\log4j.jar please check the error - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:26 AM Subject: RE: pdfboxhelp Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:31 PM To: Lucene Users List Subject: Re: pdfboxhelp thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{PDFParser parser
RE: pdfboxhelp
Hi Santosh Hold on I's monday and I am on running off the Schedule with my Job... will reply u some time in noon. Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:51 AM To: Lucene Users List Subject: Fw: pdfboxhelp hi karthik, did u find any solution? should I send the pdf to u? - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:23 AM Subject: Re: pdfboxhelp hi karthik, I kept log4j in the classpath , I am sending classpath variable CLASSPATH .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1. 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat 4.0\common\lib\servlet.jar;C:\Program Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1. 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1. 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6. jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6. 6\external\log4j.jar please check the error - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:26 AM Subject: RE: pdfboxhelp Hi Santosh I think u'r Pdf is using Log4j package ,Try toe set the classpath for log4j.jar path. [ Is it a just a WARNING or an ERROR u are getting. Send me in u'r Configuration management Let me help u with it ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Monday, August 23, 2004 10:11 AM To: Lucene Users List Cc: Ben Litchfield Subject: Re: pdfboxhelp hi karthik, I have downloaded pdfbox and kept pdfjar file in the classpath, but when I am typing following command in the command prompt I am getting the error: D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText C:\test.pdf C:\test.txt log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse r). log4j:WARN Please initialize the log4j system properly why I am getting this error? plz help - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, August 23, 2004 9:21 AM Subject: RE: pdfboxhelp Hi To Begin with try to build Indexes offline [ out of Tomcat container] and on completing indxexes, feed u'r search with the realpath of the offline indexed folder,Start the Tomcat and then use the search on As u experiment it out u will be comfortable withrequirment of Indexing /Search.. ; [ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:55 PM To: Lucene Users List Subject: Re: pdfboxhelp Yes I did the same. I copied all the classes into classes folder but now when I am building the index using IndexHTML the pdfs are not added to this index, only text and htmls are added to index. what changes should I do for IndexHTML.java to build index with pdf - Original Message - From: Karthik N S [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:54 PM Subject: RE: pdfboxhelp Hi If u are using the jar file with Web Interface for jsp/servlet dev, Place the jar file in webapps/u'rapplication/Web-inf/lib and also correct the Classpath for the present modification. 2)create u'r own package and put all u'r java files copy the java files to /Web-inf/Classes/u'r package Then use the same..;{ Karthik -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 4:31 PM To: Lucene Users List Subject: Re: pdfboxhelp thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream
RE: pdfboxhelp
Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{ PDFParser parser = null; PDDocument pdDoc = null; PDFTextStripper stripper = null; String pdftext = ; try{ parser = new PDFParser(reader); parser.parse(); pdDoc = parser.getPDDocument(); if(pdDoc.isEncrypted()){ DecryptDocument decryptor = new DecryptDocument(pdDoc); decryptor.decryptDocument(); } stripper = new PDFTextStripper(); pdftext = stripper.getText(pdDoc); info = pdDoc.getDocumentInformation(); } catch(Exception err) { System.out.println(err.getMessage()); } pdDoc.close(); return pdftext; } Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:14 PM To: Lucene Users List Subject: Re: pdfboxhelp Hi Don, your Idea is nice, but whenever I write the following code in IndexHTML.java of lucene import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Iam getting the following error package org.pdfbox.searchengine.lucene does not exist I have downloaded pdfbox source code and kept the jar file in the classpath, please help me on this - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PM Subject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PM Subject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL PROTECTED] web: http://www.web-impact.com This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any
Re: pdfboxhelp
thanks Natarajan and karthik, I corrected classpath but where I should write your code? should I write your code in IndexHTML.java which comes along with lucene or some other place? one more thing I kept pdfbox jar file in the classpath is this enough or I have to build the pdfbox? thankyou - Original Message - From: Natarajan.T [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:20 PM Subject: RE: pdfboxhelp Hi Santhosh, Try out this below code.(pdfbox.jar file must be in your classpath) public String getContent(InputStream reader) throws IOException{PDFParser parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc = parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor = new DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new PDFTextStripper();pdftext = stripper.getText(pdDoc); info = pdDoc.getDocumentInformation();}catch(Exception err) {System.out.println(err.getMessage());}pdDoc.close();return pdftext;} Natarajan. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: Saturday, August 21, 2004 3:14 PM To: Lucene Users List Subject: Re: pdfboxhelp Hi Don, your Idea is nice, but whenever I write the following code in IndexHTML.java of lucene import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Iam getting the following error package org.pdfbox.searchengine.lucene does not exist I have downloaded pdfbox source code and kept the jar file in the classpath, please help me on this- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PMSubject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object.Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PMSubject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don VaillancourtDirector of Software Development WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email: [EMAIL PROTECTED]: http://www.web-impact.com This email message is intended only for the addressee(s)and contains information that may be confidential and/orcopyright. If you are not the intended recipient pleasenotify the sender by reply email and immediately deletethis email. Use, disclosure or reproduction of this emailby anyone other than the intended recipient(s) is strictlyprohibited. No representation is made that this email orany attachments are free of viruses. Virus scanning isrecommended and is the responsibility of the recipient. ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete
pdfboxhelp
hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS.
Re: pdfboxhelp
What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad "The harder you train in peace, the lesser you bleed in war" ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL PROTECTED] web: http://www.web-impact.com This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: pdfboxhelp
exactly, the same is required to me - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PM Subject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL PROTECTED] web: http://www.web-impact.com This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS.
Re: pdfboxhelp
Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File("/path/to/the/file.pdf"); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PM Subject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad "The harder you train in peace, the lesser you bleed in war" ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL PROTECTED] web: http://www.web-impact.com This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL
Re: pdfboxhelp
- Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PM Subject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PM Subject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL PROTECTED] web: http://www.web-impact.com This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS
Re: pdfboxhelp
Did I leave you speechless!? :-) Santosh wrote: - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PM Subject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File("/path/to/the/file.pdf"); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PM Subject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad "The harder you train in peace, the lesser you bleed in war" ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL PROTECTED] web: http://www.web-impact.com This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily thos
Re: pdfboxhelp
Iam sorry, mail has been sent accidentally - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 8:02 PM Subject: Re: pdfboxhelp Did I leave you speechless!? :-) Santosh wrote: - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37 PM Subject: Re: pdfboxhelp Here is the super simple code required. import org.pdfbox.searchengine.lucene.*; File pdfFile = new File(/path/to/the/file.pdf); // Below returns a parse PDF file in a Lucene Document object. Document doc = LucenePDFDocument.getDocument(pdfFile); Santosh wrote: exactly, the same is required to me - Original Message - From: Don Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39 PM Subject: Re: pdfboxhelp What are your intensions with PDFBox? You want to use it to index PDF files? Santosh wrote: hi, I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me. regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- Don Vaillancourt Director of Software Development WEB IMPACT INC. phone: 416-815-2000 ext. 245 fax: 416-815-2001 email: [EMAIL PROTECTED] web: http://www.web-impact.com This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. -- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment