Re: pdfboxhelp

2004-08-23 Thread Santosh
Hi natarajan,
I kept log4j.properties in the classpath
my new classpath is

.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien
t.ja
r;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.4.1\
lib\
xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sdk1.4.1\l
ib\s
ervlet.jar;E:\Program Files\Apache Tomcat
4.0\common\lib\servlet.jar;C:\Program
Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.
4.1\
lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar;C:\
j2sd
k1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.4.1\lib\
jaxp
.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C:\struts.jar
;F:\
apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.jar;C:\j2sdk1.4.
1\li
b\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\log4j.jar
;C:\
j2sdk1.4.1\lib\log4j.properties;

but there is no difference in the output


- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:56 AM
Subject: RE: pdfboxhelp


 Hi Santhosh,

 The attached file must be in your class path.


 Natarajan.



 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:51 AM
 To: Lucene Users List
 Subject: Fw: pdfboxhelp

 hi karthik,
 did u find any solution? should I send the pdf to u?
 - Original Message -
 From: Santosh [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:23 AM
 Subject: Re: pdfboxhelp


  hi karthik,
   I kept log4j in the classpath , I am sending classpath variable
 
  CLASSPATH
 
 
 .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc
 lien
 
 t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s
 dk1.
 
 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\
 j2sd
  k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
  4.0\common\lib\servlet.jar;C:\Program
 
 Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s
 dk1.
 
 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl
 .jar
 
 ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s
 dk1.
 
 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z
 ip;C
 
 :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.
 6.6.
 
 jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-
 0.6.
  6\external\log4j.jar
 
  please check the error
 
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:26 AM
  Subject: RE: pdfboxhelp
 
 
   Hi Santosh
  
 I think u'r Pdf is using  Log4j package ,Try toe set the classpath
 for
   log4j.jar path.
  
[ Is it a just a WARNING  or an ERROR  u are getting.
  
 Send me in u'r Configuration management Let me help u with it
 ; [
  
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 10:11 AM
   To: Lucene Users List
   Cc: Ben Litchfield
   Subject: Re: pdfboxhelp
  
  
   hi karthik,
  
   I have downloaded pdfbox and kept pdfjar file in the classpath, but
 when
 I
   am typing following command in the command prompt I am getting the
 error:
  
   D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
   C:\test.pdf
   C:\test.txt
   log4j:WARN No appenders could be found for logger
   (org.pdfbox.pdfparser.PDFParse
   r).
   log4j:WARN Please initialize the log4j system properly
  
   why I am getting this error? plz help
  
  
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 9:21 AM
   Subject: RE: pdfboxhelp
  
  
Hi
   
   
To Begin with try to build Indexes offline  [ out of Tomcat
  container]
and  on completing indxexes, feed u'r search  with the realpath of
 the
   offline indexed folder,Start the Tomcat and then use the
search on As u experiment it out u will be comfortable
  withrequirment
   of Indexing /Search..   ; [
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:55 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
Yes I did the same.
I copied all the classes into classes folder but
now when I am building the index using IndexHTML the pdfs are not
 added
  to
this index, only text and htmls are added to index.
what changes should I do for IndexHTML.java to build index with
 pdf
- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:54 PM
Subject: RE: pdfboxhelp
   
   
 Hi

 If u are using

Re: pdfboxhelp

2004-08-23 Thread Santosh
I kept the file in the classpath

.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien
t.ja
r;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;D:\JAVAPRO;E:\
Prog
ram Files\Apache Tomcat
4.0\common\lib\servlet.jar;C:\j2sdk1.4.1\lib\classes12.z
ip;C:\struts.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.jar;C:\j2sdk1.4.1\lib\lucene
-200
30909.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\log4j.jar;C:\j2sdk1.4
.1\l
ib\log4j.properties;D:\setups\searchEngine\PDFBox-0.6.6\external\ant.jar;D:\
setu
ps\searchEngine\PDFBox-0.6.6\external\checkstyle-all-2.4.jar;D:\setups\searc
hEng
ine\PDFBox-0.6.6\external\junit.jar;D:\setups\searchEngine\PDFBox-0.6.6\exte
rnal
\lucene-1.4-final.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\lucene-de
mos-
1.4-final.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\xercesImpl.jar;D:
\set
ups\searchEngine\PDFBox-0.6.6\external\xml-apis.jar;



but there is no change in the output, it is same as previous

E:\java org.pdfbox.ExtractText C:\test.pdf C:\test.txt
log4j:WARN No appenders could be found for logger
(org.pdfbox.pdfparser.PDFParse
r).
log4j:WARN Please initialize the log4j system properly.

what might be the error?


- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:56 AM
Subject: RE: pdfboxhelp


 Hi Santhosh,

 The attached file must be in your class path.


 Natarajan.



 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:51 AM
 To: Lucene Users List
 Subject: Fw: pdfboxhelp

 hi karthik,
 did u find any solution? should I send the pdf to u?
 - Original Message -
 From: Santosh [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:23 AM
 Subject: Re: pdfboxhelp


  hi karthik,
   I kept log4j in the classpath , I am sending classpath variable
 
  CLASSPATH
 
 
 .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc
 lien
 
 t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s
 dk1.
 
 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\
 j2sd
  k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
  4.0\common\lib\servlet.jar;C:\Program
 
 Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s
 dk1.
 
 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl
 .jar
 
 ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s
 dk1.
 
 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z
 ip;C
 
 :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.
 6.6.
 
 jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-
 0.6.
  6\external\log4j.jar
 
  please check the error
 
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:26 AM
  Subject: RE: pdfboxhelp
 
 
   Hi Santosh
  
 I think u'r Pdf is using  Log4j package ,Try toe set the classpath
 for
   log4j.jar path.
  
[ Is it a just a WARNING  or an ERROR  u are getting.
  
 Send me in u'r Configuration management Let me help u with it
 ; [
  
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 10:11 AM
   To: Lucene Users List
   Cc: Ben Litchfield
   Subject: Re: pdfboxhelp
  
  
   hi karthik,
  
   I have downloaded pdfbox and kept pdfjar file in the classpath, but
 when
 I
   am typing following command in the command prompt I am getting the
 error:
  
   D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
   C:\test.pdf
   C:\test.txt
   log4j:WARN No appenders could be found for logger
   (org.pdfbox.pdfparser.PDFParse
   r).
   log4j:WARN Please initialize the log4j system properly
  
   why I am getting this error? plz help
  
  
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 9:21 AM
   Subject: RE: pdfboxhelp
  
  
Hi
   
   
To Begin with try to build Indexes offline  [ out of Tomcat
  container]
and  on completing indxexes, feed u'r search  with the realpath of
 the
   offline indexed folder,Start the Tomcat and then use the
search on As u experiment it out u will be comfortable
  withrequirment
   of Indexing /Search..   ; [
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:55 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
Yes I did the same.
I copied all the classes into classes folder but
now when I am building the index using IndexHTML the pdfs are not
 added
  to
this index, only text and htmls are added to index.
what changes should I do for IndexHTML.java to build index with
 pdf

Re: pdfboxhelp

2004-08-23 Thread Stephane James Vaucher
Your classpath should point to a directory that contains log4j.properties, 
not the file directly, see below.

sv

On Mon, 23 Aug 2004, Santosh wrote:

 Hi natarajan,
 I kept log4j.properties in the classpath
 my new classpath is
 
 C:\j2sdk1.4.1\lib\log4j.properties;

should be C:\j2sdk1.4.1\lib\
 
 but there is no difference in the output
 
 
 - Original Message -
 From: Natarajan.T [EMAIL PROTECTED]
 To: 'Lucene Users List' [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:56 AM
 Subject: RE: pdfboxhelp
 
 
  Hi Santhosh,
 
  The attached file must be in your class path.
 
 
  Natarajan.
 
 
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:51 AM
  To: Lucene Users List
  Subject: Fw: pdfboxhelp
 
  hi karthik,
  did u find any solution? should I send the pdf to u?
  - Original Message -
  From: Santosh [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:23 AM
  Subject: Re: pdfboxhelp
 
 
   hi karthik,
I kept log4j in the classpath , I am sending classpath variable
  
   CLASSPATH
  
  
  .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc
  lien
  
  t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s
  dk1.
  
  4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\
  j2sd
   k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
   4.0\common\lib\servlet.jar;C:\Program
  
  Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s
  dk1.
  
  4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl
  .jar
  
  ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s
  dk1.
  
  4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z
  ip;C
  
  :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.
  6.6.
  
  jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-
  0.6.
   6\external\log4j.jar
  
   please check the error
  
  
  
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 10:26 AM
   Subject: RE: pdfboxhelp
  
  
Hi Santosh
   
  I think u'r Pdf is using  Log4j package ,Try toe set the classpath
  for
log4j.jar path.
   
 [ Is it a just a WARNING  or an ERROR  u are getting.
   
  Send me in u'r Configuration management Let me help u with it
  ; [
   
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:11 AM
To: Lucene Users List
Cc: Ben Litchfield
Subject: Re: pdfboxhelp
   
   
hi karthik,
   
I have downloaded pdfbox and kept pdfjar file in the classpath, but
  when
  I
am typing following command in the command prompt I am getting the
  error:
   
D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
C:\test.pdf
C:\test.txt
log4j:WARN No appenders could be found for logger
(org.pdfbox.pdfparser.PDFParse
r).
log4j:WARN Please initialize the log4j system properly
   
why I am getting this error? plz help
   
   
- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 9:21 AM
Subject: RE: pdfboxhelp
   
   
 Hi


 To Begin with try to build Indexes offline  [ out of Tomcat
   container]
 and  on completing indxexes, feed u'r search  with the realpath of
  the
offline indexed folder,Start the Tomcat and then use the
 search on As u experiment it out u will be comfortable
   withrequirment
of Indexing /Search..   ; [

 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:55 PM
 To: Lucene Users List
 Subject: Re: pdfboxhelp


 Yes I did the same.
 I copied all the classes into classes folder but
 now when I am building the index using IndexHTML the pdfs are not
  added
   to
 this index, only text and htmls are added to index.
 what changes should I do for IndexHTML.java to build index with
  pdf
 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:54 PM
 Subject: RE: pdfboxhelp


  Hi
 
  If u are using the jar file with Web Interface for jsp/servlet
  dev,
Place
  the jar file in  webapps/u'rapplication/Web-inf/lib
  and also correct the Classpath for the present modification.
 
  2)create u'r own package and put all u'r java files  copy the
  java
   files
 to
  /Web-inf/Classes/u'r package
 
 
  Then use the same..;{
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto

RE: pdfboxhelp

2004-08-22 Thread Karthik N S
Hi


To Begin with try to build Indexes offline  [ out of Tomcat container]
and  on completing indxexes, feed u'r search  with the real
path of the  offline indexed folder,Start the Tomcat and then use the
search on As u experiment it out u will be comfortable with
requirment of Indexing /Search..   ; [

Karthik

-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:55 PM
To: Lucene Users List
Subject: Re: pdfboxhelp


Yes I did the same.
I copied all the classes into classes folder but
now when I am building the index using IndexHTML the pdfs are not added to
this index, only text and htmls are added to index.
what changes should I do for IndexHTML.java to build index with pdf
- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:54 PM
Subject: RE: pdfboxhelp


 Hi

 If u are using the jar file with Web Interface for jsp/servlet dev, Place
 the jar file in  webapps/u'rapplication/Web-inf/lib
 and also correct the Classpath for the present modification.

 2)create u'r own package and put all u'r java files  copy the java files
to
 /Web-inf/Classes/u'r package


 Then use the same..;{


 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:31 PM
 To: Lucene Users List
 Subject: Re: pdfboxhelp


 thanks  Natarajan and karthik,

 I corrected classpath

 but where I should write your code?
 should I write your code in IndexHTML.java  which comes along with lucene
or
 some other place?
 one more thing
 I kept pdfbox jar file in the classpath is this enough or I have to build
 the pdfbox?

 thankyou
 - Original Message -
 From: Natarajan.T [EMAIL PROTECTED]
 To: 'Lucene Users List' [EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 3:20 PM
 Subject: RE: pdfboxhelp


  Hi Santhosh,
 
  Try out this below code.(pdfbox.jar file must be in your classpath)
 
  public String getContent(InputStream  reader) throws
IOException{PDFParser
 parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
null;String
 pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
 parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor =
 new
  DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
 PDFTextStripper();pdftext = stripper.getText(pdDoc);
 
 info = pdDoc.getDocumentInformation();}catch(Exception err)
 {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
 
  Natarajan.
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 3:14 PM
  To: Lucene Users List
  Subject: Re: pdfboxhelp
 
  Hi Don,
 
  your Idea is nice, but whenever I write the  following code in
  IndexHTML.java of lucene
 
 
  import org.pdfbox.searchengine.lucene.*;
 
  File pdfFile = new File(/path/to/the/file.pdf);
 
  // Below returns a parse PDF file in a Lucene Document object.
  Document doc = LucenePDFDocument.getDocument(pdfFile);
 
  Iam getting the following error
 
  package org.pdfbox.searchengine.lucene does not exist
 
  I have downloaded pdfbox source code and kept the jar file in the
  classpath, please help me on this- Original Message - From: Don
 Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
 PMSubject: Re: pdfboxhelp
 
 
Here is the super simple code required.
 
import org.pdfbox.searchengine.lucene.*;
 
File pdfFile = new File(/path/to/the/file.pdf);
 
// Below returns a parse PDF file in a Lucene Document object.Document
 doc = LucenePDFDocument.getDocument(pdfFile);
 
Santosh wrote:
 
  exactly, the same is required to me- Original Message - From:
Don
 Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39
 PMSubject: Re: pdfboxhelp
 
 
What are your intensions with PDFBox?
 
You want to use it to index PDF files?
 
Santosh wrote:
 
  hi,
 
  I have downloaded pdfbox zip. but i am in ambigous state that where to
  start. how can I check with demo, I dont see any help document with this
  download, please help me.
 
 
  regards
  Santosh kumar
  SoftPro Systems
  Hyderabad
 
 
  The harder you train in peace, the lesser you bleed in war
 
  ---SOFTPRO DISCLAIMER--
 
  Information contained in this E-MAIL and any attachments are
  confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
  and 'confidential'.
 
  If you are not an intended or authorised recipient of this E-MAIL or
  have received it in error, You are notified that any use, copying or
  dissemination  of the information contained in this E-MAIL in any
  manner whatsoever is strictly prohibited. Please delete it immediately
  and notify the sender by E-MAIL.
 
  In such a case reading, reproducing, printing or further dissemination
  of this E-MAIL is strictly

Re: pdfboxhelp

2004-08-22 Thread Santosh
hi karthik,

I have downloaded pdfbox and kept pdfjar file in the classpath, but when I
am typing following command in the command prompt I am getting the error:

D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
C:\test.pdf
C:\test.txt
log4j:WARN No appenders could be found for logger
(org.pdfbox.pdfparser.PDFParse
r).
log4j:WARN Please initialize the log4j system properly

why I am getting this error? plz help


- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 9:21 AM
Subject: RE: pdfboxhelp


 Hi


 To Begin with try to build Indexes offline  [ out of Tomcat container]
 and  on completing indxexes, feed u'r search  with the realpath of the
offline indexed folder,Start the Tomcat and then use the
 search on As u experiment it out u will be comfortable withrequirment
of Indexing /Search..   ; [

 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:55 PM
 To: Lucene Users List
 Subject: Re: pdfboxhelp


 Yes I did the same.
 I copied all the classes into classes folder but
 now when I am building the index using IndexHTML the pdfs are not added to
 this index, only text and htmls are added to index.
 what changes should I do for IndexHTML.java to build index with pdf
 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:54 PM
 Subject: RE: pdfboxhelp


  Hi
 
  If u are using the jar file with Web Interface for jsp/servlet dev,
Place
  the jar file in  webapps/u'rapplication/Web-inf/lib
  and also correct the Classpath for the present modification.
 
  2)create u'r own package and put all u'r java files  copy the java files
 to
  /Web-inf/Classes/u'r package
 
 
  Then use the same..;{
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 4:31 PM
  To: Lucene Users List
  Subject: Re: pdfboxhelp
 
 
  thanks  Natarajan and karthik,
 
  I corrected classpath
 
  but where I should write your code?
  should I write your code in IndexHTML.java  which comes along with
lucene
 or
  some other place?
  one more thing
  I kept pdfbox jar file in the classpath is this enough or I have to
build
  the pdfbox?
 
  thankyou
  - Original Message -
  From: Natarajan.T [EMAIL PROTECTED]
  To: 'Lucene Users List' [EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 3:20 PM
  Subject: RE: pdfboxhelp
 
 
   Hi Santhosh,
  
   Try out this below code.(pdfbox.jar file must be in your
classpath)
  
   public String getContent(InputStream  reader) throws
 IOException{PDFParser
  parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
 null;String
  pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
  parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor
=
  new
   DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
  PDFTextStripper();pdftext = stripper.getText(pdDoc);
  
  info = pdDoc.getDocumentInformation();}catch(Exception err)
  {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
  
   Natarajan.
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 3:14 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
   Hi Don,
  
   your Idea is nice, but whenever I write the  following code in
   IndexHTML.java of lucene
  
  
   import org.pdfbox.searchengine.lucene.*;
  
   File pdfFile = new File(/path/to/the/file.pdf);
  
   // Below returns a parse PDF file in a Lucene Document object.
   Document doc = LucenePDFDocument.getDocument(pdfFile);
  
   Iam getting the following error
  
   package org.pdfbox.searchengine.lucene does not exist
  
   I have downloaded pdfbox source code and kept the jar file in the
   classpath, please help me on this- Original Message - From:
Don
  Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
  PMSubject: Re: pdfboxhelp
  
  
 Here is the super simple code required.
  
 import org.pdfbox.searchengine.lucene.*;
  
 File pdfFile = new File(/path/to/the/file.pdf);
  
 // Below returns a parse PDF file in a Lucene Document
object.Document
  doc = LucenePDFDocument.getDocument(pdfFile);
  
 Santosh wrote:
  
   exactly, the same is required to me- Original Message - From:
 Don
  Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39
  PMSubject: Re: pdfboxhelp
  
  
 What are your intensions with PDFBox?
  
 You want to use it to index PDF files?
  
 Santosh wrote:
  
   hi,
  
   I have downloaded pdfbox zip. but i am in ambigous state that where to
   start. how can I check with demo, I dont see any help document with
this
   download, please help me.
  
  
   regards
   Santosh kumar
   SoftPro Systems

RE: pdfboxhelp

2004-08-22 Thread Karthik N S
Hi Santosh

  I think u'r Pdf is using  Log4j package ,Try toe set the classpath for
log4j.jar path.

 [ Is it a just a WARNING  or an ERROR  u are getting.

  Send me in u'r Configuration management Let me help u with it ; [


Karthik

-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:11 AM
To: Lucene Users List
Cc: Ben Litchfield
Subject: Re: pdfboxhelp


hi karthik,

I have downloaded pdfbox and kept pdfjar file in the classpath, but when I
am typing following command in the command prompt I am getting the error:

D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
C:\test.pdf
C:\test.txt
log4j:WARN No appenders could be found for logger
(org.pdfbox.pdfparser.PDFParse
r).
log4j:WARN Please initialize the log4j system properly

why I am getting this error? plz help


- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 9:21 AM
Subject: RE: pdfboxhelp


 Hi


 To Begin with try to build Indexes offline  [ out of Tomcat container]
 and  on completing indxexes, feed u'r search  with the realpath of the
offline indexed folder,Start the Tomcat and then use the
 search on As u experiment it out u will be comfortable withrequirment
of Indexing /Search..   ; [

 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:55 PM
 To: Lucene Users List
 Subject: Re: pdfboxhelp


 Yes I did the same.
 I copied all the classes into classes folder but
 now when I am building the index using IndexHTML the pdfs are not added to
 this index, only text and htmls are added to index.
 what changes should I do for IndexHTML.java to build index with pdf
 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:54 PM
 Subject: RE: pdfboxhelp


  Hi
 
  If u are using the jar file with Web Interface for jsp/servlet dev,
Place
  the jar file in  webapps/u'rapplication/Web-inf/lib
  and also correct the Classpath for the present modification.
 
  2)create u'r own package and put all u'r java files  copy the java files
 to
  /Web-inf/Classes/u'r package
 
 
  Then use the same..;{
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 4:31 PM
  To: Lucene Users List
  Subject: Re: pdfboxhelp
 
 
  thanks  Natarajan and karthik,
 
  I corrected classpath
 
  but where I should write your code?
  should I write your code in IndexHTML.java  which comes along with
lucene
 or
  some other place?
  one more thing
  I kept pdfbox jar file in the classpath is this enough or I have to
build
  the pdfbox?
 
  thankyou
  - Original Message -
  From: Natarajan.T [EMAIL PROTECTED]
  To: 'Lucene Users List' [EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 3:20 PM
  Subject: RE: pdfboxhelp
 
 
   Hi Santhosh,
  
   Try out this below code.(pdfbox.jar file must be in your
classpath)
  
   public String getContent(InputStream  reader) throws
 IOException{PDFParser
  parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
 null;String
  pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
  parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor
=
  new
   DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
  PDFTextStripper();pdftext = stripper.getText(pdDoc);
  
  info = pdDoc.getDocumentInformation();}catch(Exception err)
  {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
  
   Natarajan.
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 3:14 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
   Hi Don,
  
   your Idea is nice, but whenever I write the  following code in
   IndexHTML.java of lucene
  
  
   import org.pdfbox.searchengine.lucene.*;
  
   File pdfFile = new File(/path/to/the/file.pdf);
  
   // Below returns a parse PDF file in a Lucene Document object.
   Document doc = LucenePDFDocument.getDocument(pdfFile);
  
   Iam getting the following error
  
   package org.pdfbox.searchengine.lucene does not exist
  
   I have downloaded pdfbox source code and kept the jar file in the
   classpath, please help me on this- Original Message - From:
Don
  Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
  PMSubject: Re: pdfboxhelp
  
  
 Here is the super simple code required.
  
 import org.pdfbox.searchengine.lucene.*;
  
 File pdfFile = new File(/path/to/the/file.pdf);
  
 // Below returns a parse PDF file in a Lucene Document
object.Document
  doc = LucenePDFDocument.getDocument(pdfFile);
  
 Santosh wrote:
  
   exactly, the same is required to me- Original Message - From:
 Don
  Vaillancourt To: Lucene Users List Sent: Friday

Re: pdfboxhelp

2004-08-22 Thread Santosh
hi karthik,
 I kept log4j in the classpath , I am sending classpath variable

CLASSPATH

.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien
t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.
4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd
k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
4.0\common\lib\servlet.jar;C:\Program
Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.
4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar
;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.
4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C
:\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.
jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.
6\external\log4j.jar

please check the error



- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:26 AM
Subject: RE: pdfboxhelp


 Hi Santosh

   I think u'r Pdf is using  Log4j package ,Try toe set the classpath for
 log4j.jar path.

  [ Is it a just a WARNING  or an ERROR  u are getting.

   Send me in u'r Configuration management Let me help u with it ; [


 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:11 AM
 To: Lucene Users List
 Cc: Ben Litchfield
 Subject: Re: pdfboxhelp


 hi karthik,

 I have downloaded pdfbox and kept pdfjar file in the classpath, but when I
 am typing following command in the command prompt I am getting the error:

 D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
 C:\test.pdf
 C:\test.txt
 log4j:WARN No appenders could be found for logger
 (org.pdfbox.pdfparser.PDFParse
 r).
 log4j:WARN Please initialize the log4j system properly

 why I am getting this error? plz help


 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 9:21 AM
 Subject: RE: pdfboxhelp


  Hi
 
 
  To Begin with try to build Indexes offline  [ out of Tomcat
container]
  and  on completing indxexes, feed u'r search  with the realpath of the
 offline indexed folder,Start the Tomcat and then use the
  search on As u experiment it out u will be comfortable
withrequirment
 of Indexing /Search..   ; [
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 4:55 PM
  To: Lucene Users List
  Subject: Re: pdfboxhelp
 
 
  Yes I did the same.
  I copied all the classes into classes folder but
  now when I am building the index using IndexHTML the pdfs are not added
to
  this index, only text and htmls are added to index.
  what changes should I do for IndexHTML.java to build index with pdf
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 4:54 PM
  Subject: RE: pdfboxhelp
 
 
   Hi
  
   If u are using the jar file with Web Interface for jsp/servlet dev,
 Place
   the jar file in  webapps/u'rapplication/Web-inf/lib
   and also correct the Classpath for the present modification.
  
   2)create u'r own package and put all u'r java files  copy the java
files
  to
   /Web-inf/Classes/u'r package
  
  
   Then use the same..;{
  
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:31 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
  
   thanks  Natarajan and karthik,
  
   I corrected classpath
  
   but where I should write your code?
   should I write your code in IndexHTML.java  which comes along with
 lucene
  or
   some other place?
   one more thing
   I kept pdfbox jar file in the classpath is this enough or I have to
 build
   the pdfbox?
  
   thankyou
   - Original Message -
   From: Natarajan.T [EMAIL PROTECTED]
   To: 'Lucene Users List' [EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 3:20 PM
   Subject: RE: pdfboxhelp
  
  
Hi Santhosh,
   
Try out this below code.(pdfbox.jar file must be in your
 classpath)
   
public String getContent(InputStream  reader) throws
  IOException{PDFParser
   parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
  null;String
   pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
   parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument
decryptor
 =
   new
DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
   PDFTextStripper();pdftext = stripper.getText(pdDoc);
   
   info = pdDoc.getDocumentInformation();}catch(Exception err)
   {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
   
Natarajan.
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday

Fw: pdfboxhelp

2004-08-22 Thread Santosh
hi karthik,
did u find any solution? should I send the pdf to u?
- Original Message -
From: Santosh [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:23 AM
Subject: Re: pdfboxhelp


 hi karthik,
  I kept log4j in the classpath , I am sending classpath variable

 CLASSPATH


.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien

t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.

4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd
 k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
 4.0\common\lib\servlet.jar;C:\Program

Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.

4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar

;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.

4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C

:\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.

jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.
 6\external\log4j.jar

 please check the error



 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:26 AM
 Subject: RE: pdfboxhelp


  Hi Santosh
 
I think u'r Pdf is using  Log4j package ,Try toe set the classpath for
  log4j.jar path.
 
   [ Is it a just a WARNING  or an ERROR  u are getting.
 
Send me in u'r Configuration management Let me help u with it ; [
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:11 AM
  To: Lucene Users List
  Cc: Ben Litchfield
  Subject: Re: pdfboxhelp
 
 
  hi karthik,
 
  I have downloaded pdfbox and kept pdfjar file in the classpath, but when
I
  am typing following command in the command prompt I am getting the
error:
 
  D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
  C:\test.pdf
  C:\test.txt
  log4j:WARN No appenders could be found for logger
  (org.pdfbox.pdfparser.PDFParse
  r).
  log4j:WARN Please initialize the log4j system properly
 
  why I am getting this error? plz help
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 9:21 AM
  Subject: RE: pdfboxhelp
 
 
   Hi
  
  
   To Begin with try to build Indexes offline  [ out of Tomcat
 container]
   and  on completing indxexes, feed u'r search  with the realpath of the
  offline indexed folder,Start the Tomcat and then use the
   search on As u experiment it out u will be comfortable
 withrequirment
  of Indexing /Search..   ; [
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:55 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
  
   Yes I did the same.
   I copied all the classes into classes folder but
   now when I am building the index using IndexHTML the pdfs are not
added
 to
   this index, only text and htmls are added to index.
   what changes should I do for IndexHTML.java to build index with pdf
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:54 PM
   Subject: RE: pdfboxhelp
  
  
Hi
   
If u are using the jar file with Web Interface for jsp/servlet dev,
  Place
the jar file in  webapps/u'rapplication/Web-inf/lib
and also correct the Classpath for the present modification.
   
2)create u'r own package and put all u'r java files  copy the java
 files
   to
/Web-inf/Classes/u'r package
   
   
Then use the same..;{
   
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:31 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
thanks  Natarajan and karthik,
   
I corrected classpath
   
but where I should write your code?
should I write your code in IndexHTML.java  which comes along with
  lucene
   or
some other place?
one more thing
I kept pdfbox jar file in the classpath is this enough or I have to
  build
the pdfbox?
   
thankyou
- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 3:20 PM
Subject: RE: pdfboxhelp
   
   
 Hi Santhosh,

 Try out this below code.(pdfbox.jar file must be in your
  classpath)

 public String getContent(InputStream  reader) throws
   IOException{PDFParser
parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
   null;String
pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc
=
parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument
 decryptor
  =
new

RE: pdfboxhelp

2004-08-22 Thread Natarajan.T
Hi Santhosh,

The attached file must be in your class path.


Natarajan.



-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 23, 2004 10:51 AM
To: Lucene Users List
Subject: Fw: pdfboxhelp

hi karthik,
did u find any solution? should I send the pdf to u?
- Original Message -
From: Santosh [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:23 AM
Subject: Re: pdfboxhelp


 hi karthik,
  I kept log4j in the classpath , I am sending classpath variable

 CLASSPATH


.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc
lien

t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s
dk1.

4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\
j2sd
 k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
 4.0\common\lib\servlet.jar;C:\Program

Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s
dk1.

4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl
.jar

;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s
dk1.

4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z
ip;C

:\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.
6.6.

jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-
0.6.
 6\external\log4j.jar

 please check the error



 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:26 AM
 Subject: RE: pdfboxhelp


  Hi Santosh
 
I think u'r Pdf is using  Log4j package ,Try toe set the classpath
for
  log4j.jar path.
 
   [ Is it a just a WARNING  or an ERROR  u are getting.
 
Send me in u'r Configuration management Let me help u with it
; [
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:11 AM
  To: Lucene Users List
  Cc: Ben Litchfield
  Subject: Re: pdfboxhelp
 
 
  hi karthik,
 
  I have downloaded pdfbox and kept pdfjar file in the classpath, but
when
I
  am typing following command in the command prompt I am getting the
error:
 
  D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
  C:\test.pdf
  C:\test.txt
  log4j:WARN No appenders could be found for logger
  (org.pdfbox.pdfparser.PDFParse
  r).
  log4j:WARN Please initialize the log4j system properly
 
  why I am getting this error? plz help
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 9:21 AM
  Subject: RE: pdfboxhelp
 
 
   Hi
  
  
   To Begin with try to build Indexes offline  [ out of Tomcat
 container]
   and  on completing indxexes, feed u'r search  with the realpath of
the
  offline indexed folder,Start the Tomcat and then use the
   search on As u experiment it out u will be comfortable
 withrequirment
  of Indexing /Search..   ; [
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:55 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
  
   Yes I did the same.
   I copied all the classes into classes folder but
   now when I am building the index using IndexHTML the pdfs are not
added
 to
   this index, only text and htmls are added to index.
   what changes should I do for IndexHTML.java to build index with
pdf
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:54 PM
   Subject: RE: pdfboxhelp
  
  
Hi
   
If u are using the jar file with Web Interface for jsp/servlet
dev,
  Place
the jar file in  webapps/u'rapplication/Web-inf/lib
and also correct the Classpath for the present modification.
   
2)create u'r own package and put all u'r java files  copy the
java
 files
   to
/Web-inf/Classes/u'r package
   
   
Then use the same..;{
   
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:31 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
thanks  Natarajan and karthik,
   
I corrected classpath
   
but where I should write your code?
should I write your code in IndexHTML.java  which comes along
with
  lucene
   or
some other place?
one more thing
I kept pdfbox jar file in the classpath is this enough or I have
to
  build
the pdfbox?
   
thankyou
- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 3:20 PM
Subject: RE: pdfboxhelp
   
   
 Hi Santhosh,

 Try out this below code.(pdfbox.jar file must be in your
  classpath)

 public String getContent(InputStream  reader) throws
   IOException{PDFParser
parser

RE: pdfboxhelp

2004-08-22 Thread Karthik N S
Hi Santosh

  Hold on I's monday and I am on running off the Schedule  with my Job...
will reply u some time in noon.


 Karthik

-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:51 AM
To: Lucene Users List
Subject: Fw: pdfboxhelp


hi karthik,
did u find any solution? should I send the pdf to u?
- Original Message -
From: Santosh [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:23 AM
Subject: Re: pdfboxhelp


 hi karthik,
  I kept log4j in the classpath , I am sending classpath variable

 CLASSPATH


.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien

t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.

4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd
 k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
 4.0\common\lib\servlet.jar;C:\Program

Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.

4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar

;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.

4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C

:\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.

jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.
 6\external\log4j.jar

 please check the error



 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:26 AM
 Subject: RE: pdfboxhelp


  Hi Santosh
 
I think u'r Pdf is using  Log4j package ,Try toe set the classpath for
  log4j.jar path.
 
   [ Is it a just a WARNING  or an ERROR  u are getting.
 
Send me in u'r Configuration management Let me help u with it ; [
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:11 AM
  To: Lucene Users List
  Cc: Ben Litchfield
  Subject: Re: pdfboxhelp
 
 
  hi karthik,
 
  I have downloaded pdfbox and kept pdfjar file in the classpath, but when
I
  am typing following command in the command prompt I am getting the
error:
 
  D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
  C:\test.pdf
  C:\test.txt
  log4j:WARN No appenders could be found for logger
  (org.pdfbox.pdfparser.PDFParse
  r).
  log4j:WARN Please initialize the log4j system properly
 
  why I am getting this error? plz help
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 9:21 AM
  Subject: RE: pdfboxhelp
 
 
   Hi
  
  
   To Begin with try to build Indexes offline  [ out of Tomcat
 container]
   and  on completing indxexes, feed u'r search  with the realpath of the
  offline indexed folder,Start the Tomcat and then use the
   search on As u experiment it out u will be comfortable
 withrequirment
  of Indexing /Search..   ; [
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:55 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
  
   Yes I did the same.
   I copied all the classes into classes folder but
   now when I am building the index using IndexHTML the pdfs are not
added
 to
   this index, only text and htmls are added to index.
   what changes should I do for IndexHTML.java to build index with pdf
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:54 PM
   Subject: RE: pdfboxhelp
  
  
Hi
   
If u are using the jar file with Web Interface for jsp/servlet dev,
  Place
the jar file in  webapps/u'rapplication/Web-inf/lib
and also correct the Classpath for the present modification.
   
2)create u'r own package and put all u'r java files  copy the java
 files
   to
/Web-inf/Classes/u'r package
   
   
Then use the same..;{
   
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:31 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
thanks  Natarajan and karthik,
   
I corrected classpath
   
but where I should write your code?
should I write your code in IndexHTML.java  which comes along with
  lucene
   or
some other place?
one more thing
I kept pdfbox jar file in the classpath is this enough or I have to
  build
the pdfbox?
   
thankyou
- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 3:20 PM
Subject: RE: pdfboxhelp
   
   
 Hi Santhosh,

 Try out this below code.(pdfbox.jar file must be in your
  classpath)

 public String getContent(InputStream

RE: pdfboxhelp

2004-08-21 Thread Natarajan.T
Hi Santhosh,

Try out this below code.(pdfbox.jar file must be in your classpath)

public String getContent(InputStream  reader) throws IOException{
 PDFParser parser = null;
 PDDocument pdDoc = null;
 PDFTextStripper stripper = null;
 String pdftext = ;
 try{
parser = new PDFParser(reader);
parser.parse();
pdDoc = parser.getPDDocument();

if(pdDoc.isEncrypted()){
DecryptDocument decryptor = new
DecryptDocument(pdDoc);
decryptor.decryptDocument();
}
stripper = new PDFTextStripper();
pdftext = stripper.getText(pdDoc);

info = pdDoc.getDocumentInformation();
 }
 catch(Exception err) {
 System.out.println(err.getMessage());
 }
pdDoc.close();
return pdftext;
 }

Natarajan.

-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED] 
Sent: Saturday, August 21, 2004 3:14 PM
To: Lucene Users List
Subject: Re: pdfboxhelp

Hi Don,

your Idea is nice, but whenever I write the  following code in
IndexHTML.java of lucene 


import org.pdfbox.searchengine.lucene.*;

File pdfFile = new File(/path/to/the/file.pdf); 

// Below returns a parse PDF file in a Lucene Document object.
Document doc = LucenePDFDocument.getDocument(pdfFile);

Iam getting the following error

package org.pdfbox.searchengine.lucene does not exist

I have downloaded pdfbox source code and kept the jar file in the
classpath, please help me on this
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 7:37 PM
  Subject: Re: pdfboxhelp


  Here is the super simple code required.

  import org.pdfbox.searchengine.lucene.*;

  File pdfFile = new File(/path/to/the/file.pdf); 

  // Below returns a parse PDF file in a Lucene Document object.
  Document doc = LucenePDFDocument.getDocument(pdfFile);

  
  Santosh wrote:

exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to
start. how can I check with demo, I dont see any help document with this
download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any

Re: pdfboxhelp

2004-08-21 Thread Santosh
thanks  Natarajan and karthik,

I corrected classpath

but where I should write your code?
should I write your code in IndexHTML.java  which comes along with lucene or
some other place?
one more thing
I kept pdfbox jar file in the classpath is this enough or I have to build
the pdfbox?

thankyou
- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 3:20 PM
Subject: RE: pdfboxhelp


 Hi Santhosh,

 Try out this below code.(pdfbox.jar file must be in your classpath)

 public String getContent(InputStream  reader) throws IOException{PDFParser
parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String
pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor =
new
 DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
PDFTextStripper();pdftext = stripper.getText(pdDoc);

info = pdDoc.getDocumentInformation();}catch(Exception err)
{System.out.println(err.getMessage());}pdDoc.close();return pdftext;}

 Natarajan.

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 3:14 PM
 To: Lucene Users List
 Subject: Re: pdfboxhelp

 Hi Don,

 your Idea is nice, but whenever I write the  following code in
 IndexHTML.java of lucene


 import org.pdfbox.searchengine.lucene.*;

 File pdfFile = new File(/path/to/the/file.pdf);

 // Below returns a parse PDF file in a Lucene Document object.
 Document doc = LucenePDFDocument.getDocument(pdfFile);

 Iam getting the following error

 package org.pdfbox.searchengine.lucene does not exist

 I have downloaded pdfbox source code and kept the jar file in the
 classpath, please help me on this- Original Message - From: Don
Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
PMSubject: Re: pdfboxhelp


   Here is the super simple code required.

   import org.pdfbox.searchengine.lucene.*;

   File pdfFile = new File(/path/to/the/file.pdf);

   // Below returns a parse PDF file in a Lucene Document object.Document
doc = LucenePDFDocument.getDocument(pdfFile);

   Santosh wrote:

 exactly, the same is required to me- Original Message - From: Don
Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39
PMSubject: Re: pdfboxhelp


   What are your intensions with PDFBox?

   You want to use it to index PDF files?

   Santosh wrote:

 hi,

 I have downloaded pdfbox zip. but i am in ambigous state that where to
 start. how can I check with demo, I dont see any help document with this
 download, please help me.


 regards
 Santosh kumar
 SoftPro Systems
 Hyderabad


 The harder you train in peace, the lesser you bleed in war

 ---SOFTPRO DISCLAIMER--

 Information contained in this E-MAIL and any attachments are
 confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
 and 'confidential'.

 If you are not an intended or authorised recipient of this E-MAIL or
 have received it in error, You are notified that any use, copying or
 dissemination  of the information contained in this E-MAIL in any
 manner whatsoever is strictly prohibited. Please delete it immediately
 and notify the sender by E-MAIL.

 In such a case reading, reproducing, printing or further dissemination
 of this E-MAIL is strictly prohibited and may be unlawful.

 SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
 hereto is free from computer viruses or other defects.

 The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
 those of the author and are not necessarily those of SOFTPRO SYSTEMS.
 





   -- Don VaillancourtDirector of Software Development

   WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email:
[EMAIL PROTECTED]: http://www.web-impact.com



   This email message is intended only for the addressee(s)and contains
information that may be confidential and/orcopyright. If you are not the
intended recipient pleasenotify the sender by reply email and immediately
deletethis email. Use, disclosure or reproduction of this emailby anyone
other than the intended recipient(s) is strictlyprohibited. No
representation is made that this email orany attachments are free of
viruses. Virus scanning isrecommended and is the responsibility of the
recipient.



 ---SOFTPRO DISCLAIMER--

 Information contained in this E-MAIL and any attachments are
 confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
 and 'confidential'.

 If you are not an intended or authorised recipient of this E-MAIL or
 have received it in error, You are notified that any use, copying or
 dissemination  of the information contained in this E-MAIL in any
 manner whatsoever is strictly prohibited. Please delete

pdfboxhelp

2004-08-20 Thread Santosh
hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: pdfboxhelp

2004-08-20 Thread Don Vaillancourt




What are your intensions with PDFBox?

You want to use it to index PDF files?

Santosh wrote:

  hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


"The harder you train in peace, the lesser you bleed in war"

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



-- 

Don Vaillancourt
Director of Software Development


WEB IMPACT INC.
phone: 416-815-2000 ext. 245
fax: 416-815-2001
email: [EMAIL PROTECTED]
web: http://www.web-impact.com




This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright. If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: pdfboxhelp

2004-08-20 Thread Santosh
exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: pdfboxhelp

2004-08-20 Thread Don Vaillancourt




Here is the super simple code required.

import org.pdfbox.searchengine.lucene.*;

File pdfFile = new File("/path/to/the/file.pdf"); 

// Below returns a parse PDF file in a Lucene Document object.
Document doc = LucenePDFDocument.getDocument(pdfFile);

 
Santosh wrote:

  exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


"The harder you train in peace, the lesser you bleed in war"

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



-- 

Don Vaillancourt
Director of Software Development


WEB IMPACT INC.
phone: 416-815-2000 ext. 245
fax: 416-815-2001
email: [EMAIL 

Re: pdfboxhelp

2004-08-20 Thread Santosh

  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 7:37 PM
  Subject: Re: pdfboxhelp


  Here is the super simple code required.

  import org.pdfbox.searchengine.lucene.*;

  File pdfFile = new File(/path/to/the/file.pdf); 

  // Below returns a parse PDF file in a Lucene Document object.
  Document doc = LucenePDFDocument.getDocument(pdfFile);

  
  Santosh wrote:

exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS

Re: pdfboxhelp

2004-08-20 Thread Don Vaillancourt




Did I leave you speechless!? :-)

Santosh wrote:

- Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 7:37 PM
  Subject: Re: pdfboxhelp


  Here is the super simple code required.

  import org.pdfbox.searchengine.lucene.*;

  File pdfFile = new File("/path/to/the/file.pdf"); 

  // Below returns a parse PDF file in a Lucene Document object.
  Document doc = LucenePDFDocument.getDocument(pdfFile);

  
  Santosh wrote:

exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


"The harder you train in peace, the lesser you bleed in war"

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily thos

Re: pdfboxhelp

2004-08-20 Thread Santosh
Iam sorry, mail has been sent accidentally
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 8:02 PM
  Subject: Re: pdfboxhelp


  Did I leave you speechless!?  :-)

  Santosh wrote:

  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 7:37 PM
  Subject: Re: pdfboxhelp


  Here is the super simple code required.

  import org.pdfbox.searchengine.lucene.*;

  File pdfFile = new File(/path/to/the/file.pdf); 

  // Below returns a parse PDF file in a Lucene Document object.
  Document doc = LucenePDFDocument.getDocument(pdfFile);

  
  Santosh wrote:

exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment