problem restoring index

2004-12-08 Thread Santosh
hi,

when I restart the tomcat . the Index is getting corrupted. If I take the 
backup of Index and then restarting tomcat. the Index is not working properly. 

Do I have to Index again all the documents whenever I restart the Tomcat?




---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Help for alternatives for search words

2004-11-29 Thread Santosh
I am using lucene for searching. It is working fine but I have a problem with 
alternate words.If I am searching for fooddy the lucene will give the result 
for food also .

how can I trace these alternate word foam in these documents. How lucene will 
support this feature? how can I get alternate words list. I need three 
alternate words for each search word. when each time a user enters roam I need 
to show are you searching for ? (food,foods,foody)   

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: modifying existing index

2004-11-24 Thread Santosh
I am able to delete now the Index using the following

if(indexDir.exists())

{


IndexReader reader = IndexReader.open( indexDir );

uidIter = reader.terms(new Term(id, ));

while (uidIter.term() != null  uidIter.term().field() == id) {


reader.delete(uidIter.term());

uidIter.next();

}

reader.close();

}

where id  is the keyword field. But here also all the documents are
deleted. How can I modify my code and delete particular document with given
id





Iam creating the index in the following way

Document doc = new Document();

doc.add(Field.Text(text,text));

doc.add(Field.Keyword(id,Long.toString(id)));

doc.add(Field.Keyword(title,title));

doc.add(Field.Keyword(keywords,keywords));

doc.add(Field.Keyword(type,type));

writer.addDocument(doc);









- Original Message -
From: Chuck Williams [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Wednesday, November 24, 2004 1:06 PM
Subject: RE: modifying existing index


A good way to do this is to add a keyword field with whatever unique id
you have for the document.  Then you can delete the term containing a
unique id to delete the document from the index (look at
IndexReader.delete(Term)).  You can look at the demo class IndexHTML to
see how it does incremental indexing for an example.

Chuck

   -Original Message- From: Santosh
[mailto:[EMAIL PROTECTED] Sent: Tuesday, November 23, 2004 11:34
PM To: Lucene Users List Subject: Re: modifying existing index  I have
gon through IndexReader , I got method : delete(int docNum)   , but
from where I will get document number? Is  this predifined? or
we have to give a number prior  to indexing? - Original
Message - From: Luke Francl [EMAIL PROTECTED] To: Lucene
Users List [EMAIL PROTECTED] Sent: Wednesday, November 24,
2004 1:26 AM Subject: Re: modifying existing indexOn Tue,
2004-11-23 at 13:59, Santosh wrote:   I am using lucene for indexing,
when I am creating Index the docuemnts are added. but when I want to
modify the single existing document
and reIndex again, it is taking as new document and adding one more
time, so that I am getting same document twice in the results.   To
overcome this I am deleting existing Index and again
recreating whole Index. but is it possibe to index  the modified document
again and overwrite existing document without deleting and recreation. can
I do this? If
so how?   You do not need to recreate the whole index. Just mark the
document as  deleted using the IndexReader and then add it again with the
 IndexWriter. Remember to close your IndexReader and IndexWriter
after  doing this.   The deleted document will be removed the next
time you optimize
your  index.   Luke Francl   
-  To
unsubscribe, e-mail: [EMAIL PROTECTED]  For
additional commands, e-mail:
[EMAIL PROTECTED]   
- To
unsubscribe, e-mail: [EMAIL PROTECTED] For
additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



modifying existing index

2004-11-23 Thread Santosh
I am using lucene for indexing, when I am creating Index the docuemnts are 
added. but when I want to modify the single existing document and reIndex 
again, it is taking as new document and adding one more time, so that I am 
getting same document twice in the results.
To overcome this I am deleting existing Index and again recreating whole Index. 
but is it possibe to index  the modified document again and overwrite existing 
document without deleting and recreation. can I do this? If so how? 

and one more question.
can lucene will be able to do stemming?
If I am searching for roam then I know that it can give result for foam 
using fuzzy query. But my requirement is if I search for roam can I get the 
similar worlist as output. so that I can show the end user in the column  
---   do you mean foam?
How can I get similar word list in the given content?  




---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





fetching similar wordlist as given word

2004-11-23 Thread Santosh
can lucene will be able to do stemming?
If I am searching for roam then I know that it can give result for foam 
using fuzzy query. But my requirement is if I search for roam can I get the 
similar wordlist as output. so that I can show the end user in the column  
---   do you mean foam?
How can I get similar word list in the given content?  



---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: modifying existing index

2004-11-23 Thread Santosh
I have gon through IndexReader , I got method : delete(int docNum)   ,
but from where I will get document number? Is  this predifined? or we have
to give a number prior  to indexing?
- Original Message -
From: Luke Francl [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Wednesday, November 24, 2004 1:26 AM
Subject: Re: modifying existing index


 On Tue, 2004-11-23 at 13:59, Santosh wrote:
  I am using lucene for indexing, when I am creating Index the docuemnts
are added. but when I want to modify the single existing document and
reIndex again, it is taking as new document and adding one more time, so
that I am getting same document twice in the results.
  To overcome this I am deleting existing Index and again recreating whole
Index. but is it possibe to index  the modified document again and overwrite
existing document without deleting and recreation. can I do this? If so how?

 You do not need to recreate the whole index. Just mark the document as
 deleted using the IndexReader and then add it again with the
 IndexWriter. Remember to close your IndexReader and IndexWriter after
 doing this.

 The deleted document will be removed the next time you optimize your
 index.

 Luke Francl


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: worddoucments search

2004-08-25 Thread Santosh
I have gon through textmining.org, I am able to extract text in string
format. but how can I get it as
lucene document format
- Original Message -
From: Otis Gospodnetic [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Tuesday, August 24, 2004 11:54 PM
Subject: Re: worddoucments search


 As I just answered in a separate email to Ryan - we used textmining.orglibrary, too, 
as an example of something that is easier to use thanPOI.  It's been a while since I 
wrote that chapter, so it slipped mymind when I replied.  Yes, use textmining.org 
first, you'll be able toinclude it in your code in 2 minutes.  Good stuff.

 Otis





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



integrationofLucene and PDF box

2004-08-24 Thread Santosh
any body integrated lucene with pdfbox?
can we do it by changing the code in the IndexFiles.java or IndexHTML.java 

regards
Santosh kumar


---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: integration of lucene with pdfbox

2004-08-24 Thread Santosh
I dont know how to add lucene document to index, i know how to add given
directory.
any body please tell me how to add lucene document to index
- Original Message -
From: Ben Litchfield [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 8:13 PM
Subject: Re: integration of lucene with pdfbox




 If you can use lucene on its own then you already know how to add a lucene
 Document to the index.  So you need to be able to take a PDF and get a
 lucene Document.

 org.pdfbox.searchengine.lucene.LucenePDFDocument.getDocument()

 does that for you.

 Ben


 On Mon, 23 Aug 2004, Santosh wrote:

  I have downloaded pdfbox and lucene and kept jar files in the class
path, I am able to work with both of them independently but how can I
integrate both
 
  regards
  Santosh kumar
 
  ---SOFTPRO DISCLAIMER--
 
  Information contained in this E-MAIL and any attachments are
  confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
  and 'confidential'.
 
  If you are not an intended or authorised recipient of this E-MAIL or
  have received it in error, You are notified that any use, copying or
  dissemination  of the information contained in this E-MAIL in any
  manner whatsoever is strictly prohibited. Please delete it immediately
  and notify the sender by E-MAIL.
 
  In such a case reading, reproducing, printing or further dissemination
  of this E-MAIL is strictly prohibited and may be unlawful.
 
  SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
  hereto is free from computer viruses or other defects.
 
  The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
  those of the author and are not necessarily those of SOFTPRO SYSTEMS.
  
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



worddoucments search

2004-08-24 Thread Santosh
Can lucene be able to search word documents? if so please give me information about it

regards
Santosh kumar


---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: pdfboxhelp

2004-08-23 Thread Santosh
Hi natarajan,
I kept log4j.properties in the classpath
my new classpath is

.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien
t.ja
r;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.4.1\
lib\
xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sdk1.4.1\l
ib\s
ervlet.jar;E:\Program Files\Apache Tomcat
4.0\common\lib\servlet.jar;C:\Program
Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.
4.1\
lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar;C:\
j2sd
k1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.4.1\lib\
jaxp
.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C:\struts.jar
;F:\
apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.jar;C:\j2sdk1.4.
1\li
b\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\log4j.jar
;C:\
j2sdk1.4.1\lib\log4j.properties;

but there is no difference in the output


- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:56 AM
Subject: RE: pdfboxhelp


 Hi Santhosh,

 The attached file must be in your class path.


 Natarajan.



 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:51 AM
 To: Lucene Users List
 Subject: Fw: pdfboxhelp

 hi karthik,
 did u find any solution? should I send the pdf to u?
 - Original Message -
 From: Santosh [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:23 AM
 Subject: Re: pdfboxhelp


  hi karthik,
   I kept log4j in the classpath , I am sending classpath variable
 
  CLASSPATH
 
 
 .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc
 lien
 
 t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s
 dk1.
 
 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\
 j2sd
  k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
  4.0\common\lib\servlet.jar;C:\Program
 
 Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s
 dk1.
 
 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl
 .jar
 
 ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s
 dk1.
 
 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z
 ip;C
 
 :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.
 6.6.
 
 jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-
 0.6.
  6\external\log4j.jar
 
  please check the error
 
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:26 AM
  Subject: RE: pdfboxhelp
 
 
   Hi Santosh
  
 I think u'r Pdf is using  Log4j package ,Try toe set the classpath
 for
   log4j.jar path.
  
[ Is it a just a WARNING  or an ERROR  u are getting.
  
 Send me in u'r Configuration management Let me help u with it
 ; [
  
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 10:11 AM
   To: Lucene Users List
   Cc: Ben Litchfield
   Subject: Re: pdfboxhelp
  
  
   hi karthik,
  
   I have downloaded pdfbox and kept pdfjar file in the classpath, but
 when
 I
   am typing following command in the command prompt I am getting the
 error:
  
   D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
   C:\test.pdf
   C:\test.txt
   log4j:WARN No appenders could be found for logger
   (org.pdfbox.pdfparser.PDFParse
   r).
   log4j:WARN Please initialize the log4j system properly
  
   why I am getting this error? plz help
  
  
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 9:21 AM
   Subject: RE: pdfboxhelp
  
  
Hi
   
   
To Begin with try to build Indexes offline  [ out of Tomcat
  container]
and  on completing indxexes, feed u'r search  with the realpath of
 the
   offline indexed folder,Start the Tomcat and then use the
search on As u experiment it out u will be comfortable
  withrequirment
   of Indexing /Search..   ; [
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:55 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
Yes I did the same.
I copied all the classes into classes folder but
now when I am building the index using IndexHTML the pdfs are not
 added
  to
this index, only text and htmls are added to index.
what changes should I do for IndexHTML.java to build index with
 pdf
- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:54 PM
Subject: RE: pdfboxhelp
   
   
 Hi

 If u are using

Re: pdfboxhelp

2004-08-23 Thread Santosh
I kept the file in the classpath

.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien
t.ja
r;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;D:\JAVAPRO;E:\
Prog
ram Files\Apache Tomcat
4.0\common\lib\servlet.jar;C:\j2sdk1.4.1\lib\classes12.z
ip;C:\struts.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.jar;C:\j2sdk1.4.1\lib\lucene
-200
30909.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\log4j.jar;C:\j2sdk1.4
.1\l
ib\log4j.properties;D:\setups\searchEngine\PDFBox-0.6.6\external\ant.jar;D:\
setu
ps\searchEngine\PDFBox-0.6.6\external\checkstyle-all-2.4.jar;D:\setups\searc
hEng
ine\PDFBox-0.6.6\external\junit.jar;D:\setups\searchEngine\PDFBox-0.6.6\exte
rnal
\lucene-1.4-final.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\lucene-de
mos-
1.4-final.jar;D:\setups\searchEngine\PDFBox-0.6.6\external\xercesImpl.jar;D:
\set
ups\searchEngine\PDFBox-0.6.6\external\xml-apis.jar;



but there is no change in the output, it is same as previous

E:\java org.pdfbox.ExtractText C:\test.pdf C:\test.txt
log4j:WARN No appenders could be found for logger
(org.pdfbox.pdfparser.PDFParse
r).
log4j:WARN Please initialize the log4j system properly.

what might be the error?


- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:56 AM
Subject: RE: pdfboxhelp


 Hi Santhosh,

 The attached file must be in your class path.


 Natarajan.



 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:51 AM
 To: Lucene Users List
 Subject: Fw: pdfboxhelp

 hi karthik,
 did u find any solution? should I send the pdf to u?
 - Original Message -
 From: Santosh [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:23 AM
 Subject: Re: pdfboxhelp


  hi karthik,
   I kept log4j in the classpath , I am sending classpath variable
 
  CLASSPATH
 
 
 .;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webc
 lien
 
 t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2s
 dk1.
 
 4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\
 j2sd
  k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
  4.0\common\lib\servlet.jar;C:\Program
 
 Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2s
 dk1.
 
 4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl
 .jar
 
 ;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2s
 dk1.
 
 4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.z
 ip;C
 
 :\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.
 6.6.
 
 jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-
 0.6.
  6\external\log4j.jar
 
  please check the error
 
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:26 AM
  Subject: RE: pdfboxhelp
 
 
   Hi Santosh
  
 I think u'r Pdf is using  Log4j package ,Try toe set the classpath
 for
   log4j.jar path.
  
[ Is it a just a WARNING  or an ERROR  u are getting.
  
 Send me in u'r Configuration management Let me help u with it
 ; [
  
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 10:11 AM
   To: Lucene Users List
   Cc: Ben Litchfield
   Subject: Re: pdfboxhelp
  
  
   hi karthik,
  
   I have downloaded pdfbox and kept pdfjar file in the classpath, but
 when
 I
   am typing following command in the command prompt I am getting the
 error:
  
   D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
   C:\test.pdf
   C:\test.txt
   log4j:WARN No appenders could be found for logger
   (org.pdfbox.pdfparser.PDFParse
   r).
   log4j:WARN Please initialize the log4j system properly
  
   why I am getting this error? plz help
  
  
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Monday, August 23, 2004 9:21 AM
   Subject: RE: pdfboxhelp
  
  
Hi
   
   
To Begin with try to build Indexes offline  [ out of Tomcat
  container]
and  on completing indxexes, feed u'r search  with the realpath of
 the
   offline indexed folder,Start the Tomcat and then use the
search on As u experiment it out u will be comfortable
  withrequirment
   of Indexing /Search..   ; [
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:55 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
Yes I did the same.
I copied all the classes into classes folder but
now when I am building the index using IndexHTML the pdfs are not
 added
  to
this index, only text and htmls are added to index.
what changes should I do for IndexHTML.java to build index with
 pdf

integration of lucene with pdfbox

2004-08-23 Thread Santosh
I have downloaded pdfbox and lucene and kept jar files in the class path, I am able to 
work with both of them independently but how can I integrate both

regards
Santosh kumar

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: pdfboxhelp

2004-08-22 Thread Santosh
hi karthik,

I have downloaded pdfbox and kept pdfjar file in the classpath, but when I
am typing following command in the command prompt I am getting the error:

D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
C:\test.pdf
C:\test.txt
log4j:WARN No appenders could be found for logger
(org.pdfbox.pdfparser.PDFParse
r).
log4j:WARN Please initialize the log4j system properly

why I am getting this error? plz help


- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 9:21 AM
Subject: RE: pdfboxhelp


 Hi


 To Begin with try to build Indexes offline  [ out of Tomcat container]
 and  on completing indxexes, feed u'r search  with the realpath of the
offline indexed folder,Start the Tomcat and then use the
 search on As u experiment it out u will be comfortable withrequirment
of Indexing /Search..   ; [

 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:55 PM
 To: Lucene Users List
 Subject: Re: pdfboxhelp


 Yes I did the same.
 I copied all the classes into classes folder but
 now when I am building the index using IndexHTML the pdfs are not added to
 this index, only text and htmls are added to index.
 what changes should I do for IndexHTML.java to build index with pdf
 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 4:54 PM
 Subject: RE: pdfboxhelp


  Hi
 
  If u are using the jar file with Web Interface for jsp/servlet dev,
Place
  the jar file in  webapps/u'rapplication/Web-inf/lib
  and also correct the Classpath for the present modification.
 
  2)create u'r own package and put all u'r java files  copy the java files
 to
  /Web-inf/Classes/u'r package
 
 
  Then use the same..;{
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 4:31 PM
  To: Lucene Users List
  Subject: Re: pdfboxhelp
 
 
  thanks  Natarajan and karthik,
 
  I corrected classpath
 
  but where I should write your code?
  should I write your code in IndexHTML.java  which comes along with
lucene
 or
  some other place?
  one more thing
  I kept pdfbox jar file in the classpath is this enough or I have to
build
  the pdfbox?
 
  thankyou
  - Original Message -
  From: Natarajan.T [EMAIL PROTECTED]
  To: 'Lucene Users List' [EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 3:20 PM
  Subject: RE: pdfboxhelp
 
 
   Hi Santhosh,
  
   Try out this below code.(pdfbox.jar file must be in your
classpath)
  
   public String getContent(InputStream  reader) throws
 IOException{PDFParser
  parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
 null;String
  pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
  parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor
=
  new
   DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
  PDFTextStripper();pdftext = stripper.getText(pdDoc);
  
  info = pdDoc.getDocumentInformation();}catch(Exception err)
  {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
  
   Natarajan.
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 3:14 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
   Hi Don,
  
   your Idea is nice, but whenever I write the  following code in
   IndexHTML.java of lucene
  
  
   import org.pdfbox.searchengine.lucene.*;
  
   File pdfFile = new File(/path/to/the/file.pdf);
  
   // Below returns a parse PDF file in a Lucene Document object.
   Document doc = LucenePDFDocument.getDocument(pdfFile);
  
   Iam getting the following error
  
   package org.pdfbox.searchengine.lucene does not exist
  
   I have downloaded pdfbox source code and kept the jar file in the
   classpath, please help me on this- Original Message - From:
Don
  Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
  PMSubject: Re: pdfboxhelp
  
  
 Here is the super simple code required.
  
 import org.pdfbox.searchengine.lucene.*;
  
 File pdfFile = new File(/path/to/the/file.pdf);
  
 // Below returns a parse PDF file in a Lucene Document
object.Document
  doc = LucenePDFDocument.getDocument(pdfFile);
  
 Santosh wrote:
  
   exactly, the same is required to me- Original Message - From:
 Don
  Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39
  PMSubject: Re: pdfboxhelp
  
  
 What are your intensions with PDFBox?
  
 You want to use it to index PDF files?
  
 Santosh wrote:
  
   hi,
  
   I have downloaded pdfbox zip. but i am in ambigous state that where to
   start. how can I check with demo, I dont see any help document with
this
   download, please help me.
  
  
   regards
   Santosh kumar
   SoftPro Systems

Re: pdfboxhelp

2004-08-22 Thread Santosh
hi karthik,
 I kept log4j in the classpath , I am sending classpath variable

CLASSPATH

.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien
t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.
4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd
k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
4.0\common\lib\servlet.jar;C:\Program
Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.
4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar
;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.
4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C
:\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.
jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.
6\external\log4j.jar

please check the error



- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:26 AM
Subject: RE: pdfboxhelp


 Hi Santosh

   I think u'r Pdf is using  Log4j package ,Try toe set the classpath for
 log4j.jar path.

  [ Is it a just a WARNING  or an ERROR  u are getting.

   Send me in u'r Configuration management Let me help u with it ; [


 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:11 AM
 To: Lucene Users List
 Cc: Ben Litchfield
 Subject: Re: pdfboxhelp


 hi karthik,

 I have downloaded pdfbox and kept pdfjar file in the classpath, but when I
 am typing following command in the command prompt I am getting the error:

 D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
 C:\test.pdf
 C:\test.txt
 log4j:WARN No appenders could be found for logger
 (org.pdfbox.pdfparser.PDFParse
 r).
 log4j:WARN Please initialize the log4j system properly

 why I am getting this error? plz help


 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 9:21 AM
 Subject: RE: pdfboxhelp


  Hi
 
 
  To Begin with try to build Indexes offline  [ out of Tomcat
container]
  and  on completing indxexes, feed u'r search  with the realpath of the
 offline indexed folder,Start the Tomcat and then use the
  search on As u experiment it out u will be comfortable
withrequirment
 of Indexing /Search..   ; [
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 4:55 PM
  To: Lucene Users List
  Subject: Re: pdfboxhelp
 
 
  Yes I did the same.
  I copied all the classes into classes folder but
  now when I am building the index using IndexHTML the pdfs are not added
to
  this index, only text and htmls are added to index.
  what changes should I do for IndexHTML.java to build index with pdf
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Saturday, August 21, 2004 4:54 PM
  Subject: RE: pdfboxhelp
 
 
   Hi
  
   If u are using the jar file with Web Interface for jsp/servlet dev,
 Place
   the jar file in  webapps/u'rapplication/Web-inf/lib
   and also correct the Classpath for the present modification.
  
   2)create u'r own package and put all u'r java files  copy the java
files
  to
   /Web-inf/Classes/u'r package
  
  
   Then use the same..;{
  
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:31 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
  
   thanks  Natarajan and karthik,
  
   I corrected classpath
  
   but where I should write your code?
   should I write your code in IndexHTML.java  which comes along with
 lucene
  or
   some other place?
   one more thing
   I kept pdfbox jar file in the classpath is this enough or I have to
 build
   the pdfbox?
  
   thankyou
   - Original Message -
   From: Natarajan.T [EMAIL PROTECTED]
   To: 'Lucene Users List' [EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 3:20 PM
   Subject: RE: pdfboxhelp
  
  
Hi Santhosh,
   
Try out this below code.(pdfbox.jar file must be in your
 classpath)
   
public String getContent(InputStream  reader) throws
  IOException{PDFParser
   parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
  null;String
   pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
   parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument
decryptor
 =
   new
DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
   PDFTextStripper();pdftext = stripper.getText(pdDoc);
   
   info = pdDoc.getDocumentInformation();}catch(Exception err)
   {System.out.println(err.getMessage());}pdDoc.close();return pdftext;}
   
Natarajan.
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday

Fw: pdfboxhelp

2004-08-22 Thread Santosh
hi karthik,
did u find any solution? should I send the pdf to u?
- Original Message -
From: Santosh [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, August 23, 2004 10:23 AM
Subject: Re: pdfboxhelp


 hi karthik,
  I kept log4j in the classpath , I am sending classpath variable

 CLASSPATH


.;..;C:\j2sdk1.4.1\lib;C:\j2sdk1.4.1\lib\jndi.jar;C:\j2sdk1.4.1\lib\webclien

t.jar;C:\j2sdk1.4.1\lib\mail.jar;C:\j2sdk1.4.1\lib\activation.jar;C:\j2sdk1.

4.1\lib\xml-apis.jar;D:\JAVAPRO;C:\j2sdk1.4.1\jre\lib\ext\msbase.jar;C:\j2sd
 k1.4.1\lib\servlet.jar;E:\Program Files\Apache Tomcat
 4.0\common\lib\servlet.jar;C:\Program

Files\Altova\xmlspy\XMLSpyInterface.jar;C:\j2sdk1.4.1\lib\sax.jar;C:\j2sdk1.

4.1\lib\dom.jar;C:\j2sdk1.4.1\lib\xalan.jar;C:\j2sdk1.4.1\lib\xercesImpl.jar

;C:\j2sdk1.4.1\lib\xmlParserAPIs.jar;C:\j2sdk1.4.1\lib\parser.jar;C:\j2sdk1.

4.1\lib\jaxp.jar;C:\j2sdk1.4.1\lib\xml.jar;C:\j2sdk1.4.1\lib\classes12.zip;C

:\struts.jar;F:\apache-ant-1.6.1\lib\ant.jar;C:\j2sdk1.4.1\lib\PDFBox-0.6.6.

jar;C:\j2sdk1.4.1\lib\lucene-20030909.jar;D:\setups\searchEngine\PDFBox-0.6.
 6\external\log4j.jar

 please check the error



 - Original Message -
 From: Karthik N S [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, August 23, 2004 10:26 AM
 Subject: RE: pdfboxhelp


  Hi Santosh
 
I think u'r Pdf is using  Log4j package ,Try toe set the classpath for
  log4j.jar path.
 
   [ Is it a just a WARNING  or an ERROR  u are getting.
 
Send me in u'r Configuration management Let me help u with it ; [
 
 
  Karthik
 
  -Original Message-
  From: Santosh [mailto:[EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 10:11 AM
  To: Lucene Users List
  Cc: Ben Litchfield
  Subject: Re: pdfboxhelp
 
 
  hi karthik,
 
  I have downloaded pdfbox and kept pdfjar file in the classpath, but when
I
  am typing following command in the command prompt I am getting the
error:
 
  D:\setups\searchEngine\PDFBox-0.6.6\srcjava org.pdfbox.ExtractText
  C:\test.pdf
  C:\test.txt
  log4j:WARN No appenders could be found for logger
  (org.pdfbox.pdfparser.PDFParse
  r).
  log4j:WARN Please initialize the log4j system properly
 
  why I am getting this error? plz help
 
 
  - Original Message -
  From: Karthik N S [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, August 23, 2004 9:21 AM
  Subject: RE: pdfboxhelp
 
 
   Hi
  
  
   To Begin with try to build Indexes offline  [ out of Tomcat
 container]
   and  on completing indxexes, feed u'r search  with the realpath of the
  offline indexed folder,Start the Tomcat and then use the
   search on As u experiment it out u will be comfortable
 withrequirment
  of Indexing /Search..   ; [
  
   Karthik
  
   -Original Message-
   From: Santosh [mailto:[EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:55 PM
   To: Lucene Users List
   Subject: Re: pdfboxhelp
  
  
   Yes I did the same.
   I copied all the classes into classes folder but
   now when I am building the index using IndexHTML the pdfs are not
added
 to
   this index, only text and htmls are added to index.
   what changes should I do for IndexHTML.java to build index with pdf
   - Original Message -
   From: Karthik N S [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Saturday, August 21, 2004 4:54 PM
   Subject: RE: pdfboxhelp
  
  
Hi
   
If u are using the jar file with Web Interface for jsp/servlet dev,
  Place
the jar file in  webapps/u'rapplication/Web-inf/lib
and also correct the Classpath for the present modification.
   
2)create u'r own package and put all u'r java files  copy the java
 files
   to
/Web-inf/Classes/u'r package
   
   
Then use the same..;{
   
   
Karthik
   
-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 4:31 PM
To: Lucene Users List
Subject: Re: pdfboxhelp
   
   
thanks  Natarajan and karthik,
   
I corrected classpath
   
but where I should write your code?
should I write your code in IndexHTML.java  which comes along with
  lucene
   or
some other place?
one more thing
I kept pdfbox jar file in the classpath is this enough or I have to
  build
the pdfbox?
   
thankyou
- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 3:20 PM
Subject: RE: pdfboxhelp
   
   
 Hi Santhosh,

 Try out this below code.(pdfbox.jar file must be in your
  classpath)

 public String getContent(InputStream  reader) throws
   IOException{PDFParser
parser = null;PDDocument pdDoc = null;PDFTextStripper stripper =
   null;String
pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc
=
parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument
 decryptor
  =
new

Re: pdfboxhelp

2004-08-21 Thread Santosh
thanks  Natarajan and karthik,

I corrected classpath

but where I should write your code?
should I write your code in IndexHTML.java  which comes along with lucene or
some other place?
one more thing
I kept pdfbox jar file in the classpath is this enough or I have to build
the pdfbox?

thankyou
- Original Message -
From: Natarajan.T [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Saturday, August 21, 2004 3:20 PM
Subject: RE: pdfboxhelp


 Hi Santhosh,

 Try out this below code.(pdfbox.jar file must be in your classpath)

 public String getContent(InputStream  reader) throws IOException{PDFParser
parser = null;PDDocument pdDoc = null;PDFTextStripper stripper = null;String
pdftext = ;try{parser = new PDFParser(reader);parser.parse();pdDoc =
parser.getPDDocument();if(pdDoc.isEncrypted()){DecryptDocument decryptor =
new
 DecryptDocument(pdDoc);decryptor.decryptDocument();}stripper = new
PDFTextStripper();pdftext = stripper.getText(pdDoc);

info = pdDoc.getDocumentInformation();}catch(Exception err)
{System.out.println(err.getMessage());}pdDoc.close();return pdftext;}

 Natarajan.

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Saturday, August 21, 2004 3:14 PM
 To: Lucene Users List
 Subject: Re: pdfboxhelp

 Hi Don,

 your Idea is nice, but whenever I write the  following code in
 IndexHTML.java of lucene


 import org.pdfbox.searchengine.lucene.*;

 File pdfFile = new File(/path/to/the/file.pdf);

 // Below returns a parse PDF file in a Lucene Document object.
 Document doc = LucenePDFDocument.getDocument(pdfFile);

 Iam getting the following error

 package org.pdfbox.searchengine.lucene does not exist

 I have downloaded pdfbox source code and kept the jar file in the
 classpath, please help me on this- Original Message - From: Don
Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 7:37
PMSubject: Re: pdfboxhelp


   Here is the super simple code required.

   import org.pdfbox.searchengine.lucene.*;

   File pdfFile = new File(/path/to/the/file.pdf);

   // Below returns a parse PDF file in a Lucene Document object.Document
doc = LucenePDFDocument.getDocument(pdfFile);

   Santosh wrote:

 exactly, the same is required to me- Original Message - From: Don
Vaillancourt To: Lucene Users List Sent: Friday, August 20, 2004 6:39
PMSubject: Re: pdfboxhelp


   What are your intensions with PDFBox?

   You want to use it to index PDF files?

   Santosh wrote:

 hi,

 I have downloaded pdfbox zip. but i am in ambigous state that where to
 start. how can I check with demo, I dont see any help document with this
 download, please help me.


 regards
 Santosh kumar
 SoftPro Systems
 Hyderabad


 The harder you train in peace, the lesser you bleed in war

 ---SOFTPRO DISCLAIMER--

 Information contained in this E-MAIL and any attachments are
 confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
 and 'confidential'.

 If you are not an intended or authorised recipient of this E-MAIL or
 have received it in error, You are notified that any use, copying or
 dissemination  of the information contained in this E-MAIL in any
 manner whatsoever is strictly prohibited. Please delete it immediately
 and notify the sender by E-MAIL.

 In such a case reading, reproducing, printing or further dissemination
 of this E-MAIL is strictly prohibited and may be unlawful.

 SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
 hereto is free from computer viruses or other defects.

 The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
 those of the author and are not necessarily those of SOFTPRO SYSTEMS.
 





   -- Don VaillancourtDirector of Software Development

   WEB IMPACT INC.phone: 416-815-2000 ext. 245fax: 416-815-2001email:
[EMAIL PROTECTED]: http://www.web-impact.com



   This email message is intended only for the addressee(s)and contains
information that may be confidential and/orcopyright. If you are not the
intended recipient pleasenotify the sender by reply email and immediately
deletethis email. Use, disclosure or reproduction of this emailby anyone
other than the intended recipient(s) is strictlyprohibited. No
representation is made that this email orany attachments are free of
viruses. Virus scanning isrecommended and is the responsibility of the
recipient.



 ---SOFTPRO DISCLAIMER--

 Information contained in this E-MAIL and any attachments are
 confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
 and 'confidential'.

 If you are not an intended or authorised recipient of this E-MAIL or
 have received it in error, You are notified that any use, copying or
 dissemination  of the information contained in this E-MAIL in any
 manner whatsoever is strictly prohibited. Please delete

pdf search

2004-08-20 Thread Santosh
Hi,

I am new bee to lucene.

I have downloaded zip file. now how can i give my own list words to lucene?
In the demo i saw that lucene is automatically creating index if we run the java 
program.but I want to give my own search words, how is it possible? 


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Fw: pdf search

2004-08-20 Thread Santosh
How can I search through PDF?
- Original Message - 
From: Santosh 
To: Lucene Users List 
Sent: Friday, August 20, 2004 5:59 PM
Subject: pdf search


Hi,

I am new bee to lucene.

I have downloaded zip file. now how can i give my own list words to lucene?
In the demo i saw that lucene is automatically creating index if we run the java 
program.but I want to give my own search words, how is it possible? 


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





pdfboxhelp

2004-08-20 Thread Santosh
hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: pdf search

2004-08-20 Thread Santosh
hi karthik,

I have a website with some items, each  contain html and pdf documents , I
have to store keywords against each item, whenever a user enters any search
word if it matches with any one of  the existing keyword list then it should
show the link to particular Item.


- Original Message -
From: Karthik N S [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Friday, August 20, 2004 6:56 PM
Subject: RE: pdf search


 hi

 What is that u intend to Search and What is this own 'search words'

  First Explain properly  u'r requirement to the form to get intented
 results.



 with regards
 Karthik

 -Original Message-
 From: Santosh [mailto:[EMAIL PROTECTED]
 Sent: Friday, August 20, 2004 5:59 PM
 To: Lucene Users List
 Subject: pdf search


 Hi,

 I am new bee to lucene.

 I have downloaded zip file. now how can i give my own list words to
lucene?
 In the demo i saw that lucene is automatically creating index if we run
the
 java program.but I want to give my own search words, how is it possible?


 regards
 Santosh kumar
 SoftPro Systems
 Hyderabad


 The harder you train in peace, the lesser you bleed in war

 ---SOFTPRO DISCLAIMER--



 Information contained in this E-MAIL and any attachments are

 confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

 and 'confidential'.



 If you are not an intended or authorised recipient of this E-MAIL or

 have received it in error, You are notified that any use, copying or

 dissemination  of the information contained in this E-MAIL in any

 manner whatsoever is strictly prohibited. Please delete it immediately

 and notify the sender by E-MAIL.



 In such a case reading, reproducing, printing or further dissemination

 of this E-MAIL is strictly prohibited and may be unlawful.



 SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

 hereto is free from computer viruses or other defects.



 The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

 those of the author and are not necessarily those of SOFTPRO SYSTEMS.

 



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: pdfboxhelp

2004-08-20 Thread Santosh
exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: pdfboxhelp

2004-08-20 Thread Santosh

  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 7:37 PM
  Subject: Re: pdfboxhelp


  Here is the super simple code required.

  import org.pdfbox.searchengine.lucene.*;

  File pdfFile = new File(/path/to/the/file.pdf); 

  // Below returns a parse PDF file in a Lucene Document object.
  Document doc = LucenePDFDocument.getDocument(pdfFile);

  
  Santosh wrote:

exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS

Re: pdfboxhelp

2004-08-20 Thread Santosh
Iam sorry, mail has been sent accidentally
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 8:02 PM
  Subject: Re: pdfboxhelp


  Did I leave you speechless!?  :-)

  Santosh wrote:

  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 7:37 PM
  Subject: Re: pdfboxhelp


  Here is the super simple code required.

  import org.pdfbox.searchengine.lucene.*;

  File pdfFile = new File(/path/to/the/file.pdf); 

  // Below returns a parse PDF file in a Lucene Document object.
  Document doc = LucenePDFDocument.getDocument(pdfFile);

  
  Santosh wrote:

exactly, the same is required to me
  - Original Message - 
  From: Don Vaillancourt 
  To: Lucene Users List 
  Sent: Friday, August 20, 2004 6:39 PM
  Subject: Re: pdfboxhelp


  What are your intensions with PDFBox?

  You want to use it to index PDF files?

  Santosh wrote:

hi,

I have downloaded pdfbox zip. but i am in ambigous state that where to start. how can 
I check with demo, I dont see any help document with this download, please help me.


regards
Santosh kumar
SoftPro Systems
Hyderabad


The harder you train in peace, the lesser you bleed in war

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.


  



  -- 
  Don Vaillancourt
  Director of Software Development

  WEB IMPACT INC.
  phone: 416-815-2000 ext. 245
  fax: 416-815-2001
  email: [EMAIL PROTECTED]
  web: http://www.web-impact.com



  This email message is intended only for the addressee(s)
  and contains information that may be confidential and/or
  copyright. If you are not the intended recipient please
  notify the sender by reply email and immediately delete
  this email. Use, disclosure or reproduction of this email
  by anyone other than the intended recipient(s) is strictly
  prohibited. No representation is made that this email or
  any attachments are free of viruses. Virus scanning is
  recommended and is the responsibility of the recipient.



---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
hereto is free from computer viruses or other defects. 

The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
those of the author and are not necessarily those of SOFTPRO SYSTEMS.






--


  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]

---SOFTPRO DISCLAIMER--

Information contained in this E-MAIL and any attachments are
confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
and 'confidential'.

If you are not an intended or authorised recipient of this E-MAIL or
have received it in error, You are notified that any use, copying or
dissemination  of the information contained in this E-MAIL in any
manner whatsoever is strictly prohibited. Please delete it immediately
and notify the sender by E-MAIL.

In such a case reading, reproducing, printing or further dissemination
of this E-MAIL is strictly prohibited and may be unlawful.

SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

searchhelp

2004-08-19 Thread Santosh
Hi,

I am using lucene search engine for my application.

i am able to search through the text files and htmls as specified by lucene

can you please clarify my doubts

1.can lucene search through pdfs and word documents? if yes then how?

2.can lucene search through database ? if yes then how?

thankyou

santosh


---SOFTPRO DISCLAIMER--



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects.



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.





Re: searchhelp

2004-08-19 Thread Santosh
I am recently joined into list, I didnt gone through any previous mails, if
you have any mails or related code please forward it to me
- Original Message -
From: Chandan Tamrakar [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Thursday, August 19, 2004 3:47 PM
Subject: Re: searchhelp


 For PDF you need to extract a text from pdf files using pdfbox library
and
 for word documents u can use apache POI api's . There are messages
 posted on the  lucene list related to your queries. About database ,i
guess
 someone must have done it . :)

 - Original Message -
 From: Santosh [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Thursday, August 19, 2004 3:58 PM
 Subject: searchhelp


 Hi,

 I am using lucene search engine for my application.

 i am able to search through the text files and htmls as specified by
lucene

 can you please clarify my doubts

 1.can lucene search through pdfs and word documents? if yes then how?

 2.can lucene search through database ? if yes then how?

 thankyou

 santosh


 ---SOFTPRO DISCLAIMER--

 Information contained in this E-MAIL and any attachments are
 confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
 and 'confidential'.

 If you are not an intended or authorised recipient of this E-MAIL or
 have received it in error, You are notified that any use, copying or
 dissemination  of the information contained in this E-MAIL in any
 manner whatsoever is strictly prohibited. Please delete it immediately
 and notify the sender by E-MAIL.

 In such a case reading, reproducing, printing or further dissemination
 of this E-MAIL is strictly prohibited and may be unlawful.

 SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
 hereto is free from computer viruses or other defects.

 The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
 those of the author and are not necessarily those of SOFTPRO SYSTEMS.
 



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: searchhelp

2004-08-19 Thread Santosh
thanks everybody,

but i didnt got any code or any real help in this links
any body has performed previously this search?if yes then please send me the
code, or tell me the what code I have to add to my present lucene
- Original Message -
From: David Townsend [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Thursday, August 19, 2004 4:17 PM
Subject: RE: searchhelp


JGURU FAQ
http://www.jguru.com/faq/Lucene

OFFICIAL FAQ
http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi

MAIL ARCHIVE
http://www.mail-archive.com/[EMAIL PROTECTED]/

hope this helps.


-Original Message-
From: Santosh [mailto:[EMAIL PROTECTED]
Sent: 19 August 2004 11:25
To: Lucene Users List
Subject: Re: searchhelp


I am recently joined into list, I didnt gone through any previous mails, if
you have any mails or related code please forward it to me
- Original Message -
From: Chandan Tamrakar [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Thursday, August 19, 2004 3:47 PM
Subject: Re: searchhelp


 For PDF you need to extract a text from pdf files using pdfbox library
and
 for word documents u can use apache POI api's . There are messages
 posted on the  lucene list related to your queries. About database ,i
guess
 someone must have done it . :)

 - Original Message -
 From: Santosh [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Thursday, August 19, 2004 3:58 PM
 Subject: searchhelp


 Hi,

 I am using lucene search engine for my application.

 i am able to search through the text files and htmls as specified by
lucene

 can you please clarify my doubts

 1.can lucene search through pdfs and word documents? if yes then how?

 2.can lucene search through database ? if yes then how?

 thankyou

 santosh


 ---SOFTPRO DISCLAIMER--

 Information contained in this E-MAIL and any attachments are
 confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'
 and 'confidential'.

 If you are not an intended or authorised recipient of this E-MAIL or
 have received it in error, You are notified that any use, copying or
 dissemination  of the information contained in this E-MAIL in any
 manner whatsoever is strictly prohibited. Please delete it immediately
 and notify the sender by E-MAIL.

 In such a case reading, reproducing, printing or further dissemination
 of this E-MAIL is strictly prohibited and may be unlawful.

 SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment
 hereto is free from computer viruses or other defects.

 The opinions expressed in this E-MAIL and any ATTACHEMENTS may be
 those of the author and are not necessarily those of SOFTPRO SYSTEMS.
 



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]