Hello Chetan,

The code that comes with the Lucene book contains a little framework
for indexing rich-text documents.  It sounds like you may be able to
use it as-is, and extending it with a parser for Excel files, which we
didn't include in the code (whould we include it in the next edition?).
 While PDFBox comes with that handy Lucene-specific class that you are
using, it may be better for you to be in control of how exactly you
construct your Lucene documents.
c.f. http://www.lucenebook.com/search?query=framework

Otis

--- chetan minajagi <[EMAIL PROTECTED]> wrote:

> Hi Karthik/Cocula,
> 
> Luke didn't work but Limo helped.I seem to get results when i use
> Limo for my text/xls files.
> Now the problem with pdf search
> The problem that i see is the "summary" field as seen through LIMO is
> not indexed and hence no hits.
> I'm using the default document got by 
>  LucenePDFDocument.getDocument(myPdfFile);
> So how do i ensure that a few of the fields in this which are not
> indexed are set to indexed.
> As far as I can see I can only probe whether a field is indexed or
> not by using 
> Field.isIndexed() but is there a method by which i can set to
> indexed.
> can someone provide any help or pointers in this regard?
>  
> Thanks & Regards,
> Chetan
> 
> Karthik N S <[EMAIL PROTECTED]> wrote:
> Hi
> 
> Probably u need to use the Luke S/w to peek insid tu'r Indexer,Use it
> then
> come back for more help
> 
> 
> Karthik
> 
> 
> -----Original Message-----
> From: chetan minajagi [mailto:[EMAIL PROTECTED]
> Sent: Thursday, January 20, 2005 12:05 PM
> To: lucene-user@jakarta.apache.org
> Subject: help in indexing
> 
> 
> Hi ,
> 
> It might seem elementary to most of you.
> I am trying to build a search tool for internal use using lucene.
> I have used the following
> for
> .pdf --> PDFBOx
> .html --> demo file of lucene(HTMLDocument)
> .xls --> poi
> 
> The indexing seems to work without throwing up any errors.
> But,when i try to search i end up getting with zero hits always.
> I have tried to use the same string that i see
> (System.out.print(Document))
> but in vain.
> Can somebody let me know where and what could be wrong.
> Regards,
> Chetan
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Search presents - Jib Jab's 'Second Term'
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
>               
> ---------------------------------
> Do you Yahoo!?
>  Yahoo! Mail - You care about security. So do we.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to