Hello Chetan, The code that comes with the Lucene book contains a little framework for indexing rich-text documents. It sounds like you may be able to use it as-is, and extending it with a parser for Excel files, which we didn't include in the code (whould we include it in the next edition?). While PDFBox comes with that handy Lucene-specific class that you are using, it may be better for you to be in control of how exactly you construct your Lucene documents. c.f. http://www.lucenebook.com/search?query=framework
Otis --- chetan minajagi <[EMAIL PROTECTED]> wrote: > Hi Karthik/Cocula, > > Luke didn't work but Limo helped.I seem to get results when i use > Limo for my text/xls files. > Now the problem with pdf search > The problem that i see is the "summary" field as seen through LIMO is > not indexed and hence no hits. > I'm using the default document got by > LucenePDFDocument.getDocument(myPdfFile); > So how do i ensure that a few of the fields in this which are not > indexed are set to indexed. > As far as I can see I can only probe whether a field is indexed or > not by using > Field.isIndexed() but is there a method by which i can set to > indexed. > can someone provide any help or pointers in this regard? > > Thanks & Regards, > Chetan > > Karthik N S <[EMAIL PROTECTED]> wrote: > Hi > > Probably u need to use the Luke S/w to peek insid tu'r Indexer,Use it > then > come back for more help > > > Karthik > > > -----Original Message----- > From: chetan minajagi [mailto:[EMAIL PROTECTED] > Sent: Thursday, January 20, 2005 12:05 PM > To: lucene-user@jakarta.apache.org > Subject: help in indexing > > > Hi , > > It might seem elementary to most of you. > I am trying to build a search tool for internal use using lucene. > I have used the following > for > .pdf --> PDFBOx > .html --> demo file of lucene(HTMLDocument) > .xls --> poi > > The indexing seems to work without throwing up any errors. > But,when i try to search i end up getting with zero hits always. > I have tried to use the same string that i see > (System.out.print(Document)) > but in vain. > Can somebody let me know where and what could be wrong. > Regards, > Chetan > > > --------------------------------- > Do you Yahoo!? > Yahoo! Search presents - Jib Jab's 'Second Term' > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------- > Do you Yahoo!? > Yahoo! Mail - You care about security. So do we. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]