Hi Erik,

Thanks for the pointers, I have modified the Indexer.java to index the
files from the directory by removing the file extenstion check of
(".txt"). Now I do get the index from the files.

New situation is that when I run the FileSearch

java org.apache.lucene.demo.SearchFiles
Query: tty
Searching for: tty
3 total matching documents
0. No path nor URL for this document
1. No path nor URL for this document
2. No path nor URL for this document

I do not get the actual path from the index and using Luke I get the
three hits. Last two are from the index and not the real documents.

Any idea what is happeneing and how can I fix it.

Thanks.
-H

Erik Hatcher wrote:
> On Jan 10, 2005, at 7:06 PM, Hetan Shah wrote:
> 
>>Got the latest Ant and got the demo to work. I am however not sure 
>>which part in the whole source code is the indexing for different file 
>>types is done, say for example .html .txt and such?
> 
> 
> Your best bet is to dig around in the codebase.  The Indexer.java code 
> is hard-coded to only do .txt file extensions - this was on purpose as 
> the first example in the book, figuring someone using this code on the 
> their C:\ drive would be relatively safe and fast to run.
> 
> Their is also an example easily run from the Ant launcher to show how 
> various document types can be handled using an extensible framework.  
> Run "ant ExtensionFileHandler".  It doesn't actually index the document 
> it creates, but displays it to the console.  It would be pretty trivial 
> to pair the Indexer.java code up with the file handler framework to 
> crawl a directory tree and index any content it recognizes.
> 
> 
>>Appreciate your help. If you have any sample code would certainly 
>>appreciate that also.
> 
> 
> You got all the code already.  It should be fairly straightforward to 
> navigate the src tree, especially with the Table of Contents handy:
> 
>       http://www.lucenebook.com/toc
> 
> (incidentally, this dynamic TOC page is blending the blog content with 
> the TOC using an IndexReader to find all blog entries that refer to 
> each section - and you'll see the two, minor and cosmetic, errata 
> listed there already).
> 
>       Erik
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to