Re: MS-Word docs.

2005-11-27 Thread Steven Bell
I did run down the issue. And it's a case of tired coder. I wasn't creating a new document object in the method I was using to handle word documents. Thanks very much for the links guys, I appreciate it! steve. Chris Hostetter wrote: : I dump the doc files into a text file with the same var

Re: MS-Word docs.

2005-11-27 Thread Chris Hostetter
: I dump the doc files into a text file with the same variable I use in : the Lucene doc.add(Field.UnStored("content", textStr));| and they look : fine in the file. However searches return nothing. if i'm reading that sentence correctly, then you are saying that you've tried isolating your MS-Wor

Re: MS-Word docs.

2005-11-27 Thread Otis Gospodnetic
Hello Steven, There is a small ready-to-do framework in Lucene in Action that you can use to indes MS Word, PDF, RTF, XML, and plain0text docs - http://lucenebook.com/ . I suggest you stick with POI libraries, as it looks like Textmining code is no longer maintained. Otis --- Steven Bell <[EMAI