Lucene-created files

2002-03-06 Thread Melissa Mifsud
Hi, Does anyone know the significance of the files that are generated by Lucene? I know they are essentially the term index, however I need to have a full understanding of them. Also, they look encrypted... can anyone confirm this? Melissa

phrase query and slop factor

2002-03-06 Thread Norbert Pabi
What must be slop factor to allow any combination of word in phrase? -- Norbert Pabi -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]

Re: phrase query and slop factor

2002-03-06 Thread Otis Gospodnetic
Wouldn't that depend on how far from each other you wanted to allow them to be? If you have a document with 100 words indexed and you are searching for first second wouldn't you have to set the slop to about 100, just in case the word 'first' is the very first word in the document, and 'second'

RE: Lucene-created files

2002-03-06 Thread Mark Tucker
This document is by no means complete, but it might get you started. My approach was to execute some code and look at the files using a hex editor. Then I began looking for the file extensions in code and trying to decipher them. I have had time to get very far and some of my initial

Virtual Index

2002-03-06 Thread Paul Dlug
We have a relatively large (300,000+ documents) set of XML files to index. The files themselves are articles broken up by journal and decade so that users can restrict their search to specific journals and year ranges. Under our old search engine this was done by creating a seperate index for

Re: Virtual Index

2002-03-06 Thread Otis Gospodnetic
If you prefer the old way (multiple indices) you can do that with Lucene, too. Look at MultiSearcher class. Lucene also supports range queries which may be helpful. I haven't used them, but it sounds like the thing to look at. Otis --- Paul Dlug [EMAIL PROTECTED] wrote: We have a relatively

RE: phrase query and slop factor

2002-03-06 Thread Oshima, Scott
Just dont make it a phrase query. Remember a phrase is a set string. Your talking about combinations of non set strings. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 06, 2002 8:40 AM To: Lucene Users List Subject: Re: phrase query and

Re: how to parse XHTML

2002-03-06 Thread Peter Carlson
Terry, Check out the contribution sections of the lucene site. It has a few xml document parsers. --Peter On 3/5/02 9:08 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Terry, These are really not Lucene questions. Lucene will let you index text, but you need to figure out how to parse

1.02 download on jakarta.apache.org?

2002-03-06 Thread Shannon Booher
Maybe I'm just blind, but Lucene v1.02 does not appear to be available through jakarta.apache.org. There is no listing for Lucene under Release Builds, only Milestone and Nightly... thanks, sjb -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail:

Re: Support for russian morphology in Lucene

2002-03-06 Thread Philipp Chudinov
its mei :) having no ideas about morphology and great wishes to use lucene in russian. nice to see you here. maybe we should try to do things together. - Original Message - From: Vadim Solonovich [EMAIL PROTECTED] To: Lucene Developers List [EMAIL PROTECTED] Cc: Lucene Users List [EMAIL

FNFE while indexing

2002-03-06 Thread Kelvin Tan
Encountering an odd FNFE during indexing... 2002-03-07 14:38:56,160 [Thread-2] ERROR com.marketingbright.core.tasks.SearchIndexingTask - C:\index\_l.fnm (The system cannot find the file specified) java.io.FileNotFoundException: C:\market\catalina\webapps\marketingbright\index\_l.fnm (The system