Re: Keyword fields which don't contribute to a document's score?

2002-12-06 Thread Doug Cutting
In the pre-release version available in the nightly builds you can boost document fields at index time. Check out the CHANGES.txt file for details. Doug Ashley Collins wrote: Is it possible to stop keyword fields contributing to a document's score? Leaving only text fields? Is the best way t

Re: Incremental indexing

2002-12-06 Thread Doug Cutting
Eric Jain wrote: Currently, I use the following procedure to update an index incrementally: 1. Build document 2. Open index reader 3. Delete any previous version of the document using a key field 4. Close index reader 5. Open index writer 6. Add document to index 7. Cl

RE: Lucene Speed under diff JVMs

2002-12-06 Thread Jonathan Reichhold
It doesn't surprise me that the IBM JDK is faster indexing. This JVM is better optimized in this case from my experience. I did some serious load testing with various JVM implementation from Sun and IBM and found that the opposite when it came to searching. I.e. Lucene searches were fastest unde

RE: Lucene Speed under diff JVMs

2002-12-06 Thread Armbrust, Daniel C.
Class that was used (attached) And correction, the UnStored field had 1000 words, not 500. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Sent: Friday, December 06, 2002 10:57 AM To: Lucene Users List Subject: RE: Lucene Speed under diff JVMs Otis doesn't mind. -

RE: Lucene Speed under diff JVMs

2002-12-06 Thread Otis Gospodnetic
Otis doesn't mind. --- "Armbrust, Daniel C." <[EMAIL PROTECTED]> wrote: > One more bit of info that I should have included: > > The randomly generated documents consisted of 2 fields, one Text with > 3 words, and one UnStored with 500 words. Average word length was 7 > characters. > > If Otis (

RE: Lucene Speed under diff JVMs

2002-12-06 Thread Armbrust, Daniel C.
One more bit of info that I should have included: The randomly generated documents consisted of 2 fields, one Text with 3 words, and one UnStored with 500 words. Average word length was 7 characters. If Otis (he wrote it, I just made a tweak or two) doesn't mind, I'll post the source code. Da

Keyword fields which don't contribute to a document's score?

2002-12-06 Thread Ashley Collins
Is it possible to stop keyword fields contributing to a document's score? Leaving only text fields? Is the best way to boost the terms I know are keyword fields by small numbers? e.g. sender:"[EMAIL PROTECTED]"^0.001 Thanks. Ashley _

RE: Lucene Speed under diff JVMs

2002-12-06 Thread Armbrust, Daniel C.
To clarify (which means adding the info I should have put in it the first time but missed), the run was of 40,000 documents. The number was an average. Each run was done twice (and the results were identical). And the machine was a dual processor machine, so most OS tasks ran on the idle proce

Re: Indexing email messages?

2002-12-06 Thread Ashley Collins
Thanks for the link. Looking in SZIndex.java it seems that all fields are added in a for loop. Do other people think that this is the best way to proceed? Can anyone help with my other questions? Cheers. Ashley From: petite_abeille <[EMAIL PROTECTED]> Reply-To: "Lucene Users List" <[EMAIL

Re: Indexing email messages?

2002-12-06 Thread petite_abeille
On Friday, Dec 6, 2002, at 11:12 Europe/Zurich, Ashley Collins wrote: I'm using Lucene to index MIME messages and have a couple of questions. You should take a look at ZOE as it does all that and more. It's open source and uses Lucene to index every single bits of email. http://guests.evecto

Indexing email messages?

2002-12-06 Thread Ashley Collins
I'm using Lucene to index MIME messages and have a couple of questions. 1) What is the best way to handle keyword fields which are repeated? Like "recipient" for example. At the moment I have a for loop doing document.add(Field.Keyword("recipient", address)); But this seems to limit qu