SV: Changing the Scoring api

2006-09-12 Thread Marcus Falck
Hi Hoss, No it wasn't any thing wrong with your suggestions except that they had landed in my junk mail for some reason, stupid outlook. However I haven't had any chance testing all of your suggestions but I already had implemented my own similarity class that has the coord fixed to 1. And it

SV: Changing the Scoring api

2006-09-12 Thread Marcus Falck
However the BooleanQuery's disableCoord seems to make effect. But I still have the problem when I'm constructing queries with wildcards. / Marcus -Ursprungligt meddelande- Från: Marcus Falck [mailto:[EMAIL PROTECTED] Skickat: den 12 september 2006 09:34 Till:

Re: Highligher Example

2006-09-12 Thread Tom Emerson
Autonomy's KeyView is an alternative to Stellent. It does not cover all of the file formats that Stellent does, though many of them are probably not interesting for most applications. When I last looked at it it did not handle mail archives, though there was a plan to add it. I found it more

Re: getCurrentVersion question

2006-09-12 Thread Tom Emerson
As far as I know there isn't a way to do this. What we do is add a metadata document to each index that includes the creation date, the user name of the creating user, and various other tidbits. This gets updated on incremental updates to the index as well. Easily done and makes it easy to query.

Storing fields without term positions

2006-09-12 Thread Timo Nentwig
Hi everybody, is it possible to store fields without term position (the .prx file) data? We store sort of custom data in the field and use it as some sort of a filter for queries, so we just don't need any term position data and it bloats the index' size nearly by factor 3. Thanks Timo

Re: SV: Changing the Scoring api

2006-09-12 Thread Chris Hostetter
: However the BooleanQuery's disableCoord seems to make effect. : But I still have the problem when I'm constructing queries with wildcards. really? ... that's strange, WildcardQuery uses the disableCoord feature of BooleanQuery. Do you have an example of what you mean? : already had

group field selection of the form field:(a b c)

2006-09-12 Thread Pramodh Shenoy
Hi Eric/Usergroup, I am working on a help content index-search project based on Lucene. One of my requirements is to search for a particular text in the content of files from specific directories. When I index the content Eg. guides/accountmanagement/index.htm and

Re: getCurrentVersion question

2006-09-12 Thread Mag Gam
Tom: great! Now do you do you add metadata? I am new to Lucene API + Java, but willing to learn. Got an example? TIA On 9/12/06, Tom Emerson [EMAIL PROTECTED] wrote: As far as I know there isn't a way to do this. What we do is add a metadata document to each index that includes the creation

Re: Using Hibernate to store Lucene Indexes in a Database

2006-09-12 Thread Beady Geraghty
I don't know if the use of a DATALINK data type would be relevant in your case. Here are some references. http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/start/c0005450.htm

Re: group field selection of the form field:(a b c)

2006-09-12 Thread Erick Erickson
Interestingly, you have extra spaces when you construct your queries, e.g. queries[2]= accountmanagement has an extra space at the beginning but when you index the document, there are no spaces. I believe that since you're indexing the fields UN_TOKENIZED, that the spaces are preserved in the

Re: getCurrentVersion question

2006-09-12 Thread Erick Erickson
Just add another document (I do something similar). The key is to remember that documents in the same index do NOT have to have the same fields. So, say for your regular documents, you have fields (f1, f2, f3, f4). For your meta-data document, you index fields (md1, md2, md3...). The value for

RE: group field selection of the form field:(a b c)

2006-09-12 Thread Pramodh Shenoy
The spaces just came i guess when i copied the code to outlook :-), actually there arent any. Let me take a look at Luke , especially testing to see what should be returned when i run the aprsed query.. sounds very interesting.. Thanks a lot Pramodh From:

Re: UTF8 accents umlauts filter?

2006-09-12 Thread Yonik Seeley
Thanks for the links Michael... this one does look interesting: http://dev.alt.textdrive.com/browser/lu/LUStringBasicLatin.txt The challenge would be to make it fast... perhaps a custom hash table, or look into the cost of a perfect hash function. Just to clear up some unicode/terminology

RE: group field selection of the form field:(a b c)

2006-09-12 Thread Doron Cohen
It think option B cannot work because due to the MUST operator it requires both databasemanagement and accountmanagement to be in the subtype field. Option A however should work, once the padding blank spaces are removed from the field name - notice that while the standard analyzer would trim

Re: UTF8 accents umlauts filter?

2006-09-12 Thread Ken Krugler
Thanks for the links Michael... this one does look interesting: http://dev.alt.textdrive.com/browser/lu/LUStringBasicLatin.txt The challenge would be to make it fast... perhaps a custom hash table, or look into the cost of a perfect hash function. Just to clear up some unicode/terminology

Re: group field selection of the form field:(a b c)

2006-09-12 Thread Erick Erickson
As long as the field is added to the *same* document, I don't see a problem with option B, although I'll admit that I haven't used MultiFieldQueryParser. But there was a discussion a while ago about adding tokens with the same field name to a document via document.add being exactly the same as