RE: Help for text based indexing

2004-09-15 Thread Cocula Remi
No. group:Group1 AND Hello the group: prefix means that the word Group1 has to be searched in the group field. -Message d'origine- De : mahaveer jain [mailto:[EMAIL PROTECTED] Envoyé : mardi 14 septembre 2004 18:24 À : Lucene Users List Objet : RE: Help for text based indexing If i

RE: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-15 Thread Aad Nales
David, Perhaps I misunderstand somehting so please correct me if I do. I used http://www.searchmorph.com/kat/spell.jsp to look for conts without changing any of the default values. What I got as results did not include 'const' which has quite a high frequency in your index and should have a

Concurent operations with Lucene

2004-09-15 Thread Daniel CHAN
Hi, I'm currently developping a search engine for a few websites and would like to use Lucene to do so. After reading some docs, a post on jGuru states that some concurrent operations are forbidden with Lucene (http://www.jguru.com/faq/view.jsp?EID=913302). However, the post dated from 2 years

RE: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-15 Thread Aad Nales
By trying: if you type const you will find that it returns 216 hits. The third sports 'const' as a term (space seperated and all). I would expect 'conts' to return with const as well. But again I might be mistaken. I am now trying to figure what the problem might be: 1. my expectations (most

Re: Concurent operations with Lucene

2004-09-15 Thread Otis Gospodnetic
Hello Only 1 process can modify (add/delete) an index at a time. Have you seen Nutch (http://nutch.org/)? Otis --- Daniel CHAN [EMAIL PROTECTED] wrote: Hi, I'm currently developping a search engine for a few websites and would like to use Lucene to do so. After reading some docs, a post

Re: Concurent operations with Lucene

2004-09-15 Thread Terry Steichen
Otis, What's the relationship between Nutch and Lucene? Terry - Original Message - From: Otis Gospodnetic To: Lucene Users List Sent: Wednesday, September 15, 2004 7:29 AM Subject: Re: Concurent operations with Lucene Hello Only 1 process can modify (add/delete) an

boosting fields in MultiFieldQueryParser with different factors

2004-09-15 Thread Fiebig, Swen (init)
Hello group, is there a way to boost the different fields of a MultiFieldQueryParser with different factors? Or at least in the resulting Query? Greetings, Swen fiebig

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-15 Thread David Spencer
Aad Nales wrote: By trying: if you type const you will find that it returns 216 hits. The third sports 'const' as a term (space seperated and all). I would expect 'conts' to return with const as well. But again I might be mistaken. I am now trying to figure what the problem might be: 1. my

Lucene docs

2004-09-15 Thread Ian McDonnell
What is the best resource for beginners looking to understand Lucenes functionality, ie its use of fields, documents, the index reader and writer etc. is there any web resource that goes into details on the exact workings of it? Ian _

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-15 Thread David Spencer
Andrzej Bialecki wrote: Aad Nales wrote: David, Perhaps I misunderstand somehting so please correct me if I do. I used http://www.searchmorph.com/kat/spell.jsp to look for conts without changing any of the default values. What I got as results did not include 'const' which has quite a high

Re: Lucene docs

2004-09-15 Thread Honey George
Try these, http://jakarta.apache.org/lucene/docs/gettingstarted.html http://www.darksleep.com/lucene/ Thanks, George --- Ian McDonnell [EMAIL PROTECTED] wrote: What is the best resource for beginners looking to understand Lucenes functionality, ie its use of fields, documents, the index

problem with SortField[] in search method (newbie)

2004-09-15 Thread Wermus Fernando
Luceners, My search looks up the whole entities. My entities are accounts, contacts, tasks, etc. My searching looks up a group of entity's fields. This works fine despite, I don't have indexed any entity in a document. But If I sort by some fields from different entities, I get the following

Re: Lucene docs

2004-09-15 Thread Steven Rowe
URL:http://wiki.apache.org/jakarta-lucene/IntroductionToLucene Ian McDonnell wrote: What is the best resource for beginners looking to understand Lucenes functionality, ie its use of fields, documents, the index reader and writer etc. is there any web resource that goes into details on the exact

RE: problem with SortField[] in search method (newbie)

2004-09-15 Thread Aviran
You can only sort on indexed field. (even more than that, it'll work properly only on Untokenized fields, ie keyword). Aviran -Original Message- From: Wermus Fernando [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 15, 2004 13:13 PM To: [EMAIL PROTECTED] Subject: problem with

RE: problem with SortField[] in search method (newbie)

2004-09-15 Thread Wermus Fernando
Aviran, I can search in not indexed fields without any exception, but I can't order by the same fields. Besides, I can't know in advance if they are indexed in my app, because I index those fields that have some value, if it doesn't I don't add it to the document. What

Re: problem with SortField[] in search method (newbie)

2004-09-15 Thread Praveen Peddi
Does it mean you indexed all not null fields?. I think you should change your code so that you always index the fields you want to sort. In any case, it looks like some of your documents have shortName not null and not indexed. If you do not have any non-indexed shotnames in the index, I don't

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-15 Thread Andrzej Bialecki
David Spencer wrote: To restate the question for a second. The misspelled word is: conts. The sugggestion expected is const, which seems reasonable enough as it's just a transposition away, thus the string distance is low. But - I guess the problem w/ the algorithm is that for short words like

Re: boosting fields in MultiFieldQueryParser with different factors

2004-09-15 Thread Daniel Naber
On Wednesday 15 September 2004 18:06, Fiebig, Swen (init) wrote: is there a way to boost the different fields of a MultiFieldQueryParser with different factors? Or at least in the resulting Query? The easiest way is probably to subclass MultiFieldQueryParser and implement a method that

QueryParser.parse() and Lucene1.4.1

2004-09-15 Thread Polina Litvak
I have a question regarding QueryParser and lucene-1.4.1.jar: When using lucene-1.3-final.jar, a query of the form: Field:(A AND -(B)) was parsed into +Field:A -Field:B (using QueryParser.parse()). After making the switch to lucene-1.4.1.jar, the same query is being parsed into Field:A Field:-

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-15 Thread David Spencer
Aad Nales wrote: By trying: if you type const you will find that it returns 216 hits. The third sports 'const' as a term (space seperated and all). I would expect 'conts' to return with const as well. But again I might be mistaken. I am now trying to figure what the problem might be: 1. my

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-15 Thread David Spencer
Andrzej Bialecki wrote: David Spencer wrote: To restate the question for a second. The misspelled word is: conts. The sugggestion expected is const, which seems reasonable enough as it's just a transposition away, thus the string distance is low. But - I guess the problem w/ the algorithm is

Re: frequent terms - Re: combining open office spellchecker with Lucene

2004-09-15 Thread David Spencer
Doug Cutting wrote: David Spencer wrote: [1] The user enters a query like: recursize descent parser [2] The search code parses this and sees that the 1st word is not a term in the index, but the next 2 are. So it ignores the last 2 terms (recursive and descent) and suggests alternatives to

Re: QueryParser.parse() and Lucene1.4.1

2004-09-15 Thread Daniel Naber
On Wednesday 15 September 2004 21:58, Polina Litvak wrote: Does anyone know how to work around this new feature ? I can't remember any changes in this area, but I just tried with the current version from CVS and the output is the one which you want. Regards Daniel --

Re: Similarity score computation documentation

2004-09-15 Thread Ken McCracken
Hi Doug, Thanks for the reply. idf is a function of the individual terms in the query. I think that the grouping in the Similarity formula would be off with it at the end. It currently looks like ( SUM_t in q ( tf * idf * getBoost * lengthNorm ) ) * coord * queryNorm whereas I think what