hey Uwe, so from your last answer, I understand I'm done.. no need to do anything, I can already compare the queries.
However there is actually a misunderstanding: my booleanqueries have variable number of boolean clauses because the fields are fixed but the terms per field are not. So, for example, I have: BooleanQuery1: field1:term, SHOULD field1:term, SHOULD field2:term, SHOULD field2:term, SHOULD field2:term, SHOULD field3:term, SHOULD BooleanQuery2: field1:term, SHOULD field2:term, SHOULD field3:term, SHOULD field3:term, SHOULD field3:term, SHOULD field3:term, SHOULD field3:term, SHOULD Is any of the points we discussed so far not anymore valid ? thanks On 29 March 2011 10:48, Uwe Schindler <u...@thetaphi.de> wrote: > > thanks for your reply. I thought I've solved the issue according to Uwe, > the > > queries without coord function were reasonably comparable, but now you > > actually reopened it. > > > > So, I need to be sure I'm making them comparable and I would like to ask > the > > following. > > > > My BooleanQueries have similar structure. Important: they only contain > > TermQueries. The fields are always 3 but the terms number can vary... > this > is > > an example of BooleanQuery (sorry for the syntax): > > > > field1:term1, SHOULD > > field1:term2, SHOULD > > field2:term1, SHOULD > > field2:term2, SHOULD > > field2:term3, SHOULD > > field3:term1, SHOULD > > ... > > > > If it is not clear how the BooleanQueries are, I can print some of them > for > > you. They have same number of fields but different number of terms. > > > > 1- Do you still think QueryNorm is not an issue ? Funny, because in the > > documentation I can read: > > QueryNorm(q) is a normalizing factor used to make scores between queries > > comparable. This factor does not affect document ranking (since all > ranked > > documents are multiplied by the same factor), but rather just attempts to > > make scores from different queries (or even different indexes) > comparable. > > > > It seems I can compare queries from the documentation. > > But as you are always using the same type of query (TermQuery), the > QueryNorm should not change, so no issue at all. It differs if you have a > variable number of Boolean clauses, the Query norm could help you to make > the queries comparable. But if you only have always the same looking BQ > with > exact same number of TQ in it (only different terms) its not an issue at > all. In all other cases, the query norm helps to compare e.g. a BQ with 5 > TQ > clauses with another BQ that has 8 TQ clauses. > > > 2- I don't think I'm using queryBoosts, are they enabled by default in > the > > BooleanQuery ? > > Query boost are only active if you do TermQuery.setBoost(anything != 1.0f). > > > 3- FieldNorm is not mentioned in Similarity class. How can I disable it ? > > SHould I disable it ? Is it a issue ? > > FieldNorm should not be a problem, as it's an indexed feature. So the same > document has always the same FieldNorm (which is a combination of length > norm, indexing document boost). If two queries hit the same document the > scores for this document should be comparable, as the FieldNorm is the same > for both cases. > > See point 6) in the Similarity docs: norm(t,d) > > > 4- If I'm not wrong Uwe told me I can compute comparable cosine > similarities > > even with documents of different length. Tf and Idf are unbounded, and my > > docs have different length. Can't I measure the similarity between query > and > > doc vectors anyway ? > > The field norm normalizes that. So where is the problem? > > > 5 - Again, I've been told I can compare queries and from documentation, I > > can see that queryNorm factor normalizes all queries. But you are saying > I > > should manually normalize them somehow ? It is not clear > > It only affects different querys (e.g. number of Boolean clauses differ, > type of queries differ). > > Uwe > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >