Re: Creating parser query "by hand"

2007-05-31 Thread Bhavin Pandya
Hi vaasu, You can convert string query into lucene query using QueryParser class. Please check API. Thanks. Bhavin pandya - Original Message - From: "Vaasu" <[EMAIL PROTECTED]> To: Sent: Friday, June 01, 2007 11:41 AM Subject: Re: Creating parser query "by hand" Hi, I am workin

Re: Creating parser query "by hand"

2007-05-31 Thread Vaasu
Hi, I am working on search using lucene2.0.0. My requirement is to construct a lucene query in the form of string (+contents:java programming +author:xxx) and then this is used to construct lucene query so that it can be searched using lucene searcher. I want to know, how to convert String quer

exception during optimze

2007-05-31 Thread Cedric Ho
Hi, When I tried to build an index last night, the following exception occurred during call to IndexWriter.optimze(): java.lang.NullPointerException at org.apache.lucene.index.IndexFileDeleter.findDeletableFiles(IndexFileDeleter.java:88) at org.apache.lucene.index.IndexWriter.mer

Re: The values which compute scores.

2007-05-31 Thread Chris Hostetter
: What I'm trying to do is prevent Lucene from providing better ranking : for documents that use a term multiple times than those that have more : term hits. : : I've got some huge queries with quite a number of unique terms. I : want the documents that hit more unique terms to float to the top,

Re: boosting different parts of the same field

2007-05-31 Thread Grant Ingersoll
If you are up for _BLEEDING EDGE_ stuff, have a look at the Payload addition to the trunk. With this, you can set payloads on individual terms (i.e. the title tokens) and then use the BoostingTermQuery when searching to factor in the payload in the scoring. This is very much "buyer beware"

Re: The values which compute scores.

2007-05-31 Thread Walt Stoneburner
Grant writes: One question that comes to mind, is what are you looking to do? What I'm trying to do is prevent Lucene from providing better ranking for documents that use a term multiple times than those that have more term hits. I've got some huge queries with quite a number of unique terms.

Re: boosting different parts of the same field

2007-05-31 Thread Les Fletcher
I recently posted a similar question to this list. Currently I am just adding the boosted field to the default search field multiple times Look for a thread on this list with a subject of "Document Boost." There were a few interesting ideas posted about how to go about this. Les Ken Krugle

Re: boosting different parts of the same field

2007-05-31 Thread Ken Krugler
This is issue with a field (let's call it "fulltext") containing all other fields' values (to perform "search in all" query). Still, while performing "search in all" I would like to boost some parts of this "fulltext" field. One way I do this in Solr (where you can easily specify this type of c

Re: boosting different parts of the same field

2007-05-31 Thread Mark Miller
I think your looking at a limitation of using the "fulltext" approach. You are also loosing the boosting you would get for matching on shorter fields. Sometimes you just have to give in and query across all fields or take the compromise..Lucene does not have a concept of "meta-fields" within fi

Re: boosting different parts of the same field

2007-05-31 Thread wojtek hury
This is issue with a field (let's call it "fulltext") containing all other fields' values (to perform "search in all" query). Still, while performing "search in all" I would like to boost some parts of this "fulltext" field. wojtek On 5/31/07, Donna L Gresh <[EMAIL PROTECTED]> wrote: >Is there

Re: boosting different parts of the same field

2007-05-31 Thread Donna L Gresh
>Is there a way of boosting only fragment of the field? Let's say that I have >a title and short description of something which I want to index into >"myfield" field - is there a way of boosting title as more important for >scoring than description? I thought that maybe something like below would

boosting different parts of the same field

2007-05-31 Thread wojtek hury
Is there a way of boosting only fragment of the field? Let's say that I have a title and short description of something which I want to index into "myfield" field - is there a way of boosting title as more important for scoring than description? I thought that maybe something like below would work

Re: restricting hits to a subset of "id"s

2007-05-31 Thread Donna L Gresh
Thanks Yonik, this is working well (the BitSet and Filter option). It's always helpful to have a pointer as to where to start-- >It's probably easier to use a Filter (which essentially does the same >thing at a lower level in the search API). >Use termDocs(Term) to look up the ids, add them to a

Re: Potential issue with DisjunctionMaxScorer

2007-05-31 Thread Yonik Seeley
On 5/31/07, balasubramanian sudaakeran <[EMAIL PROTECTED]> wrote: Hi, I found the following piece of logic in DisjunctionMaxScorer.skipTo function which may have a potential issue (marked in code as <<<>>>). public boolean skipTo(int target) throws IOException { if (firstTime) {

Re: Potential issue with DisjunctionMaxScorer

2007-05-31 Thread Yonik Seeley
If I haven't been recently looking at some of these scorers, it takes a while to wrap my head around them again. It would be really helpful if you could provide a unit test that shows the failure, and attach it to a JIRA issue. -Yonik On 5/31/07, balasubramanian sudaakeran <[EMAIL PROTECTED]>

Potential issue with DisjunctionMaxScorer

2007-05-31 Thread balasubramanian sudaakeran
Hi, I found the following piece of logic in DisjunctionMaxScorer.skipTo function which may have a potential issue (marked in code as <<<>>>). public boolean skipTo(int target) throws IOException { if (firstTime) { if (!more) return false; heapify(); first

Re: The values which compute scores.

2007-05-31 Thread Daniel Einspanjer
The score normalization is actually more important for purposes of review. It actually is possible that both D1 and D2 properly match to F1. Some customers have repeats of the same film (e.g. Spiderman 2 and Spiderman 2 in HD). When the system goes through and records the potential matches, our r

Re: The values which compute scores.

2007-05-31 Thread Doron Cohen
I have no particular experience with matching problems so the following might be off target... Anyhow, if I understand correctly, problem is that, currently, given a set of customer film descriptions {D1, D2, ... , Dn}, a set of n queries are created and each query can match at most one film in th