how to write my own query?

2010-06-04 Thread Li Li
hi all,
   I want to implement a query that taking position and terms'
relative positions into consideration. It only supports multiterm
queries like boolean or query.
   But I want to consider term postion and terms relative positions.
   e.g. there are two docs
  doc1 apache lucene is a open source project
  doc2 apache is a http server and lucene ...
  if user search apache lucene  doc1 will win because apache lucene
appear closer than doc2
  e.g.
  doc1some other text apache lucene is a open source project
  doc2 apache lucene is a open source project some other text
  doc2 wins because apache lucene appear at the first position

  I think I can imitate boolean query and just integrate position
information into boolean or query. but I am not familiar with lucene's
implementaion. anyone could show me some directions? thank you.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: how to write my own query?

2010-06-04 Thread Erik Hatcher
This is perhaps best discussed on the java-user list instead.  Here's  
some thoughts...


On Jun 4, 2010, at 2:36 AM, Li Li wrote:


hi all,
  I want to implement a query that taking position and terms'
relative positions into consideration. It only supports multiterm
queries like boolean or query.
  But I want to consider term postion and terms relative positions.
  e.g. there are two docs
 doc1 apache lucene is a open source project
 doc2 apache is a http server and lucene ...
 if user search apache lucene  doc1 will win because apache lucene
appear closer than doc2


A PhraseQuery will do that.  It's common-place to OR in a (sloppy)  
phrase query for the users query in order to get proximity to boost  
things.  No custom query needed to accomplish this.



 e.g.
 doc1some other text apache lucene is a open source project
 doc2 apache lucene is a open source project some other text
 doc2 wins because apache lucene appear at the first position


And here, SpanFirstQuery is your friend.  So OR'ing a PhraseQuery and  
a SpanFirstQuery (with nested SpanNearQuery, or whatever is  
appropriate) seems to accomplish your goals.


Give those a try and report back if things still aren't quite what  
you're after.


Erik


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: how to write my own query?

2010-06-04 Thread Li Li
thank you. But I don't think SpanFirst query is my need. Because I
want to get all documents that contains any term. But give the one
whose position is top a boost. The same is term's relative posistions.
e.g.
 doc1 apache lucene is a open source project
 doc2 apache is a http server and many many other words  ...
lucene ...
if user searchs apache lucene, I want both the docs are presented to
user. But doc1 gets a higher score. I don't want to use a phrase query
because it's slow(compare to boolean query) and set slop to 1
seems strange.
e.g.
  doc1some other text   ... apache lucene
is a open source project
  doc2 apache lucene is a open source project some other text

SpanFirstQuery is not my need. if user search apache, I want to show
both docs but give higher score to doc2 because the matched terms'
position less than doc1. If I  use SpanFirstQuery SpanFirstQuery sfq =
new SpanFirstQuery(apache, 100); I will fail to find docs which
contains apache whose position is larger than 100.


2010/6/4 Erik Hatcher erik.hatc...@gmail.com:
 This is perhaps best discussed on the java-user list instead.  Here's some
 thoughts...

 On Jun 4, 2010, at 2:36 AM, Li Li wrote:

 hi all,
  I want to implement a query that taking position and terms'
 relative positions into consideration. It only supports multiterm
 queries like boolean or query.
  But I want to consider term postion and terms relative positions.
  e.g. there are two docs
  doc1         apache lucene is a open source project
  doc2         apache is a http server and lucene ...
  if user search apache lucene  doc1 will win because apache lucene
 appear closer than doc2

 A PhraseQuery will do that.  It's common-place to OR in a (sloppy) phrase
 query for the users query in order to get proximity to boost things.  No
 custom query needed to accomplish this.

  e.g.
  doc1        some other text apache lucene is a open source project
  doc2         apache lucene is a open source project some other text
  doc2 wins because apache lucene appear at the first position

 And here, SpanFirstQuery is your friend.  So OR'ing a PhraseQuery and a
 SpanFirstQuery (with nested SpanNearQuery, or whatever is appropriate) seems
 to accomplish your goals.

 Give those a try and report back if things still aren't quite what you're
 after.

        Erik


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org