dh, sorting. I absolutely love it when I overlook the obvious G.
[EMAIL PROTECTED]
On Fri, Nov 7, 2008 at 4:58 AM, Michael McCandless
[EMAIL PROTECTED] wrote:
Couldn't you just do a single Query that sorts first by category and second
by relevance?
Mike
Erick Erickson wrote:
It
Couldn't you just do a single Query that sorts first by category and
second by relevance?
Mike
Erick Erickson wrote:
It seems to me that the easiest thing would be to fire two queries and
then just concatenate the results
category:A AND body:fred
category:B AND body:fred
If you
This actually brings up an interesting question, and something I have been
curious about.
In this case, does it make more sense to do Boosting by Category, or to do
sorting? From what I understand, Lucene sorting involves putting the
relevant fields into memory, and then executing a sort.
Is
Well, it's not like sorting hadn't occurred to me. Unfortunately, what
I recalled was that you could only sort results on one field (I do date
sorted searches all the time in my application). I should have gone
back and looked. My memory failed me as I can see that you can sort on
multiple
This is a good point.
Sorting populates the field cache (internal to Lucene) for that field,
meaning it loads all values for all docs and holds them in memory.
This makes the first query slow, and, consumes RAM, in proportion to
how large your index is.
Whereas boosting should be able
Hi Guys,
I currently have a bug of wrong term offset values for fields analyzed
with KeywordAnalyzer (and also unanalyzed fields, whereby I assume that
the code may be the same)
The offset of a field seems to be incremented by the entry length of the
previously analyzed field.
I had a look into
If you sort first by score, keep in mind that the raw scores are very
precise and you could see many unique values in the result set. The
secondary sort field would only be used to break equal scores. We had to use
a custom comparator to 'smooth out' the scores to allow the second field to
take
boost:(+petroleum +engineer +refinery)
(+contents:(+petroleum +engineer +refinery)
+((*:* -boost:petroleum)
(*:* -boost:engineer)
(*:* -boost:refinery)))
That's an interesting solution. Would this result in many more documents
being visited by the scorer, possibly impacting
Thanks for raising these!
For the 1st issue (KeywordTokenizer fails to set start/end offset on
its token), I think we add your two lines to fix it. I'll open an
issue for this.
The 2nd issue (if same field name has more than one NOT_ANALYZED
instance in a doc then the offsets are double
Hi,
I¹m wondering if there is any easy technique to number the terms in an index
(By number I mean map a sequence of terms to a contiguous range of integers
and map terms to these numbers efficiently)
Looking at the Term class and the .tis/.tii index format it appears that the
terms are stored
http://www.gossamer-threads.com/lists/lucene/java-user/
Date: Fri, 7 Nov 2008 14:27:38 -0700
From: [EMAIL PROTECTED]
To: java-user@lucene.apache.org
Subject: searchable archives
Hey,
Is this list available somewhere that you can search the entire archives at
one time?
Thanks,
Chad
Hey,
Is this list available somewhere that you can search the entire archives at
one time?
Thanks,
Chad
I just need a little confirmation of my understanding here.
If i say that a field is to be stored, the entire thing is written to the
index. It might also be indexed in a tokenized fasion if i also specify
that.
What are the advantages to storing a field then?
So you can search for that field?
I'm upgrading from a very old version of lucene to 2.4 I tried to research
all the possible changes, this included reading the change file from the 2.4
version, which appears to reach back through all of the versions. However,
I'm finding major API changes that aren't documented in that file.
Or nabble or markmail
- Mark
On Nov 7, 2008, at 3:33 PM, Dragon Fly [EMAIL PROTECTED]
wrote:
http://www.gossamer-threads.com/lists/lucene/java-user/
Date: Fri, 7 Nov 2008 14:27:38 -0700
From: [EMAIL PROTECTED]
To: java-user@lucene.apache.org
Subject: searchable archives
Hey,
Is this
On Fri, Nov 7, 2008 at 4:36 PM, ChadDavis [EMAIL PROTECTED] wrote:
I just need a little confirmation of my understanding here.
If i say that a field is to be stored, the entire thing is written to the
index. It might also be indexed in a tokenized fasion if i also specify
that.
Right.
Hello,
I'm writing my first JSP application, so this may be too much of a
newbie question, in which case I hope you can refer me to documentation
which can help me out.
How do I keep only one IndexSearcher open for all the searches on my
website?
Greetings,
I'm getting a strange behaviour when using the FrenchAnalyzer.
Calling the same class (Searcher.java, see below) from a JSP file and from a
Java class, gives differents results when the query contains accents !
Notice the different value of the query object :
q = secrétaire
If
18 matches
Mail list logo