[
http://issues.apache.org/jira/browse/LUCENE-129?page=comments#action_12357779 ]
Sam Hough commented on LUCENE-129:
--
I think FSDirectory needs a finalize method adding to remove its reference
from FSDirectory.DIRECTORIES otherwise, through normal garbage c
[
http://issues.apache.org/jira/browse/LUCENE-129?page=comments#action_12357780 ]
Sam Hough commented on LUCENE-129:
--
Doh. Sorry. Been a long day. Finalize wont be called if DIRECTORIES still
points at it :( Think twice, post once.
Does this mean that cli
If that's the way to go, we should do it by default so the user doesn't have to.
Unless the scores between two types of queries are compatible, It's a
bad idea to transparently switch between them since it will cause
relevancy to unpredictably change in the future (triggered by either a
query chan
Hi,
> 1) It might be OK to implement retrieving field values separately for a
> document. However, I think from a simplicity point of view, it might be
> better to have the application code do this drudgery. Adding this feature
> could complicate the nice and simple design of Lucene without much
[ http://issues.apache.org/jira/browse/LUCENE-395?page=all ]
Yonik Seeley resolved LUCENE-395:
-
Resolution: Fixed
Assign To: Yonik Seeley (was: Lucene Developers)
fixed BooleanQuery hashCode/equals and committed patches.
> CoordConstrainedBoo
Need QueryParser support for BooleanQuery.minNrShouldMatch
--
Key: LUCENE-466
URL: http://issues.apache.org/jira/browse/LUCENE-466
Project: Lucene - Java
Type: Improvement
Components: Search
Versions: un
: > Should we dynamically decide to switch to FieldNormQuery when
: > BooleanQuery.maxClauseCount is exceeded? That way queries that
: Why not leave that decision to the program using the query?
: Something like this:
: - catch the TooManyClauses exception,
: - adapt (the offending parts of) th
[
http://issues.apache.org/jira/browse/LUCENE-323?page=comments#action_12357806 ]
Yonik Seeley commented on LUCENE-323:
-
Added Iterable to DisjunctionMaxQuery as a semi Java5 friendly way to iterate
over the disjuncts. Added ability to add all disjunct
Yonik Seeley wrote:
Totally untested, but here is a hack at what the scorer might look
like when the number of terms is large.
Looks plausible to me.
You could instead use a byte[maxDoc] and encode/decode floats as you
store and read them, to use a lot less RAM.
// could also use a bitse
On 11/16/05, Doug Cutting <[EMAIL PROTECTED]> wrote:
> You could instead use a byte[maxDoc] and encode/decode floats as you
> store and read them, to use a lot less RAM.
Hmmm, very interesting idea.
Less than one decimal digit of precision might be hard to swallow when
you have to add scores toget
Float.floatToRawIntBits (in Java1.4) gives the raw float bits without
normalization (like *(int*)&floatvar would in C). Since it doesn't do
normalization of NaN values, it's faster (and hopefully optimized to a
simple inline machine instruction by the JVM).
On my Pentium4, using floatToRawIntBits
Hi Folks.
I downloaded the Lucene and tried to do an ant. It initially gave me the
following error:
BUILD FAILED
file:/home/parikpol/downloads/lucene-1.4.3/build.xml:11: Unexpected element
"tstamp"
I commented out the tstamp tag from build.xml, and now it gives me the
following errors:
compile-
Yonik Seeley wrote:
Hmmm, very interesting idea.
Less than one decimal digit of precision might be hard to swallow when
you have to add scores together though:
smallfloat(score1) + smallfloat(score2) + smallfloat(score3)
Do you think that the 5/3 exponent/mantissa split is right for this,
or wo
On Tuesday 15 November 2005 23:45, Yonik Seeley wrote:
> Totally untested, but here is a hack at what the scorer might look
> like when the number of terms is large.
>
> -Yonik
>
>
> package org.apache.lucene.search;
>
> import org.apache.lucene.index.TermEnum;
> import org.apache.lucene.index.
I can confirm this takes ~ 20% of an overall Indexing operation (see
attached link from YourKit).
http://people.apache.org/~psmith/luceneYourkit.jpg
Mind you, the whole "signalling via IOException" in the
FastCharStream is a way bigger overhead, although I agree much harder
to fix.
Paul
Wow! A much larger gain than I expected!
Thanks for the profile Paul!
-Yonik
Now hiring -- http://forms.cnet.com/slink?231706
On 11/16/05, Paul Smith <[EMAIL PROTECTED]> wrote:
> I can confirm this takes ~ 20% of an overall Indexing operation (see
> attached link from YourKit).
>
> http://peopl
Use Float.floatToRawIntBits over Float.floatToIntBits
--
Key: LUCENE-467
URL: http://issues.apache.org/jira/browse/LUCENE-467
Project: Lucene - Java
Type: Improvement
Components: Other
Versions: 1.9
[
http://issues.apache.org/jira/browse/LUCENE-467?page=comments#action_12357827 ]
Yonik Seeley commented on LUCENE-467:
-
Paul Smith's profiling shows that that encodeNorm() taking 20% of the total
indexing time, with floatToIntBits registering all of th
In general I would not take this sort of profiler output too literally.
If floatToRawIntBits is 5x faster, then you'd expect a 16% improvement
from using it, but my guess is you'll see far less. Still, it's
probably worth switching & measuring as it might be significant.
Doug
Paul Smith wro
On 17/11/2005, at 9:24 AM, Doug Cutting wrote:
In general I would not take this sort of profiler output too
literally. If floatToRawIntBits is 5x faster, then you'd expect a
16% improvement from using it, but my guess is you'll see far
less. Still, it's probably worth switching & measuri
1. Run profiler
2. Sort methods by CPU time spent
3. Optimize
4. Repeat
:)
On 11/16/05, Paul Smith <[EMAIL PROTECTED]> wrote:
>
> On 17/11/2005, at 9:24 AM, Doug Cutting wrote:
>
> > In general I would not take this sort of profiler output too
> > literally. If floatToRawIntBits is 5x faster, th
On 17/11/2005, at 10:21 AM, Chris Lamprecht wrote:
1. Run profiler
2. Sort methods by CPU time spent
3. Optimize
4. Repeat
:)
Umm, well I know I could make it quicker, it's just whether it still
_works_ as expected Maintaining the contract means I'll need to
develop some good junit
[
http://issues.apache.org/jira/browse/LUCENE-467?page=comments#action_12357838 ]
Yonik Seeley commented on LUCENE-467:
-
With -server mode, it's only 3 times as fast, and both are really fairly fast.
I do wonder if the profiler had it's numbers right, or
[
http://issues.apache.org/jira/browse/LUCENE-467?page=comments#action_12357839 ]
Paul Smith commented on LUCENE-467:
---
I probably didn't make my testing framework as clear as I should. Yourkit was
setup to use method sampling (waking up every X milliseco
[
http://issues.apache.org/jira/browse/LUCENE-467?page=comments#action_12357851 ]
Yonik Seeley commented on LUCENE-467:
-
Fun with premature optimization!
I know this isn't a bottleneck, but here is the fastest floatToByte() that I
could come up with:
25 matches
Mail list logo