wiki volunteer?

2006-10-23 Thread Erik Hatcher
It'd be nice to have our wiki appropriately named, without the "jakarta" bit in there. It'd take someone volunteering to bring up the issue with the Apache infrastructure team and see what is involved in making that switch (and of course reading any Apache management FAQ's along the way so

number of term occurrences

2006-10-23 Thread beatriz ramos
Hello, I´m working with Lucene. I need to get the number of occurrences of the term in the document. I had seen the documentations ant I don´t find anything. Do you have any idea? Thanks.

BufferedIndexInput performance improvement

2006-10-23 Thread Nadav Har'El
Hi Luceners, During a profiling session, I discovered that BufferedIndexInput.readBytes(), the function which reads a bunch of bytes from an index, is very inefficient in many cases. It is efficient for one or two bytes, and also efficient for a very large number of bytes (e.g., when the norms are

RE: wiki volunteer?

2006-10-23 Thread Steven Parkes
I think I half-volunteered for this a while ago, pending any objections. There never were any, but I didn't circle back and do anything about it. But given the prodding, I will ... - To unsubscribe, e-mail: [EMAIL PROTECTED] For a

[jira] Commented: (LUCENE-673) Exceptions when using Lucene over NFS

2006-10-23 Thread Steven Parkes (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-673?page=comments#action_12444069 ] Steven Parkes commented on LUCENE-673: -- This is more of an aside than anything else, but V2-3 clients do have some support for delete after close, right? The

Re: BufferedIndexInput performance improvement

2006-10-23 Thread Yonik Seeley
On 10/23/06, Nadav Har'El <[EMAIL PROTECTED]> wrote: The basic problem in the existing code was that if you ask it to read 100 bytes, readBytes() simply calls readByte() 100 times in a loop, Gack! I had no idea this was the case. Luckily, I don't think readBytes() is used in any hotspots, righ

[jira] Commented: (LUCENE-673) Exceptions when using Lucene over NFS

2006-10-23 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-673?page=comments#action_12444086 ] Yonik Seeley commented on LUCENE-673: - > but V2-3 clients do have some support for delete after close, right? The > whole .nfs thing? I don't think that w

[jira] Commented: (LUCENE-673) Exceptions when using Lucene over NFS

2006-10-23 Thread Steven Parkes (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-673?page=comments#action_12444088 ] Steven Parkes commented on LUCENE-673: -- Yeah, I think you're right. I figured I was missing something. > Exceptions when using Lucene over NFS > -

Re: [jira] Resolved: (LUCENE-443) ConjunctionScorer tune-up

2006-10-23 Thread Peter Keegan
I did some profile testing with the new ConjuctionScorer in 2.1 and discovered a new bottleneck in ConjunctionScorer.sortScorers. The java.utils.Arrays.sort method is cloning the Scorers array on every sort, which is quite expensive on large indexes because of the size of the 'norms' array within,

Re: [jira] Resolved: (LUCENE-443) ConjunctionScorer tune-up

2006-10-23 Thread Yonik Seeley
On 10/23/06, Peter Keegan <[EMAIL PROTECTED]> wrote: I did some profile testing with the new ConjuctionScorer in 2.1 and discovered a new bottleneck in ConjunctionScorer.sortScorers. The java.utils.Arrays.sort method is cloning the Scorers array on every sort, Huh... that's interesting. I wond

Re: [jira] Resolved: (LUCENE-443) ConjunctionScorer tune-up

2006-10-23 Thread Paul Elschot
On Monday 23 October 2006 22:12, Peter Keegan wrote: > I did some profile testing with the new ConjuctionScorer in 2.1 and > discovered a new bottleneck in ConjunctionScorer.sortScorers. The > java.utils.Arrays.sort method is cloning the Scorers array on every sort, > which is quite expensive on la

[jira] Created: (LUCENE-693) ConjunctionScorer - more tuneup

2006-10-23 Thread Peter Keegan (JIRA)
ConjunctionScorer - more tuneup --- Key: LUCENE-693 URL: http://issues.apache.org/jira/browse/LUCENE-693 Project: Lucene - Java Issue Type: Bug Components: Search Affects Versions: 2.1 Envir

Re: [jira] Resolved: (LUCENE-443) ConjunctionScorer tune-up

2006-10-23 Thread Peter Keegan
Huh... that's interesting. I wonder why Arrays.sort(int[]) is all in-place but sort(Object[]) is not. I was wondering that myself. Here's the code: public static void sort(T[] a, Comparator c) { T[] aux = (T[])a.clone(); if (c==null) mergeSort(aux, a, 0, a.length, 0);

Re: [jira] Resolved: (LUCENE-443) ConjunctionScorer tune-up

2006-10-23 Thread Peter Keegan
Isn't the issue the creation of a new Comparator each time the scorers are sorted? That could be easily fixed by keeping single comparator around to do all the work. Yes, it's the Comparator, but I think even if you kept it around, the Array.sort would still clone the Scorers, no? Peter On 10/2

Re: [jira] Resolved: (LUCENE-443) ConjunctionScorer tune-up

2006-10-23 Thread Peter Keegan
Can you open a JIRA issue for this? Yes, it's: http://issues.apache.org/jira/browse/LUCENE-693 Peter

[jira] Commented: (LUCENE-693) ConjunctionScorer - more tuneup

2006-10-23 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-693?page=comments#action_12444150 ] Paul Elschot commented on LUCENE-693: - As just discussed on java-dev, the creation of an object during the call to sort could well be due to the creation of a

[jira] Updated: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-10-23 Thread Marvin Humphrey (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=all ] Marvin Humphrey updated LUCENE-675: --- Attachment: BenchmarkingIndexer.pm extract_reuters.plx Grant had asked me if he could reuse some code from the indexer benchmarks I wrote

[jira] Updated: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-10-23 Thread Marvin Humphrey (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=all ] Marvin Humphrey updated LUCENE-675: --- Attachment: LuceneIndexer.java One more file... > Lucene benchmark: objective performance test for Lucene > -

Contrib modules

2006-10-23 Thread Grant Ingersoll
OK, I am putting finishing touches on benchmark contribution and was wondering about 3rd party licenses. Namely, I used Digester for some configuration stuff and was wondering if I need to include the Digester license (and the other affiliated licenses) in the lib directory or are these "s

[jira] Commented: (LUCENE-693) ConjunctionScorer - more tuneup

2006-10-23 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-693?page=comments#action_12444174 ] Yonik Seeley commented on LUCENE-693: - It occures to me that we shouldn't even need to sort anything! Stay tuned... I'm coming up with a patch. > ConjunctionSc

Scorer.skipTo() valid before next()?

2006-10-23 Thread Yonik Seeley
I got a bit of a surprise trying to re-implement the ConjunctionScorer. It turns out that skipTo(0) does not always return the same thing as next() on a newly created scorer. Some scorers give invalid results if skipTo() is called before next(). The javddoc is unclear on the subject, but the jav

Re: Contrib modules

2006-10-23 Thread Otis Gospodnetic
If there is a policy, you can bet its a conservative one but this is ASF software, so I think licenses aren't needed. Otis - Original Message From: Grant Ingersoll <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Monday, October 23, 2006 8:45:50 PM Subject: Contrib modules

[jira] Updated: (LUCENE-693) ConjunctionScorer - more tuneup

2006-10-23 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-693?page=all ] Yonik Seeley updated LUCENE-693: Attachment: conjunction.patch Here's a patch that: 1) nails things down in the constructor (removes incremental add code) 2) removes sorting 3) always skips to