Re: Javadoc not available due to non-public classes?

2005-02-25 Thread Paul Elschot
api. This is not a bug, but a feature. There is also a javadocs-internal target in in the build.xml file. It has an access=package attribute. (I prefer use private) Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL

Re: Into javadocs? [Bug 31841] - [PATCH] MultiSearcher problems with Similarity.docFreq()

2005-02-21 Thread Paul Elschot
Doug, Would you mind if some pieces of your reply end up in the javadocs? Regards, Paul Elschot. On Monday 21 February 2005 18:49, [EMAIL PROTECTED] wrote: DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org

Re: BitSet implementation and large index

2005-02-14 Thread Paul Elschot
can score documents out of order, which is not compatible with the order required by filtering using the stored document number differences. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

Re: Weird BooleanQuery behavior

2005-02-08 Thread Paul Elschot
document. Isn't it weird? Is it an expected behavior? That depends on the index that is being searched. A correct result could be the same document for both queries. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL

Re: Weird BooleanQuery behavior

2005-02-08 Thread Paul Elschot
and assign the copyright to the ASF? BooleanQuery and BooleanScorer class but it's too *deep* to dig into :-) There has also been a recent addition in the development branch. Regards, Paul Elschot - To unsubscribe, e-mail

Re: Study Group (WAS Re: Normalized Scoring)

2005-02-06 Thread Paul Elschot
Gospodnetic wrote:  Exactly.  Luckily, since then I've learned a bit from lucene-dev  discussions and side IR readings, so some of the topics are making  more sense now. One could collect and annotate more references here: http://wiki.apache.org/jakarta-lucene/InformationRetrieval Regards, Paul

Re: How to proceed with Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?

2005-01-13 Thread Paul Elschot
, with a replicated cache, much the same thing would need to be done remotely. Regards, Paul Elschot. P.S. Are you sure it is worthwhile to do this? Term density (and it's square root tf()) vary much more than idf nowadays. Chuck -Original Message- From: Wolf Siberski [mailto:[EMAIL PROTECTED

Re: Span Query Performance

2005-01-06 Thread Paul Elschot
A bit more: On Thursday 06 January 2005 10:22, Paul Elschot wrote: On Thursday 06 January 2005 02:17, Andrew Cunningham wrote: Hi all, I'm currently doing a query similar to the following: for w in wordset:     query = w near (word1 V word2 V word3 ... V word1422);     perform

Re: auto-filters?

2005-01-06 Thread Paul Elschot
On Thursday 06 January 2005 22:31, Doug Cutting wrote: Paul Elschot wrote: Filters are more efficient than query terms for many I think there are two reasons for the peformance gain: - having things in RAM, eg. the bits of a filter after it is computed once, - being able to search per

Re: Document Boosts as floats, alternate encoding

2004-12-15 Thread Paul Elschot
On Wednesday 15 December 2004 21:46, Dan Climan wrote: I'm experimenting with Document boosts and I'm finding them effective for certain types of scoring enhancements. My concern is that because of the way they are stored (ie an encoded byte). There are not enough boost values to cover the

Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/search BooleanQuery.java

2004-12-14 Thread Paul Elschot
. Regards, Paul Elschot. On Tuesday 14 December 2004 20:28, [EMAIL PROTECTED] wrote: dnaber 2004/12/14 11:28:44 Modified:src/java/org/apache/lucene/search BooleanQuery.java Log: slightly improve the TooManyClauses exception documentation Revision ChangesPath 1.28

Re: Boolean Scorer

2004-12-12 Thread Paul Elschot
that max is *required* to implement reasonable multi-field searching (1). I can't imagine a case where the current I agree that a maximum score over fields is useful. ... -Original Message- From: Paul Elschot [mailto:[EMAIL PROTECTED] Sent: Saturday, December 11, 2004 2:05 PM

Re: Boolean Scorer

2004-12-11 Thread Paul Elschot
. It is done by the make...SumScorer methods. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Boolean Scorer

2004-12-11 Thread Paul Elschot
that computes the score is java protected. It can be overridden to make the best of it in another world. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Boolean Scorer

2004-12-10 Thread Paul Elschot
this is should be no problem, though. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Sorting Of Index

2004-11-17 Thread Paul Elschot
On Wednesday 17 November 2004 17:29, Palmer, Andrew MMI Woking wrote: Hi, I have been using Lucene now for about 4 years (pre Jakarta) and the system has worked fine. One feature that would be nice for my development would be the ability to sort the index by a particular field. There

Re: idf and explain(), was Re: Search and Scoring

2004-10-19 Thread Paul Elschot
Chuck, Doug, I have quite bit of this in the DisjunctionScorer I mentioned before. I'll post it soon, ie. in a (few) day(s). It combines nicely with what I called a DensityTermQuery, a TermQuery that scores by it's density, and power norms. Regards, Paul Elschot

Re: idf^2

2004-10-17 Thread Paul Elschot
mechanism of Lucene. With some hindsight it's a nice proof of concept for a structured query language on top of Lucene. Regards, Paul Elschot Chuck -Original Message- From: Paul Elschot [mailto:[EMAIL PROTECTED] Sent: Sunday, October 17, 2004 3:15 AM To: Lucene Developers List

Re: Contribution: better multi-field searching

2004-10-13 Thread Paul Elschot
discussed recently, but I don't remember the outcome, perhaps some else? Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Contribution: better multi-field searching

2004-10-12 Thread Paul Elschot
are optional. Instead of summing just use the maximum. BooleanScorer works ahead for each scorer to avoid the need for keeping the scorers sorted. But you'll probably loose skipTo() when using BooleanScorer. Regards, Paul Elschot

Re: Using MMapDirectory fails TestCompoundFile; MMapDirectory for huge indexes

2004-10-01 Thread Paul Elschot
On Friday 01 October 2004 22:16, Doug Cutting wrote: Paul Elschot wrote: I'm working on a memory mapped directory that uses multiple buffers for large files. Great! There will be a small performance hit, as each call to readByte() will need to first check whether it's overflowed

Re: Patch for bug 31174

2004-09-11 Thread Paul Elschot
to this list. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: RemoteSearchable will not work anylonger, due to changes in BooleanClause

2004-08-29 Thread Paul Elschot
, on current CVS head, after ifdown eth0 on Suse 9.1 I also get Network is unreachable errors from ant build.xml -Dtestcase=TestRemoteSearchable test After ifup eth0 the test passes normally again, no hangs at all. java -version is 1.4.2-b28 Regards, Paul Elschot

Re: Regexp Query

2004-08-25 Thread Paul Elschot
to a BooleanQuery. There is a default maximum of 1024 clauses there, so it might make sense to have a minimum length for the constant prefix. (The maximum nr. of clauses exists because each search Term requires a buffer.) Regards, Paul Elschot P.S. This might be useful, just ignore the defined class name and its

Re: API cleanup: BooleanQuery.add()

2004-08-23 Thread Paul Elschot
operator with subqueries/clauses. Perhaps one should distinguish parsed queries (AndQuery, OrQuery etc.) from searchable queries (BooleanQuery, SpanQuery) by putting them in separate packages. Regards, Paul Elschot - To unsubscribe

Re: the future of DateField

2004-08-17 Thread Paul Elschot
Damian, Around 18 May 2004 much of this was discussed on lucene-user as range queries over large ranges. Regards, Paul - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: bitwise OR in BooleanScorer

2004-08-15 Thread Paul Elschot
On Saturday 14 August 2004 17:15, Daniel Naber wrote: Hi, BooleanScorer's next() method uses a bitwise OR in a while loop: ...} while (bucketTable.first != null | more); Is there any reason for this, couldn't this just be || ? BTW, I found this with FindBugs

Javadocs for Scorer.java and TermScorer.java

2004-08-09 Thread Paul Elschot
Doug, See http://issues.apache.org/bugzilla/show_bug.cgi?id=30360 After your comments on the patch, I changed build.xml to include a javadocs-interal target to create javadocs for public and internal API. I'd like to continue to provide javadocs for other subclasses of Scorer, but these

Re: FilteringQuery.java, Filter

2004-08-01 Thread Paul Elschot
On Friday 30 July 2004 23:29, Robert Engels wrote: I thought the next release we were change 'Filter' to an interface, with a definition interface Filter { boolean accept(Document doc); } Is this not going to happen? I don't know, I wasn't involved in that. I'd rather have the BitSet

FilteringQuery.java

2004-07-30 Thread Paul Elschot
Dear developers, At the moment IndexSearcher.search(Query, Filter) computes a score for every document matching the query before checking the filter. With the BitSet.nextSetBit() method one might implement a filter as a required clause in a Query. This would even allow the evt. use of

Some javadoc additions: via bugzilla?

2004-07-27 Thread Paul Elschot
Dear developers, I have made javadocs for Scorer.java and TermScorer.java. For this I also had to change build.xml to use package access for the javadocs target. That caused some minor error javadoc messages in CompoundFileReader.java and FieldInfos.java, which I fixed. Can I post corresponding

Simple patch for broken link in javadoc of Weight.java

2004-07-26 Thread Paul Elschot
Regards, Paul Index: search/Weight.java === RCS file: /home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/search/Weight.java,v retrieving revision 1.3 diff -u -r1.3 Weight.java --- search/Weight.java 29 Mar 2004 22:48:04

Re: 1.4 RC3?

2004-05-11 Thread Paul Elschot
On Tuesday 11 May 2004 20:09, Doug Cutting wrote: Any reason not to roll out a 1.4 RC3 release today? There's a bug in 1.4RC2 that breaks existing Lucene applications which has been fixed, so I'd like to get something out ASAP. There's an unfixed bug in span search. But since the span code

Re: [Jakarta Lucene Wiki] Updated: PoweredBy

2004-04-25 Thread Paul Elschot
Erik, MIT's superarchive isn't listed, it seems: http://www.technologyreview.com/articles/atwood1202.asp?p=0 On Saturday 24 April 2004 04:19, Erik Hatcher wrote: Wow, just wow http://www.akamai.com/en/html/services/edgecomputing_search.html On Apr 23, 2004, at 10:00 AM, [EMAIL

Re: Ordered span query with more than 2 subqueries: avoid?

2004-04-19 Thread Paul Elschot
Hi Doug, On Monday 19 April 2004 20:56, Doug Cutting wrote: Paul Elschot wrote: On Tuesday 06 April 2004 18:11, Doug Cutting wrote: I think this is indeed the problem. Currently it always increments the earliest span. Rather I think it should increment the first span, still within slop

Surround query parser

2004-04-18 Thread Paul Elschot
Dear developers, I'd like to contribute a query parser named Surround. The implementation uses mostly Lucene's BooleanQuery, TermQuery, SpanNearQuery, SpanOrQuery and SpanTermQuery. These are chosen depending on the query operator. Currently the sources are in a CVS working copy next to a

Re: cvs commit: jakarta-lucene/src/test/org/apache/lucene/search/spans TestSpans.java

2004-04-09 Thread Paul Elschot
/search/spans TestSpans.java Log: add SpanNearQuery test case contributed by Paul Elschot - note, this test currently fails due to a bug. Revision ChangesPath 1.1 jakarta-lucene/src/test/org/apache/lucene/search/spans/TestSpans.java

Test case for ordered span query with more than 2 subqueries and slop.

2004-04-07 Thread Paul Elschot
Since the attachment didn't arrive, I'll try again. On Tuesday 06 April 2004 21:23, Paul Elschot wrote: Doug, On Tuesday 06 April 2004 18:11, Doug Cutting wrote: Paul Elschot wrote: A test of the ordered span query with three terms: w1 w2 w3 and slop 1 against document

Re: Ordered span query with more than 2 subqueries: avoid?

2004-04-06 Thread Paul Elschot
Doug, On Tuesday 06 April 2004 18:11, Doug Cutting wrote: Paul Elschot wrote: A test of the ordered span query with three terms: w1 w2 w3 and slop 1 against document: w1 w3 w2 w3 fails. Thanks for catching this. It would be helpful if you could submit a JUnit test which

Ordered span query with more than 2 subqueries: avoid?

2004-04-01 Thread Paul Elschot
Dear readers, (Not sure whether this would be better posted to lucene-user.) A test of the ordered span query with three terms: w1 w2 w3 and slop 1 against document: w1 w3 w2 w3 fails. The javadoc (1.4 rc3) of SpanNearQuery gives: Matches spans which are near one another. One can

Re: Queries with only non required terms: not as OR?

2004-03-03 Thread Paul Elschot
Doug, On Wednesday 03 March 2004 18:47, Doug Cutting wrote: Paul Elschot wrote: I read a bit into the source code and I found this comment at BooleanQuery.scorer(): // Also, at this point a // BooleanScorer cannot be embedded in a ConjunctionScorer, as the hits // from

Queries with only non required terms: not as OR?

2004-03-02 Thread Paul Elschot
Hello, I'm trying to implement a query language with ao. AND and OR operators for Lucene. I can get the AND operator to work by mapping it to a BooleanQuery with only required terms. However, when I try to implement the OR operator by mapping it to a BooleanQuery with non required terms, it

Re: Queries with only non required terms: not as OR?

2004-03-02 Thread Paul Elschot
implement skipTo() correctly, which is // required by ConjunctionScorer. The test function I used assumes that documents will be collected in order. Could this be the source of the problem? Paul. On Tuesday 02 March 2004 21:48, Paul Elschot wrote: Hello, I'm trying to implement a query language