DO NOT REPLY [Bug 31841] - [PATCH] MultiSearcher problems with Similarity.docFreq()

2005-02-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=31841.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=31841


[EMAIL PROTECTED] changed:

   What|Removed |Added

  Attachment #14312|0   |1
is obsolete||




--- Additional Comments From [EMAIL PROTECTED]  2005-02-22 16:33 ---
Created an attachment (id=14344)
 -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14344action=view)
Next version - sends weights instead of queries

Thanks for the valuable feedback which has resulted in this new version. Now
the query freezing approach is avoided by sending weights to the searchables
instead of queries. Thus queries are still resusable, but the Searchable
interface had to be extended. Also, previous API and behavior modifications
have been reverted as far as possible.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: How to proceed with Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?

2005-02-22 Thread Wolf Siberski
Doug Cutting wrote:
Wolf Siberski wrote:
Now I found another solution which requires more changes, but IMHO is
much cleaner:
- when a query computes its Weight, it caches it in an attribute
- a query can be 'frozen'. A frozen query always returns the cached
  Weight when calling Query.weight().

Orignally there was no Weight in Lucene, only Query and Scorer.  Weight 
was added in order to make it so that searching did not modify a Query, 
so that a Query instance could be reused.  Searcher-dependent state of 
the query is meant to reside in the Weight.  IndexReader dependent state 
resides in the Scorer.  Your freezing a query violates this.  Can't we 
create the weight once in Searcher.search?

I see. Yes, it is possible to avoid this by sending the weights to the 
Searchables
instead of the queries. This is much better, because it becomes more explicit
what is going on. The price is an extension (or modification) of the
Searchable interface. I've added corresponding search(Weight...) methods
to the existing search(Query...) methods and deprecated the latter.
If Searchable is meant to be Lucene internal, then IMHO these 'duplicates'
should be removed.
Regarding your other comments: I've been a bit too eager in refactoring,
not giving enough thought to backward compatibility issues. Now I've
reverted to existing API and behavior as far as (IMHO) possible,
and that was pretty far. The only API change necessary is
createWeight() _throws IOException_, because the idfs have to
be computed in the Weight constructors.
Thanks for your valuable feedback. It helped a lot to understand
the Lucene 'spirit' behind the code. An improved patch is attached
to the Bugzilla issue.
--Wolf
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


DO NOT REPLY [Bug 31841] - [PATCH] MultiSearcher problems with Similarity.docFreq()

2005-02-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=31841.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=31841





--- Additional Comments From [EMAIL PROTECTED]  2005-02-22 19:17 ---
This looks great to me!

+1

Thanks again for patiently working through this rather extensive change.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: How to proceed with Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?

2005-02-22 Thread Doug Cutting
Wolf Siberski wrote:
The price is an extension (or modification) of the
Searchable interface. I've added corresponding search(Weight...) methods
to the existing search(Query...) methods and deprecated the latter.
I think this is the right solution.
If Searchable is meant to be Lucene internal, then IMHO these 'duplicates'
should be removed.
Searchable should be public, so that other RPC mechanisms may be used, 
rather than RMI.  Thus the architecture supports distributed search and 
RMI is just one potential platform.  Searchable is meant to be the 
abstract network protocol.  Queries, filters and sort criteria are 
designed to be compact so that they may be efficiently passed to a 
remote index, with only the top-scoring hits are returned, rather than 
every non-zero scoring hit.  HitCollector-based access to remote indexes 
is discouraged.  HitColletors are rather primarily meant to be used to 
implement queries, sorting and filtering.

The deprecated methods should be removed in Lucene 2.0.  We could 
probably remove them now without breaking anyone, but it's better to be 
safe.

Regarding your other comments: I've been a bit too eager in refactoring,
not giving enough thought to backward compatibility issues. Now I've
reverted to existing API and behavior as far as (IMHO) possible,
and that was pretty far. The only API change necessary is
createWeight() _throws IOException_, because the idfs have to
be computed in the Weight constructors.
I think that's okay.  Thanks for all your work!
An improved patch is attached to the Bugzilla issue.
This patch now looks great to me.  +1
Does anyone object to comitting this patch?
  http://issues.apache.org/bugzilla/show_bug.cgi?id=31841
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


DO NOT REPLY [Bug 33678] - More javadocs for Weight

2005-02-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=33678.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=33678





--- Additional Comments From [EMAIL PROTECTED]  2005-02-22 20:45 ---
Created an attachment (id=14345)
 -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14345action=view)
Javadoc additions to Searchable and HitCollector

Derived from Doug's lucene-dev reply of 22 Feb 2004 about bug 31841

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 33019] - [PATCH] BooleanScorer can score documents in non increasing order

2005-02-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=33019.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=33019





--- Additional Comments From [EMAIL PROTECTED]  2005-02-22 21:18 ---
Created an attachment (id=14347)
 -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14347action=view)
Control allowSkipTo() on 1.4 scorer from BooleanQuery


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



DO NOT REPLY [Bug 33019] - [PATCH] BooleanScorer can score documents in non increasing order

2005-02-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=33019.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=33019





--- Additional Comments From [EMAIL PROTECTED]  2005-02-22 21:22 ---
Created an attachment (id=14348)
 -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14348action=view)
Adapted TestBoolean2.java to use skipTo on the 1.4 scorer

These two patches allow experiments with 3 versions of BooleanScorer:
- the 1.4 scorer (almost) unmodified,
- the 1.4 scorer implementing skipTo() and scoring docs in order,
- the new default scorer.


-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[Jakarta Lucene Wiki] Updated: PoweredBy

2005-02-22 Thread lucene-cvs
   Date: 2005-02-22T14:06:15
   Editor: JunLu
   Wiki: Jakarta Lucene Wiki
   Page: PoweredBy
   URL: http://wiki.apache.org/jakarta-lucene/PoweredBy

   add http://www.docjar.com, a javadoc search engine powered by Lucene.

Change Log:

--
@@ -4,6 +4,7 @@
  * [http://aduna.biz/products/metadataserver/index.html Aduna Metadata Server] 
- RDF-based indexing server for metadata and full text
  * [http://www.celoxis.com/ Celoxis] - web based project management tool 
  * [http://www.cvmail.com/ CvMail] - web based tool for recruiters (to manage 
job-applications by mail)
+ * [http://www.docjar.com/ DocJar] - search engine of thousand open source 
Java API document.
  * [http://www.eclipse.org Eclipse] - the Eclipse IDE uses Lucene for 
searching its documentation
  * [http://www.yawah.com/ eRez Imaging Server] - Dynamic Imaging Server
  * [http://eyebrowse.tigris.org/ Eyebrowse] - a browser for Unix mbox format 
mail archives 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]