DO NOT REPLY [Bug 31841] - [PATCH] MultiSearcher problems with Similarity.docFreq()
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=31841. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=31841 [EMAIL PROTECTED] changed: What|Removed |Added Attachment #14312|0 |1 is obsolete|| --- Additional Comments From [EMAIL PROTECTED] 2005-02-22 16:33 --- Created an attachment (id=14344) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14344action=view) Next version - sends weights instead of queries Thanks for the valuable feedback which has resulted in this new version. Now the query freezing approach is avoided by sending weights to the searchables instead of queries. Thus queries are still resusable, but the Searchable interface had to be extended. Also, previous API and behavior modifications have been reverted as far as possible. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How to proceed with Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?
Doug Cutting wrote: Wolf Siberski wrote: Now I found another solution which requires more changes, but IMHO is much cleaner: - when a query computes its Weight, it caches it in an attribute - a query can be 'frozen'. A frozen query always returns the cached Weight when calling Query.weight(). Orignally there was no Weight in Lucene, only Query and Scorer. Weight was added in order to make it so that searching did not modify a Query, so that a Query instance could be reused. Searcher-dependent state of the query is meant to reside in the Weight. IndexReader dependent state resides in the Scorer. Your freezing a query violates this. Can't we create the weight once in Searcher.search? I see. Yes, it is possible to avoid this by sending the weights to the Searchables instead of the queries. This is much better, because it becomes more explicit what is going on. The price is an extension (or modification) of the Searchable interface. I've added corresponding search(Weight...) methods to the existing search(Query...) methods and deprecated the latter. If Searchable is meant to be Lucene internal, then IMHO these 'duplicates' should be removed. Regarding your other comments: I've been a bit too eager in refactoring, not giving enough thought to backward compatibility issues. Now I've reverted to existing API and behavior as far as (IMHO) possible, and that was pretty far. The only API change necessary is createWeight() _throws IOException_, because the idfs have to be computed in the Weight constructors. Thanks for your valuable feedback. It helped a lot to understand the Lucene 'spirit' behind the code. An improved patch is attached to the Bugzilla issue. --Wolf - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
DO NOT REPLY [Bug 31841] - [PATCH] MultiSearcher problems with Similarity.docFreq()
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=31841. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=31841 --- Additional Comments From [EMAIL PROTECTED] 2005-02-22 19:17 --- This looks great to me! +1 Thanks again for patiently working through this rather extensive change. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How to proceed with Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?
Wolf Siberski wrote: The price is an extension (or modification) of the Searchable interface. I've added corresponding search(Weight...) methods to the existing search(Query...) methods and deprecated the latter. I think this is the right solution. If Searchable is meant to be Lucene internal, then IMHO these 'duplicates' should be removed. Searchable should be public, so that other RPC mechanisms may be used, rather than RMI. Thus the architecture supports distributed search and RMI is just one potential platform. Searchable is meant to be the abstract network protocol. Queries, filters and sort criteria are designed to be compact so that they may be efficiently passed to a remote index, with only the top-scoring hits are returned, rather than every non-zero scoring hit. HitCollector-based access to remote indexes is discouraged. HitColletors are rather primarily meant to be used to implement queries, sorting and filtering. The deprecated methods should be removed in Lucene 2.0. We could probably remove them now without breaking anyone, but it's better to be safe. Regarding your other comments: I've been a bit too eager in refactoring, not giving enough thought to backward compatibility issues. Now I've reverted to existing API and behavior as far as (IMHO) possible, and that was pretty far. The only API change necessary is createWeight() _throws IOException_, because the idfs have to be computed in the Weight constructors. I think that's okay. Thanks for all your work! An improved patch is attached to the Bugzilla issue. This patch now looks great to me. +1 Does anyone object to comitting this patch? http://issues.apache.org/bugzilla/show_bug.cgi?id=31841 Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
DO NOT REPLY [Bug 33678] - More javadocs for Weight
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33678. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33678 --- Additional Comments From [EMAIL PROTECTED] 2005-02-22 20:45 --- Created an attachment (id=14345) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14345action=view) Javadoc additions to Searchable and HitCollector Derived from Doug's lucene-dev reply of 22 Feb 2004 about bug 31841 -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
DO NOT REPLY [Bug 33019] - [PATCH] BooleanScorer can score documents in non increasing order
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33019. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33019 --- Additional Comments From [EMAIL PROTECTED] 2005-02-22 21:18 --- Created an attachment (id=14347) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14347action=view) Control allowSkipTo() on 1.4 scorer from BooleanQuery -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
DO NOT REPLY [Bug 33019] - [PATCH] BooleanScorer can score documents in non increasing order
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://issues.apache.org/bugzilla/show_bug.cgi?id=33019. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bug.cgi?id=33019 --- Additional Comments From [EMAIL PROTECTED] 2005-02-22 21:22 --- Created an attachment (id=14348) -- (http://issues.apache.org/bugzilla/attachment.cgi?id=14348action=view) Adapted TestBoolean2.java to use skipTo on the 1.4 scorer These two patches allow experiments with 3 versions of BooleanScorer: - the 1.4 scorer (almost) unmodified, - the 1.4 scorer implementing skipTo() and scoring docs in order, - the new default scorer. -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[Jakarta Lucene Wiki] Updated: PoweredBy
Date: 2005-02-22T14:06:15 Editor: JunLu Wiki: Jakarta Lucene Wiki Page: PoweredBy URL: http://wiki.apache.org/jakarta-lucene/PoweredBy add http://www.docjar.com, a javadoc search engine powered by Lucene. Change Log: -- @@ -4,6 +4,7 @@ * [http://aduna.biz/products/metadataserver/index.html Aduna Metadata Server] - RDF-based indexing server for metadata and full text * [http://www.celoxis.com/ Celoxis] - web based project management tool * [http://www.cvmail.com/ CvMail] - web based tool for recruiters (to manage job-applications by mail) + * [http://www.docjar.com/ DocJar] - search engine of thousand open source Java API document. * [http://www.eclipse.org Eclipse] - the Eclipse IDE uses Lucene for searching its documentation * [http://www.yawah.com/ eRez Imaging Server] - Dynamic Imaging Server * [http://eyebrowse.tigris.org/ Eyebrowse] - a browser for Unix mbox format mail archives - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]