Re: WriteLineDocTask does not release resources

2009-04-11 Thread Michael McCandless
Actually, in LUCENE-1516 I added empty PerfTask.close() for exactly this purpose. (Probably you need to svn up). (The NearRealtimeReaderTask needed to close its reader). Why do we need a separate Map to track separate classes that close resources? Why not simply ask each task to implement close

Re: WriteLineDocTask does not release resources

2009-04-11 Thread Shai Erera
That explains everything. I started to work on 1591 before you committed 1516, and was looking for something like this close() method the entire morning ... :) I agree with everything you say. This PerfTask.close() also allows one to write a Close task, in case calling that close() is necessary in

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-11 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698069#action_12698069 ] Shai Erera commented on LUCENE-1591: I wonder why does EnwikiDocMaker extend LineDocMa

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698074#action_12698074 ] Michael McCandless commented on LUCENE-1591: bq. I wonder why does EnwikiDocMa

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-11 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698077#action_12698077 ] Shai Erera commented on LUCENE-1591: resetInputs() is called from PerfRunData's ctor (

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698079#action_12698079 ] Michael McCandless commented on LUCENE-1591: OK sounds good! > Enable bzip co

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698089#action_12698089 ] Michael McCandless commented on LUCENE-831: --- bq. Iterate iterate iterate I suppos

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-11 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698096#action_12698096 ] Mark Miller commented on LUCENE-831: Thanks Mike! Everything makes sense on first read,

(Benchmark) Split DocMaker into DocCollector and DocMaker

2009-04-11 Thread Shai Erera
Hi I would like to propose some refactoring to the benchmark package. Today, DocMaker has two roles: collecting documents from a collection and preparing a Document object. I think these two should actually be split up to DocCollector and DocMaker, which will use a DocCollector instance. DocColle

Re: (Benchmark) Split DocMaker into DocCollector and DocMaker

2009-04-11 Thread Michael McCandless
Sounds great! As long as LineDocMaker still has very low overhead :) But how about the name RawContentSource (or maybe ContentSource) instead of DocCollector? Ie, it's the thing that pulls raw content from somewhere, and then DocMaker creates documents from it? Mike On Sat, Apr 11, 2009 at 11:

Re: (Benchmark) Split DocMaker into DocCollector and DocMaker

2009-04-11 Thread Shai Erera
ConentSource is also a good name. I was thinking that its main API will be getNextDocData (which already exists today) and will return DocData (as it does today). Then BasicDocMaker or DocStateDocMaker will translate it into DocState. getNextDocData will receive a DocData object to reuse (something

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-11 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698101#action_12698101 ] Uwe Schindler commented on LUCENE-831: -- {quote} How about we create a ValueSource abst

[jira] Created: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-04-11 Thread Shai Erera (JIRA)
Split DocMaker into ContentSource and DocMaker -- Key: LUCENE-1595 URL: https://issues.apache.org/jira/browse/LUCENE-1595 Project: Lucene - Java Issue Type: Improvement Components: contri

[jira] Commented: (LUCENE-1570) QueryParser.setAllowLeadingWildcard could provide finer granularity

2009-04-11 Thread Jonathan Watt (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698103#action_12698103 ] Jonathan Watt commented on LUCENE-1570: --- Err. s/Deki/Lucene/ :-) > QueryParser.setA

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698104#action_12698104 ] Michael McCandless commented on LUCENE-831: --- bq. E.g.: One have a CSF and a TrieF

[jira] Created: (LUCENE-1596) optimize MultiTermEnum/MultiTermDocs

2009-04-11 Thread Yonik Seeley (JIRA)
optimize MultiTermEnum/MultiTermDocs Key: LUCENE-1596 URL: https://issues.apache.org/jira/browse/LUCENE-1596 Project: Lucene - Java Issue Type: Improvement Components: Search Reporte

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-11 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698135#action_12698135 ] Mark Miller commented on LUCENE-831: Any ideas on where parser fits in with valuesource

[jira] Updated: (LUCENE-1596) optimize MultiTermEnum/MultiTermDocs

2009-04-11 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated LUCENE-1596: - Attachment: LUCENE-1596.patch Attaching optimization patch. Results up front: random seeks to

[jira] Issue Comment Edited: (LUCENE-1596) optimize MultiTermEnum/MultiTermDocs

2009-04-11 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698136#action_12698136 ] Yonik Seeley edited comment on LUCENE-1596 at 4/11/09 2:16 PM: -

[jira] Issue Comment Edited: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-11 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698135#action_12698135 ] Mark Miller edited comment on LUCENE-831 at 4/11/09 2:28 PM: - A

[jira] Commented: (LUCENE-1596) optimize MultiTermEnum/MultiTermDocs

2009-04-11 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698140#action_12698140 ] Paul Elschot commented on LUCENE-1596: -- Do I interpret correctly that getting the doc

[jira] Commented: (LUCENE-1596) optimize MultiTermEnum/MultiTermDocs

2009-04-11 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698142#action_12698142 ] Yonik Seeley commented on LUCENE-1596: -- Yes, *if* you are doing low level stuff direc

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-11 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698151#action_12698151 ] Mark Miller commented on LUCENE-831: or parsing is just done by the FieldValue implemen

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-11 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698161#action_12698161 ] Shai Erera commented on LUCENE-1591: Before I post a patch I wanted to test reading th

[jira] Updated: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-11 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1591: --- Attachment: ant-1.7.1.jar LUCENE-1591.patch The patch touches LineDocMaker, EnwikiDo