[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1539: --- Attachment: LUCENE-1539.patch Attached new patch; fixed a bunch of silly issues (eg we had broken parsing of the readOnly option to OpenReaderTask; the deletepercent.alg was opening readOnly readers to do the deletes; the readOnly option was ignored if you specified userData; etc.). I also switched the default for autoCommit to false, when creating an IndexWriter. I think it's ready to commit... I'll commit soon. Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: LUCENE-1539.patch Keeps previous deletes (doesn't call undeleteall). When existing deletes are over 50%, we loop through termdocs instead. Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: LUCENE-1539.patch Implemented the changes. Wasn't sure how to floor it. Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1539: --- Attachment: LUCENE-1539.patch Updated patch: * Switched to TermDocs to pick the deletes; I think this is sufficient (no floor is needed) * Beefed up CHANGES * Added a few more copyrights I think it's ready to commit! I'll wait a day or two... Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1539: --- Attachment: LUCENE-1539.patch Added undelete all if you try to delete to an absolute pct less than the current deletions. Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: LUCENE-1539.patch Changed the deletes to be random, cleaned up the code. Multiple passes of deletePercent.alg fails, I may have time to figure out why, as is though the patch works. Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: LUCENE-1539.patch Above mentioned issues fixed. It seems a bit awkward that DeleteByPercentTask needs to call IR.undeleteAll before executing the deletes. Also that subsequent delete by percent calls in deletepercent.alg need to open the latest version of the index rather than the original (which does not have deletes). This is due to DirectoryIndexReader.acquireWriteLock checking to insure the latest version of the index is locked. Perhaps we can relax this? I would rather be able to open a commit point and delete from the reader, then flush as the latest version. Perhaps in flexible indexing we can have more customizability with the versioning? Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: LUCENE-1539.patch Fixed the above mentioned problems. When LUCENE-1516 is in should we add the near realtime benchmarks here? Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Assignee: Michael McCandless Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: LUCENE-1539.patch * Added deletepercent.alg as an example of these tasks * CommitIndexTask commits an IndexWriter using a commit name * OpenReaderTask opens a specific commit point by name * FlushReaderTask flushes a reader using a commit name * DeleteByPercentTask a percentage of reader documents Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: LUCENE-1539.patch The patch adds CreateWikiIndex which creates enwiki indexes with multiple percentages of deletes. It probably needs to be made into a task or multiple tasks along with an alg file. One goal is to evolve this patch to enable concurrent indexing and searching. I can see the elegance of using Python scripts because it's easy to edit, and the pickling is nice. Equivalent Java code could be fairly lengthy. However since this is a Java project and we have a framework with the .alg files for defining some level of external operations, it seems we may want to figure out a way to put the Python script functionality into tasks and defined by .alg files. Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1539) Improve Benchmark
[ https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1539: - Attachment: sortCollate2.py sortBench2.py Python scripts attached. Improve Benchmark - Key: LUCENE-1539 URL: https://issues.apache.org/jira/browse/LUCENE-1539 Project: Lucene - Java Issue Type: Improvement Components: contrib/benchmark Affects Versions: 2.4 Reporter: Jason Rutherglen Priority: Minor Fix For: 2.9 Attachments: LUCENE-1539.patch, sortBench2.py, sortCollate2.py Original Estimate: 336h Remaining Estimate: 336h Benchmark can be improved by incorporating recent suggestions posted on java-dev. M. McCandless' Python scripts that execute multiple rounds of tests can either be incorporated into the codebase or converted to Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org