[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-06-14 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1539:
---

Attachment: LUCENE-1539.patch

Attached new patch; fixed a bunch of silly issues (eg we had broken
parsing of the readOnly option to OpenReaderTask; the
deletepercent.alg was opening readOnly readers to do the deletes; the
readOnly option was ignored if you specified userData; etc.).

I also switched the default for autoCommit to false, when creating an
IndexWriter.

I think it's ready to commit... I'll commit soon.

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, 
 sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-06-12 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: LUCENE-1539.patch

Keeps previous deletes (doesn't call undeleteall).  When existing deletes are 
over 50%, we loop through termdocs instead.

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, 
 sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-06-12 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: LUCENE-1539.patch

Implemented the changes.  Wasn't sure how to floor it.  

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 sortBench2.py, sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-06-12 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1539:
---

Attachment: LUCENE-1539.patch

Updated patch:

  * Switched to TermDocs to pick the deletes; I think this is sufficient (no 
floor is needed)

  * Beefed up CHANGES

  * Added a few more copyrights

I think it's ready to commit!  I'll wait a day or two...

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, sortBench2.py, sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-06-12 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1539:
---

Attachment: LUCENE-1539.patch

Added undelete all if you try to delete to an absolute pct less than the 
current deletions.

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-06-11 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: LUCENE-1539.patch

Changed the deletes to be random, cleaned up the code.

Multiple passes of deletePercent.alg fails, I may have time to figure out why, 
as is though the patch works.

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-04-08 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: LUCENE-1539.patch

Above mentioned issues fixed.

It seems a bit awkward that DeleteByPercentTask needs to call
IR.undeleteAll before executing the deletes. Also that
subsequent delete by percent calls in deletepercent.alg need to
open the latest version of the index rather than the original
(which does not have deletes). This is due to
DirectoryIndexReader.acquireWriteLock checking to insure the
latest version of the index is locked. Perhaps we can relax
this? I would rather be able to open a commit point and delete
from the reader, then flush as the latest version.

Perhaps in flexible indexing we can have more customizability
with the versioning? 

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 LUCENE-1539.patch, sortBench2.py, sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-04-07 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: LUCENE-1539.patch

Fixed the above mentioned problems.  When LUCENE-1516 is in should we add the 
near realtime benchmarks here?

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, 
 sortBench2.py, sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-03-12 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: LUCENE-1539.patch

* Added deletepercent.alg as an example of these tasks
* CommitIndexTask commits an IndexWriter using a commit name
* OpenReaderTask opens a specific commit point by name
* FlushReaderTask flushes a reader using a commit name
* DeleteByPercentTask a percentage of reader documents


 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, LUCENE-1539.patch, sortBench2.py, 
 sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-02-13 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: LUCENE-1539.patch

The patch adds CreateWikiIndex which creates enwiki indexes with
multiple percentages of deletes. It probably needs to be made into a
task or multiple tasks along with an alg file. One goal is to evolve
this patch to enable concurrent indexing and searching. 

I can see the elegance of using Python scripts because it's easy to
edit, and the pickling is nice. Equivalent Java code could be fairly
lengthy. However since this is a Java project and we have a framework
with the .alg files for defining some level of external operations,
it seems we may want to figure out a way to put the Python script
functionality into tasks and defined by .alg files. 


 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1539) Improve Benchmark

2009-02-13 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated LUCENE-1539:
-

Attachment: sortCollate2.py
sortBench2.py

Python scripts attached.  

 Improve Benchmark
 -

 Key: LUCENE-1539
 URL: https://issues.apache.org/jira/browse/LUCENE-1539
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/benchmark
Affects Versions: 2.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1539.patch, sortBench2.py, sortCollate2.py

   Original Estimate: 336h
  Remaining Estimate: 336h

 Benchmark can be improved by incorporating recent suggestions posted
 on java-dev. M. McCandless' Python scripts that execute multiple
 rounds of tests can either be incorporated into the codebase or
 converted to Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org