Re: threads & benchmark contrib

2007-06-22 Thread Doron Cohen
Mike, I didn't anticipate this use case and I think it would not work correctly. I'll look into this. Anyhow, I think it would not work as you expect. It seems what you want is to have 4 threads, adding docs in parallel, until the doc maker is exhausted. But this line: {[AddDoc(4000)]: 4} : *

[jira] Created: (LUCENE-941) Benchmark al line - {[AddDoc(4000)]: 4} : * - causes an infinite loop

2007-06-22 Thread Doron Cohen (JIRA)
Benchmark al line - {[AddDoc(4000)]: 4} : * - causes an infinite loop -- Key: LUCENE-941 URL: https://issues.apache.org/jira/browse/LUCENE-941 Project: Lucene - Java Issue

[jira] Created: (LUCENE-940) SimpleDateFormat used in a non thread safe manner

2007-06-22 Thread Doron Cohen (JIRA)
SimpleDateFormat used in a non thread safe manner - Key: LUCENE-940 URL: https://issues.apache.org/jira/browse/LUCENE-940 Project: Lucene - Java Issue Type: Bug Components: contrib/be

[jira] Commented: (LUCENE-937) Make CachingTokenFilter faster

2007-06-22 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507560 ] Michael Busch commented on LUCENE-937: -- Mark, the new patch looks good to me! I'm going to commit this soon. A

Build failed in Hudson: Lucene-Nightly #130

2007-06-22 Thread hudson
See http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/130/changes Changes: [buschmi] add 2.2 release to doap file -- [...truncated 4124 lines...] [junit] Writing files byte by byte [junit] 1551 total milliseconds to read, 4885 kb/s

[jira] Closed: (LUCENE-673) Exceptions when using Lucene over NFS

2007-06-22 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless closed LUCENE-673. - Resolution: Fixed Fix Version/s: 2.2 This issue is now resolved by both LUCENE-701

threads & benchmark contrib

2007-06-22 Thread Michael McCandless
Hi, I'm trying to test LUCENE-843 (IndexWriter speedups) on Wikipedia using the the benchmark contrib framework plus the patch from LUCENE-848. I downloaded an older wikipedia export (the "latest" doesn't seem to exist) and got it un-tar'd. The test I'd like to run is to use 4 threads to index

Re: [jira] Commented: (LUCENE-843) improve how IndexWriter uses RAM to buffer added documents

2007-06-22 Thread Michael McCandless
Hi Grant, The benchmarking code I've been using is in all but the first & last patches I attached on LUCENE-843. Really it's just a modified version of the demo IndexFiles code, plus a new analyzer (SimpleSpaceAnalyzer) that is the same as WhitespaceAnalyzer except it re-uses Token/String instea

Re: [jira] Commented: (LUCENE-843) improve how IndexWriter uses RAM to buffer added documents

2007-06-22 Thread Grant Ingersoll
Hi Michael, I know you've got your hands full, but was wondering if you could either post your benchmark code, or better yet, hook it into the benchmarker contrib (it is quite easy). Let me know if I can help, Grant On Jun 21, 2007, at 10:01 AM, Michael McCandless (JIRA) wrote: [ ht

[jira] Commented: (LUCENE-892) CompoundFileReader's openInput produces streams that may do an extra buffer copy

2007-06-22 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507466 ] Michael McCandless commented on LUCENE-892: --- It looks like the double copying only happens in certain limit

Re: [jira] Commented: (LUCENE-937) Make CachingTokenFilter faster

2007-06-22 Thread Sean Timm
HashMap defaults to 16, but ArrayList defaults to 10. He probably got those confused. I'm not sure how Sun came up with the default values. -Sean Mark Miller (JIRA) wrote: A friend was recently telling me that ArrayList defaulted to 16, but it does not -- it defaults to 10. He must have bee

Re: [ANN] Luke 0.7.1 released

2007-06-22 Thread Andrzej Bialecki
Steven Rowe wrote: Hi Andrzej, Andrzej Bialecki wrote: Luke still requires 1.5, because that's what Lucene requires. Lucene core requires 1.4, not 1.5. Indeed! I had a vague recollection that it requires 1.5, probably due to the gdata-server contrib module ... but I just checked it, and yo

Re: [ANN] Luke 0.7.1 released

2007-06-22 Thread Steven Rowe
Hi Andrzej, Andrzej Bialecki wrote: > Luke still requires 1.5, because that's what Lucene requires. Lucene core requires 1.4, not 1.5. Steve -- Steve Rowe Center for Natural Language Processing http://www.cnlp.org/tech/lucene.asp ---

Re: [ANN] Luke 0.7.1 released

2007-06-22 Thread Andrzej Bialecki
Andrzej Bialecki wrote: Mark Miller wrote: I think it was probably compiled with Java 1.6. 1.5 does not work, but 1.6 does. Ah, yes - sorry, I forgot that 1.6 is the default in my environment. There is nothing specific in Luke that would require 1.6 - I'll recompile it and upload an updated

[jira] Updated: (LUCENE-937) Make CachingTokenFilter faster

2007-06-22 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-937: --- Attachment: CachingTokenFilterRev2.patch I have the reset method check if the cache is null before cr

[jira] Updated: (LUCENE-937) Make CachingTokenFilter faster

2007-06-22 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-937: --- Description: The LinkedList used by CachingTokenFilter is accessed using the get() method. Direct a

[jira] Commented: (LUCENE-937) Make CachingTokenFilter faster

2007-06-22 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507334 ] Mark Miller commented on LUCENE-937: Well this is embarrassing. I messed up the implementation of the iterator a

Re: [ANN] Luke 0.7.1 released

2007-06-22 Thread Andrzej Bialecki
Mark Miller wrote: I think it was probably compiled with Java 1.6. 1.5 does not work, but 1.6 does. Ah, yes - sorry, I forgot that 1.6 is the default in my environment. There is nothing specific in Luke that would require 1.6 - I'll recompile it and upload an updated package. -- Best regar

Re: [ANN] Luke 0.7.1 released

2007-06-22 Thread Mark Miller
I think it was probably compiled with Java 1.6. 1.5 does not work, but 1.6 does. - Mark Ard Schrijvers wrote: Hello I seem to get an error running the latest lukeall jar. Previous versions just work. See error below (java version "1.5.0_12"). lukeall-0.7.1.jar runs fine regards Ard ps sry

RE: [ANN] Luke 0.7.1 released

2007-06-22 Thread Ard Schrijvers
Hello I seem to get an error running the latest lukeall jar. Previous versions just work. See error below (java version "1.5.0_12"). lukeall-0.7.1.jar runs fine regards Ard ps sry for not properly indenting mail due to webmail C:\Tools>java -jar lukeall-0.7.1.jar Exception in thread "main" ja

[ANN] Luke 0.7.1 released

2007-06-22 Thread Andrzej Bialecki
Hi all, I just released Luke 0.7.1, the Lucene Index Toolbox. As usually, you can get it here: http://www.getopt.org/luke/ This minor release is mostly an upgrade to the official Lucene 2.2.0 release JARs. The following changes have been made in this release: * Added a term distributio

[jira] Commented: (LUCENE-937) Make CachingTokenFilter faster

2007-06-22 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507233 ] Mark Miller commented on LUCENE-937: I am testing on Java 1.5 I tested with LinkedList using get, LinkedList usi