Thanks for bringing closure! Mike
On Wed, Jun 10, 2009 at 4:42 AM, Mateusz Berezecki<mateu...@gmail.com> wrote: > Hi list! > > I'm forwarding as somehow I did not put the list in the CC but the > answer I think is noteworthy, so here it is. Please remember to use > StringBuffer before blaming lucene ;-) > > Actual time consumed by lucene is now ~130 minutes as opposed to 20 > hours which is neat. I can do much more passes during the day. > > Mateusz > > On Wed, Jun 10, 2009 at 2:31 AM, Michael > McCandless<luc...@mikemccandless.com> wrote: >> Oh actually could you send this response to the full list (so they see >> closure too)? Thanks. >> >> Mike >> >> On Tue, Jun 9, 2009 at 6:32 PM, Mateusz Berezecki<mateu...@gmail.com> wrote: >>> Hi Michael, >>> >>> It took a while but thanks to your suggestions I've started poking >>> around and it turned out that the bottleneck was with GC for JVM and >>> in my use of String instead of StringBuffer. The thing is now over 20 >>> times faster ! >>> >>> Thanks a lot ! >>> >>> Mateusz >>> >>> On Mon, Jun 8, 2009 at 2:13 PM, Michael >>> McCandless<luc...@mikemccandless.com> wrote: >>>> On Mon, Jun 8, 2009 at 7:54 AM, Mateusz Berezecki<mateu...@gmail.com> >>>> wrote: >>>> >>>>> Thanks for a prompt response. >>>> >>>> You're welcome! >>>> >>>>>> A mergeFactor of 150 is way too high; I'd put that back to 10 and see >>>>>> if the problem persists. Also make sure you're using >>>>>> autoCommit=false, and try the suggestions here: >>>>>> >>>>>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed >>>>> >>>>> I've set mergeFactor to 10, 15 and 20 before trying out 150 and the >>>>> problem persisted, although I have to admit that 2.9 gives some >>>>> serious speed improvements as compared to 2.4.1 which I believe is a >>>>> good sign, i.e. it reaches the same document that causes deadlock much >>>>> faster than 2.4.1 does >>>> >>>> Hmm: do you know for certain that a particular document causes this? >>>> If you make a standalone test indexing only that document, does the >>>> problem happen? >>>> >>>>>> You're sure the JRE's heap size is big enough? >>>>> >>>>> I've set it to 3.8 GB and I'm running it on a desktop with 4 GB of RAM. >>>> >>>> OK sounds like plenty, though likely the OS won't give you 3.8 GB (if >>>> the JRE is 32-bit). >>>> >>>>>> If the problem persists... can you turn on IndexWriter's infoStream >>>>>> and post the resulting output leading up to the 100% CPU? You might >>>>>> also try "kill -QUIT" when the 100% CPU problem is happening, to catch >>>>>> the stack trace of all threads, and post that too... >>>>> >>>>> Not sure how do I turn on the infoStream and autoCommit? WRT to >>>>> autoCommit I did not use the deprecated API with autoCommit flags in >>>>> constructors, so assuming I used the recommended API is the autoCommit >>>>> on/off by default? >>>> >>>> For infoStream, eg: IndexWriter.setInfoStream(System.out); >>>> >>>> And, yes, since you're using a non-deprecated ctor of IndexWriter, you >>>> are getting autoCommit=false, so that's good. >>>> >>>> Mike >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>>> >>> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org