[ https://issues.apache.org/jira/browse/LUCENE-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-1119: --------------------------------------- Attachment: LUCENE-1119.patch Attached patch. I plan to commit in a day or two. > Optimize TermInfosWriter.add > ---------------------------- > > Key: LUCENE-1119 > URL: https://issues.apache.org/jira/browse/LUCENE-1119 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Michael McCandless > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.3 > > Attachments: LUCENE-1119.patch > > > I found one more optimization, in how terms are written in > TermInfosWriter. Previously, each term required a new Term() and a > new String(). Looking at the cpu time (using YourKit), I could see > this was adding a non-trivial cost to flush() when indexing Wikipedia. > I changed TermInfosWriter.add to accept char[] directly, instead. > I ran a quick test building first 200K docs of Wikipedia. With this > fix it took 231.31 sec (best of 3) and without the fix it took 236.05 > sec (best of 3) = ~2% speedup. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]