[ https://issues.apache.org/jira/browse/LUCENE-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836352#action_12836352 ]
Michael McCandless commented on LUCENE-2269: -------------------------------------------- Patch looks great! This speeds up the test from 16.8 sec -> 1.5 sec for me. Only thing is, I think you don't have to unzip yourself -- benchmark can decompress .bz2 itself on the fly. > don't download/extract 20,000 files when doing the build > -------------------------------------------------------- > > Key: LUCENE-2269 > URL: https://issues.apache.org/jira/browse/LUCENE-2269 > Project: Lucene - Java > Issue Type: Test > Components: Build > Reporter: Robert Muir > Assignee: Robert Muir > Priority: Trivial > Fix For: 3.1 > > Attachments: LUCENE-2269.patch, reuters.578.lines.zip > > > When you build lucene, it downloads and extracts some data for > contrib/benchmark, especially the 20,000+ files for the reuters corpus. > this is only needed for one test, and these 20,000 files drive IDEs and such > crazy. > instead of doing this by default, we should only download/extract data if you > specifically ask (like wikipedia, collation do, etc) > for the qualityrun test, instead use a linedoc formatted 587-line text file, > similar to reuters.first20.lines.txt already used by benchmark. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org