[
https://issues.apache.org/jira/browse/LUCENE-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved LUCENE-947.
---------------------------------------
Resolution: Fixed
Fix Version/s: 2.3
Lucene Fields: [New, Patch Available] (was: [Patch Available, New])
> Some improvements to contrib/benchmark
> --------------------------------------
>
> Key: LUCENE-947
> URL: https://issues.apache.org/jira/browse/LUCENE-947
> Project: Lucene - Java
> Issue Type: Improvement
> Components: contrib/benchmark
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-947.patch, LUCENE-947.take2.patch,
> LUCENE-947.take3.patch, LUCENE-947.take4.patch, LUCENE-947.take5.patch
>
>
> I've made some small improvements to the contrib/benchmark, mostly
> merging in the ad-hoc benchmarking code I've been using in LUCENE-843:
> - Fixed thread safety of DirDocMaker's usage of SimpleDateFormat
> - Print the props in sorted order
> - Added new config "autocommit=true|false" to CreateIndexTask
> - Added new config "ram.flush.mb=int" to AddDocTask
> - Added new configs "doc.term.vector.positions=true|false" and
> "doc.term.vector.offsets=true|false" to BasicDocMaker
> - Added WriteLineDocTask.java, so you can make an alg that uses this
> to build up a single file containing one document per line in a
> single file. EG this alg converts the reuters-out tree into a
> single file that has ~1000 bytes per body field, saved to
> work/reuters.1000.txt:
> docs.dir=reuters-out
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.DirDocMaker
> line.file.out=work/reuters.1000.txt
> doc.maker.forever=false
> {WriteLineDoc(1000)}: *
> Each line has tab-separted TITLE, DATE, BODY fields.
> - Created feeds/LineDocMaker.java that creates documents read from
> the file created by WriteLineDocTask.java. EG this alg indexes
> all documents created above:
> analyzer=org.apache.lucene.analysis.SimpleAnalyzer
> directory=FSDirectory
> doc.add.log.step=500
> docs.file=work/reuters.1000.txt
> doc.maker=org.apache.lucene.benchmark.byTask.feeds.LineDocMaker
> doc.tokenized=true
> doc.maker.forever=false
> ResetSystemErase
> CreateIndex
> {AddDoc}: *
> CloseIndex
> RepSumByPref AddDoc
> I'll attach initial patch shortly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]