[ https://issues.apache.org/jira/browse/LUCENE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doron Cohen resolved LUCENE-849. -------------------------------- Resolution: Fixed Lucene Fields: [Patch Available] (was: [Patch Available, New]) Committed. > contrib/benchmark: configurable HTML Parser, external classes to path, > exhaustive doc maker > -------------------------------------------------------------------------------------------- > > Key: LUCENE-849 > URL: https://issues.apache.org/jira/browse/LUCENE-849 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/benchmark > Reporter: Doron Cohen > Assigned To: Doron Cohen > Priority: Minor > Attachments: 849-bench-parse-exhaust.patch > > > "doc making" enhancements: > 1. Allow configurable html parser, with a new html.parser property. > Currently TrecDocMaker is using the Demo html parser. With this new property > this can be overridden. > 2. allow to add external class path, so the benchmark can be used with > modified makers/parsers without having to add code to Lucene. > Run benchmark with e.g. "ant run-task > -Dbenchmark.ext.classpath=/myproj/myclasses" > 3. allow to crawl a doc maker until exhausting all its files/docs once, > without having to know in advance how many docs it can make. > This can be useful for instance if the input data is in zip files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]