[
https://issues.apache.org/jira/browse/LUCENE-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-2923:
---------------------------------------
Attachment: LUCENE-2923.patch
OK new patch, fixing a number of things:
* I close the Reader (thanks Mark).
* I cutover to NumericField (and stopped using DateTools) for the
"modified" field.
* I added a -create option to IndexFiles, so you can see how to
CREATE vs CREATE_OR_APPEND
* I left commented-out optional things -- calling optimize,
increasing IW's RAM buffer.
* Don't use Version.LUCENE_CURRENT.
* I sucked in test files from Lucene in Action 2E's tests (open
source licenses).
* I use addDocument or updateDocument depending on -create.
* I made the "demo html parser" private to modules/benchmark, which
had a dependency on it. Can someone lookover my changes to the
build xml files? (Especially the Maven part, where I completely
guessed!).
* IndexHTML is gone, and the webapp (src/jsp/*) is gone too.
To apply the patch you first have to do this:
{noformat}
svn mv lucene/contrib/benchmark/src/java/org/apache/lucene/demo/html
modules/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/demohtml
svn mv lucene/contrib/demo/src/test/org/apache/lucene/demo/html
modules/benchmark/src/test/org/apache/lucene/benchmark/byTask/feeds/demohtml
{noformat}
> cleanup contrib/demo
> --------------------
>
> Key: LUCENE-2923
> URL: https://issues.apache.org/jira/browse/LUCENE-2923
> Project: Lucene - Java
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2923.patch, LUCENE-2923.patch
>
>
> I don't think we should include optimize in the demo; many people start from
> the demo and may think you must optimize to do searching, and that's clearly
> not the case.
> I think we should also use a buffered reader in FileDocument?
> And... I'm tempted to remove IndexHTML (and the html parser) entirely. It's
> ancient, and we now have Tika to extract text from many doc formats.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]