[jira] Updated: (LUCENE-2923) cleanup contrib/demo

Michael McCandless (JIRA) Thu, 17 Feb 2011 02:52:54 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Michael McCandless updated LUCENE-2923:
---------------------------------------

    Attachment: LUCENE-2923.patch

OK new patch, fixing a number of things:

  * I close the Reader (thanks Mark).

  * I cutover to NumericField (and stopped using DateTools) for the
    "modified" field.

  * I added a -create option to IndexFiles, so you can see how to
    CREATE vs CREATE_OR_APPEND

  * I left commented-out optional things -- calling optimize,
    increasing IW's RAM buffer.

  * Don't use Version.LUCENE_CURRENT.

  * I sucked in test files from Lucene in Action 2E's tests (open
    source licenses).

  * I use addDocument or updateDocument depending on -create.

  * I made the "demo html parser" private to modules/benchmark, which
    had a dependency on it.  Can someone lookover my changes to the
    build xml files?  (Especially the Maven part, where I completely
    guessed!).

  * IndexHTML is gone, and the webapp (src/jsp/*) is gone too.

To apply the patch you first have to do this:

{noformat}
svn mv lucene/contrib/benchmark/src/java/org/apache/lucene/demo/html 
modules/benchmark/src/java/org/apache/lucene/benchmark/byTask/feeds/demohtml
svn mv lucene/contrib/demo/src/test/org/apache/lucene/demo/html 
modules/benchmark/src/test/org/apache/lucene/benchmark/byTask/feeds/demohtml
{noformat}


> cleanup contrib/demo
> --------------------
>
>                 Key: LUCENE-2923
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2923
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2923.patch, LUCENE-2923.patch
>
>
> I don't think we should include optimize in the demo; many people start from 
> the demo and may think you must optimize to do searching, and that's clearly 
> not the case.
> I think we should also use a buffered reader in FileDocument?
> And... I'm tempted to remove IndexHTML (and the html parser) entirely.  It's 
> ancient, and we now have Tika to extract text from many doc formats.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Updated: (LUCENE-2923) cleanup contrib/demo

Reply via email to