[jira] Commented: (LUCENE-971) Create enwiki indexable data as line-per-article rather than file-per-article

Michael McCandless (JIRA) Wed, 01 Aug 2007 09:34:15 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517007
 ]


Michael McCandless commented on LUCENE-971:
-------------------------------------------


> I can look at what it would take to avoid the line file ... but
> ... what about the overhead of the XML parser? I don't tend to think
> of XML parsers as "light". Would bundling that into the test be a
> concern?

Right I too would not consider XML parsing overhead "light".  So tests
that are sensitive to the XML parsing cost should first create a line
file.

But, this is the case regardless of which approach we use (ie, both
approaches allow you use a line file -- the WriteLineDocTask writes a
line file from any DocMaker).  It's just that the new approach would
buy us more flexibility for those people who don't need (or want) to
use the line file as an intermediary.


> Create enwiki indexable data as line-per-article rather than file-per-article
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-971
>                 URL: https://issues.apache.org/jira/browse/LUCENE-971
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Steven Parkes
>         Attachments: LUCENE-971.patch.txt
>
>
> Create a line per article rather than a file. Consume with indexLineFile task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-971) Create enwiki indexable data as line-per-article rather than file-per-article

Reply via email to