[ 
https://issues.apache.org/jira/browse/LUCENE-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005653#comment-13005653
 ] 

Shai Erera commented on LUCENE-2958:
------------------------------------

No, the flexibility is in the ability to have a TrecContentSource emitting the 
TREC documents, and multiple DocMakers that consume them and build Lucene 
documents out of them.

For example, one DocMaker can decide to split each doc into N tiny docs. 
Another can choose to add facets to it. Yet another can do complex analysis on 
it and produce richer documents.

Before that, you'd have to write a DocMaker for every such combination. E.g., 
if you wanted to add facets, you'd need to write a DocMaker per source of data 
with the same impl.

DocData as an intermediary object is not expensive, considering it's only bin 
over some already allocated Strings. And we reuse it always, so you don't even 
allocate it more than once ...

I would hate to lose that flexibility.

> WriteLineDocTask improvements
> -----------------------------
>
>                 Key: LUCENE-2958
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2958
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>            Reporter: Doron Cohen
>            Assignee: Doron Cohen
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-2958.patch, LUCENE-2958.patch
>
>
> Make WriteLineDocTask and LineDocSource more flexible/extendable:
> * allow to emit lines also for empty docs (keep current behavior as default)
> * allow more/less/other fields

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to