[ https://issues.apache.org/jira/browse/LUCENE-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005653#comment-13005653 ]
Shai Erera commented on LUCENE-2958: ------------------------------------ No, the flexibility is in the ability to have a TrecContentSource emitting the TREC documents, and multiple DocMakers that consume them and build Lucene documents out of them. For example, one DocMaker can decide to split each doc into N tiny docs. Another can choose to add facets to it. Yet another can do complex analysis on it and produce richer documents. Before that, you'd have to write a DocMaker for every such combination. E.g., if you wanted to add facets, you'd need to write a DocMaker per source of data with the same impl. DocData as an intermediary object is not expensive, considering it's only bin over some already allocated Strings. And we reuse it always, so you don't even allocate it more than once ... I would hate to lose that flexibility. > WriteLineDocTask improvements > ----------------------------- > > Key: LUCENE-2958 > URL: https://issues.apache.org/jira/browse/LUCENE-2958 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/benchmark > Reporter: Doron Cohen > Assignee: Doron Cohen > Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2958.patch, LUCENE-2958.patch > > > Make WriteLineDocTask and LineDocSource more flexible/extendable: > * allow to emit lines also for empty docs (keep current behavior as default) > * allow more/less/other fields -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org