Alexandre Rafalovitch created SOLR-9601:
-------------------------------------------
Summary: DIH: Radicially simplify Tika example to only show
relevant configuration
Key: SOLR-9601
URL: https://issues.apache.org/jira/browse/SOLR-9601
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Components: contrib - DataImportHandler, contrib - Solr Cell (Tika
extraction)
Affects Versions: 6.x, master (7.0)
Reporter: Alexandre Rafalovitch
Assignee: Alexandre Rafalovitch
Solr DIH examples are legacy examples to show how DIH work. However, they
include full configurations that may obscure teaching points. This is no longer
needed as we have 3 full-blown examples in the configsets.
Specifically for Tika, the field types definitions were at some point
simplified to have less support files in the configuration directory. This,
however, means that we now have field definitions that have same names as other
examples, but different definitions.
Importantly, Tika does not use most (any?) of those modified definitions. They
are there just for completeness. Similarly, the solrconfig.xml includes extract
handler even though we are demonstrating a different path of using Tika.
Somebody grepping through config files may get confused about what
configuration aspects contributes to what experience.
I am planning to significantly simplify configuration and schema of Tika
example to **only** show DIH Tika extraction path. It will end-up a very short
and focused example.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]