[
https://issues.apache.org/jira/browse/SOLR-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269247#comment-13269247
]
Jack Krupansky commented on SOLR-3439:
--------------------------------------
Right, so if it is the double indexing that is a serious concern, maybe having
"content" stored but not indexed is a reasonable compromise. It would be
searchable due to the CopyField but not double-indexed. This would still give a
reasonablly friendly out of the box experience (default search works and
content is returned), and obviously they can hand-tune for more specific
control.
But if "content" is stored but not indexed, the user can't simply add "content"
to "qf" - they need to make it indexed, which is what my preliminary patch does.
> Add "content" field to example schema to make SolrCell easier to use out of
> the box
> -----------------------------------------------------------------------------------
>
> Key: SOLR-3439
> URL: https://issues.apache.org/jira/browse/SOLR-3439
> Project: Solr
> Issue Type: Improvement
> Components: contrib - Solr Cell (Tika extraction), Schema and
> Analysis
> Reporter: Jack Krupansky
> Priority: Minor
> Fix For: 4.0
>
> Attachments: Lincoln-Gettysburg-Address.docx,
> Lincoln-Gettysburg-Address.pdf
>
>
> Currently, SolrCell is configured to map Tika "content" (the main body of a
> document) to the "text" field which is the indexed-only (not stored)
> catch-all for default queries. That searches fine, but doesn't show the
> document content in the results, sometimes leading users to think that
> something is wrong. Sure, the user can easily add the field (and this is
> documented), but it would be a better user experience to have such a basic
> feature work right out of the box without any config editing and without the
> need for the user to read the fine print in the documentation.
> I propose that we add the "content" field to the example schema in the
> section of fields already defined to support SolrCell metadata. It would be
> stored and indexed.
> I further propose that a copyField be added for the "title", "description",
> (and maybe a couple of others) and "content" fields to add them to the "text"
> field for searching. Again, trying to improve the out of the box user
> experience. It also simplifies testing - less setup.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]