I have an index with various fields and I want to highlight query matchings on "title" and "content" fields. These fields could contain html tags so I've configured HtmlFormatter for highlighting. The problem is that if the query doesn't match the text of the field, solr returns the value of configured alternate field without encoding it. Is there any way to get encoded value also for alternate fields? And in general there is a way to do html escaping on values returned from a response writer?

I'm using solr 3.1 and here is an excerpt from requestHandler configuration

[...]
<str name="wt">json</str>
<str name="hl">true</str>
<str name="hl.fl">title,content</str>
<str name="hl.simple.pre"><![CDATA[<b>]]></str>
<str name="hl.simple.post"><![CDATA[</b>]]></str>
<str name="f.title.hl.fragsize">1024</str>
<str name="f.title.hl.alternateField">title</str>
<str name="f.title.hl.maxAlternateFieldLength">512</str>
<int name="f.title.hl.snippets">1</int>
<str name="f.content.hl.alternateField">content</str>
<str name="f.content.hl.maxAlternateFieldLength">512</str>
<int name="f.content.hl.snippets">2</int>
[...]

and from highlighting configuration

[...]
<highlighting>
<formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" default="true">
</formatter>
<encoder name="html" class="org.apache.solr.highlight.HtmlEncoder" default="true" /> <fragmentsBuilder name="default" class="org.apache.solr.highlight.ScoreOrderFragmentsBuilder"
            default="true" />
</highlighting>
[...]

Thanks
Massimo

--
DISCLAIMER: This e-mail and any attachment is for authorised use by
the intended recipient(s) only. It may contain proprietary material,
confidential information and/or be subject to legal privilege. It
should not be copied, disclosed to, retained or used by, any other
party. If you are not an intended recipient then please promptly
delete this e-mail and any attachment and all copies and inform
the sender. Thank you.

Reply via email to