I want to retain the formatted HTML in a result but, want to ignore (or
filter out) HTML tags in a search, if this makes sense?

On Thu, Apr 5, 2012 at 3:44 PM, Steven A Rowe <[email protected]> wrote:

> okayndc,
>
> A field configured to use HTMLStripCharFilter as part of its index-time
> analyzer will strip out HTML tags before index terms are created by the
> tokenizer, so HTML tags will not be put into the index.  As a result,
> queries for HTML tags cannot match the original documents' HTML tags (in
> the field configured to use HTMLStripCharFilter, anyway).
>
> So HTMLStripCharFilter should do what you want.
>
> Steve
>
> From: okayndc [mailto:[email protected]]
> Sent: Thursday, April 05, 2012 3:36 PM
> To: Steven A Rowe
> Cc: [email protected]
> Subject: Re: HTML tags and Lucene highlighting
>
> Hello,
>
> I want to ignore HTML tags within a search.  ~ I should not be able to
> search for a HTML tag (ex. <strong>) and get back the highlighted HTML tag
> (ex. <span class="highlighted"><strong></span>) in a result set.
>
> Thanks
>
> On Thu, Apr 5, 2012 at 3:24 PM, Steven A Rowe <[email protected]<mailto:
> [email protected]>> wrote:
> Hi okayndc,
>
> What *do* you want?
>
> Steve
>
> -----Original Message-----
> From: okayndc [mailto:[email protected]<mailto:[email protected]>]
> Sent: Thursday, April 05, 2012 1:34 PM
> To: [email protected]<mailto:[email protected]>
> Subject: HTML tags and Lucene highlighting
>
> Hello,
>
> I currently use Lucene version 3.0...probably need to upgrade to a more
> current version soon.
> The problem that I have is when I test search for a an HTML tag (ex.
> <strong>), Lucene returns
> the highlighted HTML tag ~ which is what I DO NOT want.  Is there a way to
> "filter" HTML tags?
> I have read up on HTMLStripChar filter (packaged with Solr) and wondered
> if this is the way to go?
>
> Any help will be greatly appreciated,
> Thanks
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]<mailto:
> [email protected]>
> For additional commands, e-mail: [email protected]<mailto:
> [email protected]>
>
>

Reply via email to