Re: [MarkLogic Dev General] Avoiding HTML tags in searches

David Ennis Mon, 21 Sep 2015 12:43:35 -0700

Your likely looking for the feature "Phrase Around"

https://docs.marklogic.com/admin-help/phrase-around



In a nutshell, content inside of these elements(and and containing content
- including more elements and their children, etc)  are simply not
indexed. So, having the actual structure embedded in the document would
give you more granular control

The challenge you will have is that the HTML in the CDATA is likely indexed
as text, so the feature listed needs to be on the element containing the
CDATA..





Kind Regards,
David Ennis


David Ennis
*Content Engineer*

[image: HintTech]  <http://www.hinttech.com/>
Mastering the value of content
creative | technology | content

Delftechpark 37i
2628 XJ Delft
The Netherlands
T: +31 88 268 25 00
M: +31 63 091 72 80

[image: http://www.hinttech.com] <http://www.hinttech.com>
<https://twitter.com/HintTech>  <http://www.facebook.com/HintTech>
<http://www.linkedin.com/company/HintTech>

On 21 September 2015 at 21:32, Travis Raybold <[email protected]> wrote:

> Howdy,
>
>
>
> We have a content base full of documents that contain some HTML fields in
> them. This content is all escaped in CDATA tags because it is generally not
> valid XML. When we search, though, we really don't want to find text inside
> of the HTML tags - "class", "id" or "name", for example.
>
>
>
> Is there a way to tell the indexer to avoid content inside of HTML tags?
>
>
>
> Thanks,
>
>
>
> --Travis
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Avoiding HTML tags in searches

Reply via email to