​Your likely looking for the feature "Phrase Around" https://docs.marklogic.com/admin-help/phrase-around
In a nutshell, content inside of these elements(and and containing content - including more elements and their children, etc) are simply not indexed​. So, having the actual structure embedded in the document would give you more granular control The challenge you will have is that the HTML in the CDATA is likely indexed as text, so the feature listed needs to be on the element containing the CDATA.. Kind Regards, David Ennis David Ennis *Content Engineer* [image: HintTech] <http://www.hinttech.com/> Mastering the value of content creative | technology | content Delftechpark 37i 2628 XJ Delft The Netherlands T: +31 88 268 25 00 M: +31 63 091 72 80 [image: http://www.hinttech.com] <http://www.hinttech.com> <https://twitter.com/HintTech> <http://www.facebook.com/HintTech> <http://www.linkedin.com/company/HintTech> On 21 September 2015 at 21:32, Travis Raybold <[email protected]> wrote: > Howdy, > > > > We have a content base full of documents that contain some HTML fields in > them. This content is all escaped in CDATA tags because it is generally not > valid XML. When we search, though, we really don't want to find text inside > of the HTML tags - "class", "id" or "name", for example. > > > > Is there a way to tell the indexer to avoid content inside of HTML tags? > > > > Thanks, > > > > --Travis > > > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
