I've noticed that the snippets returned in nutch's search seem to have the formatting added to them, and are then escaped into xml strings. How would I go about changing the process so that the content was escaped, then formatting added, then the snippet escaped?
the reason I want this is so that I can return valid xml with the formatting as xml entities, but the actual snippet text escaped. example of how nutch does it: origional text: "red fox & lazy dog" formatting applied: "red <span class="highlight">fox</span> & lazy dog" escaped: "red <span class="highlight">fox</span> & dog" example of what I'm after: origional text: "red fox & lazy dog" escaped text" "red fox & lazy dog" formatting applied: "red <span class="highlight">fox</span> & lazy dog" escaped: "red <span class="highlight">fox</span> &amp; lazy dog"
