Responding to myself because there was indeed something I was "fundamentally misunderstanding":
On Sun, Jul 25, 2010 at 8:09 PM, Ethan Jewett <[email protected]> wrote: > > I think if the user sends non-entity-encoded strings to the API for > the metadata, then we need to return the same string. We're returning > UTF-8 XML here, not HTML, so my thought was that no one would be > expected entity-encoded output. If we return JSON or other formats in > the future (and I do plan to do that), then we'll have to encode > however necessary for that format. > > Maybe I'm fundamentally misunderstanding how this output is supposed > to be encoded. Encoding is definitely not my area of expertise. > > If I'm not misunderstanding, then yes, I think we need to decode the > entity-encoding so that we see " < and > in the output. I'm not sure > if it is currently getting encoded because of something we do or > because of Scala's XML handling. My fundamental misunderstanding was around how you deal with the characters <, >, ", ', and & in XML: http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Predefined_entities_in_XML They indeed must be encoded (though they are a much smaller subset of entities that must be encoded than in HTML), so I suppose that we should return them in encoded form. Do you agree? Ethan
