Branko Čibej wrote: > Bhuvaneswaran A wrote: > > On Fri, 2009-12-04 at 12:36 +0000, Julian Foad wrote: > > > >> Bhuvaneswaran A wrote: > >> > >>> Please find attached the revised patch. I incorporated following > >>> feedback: > >>> a) Fix the array slicing part > >>> b) Escape using ord() instead of removing those characters > >>> c) Handle "]]>" in CDATA section > >>> d) Define the ascii table globally (once) and re-use > >>> > >>> I also verified this fix by generating the junit files for tests having > >>> special characters and simulating a test that has "]]>" in failure text. > >>> With this patch, it generates valid junit file. > >>> > >> It looks great. You could also move the definition of 'chars_to_remove' > >> out of the function, but either way it's fine. Go on, commit it! > >> > > > > Branko, Julian: Thank you for the review comments.
Branko wrote (elsewhere in the thread): > Julian Foad wrote: > > I searched on the web and didn't find a really really simple way to > > escape a set of characters. I think something like > > > > for c in chars_to_remove: > > data = data.replace(c, '%%%0x' % ord(c)) > > > > would do it. > > Please read what I wrote earlier. Second, this is URL escaping, not XML > quoting. But first, there is no way to represent such control chars in > XML. Only CR and LF are valid according the the XML spec. Others are > not; and you can't use character references, e.g.,  to represent > ESC. That's not valid XML. Sure. The idea is to make the occasional unexpected control character appear as something that is valid in an XML CDATA section and is human-readable. URL escaping rules seemed a fine choice. Have I missed something? > > Incorporated the above suggestion and committed in r887178. > > Wait, you committed a script that does URL quoting on XML contents? Did > you look at the output? I didn't look at any real output, only a hand-crafted test string. Did you? Why do you ask? - Julian