On 26 August 2016 at 16:10, Frank Millman <[email protected]> wrote:
> "Joonas Liik" wrote in message
> news:cab1gnpqnjdenaa-gzgt0tbcvwjakngd3yroixgyy+mim7fw...@mail.gmail.com...
>
>> On 26 August 2016 at 08:22, Frank Millman <[email protected]> wrote:
>> >
>> > So this is my conversion routine -
>> >
>> > lines = string.split('"') # split on attributes
>> > for pos, line in enumerate(lines):
>> > if pos%2: # every 2nd line is an attribute
>> > lines[pos] = line.replace('<', '<').replace('>', '>')
>> > return '"'.join(lines)
>> >
>>
>> or.. you could just escape all & as & before escaping the > and <,
>> and do the reverse on decode
>>
>
> Thanks, Joonas, but I have not quite grasped that.
>
> Would you mind explaining how it would work?
>
> Just to confirm that we are talking about the same thing -
>
> This is not allowed - '<root><fld name="<new>"/></root>' [A]
>
>>>> import xml.etree.ElementTree as etree
>>>> x = '<root><fld name="<new>"/></root>'
>>>> y = etree.fromstring(x)
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File
> "C:\Users\User\AppData\Local\Programs\Python\Python35\lib\xml\etree\ElementTree.py",
> line 1320, in XML
> parser.feed(text)
> xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1,
> column 17
>
> You have to escape it like this - '<root><fld name="<new>"/></root>'
> [B]
>
>>>> x = '<root><fld name="<new>"/></root>'
>>>> y = etree.fromstring(x)
>>>> y.find('fld').get('name')
>
> '<new>'
>>>>
>>>>
>
> I want to convert the string from [B] to [A] for editing, and then back to
> [B] before saving.
>
> Thanks
>
> Frank
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list
something like.. (untested)
def escape(untrusted_string):
''' Use on the user provided strings to render them inert for storage
escaping & ensures that the user cant type sth like '>' in
source and have it magically decode as '>'
'''
return untrusted_string.replace("&","&").replace("<",
"<").replace(">", ">")
def unescape(escaped_string):
'''Once the user string is retreived from storage use this
function to restore it to its original form'''
return escaped_string.replace("<","<").replace(">",
">").replace("&", "&")
i should note tho that this example is very ad-hoc, i'm no xml expert
just know a bit about xml entities.
if you decide to go this route there are probably some much better
tested functions out there to escape text for storage in xml
documents.
--
https://mail.python.org/mailman/listinfo/python-list