Tim Nyborg has got the solution:
It's a bug in yatl/sanitizer.py, which can be fixed as described:
https://stackoverflow.com/questions/60176267/webp2y-xml-helper-sanitize-line-breaks-under-python3
Thanks Tim!
On Wednesday, February 12, 2020 at 5:17:31 PM UTC+1, Clemens wrote:
>
> Hi Chris,
>
> thanks a lot for your help! But the problem still exists even replacing my
> str_replace routine by str.replace() as proposed by you. Yes, I had the
> same problem with line breaks crashing the view. And replacing the line
> breaks by fixed it. But switching form python 2.7 to 3.6 raises the
> new problem that the sanitizer can't process -coded line breaks.
> Without sanitize=True (i.e. False by default) it also works with python
> 3.6. But sanitize=True doesn't work for -coded line breaks under
> python 3.6. And this is the case only for line breaks, all other special
> character are no problem.
>
> I really think, that the XML sanitizer under python 3.6 is the problem. Do
> you have an idea for a work around except of eliminating all line breaks,
> cause I can't do this?
>
> Best regards
> Clemens
>
>
> On Wednesday, February 12, 2020 at 4:42:53 PM UTC+1, Christian Varas wrote:
>>
>> Hi Clemens,
>>
>> Replace can handle big text it does not matter if is 1 - 1000 lines or
>> more, It will replace all the occurrences in the text, also is faster.
>> chaining "replace" is more faster than other methods.
>>
>> description = his_item.description.replace("\n"," ").replace("\r","
>> ").replace("<","<").replace(">",">")
>> XML(description, sanitize=True)
>>
>> or in one line
>>
>> XML(his_item.description.replace("\n"," ").replace("\r"," ")
>> .replace("<","<").replace(">",">"), sanitize=True)
>>
>>
>> A(this_item.title, \
>> callback = URL('item', 'select', \
>> vars=dict(uuid=this_item.uuid), user_signature=True), \
>> _title=XML(his_item.description.replace("\n"," ").replace("\r","
>> ").replace("<","<").replace(">",">"), sanitize=True)
>>
>> I had this issue with line breaks and XML helper also, the input
>> containing line breaks was breaking my view, and replacing the bad
>> characters before pass it to the helper fixed my problem.
>>
>> Try in a console with a custom text and see the results.
>>
>> Hope this helps
>> Cheers.
>> Chris.
>>
>> El mié., 12 feb. 2020 a las 10:08, Clemens (<[email protected]>)
>> escribió:
>>
>>> Hello Chris,
>>>
>>> thanks for your answer! But just kicking out all line breaks is a little
>>> harsh, since in my case the description is mostly a few lines long with 2
>>> or 3 paragraphs. And I had the problem already solved by this procedure and
>>> the call as described in my question:
>>>
>>> def str_replace(string, replacement_dict):
>>>> if not isinstance(string, str):
>>>> string = str(string)
>>>> pattern = re.compile('|'.join([re.escape(k) for k in
>>>> list(replacement_dict.keys())]), re.M)
>>>> return pattern.sub(lambda x: replacement_dict[x.group(0)], string)
>>>>
>>>
>>> And this solution worked very well with python 2.7, having even line
>>> breaks in link titles. Then I moved to python 3.6 and the problem was
>>> there. Thus, I think, that the XML sanitizer under Python 3.6 is the
>>> problem, since it can't handle
>>>
>>> Do you have any other ideas?
>>>
>>> Best regards
>>> Clemens
>>>
>>>
>>> On Wednesday, February 12, 2020 at 12:08:17 PM UTC+1, Christian Varas
>>> wrote:
>>>>
>>>> I had an issue with line breaks too, I remove lie breaks like this with
>>>> python 3.7
>>>>
>>>> some_string = some_string.replace(“\n”, ””).replace(“\r”, ””)
>>>>
>>>> XML(some_string, sanitize=True)
>>>>
>>>> Cheers
>>>> Chris
>>>>
>>>> El El mié, 12 de feb. de 2020 a la(s) 04:37, Clemens <
>>>> [email protected]> escribió:
>>>>
>>>>> Hello!
>>>>>
>>>>> In my web2py app I’m processing a list of items, where the user can
>>>>> click on a link for each item to select this. An item has an UUID, a
>>>>> title
>>>>> and a description. For a better orientation the item description is also
>>>>> displayed as link title. To prevent injections by and to escape tags in
>>>>> the
>>>>> description I’m using the XML sanitizer as follows:
>>>>>
>>>>> A(this_item.title, \
>>>>> callback = URL('item', 'select', \
>>>>> vars=dict(uuid=this_item.uuid), user_signature=True),
>>>>> \
>>>>> _title=XML(str_replace(this_item.description, {'\r\n':' ',
>>>>> '<':'<', '>':'>'}), sanitize=True))
>>>>>
>>>>> Using Python 2.7 everything was fine. Since I have switched to Python
>>>>> 3.6 I have the following problem. When the description contains line
>>>>> breaks
>>>>> the sanitizer is not working anymore. For example the following string
>>>>> produces by my str_replace routine is fine to be sanitized by the XML
>>>>> helper under Python 2.7 but not under Python 3.6:
>>>>>
>>>>> Header Line1 Line2 Line3
>>>>>>
>>>>>
>>>>> Sanitizing line breaks escaped by is the problem with Python 3
>>>>> (but not with Python 2). Everything else is no problem for the XML helper
>>>>> to sanitize (e.g. less than or greater than, I need these, since if there
>>>>> is no description it is generated as <no description>).
>>>>>
>>>>> How can be line breaks sanitized by the XML helper running web2py
>>>>> under Python3?
>>>>>
>>>>> Thanks for any support!
>>>>>
>>>>> Best regards Clemens
>>>>>
>>>>>
>>>>> --
>>>>> Resources:
>>>>> - http://web2py.com
>>>>> - http://web2py.com/book (Documentation)
>>>>> - http://github.com/web2py/web2py (Source code)
>>>>> - https://code.google.com/p/web2py/issues/list (Report Issues)
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "web2py-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/web2py/319d22e0-d1be-452c-8c25-d1ec76df1a5e%40googlegroups.com
>>>>>
>>>>> <https://groups.google.com/d/msgid/web2py/319d22e0-d1be-452c-8c25-d1ec76df1a5e%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> Resources:
>>> - http://web2py.com
>>> - http://web2py.com/book (Documentation)
>>> - http://github.com/web2py/web2py (Source code)
>>> - https://code.google.com/p/web2py/issues/list (Report Issues)
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "web2py-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/web2py/64244a11-0964-4e44-9b75-e9d9e8d33f83%40googlegroups.com
>>>
>>> <https://groups.google.com/d/msgid/web2py/64244a11-0964-4e44-9b75-e9d9e8d33f83%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/web2py/bc1aca0d-6b82-47d5-b1b2-0307ba886340%40googlegroups.com.