Are you sure your input is UTF8? The web2py markmin_serializer is in gluon/html.py and it is relatively straightforward. Nothing can really go bad there. I suspect your input has not been parsed at all into the web2py object representation.
the parsing is done by TAG(input) (not by XML(input)) and it is based on the python built-in XML parser which chokes on non-utf8 chars. It may not be parsting the XML at all and returning the XML as a single string. Massimo On Sep 11, 2:01 pm, jotbe <[email protected]> wrote: > Hi List, > > I just started my first Web2Py sample project (the Wiki from the book) > and got it even managed to integrate the HTML5 editor > Aloha:http://aloha-editor.org/ > > My pages should use Markmin instead of HTML and therefore I am > converting the HTML to Markmin using TAG().flatten() and > markmin_serializer. In general it is working and the content is stored > as Markmin code, but when using eg. German umlauts like 'öä', TAG() > seems to get confused and doesn't handle the encoding properly. > > On the other hand, when trying to use > XML().flatten(render=markmin_serializer) instead of > TAG().flatten(render=markmin_serializer), nothing changes at all. > XML().flatten(render=markmin_serializer) will return the input HTML > string as is, instead of converting it to Markmin. > > I am trying to solve this issue for two days now and read lots of > posts regarding handling of UTF-8 in Python, tried lots of third party > modules to workaround this issue, but had no luck so far. I really > appreciate your help/tips. :) > > Various sample code using the Web2Py > Shell:https://gist.github.com/caec7bd5b41624d50b01 > > Thanks in advance!

