Hi, > > Is there some method provided in python standard library to sanitize > > strings used as input to xml documents? (=remove form-feeds and whatever > > else). I've searched docs and google, found only 4Suite project. I > > cannot rely on something not in standard lib, so I'm wondering if I've > > just overlooked something in the docs to this (imo) important task...
> > What do you mean by "sanitize strings used as input to xml > documents"? > Do you have a string that you're going to parse as an XML document? Either > it is valid XML, or not. I would not "sanitize" it, you risk changing the > data inside. Thanks for response and sorry for I wasn't clear first time. I have a heap of data (logs), from which I build a XML document using xml.dom.minidom. In this data, some xml invalid characters may occur - form feed (\x0c) character is one example. I don't know what else is illegal in xml, so I've searched if there's some method how to prepare strings for insertion to a xml doc before I start research on a xml spec and write such function on my own. Petr -- http://mail.python.org/mailman/listinfo/python-list