Just for interest, here is how I do this encoding/decoding for CGI input. There are two encodings defined, the user's preferred encoding (as sent by the browser in the HTTP_ACCEPT_CHARSET header) and the application encoding (as used by the database). I use codecs.EncodedFile (with the preferred and application encodings reversed) to encode the output before it's sent to the browser. I hope this helps others in understanding how encoding/decoding works with web applications, it took me a while to figure it out!The solution is: Know where you have byte strings and where you have unicode objects. If you have a form, parameters will be byte strings encoded with the encoding of the html page. The database stores byte strings and has an encoding as well. As a general rule you should use unicode objects in your program and know the boundaries where data comes in (forms) or gets serialized (database). Encode/decode at those boundaries and you are safe.
Cheers Peter
import cgi def cgiescape(s, encodeAmp=False): if isinstance(s, basestring): if encodeAmp: s = s.replace("&", "&") s = s.replace("<", "<") s = s.replace(">", ">") s = s.replace('"', """) s = s.replace("'", "'") return s class Request: def __init__(self, input, environment, applicationEncoding='ascii'): preferredEncoding = determinePreferredEncoding(environment, applicationEncoding) fieldStorage = cgi.FieldStorage(input, environ=environment) self.__properties = {} for key in fieldStorage.keys(): field = fieldStorage[key] if isinstance(field, list): for item in field: unicodeValue = unicode(item.value, preferredEncoding) appValue = unicodeValue.encode(applicationEncoding, unicodeValue) item.value = appValue self.__properties[key] = field elif field.filename: self.__properties[key] = field else: unicodeValue = unicode(field.value, preferredEncoding) appValue = unicodeValue.encode(applicationEncoding, unicodeValue) self.__properties[key] = appValue self.environ = environment def __determinePreferredEncoding(self, environment, applicationEncoding): try: encodingHeader = environment['HTTP_ACCEPT_CHARSET'] except KeyError: preferredEncoding = applicationEncoding else: try: elements = encodingHeader.split(';') encodings = elements[0].split(',') preferredEncoding = encodings[0] except: # If we get dodgy input here it may be a hack attempt, best to ignore preferredEncoding = applicationEncoding return preferredEncoding def get(self, attr, escapeAmp=True): try: value = self.__properties[attr] if escapeAmp: value = cgiescape(value, escapeAmp) return value except: return None
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________ sqlobject-discuss mailing list sqlobject-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss