Ok, I did some changes and things look to be working. My intention was to receive URLs, parse them to get the base URL, put them in database (Postgres), and then through a http query, through Django interface through psycopg2, retrieve these URLs and display those to the user on the browser in a table.
I do not know whether I should be really worrying about encoding/ decoding here as I just wanted to get the chars as they were coming. Hence, I tried below changes and they were working fine. I am giving some samples of the URLs that I processed which, with the help of below changes, could work fine. http://003-sexo-mulheres-nuas.ck7.net http://live.žšcr.com/host <<some chars before cr can not be copied here. http://östrogenfrei.de/verhuetung.html I needed below changes - - keep the Postgres client encoding to sql_ascii. - Make below changes in Django in following modules - ./python2.6.1/lib/python2.6/site-packages/django//contrib/syndication/ feeds.py Change for applying Unicode on our URLs and data which is probably unnecessary. The iri_to_uri is harmless, but works for us. 135,136c135 < url = iri_to_uri(enc_url), < #url = smart_unicode(enc_url), --- > url = smart_unicode(enc_url), 138,139c137 < mime_type = iri_to_uri(self.__get_dynamic_attr('item_enclosure_mime_type', item)) < #mime_type = smart_unicode(self.__get_dynamic_attr('item_enclosure_mime_type', item)) --- > mime_type = > smart_unicode(self.__get_dynamic_attr('item_enclosure_mime_type', item)) ./python2.6.1/lib/python2.6/site-packages/django//db/backends/ postgresql/base.py Same philosophy as above Additionally, using sql_ascii as character set wherever possible. 46,47c46 < #result[smart_str(key, charset)] = smart_str(value, charset) < result[smart_str(key, charset)] = iri_to_uri(value) --- > result[smart_str(key, charset)] = smart_str(value, charset) 50,51c49 < return tuple([iri_to_uri(p) for p in params]) < #return tuple([smart_str(p, self.charset, True) for p in params]) --- > return tuple([smart_str(p, self.charset, True) for p in params]) 54,55c52 < return self.cursor.execute(iri_to_uri(sql), self.format_params(params)) < #return self.cursor.execute(smart_str(sql, self.charset), self.format_params(params)) --- > return self.cursor.execute(smart_str(sql, self.charset), > self.format_params(params)) 128c125 < cursor = UnicodeCursorWrapper(cursor, 'sql_ascii') --- > cursor = UnicodeCursorWrapper(cursor, 'utf-8') 137,138c134 < #return smart_unicode(s) < return iri_to_uri(s) --- > return smart_unicode(s) ./python2.6.1/lib/python2.6/site-packages/django//db/backends/ postgresql_psycopg2/base.py Need to disable psycopg2 extensions as Unicode as this is not needed. We can safely expect whatever data we get from DJango interface to be of our use. 25c25 < #psycopg2.extensions.register_type(psycopg2.extensions.UNICODE) --- > psycopg2.extensions.register_type(psycopg2.extensions.UNICODE) The below changes looks redundant now, things are working even w/o this one. ./python2.6.1/lib/python2.6/site-packages/django//db/models/base.py Setting the encoding to ascii. 277,278c277 < return force_unicode(self).encode('ascii') < #return force_unicode(self).encode('utf-8') --- > return force_unicode(self).encode('utf-8') For the purpose of processing, my views.py needed to process the URLs in a slightly different before rendering the response back to html - import urllib url = urllib.quote_plus(received_url) Also, in the html file, where I was processing the URL, I needed to 'unescape' my url. Here is my request/query - Can you please review these changes and this approach? Do you see any major issue here? I am sure there must be some purpose in not having this approach earlier, but just wondering why? Thanks, -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

