The same thing what you would normally do when handling databases with
encodings.

a) always set the table encoding explicitly (just like you do now with
mysql)

b) always set the client encoding explicitly
postgre example:
     >>> from pyPgSQL import PgSQL
     >>> cx = PgSQL.connect(database="mydb", client_encoding="utf-8",
unicode_results=1)
     >>> cu = cx.cursor()
     >>> cu.execute("set client_encoding to unicode")
     >>> cu.execute("insert into test(v) values (%s)",
(u'\x99sterreich',))
     >>> cu.execute("select v from test")
     >>> cu.fetchone()
     [u'\x99sterreich']

Mysql uses " SET NAMES 'utf8' " instead of client_encoding.

c) allow collation to be specified for tables, something like SQLTable
('name', SQLField(), collation='collationname')

Obviously these need a bit of sql dialect exploration as different
databases handle this differently :( I am familiar with postgre and
mysql in this regard, but not with the rest. Django already does this,
we can't have that, can we ? :)

On Dec 16, 12:49 pm, mdipierro <[email protected]> wrote:
> Point received.  So what do you think we need to do?
>
> Massimo
>
> On Dec 16, 4:40 am, achipa <[email protected]> wrote:
>
> > Anytime somebody stores a string in a wrong encoding knowingly, a
> > puppy dies. The database HAS to know about the encoding, or you can
> > kiss most string functions goodbye. LENGTH(strcolumn) won't work, LIKE
> > won't work, SUBSTR(...) won't work, ORDER BY won't work, etc, etc.
> > Worst of all you WILL get random truncations bc of the declared column
> > length (VARCHAR(50) in latin1 does NOT hold 50 utf8 characters). So
> > let's save the puppies and do encoding correctly (I reckon half of my
> > grey hair is due to encoding issues - the other half most likely due
> > to IE), that will also make it easier to hunt down any potential bugs.
>
> > On Dec 16, 6:15 am, mdipierro <[email protected]> wrote:
>
> > > This is not supposed to be a problem because web2py only passes utf8
> > > encoded data to the database, which means that, even if the database
> > > does not know it is utf8, it is just a string with regular characters.
> > > The database does not need to know which encoding is used and long one
> > > uses it consistently.
>
> > > This is not to say that may not be a bug. The email from Lorena seem
> > > to indicate that there may be a bug. I need to learn more to
> > > understand what is causing this.
>
> > > If you feel I am wrong about collations could you provide an example
> > > that breaks the current system?
>
> > > Massimo
>
> > > On Dec 15, 8:22 pm, achipa <[email protected]> wrote:
>
> > > > In european languages "Ã..." is usually a sign you have utf8
> > > > characters in latin 8bit fields.
>
> > > > Massimo, I can see that for MySQL you do other=' ENGINE=InnoDB
> > > > CHARACTER SET utf8;' but I fail to see a similar statement for MSSQL
> > > > or other databases. How do you make sure they create tables in utf8 ?
>
> > > > Also, what is perhaps missing is the collation. Character sets are
> > > > cool, but if you want to use the DAL with 'orderby' on a non-english
> > > > (>128 ascii actually) language table, you're in trouble without
> > > > collations.
>
> > > > On Dec 15, 8:27 pm, mdipierro <[email protected]> wrote:
>
> > > > > Can you be more explicit? How are you inserting the character and
> > > > > getting it out?
>
> > > > > Massimo
>
> > > > > On Dec 15, 9:37 am, Lorena <[email protected]> wrote:
>
> > > > > > Hello, I have a problem with accented characters like "à". Using
> > > > > > SQLform in my mssql table shows the character "Ã". Can anyone help 
> > > > > > me?
> > > > > > Thanks Lorena 1000
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"web2py Web Framework" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/web2py?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to