> >> no, only pump. unicode uses up to 4 bytes per character.
> >> Charset NONE uses only 1 byte per character.
> >> So, field char/varchar(20) can store 20 characters for NONE,
> >> and from 20 to 5 characters for UTF8, depending on how much bytes
> >> each character have.
> >> Thus, you may need to increase your character fields size.
> >
> > Is that really the case?
> > Shouldn't the length remain the same and "just" the size of the database 
> > become larger?
> 
> If you are talking about a database with characterset NONE, then a 
> (VAR)CHAR(100) will accept 100 bytes worth of characters. For UTF-8 it 
> would then mean it will accepts up to 25 characters (I think that isn't 
> entirely true, as UTF-8 is 1 to 4 bytes per characters, so you might be 
> able to store more if you also use connection characterset NONE and send 
> the data as UTF-8).
> 
> Now if the default characterset is UTF-8 (or if the column itself is 
> UTF-8), then (VAR)CHAR(100) will accept 100 characters, but it will 
> store up to 400 bytes of data.

Ah, ok, didn't consider that part about the characterset NONE.

But if I have a database with ISO8859_1 as character set, there isn't going to 
be a problem with the lengths of (var)char fields when pumping the data into an 
UTF8 database, into (var)char fields defined with the same length / number of 
characters, right?

Reply via email to