> >> no, only pump. unicode uses up to 4 bytes per character. > >> Charset NONE uses only 1 byte per character. > >> So, field char/varchar(20) can store 20 characters for NONE, > >> and from 20 to 5 characters for UTF8, depending on how much bytes > >> each character have. > >> Thus, you may need to increase your character fields size. > > > > Is that really the case? > > Shouldn't the length remain the same and "just" the size of the database > > become larger? > > If you are talking about a database with characterset NONE, then a > (VAR)CHAR(100) will accept 100 bytes worth of characters. For UTF-8 it > would then mean it will accepts up to 25 characters (I think that isn't > entirely true, as UTF-8 is 1 to 4 bytes per characters, so you might be > able to store more if you also use connection characterset NONE and send > the data as UTF-8). > > Now if the default characterset is UTF-8 (or if the column itself is > UTF-8), then (VAR)CHAR(100) will accept 100 characters, but it will > store up to 400 bytes of data.
Ah, ok, didn't consider that part about the characterset NONE. But if I have a database with ISO8859_1 as character set, there isn't going to be a problem with the lengths of (var)char fields when pumping the data into an UTF8 database, into (var)char fields defined with the same length / number of characters, right?
