Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Tei
On 2 May 2017 at 19:10, Mark Clements (HappyDog) wrote: > Hi all, > > I seem to recall that a long, long time ago MediaWiki was using UTF-8 > internally but storing the data in 'latin1' fields in MySQL. > I remember a old thread in 2009. https://lists.gt.net/wiki/wikitech/160875 -- -- ℱin

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Tim Starling
On 03/05/17 03:10, Mark Clements (HappyDog) wrote: > Can anyone confirm that MediaWiki used to behave in this manner, and > if so why? In MySQL 4.0, MySQL didn't really have character sets, it only had collations. Text was stored as 8-bit clean binary, and was only interpreted as a character seque

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread bawolff
On Tue, May 2, 2017 at 8:05 PM, Jaime Crespo wrote: > On Tue, May 2, 2017 at 9:24 PM, Brian Wolff wrote: > >> . >> > >> > On the latest discussions, there are proposals to increase the minimum >> > mediawiki requirements to MySQL/MariaDB 5.5 and allow binary or utf8mb4 >> > (not utf8, 3 byte utf8

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Jaime Crespo
On Tue, May 2, 2017 at 9:24 PM, Brian Wolff wrote: > . > > > > On the latest discussions, there are proposals to increase the minimum > > mediawiki requirements to MySQL/MariaDB 5.5 and allow binary or utf8mb4 > > (not utf8, 3 byte utf8), https://phabricator.wikimedia.org/T161232. > Utf8mb4 > > s

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Brian Wolff
. > > On the latest discussions, there are proposals to increase the minimum > mediawiki requirements to MySQL/MariaDB 5.5 and allow binary or utf8mb4 > (not utf8, 3 byte utf8), https://phabricator.wikimedia.org/T161232. Utf8mb4 > should be enough for most uses (utf8 will not allow for emojis, for

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Jaime Crespo
Mark, On Tue, May 2, 2017 at 7:10 PM, Mark Clements (HappyDog) < gm...@kennel17.co.uk> wrote: > Hi all, > > I seem to recall that a long, long time ago MediaWiki was using UTF-8 > internally but storing the data in 'latin1' fields in MySQL. > > I notice that there is now the option to use either

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Gergő Tisza
On Tue, May 2, 2017 at 7:10 PM, Mark Clements (HappyDog) < gm...@kennel17.co.uk> wrote: > I seem to recall that a long, long time ago MediaWiki was using UTF-8 > internally but storing the data in 'latin1' fields in MySQL. > Indeed. See $wgLegacyEncoding

Re: [Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread James Hare
I thought MediaWiki, by default, stored data as binary blobs, rather than something of a particular encoding? On May 2, 2017 at 10:11:38 AM, Mark Clements (HappyDog) ( gm...@kennel17.co.uk) wrote: Hi all, I seem to recall that a long, long time ago MediaWiki was using UTF-8 internally but storin

[Wikitech-l] Historical use of latin1 fields in MySQL

2017-05-02 Thread Mark Clements (HappyDog)
Hi all, I seem to recall that a long, long time ago MediaWiki was using UTF-8 internally but storing the data in 'latin1' fields in MySQL. I notice that there is now the option to use either 'utf8' or 'binary' columns (via the $wgDBmysql5 setting), and the default appears to be 'binary'.[1]