https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
Andre Klapper aklap...@wikimedia.org changed:
What|Removed |Added
CC|
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #8 from Ori Livneh o...@wikimedia.org ---
The reason we don't use the 'utf8' character encoding setting in MySQL is, it
turns out, that it can only encode BMP characters, which have a maximum width
of three bytes. Supplementary
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #6 from Ori Livneh o...@wikimedia.org ---
(In reply to comment #0)
Inconsistency is causing issues with scripts which now have to treat the log
database specially.
This isn't EventLogging exceptionalism, you know: URIs are UTF-8,
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #7 from Yuvi Panda yuvipa...@gmail.com ---
*shrug* I should open a new bug to have the slaves of the production databases
set to utf8, but I am unsure how productive that would be.
--
You are receiving this mail because:
You are
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #1 from Yuvi Panda yuvipa...@gmail.com ---
Related is https://bugzilla.wikimedia.org/show_bug.cgi?id=45718 - if we change
charset for the database we should also ensure that the tables are changed too.
--
You are receiving this
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #2 from Dario Taraborelli dtarabore...@wikimedia.org ---
Yuvi, I am not fully persuaded the charset should be consistent with MW's
default binary. Other databases used for data analysis on s1 host datasets
coming from various
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #3 from Yuvi Panda yuvipa...@gmail.com ---
So the problem is with Python. I'm using a bunch of scripts to manually 'join'
data from the production slaves (commons) and eventlogging, and am having to
treat the connections to log /
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #4 from Yuvi Panda yuvipa...@gmail.com ---
As per IRC, I think being internally consistent inside MySQL (all mysql
databases have same encoding, you don't need to check that everytime) is more
important than having *some* mysql
https://bugzilla.wikimedia.org/show_bug.cgi?id=47368
--- Comment #5 from Yuvi Panda yuvipa...@gmail.com ---
https://gerrit.wikimedia.org/r/#/c/59880/ is the use case I am talking about.
DB handling code should not have to care about which database it is connecting
to...
--
You are receiving