Re: [Xmldatadumps-l] Encoding issue in the last ZH dump

2013-01-08 Thread Ariel T. Glenn
The issue is that the bad character was added in 2004, see https://zh.wikipedia.org/w/index.php?title=Wikipedia:%E6%96%B0%E9%97%BB% E7%A8%BF/2004%E5%B9%B42%E6%9C%88_%28%E7%AE%80% 29action=editoldid=386385 before there were aggressive checks for that sort of thing. Garbage in, garbage out...

Re: [Xmldatadumps-l] Encoding issue in the last ZH dump

2013-01-08 Thread Federico Leva (Nemo)
Ariel T. Glenn, 08/01/2013 09:26: The issue is that the bad character was added in 2004, see https://zh.wikipedia.org/w/index.php?title=Wikipedia:%E6%96%B0%E9%97%BB% E7%A8%BF/2004%E5%B9%B42%E6%9C%88_%28%E7%AE%80% 29action=editoldid=386385 I've requested removal and revdeletion: