[Bug 27849] API: add normalized info also for unicode normalization of titles

2014-03-11 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Andre Klapper changed: What|Removed |Added Priority|High|Normal --- Comment #29 from Andre Klap

[Bug 27849] API: add normalized info also for unicode normalization of titles

2014-02-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #28 from Gerrit Notification Bot --- Change 22831 abandoned by Hashar: (bug 27849) Add normalized info for Unicode normalization of titles Reason: Cleaning up very old change. Feel free to resurrect if there is any interest in fini

[Bug 27849] API: add normalized info also for unicode normalization of titles

2013-03-07 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Nemo changed: What|Removed |Added CC||federicol...@tiscali.it See Also|

[Bug 27849] API: add normalized info also for unicode normalization of titles

2012-09-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Mark A. Hershberger changed: What|Removed |Added Target Milestone|1.20.0 release |Future release --- Comment #26 f

[Bug 27849] API: add normalized info also for unicode normalization of titles

2012-09-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #25 from Roan Kattouw 2012-09-05 21:40:50 UTC --- Moved patch into Gerrit, see https://gerrit.wikimedia.org/r/#/c/22831/ . It doesn't actually work yet, because the unnormalized data needs to be armored to bypass ApiResult::cleanUp

[Bug 27849] API: add normalized info also for unicode normalization of titles

2012-09-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Roan Kattouw changed: What|Removed |Added AssignedTo|roan.katt...@gmail.com |wikibugs-l@lists.wikimedia.

[Bug 27849] API: add normalized info also for unicode normalization of titles

2012-02-08 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Sam Reed (reedy) changed: What|Removed |Added Target Milestone|1.19.0 release |1.20.0 release -- Configure bugmai

[Bug 27849] API: add normalized info also for unicode normalization of titles

2012-01-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Mark A. Hershberger changed: What|Removed |Added Blocks|29097 | Target Milestone|---

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-12-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Aaron Schulz changed: What|Removed |Added CC||schulzaaro...@yahoo.de --- Comment #24

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-11-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Sumana Harihareswara changed: What|Removed |Added Keywords||reviewed CC|

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-07-14 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Mark A. Hershberger changed: What|Removed |Added Blocks|29068 |29097 AssignedTo|bawolff

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-07-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Mark A. Hershberger changed: What|Removed |Added AssignedTo|roan.katt...@gmail.com |bawolff...@gmail.com -- Configu

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-06-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #22 from Bawolff 2011-06-30 03:30:48 UTC --- (In reply to comment #21) > (In reply to comment #20) > > leaving this as a deployment blocker since all that seems to be needed here > > is > > a SMOP. > > This could potentially lead

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-06-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #21 from Bawolff 2011-06-30 03:29:49 UTC --- (In reply to comment #20) > leaving this as a deployment blocker since all that seems to be needed here is > a SMOP. This could potentially lead to invalid output for XML formats (since

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-06-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Mark A. Hershberger changed: What|Removed |Added Blocks||29068 --- Comment #20 from Mark

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-06-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #19 from Mark A. Hershberger 2011-06-29 16:42:09 UTC --- Bryan, Bawolff, Could one of you take this and make the necessary changes to close the bug? -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-06-28 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Reedy changed: What|Removed |Added Blocks|29068 | -- Configure bugmail: https://bugzilla.wikim

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-06-16 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Mark A. Hershberger changed: What|Removed |Added Blocks||29068 -- Configure bugmail: htt

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #18 from Bryan Tong Minh 2011-05-06 07:30:31 UTC --- (In reply to comment #15) > (In reply to comment #14) > > Can't you do something like > > $string2 = $string > > UtfNormal::quickIsNFCVerify( $string2 ); > > $stringIsValidUTF8 =

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #17 from Bawolff 2011-05-06 00:48:34 UTC --- btw, if i recall we do some other normalization beyond NFC for ml and ar wikis (that are done only on wikis with those content languages for performance reasons, so if you get an interwik

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #16 from merl 2011-05-05 23:35:47 UTC --- Just some statistics from my interwiki bot: Each of my api requests normally contains 50 titles values. The title values itself are result of other api requests, so it should all be valid u

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 p858snake changed: What|Removed |Added Keywords||patch -- Configure bugmail: https://bugzi

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #15 from Brion Vibber 2011-05-05 22:35:30 UTC --- (In reply to comment #14) > Can't you do something like > $string2 = $string > UtfNormal::quickIsNFCVerify( $string2 ); > $stringIsValidUTF8 = $string === $string2 ? true : false; >

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Bawolff changed: What|Removed |Added CC||bawolff...@gmail.com --- Comment #14 from Ba

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #13 from Brion Vibber 2011-05-05 17:43:15 UTC --- Honestly I don't think we have a good way to do that right now; UtfNormal combines it with the NFC stuff in quickIsNFCVerify(), and our fallbacks mean that a call to iconv() or mv_c

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #12 from Roan Kattouw 2011-05-05 17:32:41 UTC --- (In reply to comment #11) > So in short: don't worry about representing invalid UTF-8 byte sequences: > either use a 'before' value that's been validated as UTF-8, or let the API >

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #11 from Brion Vibber 2011-05-05 17:28:14 UTC --- There are essentially two layers of work here, which our input validation merges into a single step: 1) invalid UTF-8 sequences must be found and replaced with valid placeholder ch

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #10 from Roan Kattouw 2011-05-05 17:21:54 UTC --- (In reply to comment #9) > Invalid UTF-8 is essentially random binary data and should thus be encoded, > for > example in base64. Yeah. But I think it's fair not to offer this feat

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #9 from Bryan Tong Minh 2011-05-05 17:00:52 UTC --- (In reply to comment #6) > I could armor the from value to protect it from Unicode normalization (I've > written code for that before; I threw it out but I should be able to repro

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Roan Kattouw changed: What|Removed |Added Attachment #8504|0 |1 is obsolete|

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #7 from Roan Kattouw 2011-05-05 16:09:39 UTC --- Created attachment 8504 --> https://bugzilla.wikimedia.org/attachment.cgi?id=8504 Stashing my work-in-progress changes here, this is as good a place as any -- Configure bugmail:

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-05-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Roan Kattouw changed: What|Removed |Added CC||br...@wikimedia.org,

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-04-12 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 merl changed: What|Removed |Added CC||bugrepor...@to.mabomuja.de --- Comment #5 from

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-03-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 Mark A. Hershberger changed: What|Removed |Added Priority|Normal |High CC|

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-03-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #4 from Bryan Tong Minh 2011-03-05 14:42:33 UTC --- The normalization is done in getGPCValue. Just add a boolean parameter $normalize. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email --- You are r

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-03-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #3 from Reedy 2011-03-05 14:40:56 UTC --- Looks like we might need to cache it earlier... As it looks like whenever the normalize is called, it just overrides them all.. -- Configure bugmail: https://bugzilla.wikimedia.org/userpr

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-03-05 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #2 from Bryan Tong Minh 2011-03-05 14:28:30 UTC --- We can add a function to WebRequest to return the original value instead of the normalized. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ---

[Bug 27849] API: add normalized info also for unicode normalization of titles

2011-03-04 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=27849 --- Comment #1 from Brion Vibber 2011-03-05 00:22:10 UTC --- IIRC, this normalization is applied on raw input in WebRequest, so the API code would only ever see the NFC form in the first place. For it to know anything had changed, it would hav