[Pywikipedia-bugs] [Maniphest] [Commented On] T202189: diff-checker.py counts bytes not unicodes

2018-08-19 Thread Dalba
Dalba added a comment. There are some packages like wcwidth that can detect the display width of a line, but even them are not 100% accurate. The real display width depends on other factros like user's installed fonts and unicode version. There are also some characters that when put next to each

[Pywikipedia-bugs] [Maniphest] [Commented On] T202189: diff-checker.py counts bytes not unicodes

2018-08-19 Thread Xqt
Xqt added a comment. There are some hidden characters inside like "়া" (without the last one) which are counted with len() but I have 5 characters available until the rigtht side of a 80 column page. Maybe this is a such a minor problem that we just could decline it. I guess it does not worth to

[Pywikipedia-bugs] [Maniphest] [Commented On] T202189: diff-checker.py counts bytes not unicodes

2018-08-18 Thread Dalba
Dalba added a comment. Could you be more specific about the lines with false positive? I tested the first error (line 557 of PS2) and at seems to be indeed longer than 79 characters: >>> len(u"'bn': lambda v: slh(v, ['জানুয়ারি', 'ফেব্রুয়ারি', 'মার্চ', 'এপ্রিল', 'মে',") 84TASK