[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2021-11-05 Thread zhuyifei1999
zhuyifei1999 added a comment. (trying to reproduce it again) TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: zhuyifei1999 Cc: Count_Count, Dvorapa, zhuyifei1999, TheSandDoctor, Mpaa, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2021-11-04 Thread zhuyifei1999
zhuyifei1999 added a comment. Sorry, I think I was working on it last year and then forgot about this ticket. I'll check what I was doing back then. Is this bug still reproducible? TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2021-11-03 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 any word? TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: zhuyifei1999, TheSandDoctor Cc: Count_Count, Dvorapa, zhuyifei1999, TheSandDoctor, Mpaa,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-09-12 Thread TheSandDoctor
TheSandDoctor added a comment. Thanks @zhuyifei1999 ! Have you been able to work on this any? TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: zhuyifei1999, TheSandDoctor Cc: Count_Count,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-03-06 Thread Dvorapa
Dvorapa added a comment. Yeah, it seems like PoolCounter response TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Dvorapa Cc: Count_Count,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-03-06 Thread Count_Count
Count_Count added a comment. In T181443#5931256 , @TheSandDoctor wrote: > I just discovered that rcwatcher.py crashed at some point within the past couple of days. Interesting. > [...] > requests.exceptions.HTTPError: 429 Client

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-03-01 Thread TheSandDoctor
TheSandDoctor added a comment. I just discovered that rcwatcher.py crashed at some point within the past couple of days. Interesting. Traceback (most recent call last): File "rcwatcher.py", line 65, in main() File "rcwatcher.py", line 57, in main

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread TheSandDoctor
TheSandDoctor added a comment. In T181443#5927865 , @zhuyifei1999 wrote: > Whichever is simplest. That would be test_rc.py . TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread zhuyifei1999
zhuyifei1999 added a comment. Whichever is simplest. TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Dvorapa, zhuyifei1999 Cc: Count_Count, Dvorapa, zhuyifei1999, TheSandDoctor, Mpaa,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread TheSandDoctor
TheSandDoctor added a comment. In T181443#5927759 , @zhuyifei1999 wrote: > I will be running the script with pdb + save all sseclient trace over the weekend. Running test_rc.py, right? TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread zhuyifei1999
zhuyifei1999 added a comment. I will be running the script with pdb + save all sseclient trace over the weekend. TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Dvorapa, zhuyifei1999 Cc:

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread TheSandDoctor
TheSandDoctor added a comment. @Dvorapa I saw that myself & just updated my above comment prior to seeing your response. It appears to have slipped through, will have to add a catch for that. Others still relevant though. In T181443#5927703

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread Dvorapa
Dvorapa added a comment. > pywikibot.exceptions.Error: Title contains illegal char (\uFFFD 'REPLACEMENT CHARACTER') This should not happen after the patch anymore? TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread Dvorapa
Dvorapa added a comment. I tried this file and a random file to compare and Pywikibot is correct. The file has no imageinfo as it is a redirect. See the file page: https://commons.wikimedia.org/wiki/File:Vanessa_Mai_at_Gruenspan_2019_(3).png TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread TheSandDoctor
TheSandDoctor added a comment. Since updating to the latest master version of sseclient (post-fix merges) more of the workers crashed than usual (4 of the 5). 3 of the 4 crashes were due to the same issue and the 4th was: 2020-02-27 23:07:01,324 __main__: DEBUG None 2020-02-27

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-28 Thread Dvorapa
Dvorapa added a comment. Hopefully this helps. There were more issues with sseclient 0.0.23 and 0.0.24 (https://github.com/wikimedia/pywikibot/commit/6cf25c0fe51991a892486a32211a434c695357b6), hopefully 0.0.25 will solve them all TASK DETAIL https://phabricator.wikimedia.org/T181443

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 requests has been updated & the workers/feeder all restarted. I have re-started test_rc.py and will post back here if anything crashes. If it is good in a few days/week or something like that I think we could consider this resolved. Thanks for

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread zhuyifei1999
zhuyifei1999 added a comment. Yes, what I was saying was, the first and the third and two separate consumers, so events on first should also be received on the third. If there were something fundamentally wrong with the event data, then both would crash. Since this is not the case, there is

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 Yes, first and third are two separate customers. The first and second are working with the same customer. The third (code

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread zhuyifei1999
zhuyifei1999 added a comment. In T181443#5916753 , @TheSandDoctor wrote: > @zhuyifei1999 the first and second traceback are from "production" worker instances and pop items off of the same redis queue (all fed by a single instance of

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 the first and second traceback are from "production" worker instances and pop items off of the same redis queue (all fed by a single instance of rcwatcher.py), thus they wouldn't get the same image. So it isn't feasible that they would crash all

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread zhuyifei1999
zhuyifei1999 added a comment. In T181443#5916068 , @TheSandDoctor wrote: > The first traceback above > 2020-02-22 20:02:09 <- start > 2020-02-24 08:42:16 <- crash > The second traceback I posted above > 2020-02-22 20:02:20

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread TheSandDoctor
TheSandDoctor added a comment. 2020-02-22 20:02:20 TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Dvorapa, TheSandDoctor Cc: Dvorapa, zhuyifei1999, TheSandDoctor, Mpaa, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread TheSandDoctor
TheSandDoctor added a comment. In T181443#5915796 , @Dvorapa wrote: > Also, what is the output of `echo $LC_CTYPE` and `echo $LANG`? Weird is this happens only for some letters. I can not reproduce it on my machine. If you use

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread Dvorapa
Dvorapa added a comment. Also, what is the output of `echo $LC_CTYPE` and `echo $LANG`? Weird is this happens only for some letters. I can not reproduce it on my machine. If you use Pywikibot packaged with scripts, could you share the output of `python pwb.py version`? TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread TheSandDoctor
TheSandDoctor added a comment. @Dvorapa python 3 TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Dvorapa, TheSandDoctor Cc: Dvorapa, zhuyifei1999, TheSandDoctor, Mpaa, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread Dvorapa
Dvorapa added a comment. In T181443#5915665 , @TheSandDoctor wrote: > @Dvorapa all encoding is set to UTF-8. > @zhuyifei1999 Oops, thought I had. see here

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 Oops, thought I had. see here TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread zhuyifei1999
zhuyifei1999 added a comment. Would you mind posting the code of the 'minimal test case' somewhere? TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Dvorapa, zhuyifei1999 Cc: Dvorapa,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-25 Thread Dvorapa
Dvorapa added a comment. This seems like an encoding issue TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Dvorapa Cc: Dvorapa, zhuyifei1999, TheSandDoctor, Mpaa, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-24 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 For both of these worth noting that I have //not// updated the the latest version with the change in behaviour that this task merged. From a normal worker: 2020-02-24 08:42:16,590 __main__: INFO File:�রপক্ষ মন্দির থেকে ভক্তদের বের

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 Unknown at this point. Implemented and running alongside it now. If either crashes will report back here. TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread zhuyifei1999
zhuyifei1999 added a comment. > rcworker.py would just be reduced to a few lines Is redis queue critical to reproducing the issue? If not, that is an extra layer of complexity and a minimal reproducible test case does not need that. TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread TheSandDoctor
TheSandDoctor added a comment. @zhuyifei1999 The only log currently available is as follows (and linked above): 2020-02-18 18:45:16,662 __main__: INFO File:PICT0430 - 301032 - onroerenderfgoed.jpg :Not corrupt. Stored Traceback (most recent call last): File

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread zhuyifei1999
zhuyifei1999 added a comment. What is the traceback? Any minimal reproducible test case (even if it takes a long time to reproduce that's still something)? TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread Dvorapa
Dvorapa added a comment. Now we need to know, why is this happening, because I don't feel like there are so many new files in Commons with this character TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread TheSandDoctor
TheSandDoctor added a comment. @Dvorapa Commons. The issue appears to happen at random. I have improved the ordering of my logs so next time it happens it should //hopefully// actually tell me the issue. Given that the files are only run from recent changes if they are new uploads, this

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread Dvorapa
Dvorapa added a comment. I don't understand it either. Could you find out, which file makes this (or is it for random files?) and also tell me, which wiki do you check? (Commons?) TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread TheSandDoctor
TheSandDoctor added a comment. @Dvorapa grabs the file from recent changes using site_rc_listener and then sends it to the other script for processing. site_rc_listener is what must be giving

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread Dvorapa
Dvorapa added a comment. I'm sorry, I don't understand, what your script does and how. There should be no title in commons with that character and if there are links to a page with that character, the freshly called InvalidTitle error should be always checked in scripts. So what else

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread TheSandDoctor
TheSandDoctor added a comment. @Dvorapa Just trying to create a file page reference given a name of a file from recent changes. Here is the error log . The line that causes the problem is `file_page = pywikibot.FilePage(site,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread gerritbot
gerritbot added a comment. Change 574111 **merged** by jenkins-bot: [pywikibot/core@master] [bugfix] Raise InvalidTitle for title containing illegal char https://gerrit.wikimedia.org/r/574111 TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-22 Thread Dvorapa
Dvorapa added a comment. In T181443#5909015 , @TheSandDoctor wrote: > Also: thanks for switching it to a catchable exception! :) Yeah, the error should be now better. > @Dvorapa But what would cause it to return it then when

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-21 Thread TheSandDoctor
TheSandDoctor added a comment. @Dvorapa But what would cause it to return it then when looking at images? I am sort of confused here. TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To:

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-21 Thread gerritbot
gerritbot added a comment. Change 574111 had a related patch set uploaded (by Dvorapa; owner: Dvorapa): [pywikibot/core@master] [bugfix] Throw InvalidTitle for title containing illegal char https://gerrit.wikimedia.org/r/574111 TASK DETAIL https://phabricator.wikimedia.org/T181443

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2020-02-21 Thread Dvorapa
Dvorapa added a comment. In T181443#5907172 , @TheSandDoctor wrote: > @zhuyifei1999 Do you think that such a raise could be made? The problem that I see with both handlings though is that the titles are //not// "invalid" as they are

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER' in category add

2020-02-21 Thread TheSandDoctor
TheSandDoctor added a comment. @Fructibus Was this resolved or? TASK DETAIL https://phabricator.wikimedia.org/T181443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: TheSandDoctor Cc: TheSandDoctor, Mpaa, Aklapper, pywikibot-bugs-list, Fructibus,

[Pywikipedia-bugs] [Maniphest] [Commented On] T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'

2017-11-27 Thread Mpaa
Mpaa added a comment. It is [[Category:Mus�e du quai Branly]] giving problems.TASK DETAILhttps://phabricator.wikimedia.org/T181443EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: MpaaCc: Mpaa, Aklapper, pywikibot-bugs-list, Fructibus, Chicocvenancio,