[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread gerritbot
gerritbot added a comment. Change 378072 merged by jenkins-bot: [mediawiki/extensions/EventBus@master] Allow unicode in serialized events. https://gerrit.wikimedia.org/r/378072TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread Pchelolo
Pchelolo added a comment. In T175316#3608922, @GWicke wrote: Looks like adding the JSON_UNESCAPED_UNICODE flag should do it: http://php.net/manual/en/function.json-encode.php We use JsonFormatter class to encode JSON, so the solution is a bit different, but the above patch takes care of

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread gerritbot
gerritbot added a comment. Change 378072 had a related patch set uploaded (by Ppchelko; owner: Ppchelko): [mediawiki/extensions/EventBus@master] Allow unicode in serialized events. https://gerrit.wikimedia.org/r/378072TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread GWicke
GWicke added a comment. Looks like adding the JSON_UNESCAPED_UNICODE flag should do it: http://php.net/manual/en/function.json-encode.phpTASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, GWickeCc:

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread daniel
daniel added a comment. For now, let's just try to get the size of the jobs below the 4MB mark? :) If you fix your encoding ;)TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, danielCc: mobrovac,

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread mobrovac
mobrovac added a comment. In T175316#3608388, @daniel wrote: @mobrovac how about a very large number of very small jobs? e.g. a million jobs to purge a million pages from cdn? Note that we introduced batching only a few weeks ago, at the explicit request of the performance folks. We had one job

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread daniel
daniel added a comment. @mobrovac how about a very large number of very small jobs? e.g. a million jobs to purge a million pages from cdn?TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Pchelolo, danielCc:

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread Pchelolo
Pchelolo added a comment. @Pchelolo actually, can you confirm how many entries there were in the "pages" parameter? With the latest patches deployed, there should be no more than 20. Perhaps this is an old job getting retried, because it failed earlier? There's definitely more then 20 entries

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread daniel
daniel added a comment. @Pchelolo actually, can you confirm how many entries there were in the "pages" parameter? With the latest patches deployed, there should be no more than 20. Perhaps this is an old job getting retried, because it failed earlier?TASK

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread mobrovac
mobrovac added a comment. In T175316#3608364, @daniel wrote: We can tweak the chunk size - more jobs, or larger jobs, your pick. Since in the new JQ system all jobrunners will run all jobs, a higher number of smaller jobs are preferred over a smaller number of big jobs.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread daniel
daniel added a comment. @Pchelolo disabling escaping of non-ascii characters would probably reduce the size to a fourth... I'm not sure what kind of improvement you mean. We can tweak the chunk size - more jobs, or larger jobs, your pick.TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread Pchelolo
Pchelolo added a comment. After the patch was deployed the situation improved a lot, but we've got a 5 Mb event today: https://people.wikimedia.org/~ppchelko/event 5 Mb is not critically large, we can increase the limit in Kafka to 8 Mb I think, but maybe we should do some more improvements in

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-14 Thread gerritbot
gerritbot added a comment. Change 377812 merged by jenkins-bot: [mediawiki/extensions/Wikibase@wmf/1.30.0-wmf.18] Split page set before constructing InjectRCRecordsJob https://gerrit.wikimedia.org/r/377812TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread Stashbot
Stashbot added a comment. Mentioned in SAL (#wikimedia-operations) [2017-09-13T23:29:57Z] Synchronized php-1.30.0-wmf.18/extensions/Wikidata/extensions/Wikibase/client: Split page set before constructing InjectRCRecordsJob (T175316) (duration: 00m 57s)TASK

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread gerritbot
gerritbot added a comment. Change 377897 merged by jenkins-bot: [mediawiki/extensions/Wikidata@wmf/1.30.0-wmf.18] Split page set before constructing InjectRCRecordsJob https://gerrit.wikimedia.org/r/377897TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread gerritbot
gerritbot added a comment. Change 377897 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani): [mediawiki/extensions/Wikidata@wmf/1.30.0-wmf.18] Split page set before constructing InjectRCRecordsJob https://gerrit.wikimedia.org/r/377897TASK

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread gerritbot
gerritbot added a comment. Change 377811 merged by jenkins-bot: [mediawiki/extensions/Wikibase@master] Split page set before constructing InjectRCRecordsJob https://gerrit.wikimedia.org/r/377811TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread gerritbot
gerritbot added a comment. Change 377812 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler): [mediawiki/extensions/Wikibase@wmf/1.30.0-wmf.18] Split page set before constructing InjectRCRecordsJob https://gerrit.wikimedia.org/r/377812TASK

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread gerritbot
gerritbot added a comment. Change 377811 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler): [mediawiki/extensions/Wikibase@master] Split page set before constructing InjectRCRecordsJob https://gerrit.wikimedia.org/r/377811TASK

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread Pchelolo
Pchelolo added a comment. Here's an example of a very large event: https://people.wikimedia.org/~ppchelko/large_event It's not an event itself, it's a log message from #eventbus but the event is embedded in the log message.TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread daniel
daniel added a comment. InjectRCRecords batches inserts when running the job, but doesn't chop the batch up before scheduling the job. I can easily fix that. The patch should be back-portable, too. Give me a minute...TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-13 Thread Ladsgroup
Ladsgroup added a comment. Can I examine the job logs in more depth? the pages params can't have more than 100 (old settings) which we changed it to 50 and now to 20.TASK DETAILhttps://phabricator.wikimedia.org/T175316EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T175316: Very large jobs posted by Wikidata

2017-09-08 Thread Pchelolo
Pchelolo added a comment. In T175316#3591889, @GWicke wrote: @Pchelolo, based on our previous conversation about this I am assuming that the bulk of the task is a very large list of pages. Is this correct? Ye, in the actual event the params.pages array contains millions and millions of