[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
ArielGlenn added a comment. WIkidata has been moved to the list of "big" wikis which means jobs run in parallel now, cutting down on processing time. It truly is growing leaps and bounds. We should be able to do two runs a month as we just did in August, one full run including revision history which will start at the beginning of the month, and one which will start probably around the 20th of the month without revision history. Stubs and current articles should be available by the end of the first week of the month for the first run. Wll this cover most folks' needs? TASK DETAIL https://phabricator.wikimedia.org/T85970 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ArielGlenn Cc: Hydriz, Liuxinyu970226, JanZerebecki, ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Svick, Malyacko ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
Hydriz added a subscriber: Hydriz. Hydriz added a comment. Just a update on the dump progress for the last few wikidatawiki dumps: mysql> SELECT subject,dumpdate,progress FROM archive WHERE subject="wikidatawiki"; +--++--+ | subject | dumpdate | progress | +--++--+ | wikidatawiki | 2014-12-05 | error| | wikidatawiki | 2014-12-08 | error| | wikidatawiki | 2015-01-13 | progress | | wikidatawiki | 2015-02-04 | progress | | wikidatawiki | 2015-02-07 | done | | wikidatawiki | 2015-03-07 | done | | wikidatawiki | 2015-03-30 | done | | wikidatawiki | 2015-04-23 | error| | wikidatawiki | 2015-05-26 | progress | | wikidatawiki | 2015-06-03 | done | | wikidatawiki | 2015-07-04 | done | | wikidatawiki | 2015-08-06 | done | | wikidatawiki | 2015-08-26 | progress | +--++--+ 13 rows in set (0.01 sec) The 20150603's dump was the first successful parallel dump, hopefully we are progressing in the right track in reducing the failures. TASK DETAIL https://phabricator.wikimedia.org/T85970 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Hydriz Cc: Hydriz, Liuxinyu970226, JanZerebecki, ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Svick, Malyacko ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
ezachte added a comment. See also https://phabricator.wikimedia.org/T89273 why synchronized dumps can help us to get early monthly stats. TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ezachte Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Jdouglas, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
ezachte added a comment. If budget allows let's run dumps more often. But one monthly cycle starting on the first date of each month is better than a 3 week continuous cycle (which grows in length every month anyway). The current scheme frustrates all those users who want monthly stats with only reasonable delay (instead of 4 weeks after a month closes). And other users who require updates per full month. . TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ezachte Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Jdouglas, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
Lydia_Pintscher added a comment. Agreed :) TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Lydia_Pintscher Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Jdouglas, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
hoo added a comment. We have to distinguish here: Our json dumps will keep running on a weekly schedule, but the other dumps are apparently monthly (and we need those rather more often than less often). TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: hoo Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Jdouglas, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
ezachte added a comment. @Lydia_Pintscher are you referring to wikidata? For all practical purposes the current rate is once a month for wikidata anyway. One exception since June 2014: two runs completed in Aug. TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ezachte Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Jdouglas, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
Lydia_Pintscher added a comment. We are talking about moving from how often to once a month? TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Lydia_Pintscher Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Jdouglas, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
Lydia_Pintscher added a comment. I don't think it is ok for our users to do it less often than it is at the moment. TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Lydia_Pintscher Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Jdouglas, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
ArielGlenn added a comment. I had a look at the previous failed runs to get a sense of what was going on. The causes are various: the dataset1001 host or the snapshot host being rebooted for security updates; the db server being either hung or having been depooled (I didn't check which); a fatal caused somewhere in the wikidatabase code. The lesson to be learned from this is that 20 days for a run is simply too long to guarantee a clean run without something else going wrong in the interim. This is another reason that I think wikidata runs should be parallelized as we do for other large wikis, and moved off short-term to run after en wiki every month, and medium term to a new server along with other not-so-small wikis, if we need more than one run a month. TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ArielGlenn Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
ArielGlenn added a comment. Wikidata needs to be moved to the 'big wikis' queue at some point and there are other not so small wikis that should be moved over as well. A question for wikiata dumps users; is once a month often enough for the run or do people need two complete runs? Once a month could be set up now, to run in the second half of each month after the en wiki dumps complete. TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ArielGlenn Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T85970: Some Wikidata XML dumps are failing
ezachte added a subscriber: ezachte. ezachte added a comment. Here are recent dump times and outcomes: wiki,date,run time in hms,run time in secs,,result, wikidatawiki,20140612,195:38:06,704286,-,done wikidatawiki,20140705,278:35:10,1002910,-,done wikidatawiki,20140731,78:04:51,281091,-,failed wikidatawiki,20140804,293:44:58,1057498,-,done wikidatawiki,20140823,307:52:31,1108351,-,done wikidatawiki,20140912,312:18:05,1124285,-,done wikidatawiki,20141009,345:25:42,1243542,-,done wikidatawiki,20141106,367:02:42,1321362,-,done wikidatawiki,20141205,65:27:10,235630,-,failed wikidatawiki,20141208,54:19:11,195551,-,failed wikidatawiki,20150113,111:13:44,400424,-,failed wikidatawiki,20150204,0:01:28,88,-,failed wikidatawiki,20150207,96:52:58,348778,-,running Is wikidatawiki in same queue as small wikis? For many of those data are still from December. TASK DETAIL https://phabricator.wikimedia.org/T85970 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign . EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ezachte Cc: ezachte, jeremyb, Krenair, aude, hoo, Lydia_Pintscher, ArielGlenn, MZMcBride, Aklapper, Wikidata-bugs, Svick ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs