Addshore added a project: Wikidata-Campsite.
Addshore moved this task from incoming to needs discussion or investigation on 
the Wikidata board.
Addshore added a comment.


  > Is batch+sleep the best approach? Do we need offset+limit?
  
  So, looking that the maint script and job code that is actually run batching 
already occurs.
  Firstly the maint script runs a job for each property synchronously.
  Within those jobs some batching, or kind of batching seems to be happening, 
but is not configurable and doesn't look perfect.
  
https://github.com/wikimedia/mediawiki-extensions-WikibaseQualityConstraints/blob/26846fd7320993f13570392232b6e358a3378a4f/src/UpdateConstraintsTableJob.php#L131-L134
  Although batch size is 10, more than 10 things could actually end up being in 
a single batch in theory.
  
  The most important thing here is probably to wait for replication lag 
somewhere which it doesn't look like the script or the job does.
  It probably makes sense to do this just after the inserts, eg 
https://github.com/wikimedia/mediawiki-extensions-WikibaseQualityConstraints/blob/26846fd7320993f13570392232b6e358a3378a4f/src/UpdateConstraintsTableJob.php#L132
  
  In terms of arbitrary sleeps that a runner of the maint script could pass in, 
we could do this, but DB wise there is probably no need.
  
  > What should the default batch size and sleep time be?
  
  I answer the sleep question above, it can just be a wait for replication.
  
  As for batch size, it looks like calling insertBatch only results in a single 
insert which is nice.
  And the current default in the job means right now we are working with mostly 
batches of 10.
  
https://github.com/wikimedia/mediawiki/blob/master/maintenance/Maintenance.php#L1709
 defines the default batch size for maintenance scripts currently at 200, it's 
probably fine to use that default and pass it down to the job.

TASK DETAIL
  https://phabricator.wikimedia.org/T226635

WORKBOARD
  https://phabricator.wikimedia.org/project/board/71/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: Addshore, Marostegui, Lucas_Werkmeister_WMDE, Aklapper, darthmon_wmde, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Agabi10, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to