[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2020-07-07 Thread ArielGlenn
ArielGlenn added a comment.


  Updated.F31919691: commons_slots_new.png 


TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Ladsgroup, Abit, matthiasmullie, Marostegui, Mholloway, Addshore, 
Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, lmata, jannee_e, 
CBogen, Akuckartz, darthmon_wmde, Legado_Shulgin, Nandana, JKSTNK, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, 
QZanden, Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, 
rosalieper, Scott_WUaS, Susannaanas, Wong128hk, gnosygnu, Jane023, 
Wikidata-bugs, Base, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, 
Fabrice_Florin, Raymond, faidon, Steinsplitter, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-12-11 Thread ArielGlenn
ArielGlenn added a comment.


  F31470388: commons_slots.png  
generated via 
https://github.com/apergos/misc-wmf-crap/blob/master/sdc-growth/get_slot_growth.py
 a quickie one-off script.

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Ladsgroup, Abit, matthiasmullie, Marostegui, Mholloway, Addshore, 
Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, darthmon_wmde, 
Legado_Shulgin, DannyS712, Nandana, JKSTNK, Davinaclare77, Qtn1293, 
Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Anooprao, 
SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, QZanden, Tramullas, 
Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, rosalieper, Scott_WUaS, 
Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, aude, 
Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, 
Steinsplitter, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-11-27 Thread ArielGlenn
ArielGlenn added a comment.


  Do we have a meeting scheduled to talk about capacity needs?

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Ladsgroup, Abit, matthiasmullie, Marostegui, Mholloway, Addshore, 
Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, darthmon_wmde, 
Legado_Shulgin, DannyS712, Nandana, JKSTNK, Davinaclare77, Qtn1293, 
Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Anooprao, 
SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, QZanden, Tramullas, 
Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, rosalieper, Scott_WUaS, 
Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, aude, 
Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, 
Steinsplitter, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-11-13 Thread matthiasmullie
matthiasmullie added a comment.


  In T226093#5657414 , 
@Ramsey-WMF wrote:
  
  > Matthias will look into discrepancies between number of files with 
mediainfo slots vs. what's indexed in Cirrus.
  
  It looks like the difference is mostly revisions vs files.
  
  The 5.5+ number closely matches the amount of revisions with a mediainfo 
`slots` record:
  `SELECT COUNT(*) FROM slots WHERE slot_role_id = 2` = **6161688**
  That number includes all edits though (SDC edits as well as regular file page 
edits once the page got its first structured data)
  
  mediainfo slots grouped by page is more similar to the results we get from 
Cirrus:
  `SELECT COUNT(DISTINCT rev_page) FROM slots INNER JOIN revision ON 
slot_revision_id = rev_id WHERE slot_role_id = 2` = **2918327**
  (This number also includes pages that once had structured data, but which has 
since been deleted)

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: matthiasmullie
Cc: Ladsgroup, Abit, matthiasmullie, Marostegui, Mholloway, Addshore, 
Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, darthmon_wmde, 
Legado_Shulgin, DannyS712, Nandana, JKSTNK, Davinaclare77, Qtn1293, 
Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, Anooprao, 
SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, QZanden, Tramullas, 
Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, rosalieper, Scott_WUaS, 
Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, aude, 
Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, 
Steinsplitter, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-11-12 Thread Marostegui
Marostegui added a comment.


  I would like to know if there is some work going on to be able to split those 
tables from s4 into their own set of servers. My understanding is that it 
wasn't possible and that's why the tables were created on s4 (where Commons 
live) directly.
  Sharing the same set of servers with commonswiki means that sooner or later 
those tables will need to be moved out if the growth continues for the SDC 
related tables, into their own set of servers (as we advised when we were first 
involved into the conversations about SDC

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Marostegui
Cc: Marostegui, Mholloway, Addshore, Ramsey-WMF, jcrespo, Yann, MarkTraceur, 
ArielGlenn, Aklapper, darthmon_wmde, Legado_Shulgin, DannyS712, Nandana, 
JKSTNK, Davinaclare77, Qtn1293, Techguru.pc, Lahi, PDrouin-WMF, Gq86, 
E1presidente, Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, 
Th3d3v1ls, Hfbn0, QZanden, Tramullas, Acer, LawExplorer, Salgo60, Zppix, 
Silverfish, _jensen, rosalieper, Scott_WUaS, Susannaanas, Wong128hk, gnosygnu, 
Jane023, Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, 
Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, Steinsplitter, Mbch331, 
Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-11-08 Thread ArielGlenn
ArielGlenn added a comment.


  As evidenced by https://graphite.wikimedia.org/S/i we already have 5.5 
million images with contents in the MediaInfo slot. Two months to go until end 
of the year and we see how low the prediction was compared to the actual number.

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mholloway, Addshore, Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, 
Aklapper, darthmon_wmde, Legado_Shulgin, DannyS712, Nandana, JKSTNK, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, 
QZanden, Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, 
rosalieper, Scott_WUaS, Susannaanas, Wong128hk, gnosygnu, Jane023, 
Wikidata-bugs, Base, matthiasmullie, aude, Ricordisamoa, Wesalius, 
Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, Steinsplitter, Mbch331, 
Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-07-08 Thread ArielGlenn
ArielGlenn added a comment.


  I've commented about this over on the other ticket. Let's see what they say.

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mholloway, Addshore, Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, 
Aklapper, darthmon_wmde, Legado_Shulgin, Nandana, JKSTNK, thifranc, AndyTan, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, 
QZanden, Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, 
rosalieper, Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, 
matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, 
Raymond, faidon, Steinsplitter, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-07-05 Thread ArielGlenn
ArielGlenn added a comment.


  I keep pretty decent tabs on wikidata growth, because of the dumps. I don't 
do that for commons entities because I can't even find the proper wikibase 
tables. I checked the wb_* tables on commonswiki and they all appear to be 
empty (?!)
  
  I can do some very rough numbers gathering by periodically getting the max 
slotid and the max revid, which would at least let us track those two trends. 
You don't by any chance track those two numbers already?
  
  I think we'll have bots soon enough that help with the issue of adding 
captions for already uploaded files, and then you'll see huge growth in the 
number of slots. The captions are in a separate slot, right?  It would be nice 
to be able to track the growth in the number of specific slots (depicts, 
caption, anything else on the short-to-mid-term horizon).

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mholloway, Addshore, Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, 
Aklapper, darthmon_wmde, Legado_Shulgin, Nandana, JKSTNK, thifranc, AndyTan, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, 
QZanden, Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, 
rosalieper, Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, 
matthiasmullie, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, 
Raymond, faidon, Steinsplitter, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-06-20 Thread Addshore
Addshore added a comment.


  So there are some details on 
https://wikitech.wikimedia.org/wiki/WMDE/Wikidata/Growth, but I havn't written 
too much about media info yet.
  
  Details about number of entities: 
https://wikitech.wikimedia.org/wiki/WMDE/Wikidata/Growth#MediaInfo
  Details about number of revisions: 
https://wikitech.wikimedia.org/wiki/WMDE/Wikidata/Growth#Commons
  
  I didn't bother predicting the revision count for commons too much yet as 
wikidata will likely hit the big int mark before commons.
  
  Naturally edit rate on commons is likely going to be heading up (not sure if 
anyone is tracking this yet).
  For Wikidata currently we have 
https://grafana.wikimedia.org/d/00170/wikidata-edits
  Even with an extremely high edit rate I think we should be able to spot most 
capacity issues quite a while before they happen.

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: Addshore, Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, 
darthmon_wmde, Legado_Shulgin, Nandana, JKSTNK, thifranc, AndyTan, 
Davinaclare77, Qtn1293, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, 
Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, QZanden, 
Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, rosalieper, 
Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, matthiasmullie, 
aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, 
Steinsplitter, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-06-20 Thread Ramsey-WMF
Ramsey-WMF added a comment.


  @ArielGlenn here's a *tentative* roadmap that provides a high-level view of 
the SDC work we have planned for the rest of the calendar year. Anything beyond 
Dec. 31 is still uncertain at this time. 
https://docs.google.com/presentation/d/1hdqodLhi9Ym-BtLNyhfHKAnTcPMcLOV6hocNQqwiqLE/edit?usp=sharing

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ramsey-WMF
Cc: Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, 
darthmon_wmde, Legado_Shulgin, Nandana, JKSTNK, thifranc, AndyTan, 
Davinaclare77, Qtn1293, Lahi, PDrouin-WMF, Gq86, E1presidente, Cparle, 
Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, QZanden, 
Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, rosalieper, 
Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, matthiasmullie, 
aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, 
Steinsplitter, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2019-06-20 Thread MarkTraceur
MarkTraceur added a comment.


  @ArielGlenn 
https://grafana.wikimedia.org/d/00175/wikidata-datamodel-statements?refresh=30m=4=1
 <-- average statements per item on Wikidata
  
  Let me find an up to date roadmap for you.

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MarkTraceur
Cc: jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, darthmon_wmde, 
Legado_Shulgin, Nandana, JKSTNK, thifranc, AndyTan, Davinaclare77, Qtn1293, 
Lahi, PDrouin-WMF, Gq86, E1presidente, Ramsey-WMF, Cparle, Anooprao, 
SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, QZanden, Tramullas, 
Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, rosalieper, 
Susannaanas, Wong128hk, gnosygnu, Jane023, Wikidata-bugs, Base, matthiasmullie, 
aude, Ricordisamoa, Wesalius, Lydia_Pintscher, Fabrice_Florin, Raymond, faidon, 
Steinsplitter, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs