[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, me, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, 
Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thanks for taking care of this, @Lucas_Werkmeister_WMDE! We'll be able to 
close both this and T351072 <https://phabricator.wikimedia.org/T351072> after 
Tuesday next week if/when the Puppet change is deployed :)

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder, Adamm71, S8321414, 
Hellket777, LisafBia6531, Astuthiodit_1, 786, Biggs657, karapayneWMDE, 
Invadibot, maantietaja, Juan90264, Alter-paule, Beast1978, ItamarWMDE, Un1tY, 
Akuckartz, Dringsim, Hook696, Kent7301, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, 
Neuronton, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T365457: Bring in all Purdue Porgram PRs and upload Mismatch Finder mismatches

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T365457

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @BTullis, checking in on this as your help in T358311 
<https://phabricator.wikimedia.org/T358311> reminded me as it's all related to 
the same user. Would you be able to remove the 
`statistics/manifests/wmde/wdcm.pp` file and any related processes (including 
now stat1011) as well?

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thank you, @BTullis! Ya I wasn't happy with the solution either. Appreciate 
your willingness to help!

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: BTullis, brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  I'm realizing also that I don't have admin rights and thus can't move files 
to your directory. I'll copy these files over to my directory, download them 
and send you a link to a zipped directory on Google Drive once we have the 
above figured out.

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hi @Manuel, checking further as it's still not clear what you'd like. The 
double except is confusing. I'll only transfer files from `stat1005`, and could 
you answer the following questions:
  
  1. Do you want **data files** (.csv, .tsv, etc) __before 2020__? (assumption 
no)
  2. Do you want **data files** __after 2020__? (as of now unclear)
  3. Do you want **non data files** (.py, .R, etc) __before 2020__? (as of now 
unclear)
  4. Do you want **non data files** __after 2020__? (assumption yes)

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hi @Manuel - sending along a summary of what I'll be getting for you:
  
== stat1004 ==
Jul 25  2020 Analytics
Jun 23  2020 Experiments
Jul 25  2020 wdUsagePerPage

== stat1005 ==
All non data files

== stat1007 ==
Aug 23  2020 Analytics
Jan 27  2020 Experiments
Aug 23  2020 RScripts

== stat1008 ==
Oct 11  2021 Analytics
Jun 23  2020 R

=== HDFS 
2021-11-02 17:37 /user/goransm/dewiki_revisions
2021-04-11 16:51 /user/goransm/wdtranslationsb
No other files, as everything after 2020 is a data file or ORES related 
(this is coming in the stat server files anyway)
  
  TSVs, CSVs and data file types will not be included in the transfer. Out of 
convenience, I'm going to transfer the files into your directory on the given 
server.

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Ok then!
  
  So the checks of the files above is complete as shown by its status. General 
summaries of each stat machine and HDFS are provided under the subsections 
above. `stat1005` has some files that @Manuel may find interesting given that 
they're for prior tasks of his. Any queries that looked like they could be 
interesting or were in files whose names sounded interesting but the query 
ended up not being interesting are printed above for documentation.
  
  Overall I can say that anything from the above would be easier to work from 
scratch via the docs and checking with WMDE engineers or WMF Data 
Engineering/Analytics rather than going through and re-implementing it. I 
personally would not keep anything, and will delete the files I copied over to 
my `stat1005` once this is closed :)
  
  Thanks again @JAllemandou for the file lists, and thanks @brouberol for the 
ping!

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  So basically removing the wdcm.pp related file on GitHub and its Puppet 
workflows will close both tasks :)

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Ah looking at this, I'm realizing I restated myself as the work that's left 
in T364965: stat1007 to stat1011 migration pipeline output check 
<https://phabricator.wikimedia.org/T364965> is a duplicate of what we want to 
do here :)

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hey @Arian_Bozorg  Yes, we do still need to check this out. I was thinking 
that @Lucas_Werkmeister_WMDE and I could discuss this when we chat about what 
else is needed in T364965: stat1007 to stat1011 migration pipeline output check 
<https://phabricator.wikimedia.org/T364965>. In that one we've confirmed now 
that the data is coming in from stat1011, so at this point it'd be good to 
delete the statistics/manifests/wmde/wdcm.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/wdcm.pp>
 and also remove it's workflow from Puppet (just not quite sure if I have 
access and how to go about the Puppet work).
  
  I'm hopeful that another 25min call would be enough to get the work done for 
both tasks and I can document for my learning/our processes and report back? 
Let me know if sometime later if the week could work for this!

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T365457: Bring in all Purdue Porgram PRs and upload Mismatch Finder mismatches

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T365457

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T365457: Bring in all Purdue Porgram PRs and upload Mismatch Finder mismatches

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata Analytics (Kanban), Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Making this task as a means of saving that there is still work to be done to 
close out the Purdue Data Mine program. Specifically all pull requests in the 
repo <https://github.com/Wikidata/Purdue-Data-Mine-2024/pulls> need to be 
brought in, and the resulting mismatches should be uploaded to Mismatch Finder 
using upload_mismatches.py 
<https://github.com/Wikidata/Purdue-Data-Mine-2024/blob/main/upload_mismatches.py>.

TASK DETAIL
  https://phabricator.wikimedia.org/T365457

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  ⚠️ Currently WIP ⚠️
  ===
  
  Going through the files sent by @JAllemandou above 
<https://phabricator.wikimedia.org/T358311#9648470>. This message will be saved 
as I go so that I don't loose my progress  If I do find something worth 
documenting, then I'll also include it below so that this task can serve as a 
reference for later if need be.
  
  stat1004
  
  
  All of the files are not worth keeping. See descriptions and reasoning below:
  
total 28

Analytics
└─ NewEditors 
└─ adHoc (nothing of interest)
└─ Compaigns
└─ 2019 and 2020 email compaigns with R based analysis (nothing of 
interest)
└─ WDCM
└─ WDCM_Output 
└─ Lots directories of CSVs (nothing of interest)
└─ WDCM_Scripts
└─ R based scripts that would be archived on Gerrit if they were 
ever in production (nothing of interest)
└─ Wikidata
└─ misc
└─ Some ad hoc work (nothing of interest)
└─ WD_languagesLandscape
└─ R based scripts that would be archived on Gerrit if they were 
ever in production (nothing of interest)
└─ WD_ORES_ItemQuality (nothing of interest given Lift Wing migration)
└─ WD_UsageCoverage
└─ R and Python scripts that are doubtless versions of the WDCM 
UsageCoverage dashboard that's archived on Gerrit (nothing of interest)
Experiments
└─ Empty
_miscWMDE
└─ summerBannerCampaign2017_DataOUT
└─ TSV files (nothing of interest)
└─ TWLBanner_2017
└─ TSV files and simple HQL queries from `wmf.webrequest` for 
banner campaigns hits (nothing of interest, easy to learn as needed)

Example query:

SELECT count(*)
FROM wmf.webrequest
WHERE uri_host = 'de.wikipedia.org'
  AND uri_query LIKE "$/wiki/Wikipedia:Umfragen/Technische_Wünsche_2017$"
  AND http_method = 'GET'
  AND is_pageview = TRUE
  AND YEAR = 2017
  AND MONTH = 6
  AND DAY = 1
  and HOUR = 20;

└─ TWLBanner_2017_DataOUT
└─ TSV files (nothing of interest)
_miscWMDE_1004
└─ TWLBanner_2017
└─ One HQL and one TSV file that are similar to the above (nothing 
of interest)
R
└─ x86_64-pc-linux-gnu-library (nothing of interest)
Research
└─ DydimusZengenene
└─ Note: work to support a researcher (nothing of interest)
└─ _analytics
└─ _data
└─ DydimusZengenene.Rproj
└─ ParseTargetPage.R
wdUsagePerPage
└─ Related to the percentage usage dashboard, so would be archived on 
Gerrit if they were ever in production (nothing of interest)
  
  
  
  stat1005
  
  
total 964

Analytics
└─ 
BotEdits_perProject.ipynb
└─ 
crontabstat1005.txt
└─ 
DataModelTerms_20210228_Updates.ipynb
└─ 
dewiki_NewEds_2021.ipynb
└─ 
QCF_M2_Test.ipynb
└─ 
QuratorCuriousFacts_Separators.ipynb
└─ 
Qurator_M1.ipynb
└─ 
R
└─ 
snapshot_query.hql
└─ 
Untitled1.ipynb
└─ 
untitled1.txt
└─ 
Untitled2.ipynb
└─ 
Untitled3.ipynb
└─ 
Untitled4.ipynb
└─ 
Untitled5.ipynb
└─ 
Untitled.ipynb
└─ 
untitled.txt
└─ 
venv
└─ 
wd_cluster_fetch_items_M2.ipynb
└─ 
wd_cluster_fetch_items_M3.ipynb
└─ 
WDCM_ETL_OTHER_TEST.ipynb
└─ 
WDCM_Statements_Test.ipynb
└─ 
WD_HumanEditsPerClass_RevisionTags.ipynb
└─ 
WD_Inequality_Intake.ipynb
└─ 
WD_Languages_Datamodel_CollectInit.ipynb
└─ 
WD_Languages_Datamodel_EXP.ipynb
└─ 
WD_MonthlyEditors.ipynb
└─ 
WD_Sitelinks_WDAHP_202108.ipynb
└─ 
wd_statements_HiveQL_Query.hql
└─ 
WD_Translations.ipynb
└─ 
WHEIP_exps.ipynb
└─ 
wikidata_analytics_examples
└─ 
WikidataRevisions_November2020.csv
└─ 
  
  
  
  stat1006
  
  
total 48

misc_projects
└─ 
myTemp
└─ 
NewEds
└─ 
nohup.out
└─ 
R
└─ 
RPckg
└─ 
RScripts
└─ 
sqlIn
└─ 
sqlOut
└─ 
WDCM_Credentials
└─ 
WDCM_DataIN
└─ 
WDCM_DataOUT
└─ 
WDCM_sql
└─ 
  
  
  
  stat1007
  
  
total 28

Analytics
└─ 
crontabstat1007.txt
└─ 
Experiments
└─ 
Python3
└─ 
R
└─ 
RScripts
└─ 
venv
└─ 
  
  
  
  stat1008
  
  
total 16

Analytics
└─ 
R
└─ 
renv
└─ 
venv
└─ 
  
  
  
  stat1009
  
  
total 0
  
  
  
  stat1010
  ---

[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-05-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that MR#700 
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/700>
 has been  opened that has the work for this :)

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that MR#700 
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/700>
 has been  opened that has the work for this :)

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-16 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Confirming that data's still coming in as well. @BTullis, what should we do 
about statistics/manifests/wmde/wdcm.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/wdcm.pp>?
 Remove the file? And could you also remove it from puppet entirely on stat1011 
as well? Anything else?

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Quick note that the word used by @BTullis was `disabled` instead of `removed` 
for the stat1007 timers, so apologies if this caused some confusion. I figure 
not, but just wanted to be clear :)
  
  @BTullis, would you be able to check the journal for them and paste the 
output here so we can check it? On my end as well it seems like I can't access 
it.

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "stat1007 migration output check" to 
"stat1007 to stat1011 migration pipeline output check".

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 migration output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata Analytics (Kanban), Wikidata, 
Wikidata Dev Team.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Context
  ---
  
  Recently WMF has been migrating from legacy stat servers that are being 
deprecated - specifically stat1004, 1005, 1006 and 1007. WMDE has a few 
pipelines that were running on stat1007 that have since been migrated over to 
stat1011:
  
  - statistics/manifests/wmde/graphite.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/graphite.pp>
  - statistics/manifests/wmde/wdcm.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/wdcm.pp>
  
  The latter at first glance doesn't appear to do anything as it sets the 
environment variables and clones, but then the rest is `TODO`. The former is 
more expansive and leads in to our Graphite/Grafana workflows.
  
  Further directions
  --
  
  > You should be able to find the required files and the clone of 
https://gerrit.wikimedia.org/g/analytics/wmde/scripts 
<https://gerrit.wikimedia.org/g/analytics/wmde/scripts> beneath 
`stat1011:/srv/analytics-wmde`.
  
  The assumption is that they're working, and the timers for stat1007 have been 
removed.
  
  Goals
  -
  
  Check the pipeline in statistics/manifests/wmde/graphite.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/graphite.pp>
 to assure that everything is working properly after the stat1007 -> stat1011 
migration.

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: June 2024)

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Sheet updated with the numbers for April. Higher number of user agents, but 
lower IPs (but then IPs still much higher than Feb).

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: June 2024)

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "[Analytics] Monthly repeating tasks 
(next: May 2024)" to "[Analytics] Monthly repeating tasks (next: June 2024)".
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hey @brouberol  Just getting back from two weeks off today :) I'll check 
into this and get back to you all! Thanks for the ping!

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "Generate historical weekly segments of 
Wikidata item sitelinks segmentations" to "Generate historical weekly segments 
of Wikidata item sitelink segmentations".

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelinks segmentations

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "Generate weekly historical segments of 
Wikidata item sitelinks segmentations" to "Generate historical weekly segments 
of Wikidata item sitelinks segmentations".

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate weekly historical segments of Wikidata item sitelinks segmentations

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata, Wikidata Analytics (Kanban).
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Purpose
  ---
  
  In T362849: [Analytics] Segments of Wikidata's data over time 
<https://phabricator.wikimedia.org/T362849> we need to calculate historical 
segments of Wikidata's items based on their relation to sitelinks.
  
  Purpose from that ticket:
  
  > As Wikidata Product Managers, we would like to understand how different 
segments of Wikidata's data developed over time, so we can inform our 
projections.
  
  This task would encompass the historical data that's needed to achieve this.
  
  Scope
  -
  
  From T362849 <https://phabricator.wikimedia.org/T362849>:
  
  > How did the number of Items of the following types develop over time?
  >
  >   A) Items that contain a sitelink to one of the Wikimedia projects (e.g. 
about a notable person)
  >   B) Items that are needed to build A (used in A Items for example in a 
statement or reference; e.g. the non-notable father of that notable person)
  >   C) All other Items
  
  
  
  - In order to do this, T363451: Add job to create Wikidata partition to 
wmf.mediawiki_wikitext_history <https://phabricator.wikimedia.org/T363451> was 
made to recreate the Wikidata partition of wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>
  - Once this task is complete, work can then begin to use this partition to 
generate all data from when Wikidata was created to the most recent weekly data 
generated by the DAG created in T362849 
<https://phabricator.wikimedia.org/T362849>
  
  Desired Output
  --
  
  - Weekly stats of the number of Items in category A, B and C
  
  Acceptance criteria:
  
  [ ] Weekly historical breakdowns of populations A, B and C
- These would be in the Data Lake and the published datasets
  
  ---
  
  **Information below this point is filled out by the Wikidata Analytics team.**
  
  General Planning
  
  
  Information is filled out by the analytics product manager.
  
  Assignee Planning
  -
  
  Information is filled out by the assignee of this task.
  
  Estimation
  --
  
  Estimate:
  Actual:
  
  Sub Tasks
  -
  
  Full breakdown of the steps to complete this task:
  
  [ ] Step
  
  Data to be used
  ---
  
  See Analytics/Data_Lake 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake> for the breakdown of 
the data lake databases and tables.
  
  The following tables will be referenced in this task:
  
  - wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>
  
  Notes and Questions
  ---
  
  Things that came up during the completion of this task, questions to be 
answered and follow up tasks:
  
  - Note

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  See T362849_wd_item_sitelink_segments.ipynb 
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/blob/main/tasks/wikidata/2024/T362849_wd_item_sitelink_segments/T362849_wd_item_sitelink_segments.ipynb?ref_type=heads>
 for the work to derive the segments :)

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Ok, so the new numbers after the change in scope for the max `2024-04-15` 
snapshot are:
  
items_with_sitelinks: 32,231,861
items_items_with_sitelinks_link_to: 2,980,388
all_other_items: 72,910,679
  
  For documentation, the numbers for the original Population B definition for 
the min `2024-02-26` snapshot were:
  
items_with_sitelinks: 31,978,738
linked_to_items_with_sitelinks: 75,221,879
all_other_items: 242,565
  
  Status on the rest of this:
  
  - The weekly DAG is written and further does include an export to the 
published datasets repo
- I've also included the work for T361203 
<https://phabricator.wikimedia.org/T361203> in this
  - We need to confirm the numbers above and the method that generates them
  - I'll then rewrite the DAG job that runs the query
  - Then testing, I'll need the table `wmde.wd_item_sitelink_segments_weekly` 
to be made in HDFS by an admin, and then we can go into production
  - Should all be done by Tuesday/Wednesday evening after I'm back in a few 
weeks depending on folks' availability
  - I'll make a new task for the historic data generation process, which will 
depend on T363451 <https://phabricator.wikimedia.org/T363451>

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Moved this to `In progress` as I'm adding the job to export everything to the 
published datasets folder to the DAG as I work on the same for T362849 
<https://phabricator.wikimedia.org/T362849>.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-25 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  See {https://phabricator.wikimedia.org/T363451} for the task about bringing 
back the partition (hopefully via another job). I added a bit about whether we 
want to maybe turn this job on when WMDE needs historical data. Let me know 
what you all think on that :)

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Another note on this is: if we don't expect to be needing a Wikidata 
partition of `wmf.mediawiki_wikitext_history` for other tasks, then we could 
work directly from the XML dump for the data backdate. We wouldn't be able to 
leverage PySpark for the querying though, so I worry about how long all of this 
would take...

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a subscriber: JAllemandou.
AndrewTavis_WMDE added a comment.


  Thanks for all of the information, @mpopov!
  
  I talked this over in my bi-weekly with @JAllemandou, and would like to bring 
some further context to this particular situation :)
  
  The go to table for this would be wmf.wikidata_entity 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Wikidata_entity>
 for the following reasons:
  
  - It has the `sitelinks` column for Population A above
  - It has the `claims` column for Population B above
  
  It thus has everything we need for the given task for future data. One change 
to the output for this though would be the frequency of the DAG, as 
`wmf.wikidata_entity` is a weekly data dump, so it'd make sense to do a weekly 
DAG. If we still want to do a monthly job, then the best option would be to do 
a DAG that runs on the first Monday of every month (in the docs for 
`wmf.wikidata_entity` it mentions the `2020-01-20` snapshot, which was a 
Monday).
  
  Now we get to the question of the historical data... This is a situation that 
cannot be solved at this time given the current makeup of the Data Lake. As 
mentioned on Mattermost: we currently do not have Wikidata as a partition 
within wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>,
 so we do not have historical versions of Wikidata items with which we'd be 
able to rebuild the history. The assumption we're making on this is that the 
legacy version of these metrics was made using `wmf.mediawiki_wikitext_history` 
at a time when Wikidata was still an available partition. The change for 
removing Wikidata from the `wmf.mediawiki_wikitext_history` dump process was 
`2024-02` - see T357859 <https://phabricator.wikimedia.org/T357859> where ~12 
of 25 days of the dump generation is for the Wikidata XML dump. This was 
slowing down metrics delivery for WMF Movements Insights.
  
  Steps forward on this:
  
  - I'll begin work on a DAG based on `wmf.wikidata_entity`, as even if we do 
get a Wikidata partition within `wmf.mediawiki_wikitext_history`, it would not 
be used for recent data updates
- Are we fine with a weekly DAG?
  - A decision needs to be made on whether WMDE is requesting Wikidata data to 
again be an output in `wmf.mediawiki_wikitext_history` snapshot creation process
- The preferred solution here would be to not revert the changes to T357859 
<https://phabricator.wikimedia.org/T357859>, but rather make a new job that 
adds a new partition to the table via the Wikidata XML dump
- Reason for this is to assure that WMF Movements Insights can maintain the 
current speed of delivery
- @JAllemandou has said that bringing the Wikidata partition back is fine 
if we need it (again, preferably in the above way)
  - If the request is being made, a new task should be made for it
  - We'd then do what I'd argue would be a separate task whereby the new 
`wmf.mediawiki_wikitext_history` Wikidata parition would be used to recompute 
the historical populations above
  
  Let me know what thoughts are on the above!

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: mpopov, AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: mpopov, AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Summary on your end sounds great, @Ifrahkhanyaree_WMDE!  Let me know if 
sending along some empty new item revisions from 2024 would be helpful :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Notebook with the work that was done for this is: 
wmde/analytics/tasks/product_platform/2024/T360761_empty_wikidata_items/T360761_empty_wikidata_items.ipynb
 
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/blob/main/tasks/product_platform/2024/T360761_empty_wikidata_items/T360761_empty_wikidata_items.ipynb>.
 Will update this if further work is needed :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE moved this task from Needs product input to Product 
verification on the Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.


  Further insights on this, and moving it to `Product verification` at this 
point :) I've now changed the query to a span of bytes that would be allowable 
for something to be empty. I added 10 bytes to the calculated max for `170`, 
but also tried with `180` and `190` and the trend of empty on first revision 
items dropping off is maintained.
  
  Basic finding: it used to be way more common, but still does happen today
  
  New query is the following:
  
SELECT DISTINCT
event_user_text AS editor,
substring(event_timestamp, 1, 7) AS event_year_month,
page_title AS created_empty_qid

FROM
wmf.mediawiki_history

WHERE
wiki_db = 'wikidatawiki'
AND page_namespace_is_content = True
AND snapshot = '2024-03'
AND event_entity = 'revision'
AND event_type = 'create'
AND page_revision_count = 1
-- Factor in bytes that are within a range small enough to be an empty 
first edit.
AND 148 < revision_text_bytes
AND revision_text_bytes < 170
;
  
  Task 1.1 - Number of Items in population A that were created empty: 
`5,075,471`
  Task 1.2 - Number of editors who are creating empty items: `27,61`
  
  Of the above items, I did a test of `50,000` to see if they were empty on 
deletion using the `https://www.wikidata.org/wiki/Special:EntityData/` 
endpoint. `49,579` returned valid JSON responses, and of those `99.65%` were 
found to be empty.
  
  I also checked the empty item creation over time, with the following two 
plots coming based on the above definition of the population in the query 
(148-170 bytes being "empty"):
  
  F48099515: total_empty_qids_created_per_month_v3_definition.png 
<https://phabricator.wikimedia.org/F48099515>
  
  F48099542: 
total_empty_qids_created_per_month_in_2023_and_2024_v3_definition.png 
<https://phabricator.wikimedia.org/F48099542>
  
  Again, I also tried boosting the max byte sizes for `180` and `190` and the 
plots above were not noticeably different.
  
  Task 2 - Number of Items in population B that are currently deleted: `44,385` 
(`0.87%`)
  
  I switched around the 3.x tasks a bit with a focus on visualization, as as I 
said I basically wasn't seeing ones that were created empty and were still 
empty.
  
  Task 3.1 - no further edits ever on items that are not deleted: `0` (they all 
have at least one more edit)
  
  Query for this:
  
WITH not_deleted_created_empty_qids_v3 AS (
SELECT DISTINCT
page_title AS not_deleted_created_empty_qid

FROM
wmf.mediawiki_history

WHERE
wiki_db = 'wikidatawiki'
AND page_namespace_is_content = True
AND snapshot = '2024-03'
AND event_entity = 'revision'
AND event_type = 'create'
AND page_revision_count = 1
-- Factor in bytes that are within a range small enough to be an 
empty first edit.
AND 148 < revision_text_bytes
AND revision_text_bytes < 170
AND page_is_deleted = False
)

SELECT
h.page_title AS not_deleted_created_empty_qid,
count(h.revision_id) AS number_of_revisions

FROM
wmf.mediawiki_history AS h

JOIN
not_deleted_created_empty_qids_v3 AS e

ON
h.page_title = e.not_deleted_created_empty_qid

WHERE
h.wiki_db = 'wikidatawiki'
AND h.page_namespace_is_content = True
AND h.snapshot = '2024-03'
AND h.event_entity = 'revision'
AND h.event_type = 'create'

GROUP BY
h.page_title
  
  Task 3.2 - at least one additional edit (=the rest): `5,031,086`
  
  - Check: `5,031,086 + 44,385 = 5,075,471`
  
  New and hopefully a bit more helpful (my assumption) Task 3.3 - graphs of the 
number of edits the items have had
  
  F48100783: 
not_deleted_empty_on_creation_items_per_edit_amount_max_100_-_v3_definition.png 
<https://phabricator.wikimedia.org/F48100783>
  
  F48100788: number_of_revisions_on_empty_on_creation_items_v3_definition.png 
<https://phabricator.wikimedia.org/F48100788>
  
  Let me know if anything else would be helpful here, @Ifrahkhanyaree_WMDE!

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___

[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-19 Thread AndrewTavis_WMDE
AndrewTavis_WMDE moved this task from In progress to Needs product input on the 
Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.


  The thread on Mattermost 
<https://mattermost.wikimedia.de/swe/pl/gsr9b485x7geby79t4sg151j7c> for 
discussing this has a lot of comments on the data restrictions we're dealing 
with here because there is no text table for Wikidata in the Data Lake. A work 
around using `revision_text_bytes` to determine the minimum size that an item 
could be (i.e. = empty) has been used so far with okish results, but there are 
definitely drawbacks and it's not exact.
  
  What it is that I can say here is that:
  
  - There are lots of items being created empty (from one subset `3,540,260`)
  - They're not normally deleted (from the same subset only `0.95%` where)
  - It's usual that there are edits (I've yet to see an item that was created 
empty and is still empty, but please note that this is an eye test on ~30 items)
  
  Moving this to `Needs product input` for now. A basic thing that can be done 
that won't take too much time is that I can use a range instead of the case 
when for determining when a item is empty via the length of it's QID and the 
`revision_text_bytes` size. We would then not be getting empty on creation 
items 100% of the time, but I could also find the ratio and we could agree on 
what an acceptable margin of error would be (say `> 90%`). Time estimate on 
this is 1/2 a day.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-19 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Prioritizing this now. Initial exploration of the data sources indicates that 
we need to use the full `mediawiki_history` rather than 
`mediawiki_history_reduced` as the latter doesn't have a distinct 
`page_is_deleted` field for Population B.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362643: Mismatch Finder gadget: visisted link text icon doesn't change color with link

2024-04-16 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added a project: Wikidata.org.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Note that I did this Phabricator tasks search 
<https://phabricator.wikimedia.org/search/query/5BIk7a7RSJzT/#R> before making 
this task :)
  
  **Steps to replicate the issue** (include links if applicable):
  
  - Go to https://mismatch-finder.toolforge.org/
  - Click on `Random mismatches`
  - Click on the label and QID header of any element displayed
  - Click on `Inspect` in the Mismatch Finder gadget with the text `There 
is/are NUM_MISMATCHES mismatch/es for this item.`
  - Wait for the page to load such that the link you clicked now has the status 
visited
  - Navigate back to the Wikidata item page you were on
  
  **What happens?**:
  
  You'll see that the link text is colored given the visited status, but the 
link icon is still the default link text color
  
  **What should have happened instead?**:
  
  My expectation would be that the icon for the external link would have the 
same color as the link it's associated with.
  
  **Software version** (on `Special:Version` page; skip for WMF-hosted wikis 
like Wikipedia):
  
  Currently deployed version of the gadget. Not sure :)
  
  **Other information** (browser name/version, screenshots, etc.):
  
  Browser is Firefox 124.0.2 (64-bit).
  
  Screenshot of the assumed discoloration is below:
  
  F46968414: Screenshot from 2024-04-16 12-54-59.png 
<https://phabricator.wikimedia.org/F46968414>
  
  Minor comment: the link icon doesn't necessarily convey that what the user is 
clicking on is an external link. Would it make sense to shift the icon over to 
the right of `Inspect` and use the external link icon - arrow pointing to the 
top right from a box - for this?

TASK DETAIL
  https://phabricator.wikimedia.org/T362643

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, KimKelting, Wikidata-bugs
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362641: [MSMF] Button texts are not centered in various places

2024-04-16 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Mismatch Finder, Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Note that I did the following Phabricator search 
<https://phabricator.wikimedia.org/search/query/FxDRSlmcrOEQ/#R> before writing 
this :)
  
  **Steps to replicate the issue**:
  
  - Go to https://mismatch-finder.toolforge.org/
  
  **What happens?**:
  
  Seems like the buttons on the page don't have their texts centered? See 
screenshots below:
  
  F46965326: Screenshot from 2024-04-16 13-23-44.png 
<https://phabricator.wikimedia.org/F46965326>
  
  F46965344: Screenshot from 2024-04-16 13-23-32.png 
<https://phabricator.wikimedia.org/F46965344>
  
  F46965357: Screenshot from 2024-04-16 13-23-17.png 
<https://phabricator.wikimedia.org/F46965357>
  
  F46965370: Screenshot from 2024-04-16 13-23-06.png 
<https://phabricator.wikimedia.org/F46965370>
  
  I've loaded each of the above screenshots into Figma to check the dimensions 
and there's extra space beneath the label in all of them except one. For the 
language selector the space is equal, but then there's a lowercase g, so maybe 
the text should be a bit lower still?
  
  **What should have happened instead?**:
  
  The text should be centered.
  
  **Software version** (on `Special:Version` page; skip for WMF-hosted wikis 
like Wikipedia):
  
  Currently deployed version of Mismatch Finder.
  
  **Other information** (browser name/version, screenshots, etc.):
  
  Browser is Firefox 124.0.2 (64-bit).

TASK DETAIL
  https://phabricator.wikimedia.org/T362641

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362301: [MSMF] Add mismatch file upload scripts to Mismatch Finder repo

2024-04-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362301

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, luca.favorido, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362301: [MSMF] Add mismatch file upload scripts to Mismatch Finder repo

2024-04-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Mismatch Finder, Wikidata, wmde-wikidata-tech.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Context
  ---
  
  A part of the WMDE x Purdue University program where students have been 
looking for mismatches 
<https://www.wikidata.org/wiki/Wikidata:Mismatch_Finder/Collaboration/Purdue_Summer_of_Data_2024>
 has been the creation of scripts to more easily upload mismatch files. These 
scripts can be found in the root of the wikidata/Purdue-Data-Mine-2024 
<https://github.com/Wikidata/Purdue-Data-Mine-2024> repo on GitHub. The files 
and descriptions of their use are:
  
  1. check_mismatch_file.py 
<https://github.com/Wikidata/Purdue-Data-Mine-2024/blob/main/check_mismatch_file.py>
- Loads a target CSV into a pandas DataFrame
- Includes the function `check_mf_formatting` that will check the validity 
of the file for upload given the Mismatch Finder user guide 
<https://github.com/wmde/wikidata-mismatch-finder/blob/main/docs/UserGuide.md#creating-a-mismatches-import-file>
- Says that the file is ready for upload, or if the file is not valid, 
steps to fix it are printed
- At the start of the process, will also warn the user if the file is 
larger than the upload file size limit of 10 MB (see next file)
  2. split_mismatch_file.py 
<https://github.com/Wikidata/Purdue-Data-Mine-2024/blob/main/split_mismatch_file.py>
- Written in response to the upload limit of 10 MB for the Mismatch Finder 
API (see T360436 <https://phabricator.wikimedia.org/T360436>)
- A path to a CSV is passed, and if the file is greater than the upload 
limit, then CSV subsets are created in a directory that are below the upload 
limit
- A path to where the subset CSVs should be saved can be passed, and the 
resulting directory is checked to make sure it only has CSVs
- Whether the original CSV should be deleted can also be passed as an 
argument
  3. upload_mismatches.py 
<https://github.com/Wikidata/Purdue-Data-Mine-2024/blob/main/upload_mismatches.py>
- A path to a CSV or directory of CSVs is passed
- Python `requests` is used to execute the cURL request, with the 
`r.raise_for_status()` raising an error and printing the errors if the upload 
is unsuccessful
- Arguments further include the needed access token, a description, the 
external source, the URL for the external source, and verbosity
- Assertions are made to assure that arguments are correct
  
  Open questions
  --
  
  I've found the process of using these scripts for uploading mismatches to be 
much easier than using cURL where the errors were not returned, or figuring out 
where all the needed arguments should go within a interface to make the request 
like Postman. Whether or not the second script should be included in the third 
is definitely something that should be considered based on end user feedback.
  
  Please let me know if there are any questions!

TASK DETAIL
  https://phabricator.wikimedia.org/T362301

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, luca.favorido, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356659: [QB] Remove references of broken tool from Mismatch Finder and Query Builder

2024-04-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "[MSMF] [QB] Remove references of 
broken tool from Mismatch Finder and Query Builder" to "[QB] Remove references 
of broken tool from Mismatch Finder and Query Builder".

TASK DETAIL
  https://phabricator.wikimedia.org/T356659

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, karapayneWMDE, AndrewTavis_WMDE, luca.favorido, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
Mattia_Capozzi_WMDE, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362151: [SW] The mismatch file description should be more visibly apparent in the Mismatch Finder UI

2024-04-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362151

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Sarai-WMDE, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362217: Mismatch finder long description modal doesn't close on X press

2024-04-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362217

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362217: Mismatch finder long description modal doesn't close on X press

2024-04-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362217

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362217: Mismatch finder long description modal doesn't close on X press

2024-04-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362217

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362217: Mismatch finder long description modal doesn't close on X press

2024-04-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Mismatch Finder, Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  **Steps to replicate the issue**:
  
  In looking at the mismatches on Mismatch Finder 
<https://mismatch-finder.toolforge.org/results?ids=Q2804125%7CQ1400789%7CQ117225388%7CQ109394737%7CQ22964628%7CQ374855%7CQ6939795%7CQ16541128%7CQ1887363%7CQ6437641%7CQ24959108%7CQ30309997%7CQ109408429%7CQ27533146%7CQ110360032>,
 I'm seeing a minor bug :ladybug: For the mismatches that have a long 
description and a read full description element, when you open the modal to 
view the full description you can only close it with `Confirm` as the `X` in 
the top right doesn't function on my end.
  
  **What happens?**:
  
  The close modal `X` receives the focus state when it is clicked.
  
  **What should have happened instead?**:
  
  The modal should close.
  
  **Software version**:
  
  Currently deployed version of Mismatch Finder.
  
  **Other information**:
  
  Browser is Firefox 124.0.2 (64-bit)
  
  Screenshot below:
  
  F45566218: Screenshot from 2024-04-09 12-38-08.png 
<https://phabricator.wikimedia.org/F45566218>

TASK DETAIL
  https://phabricator.wikimedia.org/T362217

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362151: [SW] The mismatch file description should be more visibly apparent in the Mismatch Finder UI

2024-04-09 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Mismatch Finder, Wikidata, Wikidata Dev Team.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Problem
  ---
  
  Upon uploading some new mismatches, something that I'm realizing is that the 
description field for the mismatch file isn't very apparent within the Mismatch 
Finder UI. This doesn't allow for the person uploading the data to provide 
information about the context of the upload that would help a Wikidata editor 
fix the mismatches. To me there are as of now two situations for mismatches:
  
  1. It's a simple mismatch and one value should be chosen
  2. The mismatch AND other things on the item in question should be addressed
  
  Examples are:
  
  1. The mismatch is pretty clear on one property: external says date of birth 
for a person is 1969, and Wikidata says 1970
  2. The mismatch has lead to an understanding that there are more problems: 
external source says date of birth of a person is 1969, Wikidata says 1970, and 
it's because there's a wrong identifier on the item that is leading to a soccer 
player also being a chess player so the items need to be split
  
  A screenshot of the current UI is:
  
  F45354311: Screenshot from 2024-04-09 13-54-12.png 
<https://phabricator.wikimedia.org/F45354311>
  
  F45354315: Screenshot from 2024-04-09 13-54-21.png 
<https://phabricator.wikimedia.org/F45354315>
  
  Having a more apparent description that could also be renamed `Description / 
Directions` or something along those lines would allow an uploader to provide 
more context so that issues in the second case could be addressed.
  
  Solution
  
  
  There are various ways that the description could be made more apparent. To 
me marking it also as "directions" in the UI would be helpful, but I'm 
definitely not suggesting that another field should be added to the upload API. 
Description and directions should be in one text to simplify the work to be 
done. We could also leave the description in the last column and add some 
spacing between the username for the upload and date above it.
  
  Interested to see what UX thinks on this!
  
  Open questions
  --
  
  How to best deal with the space considerations for the Mismatch Finder UI is 
definitely something that needs to be accounted for.
  
  Acceptance criteria
  ---
  
  [ ] The description field is a bit more apparent such that users would be 
able to see that there might be hints on how to best deal with the mismatch

TASK DETAIL
  https://phabricator.wikimedia.org/T362151

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Sarai-WMDE, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-04-09 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that in checking the `tmp` directory just now, there still are 
files/directories in there, meaning that parts of the process are likely still 
running (maybe parts that don't need private data access). We'll be checking 
this again in a month once the VPS machines are shut down.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: April 2024)

2024-04-08 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "[Analytics] Monthly repeating tasks 
(next: March 2024)" to "[Analytics] Monthly repeating tasks (next: April 2024)".

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: March 2024)

2024-04-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Sheet has been updated for March via a query of 
`wmde.wd_rest_api_metrics_monthly` that's generated by Airflow. Slightly lower 
user agents than last month, but IPs doubled 

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: March 2024)

2024-04-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-03-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356659: [MSMF] [QB] Remove references of broken tool from Mismatch Finder and Query Builder

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356659

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, karapayneWMDE, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356659: [MSMF] [QB] Remove references of broken tool from Mismatch Finder and Query Builder

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Updated the description as wmde/wikidata-mismatch-finder#878 
<https://github.com/wmde/wikidata-mismatch-finder/pull/878> fixed the problem 
for Mismatch Finder. At time of writing Curious Facts is still referenced in 
the Query Builder footer.

TASK DETAIL
  https://phabricator.wikimedia.org/T356659

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, karapayneWMDE, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356659: [MSMF] [QB] Remove references of broken tool from Mismatch Finder and Query Builder

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356659

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, karapayneWMDE, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: March 2024)

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE lowered the priority of this task from "Medium" to "Low".

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: March 2024)

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  I've added the numbers for February to the sheet based on the first DAG run 
and also just went through the query job one final time to check. The queries 
that are being ran by the job are directly from the original queries with only 
a few minor changes:
  
  For counting the filtered user agents we're doing the following:
  
count(
DISTINCT CASE
WHEN user_agent
NOT LIKE 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/% 
(KHTML, like Gecko) Chrome/% Safari/%'
THEN user_agent
END
) AS total_filtered_user_agents,
  
  ... instead of:
  
SELECT
count(DISTINCT user_agent) AS total_filtered_user_agents

...

WHERE
AND user_agent NOT LIKE 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/% (KHTML, like Gecko) Chrome/% Safari/%'
  
  Within the `WHERE` clause we are further adding `webrequest_source = 'text'` 
as discussed, which was suggested by WMF data engineering and meaning that we 
are not losing any any information, but rather that we are querying from a 
subset of information that included our original results.
  
  I'll update the numbers for March once the next DAG run is finished at the 
start of next week!

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that this task is dependent on whether a standardized system that would 
not require the published datasets is created. Such a system is discussed in 
T361214: Public dashboard process <https://phabricator.wikimedia.org/T361214>.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE removed AndrewTavis_WMDE as the assignee of this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public dashboard pilot

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that I've made T361214: Public dashboard process 
<https://phabricator.wikimedia.org/T361214> to explain our use case of a 
standardized public dashboard process :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-03-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata Analytics (Kanban), Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  In T341330:  [Analytics] Airflow implementation of unique ips accessing 
Wikidata's REST API metrics <https://phabricator.wikimedia.org/T341330> WMDE 
Analytics created its first Airflow DAG and the needed jobs for it. As a 
requirement for T360298:  [Analytics] Public dashboard pilot 
<https://phabricator.wikimedia.org/T360298> it seems that another step would be 
needed in order to have the data be on a publicly available dashboard - 
specifically that we need to add the published datasets 
<https://analytics.wikimedia.org/published/datasets/> as a target of the jobs 
such that the data is saved to HDFS and in TSV format in a place where it can 
be ingested by a dashboarding software like Turnilo 
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo>.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341330: [Analytics] Airflow implementation of unique ips accessing Wikidata's REST API metrics

2024-03-27 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Merge request 
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/631>
 has been brought in, and we've successfully deployed!  An output from the new 
`wmde.wd_rest_api_metrics_monthly` table is:
  
| month|total_user_agents|total_filtered_user_agents|total_ips|
|--|-|--|-|
|2024-02-01|  458|   424|14539|

TASK DETAIL
  https://phabricator.wikimedia.org/T341330

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341330: [Analytics] Airflow implementation of unique ips accessing Wikidata's REST API metrics

2024-03-27 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T341330

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public dashboard pilot

2024-03-27 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from " [Analytics] Public Superset dashboard 
pilot" to " [Analytics] Public dashboard pilot".
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-27 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Post a large discussion about this in the `data-engineering-collab` channel 
on Slack, the general findings for this are:
  
  - The public Superset instance isn't suitable for this at this time and 
there's no time table for it to be (see above comments)
  - A suggestion of putting this information on Wikistats 
<https://stats.wikimedia.org/#/all-projects> was agreed to be too complex to 
setup and manage
- We would need to use AQS 2 (Analytics Query Service) to make a 
service/API for this
  - An initial suggestion from WMDE to target Prometheus with the DAG was 
decided against
- It is possible to push data to Prometheus, but there are many 
complications with this
  - A new suggestion is to leverage Turnilo 
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Turnilo> for this
- There is a private instance at turnilo.wikimedia.org 
<https://turnilo.wikimedia.org/>
- There are also public instances of this as seen at 
wiki-search-referrals.wmcloud.org <https://wiki-search-referrals.wmcloud.org/>
  - Wikitech docs for this can be found at 
wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/referrer_daily/Dashboard
 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/referrer_daily/Dashboard>
  - The Turnilo dashboard is hosted on Cloud VPS 
<https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS>
  - The code for the Turnilo instance can be found at 
github.com/wikimedia/research-api-endpoint-template/turnilo-druid 
<https://github.com/wikimedia/research-api-endpoint-template/tree/turnilo-druid>
- The way this would be achieved is that we would have the published 
datasets <https://analytics.wikimedia.org/published/datasets/> folder be 
another target of the DAG jobs, and we'd then ingest this data via the Turnilo 
instance
  
  This sounds like a good way forward, but the question of setting up the 
Turnilo instance and maintaining it then comes to mind. A big question is: how 
often are data pipelines supposed to be public, and would putting it all on a 
single Turnilo instance work well for our requirements?

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-27 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Further checks on this: the dashboarding process for the public Superset 
seems to be based on a few preset databases that have the data from Wikimedia 
projects (see SQL Lab <https://superset.wmcloud.org/sqllab/>). As of now I'm 
doubting whether we'd be able to have active rights over one of these such that 
tables we'd generate in Airflow could be added to one and used for 
visualizations. I've asked in the WMDE data channel if there are people with 
domain knowledge for Graphite that could help with setting up a process where 
it would be one of the targets of the Airflow jobs. This to me seems more 
simple, with the end situation being that we use the main Superset instance for 
data processes that rely on the data lake/private data access, and then use 
Grafana for dashboards that are meant to be public facing.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that from the most recent discussions with WMF data engineering, there 
isn't a set workflow for getting information into a place where it can be 
accessed via the Public Superset instance. We would need to edit the DAG such 
that we include an export step for the data getting to a place where the public 
instance can access it. This would require some more research.
  
  Maybe another thing to consider is whether we'd prefer to have Graphite be 
the end export location for the data and then make a Grafana dashboard for 
this? Grafana does serve as the current public facing data dashboards for 
Wikidata, so it might make sense to leverage it more.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T348999: Add linter and formatter to wmfdata-python (and link check)

2024-03-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Exciting! I'll play around a bit towards the end of next week and send along 
a PR with the workflow, docs and changes given the local run warnings  Will 
let you know if anything comes up before then. Have a nice weekend when it 
comes along!

TASK DETAIL
  https://phabricator.wikimedia.org/T348999

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: nshahquinn-wmf, xcollazo, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, Mohamed-Awnallah, Astuthiodit_1, lbowmaker, BTullis, 
karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, 
Mayakp.wiki, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-03-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T348999: Add linter and formatter to wmfdata-python (and link check)

2024-03-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @nshahquinn-wmf, @xcollazo: checking in on this one again. I would have some 
time in the next two weeks or so to implement a PR workflow check of linting 
and code formatting. If folks are fine with Ruff 
<https://github.com/astral-sh/ruff> that'd be easiest on my end, but also happy 
to consider others! I'd also suggest adding in a `.vscode/extensions.json` file 
that would allow us to suggest VS Code extensions like the Ruff extension 
<https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff> so 
people are getting the appropriate warnings during editing. Included would of 
course also be some documentation on how to run the checks locally before a PR 
  
  Let me know if this would be of interest on your all's end!

TASK DETAIL
  https://phabricator.wikimedia.org/T348999

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: nshahquinn-wmf, xcollazo, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, Mohamed-Awnallah, Astuthiodit_1, lbowmaker, BTullis, 
karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, 
Mayakp.wiki, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341330: [Analytics] Airflow implementation of unique ips accessing Wikidata's REST API metrics

2024-03-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Merge request for this has been sent and can be found here 
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/631>
 :) Requested WMF's review on this first one, but we'll need to take over from 
there unless there are problems with it all.

TASK DETAIL
  https://phabricator.wikimedia.org/T341330

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T341330: [Analytics] Airflow implementation of unique ips accessing Wikidata's REST API metrics

2024-03-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T341330

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-03-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-03-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T357697: Archive WMDE analytics Gerrit repositories

2024-03-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE closed this task as "Resolved".
AndrewTavis_WMDE claimed this task.
AndrewTavis_WMDE added a comment.


  Fantastic! Thank you both again for the help here :) Really is great to be 
winding down these processes and moving onto the next steps! 

TASK DETAIL
  https://phabricator.wikimedia.org/T357697

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: hashar, brouberol, Manuel, Aklapper, AndrewTavis_WMDE, 
Baeisvar52braevincent, Danny_Benjafield_WMDE, Astuthiodit_1, YoutacrsVARs, 
MajaWiki82, karapayneWMDE, Invadibot, maantietaja, Peteosx1x, ItamarWMDE, 
Mgagat, Akuckartz, Totolinototo3, Hassoonbxl, Zanziii, Sadisticturd, Nandana, 
Zylc, Reari, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, Pppery, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Luke081515, Wikidata-bugs, aude, 
Dinoguy1000, Jdforrester-WMF, Mbch331, Jay8g, Krenair
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T357697: Archive WMDE analytics Gerrit repositories

2024-03-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thank you both so much! Let me know when the GitHub repos have been deleted 
and I'll resolve this and update the greater epic 

TASK DETAIL
  https://phabricator.wikimedia.org/T357697

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: hashar, brouberol, Manuel, Aklapper, AndrewTavis_WMDE, 
Baeisvar52braevincent, Danny_Benjafield_WMDE, Astuthiodit_1, YoutacrsVARs, 
MajaWiki82, karapayneWMDE, Invadibot, maantietaja, Peteosx1x, ItamarWMDE, 
Mgagat, Akuckartz, Totolinototo3, Hassoonbxl, Zanziii, Sadisticturd, Nandana, 
Zylc, Reari, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, Pppery, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Luke081515, Wikidata-bugs, aude, 
Dinoguy1000, Jdforrester-WMF, Mbch331, Jay8g, Krenair
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T357697: Archive WMDE analytics Gerrit repositories

2024-03-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE edited projects, added Wikidata Analytics (Kanban); removed 
Wikidata Analytics.

TASK DETAIL
  https://phabricator.wikimedia.org/T357697

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: hashar, brouberol, Manuel, Aklapper, AndrewTavis_WMDE, 
Baeisvar52braevincent, Danny_Benjafield_WMDE, Astuthiodit_1, YoutacrsVARs, 
MajaWiki82, karapayneWMDE, Invadibot, maantietaja, Peteosx1x, ItamarWMDE, 
Mgagat, Akuckartz, Totolinototo3, Hassoonbxl, Zanziii, Sadisticturd, Nandana, 
Zylc, Reari, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, Pppery, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Luke081515, Wikidata-bugs, aude, 
Dinoguy1000, Jdforrester-WMF, Mbch331, Jay8g, Krenair
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360436: [MSMF] Add upload file limit to Mismatch Finder documentation

2024-03-20 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Per suggestion from @noarave I reran the curl command 
<https://github.com/wmde/wikidata-mismatch-finder/blob/main/docs/UserGuide.md#example-with-curl>
 with `-v` at the end for a verbose output. Of note is in the first line we 
have `Note: Unnecessary use of -X or --request, POST is already inferred.`.  
Aside from that, I still got an empty string `"message"` at the end and nothing 
indicating that the file size limit was exceeded. At the end the response is:
  
{
"message": ""
* Connection #0 to host mismatch-finder.toolforge.org left intact
}%

TASK DETAIL
  https://phabricator.wikimedia.org/T360436

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: noarave, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Mattia_Capozzi_WMDE, ItamarWMDE, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


  1   2   3   4   5   6   >