[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-06-12 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a project: Epic.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, me, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BeautifulBold, Suran38, 
Invadibot, maantietaja, Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-06-12 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE moved this task from In progress to Product verification on 
the Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.


  @Manuel and @Lydia_Pintscher, just shared a folder with the two CSVs on 
Wolke. Let me know if there's anything else needed, and I will set a reminder 
that they should be deleted on my end in 89 days (they were generated 
yesterday). Sharing has been disabled on the directory, so if others need 
access, then let me know :)

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hi @MarcoSwart 👋 Thanks for the communication here :) I guess I'm a bit 
confused by how the other one would be used. You're roughly talking about:
  
  | word_that_is_missing_from_a_wiktionary | 
number_of_wiktionaries_that_do_have_it |
  | MOST_MISSING_WORD  | 156
|
  | NEXT_MOST_MISSING_WORD | 155
|
  | ...| ...
|
  |
  
  With that we're missing the `Wiktionary` column, so then editors wouldn't 
have the ability to easily know if their Wiktionary needed that word or not? 
Maybe it can be gotten from another part of the data process. Let me explain :)
  
  What's planned for this data process at this point is two outputs:
  
  - Missing Entries (I miss you ...) as described above 
<https://phabricator.wikimedia.org/T360296#9879652> - per Wiktionary what are 
the 1,000 most popular missing words
  - Most Popular - the most popular entries across all Wiktionaries
  
  Maybe Most Popular would serve your interests above? This would be a CSV with 
say the 10,000 or 100,000 or whatever you all would need most popular entries 
across all Wiktionaries. All of this updating on a daily basis. Would that work 
for you?
  
  Please let me know if I'm understanding correctly, by the way! Appreciate 
your feedback :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @Manuel, my assumption was that you could help any non-analytics PMs or go 
through the results with them as you have the needed access. Using Google for 
PII is not something we're supposed to do if it can be avoided, but I have no 
experience with Wolke. Please let me know if you'd like me to look into Wolke 
or send the files over Drive.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-11 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Talked further with WMF about this just now. One basic question for the end 
users: would it make it more convenient for you all if the exported datasets 
were per Wiktionary? There are two options here, with missing entries being 
used as an example:
  
  1. We export one file that has all missing entries for all Wiktionaries
- 188,000 rows x 3 columns
- 188,000 rows = the 1,000 most popular missing entries for each Wiktionary 
(there are 188 in the data)
- 3 columns
  - The Wiktionary
  - The word that's missing from it
  - The total of the other Wiktionaries that have it
  2. We export 188 CSVs, each of length 1,000 with the above columns
  
  Reason for option 1 or 2 and not both is that we don't want to keep the data 
in duplicate both in the published datasets directories and in the data lake. 
Option 1 is easier, but we can figure out Option 2 if that would be your all's 
preference.
  
  So the baseline question for each option is:
  
  1. If you're only working on one Wiktionary, would you be ok with getting it 
as a subset from the whole dataset?
  2. If you're working on more than one Wiktionary, would you be ok with 
getting the separate datasets and combining them?
  
  Let us know which would be better for your workflow! And thanks for your 
continued interest in this. Great talks today about the various options we have 
😊

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  I can also prepare a notebook with quick functions to load and explore the 
data, if that would make the option I suggested a bit easier.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  > Would it be possible to send us a spreadsheet (and schedule it for deletion 
after 90 days)?
  
  I'd prefer to transfer via the servers if possible given the comment here 
<https://phabricator.wikimedia.org/T358311#9820450> from WMF Engineering. I'm 
also not sure how to schedule a spreadsheet for deletion, but can look into 
this if this would be preferable.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Base queries for all of this are ready :) Let me know on the above and I'll 
finalize them.
  
  Re how to send the files: my suggestion would be that I put them into my 
`stat1010` and then @Manuel can migrate them to his. From there I'll delete my 
copy and he can delete his once he and @Lydia_Pintscher are done checking them. 
Suggesting this as I can't move the files into another users' directory myself.
  
  Generally from one's root the command would be:
  
# The last . is the current directory, and autocomplete should work.
cp 
../andrewtavis-wmde/wikidata/2024/T366621_rest_api_user_agents/FILE_NAME.csv .
  
  Let me know how this sounds!

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Checking on the numbers here really quick: the request is for the top `1000` 
user agents and then a sample of `1000` user agents, but the total is `1221`. 
Would an ordered list of all of them make more sense as we're talking a sample 
of 82%? There really isn't going to be a difference between the first two sets. 
An ordered list of all of them and another ordered list of all who were active 
in May and not in April?

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Status is open as T364045 <https://phabricator.wikimedia.org/T364045> has 
been resolved :)

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-06-10 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Stalled" to "Open".

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Unstalled as the plan for the data export has been approved in T365699 
<https://phabricator.wikimedia.org/T365699> :)

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Stalled" to "Open".

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Items that contain a sitelink to one of the Wikimedia projects over time

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Unstalled as the table has been created :)

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T343019: [EPIC] Segments of Wikidata's data over time [up to milestone 3]

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the status of subtask T362849: [Analytics] Items that 
contain a sitelink to one of the Wikimedia projects over time  from 
"Stalled" to "Open".

TASK DETAIL
  https://phabricator.wikimedia.org/T343019

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel, AndrewTavis_WMDE
Cc: Aklapper, Manuel, me, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, Peteosx1x, 
NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Items that contain a sitelink to one of the Wikimedia projects over time

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Stalled" to "Open".

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hi @MarcoSwart, sorry for changing the status without explanation. Was in a 
meeting and we were moving things around, but obviously context should have 
been added. This is stalled for now as we're waiting for WMF to advise us on 
the best way forward on migrating data from MariaDB to HDFS. The data processes 
we need to use for this cannot be run directly on MariaDB in a sustainable way 
that's in line with long term supported data practices, so first we need to 
migrate the data to the private data cluster, and then our normal workflows 
take over. This migration is non-standard, and they're looking into how best to 
support/guide us.
  
  By the sounds of it they're allotting the budget of a Staff Engineer to help 
with this soon. The data pipeline and the needed queries are basically done, so 
what we're waiting on is the process to migrate the data as a final step. From 
there we'll get the process up and running such that the data at the very least 
will be exported to the published datasets folders 
<https://analytics.wikimedia.org/published/datasets/> on a daily basis.
  
  As far as a dashboard is concerned, we're also in the midst of looking into a 
more sustainable solution for presenting information to the public. This is 
similarly tied to WMF's efforts on this front. For now we hope that an export 
to the published datasets will suffice such that the community can then take 
the data and model it as they wish. I'd be happy to help people with simple 
Python scripts to get the data loaded into data frames and more workable states 
once that's done! I'd put an estimate on the data process as end of month if 
things work out with WMF's resources, but if not then it's August as I'm away 
for most of July (no later than that though).
  
  Please let me know if you have further questions, and again sorry for the 
confusion!

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note, work that will unblock this task is being done in T364045: [Bug?] Can't 
find wikidatawiki on wmf.mediawiki_wikitext_history 
<https://phabricator.wikimedia.org/T364045>.

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T366621: [Analytics] Analysis of REST API user agents for May 2024

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Quick note on this, in discussion, something to check as well would be those 
user agents that were present in May 2024, but were not active in April 2024 :)

TASK DETAIL
  https://phabricator.wikimedia.org/T366621

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T332899: [EPIC] Migrate selected R-based Wikidata products

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the status of subtask T360296: [Analytics] Implement 
data process to identify missing Wiktionary entries  from "Open" to 
"Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T332899

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel, AndrewTavis_WMDE
Cc: MarcoSwart, Lydia_Pintscher, JeanFred, AndrewTavis_WMDE, Pamputt, Aklapper, 
Manuel, me, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BeautifulBold, 
Suran38, karapayneWMDE, Invadibot, maantietaja, Peteosx1x, NavinRizwi, 
ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-06 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  There's now a draft for the DAGs 
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/725/diffs#96f15bf21ce9c18b6638c53402e35a2654aeeff6>
 open on GitLab. There's still lots to do as WMF wants to sync on suggestions 
they'll give me on how to do the MariaDB to HDFS data transfer, but the DAGs 
are mapped out and the hive queries they're calling have been prepared :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-06-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-06-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-06-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thanks so much for the support here, @BTullis! I'll update the epic 
<https://phabricator.wikimedia.org/T356618> with this being done. So close to 
being finished with all this :)

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: BTullis, AndrewTavis_WMDE
Cc: BTullis, brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, me, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, 
Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, me, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, 
Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, me, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, 
Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-06-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  wmde/analytics/hql/airflow_jobs/wiktionary_cognate 
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/tree/main/hql/airflow_jobs/wiktionary_cognate?ref_type=heads>
 on GitLab now has all the needed queries to for missing entries, most popular 
entries and comparing Wiktionaries. Was easier to write all three at once 
rather than lose some context later. Note that these are Hive queries as the 
goal is to
  
  I've discussed the further infrastructure needs at length with a data 
engineer at WMF, with the steps from here being:
  
  - I need to write a PySpark job that gets the `cognate_wiktionary` tables 
from the MariaDB instance and puts them on HDFS on a daily basis
- This will go in wmde/analytics/spark 
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/tree/main/spark?ref_type=heads>
- Note  that this is relatively uncharted territory (it can be done with 
current long term supported tools, but will be a new type of job)
  - From there we need a DAG that will eventually include all three processes 
discussed above
- The reason we'll do a DAG for all three is that each will rely on the 
PySpark job to migrate the data from MariaDB to HDFS
- We can start with just doing missing entries as an output for this task, 
and then other tasks can add the other two to the DAG

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, me, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, 
Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: July 2024)

2024-06-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Table has been updated with the new data from the most recent DAG run. Lots 
more user agents - almost a 3x increase. Noting this for now as maybe grounds 
for further investigation later, but IPs are also increasing (just not by as 
much).
  
  Note that we need to do some work on the `wmde.wd_rest_api_metrics_monthly` 
table at this point as directed in T365699: Published datasets data release 
request for Wikidata REST API metrics 
<https://phabricator.wikimedia.org/T365699>. Specifically, as of now we have 
all the the outputs of this table being `bigint` values. As this type of data 
is classified under users, we need to assure that data points less than 25 are 
recorded as `"<25"`. All columns will thus need to be converted over to being 
strings. Goal on this would be of course to not have any data loss in this 
process. The query has already been updated locally, and will be changed with 
the next deploy to add the published datasets as a deployment target (both the 
DAG and the jobs need to be updated at this point).
  
  As the DAG has already been ran for this month, I'm going to update the jobs 
now. Another thing to consider is whether in this update we can also backdate 
the table with the information from months before the DAG was functional.

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: July 2024)

2024-06-03 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "[Analytics] Monthly repeating tasks 
(next: June 2024)" to "[Analytics] Monthly repeating tasks (next: July 2024)".
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE closed this task as "Resolved".
AndrewTavis_WMDE claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351070: [EPIC] Clean up Wikidata Grafana cronjobs

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE closed subtask T351072: Remove the WDCM clone (stats1007) as 
"Resolved".

TASK DETAIL
  https://phabricator.wikimedia.org/T351070

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Michael, Manuel, Aklapper, me, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, 
maantietaja, Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Lydia_Pintscher, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE closed subtask T351072: Remove the WDCM clone (stats1007) as 
"Resolved".

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Perfect, @Lucas_Werkmeister_WMDE! Glad to have this all cleared up :)

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE closed this task as "Resolved".
AndrewTavis_WMDE claimed this task.
AndrewTavis_WMDE added a comment.


  Sounds good to me! :) Thanks for the help here, @Lucas_Werkmeister_WMDE and 
@BTullis!

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T321666: Wiktionary Cognate Dashboard is not accessible [timeboxed 0.5 days]

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hi @Bicolino34 👋 Thanks for reaching out :) We are still working on tasks 
related to this dashboard - at least bringing back some of the data processes.

TASK DETAIL
  https://phabricator.wikimedia.org/T321666

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Bicolino34, Lepticed7, XANA000, VIGNERON, AndrewTavis_WMDE, 
Lydia_Pintscher, WMDE-leszek, Pamputt, MarcoSwart, GoranSMilovanovic, Otourly, 
ItamarWMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Akuckartz, Dringsim, Nandana, Lahi, 
Gq86, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Thibaut120094, Wikidata-bugs, aude, Darkdadaah, Mbch331, Ltrlg
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Moving this to verification given the work in T364965 
<https://phabricator.wikimedia.org/T364965>. Thanks for all of this, 
@Lucas_Werkmeister_WMDE! Maybe we can resolve this and leave T364965 
<https://phabricator.wikimedia.org/T364965> until `stat1007` is deprecated, or 
resolve both?

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  None of the files listed in your comment above 
<https://phabricator.wikimedia.org/T364965#9838579> look like things we should 
worry about, @Lucas_Werkmeister_WMDE. Similarly that there's a different commit 
for this, as to my knowledge `stat1005` was the main server for the related 
work.
  
  So sounds like our work for this is finalized? Do we want to resolve this or 
keep this open until `stat1005` is fully deprecated?

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-05-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  I've been asking around about the data source and connecting the tables and 
have yet to get concrete answers. Based on general assumptions of the names of 
the tables/columns though, the path forward for getting missing entries for a 
Wiktionary will be to:
  
  - Start with `cognate_wiktionary.cognate_sites`
  - Join to `cognate_wiktionary.cognate_pages` (`cognate_sites.cgsi_key = 
cognate_pages.cgpa_site`)
  - Join to `cognate_wiktionary.cognate_titles` (`cognate_pages.cgpa_title = 
cognate_titles.cgti_raw_key` - note the use of `cgti_raw_key`)
  - Use `cognate_titles.cgti_normalized_key` as a means of checking which 
Wiktionary entries are shared/missing across projects
  
  Putting this here as documentation :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, me, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, 
Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-05-28 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360296: [Analytics] Implement data process to identify missing Wiktionary entries

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360296

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred, 
Lydia_Pintscher, MarcoSwart, Manuel, me, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, BeautifulBold, Suran38, karapayneWMDE, Invadibot, maantietaja, 
Peteosx1x, NavinRizwi, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thanks for taking care of this, @Lucas_Werkmeister_WMDE! We'll be able to 
close both this and T351072 <https://phabricator.wikimedia.org/T351072> after 
Tuesday next week if/when the Puppet change is deployed :)

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder, Adamm71, S8321414, 
Hellket777, LisafBia6531, Astuthiodit_1, 786, Biggs657, karapayneWMDE, 
Invadibot, maantietaja, Juan90264, Alter-paule, Beast1978, ItamarWMDE, Un1tY, 
Akuckartz, Dringsim, Hook696, Kent7301, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, 
Neuronton, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T365457: Bring in all Purdue Porgram PRs and upload Mismatch Finder mismatches

2024-05-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T365457

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @BTullis, checking in on this as your help in T358311 
<https://phabricator.wikimedia.org/T358311> reminded me as it's all related to 
the same user. Would you be able to remove the 
`statistics/manifests/wmde/wdcm.pp` file and any related processes (including 
now stat1011) as well?

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thank you, @BTullis! Ya I wasn't happy with the solution either. Appreciate 
your willingness to help!

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: BTullis, brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  I'm realizing also that I don't have admin rights and thus can't move files 
to your directory. I'll copy these files over to my directory, download them 
and send you a link to a zipped directory on Google Drive once we have the 
above figured out.

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hi @Manuel, checking further as it's still not clear what you'd like. The 
double except is confusing. I'll only transfer files from `stat1005`, and could 
you answer the following questions:
  
  1. Do you want **data files** (.csv, .tsv, etc) __before 2020__? (assumption 
no)
  2. Do you want **data files** __after 2020__? (as of now unclear)
  3. Do you want **non data files** (.py, .R, etc) __before 2020__? (as of now 
unclear)
  4. Do you want **non data files** __after 2020__? (assumption yes)

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hi @Manuel - sending along a summary of what I'll be getting for you:
  
== stat1004 ==
Jul 25  2020 Analytics
Jun 23  2020 Experiments
Jul 25  2020 wdUsagePerPage

== stat1005 ==
All non data files

== stat1007 ==
Aug 23  2020 Analytics
Jan 27  2020 Experiments
Aug 23  2020 RScripts

== stat1008 ==
Oct 11  2021 Analytics
Jun 23  2020 R

=== HDFS 
2021-11-02 17:37 /user/goransm/dewiki_revisions
2021-04-11 16:51 /user/goransm/wdtranslationsb
No other files, as everything after 2020 is a data file or ORES related 
(this is coming in the stat server files anyway)
  
  TSVs, CSVs and data file types will not be included in the transfer. Out of 
convenience, I'm going to transfer the files into your directory on the given 
server.

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Ok then!
  
  So the checks of the files above is complete as shown by its status. General 
summaries of each stat machine and HDFS are provided under the subsections 
above. `stat1005` has some files that @Manuel may find interesting given that 
they're for prior tasks of his. Any queries that looked like they could be 
interesting or were in files whose names sounded interesting but the query 
ended up not being interesting are printed above for documentation.
  
  Overall I can say that anything from the above would be easier to work from 
scratch via the docs and checking with WMDE engineers or WMF Data 
Engineering/Analytics rather than going through and re-implementing it. I 
personally would not keep anything, and will delete the files I copied over to 
my `stat1005` once this is closed :)
  
  Thanks again @JAllemandou for the file lists, and thanks @brouberol for the 
ping!

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  So basically removing the wdcm.pp related file on GitHub and its Puppet 
workflows will close both tasks :)

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Ah looking at this, I'm realizing I restated myself as the work that's left 
in T364965: stat1007 to stat1011 migration pipeline output check 
<https://phabricator.wikimedia.org/T364965> is a duplicate of what we want to 
do here :)

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T351072: Remove the WDCM clone (stats1007)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hey @Arian_Bozorg 👋 Yes, we do still need to check this out. I was thinking 
that @Lucas_Werkmeister_WMDE and I could discuss this when we chat about what 
else is needed in T364965: stat1007 to stat1011 migration pipeline output check 
<https://phabricator.wikimedia.org/T364965>. In that one we've confirmed now 
that the data is coming in from stat1011, so at this point it'd be good to 
delete the statistics/manifests/wmde/wdcm.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/wdcm.pp>
 and also remove it's workflow from Puppet (just not quite sure if I have 
access and how to go about the Puppet work).
  
  I'm hopeful that another 25min call would be enough to get the work done for 
both tasks and I can document for my learning/our processes and report back? 
Let me know if sometime later if the week could work for this!

TASK DETAIL
  https://phabricator.wikimedia.org/T351072

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE, Aklapper, Lucas_Werkmeister_WMDE, 
AndrewTavis_WMDE, Michael, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
Djdungti, LawExplorer, _jensen, rosalieper, Scott_WUaS, Izno, Nastoshka, 
Wikidata-bugs, aude, Dinoguy1000, scfc, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T365457: Bring in all Purdue Porgram PRs and upload Mismatch Finder mismatches

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T365457

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T365457: Bring in all Purdue Porgram PRs and upload Mismatch Finder mismatches

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata Analytics (Kanban), Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Making this task as a means of saving that there is still work to be done to 
close out the Purdue Data Mine program. Specifically all pull requests in the 
repo <https://github.com/Wikidata/Purdue-Data-Mine-2024/pulls> need to be 
brought in, and the resulting mismatches should be uploaded to Mismatch Finder 
using upload_mismatches.py 
<https://github.com/Wikidata/Purdue-Data-Mine-2024/blob/main/upload_mismatches.py>.

TASK DETAIL
  https://phabricator.wikimedia.org/T365457

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T356618: [EPIC] Check of legacy wmde analytics infrastructure

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T356618

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  ⚠️ Currently WIP ⚠️
  ===
  
  Going through the files sent by @JAllemandou above 
<https://phabricator.wikimedia.org/T358311#9648470>. This message will be saved 
as I go so that I don't loose my progress 😊 If I do find something worth 
documenting, then I'll also include it below so that this task can serve as a 
reference for later if need be.
  
  stat1004
  
  
  All of the files are not worth keeping. See descriptions and reasoning below:
  
total 28

Analytics
└─ NewEditors 
└─ adHoc (nothing of interest)
└─ Compaigns
└─ 2019 and 2020 email compaigns with R based analysis (nothing of 
interest)
└─ WDCM
└─ WDCM_Output 
└─ Lots directories of CSVs (nothing of interest)
└─ WDCM_Scripts
└─ R based scripts that would be archived on Gerrit if they were 
ever in production (nothing of interest)
└─ Wikidata
└─ misc
└─ Some ad hoc work (nothing of interest)
└─ WD_languagesLandscape
└─ R based scripts that would be archived on Gerrit if they were 
ever in production (nothing of interest)
└─ WD_ORES_ItemQuality (nothing of interest given Lift Wing migration)
└─ WD_UsageCoverage
└─ R and Python scripts that are doubtless versions of the WDCM 
UsageCoverage dashboard that's archived on Gerrit (nothing of interest)
Experiments
└─ Empty
_miscWMDE
└─ summerBannerCampaign2017_DataOUT
└─ TSV files (nothing of interest)
└─ TWLBanner_2017
└─ TSV files and simple HQL queries from `wmf.webrequest` for 
banner campaigns hits (nothing of interest, easy to learn as needed)

Example query:

SELECT count(*)
FROM wmf.webrequest
WHERE uri_host = 'de.wikipedia.org'
  AND uri_query LIKE "$/wiki/Wikipedia:Umfragen/Technische_Wünsche_2017$"
  AND http_method = 'GET'
  AND is_pageview = TRUE
  AND YEAR = 2017
  AND MONTH = 6
  AND DAY = 1
  and HOUR = 20;

└─ TWLBanner_2017_DataOUT
└─ TSV files (nothing of interest)
_miscWMDE_1004
└─ TWLBanner_2017
└─ One HQL and one TSV file that are similar to the above (nothing 
of interest)
R
└─ x86_64-pc-linux-gnu-library (nothing of interest)
Research
└─ DydimusZengenene
└─ Note: work to support a researcher (nothing of interest)
└─ _analytics
└─ _data
└─ DydimusZengenene.Rproj
└─ ParseTargetPage.R
wdUsagePerPage
└─ Related to the percentage usage dashboard, so would be archived on 
Gerrit if they were ever in production (nothing of interest)
  
  
  
  stat1005
  
  
total 964

Analytics
└─ 
BotEdits_perProject.ipynb
└─ 
crontabstat1005.txt
└─ 
DataModelTerms_20210228_Updates.ipynb
└─ 
dewiki_NewEds_2021.ipynb
└─ 
QCF_M2_Test.ipynb
└─ 
QuratorCuriousFacts_Separators.ipynb
└─ 
Qurator_M1.ipynb
└─ 
R
└─ 
snapshot_query.hql
└─ 
Untitled1.ipynb
└─ 
untitled1.txt
└─ 
Untitled2.ipynb
└─ 
Untitled3.ipynb
└─ 
Untitled4.ipynb
└─ 
Untitled5.ipynb
└─ 
Untitled.ipynb
└─ 
untitled.txt
└─ 
venv
└─ 
wd_cluster_fetch_items_M2.ipynb
└─ 
wd_cluster_fetch_items_M3.ipynb
└─ 
WDCM_ETL_OTHER_TEST.ipynb
└─ 
WDCM_Statements_Test.ipynb
└─ 
WD_HumanEditsPerClass_RevisionTags.ipynb
└─ 
WD_Inequality_Intake.ipynb
└─ 
WD_Languages_Datamodel_CollectInit.ipynb
└─ 
WD_Languages_Datamodel_EXP.ipynb
└─ 
WD_MonthlyEditors.ipynb
└─ 
WD_Sitelinks_WDAHP_202108.ipynb
└─ 
wd_statements_HiveQL_Query.hql
└─ 
WD_Translations.ipynb
└─ 
WHEIP_exps.ipynb
└─ 
wikidata_analytics_examples
└─ 
WikidataRevisions_November2020.csv
└─ 
  
  
  
  stat1006
  
  
total 48

misc_projects
└─ 
myTemp
└─ 
NewEds
└─ 
nohup.out
└─ 
R
└─ 
RPckg
└─ 
RScripts
└─ 
sqlIn
└─ 
sqlOut
└─ 
WDCM_Credentials
└─ 
WDCM_DataIN
└─ 
WDCM_DataOUT
└─ 
WDCM_sql
└─ 
  
  
  
  stat1007
  
  
total 28

Analytics
└─ 
crontabstat1007.txt
└─ 
Experiments
└─ 
Python3
└─ 
R
└─ 
RScripts
└─ 
venv
└─ 
  
  
  
  stat1008
  
  
total 16

Analytics
└─ 
R
└─ 
renv
└─ 
venv
└─ 
  
  
  
  stat1009
  
  
to

[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-05-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that MR#700 
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/700>
 has been  opened that has the work for this :)

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that MR#700 
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/700>
 has been  opened that has the work for this :)

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm (timeboxed 0,5 days)

2024-05-16 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Confirming that data's still coming in as well. @BTullis, what should we do 
about statistics/manifests/wmde/wdcm.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/wdcm.pp>?
 Remove the file? And could you also remove it from puppet entirely on stat1011 
as well? Anything else?

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Quick note that the word used by @BTullis was `disabled` instead of `removed` 
for the stat1007 timers, so apologies if this caused some confusion. I figure 
not, but just wanted to be clear :)
  
  @BTullis, would you be able to check the journal for them and paste the 
output here so we can check it? On my end as well it seems like I can't access 
it.

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 to stat1011 migration pipeline output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "stat1007 migration output check" to 
"stat1007 to stat1011 migration pipeline output check".

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T364965: stat1007 migration output check

2024-05-15 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata Analytics (Kanban), Wikidata, 
Wikidata Dev Team.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Context
  ---
  
  Recently WMF has been migrating from legacy stat servers that are being 
deprecated - specifically stat1004, 1005, 1006 and 1007. WMDE has a few 
pipelines that were running on stat1007 that have since been migrated over to 
stat1011:
  
  - statistics/manifests/wmde/graphite.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/graphite.pp>
  - statistics/manifests/wmde/wdcm.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/wdcm.pp>
  
  The latter at first glance doesn't appear to do anything as it sets the 
environment variables and clones, but then the rest is `TODO`. The former is 
more expansive and leads in to our Graphite/Grafana workflows.
  
  Further directions
  --
  
  > You should be able to find the required files and the clone of 
https://gerrit.wikimedia.org/g/analytics/wmde/scripts 
<https://gerrit.wikimedia.org/g/analytics/wmde/scripts> beneath 
`stat1011:/srv/analytics-wmde`.
  
  The assumption is that they're working, and the timers for stat1007 have been 
removed.
  
  Goals
  -
  
  Check the pipeline in statistics/manifests/wmde/graphite.pp 
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/graphite.pp>
 to assure that everything is working properly after the stat1007 -> stat1011 
migration.

TASK DETAIL
  https://phabricator.wikimedia.org/T364965

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMDE, BTullis, Manuel, Aklapper, AndrewTavis_WMDE, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: June 2024)

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Sheet updated with the numbers for April. Higher number of user agents, but 
lower IPs (but then IPs still much higher than Feb).

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T342559: [Analytics] Monthly repeating tasks (next: June 2024)

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "[Analytics] Monthly repeating tasks 
(next: May 2024)" to "[Analytics] Monthly repeating tasks (next: June 2024)".
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T342559

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Jdforrester-WMF, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T358311: Check home/HDFS leftovers of goransm

2024-05-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hey @brouberol 👋 Just getting back from two weeks off today :) I'll check 
into this and get back to you all! Thanks for the ping!

TASK DETAIL
  https://phabricator.wikimedia.org/T358311

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper, 
AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, BTullis, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelink segmentations

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "Generate historical weekly segments of 
Wikidata item sitelinks segmentations" to "Generate historical weekly segments 
of Wikidata item sitelink segmentations".

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate historical weekly segments of Wikidata item sitelinks segmentations

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE renamed this task from "Generate weekly historical segments of 
Wikidata item sitelinks segmentations" to "Generate historical weekly segments 
of Wikidata item sitelinks segmentations".

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T363583: Generate weekly historical segments of Wikidata item sitelinks segmentations

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata, Wikidata Analytics (Kanban).
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  Purpose
  ---
  
  In T362849: [Analytics] Segments of Wikidata's data over time 
<https://phabricator.wikimedia.org/T362849> we need to calculate historical 
segments of Wikidata's items based on their relation to sitelinks.
  
  Purpose from that ticket:
  
  > As Wikidata Product Managers, we would like to understand how different 
segments of Wikidata's data developed over time, so we can inform our 
projections.
  
  This task would encompass the historical data that's needed to achieve this.
  
  Scope
  -
  
  From T362849 <https://phabricator.wikimedia.org/T362849>:
  
  > How did the number of Items of the following types develop over time?
  >
  >   A) Items that contain a sitelink to one of the Wikimedia projects (e.g. 
about a notable person)
  >   B) Items that are needed to build A (used in A Items for example in a 
statement or reference; e.g. the non-notable father of that notable person)
  >   C) All other Items
  
  
  
  - In order to do this, T363451: Add job to create Wikidata partition to 
wmf.mediawiki_wikitext_history <https://phabricator.wikimedia.org/T363451> was 
made to recreate the Wikidata partition of wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>
  - Once this task is complete, work can then begin to use this partition to 
generate all data from when Wikidata was created to the most recent weekly data 
generated by the DAG created in T362849 
<https://phabricator.wikimedia.org/T362849>
  
  Desired Output
  --
  
  - Weekly stats of the number of Items in category A, B and C
  
  Acceptance criteria:
  
  [ ] Weekly historical breakdowns of populations A, B and C
- These would be in the Data Lake and the published datasets
  
  ---
  
  **Information below this point is filled out by the Wikidata Analytics team.**
  
  General Planning
  
  
  Information is filled out by the analytics product manager.
  
  Assignee Planning
  -
  
  Information is filled out by the assignee of this task.
  
  Estimation
  --
  
  Estimate:
  Actual:
  
  Sub Tasks
  -
  
  Full breakdown of the steps to complete this task:
  
  [ ] Step
  
  Data to be used
  ---
  
  See Analytics/Data_Lake 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake> for the breakdown of 
the data lake databases and tables.
  
  The following tables will be referenced in this task:
  
  - wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>
  
  Notes and Questions
  ---
  
  Things that came up during the completion of this task, questions to be 
answered and follow up tasks:
  
  - Note

TASK DETAIL
  https://phabricator.wikimedia.org/T363583

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Dringsim, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  See T362849_wd_item_sitelink_segments.ipynb 
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/blob/main/tasks/wikidata/2024/T362849_wd_item_sitelink_segments/T362849_wd_item_sitelink_segments.ipynb?ref_type=heads>
 for the work to derive the segments :)

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Ok, so the new numbers after the change in scope for the max `2024-04-15` 
snapshot are:
  
items_with_sitelinks: 32,231,861
items_items_with_sitelinks_link_to: 2,980,388
all_other_items: 72,910,679
  
  For documentation, the numbers for the original Population B definition for 
the min `2024-02-26` snapshot were:
  
items_with_sitelinks: 31,978,738
linked_to_items_with_sitelinks: 75,221,879
all_other_items: 242,565
  
  Status on the rest of this:
  
  - The weekly DAG is written and further does include an export to the 
published datasets repo
- I've also included the work for T361203 
<https://phabricator.wikimedia.org/T361203> in this
  - We need to confirm the numbers above and the method that generates them
  - I'll then rewrite the DAG job that runs the query
  - Then testing, I'll need the table `wmde.wd_item_sitelink_segments_weekly` 
to be made in HDFS by an admin, and then we can go into production
  - Should all be done by Tuesday/Wednesday evening after I'm back in a few 
weeks depending on folks' availability
  - I'll make a new task for the historic data generation process, which will 
depend on T363451 <https://phabricator.wikimedia.org/T363451>

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T361203: [Analytics] Add the published datasets directories as a target for the REST API Airflow jobs

2024-04-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Moved this to `In progress` as I'm adding the job to export everything to the 
published datasets folder to the DAG as I work on the same for T362849 
<https://phabricator.wikimedia.org/T362849>.

TASK DETAIL
  https://phabricator.wikimedia.org/T361203

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-25 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  See {https://phabricator.wikimedia.org/T363451} for the task about bringing 
back the partition (hopefully via another job). I added a bit about whether we 
want to maybe turn this job on when WMDE needs historical data. Let me know 
what you all think on that :)

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Another note on this is: if we don't expect to be needing a Wikidata 
partition of `wmf.mediawiki_wikitext_history` for other tasks, then we could 
work directly from the XML dump for the data backdate. We wouldn't be able to 
leverage PySpark for the querying though, so I worry about how long all of this 
would take...

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a subscriber: JAllemandou.
AndrewTavis_WMDE added a comment.


  Thanks for all of the information, @mpopov!
  
  I talked this over in my bi-weekly with @JAllemandou, and would like to bring 
some further context to this particular situation :)
  
  The go to table for this would be wmf.wikidata_entity 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Wikidata_entity>
 for the following reasons:
  
  - It has the `sitelinks` column for Population A above
  - It has the `claims` column for Population B above
  
  It thus has everything we need for the given task for future data. One change 
to the output for this though would be the frequency of the DAG, as 
`wmf.wikidata_entity` is a weekly data dump, so it'd make sense to do a weekly 
DAG. If we still want to do a monthly job, then the best option would be to do 
a DAG that runs on the first Monday of every month (in the docs for 
`wmf.wikidata_entity` it mentions the `2020-01-20` snapshot, which was a 
Monday).
  
  Now we get to the question of the historical data... This is a situation that 
cannot be solved at this time given the current makeup of the Data Lake. As 
mentioned on Mattermost: we currently do not have Wikidata as a partition 
within wmf.mediawiki_wikitext_history 
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/Mediawiki_wikitext_history>,
 so we do not have historical versions of Wikidata items with which we'd be 
able to rebuild the history. The assumption we're making on this is that the 
legacy version of these metrics was made using `wmf.mediawiki_wikitext_history` 
at a time when Wikidata was still an available partition. The change for 
removing Wikidata from the `wmf.mediawiki_wikitext_history` dump process was 
`2024-02` - see T357859 <https://phabricator.wikimedia.org/T357859> where ~12 
of 25 days of the dump generation is for the Wikidata XML dump. This was 
slowing down metrics delivery for WMF Movements Insights.
  
  Steps forward on this:
  
  - I'll begin work on a DAG based on `wmf.wikidata_entity`, as even if we do 
get a Wikidata partition within `wmf.mediawiki_wikitext_history`, it would not 
be used for recent data updates
- Are we fine with a weekly DAG?
  - A decision needs to be made on whether WMDE is requesting Wikidata data to 
again be an output in `wmf.mediawiki_wikitext_history` snapshot creation process
- The preferred solution here would be to not revert the changes to T357859 
<https://phabricator.wikimedia.org/T357859>, but rather make a new job that 
adds a new partition to the table via the Wikidata XML dump
- Reason for this is to assure that WMF Movements Insights can maintain the 
current speed of delivery
- @JAllemandou has said that bringing the Wikidata partition back is fine 
if we need it (again, preferably in the above way)
  - If the request is being made, a new task should be made for it
  - We'd then do what I'd argue would be a separate task whereby the new 
`wmf.mediawiki_wikitext_history` Wikidata parition would be used to recompute 
the historical populations above
  
  Let me know what thoughts are on the above!

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper, 
Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Dringsim, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: mpopov, AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T362849: [Analytics] Segments of Wikidata's data over time

2024-04-23 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T362849

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: mpopov, AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE, 
S8321414, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, 
Akuckartz, Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Summary on your end sounds great, @Ifrahkhanyaree_WMDE! 😊 Let me know if 
sending along some empty new item revisions from 2024 would be helpful :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Notebook with the work that was done for this is: 
wmde/analytics/tasks/product_platform/2024/T360761_empty_wikidata_items/T360761_empty_wikidata_items.ipynb
 
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/blob/main/tasks/product_platform/2024/T360761_empty_wikidata_items/T360761_empty_wikidata_items.ipynb>.
 Will update this if further work is needed :)

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-22 Thread AndrewTavis_WMDE
AndrewTavis_WMDE moved this task from Needs product input to Product 
verification on the Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.


  Further insights on this, and moving it to `Product verification` at this 
point :) I've now changed the query to a span of bytes that would be allowable 
for something to be empty. I added 10 bytes to the calculated max for `170`, 
but also tried with `180` and `190` and the trend of empty on first revision 
items dropping off is maintained.
  
  Basic finding: it used to be way more common, but still does happen today
  
  New query is the following:
  
SELECT DISTINCT
event_user_text AS editor,
substring(event_timestamp, 1, 7) AS event_year_month,
page_title AS created_empty_qid

FROM
wmf.mediawiki_history

WHERE
wiki_db = 'wikidatawiki'
AND page_namespace_is_content = True
AND snapshot = '2024-03'
AND event_entity = 'revision'
AND event_type = 'create'
AND page_revision_count = 1
-- Factor in bytes that are within a range small enough to be an empty 
first edit.
AND 148 < revision_text_bytes
AND revision_text_bytes < 170
;
  
  Task 1.1 - Number of Items in population A that were created empty: 
`5,075,471`
  Task 1.2 - Number of editors who are creating empty items: `27,61`
  
  Of the above items, I did a test of `50,000` to see if they were empty on 
deletion using the `https://www.wikidata.org/wiki/Special:EntityData/` 
endpoint. `49,579` returned valid JSON responses, and of those `99.65%` were 
found to be empty.
  
  I also checked the empty item creation over time, with the following two 
plots coming based on the above definition of the population in the query 
(148-170 bytes being "empty"):
  
  F48099515: total_empty_qids_created_per_month_v3_definition.png 
<https://phabricator.wikimedia.org/F48099515>
  
  F48099542: 
total_empty_qids_created_per_month_in_2023_and_2024_v3_definition.png 
<https://phabricator.wikimedia.org/F48099542>
  
  Again, I also tried boosting the max byte sizes for `180` and `190` and the 
plots above were not noticeably different.
  
  Task 2 - Number of Items in population B that are currently deleted: `44,385` 
(`0.87%`)
  
  I switched around the 3.x tasks a bit with a focus on visualization, as as I 
said I basically wasn't seeing ones that were created empty and were still 
empty.
  
  Task 3.1 - no further edits ever on items that are not deleted: `0` (they all 
have at least one more edit)
  
  Query for this:
  
WITH not_deleted_created_empty_qids_v3 AS (
SELECT DISTINCT
page_title AS not_deleted_created_empty_qid

FROM
wmf.mediawiki_history

WHERE
wiki_db = 'wikidatawiki'
AND page_namespace_is_content = True
AND snapshot = '2024-03'
AND event_entity = 'revision'
AND event_type = 'create'
AND page_revision_count = 1
-- Factor in bytes that are within a range small enough to be an 
empty first edit.
AND 148 < revision_text_bytes
AND revision_text_bytes < 170
AND page_is_deleted = False
)

SELECT
h.page_title AS not_deleted_created_empty_qid,
count(h.revision_id) AS number_of_revisions

FROM
wmf.mediawiki_history AS h

JOIN
not_deleted_created_empty_qids_v3 AS e

ON
h.page_title = e.not_deleted_created_empty_qid

WHERE
h.wiki_db = 'wikidatawiki'
AND h.page_namespace_is_content = True
AND h.snapshot = '2024-03'
AND h.event_entity = 'revision'
AND h.event_type = 'create'

GROUP BY
h.page_title
  
  Task 3.2 - at least one additional edit (=the rest): `5,031,086`
  
  - Check: `5,031,086 + 44,385 = 5,075,471`
  
  New and hopefully a bit more helpful (my assumption) Task 3.3 - graphs of the 
number of edits the items have had
  
  F48100783: 
not_deleted_empty_on_creation_items_per_edit_amount_max_100_-_v3_definition.png 
<https://phabricator.wikimedia.org/F48100783>
  
  F48100788: number_of_revisions_on_empty_on_creation_items_v3_definition.png 
<https://phabricator.wikimedia.org/F48100788>
  
  Let me know if anything else would be helpful here, @Ifrahkhanyaree_WMDE!

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, G

[Wikidata-bugs] [Maniphest] T360761: [Analytics] Analysis of empty new Wikidata Items

2024-04-19 Thread AndrewTavis_WMDE
AndrewTavis_WMDE moved this task from In progress to Needs product input on the 
Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.


  The thread on Mattermost 
<https://mattermost.wikimedia.de/swe/pl/gsr9b485x7geby79t4sg151j7c> for 
discussing this has a lot of comments on the data restrictions we're dealing 
with here because there is no text table for Wikidata in the Data Lake. A work 
around using `revision_text_bytes` to determine the minimum size that an item 
could be (i.e. = empty) has been used so far with okish results, but there are 
definitely drawbacks and it's not exact.
  
  What it is that I can say here is that:
  
  - There are lots of items being created empty (from one subset `3,540,260`)
  - They're not normally deleted (from the same subset only `0.95%` where)
  - It's usual that there are edits (I've yet to see an item that was created 
empty and is still empty, but please note that this is an eye test on ~30 items)
  
  Moving this to `Needs product input` for now. A basic thing that can be done 
that won't take too much time is that I can use a range instead of the case 
when for determining when a item is empty via the length of it's QID and the 
`revision_text_bytes` size. We would then not be getting empty on creation 
items 100% of the time, but I could also find the ratio and we could agree on 
what an acceptable margin of error would be (say `> 90%`). Time estimate on 
this is 1/2 a day.

TASK DETAIL
  https://phabricator.wikimedia.org/T360761

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Dringsim, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


  1   2   3   4   5   6   7   >