[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-27 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Post a large discussion about this in the `data-engineering-collab` channel 
on Slack, the general findings for this are:
  
  - The public Superset instance isn't suitable for this at this time and 
there's no time table for it to be (see above comments)
  - A suggestion of putting this information on Wikistats 
 was agreed to be too complex to 
setup and manage
- We would need to use AQS 2 (Analytics Query Service) to make a 
service/API for this
  - An initial suggestion from WMDE to target Prometheus with the DAG was 
decided against
- It is possible to push data to Prometheus, but there are many 
complications with this
  - A new suggestion is to leverage Turnilo 
 for this
- There is a private instance at turnilo.wikimedia.org 

- There are also public instances of this as seen at 
wiki-search-referrals.wmcloud.org 
  - Wikitech docs for this can be found at 
wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/referrer_daily/Dashboard
 

  - The Turnilo dashboard is hosted on Cloud VPS 

  - The code for the Turnilo instance can be found at 
github.com/wikimedia/research-api-endpoint-template/turnilo-druid 

- The way this would be achieved is that we would have the published 
datasets  folder be 
another target of the DAG jobs, and we'd then ingest this data via the Turnilo 
instance
  
  This sounds like a good way forward, but the question of setting up the 
Turnilo instance and maintaining it then comes to mind. A big question is: how 
often are data pipelines supposed to be public, and would putting it all on a 
single Turnilo instance work well for our requirements?

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-27 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Further checks on this: the dashboarding process for the public Superset 
seems to be based on a few preset databases that have the data from Wikimedia 
projects (see SQL Lab ). As of now I'm 
doubting whether we'd be able to have active rights over one of these such that 
tables we'd generate in Airflow could be added to one and used for 
visualizations. I've asked in the WMDE data channel if there are people with 
domain knowledge for Graphite that could help with setting up a process where 
it would be one of the targets of the Airflow jobs. This to me seems more 
simple, with the end situation being that we use the main Superset instance for 
data processes that rely on the data lake/private data access, and then use 
Grafana for dashboards that are meant to be public facing.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-26 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Note that from the most recent discussions with WMF data engineering, there 
isn't a set workflow for getting information into a place where it can be 
accessed via the Public Superset instance. We would need to edit the DAG such 
that we include an export step for the data getting to a place where the public 
instance can access it. This would require some more research.
  
  Maybe another thing to consider is whether we'd prefer to have Graphite be 
the end export location for the data and then make a Grafana dashboard for 
this? Grafana does serve as the current public facing data dashboards for 
Wikidata, so it might make sense to leverage it more.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, S8321414, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, KimKelting, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE claimed this task.
AndrewTavis_WMDE updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-18 Thread Manuel
Manuel added a subtask: T341330:  [Analytics] Airflow implementation of unique 
ips accessing Wikidata's REST API metrics.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-18 Thread Manuel
Manuel added a parent task: T342331: [EPIC] Set up a sustainable tech stack for 
Wikidata Analytics.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-18 Thread Manuel
Manuel edited projects, added Wikidata Analytics (Kanban); removed Wikidata 
Analytics.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T360298: [Analytics] Public Superset dashboard pilot

2024-03-18 Thread Manuel
Manuel renamed this task from "[Analytics] Please add a descriptive title for 
the task!" to " [Analytics] Public Superset dashboard pilot".
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T360298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, KimKelting, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org