[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-10-31 Thread Gehel
Gehel added a comment. In T337021#9115572 , @Nikki wrote: > I realise this ticket is already closed (I only just noticed it) but please bear in mind when making any decisions about how to split the data that introducing `mul` will

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-08-24 Thread Manuel
Manuel added a comment. Thank you, Nikki, and great to see your estimates! I am mainly responsible for our analytics tasks these days, so no worries, I already made the point in the evaluation. :) TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-08-23 Thread Nikki
Nikki added a comment. I realise this ticket is already closed (I only just noticed it) but please bear in mind when making any decisions about how to split the data that introducing `mul` will hopefully result in a huge change to these numbers (at least for labels/aliases). I've been

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-26 Thread Manuel
Manuel added a comment. @tfmorris: > What does "planned" mean in this context? It is something that we decided to do but I do not know when we can prioritize this over other work. > My naive reading of the ticket gives the impression that it's been stalled without action for

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-21 Thread tfmorris
tfmorris added a comment. @Manuel when you write: > A new feature that would solve this problem is already planned, but it does not exist yet (see T303677 ). Thanks for the pointer! What does "planned" mean in this context? How do I find

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-21 Thread Manuel
Manuel added a comment. @tfmorris: > Is triple count the only important parameter? It seems likely that the descriptions could be larger, on average, than labels. This task is about Blazegraph, so triple counts are what matter for this specifically. But we are also concerned about

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. Updated the totals given the most recent dump to test my connection to it in relation to T342416 . As expected, no major changes in terms of percentages :) TASK DETAIL https://phabricator.wikimedia.org/T337021

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: tfmorris, Manuel, Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. Thanks for writing, @tfmorris! :) > Is triple count the only important parameter? It seems likely that the descriptions could be larger, on average, than labels. Descriptions are something we could definitely look into in relation to this. This

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-20 Thread tfmorris
tfmorris added a comment. I have a theory as to where a big chunk of the machine generated descriptions are from. They are the phrase "Wikimedia category" in hundreds of languages as a textual transcription of the triple `instanceOf Q4167836`. For example, Catégorie:Naissance à Seri Menanti

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-20 Thread tfmorris
tfmorris added a comment. Is triple count the only important parameter? It seems likely that the descriptions could be larger, on average, than labels. It seems odd that there are more descriptions (19% of total) than labels (5%), although that agrees with what the previous study found

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-20 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. Great, @Manuel! Let me know what you want to do for the documentation of this. Happy to setup a repo for us on GitHub in the coming days if that would help :) TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-20 Thread Manuel
Manuel closed this task as "Resolved". Manuel added a comment. For documentation: - No high-level code review is required as we followed AKhatun's approach. - The notebook will still be documented. We are done here! \o/ TASK DETAIL https://phabricator.wikimedia.org/T337021

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-20 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Manuel, Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-19 Thread Manuel
Manuel added a parent task: T337799: [EPIC] Analytics support for WDQS product decisions . TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE, Manuel Cc: Manuel, Aklapper,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-19 Thread Manuel
Manuel added a comment. Some thoughts about the notebook: **Double checking** Triples should always be distinct, correct? But the number 15 Billion seems lower than I have read elsewhere. **Size calculations** The predicates look correct to me for this analysis.

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. > we can change the structure later, correct? Yes, no stress on that whatsoever  > tasks can sometimes be grouped by topic (e.g. content/contributors/etc or epics like T337799 ) We could also have

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread Manuel
Manuel added a comment. Yes, let's think about the structure some more and then just try something out, we can change the structure later, correct? Some thoughts: - all of our work for Wikidata is Wikidata Analytics - tasks can sometimes be grouped by topic (e.g.

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. > We might create a little report for this and the coming WDQS task. We could do this, yes. Do you mean something in Google Docs? (I mention a readme with the code below as another alternative) > In terms of code documentation GitHub would make

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread Manuel
Manuel added a comment. Cool, thank you! > the next dump will be made on the 19th, so I'll go ahead and rerun the process then I don't think that there should be significant changes on that scale. So no need to rerun. > How do we want to document this? We might create a

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Manuel, Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Manuel, Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment. And would a table output be preferable for easier comparison? TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Manuel, Aklapper,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Manuel, Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a subscriber: Manuel. AndrewTavis_WMDE added a comment. @Manuel, the task description has now been updated with the aggregate values for the dump from `2023-7-10`. As this is weekly, the next dump will be made on the 19th, so I'll go ahead and rerun the process then so

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-18 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-17 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-14 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-05 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-05 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-05 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-05 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread Manuel
Manuel reassigned this task from Andrew-WMDE to AndrewTavis_WMDE. Manuel added a subscriber: Andrew-WMDE. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AndrewTavis_WMDE, Manuel Cc: Andrew-WMDE,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread Manuel
Manuel assigned this task to Andrew-WMDE. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Andrew-WMDE, Manuel Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE, Invadibot,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread Manuel
Manuel moved this task from Needs PM work to Kanban on the Wikidata Analytics board. Manuel edited projects, added Wikidata Analytics (Kanban); removed Wikidata Analytics. TASK DETAIL https://phabricator.wikimedia.org/T337021 WORKBOARD https://phabricator.wikimedia.org/project/board/5408/

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-07-04 Thread Manuel
Manuel updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Manuel Cc: Aklapper, Lydia_Pintscher, Astuthiodit_1, AWesterinen, karapayneWMDE, Invadibot, maantietaja,

[Wikidata-bugs] [Maniphest] T337021: [Analytics] Find out size of term subgraph

2023-05-30 Thread Manuel
Manuel renamed this task from "Find out size of term subgraph" to "[Analytics] Find out size of term subgraph". TASK DETAIL https://phabricator.wikimedia.org/T337021 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Manuel Cc: Aklapper,