[Wikidata-bugs] [Maniphest] T362508: WDQS updater misbehaving in codfw

2024-04-24 Thread dcausse
dcausse moved this task from Ready for Dev -- SWE to Needs review on the Discovery-Search (Current work) board. dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T362508 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T362060: Generalize ScholarlyArticleSplitter

2024-04-23 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T362060 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T362977: WDQS updater missed some updates

2024-04-19 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T362977 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: bking, dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, AWesterinen

[Wikidata-bugs] [Maniphest] T362977: WDQS updater missed some updates

2024-04-19 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Reported at https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#Stale_values_in_SparQL_query_result - Q968274

[Wikidata-bugs] [Maniphest] T362508: WDQS updater misbehaving in codfw

2024-04-16 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T362508 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder

[Wikidata-bugs] [Maniphest] T362508: WDQS updater misbehaving in codfw

2024-04-16 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T362508 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder

[Wikidata-bugs] [Maniphest] T362508: WDQS updater misbehaving in codfw

2024-04-16 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T362508 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dr0ptp4kt, bking, dcausse, Aklapper, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder

[Wikidata-bugs] [Maniphest] T362508: WDQS updater misbehaving in codfw

2024-04-15 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The updater is misbehaving in codfw, apparently processing too many `reconciliations` which triggers a //slow// update mode and thus is not able

[Wikidata-bugs] [Maniphest] T361935: Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs

2024-04-09 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T361935 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Daniel_Mietchen, dr0ptp4kt, pfischer, dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414

[Wikidata-bugs] [Maniphest] T361935: Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs

2024-04-09 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T361935 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Daniel_Mietchen, dr0ptp4kt, pfischer, dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414

[Wikidata-bugs] [Maniphest] T362074: WDQS wikibase:around sometimes ignore exact matches

2024-04-08 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION (originally reported https://www.wikidata.org/wiki/Wikidata:Report_a_technical_problem/WDQS_and_Search#WDQS_wikibase:around_issue) It might

[Wikidata-bugs] [Maniphest] T337013: [Epic] Splitting the graph in WDQS

2024-04-08 Thread dcausse
dcausse added a subtask: T362060: Generalize ScholarlyArticleSplitter. TASK DETAIL https://phabricator.wikimedia.org/T337013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Daniel_Mietchen, Kanashimi, SEgt-WMF, dr0ptp4kt, RKemper, bking

[Wikidata-bugs] [Maniphest] T362060: Generalize ScholarlyArticleSplitter

2024-04-08 Thread dcausse
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS. TASK DETAIL https://phabricator.wikimedia.org/T362060 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, AWesterinen, Namenlos314, Gq86

[Wikidata-bugs] [Maniphest] T362060: Generalize ScholarlyArticleSplitter

2024-04-08 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The spark job ScholarlyArticleSplitter should be generalized to support the general case with //n// subgraphs, a wider variety of rules and stubs

[Wikidata-bugs] [Maniphest] T349911: Explore the feasibility of using SPARQL federation for scholia queries

2024-04-05 Thread dcausse
dcausse moved this task from Blocked/Waiting to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Two scholia queries were rewritten: - https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Federated_Queries_Examples

[Wikidata-bugs] [Maniphest] T361950: Ensure that WDQS query throttling does not interfere with federation

2024-04-05 Thread dcausse
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS. TASK DETAIL https://phabricator.wikimedia.org/T361950 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Danny_Benjafield_WMDE, S8321414

[Wikidata-bugs] [Maniphest] T337013: [Epic] Splitting the graph in WDQS

2024-04-05 Thread dcausse
dcausse added a subtask: T361950: Ensure that WDQS query throttling does not interfere with federation. TASK DETAIL https://phabricator.wikimedia.org/T337013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Kanashimi, SEgt-WMF, dr0ptp4kt

[Wikidata-bugs] [Maniphest] T361950: Ensure that WDQS query throttling does not interfere with federation

2024-04-05 Thread dcausse
dcausse renamed this task from "Ensure that WDQS query throttling do not interfere with federation" to "Ensure that WDQS query throttling does not interfere with federation". TASK DETAIL https://phabricator.wikimedia.org/T361950 EMAIL PREFERENCES https://phabricator.wi

[Wikidata-bugs] [Maniphest] T361950: Ensure that WDQS query throttling do not interfere with federation

2024-04-05 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION When we exposed the 3 experimental endpoints to test the first version of the graph split we disabled query throttling to avoid impacting

[Wikidata-bugs] [Maniphest] T361935: Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs

2024-04-05 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T361935 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dr0ptp4kt, pfischer, dcausse, Aklapper, AWesterinen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T337013: [Epic] Splitting the graph in WDQS

2024-04-05 Thread dcausse
dcausse added a subtask: T361935: Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs. TASK DETAIL https://phabricator.wikimedia.org/T337013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Kanashimi, SEgt-WMF, dr0ptp4kt

[Wikidata-bugs] [Maniphest] T361935: Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs

2024-04-05 Thread dcausse
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS. TASK DETAIL https://phabricator.wikimedia.org/T361935 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dr0ptp4kt, pfischer, dcausse, Aklapper, AWesterinen

[Wikidata-bugs] [Maniphest] T361935: Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs

2024-04-05 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION In order to support updating the subgraphs defined in Wikidata:SPARQL_query_service/WDQS_graph_split <https://www.wikidata.org/w

[Wikidata-bugs] [Maniphest] T361114: Alert Search Platform and/or DPE SRE when Wikidata is lagged

2024-04-04 Thread dcausse
dcausse added a comment. Thanks! I'm not very familiar with alerts being set from grafana neither, I'll try to get more info on this, worst case we can always set up a new one directly in alertmanager just for the wdqs lag and sent to the search team using the same formula used

[Wikidata-bugs] [Maniphest] T361114: Alert Search Platform and/or DPE SRE when Wikidata is lagged

2024-04-04 Thread dcausse
dcausse removed dcausse as the assignee of this task. dcausse added a comment. @Lucas_Werkmeister_WMDE thanks! Do you know where we could update this to include our alert email for such alerts? TASK DETAIL https://phabricator.wikimedia.org/T361114 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T353683: Unable to find a file by filename while adding a Commons media file statement

2024-04-03 Thread dcausse
dcausse moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Should be working properly now TASK DETAIL https://phabricator.wikimedia.org/T353683 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL

[Wikidata-bugs] [Maniphest] T361106: Restore wdqs1013 with a data transfer

2024-03-29 Thread dcausse
dcausse closed this task as "Declined". dcausse added a comment. won't be required after all TASK DETAIL https://phabricator.wikimedia.org/T361106 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking, dcausse Cc: dcausse, Aklap

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-29 Thread dcausse
dcausse closed subtask T361106: Restore wdqs1013 with a data transfer as Declined. TASK DETAIL https://phabricator.wikimedia.org/T360993 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: bking, Aklapper, dcausse, Danny_Benjafield_WMDE

[Wikidata-bugs] [Maniphest] T361246: scap deploy should not repool a wdqs node that is depooled

2024-03-28 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T361246 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, AWesterinen, karapayneWMDE

[Wikidata-bugs] [Maniphest] T361106: Restore wdqs1013 with a data transfer

2024-03-28 Thread dcausse
dcausse moved this task from Backlog to Blocked / Waiting on the Data-Platform-SRE (2024.03.25 - 2024.04.14) board. dcausse added a comment. I restarted the updater on wdqs1013 and it's catching up, I have a note to check the status tomorrow and will repool it if necessary. TASK DETAIL

[Wikidata-bugs] [Maniphest] T361246: scap deploy should not repool a wdqs node that is depooled

2024-03-28 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T361246 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1, AWesterinen, karapayneWMDE

[Wikidata-bugs] [Maniphest] T361246: scap deploy should not repool a wdqs node that is depooled

2024-03-28 Thread dcausse
dcausse added a project: Wikidata-Query-Service. TASK DETAIL https://phabricator.wikimedia.org/T361246 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, AWesterinen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE, EBjune

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-28 Thread dcausse
dcausse added a comment. I could re-enable puppet on wdqs1013 and restart the updater to catchup on updates. But apparently this machine was repooled yesterday (as part of the wdqs scap deploy I suppose) and thus started to serve stale data without triggering any maxlag. It's when re

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-28 Thread dcausse
dcausse added a comment. depooling the node we can see that the query rate actually going down to 0, request rate is generally very low on codfw so we might have to tune the threshold at around 0.2. F43663858: image.png <https://phabricator.wikimedia.org/F43663858> TASK DETAIL

[Wikidata-bugs] [Maniphest] T336352: Update maxlag calculation maintenance script to reflect new prometheus queries

2024-03-26 Thread dcausse
dcausse removed a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T336352 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: hoo, dcausse Cc: Lucas_Werkmeister_WMDE, Aklapper, ItamarWMDE, dcausse, Danny_Benjafield_WMDE

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-26 Thread dcausse
dcausse added a comment. The approach taken is: - from nginx control a new header named 'x-monitoring-query' set to true if a list of criteria is met (currently using user-agent strings but could be extended to using source IPs as well I suppose) - from blazegraph, do not log query

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-26 Thread dcausse
dcausse moved this task from Incoming to Needs review on the Discovery-Search (Current work) board. dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T360993 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-26 Thread dcausse
dcausse added a comment. Here are the UAs seen in hour of a depooled server: +--+-+ |UA|count

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-26 Thread dcausse
dcausse triaged this task as "High" priority. dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T360993 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, A

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-26 Thread dcausse
dcausse added a comment. Mitigation: - blazegraph stopped - updater stopped with the `/srv/wdqs/data_loaded` flag removed - puppet disabled TASK DETAIL https://phabricator.wikimedia.org/T360993 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] T360993: WDQS lag propagation to wikidata not working as intended

2024-03-26 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Propagating the lag of a wdqs host should only be done if this host is ''pooled'' (actually serving user traffic). Determining the ''pooling

[Wikidata-bugs] [Maniphest] T357966: Document limitations of blazegraph federation

2024-03-21 Thread dcausse
dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. dcausse added a comment. draft page: https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Federation_Limits TASK DETAIL https://phabricator.wikimedia.org/T357966

[Wikidata-bugs] [Maniphest] T357966: Document limitations of blazegraph federation

2024-03-05 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T357966 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T353683: Unable to find a file by filename while adding a Commons media file statement

2024-03-05 Thread dcausse
dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. dcausse added a comment. changed the layout of the query a bit by moving the logistic function introduced in T271799 <https://phabricator.wikimedia.org/T271799> to the top-le

[Wikidata-bugs] [Maniphest] T357980: Compile a set of queries rewritten with federation across the two graph splits

2024-03-04 Thread dcausse
dcausse claimed this task. dcausse moved this task from In Progress to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Compiled 10 real world examples at https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split

[Wikidata-bugs] [Maniphest] T355040: Compare the results of sparql queries between the fullgraph and the subgraphs

2024-03-04 Thread dcausse
dcausse added a comment. final report available at https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/WDQS_Graph_Split_Impact_Analysis TASK DETAIL https://phabricator.wikimedia.org/T355040 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] T356773: [tracking] Community feedback for the WDQS Split the Graph project

2024-03-04 Thread dcausse
dcausse added a comment. @Physikerwelt thanks for your feedback. Blazegraph is definitely not the best solution and the work to move off of blazegraph should be tracked under https://phabricator.wikimedia.org/T330525 (see the initial exploration <https://www.wikidata.org/w

[Wikidata-bugs] [Maniphest] T356773: [tracking] Community feedback for the WDQS Split the Graph project

2024-03-04 Thread dcausse
dcausse added a comment. In T356773#9531179 <https://phabricator.wikimedia.org/T356773#9531179>, @EgonWillighagen wrote: > I tried to get the federation working, but got time outs too. The problem is that the current setup makes splits at a statement level. That is, given s

[Wikidata-bugs] [Maniphest] T353683: Unable to find a file by filename while adding a Commons media file statement

2024-03-04 Thread dcausse
dcausse moved this task from To Be Deployed to In Progress on the Discovery-Search (Current work) board. dcausse added a comment. The new builder moved the result to #4 which is better but still not enough and it's beaten by 3 other images because other criteria

[Wikidata-bugs] [Maniphest] T357980: Compile a set of queries rewritten with federation across the two graph splits

2024-02-26 Thread dcausse
dcausse added a comment. WIP at https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Federated_Queries_Examples TASK DETAIL https://phabricator.wikimedia.org/T357980 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse

[Wikidata-bugs] [Maniphest] T337013: [Epic] Splitting the graph in WDQS

2024-02-21 Thread dcausse
dcausse added a subtask: T357980: Compile a set of queries rewritten with federation across the two graph splits. TASK DETAIL https://phabricator.wikimedia.org/T337013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: SEgt-WMF, dr0ptp4kt

[Wikidata-bugs] [Maniphest] T357980: Compile a set of queries rewritten with federation across the two graph splits

2024-02-21 Thread dcausse
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS. TASK DETAIL https://phabricator.wikimedia.org/T357980 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, AWesterinen, Namenlos314, Gq86

[Wikidata-bugs] [Maniphest] T357980: Compile a set of queries rewritten with federation across the two graph splits

2024-02-21 Thread dcausse
dcausse renamed this task from "Compile a set of queries rewritten with federation accross the two graph splits" to "Compile a set of queries rewritten with federation across the two graph splits". TASK DETAIL https://phabricator.wikimedia.org/T357980 EMAIL

[Wikidata-bugs] [Maniphest] T357980: Compile a set of queries rewritten with federation accross the two graph splits

2024-02-21 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Having a set of examples might be helpful for users experimenting with the graph split. A subpage under https://www.wikidata.org/wiki

[Wikidata-bugs] [Maniphest] T337013: [Epic] Splitting the graph in WDQS

2024-02-21 Thread dcausse
dcausse added a subtask: T357966: Document limitations of blazegraph federation. TASK DETAIL https://phabricator.wikimedia.org/T337013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: SEgt-WMF, dr0ptp4kt, RKemper, bking, tfmorris, elal

[Wikidata-bugs] [Maniphest] T357966: Document limitations of blazegraph federation

2024-02-21 Thread dcausse
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS. TASK DETAIL https://phabricator.wikimedia.org/T357966 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, AWesterinen, Namenlos314, Gq86

[Wikidata-bugs] [Maniphest] T357966: Document limitations of blazegraph federation

2024-02-21 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Writing a query that federates multiple SPARQL endpoints can be challenging if the intermediate results that have to be shared are big. Better

[Wikidata-bugs] [Maniphest] T355040: Compare the results of sparql queries between the fullgraph and the subgraphs

2024-02-08 Thread dcausse
dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. dcausse added a comment. Draft report up at https://wikitech.wikimedia.org/wiki/User:DCausse/WDQS_Graph_Split_Impact_Analysis TASK DETAIL https://phabricator.wikimedia.org/T355040

[Wikidata-bugs] [Maniphest] T353453: [Analytics] Impact of Scholia on WDQS

2024-02-08 Thread dcausse
dcausse added a comment. In T353453#9524925 <https://phabricator.wikimedia.org/T353453#9524925>, @AndrewTavis_WMDE wrote: > Quick note on this: > > There are two ways that need to be factored in to deriving if a query is from Scholia. Some queries do start with

[Wikidata-bugs] [Maniphest] T355040: Compare the results of sparql queries between the fullgraph and the subgraphs

2024-02-02 Thread dcausse
dcausse added a comment. WIP: - included the new 100k queries sample named `QUERY-Q4` from T349512 <https://phabricator.wikimedia.org/T349512> (random sample that is representative of the query length and runtime) - the % of affected queries (deduplicated) per tool is (//

[Wikidata-bugs] [Maniphest] T355037: Compare the performance of sparql queries between the full graph and the subgraphs

2024-02-02 Thread dcausse
dcausse added a comment. @dr0ptp4kt thanks! is the difference in the number of successful queries only explained by the improvement in query time or are there some improvements in the number of queries that timeout as well? TASK DETAIL https://phabricator.wikimedia.org/T355037 EMAIL

[Wikidata-bugs] [Maniphest] T355888: Enable cross federation between experimental WDQS endpoints

2024-01-31 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T355888 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: RKemper, dcausse, Aklapper, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder, Adamm71

[Wikidata-bugs] [Maniphest] T356243: process_sparql_query_hourly sometimes fails on the jena sparql parser

2024-01-31 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Failure seen while `org.wikidata.query.rdf.spark.transform.queries.sparql.QueryExtractor` was processing the dataset

[Wikidata-bugs] [Maniphest] T356161: WikibaseMediaInfo seems to reuse statement identifiers from other entities

2024-01-30 Thread dcausse
dcausse added a comment. Scanning dumps from 2024/01/21 we can find 1623 duplicated statement ids (full list here: https://people.wikimedia.org/~dcausse/T356161_sdc_duplicated_statement_ids.csv) TASK DETAIL https://phabricator.wikimedia.org/T356161 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T356161: WikibaseMediaInfo seems to reuse statement identifiers from other entities

2024-01-30 Thread dcausse
dcausse renamed this task from "WikibaseMediaInfo (or Wikibase?) seems to reuse statement identifiers from other entities" to "WikibaseMediaInfo seems to reuse statement identifiers from other entities". dcausse updated the task description. TASK DETAIL https://phabr

[Wikidata-bugs] [Maniphest] T356161: WikibaseMediaInfo (or Wikibase?) seems to reuse statement identifiers from other entities

2024-01-30 Thread dcausse
dcausse added a comment. @Lucas_Werkmeister_WMDE thanks for all the context! I get that it only affects WikibaseMediaInfo. Can we exclude Wikibase as a culprit possibly affecting wikidata or should we run a quick investigation to find possible duplicated statement identifiers

[Wikidata-bugs] [Maniphest] T356161: WikibaseMediaInfo (or Wikibase?) seems to reuse statement identifiers from other entities

2024-01-30 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T356161 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lucas_Werkmeister_WMDE, dcausse, Aklapper, Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen

[Wikidata-bugs] [Maniphest] T356161: WikibaseMediaInfo (or Wikibase?) seems to reuse statement identifiers from other entities

2024-01-30 Thread dcausse
dcausse created this task. dcausse added projects: WikibaseMediaInfo, Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Structured-Data-Backlog. TASK DESCRIPTION Seen on M130887689 <https://commons.wikimedia.org/w

[Wikidata-bugs] [Maniphest] T355040: Compare the results of sparql queries between the fullgraph and the subgraphs

2024-01-26 Thread dcausse
dcausse added a comment. WIP: https://people.wikimedia.org/~dcausse/T355040_EARLY_DRAFT_wdqs_query_results_analysis.html (UA redacted for now) TL/DR: - added support for identifying true positives (queries with a scientific article in the sparql query or in the results

[Wikidata-bugs] [Maniphest] T351650: Expose 3 new dedicated WDQS endpoints

2024-01-25 Thread dcausse
dcausse added a subtask: T355888: Enable cross federation between experimental WDQS endpoints. TASK DETAIL https://phabricator.wikimedia.org/T351650 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RKemper, dcausse Cc: Gehel, bking, dcausse

[Wikidata-bugs] [Maniphest] T355888: Enable cross federation between experimental WDQS endpoints

2024-01-25 Thread dcausse
dcausse added a parent task: T351650: Expose 3 new dedicated WDQS endpoints. TASK DETAIL https://phabricator.wikimedia.org/T355888 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: RKemper, dcausse, Aklapper, AWesterinen, BTullis

[Wikidata-bugs] [Maniphest] T355888: Enable cross federation between experimental WDQS endpoints

2024-01-25 Thread dcausse
dcausse created this task. dcausse added projects: Data-Platform-SRE, Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION Experimental endpoints `query-main-experimental` and `query-scholarly-experimental` must allow cross federation. A simple way

[Wikidata-bugs] [Maniphest] T355040: Compare the results of sparql queries between the fullgraph and the subgraphs

2024-01-19 Thread dcausse
dcausse added a comment. Quick report on the progress being made: - Our query logs do not only contains sparql queries and the sparql client used to collect the data has to be adapted to support these (ASK, CONSTRUCT, DESCRIBE) (https://gerrit.wikimedia.org/r/c/wikidata/query/rdf

[Wikidata-bugs] [Maniphest] T353683: Unable to find a file by filename while adding a Commons media file statement

2024-01-18 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T353683 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T355040: Compare the results of sparql queries between the fullgraph and the subgraphs

2024-01-15 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, Wikidata-Query-Service. TASK DESCRIPTION By using a tool to compare the differences of two results of the same sparql query we should evaluate how many queries might "break" when running against the wikidata main gra

[Wikidata-bugs] [Maniphest] T352538: [EPIC] Evaluate the impact of the graph split

2024-01-15 Thread dcausse
dcausse added a subtask: T355037: Compare the performance of sparql queries between the full graph and the subgraphs. TASK DETAIL https://phabricator.wikimedia.org/T352538 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, Gehel

[Wikidata-bugs] [Maniphest] T355037: Compare the performance of sparql queries between the full graph and the subgraphs

2024-01-15 Thread dcausse
dcausse added a parent task: T352538: [EPIC] Evaluate the impact of the graph split. TASK DETAIL https://phabricator.wikimedia.org/T355037 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, AWesterinen, Namenlos314, Gq86

[Wikidata-bugs] [Maniphest] T355037: Compare the performance of sparql queries between the full graph and the subgraphs

2024-01-15 Thread dcausse
dcausse renamed this task from "Com" to "Compare the performance of sparql queries between the full graph and the subgraphs". dcausse added a project: Wikidata-Query-Service. dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T355037 EMAIL

[Wikidata-bugs] [Maniphest] T353683: Unable to find a file by filename while adding a Commons media file statement

2024-01-09 Thread dcausse
dcausse added subscribers: Cparle, dcausse. dcausse added a project: SDAW-MediaSearch. dcausse added a comment. Restricted Application added a project: Structured-Data-Backlog. Selecting only namespace=6 does trigger the MediaSearch query profile which does not include the `all_near_match

[Wikidata-bugs] [Maniphest] T354142: 502 error on some Lingua Libre federated queries

2024-01-04 Thread dcausse
dcausse added a comment. Closed as a duplicate of T299290 <https://phabricator.wikimedia.org/T299290>, quickly testing it seems that the 502 is triggered depending on the query size: select * { service <https://lingualibre.org/sparql> { ?e <https://lingu

[Wikidata-bugs] [Maniphest] T354142: 502 error on some Lingua Libre federated queries

2024-01-04 Thread dcausse
dcausse closed this task as a duplicate of T299290: Unexpected behavior in federated queries with LinguaLibre in WDQS. TASK DETAIL https://phabricator.wikimedia.org/T354142 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, Nikki

[Wikidata-bugs] [Maniphest] T354043: Decide the name, domain and logo of WDQS for scholarly articles

2024-01-04 Thread dcausse
dcausse edited projects, added Wikidata-Query-Service; removed Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T354043 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, Midleading, AWesterinen

[Wikidata-bugs] [Maniphest] T350464: Expose SPARQL endpoints with full wikidata data set and with split graph to enable experimentation on federation with a split graph

2023-12-19 Thread dcausse
dcausse added a subtask: T352878: Troubleshoot recurring systemd unit failures and availability issues for wdqs1022-24. TASK DETAIL https://phabricator.wikimedia.org/T350464 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Gehel, Aklapper

[Wikidata-bugs] [Maniphest] T350784: Identify/complete post-migration tasks after rdf-streaming-updater migrates to flink operator

2023-12-15 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T350784 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, JMeybohm, Aklapper, bking, Danny_Benjafield_WMDE, Isabelladantes1983, Themindcoder

[Wikidata-bugs] [Maniphest] T353453: [Analytics] QUERY-Q3: Extract a set of queries known to be used by scholia

2023-12-14 Thread dcausse
dcausse added a comment. note that scholia queries generally starts with the comment: # tool: scholia TASK DETAIL https://phabricator.wikimedia.org/T353453 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper

[Wikidata-bugs] [Maniphest] T249989: Disable native OpenSearch suggestions in Wikibase wikis

2023-12-11 Thread dcausse
dcausse removed projects: wmde-wikidata-tech, Web-Team-Backlog. dcausse added a comment. @Jdlrobson in T249989#9330640 <https://phabricator.wikimedia.org/T249989#9330640> a user reports that a similar problem sometimes happen with the new vector 2022 search completion box on

[Wikidata-bugs] [Maniphest] T351942: wbstatementquantity search keyword seems broken

2023-12-11 Thread dcausse
dcausse added a comment. In T351942#9395775 <https://phabricator.wikimedia.org/T351942#9395775>, @Michael wrote: > Mh, the tickets associated with T191633: Implement searching of 'depicts' on commons <https://phabricator.wikimedia.org/T191633> are interesting, I can't s

[Wikidata-bugs] [Maniphest] T350784: Identify/complete post-migration tasks after rdf-streaming-updater migrates to flink operator

2023-11-30 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T350784 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, JMeybohm, Aklapper, bking, Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen, BTullis

[Wikidata-bugs] [Maniphest] T351942: wbstatementquantity search keyword seems broken

2023-11-28 Thread dcausse
dcausse added a comment. I think this feature was originally meant to be used on commons and it appears that it was never properly configured anywhere (unless I'm missing something). For a quantity to be searchable via this keyword a statement must have a qualifier of type

[Wikidata-bugs] [Maniphest] T351819: Create a tool that records and compares a set of sparql query results

2023-11-28 Thread dcausse
dcausse claimed this task. dcausse moved this task from Incoming to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T351819 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T351819: Create a tool that records and compares a set of sparql query results

2023-11-28 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T351819 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen

[Wikidata-bugs] [Maniphest] T351894: Write a tool that converts IGUANA test results into tabular data suited for analysis needs

2023-11-27 Thread dcausse
dcausse moved this task from Incoming to Needs review on the Discovery-Search (Current work) board. dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T351894 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T351894: Write a tool that converts IGUANA test results into tabular data suited for analysis needs

2023-11-27 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T351894 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, Danny_Benjafield_WMDE, Astuthiodit_1, AWesterinen

[Wikidata-bugs] [Maniphest] T351894: Write a tool that converts IGUANA test results into tabular data suited for analysis needs

2023-11-23 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION IGUANA does output its test results as RDF (e.g. result.nt <https://gitlab.wikimedia.org/repos/search-platform/IGUANA/-/blob/main/wdqs-example-su

[Wikidata-bugs] [Maniphest] T349519: Determine if IGUANA and TFT would fit our query analysis needs

2023-11-23 Thread dcausse
dcausse moved this task from In Progress to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. **TFT** does not seem appropriate for the kind of tests we have to make within the scope of the graph split project, it does not provide anything to ease

[Wikidata-bugs] [Maniphest] T337013: [Epic] Splitting the graph in WDQS

2023-11-22 Thread dcausse
dcausse added a subtask: T351819: Create a tool that records and compares a set of sparql query results. TASK DETAIL https://phabricator.wikimedia.org/T337013 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dr0ptp4kt, RKemper, bking

[Wikidata-bugs] [Maniphest] T351819: Create a tool that records and compares a set of sparql query results

2023-11-22 Thread dcausse
dcausse added a parent task: T337013: [Epic] Splitting the graph in WDQS. TASK DETAIL https://phabricator.wikimedia.org/T351819 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, AWesterinen, Namenlos314, Gq86

[Wikidata-bugs] [Maniphest] T351819: Create a tool that records and compares a set of sparql query results

2023-11-22 Thread dcausse
dcausse renamed this task from "Create a tool that records and compare a set of sparql results" to "Create a tool that records and compares a set of sparql query results". TASK DETAIL https://phabricator.wikimedia.org/T351819 EMAIL PREFERENCES https://phabricator.wi

[Wikidata-bugs] [Maniphest] T351819: Create a tool that records and compare a set of sparql results

2023-11-22 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION In order to evaluate the impact of splitting the wikidata graph we want to compare the outcome of some queries against different endpoint

[Wikidata-bugs] [Maniphest] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2023-11-21 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: MPhamWMF, Gehel, Addshore, dcausse, Aklapper, me, Danny_Benjafield_WMDE, Astuthiodit_1

  1   2   3   4   5   6   7   8   9   10   >