[Wikidata-bugs] [Maniphest] T272120: Deleted item still gets shown in WDQS query results

2021-01-27 Thread dcausse
dcausse reopened this task as "Open". dcausse added a comment. Re-opening as there still seem to be a problem related to deletes, and the fix done in T267175 <https://phabricator.wikimedia.org/T267175> was not effective. All servers seem to have missed the deletion

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-27 Thread dcausse
dcausse added a comment. Thanks for the comments. Inconsistencies for edits prior to Jan 20 (time when the last fix was deployed) are //expected// and will be fixed by the reload. Inconsistency on Q104982840 is more troubling as the delete was done after this date. I'll re-open

[Wikidata-bugs] [Maniphest] T272994: Ensure scalastyle import order rules are verified

2021-01-26 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a developper of the scala projects in the wdqs repo I want scalastyle to checks the import

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-26 Thread dcausse
dcausse merged a task: T272120: Deleted item still gets shown in WDQS query results. dcausse added a subscriber: Mbch331. TASK DETAIL https://phabricator.wikimedia.org/T267175 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RKemper, dcausse Cc

[Wikidata-bugs] [Maniphest] T272120: Deleted item still gets shown in WDQS query results

2021-01-26 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem). TASK DETAIL https://phabricator.wikimedia.org/T272120 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T272120: Deleted item still gets shown in WDQS query results

2021-01-26 Thread dcausse
dcausse added a comment. I think this king of inconsistencies were related to the problems reported in T267175 <https://phabricator.wikimedia.org/T267175>. Thanks for the report, I manually resynced but please let me know via this ticket or T267175 <https://phabricator.wiki

[Wikidata-bugs] [Maniphest] T267927: Reload wikidata journal from fresh dumps

2021-01-25 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T267927 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-25 Thread dcausse
dcausse moved this task from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Tried to find more inconsistencies using the query provided by @Multichill (https://w.wiki/ugf) and could not spot any while it was very easy to find one

[Wikidata-bugs] [Maniphest] T272447: Extract a list of the 200 most viewed black historical figures from WDQS

2021-01-20 Thread dcausse
dcausse added a comment. @Miriam this would be great indeed! I think we can precise the query with (generalize the ethnicity and filter on Q5): SELECT ?item WHERE { ?item wdt:P31 wd:Q5 ; wdt:P172/wdt:P279* wd:Q817393 . } TASK DETAIL https

[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2021-01-20 Thread dcausse
dcausse moved this task from Blocked (from outside the team) to Waiting on the Discovery-Search (Current work) board. dcausse added a comment. Moving to waiting as T267175 <https://phabricator.wikimedia.org/T267175> is the last ticket still blocking this and is on the search team's

[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2021-01-20 Thread dcausse
dcausse added a comment. In T267644#6758340 <https://phabricator.wikimedia.org/T267644#6758340>, @Lucas_Werkmeister_WMDE wrote: > Alright, I uploaded a new unitConversionConfig.json at https://gerrit.wikimedia.org/r/657131; I’ve -2ed it for now due to the unclar

[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2021-01-19 Thread dcausse
dcausse added a comment. Thanks for fixing this script! Few notes on the next steps and how we should synchronize our efforts: Once the script has updated the json file read by wikibase we will have to re-import the wdqs machines using a dump generated based on the new units

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-18 Thread dcausse
dcausse added a comment. Checked a couple of these inconsistencies and they appear to all be out of order in the kafka topics. I suggest to disable `async imports` as I believe it might be possible cause of these inconsistencies. TASK DETAIL https://phabricator.wikimedia.org/T267175

[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2021-01-13 Thread dcausse
dcausse assigned this task to Lucas_Werkmeister_WMDE. dcausse moved this task from To Be Deployed to Needs Reporting on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T239931 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL

[Wikidata-bugs] [Maniphest] T271851: Clean up gui from the wdqs deploy repo and puppet

2021-01-13 Thread dcausse
dcausse assigned this task to Ladsgroup. dcausse added a project: Discovery-Search (Current work). Restricted Application added a project: User-Ladsgroup. TASK DETAIL https://phabricator.wikimedia.org/T271851 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2021-01-11 Thread dcausse
dcausse added a comment. In T239931#6736120 <https://phabricator.wikimedia.org/T239931#6736120>, @Lucas_Werkmeister_WMDE wrote: > Agreed. Should we (Wikidata team) do the config change or leave it to you? :) We can ship the config change no worries :) TASK DETAI

[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2021-01-11 Thread dcausse
dcausse added a comment. In T239931#6719994 <https://phabricator.wikimedia.org/T239931#6719994>, @EBernhardson wrote: > With the holidays over and everyone back, i think we can turn this on? Sounds good to me! TASK DETAIL https://phabricator.wikimedia.org/T2399

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-05 Thread dcausse
dcausse merged a task: T270975: Some lexemes cannot be obtained by SPARQL query. dcausse added subscribers: Strepon, Skim. TASK DETAIL https://phabricator.wikimedia.org/T267175 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RKemper, dcausse Cc: Skim

[Wikidata-bugs] [Maniphest] T270975: Some lexemes cannot be obtained by SPARQL query

2021-01-05 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem). TASK DETAIL https://phabricator.wikimedia.org/T270975 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T270975: Some lexemes cannot be obtained by SPARQL query

2021-01-05 Thread dcausse
dcausse added a comment. I manually refreshed the entities mentioned here, the root cause is already being worked on in a separate issue and thus closing this ticket as duplicate of it. TASK DETAIL https://phabricator.wikimedia.org/T270975 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T270614: Automatically depool wdqs servers that are "lagged"

2020-12-21 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a wdqs user I would like servers that are lagged to be depooled so that I don't experience

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-17 Thread dcausse
dcausse added a comment. In T269619#6696071 <https://phabricator.wikimedia.org/T269619#6696071>, @Ottomata wrote: > It depends on what you want to do :) EventGate will handle multi DC, filling some default values, and topic prefixes for you, but is an extra hop to Kafka.

[Wikidata-bugs] [Maniphest] T270371: wikimedia-event-utilities should provide tools for JVM based apps producing directly to kafka

2020-12-17 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, Analytics. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a user of the event platform I want wikimedia-event-utilities to have all

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-16 Thread dcausse
dcausse added a comment. In T269619#6695454 <https://phabricator.wikimedia.org/T269619#6695454>, @Ottomata wrote: > @dcausse, will these be POSTed to an EventGate, or to produced directly to Kafka? I plan to POST them to event gate using a very naive SinkFunction an

[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-16 Thread dcausse
dcausse added a comment. In T269619#6693136 <https://phabricator.wikimedia.org/T269619#6693136>, @Ottomata wrote: > @dcausse for retrieving schemas, https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master might help. :) Thanks! it wo

[Wikidata-bugs] [Maniphest] T270245: Jmx metrics for blazegraph are no longer visible in grafana

2020-12-16 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a wdqs maintainer I want to see metrics exported from the jmx prometheus exporter in grafana so

[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-14 Thread dcausse
dcausse assigned this task to RKemper. dcausse moved this task from Ready for Development to Needs Reporting on the Discovery-Search (Current work) board. dcausse added a comment. Seems to be resolved now: dcausse@mwmaint1002:~$ mwscript extensions/Wikidata.org/maintenance

[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2020-12-14 Thread dcausse
dcausse added a comment. The sanitizer is working OK with increased concurrency (T266762 <https://phabricator.wikimedia.org/T266762>), we might try to enable it again on wikidata and sees how it performs. TASK DETAIL https://phabricator.wikimedia.org/T239931 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-07 Thread dcausse
dcausse moved this task from Needs review to In Progress on the Discovery-Search (Current work) board. dcausse reassigned this task from Zbyszko to RKemper. dcausse added subscribers: RKemper, Zbyszko. dcausse added a comment. We will re-enable the kafka poller that was disabled for security

[Wikidata-bugs] [Maniphest] T269331: wdqs.data-reload cookbook fails when deleting the old namespace

2020-12-04 Thread dcausse
dcausse renamed this task from "wdqs.data-reload cookbook fails when switching" to "wdqs.data-reload cookbook fails when deleting the old namespace". TASK DETAIL https://phabricator.wikimedia.org/T269331 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/pa

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2020-12-04 Thread dcausse
dcausse added a subtask: T269451: Possible flink optimizations/cleanups. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Thadguidry, tfmorris, revi, Ladsgroup, Multichill

[Wikidata-bugs] [Maniphest] T269451: Possible flink optimizations/cleanups

2020-12-04 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T269451 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen

[Wikidata-bugs] [Maniphest] T269451: Possible flink optimizations/cleanups

2020-12-04 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As part of the flink review with ververica here are the few points we agreed to exeriment before

[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-12-04 Thread dcausse
dcausse removed dcausse as the assignee of this task. dcausse removed a project: Discovery-Search (Current work). dcausse added a comment. Moving back to the backlog to re-evaluate the priority TASK DETAIL https://phabricator.wikimedia.org/T233204 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T269421: Upgrade blazegraph to recent ICU4J version

2020-12-04 Thread dcausse
dcausse removed dcausse as the assignee of this task. TASK DETAIL https://phabricator.wikimedia.org/T269421 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T269421: Upgrade blazegraph to recent ICU4J version

2020-12-04 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, Wikidata. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As wdqs user I want the service to not conflate unrelated characters so that I can see what is actually stored in wikidata. Note

[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-12-04 Thread dcausse
dcausse added a comment. I did some experiments using one chunk of our dumps which accounts for 31,883,361 triples which is ~3‰ of the dump size. The journal size using the default //tertiary// strength is 154Gb it grows up to 174Gb using //identical// which is close to 13% increase

[Wikidata-bugs] [Maniphest] T269204: Some wdqs metrics changed when switching to python3

2020-12-04 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T269204 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RKemper, dcausse Cc: Gehel, RKemper, dcausse, Aklapper, lmata, CBogen, Akuckartz, Nandana

[Wikidata-bugs] [Maniphest] T269331: wdqs.data-reload cookbook fails when switching

2020-12-03 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As wdqs maintainer I want to reload all categories on wdqs machines without errors

[Wikidata-bugs] [Maniphest] T269204: Some wdqs metrics changed when switching to python3

2020-12-02 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T269204 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: RKemper, dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T269204: Some wdqs metrics changed when switching to python3

2020-12-02 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As WDQS maintainer I want all metrics that wdqs reports to prometheus to have a consistent name

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-01 Thread dcausse
dcausse merged a task: T268408: Query returns outdated results . dcausse added a subscriber: Epidosis. TASK DETAIL https://phabricator.wikimedia.org/T267175 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko, dcausse Cc: Epidosis, dcausse

[Wikidata-bugs] [Maniphest] T268408: Query returns outdated results

2020-12-01 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem). TASK DETAIL https://phabricator.wikimedia.org/T268408 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-01 Thread dcausse
dcausse merged a task: T267924: WDQS Updater (based on recent changes) missed some updates. TASK DETAIL https://phabricator.wikimedia.org/T267175 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko, dcausse Cc: dcausse, Tagishsimon

[Wikidata-bugs] [Maniphest] T267924: WDQS Updater (based on recent changes) missed some updates

2020-12-01 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem). TASK DETAIL https://phabricator.wikimedia.org/T267924 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-12-01 Thread dcausse
dcausse added a comment. Ꜵ being conflated with  is a bug in the version of ICU4j we use, switching to ICU 68.1 (currently use 4.8 from 2011) solves the problem. Other issues related to similar chars (⑫ vs ⓬) do indeed require switching to switch the collation strength to //identical

[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-11-30 Thread dcausse
dcausse claimed this task. dcausse moved this task from Ready for Development to In Progress on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T233204 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T250140: icinga: WDQS high update lag should alert when the service times out

2020-11-30 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T250140 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: William_Avery, Aklapper, Addshore, dcausse, lmata, CBogen, Akuckartz, Nandana, Namenlos314, Lahi

[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2020-11-27 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T267644 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Zbyszko, Aklapper, Lucas_Werkmeister_WMDE, Gehel, Toni_001, CBogen

[Wikidata-bugs] [Maniphest] T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS

2020-11-25 Thread dcausse
dcausse added a comment. In T244341#6646657 <https://phabricator.wikimedia.org/T244341#6646657>, @ericP wrote: > For reasons that I believe have to do with additional data not changing already inferred facts (AKA monotonicity), certain OWL constructs MUST be expressed as a b

[Wikidata-bugs] [Maniphest] T268408: Query returns outdated results

2020-11-23 Thread dcausse
dcausse moved this task from All WDQS-related tasks to Tracking on the Wikidata-Query-Service board. dcausse triaged this task as "Medium" priority. TASK DETAIL https://phabricator.wikimedia.org/T268408 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL P

[Wikidata-bugs] [Maniphest] T268408: Query returns outdated results

2020-11-23 Thread dcausse
dcausse added a comment. This particular problem should be fixed after the reload tracked in T267927 <https://phabricator.wikimedia.org/T267927>. The root of this issue is most likely due to the recent change poller that may miss some updates and should be fully addressed once we

[Wikidata-bugs] [Maniphest] T267927: Reload wikidata journal from fresh dumps

2020-11-23 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T267927 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T267029: The streaming-updater-producer should handle backfills gracefully

2020-11-23 Thread dcausse
dcausse added a comment. There were no new inconsistent events found in the past two days. TASK DETAIL https://phabricator.wikimedia.org/T267029 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, Alter-paule, Beast1978

[Wikidata-bugs] [Maniphest] T267029: The streaming-updater-producer should handle backfills gracefully

2020-11-20 Thread dcausse
dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. dcausse added a comment. The new approach seems to work. - Backfill period: `2020-11-06T23:00:01` -> `2020-11-20T13:40:00` - Dump reconciliation: `2020-11-06T23:00:01` -> `2

[Wikidata-bugs] [Maniphest] T145712: Use RDF statement counts from entity data, not page props ( wikibase:identifiers, wikibase:statements and wikibase:sitelinks )

2020-11-20 Thread dcausse
dcausse added a comment. In T145712#6636776 <https://phabricator.wikimedia.org/T145712#6636776>, @Lucas_Werkmeister_WMDE wrote: > Alright, scheduled for Monday’s EU backport+config window – if I read the cron config <https://gerrit.wikimedia.org/g/opera

[Wikidata-bugs] [Maniphest] T145712: Use RDF statement counts from entity data, not page props ( wikibase:identifiers, wikibase:statements and wikibase:sitelinks )

2020-11-20 Thread dcausse
dcausse added a comment. In T145712#6634207 <https://phabricator.wikimedia.org/T145712#6634207>, @Lucas_Werkmeister_WMDE wrote: > I wonder if we should backport at least the main change to wmf.18? Otherwise, I believe it will only start showing up in the full RDF dumps of 7

[Wikidata-bugs] [Maniphest] T267029: The streaming-updater-producer should handle backfills gracefully

2020-11-19 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T267029 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T267029: The streaming-updater-producer should handle backfills gracefully

2020-11-19 Thread dcausse
dcausse added a comment. After a test run it seems that we are able to backfill, unfortunately we skip a non negligible number of revision: ++---+---+---+---+---+---+-+ |y |m |d |h |inconsistency |status |event_type |count

[Wikidata-bugs] [Maniphest] T267029: The streaming-updater-producer should handle backfills gracefully

2020-11-17 Thread dcausse
dcausse renamed this task from "Tune the streaming-updater-producer to limit late events" to "The streaming-updater-producer should handle backfills gracefully". dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T267029 EMAIL

[Wikidata-bugs] [Maniphest] T267029: Tune the streaming-updater-producer to limit late events

2020-11-16 Thread dcausse
dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T267029 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T267927: Reload wikidata journal from fresh dumps

2020-11-16 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T267927 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T267927: Reload wikidata journal from fresh dumps

2020-11-16 Thread dcausse
dcausse added a subtask: T267644: Update Wikidata unit conversion config (normalized quantities). TASK DETAIL https://phabricator.wikimedia.org/T267927 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, CBogen, Akuckartz

[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2020-11-16 Thread dcausse
dcausse added a parent task: T267927: Reload wikidata journal from fresh dumps. TASK DETAIL https://phabricator.wikimedia.org/T267644 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Zbyszko, Aklapper, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2020-11-16 Thread dcausse
dcausse added a comment. We don't have yet a way to fix a journal where changes are performed without a new revision, all this kind of changes require some special handling and might affect a huge number of triples. Therefor a full reload is the sole solution we have at the moment

[Wikidata-bugs] [Maniphest] T267927: Reload wikidata journal from fresh dumps

2020-11-16 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As wdqs user I want to query a dataset that is coherent with the state of wikidata. During

[Wikidata-bugs] [Maniphest] T267924: WDQS Updater (based on recent changes) missed some updates

2020-11-16 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a wdqs user I want all updates made to wikidata visible in the query service so that I query

[Wikidata-bugs] [Maniphest] T267029: Tune the streaming-updater-producer to limit late events

2020-11-09 Thread dcausse
dcausse moved this task from Scaling to Current work on the Wikidata-Query-Service board. dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T267029 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T267001: Compute page properties information at munge time

2020-11-09 Thread dcausse
dcausse closed this task as a duplicate of T145712: Use RDF statement counts from entity data, not page props ( wikibase:identifiers, wikibase:statements and wikibase:sitelinks ). TASK DETAIL https://phabricator.wikimedia.org/T267001 EMAIL PREFERENCES https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T145712: Use RDF statement counts from entity data, not page props ( wikibase:identifiers, wikibase:statements and wikibase:sitelinks )

2020-11-09 Thread dcausse
dcausse merged a task: T267001: Compute page properties information at munge time. TASK DETAIL https://phabricator.wikimedia.org/T145712 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Addshore, dcausse, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T266751: The streaming updater should identify all shared statements properly

2020-11-09 Thread dcausse
dcausse claimed this task. dcausse moved this task from Needs review to To Be Deployed on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T266751 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T266751: The streaming updater should identify all shared statements properly

2020-11-03 Thread dcausse
dcausse added a comment. Detailed exception is: java.lang.IllegalArgumentException: Cannot add/delete the same triple [(https://ce.wikipedia.org/wiki/%D0%92%D0%B5%D1%80%D0%B8%D0%BD_%D0%A5%D0%BE%D1%82%D0%B0%D0%BD%D0%B0%D0%BD, http://schema.org/inLanguage, "ce"^^<http://ww

[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2020-11-02 Thread dcausse
dcausse added a comment. Moving to blocked until we know what's causing T266762 <https://phabricator.wikimedia.org/T266762> TASK DETAIL https://phabricator.wikimedia.org/T239931 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcau

[Wikidata-bugs] [Maniphest] T267029: Tune the streaming-updater-producer to limit late events

2020-11-02 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a maintainer of the wdqs updater pipeline I want to tune the flink application to discard very

[Wikidata-bugs] [Maniphest] T267001: Compute page properties information at munge time

2020-11-02 Thread dcausse
dcausse added a comment. @Lucas_Werkmeister_WMDE indeed, thanks for the link I was not aware of this ticket! :) I think we agree that most of this data can be computed using the data available in the entity and not rely on page properties, the only one that remains difficult

[Wikidata-bugs] [Maniphest] T267001: Compute page properties information at munge time

2020-11-02 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T267001 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2020-11-02 Thread dcausse
dcausse added a subtask: T267001: Compute page properties information at munge time. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Thadguidry, tfmorris, revi, Ladsgroup, Multichill

[Wikidata-bugs] [Maniphest] T266999: Special:EntityData RDF Dump should output the values for page properties that are relevant to the revision being requested

2020-11-02 Thread dcausse
dcausse renamed this task from "Special:EntityData RDF Dump should output the values for page properties that are relevant to revision being asked" to "Special:EntityData RDF Dump should output the values for page properties that are relevant to the revision being requested

[Wikidata-bugs] [Maniphest] T267001: Compute page properties information at munge time

2020-11-02 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T267001 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T267001: Compute page properties information at munge time

2020-11-02 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a maintainer of wdqs I want to workaround T266999 <https://phabricator.wikimedia.org/T266

[Wikidata-bugs] [Maniphest] T266999: Special:EntityData RDF Dump should output the values for page properties that are relevant to revision being asked

2020-11-02 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION As a user of the RDF dump provided by `Special:EntityData` I want the data it generates to be coherent with the `revision` param so that I can detect what

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2020-11-01 Thread dcausse
dcausse added a subtask: T266986: The metrics of the streaming-updater-consumer should be visible in grafana. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Thadguidry, tfmorris

[Wikidata-bugs] [Maniphest] T266986: The metrics of the streaming-updater-consumer should be visible in grafana

2020-11-01 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T266986 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, CBogen

[Wikidata-bugs] [Maniphest] T266986: The metrics of the streaming-updater-consumer should be visible in grafana

2020-11-01 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a maintainer of the wdqs streaming-updater I want to see the metrics it produces in grafana so

[Wikidata-bugs] [Maniphest] T266850: CategoryChangesAsRdfTest::testCategorization: Failed asserting that two strings are equal.

2020-10-30 Thread dcausse
dcausse removed a project: Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T266850 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: DannyS712, Reedy, dcausse, Umherirrender, RhinosF1, Jdforrester-WMF, Ammarpad, Aklapper, AndreCstr

[Wikidata-bugs] [Maniphest] T266850: CategoryChangesAsRdfTest::testCategorization: Failed asserting that two strings are equal.

2020-10-30 Thread dcausse
dcausse edited projects, added Wikidata-Query-Service; removed Discovery. Restricted Application added a project: Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T266850 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: DannyS712

[Wikidata-bugs] [Maniphest] T266750: The streaming updater consumer should stop accumulating patches if it cannot handle them

2020-10-29 Thread dcausse
dcausse renamed this task from "The streaming updater consumer should stop accumulating patches if it cannot handle" to "The streaming updater consumer should stop accumulating patches if it cannot handle them". TASK DETAIL https://phabricator.wikimedia.org/T266750 EMAIL

[Wikidata-bugs] [Maniphest] T266750: The streaming updater consumer should stop accumulating patches if it cannot handle

2020-10-29 Thread dcausse
dcausse claimed this task. dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T266750 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, CBogen, Akuckartz, Nandana

[Wikidata-bugs] [Maniphest] T266751: The streaming updater should identify all shared statements properly

2020-10-29 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As wdqs user I want triples shared by multiple entities to be treated separately in the streaming

[Wikidata-bugs] [Maniphest] T266750: The streaming updater consumer should stop accumulating patches if it cannot handle

2020-10-29 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION The KafkaStreamConsumer should stop accumulation of patches if it is going to fail

[Wikidata-bugs] [Maniphest] T262942: PoC on anomaly detection with Flink

2020-10-28 Thread dcausse
dcausse added a comment. Yes it definitely can support such queries e.g (extract all api requests from mediawiki.apiaction grouped by their action param and database where the avg backend time is > 100ms over a 1 minute window). SELECT TUMBLE_START(dt, INTERVAL '1' MIN

[Wikidata-bugs] [Maniphest] T255657: Strange result in Wikidata query (full URLs given instead of identifiers)

2020-10-26 Thread dcausse
dcausse claimed this task. dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. TASK DETAIL https://phabricator.wikimedia.org/T255657 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T266070: wdqs updater failing on parse error

2020-10-26 Thread dcausse
dcausse closed this task as "Declined". dcausse added a comment. There's nothing to fix in the updater related to this ticket, the reason was a bad response from one mw machine. TASK DETAIL https://phabricator.wikimedia.org/T266070 EMAIL PREFERENCES https://phabricator.wik

[Wikidata-bugs] [Maniphest] T266470: Expose wdqs1009 to wdqs users and gather feedback

2020-10-26 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a wdqs maintainer I would like to expose some test servers to wdqs users so that I can collect

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2020-10-23 Thread dcausse
dcausse added a subtask: T266321: Determine flink metrics configuration and backend when running from k8s. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Thadguidry, tfmorris, revi

[Wikidata-bugs] [Maniphest] T266321: Determine flink metrics configuration and backend when running from k8s

2020-10-23 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T266321 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen

[Wikidata-bugs] [Maniphest] T266321: Determine flink metrics configuration and backend when running from k8s

2020-10-23 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a streaming updater maintainer I want to setup the flink metrics system so that I can have

[Wikidata-bugs] [Maniphest] T266318: Clarify dependencies on codehale dropwizards

2020-10-23 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T266318 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2020-10-23 Thread dcausse
dcausse added a subtask: T266318: Clarify dependencies on codehale dropwizards. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Thadguidry, tfmorris, revi, Ladsgroup, Multichill

[Wikidata-bugs] [Maniphest] T266318: Clarify dependencies on codehale dropwizards

2020-10-23 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T266318 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, CBogen

  1   2   3   4   5   6   7   >