Zbyszko created this task.
Zbyszko added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.
As a WDQS maintainer I want to verify if it's possible to create an alert
that would trigger on incorrect number of statements in any Blazegraph
instance, so that I can react and remedy the issue.
Currently we rely for too much on user input when it comes to potential
inconsistencies in Blazegraph (by which we mean meaningful differences between
WD and WDQS). New Streaming Updater should remedy some issues we have with the
current updater, but it would be good to be able to now early if the data is
corrupt or incomplete.
Idea is to create alert that would periodically compare the number of
statements in Wikidata and any given Blazegraph statement. The main difficult
(which may be in fact a blocking one) is to come up with a threshold that make
sense. These numbers do not match on purpose - due to intentionally dropped
statements, shared ones, etc.
- Proposal for an algorithm of calculating the threshold, or
- explanation why it doesn't make sense to calculate one.
Cc: Aklapper, Zbyszko, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi,
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst,
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll,
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
Wikidata-bugs mailing list