[Wikidata-bugs] [Maniphest] T269451: Possible flink optimizations/cleanups

2021-01-12 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T269451 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T269451: Possible flink optimizations/cleanups

2021-01-12 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T269451 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T267825: Outdated data still present in WCQS a month after statement update

2021-01-12 Thread Zbyszko
Zbyszko added a comment. In T267825#6721180 <https://phabricator.wikimedia.org/T267825#6721180>, @Mstyles wrote: > On the wcqs beta host, `curl -d "query=select * { sdc:M8979671 wdt:P571 ?o . }" localhost/bigdata/namespace/wcq/sparql` returns an item with the cor

[Wikidata-bugs] [Maniphest] T267825: Outdated data still present in WCQS a month after statement update

2021-01-11 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T267825 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Mstyles, CBogen, JeanFred, Aklapper, GFontenelle_WMF, MPhamWMF, Nintendofan885, Akuckartz, Nandana, JKSTNK

[Wikidata-bugs] [Maniphest] T271412: Investigate an alert on statements volume in Blazegraph instances

2021-01-07 Thread Zbyszko
Zbyszko created this task. Zbyszko added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a WDQS maintainer I want to verify if it's possible to create an alert that would trigger

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-04 Thread Zbyszko
Zbyszko added a comment. Few details on the issue: - For extensively modified entities (e.g. by bots) log show that information provided by RecentChanges API isn't always up to date. This can lead to lost revisions - if the last change was among the ones not yet provided by the API

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-02 Thread Zbyszko
Zbyszko added a comment. I started to investigate the issue, but had to get back to some previous issue - I should have some update before the end of the week, though. TASK DETAIL https://phabricator.wikimedia.org/T267175 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-01 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T267175 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki

[Wikidata-bugs] [Maniphest] T264447: Examine cases where Blazegraph generates results that timeout and don’t make it back to the user

2020-12-01 Thread Zbyszko
Zbyszko added a comment. For now, we use status_code = 500 and query_time > 60s to assert that query timed out. Example notebook:F33929802: timed out queries (1).ipynb <https://phabricator.wikimedia.org/F33929802> TASK DETAIL https://phabricator.wikimedia.org/T264447 EMAIL PR

[Wikidata-bugs] [Maniphest] T264447: Examine cases where Blazegraph generates results that timeout and don’t make it back to the user

2020-11-30 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T264447 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: CBogen, Lydia_Pintscher, JAllemandou, Gehel, Aklapper, Akuckartz, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-11-19 Thread Zbyszko
Zbyszko added a comment. I'm happy to report that I managed to get swift-flink integration running - after reimplementing authorization (tempauth wasn't supported in the original swift plugin). Unfortunately, this implementation also suffers from lack of implementation of recoverable

[Wikidata-bugs] [Maniphest] T267310: Allow job name configuration for Streaming Updater

2020-11-05 Thread Zbyszko
Zbyszko created this task. Zbyszko added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a WDQS maintainer I want to be able to change the name of the streaming updater job so that I

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-11-04 Thread Zbyszko
Zbyszko added a comment. There are unfortunately issues with S3 <https://phabricator.wikimedia.org/S3>, too. One implementation doesn't seem to work anymore since introduction of RecoverableWriters (Presto), other (Hadoop) is producing class loading errors (I'll post them once I'll

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-30 Thread Zbyszko
Zbyszko added a comment. Progress so far: I managed to connect to switft via command line and create a container. Unfortunately, we use V1 auth (or at least I don't know about any newer method) and swift client that is used by Flink only supports V2+. I'll try S3 <ht

[Wikidata-bugs] [Maniphest] T266318: Clarify dependencies on codehale dropwizards

2020-10-27 Thread Zbyszko
Zbyszko added a comment. After investigation I found that most metric plugins do not shade in codehale metrics; graphite plugin does this because it actually uses them in the code (other reporters access flink provided abstraction). Since it is the exception, I decided to shade

[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo

2020-10-27 Thread Zbyszko
Zbyszko added a comment. > I hope my answer helps as well. Yes it did, thank you! It might've all came from the fact I wasn't present during your first meeting, but know I have much better perspective on how to review the code. TASK DETAIL https://phabricator.wikimedia.org/T265

[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo

2020-10-27 Thread Zbyszko
Zbyszko added a comment. > Could you elaborate on that a bit? Sure, here goes: We are using Apache Flink[1] as a platform for our event processing we do to feed Wikidata Query Service. We've want to move to Flink deployment to Kubernetes, hence this ticket. Apache Flink provides i

[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo

2020-10-27 Thread Zbyszko
Zbyszko added a comment. @akosiaris I see, makes sense. I still would like to solve the issue with replicating the original dockerfile - can we deploy Flink images to our registry - even if we'd need to fork Flink docker repo? TASK DETAIL https://phabricator.wikimedia.org/T265504 EMAIL

[Wikidata-bugs] [Maniphest] T266318: Clarify dependencies on codehale dropwizards

2020-10-26 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T266318 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-26 Thread Zbyszko
Zbyszko added a comment. Great! thanks - I'll get on that. TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: elukey, EBernhardson, JMeybohm, fgiunchedi, CBogen, #analytics, dcausse

[Wikidata-bugs] [Maniphest] T264659: Update BAG & BRT SPARQL endpoint in the whitelist

2020-10-26 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T264659 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Gehel, Denengelse, RhinosF1, Aklapper, Multichill, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696

[Wikidata-bugs] [Maniphest] T265504: Create Blubberfile in WDQS repo

2020-10-26 Thread Zbyszko
Zbyszko added a comment. @akosiaris Can we base a blubber enabled project on a 3rd party docker image, provided on docker hub? I was wondering if we have to replicate original dockerfile here (I'd rather base of their image to reduce future maintenance). TASK DETAIL https

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-22 Thread Zbyszko
Zbyszko added a comment. @Lucas_Werkmeister_WMDE little follow up to this - is it possible that this could be the result with opcache issues, like described here - https://phabricator.wikimedia.org/T255282 ? Date matches so we are wondering if those could be related. TASK DETAIL https

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-21 Thread Zbyszko
Zbyszko added a comment. Little bit more on what I found: - Both eqiad and codfw were affected, but list of Q-entities was greatly different - While almost all issues were reported for truthy and reified statements, there was a single case for a reference. Interestingly, that case

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-21 Thread Zbyszko
Zbyszko added a comment. Thank you all for swift (pun intended) action! TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: elukey, EBernhardson, JMeybohm, fgiunchedi, CBogen

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-20 Thread Zbyszko
Zbyszko added a comment. All the entities affected were refreshed and this: SELECT ?p (COUNT(*) AS ?count) WHERE { ?s ?p <http://commons.wikimedia.org/wiki/Special:FilePath/>. } GROUP BY ?p ORDER BY DESC(?count) no longer returns any results. All af

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-20 Thread Zbyszko
Zbyszko added a comment. According to this query: SELECT DISTINCT ?item ?rev ?date WHERE { { ?st ps:P50|ps:P106|ps:P136|ps:P275|pq:P512|pq:P106 <http://commons.wikimedia.org/wiki/Special:FilePath/>. ?item ?p ?st }

[Wikidata-bugs] [Maniphest] T265891: Add Wikimedia Commons Query Service endpoint in the WDQS federation whitelist

2020-10-19 Thread Zbyszko
Zbyszko added a comment. It's currently not possible - as long as Wikimedia Commons Query Service is in beta, OAuth authorization will not work with federation. We want to reevaluate this federation once we productionize WCQS. TASK DETAIL https://phabricator.wikimedia.org/T265891 EMAIL

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-19 Thread Zbyszko
Zbyszko added a comment. First look at the issue - usual culpruits don't seem to apply here: - Munged dump is correct for one of the affected entitites - query, after loading the data into blazegraph is fine,too Interestingly, every single entity I found with this, was updated

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-19 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T264042 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Zbyszko, CBogen, Lucas_Werkmeister_WMDE, Aklapper, matej_suchanek, Vojtech.dostal, Akuckartz, darthmon_wmde

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-19 Thread Zbyszko
Zbyszko added a comment. I'm fine with the thanos cluster option - we can proceed with that. @Ottomata do you know if thanos swift cluster is accessible from hadoop? TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-09 Thread Zbyszko
Zbyszko added a comment. We lack precise data for production - we haven't really optimised yet and complete functionality isn't yet ready (it will soon, though). Rarely, we get around 8-9GB checkpoints (when bootstrapping for example), but they do not happen regularly. Normally, checkpoints

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-08 Thread Zbyszko
Zbyszko removed Zbyszko as the assignee of this task. TASK DETAIL https://phabricator.wikimedia.org/T264042 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Zbyszko, CBogen, Lucas_Werkmeister_WMDE, Aklapper, matej_suchanek, Vojtech.dostal

[Wikidata-bugs] [Maniphest] T256949: The streaming updater should support suppressed deletes

2020-10-06 Thread Zbyszko
Zbyszko removed a project: Patch-For-Review. TASK DETAIL https://phabricator.wikimedia.org/T256949 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Bugreporter, dcausse, Aklapper, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314

[Wikidata-bugs] [Maniphest] T264042: Query service returns weird Commons links when asking for P50 (author) statement value

2020-10-06 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T264042 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: CBogen, Lucas_Werkmeister_WMDE, Aklapper, matej_suchanek, Vojtech.dostal, Akuckartz, darthmon_wmde, Nandana

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-10-06 Thread Zbyszko
Zbyszko added a comment. @fgiunchedi Currently, Flink pipeline resides on the Analytics Hadoop cluster. As for the question whether Flink creates it's containers - I think not, it did complain when there was no container, so I assume it expects one. TASK DETAIL https

[Wikidata-bugs] [Maniphest] T263855: mvn package fails for wikidata-query-rdf on Mac OS 10.13.6 High Sierra

2020-10-02 Thread Zbyszko
Zbyszko added a comment. The error you get: Referenced from: /private/var/folders/2t/2g54bjr10830rv00508_y13wgn/T/flink-io-6ec39247-613e-410a-a83f-712f841ce3a8/rocksdb-lib-396871de50f5fa7595c1071b59c34498/librocksdbjni-osx.jnilib (which was built for Mac OS X 10.15) would

[Wikidata-bugs] [Maniphest] T256882: The streaming updater should support page undeletes

2020-09-29 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T256882 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, dcausse, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-29 Thread Zbyszko
Zbyszko added a comment. @fgiunchedi unfortunately, there is no docker on stat instances so I'm unable to test swift that way. I'd still prefer to have some container on a already running service (whichever is accessible from analytics cluster). Test we want to set up would involve longer

[Wikidata-bugs] [Maniphest] T261119: Architecture review of Flink based WDQS Streaming Updater

2020-09-29 Thread Zbyszko
Zbyszko added a comment. Architecture review will be done at the beginning of November. TASK DETAIL https://phabricator.wikimedia.org/T261119 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, dcausse, Gehel, CBogen

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-29 Thread Zbyszko
Zbyszko added a comment. @Ottomata There's some confusion on how to access Swift from Hadoop cluster - I understood that it isn't doable, but from what I hear, search pipeline results go there. Can we reuse the same mechanism here? I rather have a set up done with the current updater, so we

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-29 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: EBernhardson, JMeybohm, fgiunchedi, CBogen, #analytics, dcausse, Gehel, Zbyszko, Aklapper

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-29 Thread Zbyszko
Zbyszko added a comment. @fgiunchedi We estimate we'd need around 500GB of storage for the streaming updater (not accounting for replicas). Our use case is almost always write only (checkpoints are read only on pipeline restarts, which ideally will be done rarely) - but we have a elasticity

[Wikidata-bugs] [Maniphest] T252124: Scap configuration for WDQS should get server groups from a known source or truth

2020-09-28 Thread Zbyszko
Zbyszko added a comment. @thcipriani - your proposal sounds reasonable (we don't really care if we're deploying public service before private one). One issue - we want to enable parallel deployments (https://phabricator.wikimedia.org/T207676) - to shorten the time and division available

[Wikidata-bugs] [Maniphest] T252124: Scap configuration for WDQS should get server groups from a known source or truth

2020-09-25 Thread Zbyszko
Zbyszko added a comment. @thcipriani I have the change in puppet (https://gerrit.wikimedia.org/r/630081). What else would I need to do to guarantee that generated lists in /etc/dsh/groups are picked up? I rather leave everything else configured as it is right now (https://github.com

[Wikidata-bugs] [Maniphest] T252124: Scap configuration for WDQS should get server groups from a known source or truth

2020-09-24 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T252124 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: thcipriani, Aklapper, RKemper, Gehel, lmata, CBogen, Akuckartz, darthmon_wmde, Legado_Shulgin, Nandana

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-22 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T246004 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: CBogen, #analytics, dcausse, Gehel, Zbyszko, Aklapper, JAllemandou, Smalyshev, Iamamz3, Ottomata, NavinRizwi

[Wikidata-bugs] [Maniphest] T261841: Tag WDQS query log with the source of the query (UI vs direct access)

2020-09-22 Thread Zbyszko
Zbyszko added a comment. It turns out that request headers already provide referrer (where available) in http.request-headers map. Is it required to provide it elsewhere to fulfill the need of analysts? TASK DETAIL https://phabricator.wikimedia.org/T261841 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T262265: Provide real-time updates for WCQS

2020-09-21 Thread Zbyszko
Zbyszko moved this task from Scaling to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T262265 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T256882: The streaming updater should support page undeletes

2020-09-21 Thread Zbyszko
Zbyszko moved this task from Scaling to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T256882 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T246004: Spike: Can/should Swift be used as Flink checkpoint backend?

2020-09-21 Thread Zbyszko
Zbyszko moved this task from Scaling to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T246004 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T262942: PoC on anomaly detection with Flink

2020-09-21 Thread Zbyszko
Zbyszko moved this task from All WDQS-related tasks to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T262942 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL

[Wikidata-bugs] [Maniphest] T263110: Investigate the cause of: ChecksumError: offset=517789868032, nbytes=16, expected=-58390144, actual=535102966 while importing wikidata dumps

2020-09-21 Thread Zbyszko
Zbyszko moved this task from All WDQS-related tasks to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T263110 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL

[Wikidata-bugs] [Maniphest] T263125: Check for errors on wdqs1009 disks

2020-09-21 Thread Zbyszko
Zbyszko moved this task from All WDQS-related tasks to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T263125 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL

[Wikidata-bugs] [Maniphest] T261937: Add CPU load and query concurrency as context to event logging from WDQS

2020-09-21 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T261937 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Zbyszko, JAllemandou, Aklapper, Gehel, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696

[Wikidata-bugs] [Maniphest] T261841: Tag WDQS query log with the source of the query (UI vs direct access)

2020-09-21 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T261841 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Gehel, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] T261937: Add CPU load and query concurrency as context to event logging from WDQS

2020-09-17 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T261937 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: JAllemandou, Aklapper, Gehel, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T261937: Add CPU load and query concurrency as context to event logging from WDQS

2020-09-17 Thread Zbyszko
Zbyszko removed Zbyszko as the assignee of this task. TASK DETAIL https://phabricator.wikimedia.org/T261937 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Zbyszko, JAllemandou, Aklapper, Gehel, CBogen, Akuckartz, darthmon_wmde, Nandana

[Wikidata-bugs] [Maniphest] T262828: Near zero downtime Data reload for WCQS

2020-09-15 Thread Zbyszko
Zbyszko triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T262828 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Gehel, Bugreporter, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde

[Wikidata-bugs] [Maniphest] T262828: Near zero downtime Data reload for WCQS

2020-09-15 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T262828 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Gehel, Bugreporter, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T262828: Near zero downtime Data reload for WCQS

2020-09-14 Thread Zbyszko
Zbyszko created this task. Zbyszko added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a user I want a shortest possible downtime for WCQS in case of data reload, so that I can

[Wikidata-bugs] [Maniphest] T262828: Near zero downtime Data reload for WCQS

2020-09-14 Thread Zbyszko
Zbyszko moved this task from All WDQS-related tasks to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T262828 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL

[Wikidata-bugs] [Maniphest] T261119: Architecture review of Flink based WDQS Streaming Updater

2020-09-14 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T261119 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, dcausse, Gehel, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T258240: Refactor Options handling in Streaming Updater

2020-09-10 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T258240 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T261097: WDQS Categories reload is failing on thankyouwiki

2020-09-10 Thread Zbyszko
Zbyszko added a comment. @RKemper we should be able to retry categories reload after deploying this. TASK DETAIL https://phabricator.wikimedia.org/T261097 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, RKemper, Gehel

[Wikidata-bugs] [Maniphest] T261097: WDQS Categories reload is failing on thankyouwiki

2020-09-10 Thread Zbyszko
Zbyszko added a comment. I went with the fix for wikidata/query/rdf scripts - it made most sense to me, since issues with a single wiki should block updates for others. Once it's merged, I'll add an entry on that to runbook TASK DETAIL https://phabricator.wikimedia.org/T261097 EMAIL

[Wikidata-bugs] [Maniphest] T262265: Provide real-time updates for WCQS

2020-09-08 Thread Zbyszko
Zbyszko added a comment. Just to be own's devil's advocate or to provide alternatives, we can solve both downtime and real-time updates with the old updater. Additionally, we can eliminate the downtime by having two blazegraph instances in an active/standby setup. TASK DETAIL https

[Wikidata-bugs] [Maniphest] T260568: [EPIC] Productionize WCQS

2020-09-08 Thread Zbyszko
Zbyszko added a subtask: T262265: Provide real-time updates for WCQS. TASK DETAIL https://phabricator.wikimedia.org/T260568 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: CBogen, Gehel, Aklapper, NavinRizwi, Akuckartz, darthmon_wmde

[Wikidata-bugs] [Maniphest] T262265: Provide real-time updates for WCQS

2020-09-08 Thread Zbyszko
Zbyszko added a parent task: T260568: [EPIC] Productionize WCQS. TASK DETAIL https://phabricator.wikimedia.org/T262265 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana

[Wikidata-bugs] [Maniphest] T262265: Provide real-time updates for WCQS

2020-09-08 Thread Zbyszko
Zbyszko added a comment. If we go by the solution of having an additional pipeline for SDC - https://phabricator.wikimedia.org/T262020 should be done first. TASK DETAIL https://phabricator.wikimedia.org/T262265 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T262020: Make sure that transaction.id assigned by Flink is unique for concurrent runs of Streaming Updater pipeline

2020-09-08 Thread Zbyszko
Zbyszko added a comment. If we go by the solution of having an additional pipeline for SDC - https://phabricator.wikimedia.org/T262020 should be done first. TASK DETAIL https://phabricator.wikimedia.org/T262020 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T262265: Provide real-time updates for WCQS

2020-09-08 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T262265 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T262265: Provide real-time updates for WCQS

2020-09-08 Thread Zbyszko
Zbyszko created this task. Zbyszko added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a user of WCQS I want to have a real-time updates to WCQS so that I can see the changes soon

[Wikidata-bugs] [Maniphest] T262178: One week after SDC edits the data still shows up in WCQS queries

2020-09-07 Thread Zbyszko
Zbyszko added a comment. Today's reload happened and if this (https://tinyurl.com/y5vd95rm) query is correct, there are no duplicates. @Jarekt, can you confirm? TASK DETAIL https://phabricator.wikimedia.org/T262178 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] T262178: One week after SDC edits the data still shows up in WCQS queries

2020-09-07 Thread Zbyszko
Zbyszko added a comment. The process we have right now is that we use SDC dumps to reload the data each week. Dumps are made each Sunday, which means, that all the changes made between Aug 30th and Sep 6th will only show up in the dump released on Sep 6th. I pushed the update time from

[Wikidata-bugs] [Maniphest] T262178: One week after SDC edits the data still shows up in WCQS queries

2020-09-07 Thread Zbyszko
Zbyszko added a comment. The process we have right now is that we use SDC dumps to reload the data each week. Dumps are made each Sunday, which means, that all the changes made between Aug 30th and Sep 6th will only show up in the dump released on Sep 6th. I pushed the update time from

[Wikidata-bugs] [Maniphest] T262020: Make sure that transaction.id assigned by Flink is unique for concurrent runs of Streaming Updater pipeline

2020-09-04 Thread Zbyszko
Zbyszko added a project: Wikidata-Query-Service. Restricted Application added a project: Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T262020 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, CBogen, Akuckartz

[Wikidata-bugs] [Maniphest] T261097: WDQS Categories reload is failing on thankyouwiki

2020-09-03 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T261097 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, RKemper, Gehel, Aklapper, lmata, CBogen, Akuckartz, darthmon_wmde, Legado_Shulgin, Nandana

[Wikidata-bugs] [Maniphest] T261840: Jetty startup logs in /var/log/wdqs

2020-09-02 Thread Zbyszko
Zbyszko created this task. Zbyszko added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION As a WDQS/WCQS maintainer a want to be able to access jetty startup logs from a file in /var/log

[Wikidata-bugs] [Maniphest] T252503: Create automatically updated CI test environment

2020-09-02 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T252503 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Gehel, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, jijiki

[Wikidata-bugs] [Maniphest] T251096: [WDQS Streaming Updater] Organise module structure

2020-08-28 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T251096 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T248449: [WDQS Streaming Updater] Add error handling for Streaming Updater

2020-08-26 Thread Zbyszko
Zbyszko added a comment. In T248449#6411542 <https://phabricator.wikimedia.org/T248449#6411542>, @dcausse wrote: > In T248449#6382230 <https://phabricator.wikimedia.org/T248449#6382230>, @Zbyszko wrote: > >> We need to decide our approach on possible data c

[Wikidata-bugs] [Maniphest] T248449: [WDQS Streaming Updater] Add error handling for Streaming Updater

2020-08-24 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T248449 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T251515: Automate data reload for SPARQL Endpoint for Commons

2020-08-18 Thread Zbyszko
Zbyszko added a comment. During the first data reload for some reason there data was not restored properly. I couldn't find a root cause of this - I'm doing some small changes to have a better understanding of the issue if it happens again. TASK DETAIL https://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] T248449: [WDQS Streaming Updater] Add error handling for Streaming Updater

2020-08-17 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T248449 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T248449: [WDQS Streaming Updater] Add error handling for Streaming Updater

2020-08-17 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T248449 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T248449: [WDQS Streaming Updater] Add error handling for Streaming Updater

2020-08-13 Thread Zbyszko
Zbyszko added a comment. We need to decide our approach on possible data corruption issues as well. One that comes to mind is a rev create with revid higher than previouis page delete. TASK DETAIL https://phabricator.wikimedia.org/T248449 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T259637: Gather information about the volume of queries on WCQS

2020-08-11 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T259637 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Zbyszko, EBernhardson, Aklapper, dcausse, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T259637: Gather information about the volume of queries on WCQS

2020-08-11 Thread Zbyszko
Zbyszko removed Zbyszko as the assignee of this task. TASK DETAIL https://phabricator.wikimedia.org/T259637 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Zbyszko, EBernhardson, Aklapper, dcausse, CBogen, Akuckartz, darthmon_wmde

[Wikidata-bugs] [Maniphest] T259637: Gather information about the volume of queries on WCQS

2020-08-11 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T259637 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: EBernhardson, Aklapper, dcausse, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T258625: Querying WCQS should allow me to use prefixes for MediaInfo items

2020-08-11 Thread Zbyszko
Zbyszko added a comment. Prefixes, as defined in mediainfo dumps, are available in WCQS. TASK DETAIL https://phabricator.wikimedia.org/T258625 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Jheald, dcausse, Aklapper, CBogen

[Wikidata-bugs] [Maniphest] T251515: Automate data reload for SPARQL Endpoint for Commons

2020-08-10 Thread Zbyszko
Zbyszko claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T251515 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: CBogen, Mstyles, dcausse, Zbyszko, Aklapper, Lea_Lacroix_WMDE, Gehel, Alter-paule, Beast1978, Un1tY

[Wikidata-bugs] [Maniphest] T248449: [WDQS Streaming Updater] Add error handling for Streaming Updater

2020-08-10 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T248449 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: dcausse, Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T251096: [WDQS Streaming Updater] Organise module structure

2020-08-10 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T251096 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T258240: Refactor Options handling in Streaming Updater

2020-08-10 Thread Zbyszko
Zbyszko added a comment. We should do that before architecture review - so the reviewers will be able to focus on the architecture instead of understanding how unimportant things (for the architecture) work. In the future, this kind of things should be done along side standard development

[Wikidata-bugs] [Maniphest] T258240: Refactor Options handling in Streaming Updater

2020-08-10 Thread Zbyszko
Zbyszko updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T258240 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: Aklapper, Zbyszko, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86

[Wikidata-bugs] [Maniphest] T248449: [WDQS Streaming Updater] Add error handling for Streaming Updater

2020-08-10 Thread Zbyszko
Zbyszko moved this task from Scaling to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T248449 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T258240: Refactor Options handling in Streaming Updater

2020-08-10 Thread Zbyszko
Zbyszko moved this task from Scaling to Current work on the Wikidata-Query-Service board. Zbyszko added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T258240 WORKBOARD https://phabricator.wikimedia.org/project/board/891/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T259587: Wikibase Turtle RDF dump should only emit the used prefixes in its header

2020-08-10 Thread Zbyszko
Zbyszko added a comment. This doesn't affect any of our current technical solutions and is only about the shape of the data itself - unused prefixes do not affect the update process. TASK DETAIL https://phabricator.wikimedia.org/T259587 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] T251515: Automate data reload for SPARQL Endpoint for Commons

2020-08-10 Thread Zbyszko
Zbyszko set the point value for this task to "2". TASK DETAIL https://phabricator.wikimedia.org/T251515 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko Cc: CBogen, Mstyles, dcausse, Zbyszko, Aklapper, Lea_Lacroix_WMDE, Gehel,

  1   2   3   >