[Wikidata-bugs] [Maniphest] [Lowered Priority] T219364: Wikidata search lagging behind

2019-03-27 Thread dcausse
dcausse lowered the priority of this task from "Unbreak Now!" to "High". dcausse edited projects, added Discovery-Search (Current work), CirrusSearch, Operations; removed Discovery. dcausse added a comment. Restricted Application edited projects, added Discovery-Search; remov

[Wikidata-bugs] [Maniphest] [Retitled] T219364: Elasticsearch indices went read-only causing huge lag

2019-03-27 Thread dcausse
dcausse renamed this task from "Wikidata search lagging behind" to "Elasticsearch indices went read-only causing huge lag". dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T219364 EMAIL PREFERENCES https://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] [Updated] T219364: Elasticsearch indices went read-only causing huge lag

2019-03-27 Thread dcausse
dcausse edited projects, added Discovery-Search (Current work); removed Discovery-Search. TASK DETAIL https://phabricator.wikimedia.org/T219364 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lucas_Werkmeister_WMDE, Smalyshev

[Wikidata-bugs] [Maniphest] [Changed Project Column] T219364: Elasticsearch indices went read-only causing huge lag

2019-03-28 Thread dcausse
dcausse moved this task from in progress to Done on the Discovery-Search (Current work) board. dcausse added a comment. Backlog of updates is now completely absorbed, a script has been run to catchup lost updates, nothing we can do at this point except waiting for the maint script to stop

[Wikidata-bugs] [Maniphest] [Commented On] T124196: Fatal "cannot perform this operation with arrays" from CirrusSearch/ElasticaWrite (using JobQueueDB)

2019-04-01 Thread dcausse
dcausse added a comment. > E.g. avoid queuing updates of this type or this size (possibly configurable), or run them differently, or to try it as today and then catch/suppress the failure - maybe logging a warning in its stead. Imo the JobQueue should raise an error if it's not

[Wikidata-bugs] [Maniphest] [Updated] T220823: Use ElasticSearch for bulk Wikidata entity term lookup

2019-04-12 Thread dcausse
dcausse edited projects, added Discovery-Search; removed Discovery. TASK DETAIL https://phabricator.wikimedia.org/T220823 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, alaa_wmde, Addshore, Aklapper, Nandana, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Commented On] T206613: Search of wikidata string property values using haswbstatement is case sensitive

2019-06-03 Thread dcausse
dcausse added a comment. @Smalyshev switching the main field for statements to `lowercase_keyword` won't break anything, it's like a new field it'll be taken into account just after the next reindex. I would advise against a new field here, the cardinality would nearly doubl

[Wikidata-bugs] [Maniphest] [Commented On] T206613: Search of wikidata string property values using haswbstatement is case sensitive

2019-06-03 Thread dcausse
dcausse added a comment. we should also note we index this data in the main filter field which means that for searches that are unlikely to be ambiguous (IDs and such) one could simply search for 10.1371/journal.pcbi.1002947 <https://www.wikidata.org/w/index.php?search=10.1371/journal.p

[Wikidata-bugs] [Maniphest] [Commented On] T206613: Search of wikidata string property values using haswbstatement is case sensitive

2019-06-04 Thread dcausse
dcausse added a comment. @Smalyshev I totally agree, I was suggesting a UX where a first attempt search would try to match using the haswbstatement keyword (switched to case insensitive) and then a second try could be made using the fulltext mode if the first attempt is unsuccessful. TASK

[Wikidata-bugs] [Maniphest] [Merged] T215615: Stop using negative scores for deboosting statements

2019-06-04 Thread dcausse
dcausse closed this task as a duplicate of T209859: Wikidata autocomplete (wbsearchentities) results with score <= 0. TASK DETAIL https://phabricator.wikimedia.org/T215615 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklap

[Wikidata-bugs] [Maniphest] [Merged] T209859: Wikidata autocomplete (wbsearchentities) results with score <= 0

2019-06-04 Thread dcausse
dcausse merged a task: T215615: Stop using negative scores for deboosting statements. TASK DETAIL https://phabricator.wikimedia.org/T209859 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Liuxinyu970226, dcausse, Smalyshev, EBernhardson

[Wikidata-bugs] [Maniphest] [Changed Status] T202254: Use ExtensionRegistry instead of class_exists to check for CirrusSearch in Wikibase

2019-06-20 Thread dcausse
dcausse changed the task status from "Stalled" to "Open". TASK DETAIL https://phabricator.wikimedia.org/T202254 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Addshore, Aklapper, darthmon_wmde, Nandana, Lahi, Gq8

[Wikidata-bugs] [Maniphest] [Claimed] T186037: Need mvn build mode that does not build gui

2019-07-16 Thread dcausse
dcausse claimed this task. dcausse moved this task from Backlog to In progress on the Discovery-Wikidata-Query-Service-Sprint board. TASK DETAIL https://phabricator.wikimedia.org/T186037 WORKBOARD https://phabricator.wikimedia.org/project/board/1239/ EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] [Updated] T186037: Need mvn build mode that does not build gui

2019-07-16 Thread dcausse
dcausse added a project: Discovery-Wikidata-Query-Service-Sprint. TASK DETAIL https://phabricator.wikimedia.org/T186037 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Gehel, Aklapper, Smalyshev, darthmon_wmde, ET4Eva, Nandana, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Commented On] T186037: Need mvn build mode that does not build gui

2019-07-17 Thread dcausse
dcausse added a comment. We could also use `mvn -pl -gui` which does not require any changes TASK DETAIL https://phabricator.wikimedia.org/T186037 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Gehel, Aklapper, Smalyshev

[Wikidata-bugs] [Maniphest] [Commented On] T173248: Convert blank nodes to “unknown value”

2019-07-17 Thread dcausse
dcausse added a comment. I see that the response is t1514691780 t1514691780 Would that

[Wikidata-bugs] [Maniphest] [Created] T229329: WDQS Updater: java.lang.StringIndexOutOfBoundsException: String index out of range: -8

2019-07-30 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION #logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg %mdc%n

[Wikidata-bugs] [Maniphest] [Edited] T229329: WDQS Updater: java.lang.StringIndexOutOfBoundsException: String index out of range: -8

2019-07-30 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T229329 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE

[Wikidata-bugs] [Maniphest] [Commented On] T229329: WDQS Updater: java.lang.StringIndexOutOfBoundsException: String index out of range: -8

2019-07-30 Thread dcausse
dcausse added a comment. if uris.entity().length() is greater than entityId.length() by 8 char it'll cause this exception. Since it's a test server it's perhaps misconfigured. TASK DETAIL https://phabricator.wikimedia.org/T229329 EMAIL PREFERENCES https://phabricato

[Wikidata-bugs] [Maniphest] [Created] T240334: \Wikibase\EntityContent::getTextForSearchIndex no longer includes textual properties

2019-12-10 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, CirrusSearch. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Discovery-Search. TASK DESCRIPTION It appears that all textual properties are being removed from the indexed text search

[Wikidata-bugs] [Maniphest] [Triaged] T240334: \Wikibase\EntityContent::getTextForSearchIndex no longer includes textual properties

2019-12-10 Thread dcausse
dcausse triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T240334 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, GoranSM

[Wikidata-bugs] [Maniphest] [Updated] T240334: \Wikibase\EntityContent::getTextForSearchIndex no longer includes textual properties

2019-12-10 Thread dcausse
dcausse added a project: Regression. TASK DETAIL https://phabricator.wikimedia.org/T240334 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Retitled] T240334: Evaluate adding all/more textual properties to the text field

2019-12-10 Thread dcausse
dcausse renamed this task from "\Wikibase\EntityContent::getTextForSearchIndex no longer includes textual properties" to "Evaluate adding all/more textual properties to the text field". dcausse lowered the priority of this task from "High" to "Medium"

[Wikidata-bugs] [Maniphest] [Edited] T240334: Evaluate adding all/more textual properties to the text field

2019-12-10 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T240334 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden

[Wikidata-bugs] [Maniphest] [Retitled] T240334: Evaluate adding all/some textual properties to the text field

2019-12-10 Thread dcausse
dcausse renamed this task from "Evaluate adding all/more textual properties to the text field" to "Evaluate adding all/some textual properties to the text field". TASK DETAIL https://phabricator.wikimedia.org/T240334 EMAIL PREFERENCES https://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] [Closed] T239898: Investigate triple counts difference between dumps and what blazegraph reports

2019-12-10 Thread dcausse
dcausse closed this task as "Invalid". dcausse added a comment. I recounted properly (using a rdf parser) the triple count from the dump after the munge operation and found 8.9B triples, closing as invalid. TASK DETAIL https://phabricator.wikimedia.org/T239898 EMAIL PREFERENC

[Wikidata-bugs] [Maniphest] [Commented On] T240328: Slow indexing for wbsearchentities

2019-12-10 Thread dcausse
dcausse added a comment. Could you precise what search string are you using? `wbsearchentities` should be using the mysql database when searching using entity ids the lag should be relatively small. On the hand the search index will take some time to udpate (job queue lag + elasticsearch

[Wikidata-bugs] [Maniphest] [Unassigned] T105427: Need a way for WDQS updater to become aware of suppressed deletes

2019-12-10 Thread dcausse
dcausse removed Smalyshev as the assignee of this task. TASK DETAIL https://phabricator.wikimedia.org/T105427 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Bugreporter, Sjoerddebruin, Krenair, gerritbot, JanZerebecki, Deskana

[Wikidata-bugs] [Maniphest] [Updated] T240453: EPIC: Improve completion search on wikidata

2019-12-11 Thread dcausse
dcausse added a project: Wikidata. dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T240453 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, DannyS712, Nandana, Lahi

[Wikidata-bugs] [Maniphest] [Updated] T240328: Slow indexing for wbsearchentities

2019-12-11 Thread dcausse
dcausse added a comment. @Fnielsen thanks for letting us know, if search by entity ID is slow again please re-open this issue with a link to the entity you created so that we can correlate with the metrics we monitor. For label search we are currently experiencing recurrent lag on the

[Wikidata-bugs] [Maniphest] [Claimed] T239338: Manually purge obsolete entites from WDQS

2019-12-11 Thread dcausse
dcausse claimed this task. dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T239338 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Lea_Lacroix_WMDE, Lydia_Pintscher, Gehel

[Wikidata-bugs] [Maniphest] [Triaged] T239338: Manually purge obsolete entites from WDQS

2019-12-11 Thread dcausse
dcausse triaged this task as "Medium" priority. TASK DETAIL https://phabricator.wikimedia.org/T239338 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Lea_Lacroix_WMDE, Lydia_Pintscher, Gehel, SCIdude, Aklapper, Mis

[Wikidata-bugs] [Maniphest] [Closed] T239338: Manually purge obsolete entites from WDQS

2019-12-11 Thread dcausse
dcausse closed this task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T239338 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Lea_Lacroix_WMDE, Lydia_Pintscher, Gehel, SCIdude, Aklapper, Mis

[Wikidata-bugs] [Paste] [Updated] P9859: Number of blank nodes used as object and grouped by predicate (wdqs2006)

2019-12-11 Thread dcausse
dcausse changed the title of this paste from "Blank node grouped by predicate (wdqs2006)" to "Number of blank nodes used as object and grouped by predicate (wdqs2006)". dcausse added a project: Wikidata-Query-Service. PASTE DETAIL https://phabricator.wikimedia.org/P985

[Wikidata-bugs] [Maniphest] [Updated] T239414: Investigate how blank nodes are used and synced between wikibase and wdqs

2019-12-11 Thread dcausse
dcausse added a comment. P9859 <https://phabricator.wikimedia.org/P9859> contains the output of select ?p (count(*)as ?cnt) { ?s ?p ?o . filter (isBlank(?o)) } group by ?p ran on wdqs2006 Will run the same query but with a filter on the subject as

[Wikidata-bugs] [Maniphest] [Updated] T239414: Investigate how blank nodes are used and synced between wikibase and wdqs

2019-12-12 Thread dcausse
dcausse added a comment. select ?p (count(*)as ?cnt) { ?s ?p ?o . filter (isBlank(?s)) } group by ?p output is at P9862 <https://phabricator.wikimedia.org/P9862> and as expected we only see the corresponding subjects of the owl constraint on `owl:comple

[Wikidata-bugs] [Maniphest] [Unassigned] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse removed Zbyszko as the assignee of this task. dcausse added a subscriber: Zbyszko. dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T239908 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Zbyszko

[Wikidata-bugs] [Maniphest] [Updated] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T239908 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko, dcausse Cc: Zbyszko, Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi

[Wikidata-bugs] [Maniphest] [Assigned] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse assigned this task to Zbyszko. TASK DETAIL https://phabricator.wikimedia.org/T239908 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko, dcausse Cc: Zbyszko, Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Triaged] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse triaged this task as "Medium" priority. TASK DETAIL https://phabricator.wikimedia.org/T239908 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko, dcausse Cc: Zbyszko, Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana,

[Wikidata-bugs] [Maniphest] [Commented On] T238002: WDQS Munger should be multi threaded

2019-12-12 Thread dcausse
dcausse added a comment. Separation of - parsing - munging - writing in multiple thread doubled the speed of the munger old: real1371m34.618s user1854m48.672s sys 24m44.480s new: real731m20.495s user1798m42.176s sys

[Wikidata-bugs] [Maniphest] [Claimed] T239750: org.wikidata.query.rdf.tool.Updater - Importer error: ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access

2019-12-16 Thread dcausse
dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T239750 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden

[Wikidata-bugs] [Maniphest] [Updated] T240540: Investigate usage of the query service & queries that are run

2019-12-17 Thread dcausse
dcausse added a comment. Also T239852 <https://phabricator.wikimedia.org/T239852> TASK DETAIL https://phabricator.wikimedia.org/T240540 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Simon_Villeneuve, Lucas_Werkmeiste

[Wikidata-bugs] [Maniphest] [Created] T241125: Import wikidata RDF dump to hadoop

2019-12-19 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION We currently have no easy way to run large scale analysis on the wikidata graph. WDQS and

[Wikidata-bugs] [Maniphest] [Created] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION Tracking task to collect all the efforts made in this direction. | start | dump

[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Commented On] T241213: Organize and improve integration test coverage for WDQS Updater

2019-12-20 Thread dcausse
dcausse added a comment. The most annoying integration test (and probably slowest) is org.wikidata.query.rdf.tool.wikibase.WikibaseRepositoryIntegrationTest: - it generates anonymous edits to test.wikidata.org in order to test the RecentChange api - Concurrent runs of this test will

[Wikidata-bugs] [Maniphest] [Triaged] T240453: EPIC: Improve completion search on wikidata

2020-01-02 Thread dcausse
dcausse triaged this task as "Medium" priority. TASK DETAIL https://phabricator.wikimedia.org/T240453 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lea_Lacroix_WMDE, dcausse, Aklapper, darthmon_wmde, Nandana,

[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2020-01-06 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T241128 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Created] T242453: wdqs1005 stopped to handle updates

2020-01-10 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION Apparently a deadlock inside blazegraph itself: Found one Java-level deadlock

[Wikidata-bugs] [Maniphest] [Created] T242640: query/wikidata/gui jenkins build broken

2020-01-13 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION Seen on https://gerrit.wikimedia.org/r/c/wikidata/query/gui/+/564056 17:41:06 + npm install

[Wikidata-bugs] [Maniphest] [Triaged] T242640: query/wikidata/gui jenkins build broken

2020-01-13 Thread dcausse
dcausse triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T242640 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkme

[Wikidata-bugs] [Maniphest] [Updated] T242640: query/wikidata/gui jenkins build broken

2020-01-13 Thread dcausse
dcausse added a comment. very similar to T242587 <https://phabricator.wikimedia.org/T242587> TASK DETAIL https://phabricator.wikimedia.org/T242640 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Aklapper, darthmo

[Wikidata-bugs] [Maniphest] [Updated] T242453: wdqs1005 stopped to handle updates

2020-01-16 Thread dcausse
dcausse added a comment. icinga check showed: `CHECK_NRPE STATE UNKNOWN: Socket timeout after 10 seconds.` for `Query Service HTTP Port` and `NaN` for `WDQS high update lag`. We should probably alert in case of timeouts. Stackdumps from blazegraph: P10185 <ht

[Wikidata-bugs] [Maniphest] [Updated] T243270: Test commons RDF dumps on sdcquery.wmflabs.org

2020-01-21 Thread dcausse
dcausse added projects: Wikidata-Query-Service, Discovery-Search (Current work). Restricted Application added a project: Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T243270 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc

[Wikidata-bugs] [Maniphest] [Created] T243292: Fix the munger to support commons RDF dump

2020-01-21 Thread dcausse
dcausse created this task. dcausse added a project: Wikidata-Query-Service. Restricted Application added a project: Wikidata. TASK DESCRIPTION When trying to munge the dumps the process is filtering many triples saying: 15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler

[Wikidata-bugs] [Maniphest] [Created] T243431: Grant more rights to wikidata/query/rdf for the group wikidata/query (similar to search)

2020-01-22 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, Gerrit-Privilege-Requests. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION In order to use the mvn release plugin on `wikidata/query/service` we

[Wikidata-bugs] [Maniphest] [Updated] T243431: Grant more rights to wikidata/query/rdf for the group wikidata/query (similar to search)

2020-01-23 Thread dcausse
dcausse added a project: Release-Engineering-Team. TASK DETAIL https://phabricator.wikimedia.org/T243431 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, Zbyszko, Mstyles, Gehel, dcausse, darthmon_wmde, DannyS712, Nandana

[Wikidata-bugs] [Maniphest] [Retitled] T243431: Grant more rights to wikidata/query/rdf for the group wikidata-query (similar to search)

2020-01-23 Thread dcausse
dcausse renamed this task from "Grant more rights to wikidata/query/rdf for the group wikidata/query (similar to search)" to "Grant more rights to wikidata/query/rdf for the group wikidata-query (similar to search)". TASK DETAIL https://phabricator.wikimedia.org/T24343

[Wikidata-bugs] [Maniphest] [Created] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-05 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. TASK DESCRIPTION The use of blank nodes makes an update process always a challenging operation (http://www.aidanhogan.com/docs/blank_nodes_jws.pdf). The use

[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-05 Thread dcausse
dcausse added a comment. In T244341#5852014 <https://phabricator.wikimedia.org/T244341#5852014>, @Lucas_Werkmeister_WMDE wrote: > If the problem is just the blank nodes themselves, why not use this new `wdunk:P2` in the same way, as in `wd:Q3 wdt:P2 wdunk:P2`? That’s still wo

[Wikidata-bugs] [Maniphest] [Triaged] T221709: scap service restarts for WDQS are inconsistent

2020-02-06 Thread dcausse
dcausse triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T221709 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Gehel, Aklapper, Smalyshev, darthmon_wmde, Nandana,

[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-06 Thread dcausse
dcausse added a comment. Yes the issue with blank nodes is that they are not "reference-able" and thus point delete queries are impossible which is what we want to achieve with the next gen updater. I did some tests and isBlank is a lot faster (I suppose because this info

[Wikidata-bugs] [Maniphest] [Edited] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-07 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T244341 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Edited] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-07 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T244341 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Edited] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-07 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T244341 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Created] T244590: EPIC: Rework the WDQS updater as an event driven application

2020-02-07 Thread dcausse
dcausse created this task. dcausse added projects: Epic, Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: Wikidata. TASK DESCRIPTION The the current merging strategy for applying updates require sending all the entity data on

[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-17 Thread dcausse
dcausse added a comment. Thanks for all the feedback. I'll discard the "constant" option. A note on the motivations: we plan to redesign the update process as a set of trivial mutations to the graph, as far as I can see updating a graph with blank nodes cannot

[Wikidata-bugs] [Maniphest] [Retitled] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-17 Thread dcausse
dcausse renamed this task from "Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints" to "Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints". dcausse updated the task description.

[Wikidata-bugs] [Maniphest] [Updated] T203397: Provide more useful redirect for statement nodes (wds:…)

2020-02-17 Thread dcausse
dcausse added a project: Discovery-Search (Current work). dcausse added a comment. @Lea_Lacroix_WMDE no, we just need to deploy it, sorry for the delay. TASK DETAIL https://phabricator.wikimedia.org/T203397 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T196165: Commons image: when pasting the exact title, get the correct file first in the suggester

2020-02-17 Thread dcausse
dcausse added a comment. I believe that because the file name has many words the score on the tokenized text fields is very high (since we sum all token scores), the score on the exact match having only one word and despite having a high weight it's not enough to compete with the lo

[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-18 Thread dcausse
dcausse added a comment. In T244341#5890517 <https://phabricator.wikimedia.org/T244341#5890517>, @Lucas_Werkmeister_WMDE wrote: >> I haven't checked but I hope that at most one blank node can be attached to the same subject/predicate, if not this makes the sync alg

[Wikidata-bugs] [Maniphest] [Commented On] T244590: EPIC: Rework the WDQS updater as an event driven application

2020-02-18 Thread dcausse
dcausse added a comment. In T244590#5893018 <https://phabricator.wikimedia.org/T244590#5893018>, @Ottomata wrote: > COOL! :) > >> it's important to note that the state of step 3 is tightly coupled with its dump and thus we will have to instantiate a new stream

[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-18 Thread dcausse
dcausse added a comment. To move this forward I propose the following plan: 1. add a `wikibase:isSomeValue` custom function configurable to work as a proxy to `isBlank()` or `STRSTARTS( STR(?o), 'http://www.wikidata.org/prop/somevalue/' )` and announce it 2. instead of ch

[Wikidata-bugs] [Maniphest] [Created] T245533: Add a custom wikibase:isSomeValue() function

2020-02-18 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, Wikidata-Query-Service. TASK DESCRIPTION In order to allow a "smooth" transition from blank nodes to IRI placeholders the `wikibase:isSomeValue` function will be added to the set of custom functions offered by the //que

[Wikidata-bugs] [Maniphest] [Created] T245541: Add a new munge option to do blank node skolemization

2020-02-18 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, Wikidata-Query-Service. TASK DESCRIPTION This munge option will transform all blank nodes as placeholder IRIs using the following rules: wdno:P109 a owl:Class ; owl:complementOf _:1 . _:1 a owl:Restriction

[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-18 Thread dcausse
dcausse added a comment. In T244341#5893723 <https://phabricator.wikimedia.org/T244341#5893723>, @Lucas_Werkmeister_WMDE wrote: > Well, I’d like to see what the IRIs for unknown value in qualifiers and references look like before we move ahead with this plan. Sure, I tri

[Wikidata-bugs] [Maniphest] [Edited] T245533: Add a custom wikibase:isSomeValue() function

2020-02-18 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T245533 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Edited] T245533: Add a custom function to identify wikibase "somevalue"

2020-02-19 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T245533 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic

[Wikidata-bugs] [Maniphest] [Retitled] T245533: Add a custom function to identify wikibase "somevalue"

2020-02-19 Thread dcausse
dcausse renamed this task from "Add a custom wikibase:isSomeValue() function" to "Add a custom function to identify wikibase "somevalue"". dcausse added a subscriber: Lucas_Werkmeister_WMDE. dcausse updated the task description. TASK DETAIL https://phabricato

[Wikidata-bugs] [Maniphest] [Changed Project Column] T239687: Rework how value and reference changes are handled

2020-02-19 Thread dcausse
dcausse moved this task from In Progress to Done on the Discovery-Search (Current work) board. dcausse added a comment. The munger has been reworked so that it does not deal with this cleanup. The next gen updater will address this cleanup in a different way. For the current updater one

[Wikidata-bugs] [Maniphest] [Updated] T241125: Import wikidata RDF dump to hadoop

2020-02-19 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T241125 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Daniel_Mietchen, Aklapper, dcausse, JAllemandou, darthmon_wmde, Nandana, Lahi

[Wikidata-bugs] [Maniphest] [Changed Project Column] T239908: Extract more metrics from blazegraph sparql update response

2020-02-19 Thread dcausse
dcausse moved this task from To Be Deployed to Done on the Discovery-Search (Current work) board. dcausse added a comment. Dashboard created here: https://grafana.wikimedia.org/d/dSksY08Zk/wikidata-query-service-updater?orgId=1 TASK DETAIL https://phabricator.wikimedia.org/T239908

[Wikidata-bugs] [Maniphest] [Claimed] T203397: Provide more useful redirect for statement nodes (wds:…)

2020-02-19 Thread dcausse
dcausse claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T203397 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: dcausse, Lea_Lacroix_WMDE, Lucas_Werkmeister_WMDE, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Merged] T244590: EPIC: Rework the WDQS updater as an event driven application

2020-02-19 Thread dcausse
dcausse merged a task: T229544: Create RDF diff for WDQS updating. dcausse added subscribers: Smalyshev, Iamamz3. TASK DETAIL https://phabricator.wikimedia.org/T244590 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Iamamz3, Smalyshev

[Wikidata-bugs] [Maniphest] [Updated] T229544: Create RDF diff for WDQS updating

2020-02-19 Thread dcausse
dcausse closed this task as a duplicate of T244590: EPIC: Rework the WDQS updater as an event driven application. TASK DETAIL https://phabricator.wikimedia.org/T229544 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Iamamz3, Aklapper

[Wikidata-bugs] [Maniphest] [Updated] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-19 Thread dcausse
dcausse added a comment. @Lucas_Werkmeister_WMDE thanks! Indeed this becomes a bit more challenging as the statement identifier alone cannot be used to identify a bnode under a particular statement. I'll continue to discuss about this specific issue in T245541 &

[Wikidata-bugs] [Maniphest] [Retitled] T231515: Duplicate blank nodes on edited properties

2020-02-20 Thread dcausse
dcausse renamed this task from "Duplicate wdno: clauses on edited properties" to "Duplicate blank nodes on edited properties". dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T231515 EMAIL PREFERENCES https://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] [Updated] T245541: Add a new munge option to do blank node skolemization

2020-02-20 Thread dcausse
dcausse added a comment. In https://www.wikidata.org/wiki/Q4115189#Q4115189$7d68afee-408d-1c1e-946b-43d8d37a17b5 @Lucas_Werkmeister_WMDE added more "somevalue" to the graph (references and qualifiers) which outputs the following graph: wd:Q4115189 p:P370 s:Q4115189-7d68afee

[Wikidata-bugs] [Maniphest] [Created] T245727: Create a streaming-updater submodule under query/wikidata/rdf

2020-02-20 Thread dcausse
dcausse created this task. dcausse added projects: Epic, Wikidata-Query-Service, Wikidata. TASK DESCRIPTION Using flink and scala with ideally a small test case. TASK DETAIL https://phabricator.wikimedia.org/T245727 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Created] T245728: Add a component to generate a diff between two entity revisions

2020-02-20 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata-Query-Service, Wikidata. TASK DESCRIPTION This component will take the list of triples of entity at revision X and Y and generate a diff between these two. The diff should be the list of triples to add and the ones to delete. For

[Wikidata-bugs] [Maniphest] [Updated] T245727: Create a streaming-updater submodule under query/wikidata/rdf

2020-02-20 Thread dcausse
dcausse removed a project: Epic. TASK DETAIL https://phabricator.wikimedia.org/T245727 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Zbyszko, dcausse Cc: Gehel, Zbyszko, Aklapper, JAllemandou, Ottomata, Smalyshev, Iamamz3, dcausse, darthmon_wmde

[Wikidata-bugs] [Maniphest] [Created] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-26 Thread dcausse
dcausse created this task. dcausse added projects: Wikidata, Wikidata-Query-Service. TASK DESCRIPTION It would nice to have an idea of the percentage of queries that uses the `isBlank` function. It might interesting to know if we can identify tools using this function in order to contact

[Wikidata-bugs] [Maniphest] [Edited] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-26 Thread dcausse
dcausse updated the task description. TASK DETAIL https://phabricator.wikimedia.org/T246237 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: JAllemandou, Aklapper, Lucas_Werkmeister_WMDE, dcausse, darthmon_wmde, Nandana, Lahi, Gq86

[Wikidata-bugs] [Maniphest] [Assigned] T246238: Investigate common qualifiers for “unknown value” statement main snaks

2020-02-27 Thread dcausse
dcausse assigned this task to JAllemandou. dcausse added a subscriber: JAllemandou. dcausse added a comment. @JAllemandou did some work and could extract some numbers from a dump imported to hadoop: SELECT ?property (COUNT(*) AS ?count) WHERE { ?statement ps:P20 ?unknown

[Wikidata-bugs] [Maniphest] [Updated] T246238: Investigate common qualifiers for “unknown value” statement main snaks

2020-02-27 Thread dcausse
dcausse added a project: Discovery-Search (Current work). TASK DETAIL https://phabricator.wikimedia.org/T246238 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JAllemandou, dcausse Cc: JAllemandou, Lea_Lacroix_WMDE, Gehel, Aklapper, dcausse

[Wikidata-bugs] [Maniphest] [Assigned] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-27 Thread dcausse
dcausse assigned this task to JAllemandou. dcausse added a project: Discovery-Search (Current work). dcausse added a subscriber: Lea_Lacroix_WMDE. dcausse added a comment. @Lea_Lacroix_WMDE the use of `isBlank` seems pretty low, do you think we should still try to identify bots by grouping by

[Wikidata-bugs] [Maniphest] [Triaged] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-27 Thread dcausse
dcausse triaged this task as "Medium" priority. TASK DETAIL https://phabricator.wikimedia.org/T246237 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JAllemandou, dcausse Cc: Lea_Lacroix_WMDE, JAllemandou, Aklapper, Lucas_Werkmeister_WMD

  1   2   3   4   5   6   7   8   9   10   >