[Wikidata-bugs] [Maniphest] [Lowered Priority] T219364: Wikidata search lagging behind

2019-03-27 Thread dcausse
dcausse lowered the priority of this task from "Unbreak Now!" to "High".
dcausse edited projects, added Discovery-Search (Current work), CirrusSearch, 
Operations; removed Discovery.
dcausse added a comment.
Restricted Application edited projects, added Discovery-Search; removed 
Discovery-Search (Current work).


  The backlog of updates is being processed, once we catch up on these updates 
we will run a maint script to reindex lost updates.
  Lowering to High as the immediate actions were taken, it now may take few 
days to fully sync the index and the database for the affected wikis.

TASK DETAIL
  https://phabricator.wikimedia.org/T219364

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Smalyshev, Lea_Lacroix_WMDE, Gehel, dcausse, 
TerraCodes, Liuxinyu970226, Aklapper, Addshore, alaa_wmde, Legado_Shulgin, 
Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, LawExplorer, Zppix, 
_jensen, rosalieper, Wong128hk, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, 
Jay8g, fgiunchedi, jeremyb, ET4Eva, Darkminds3113, Avner, FloNight
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Retitled] T219364: Elasticsearch indices went read-only causing huge lag

2019-03-27 Thread dcausse
dcausse renamed this task from "Wikidata search lagging behind" to 
"Elasticsearch indices went read-only causing huge lag".
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T219364

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Smalyshev, Lea_Lacroix_WMDE, Gehel, dcausse, 
TerraCodes, Liuxinyu970226, Aklapper, Addshore, alaa_wmde, Legado_Shulgin, 
Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, LawExplorer, Zppix, 
_jensen, rosalieper, Wong128hk, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, 
Jay8g, fgiunchedi, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T219364: Elasticsearch indices went read-only causing huge lag

2019-03-27 Thread dcausse
dcausse edited projects, added Discovery-Search (Current work); removed 
Discovery-Search.

TASK DETAIL
  https://phabricator.wikimedia.org/T219364

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Smalyshev, Lea_Lacroix_WMDE, Gehel, dcausse, 
TerraCodes, Liuxinyu970226, Aklapper, Addshore, alaa_wmde, Legado_Shulgin, 
Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, LawExplorer, Zppix, 
_jensen, rosalieper, Wong128hk, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, 
Jay8g, fgiunchedi, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T219364: Elasticsearch indices went read-only causing huge lag

2019-03-28 Thread dcausse
dcausse moved this task from in progress to Done on the Discovery-Search 
(Current work) board.
dcausse added a comment.


  Backlog of updates is now completely absorbed, a script has been run to 
catchup lost updates, nothing we can do at this point except waiting for the 
maint script to stop moving to done.

TASK DETAIL
  https://phabricator.wikimedia.org/T219364

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Mholloway, Lucas_Werkmeister_WMDE, Smalyshev, Lea_Lacroix_WMDE, Gehel, 
dcausse, TerraCodes, Liuxinyu970226, Aklapper, Addshore, alaa_wmde, 
Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, LawExplorer, Zppix, 
_jensen, rosalieper, Wong128hk, Wikidata-bugs, aude, jayvdb, faidon, Mbch331, 
Jay8g, fgiunchedi, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T124196: Fatal "cannot perform this operation with arrays" from CirrusSearch/ElasticaWrite (using JobQueueDB)

2019-04-01 Thread dcausse
dcausse added a comment.


  > E.g. avoid queuing updates of this type or this size (possibly 
configurable), or run them differently, or to try it as today and then 
catch/suppress the failure - maybe logging a warning in its stead.
  
  Imo the JobQueue should raise an error if it's not able to save the message 
correctly. Since the Queue owns the way the message is serialized it's hard for 
an extension to determine what will be the actual size of the stored message.

TASK DETAIL
  https://phabricator.wikimedia.org/T124196

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, GTirloni, debt, EBernhardson, aaron, Krinkle, Rudloff, 
Physikerwelt, GFXDude2010, Zoglun, hoo, aude, Aklapper, alaa_wmde, ET4Eva, 
Nandana, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, 
LawExplorer, Avner, Gehel, _jensen, rosalieper, FloNight, Wikidata-bugs, 
jayvdb, Jdforrester-WMF, Mbch331, Jay8g, Krenair, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T220823: Use ElasticSearch for bulk Wikidata entity term lookup

2019-04-12 Thread dcausse
dcausse edited projects, added Discovery-Search; removed Discovery.

TASK DETAIL
  https://phabricator.wikimedia.org/T220823

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, alaa_wmde, Addshore, Aklapper, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Wikidata-bugs, aude, Mbch331, ET4Eva, Darkminds3113, Avner, Gehel, FloNight
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T206613: Search of wikidata string property values using haswbstatement is case sensitive

2019-06-03 Thread dcausse
dcausse added a comment.


  @Smalyshev switching the main field for statements to `lowercase_keyword` 
won't break anything, it's like a new field it'll be taken into account just 
after the next reindex. I would advise against a new field here, the 
cardinality would nearly double.

TASK DETAIL
  https://phabricator.wikimedia.org/T206613

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: EBernhardson, WMDE-leszek, Multichill, Aklapper, Lydia_Pintscher, aude, 
debt, Smalyshev, Lea_Lacroix_WMDE, ArthurPSmith, Esc3300, dcausse, Mvolz, 
E.S.A-Sheild, darthmon_wmde, Premeditated, joker88john, ET4Eva, CucyNoiD, 
Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, 
Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, 
LawExplorer, WSH1906, Avner, Lewizho99, Maathavan, Gehel, _jensen, rosalieper, 
FloNight, Wikidata-bugs, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T206613: Search of wikidata string property values using haswbstatement is case sensitive

2019-06-03 Thread dcausse
dcausse added a comment.


  we should also note we index this data in the main filter field which means 
that for searches that are unlikely to be ambiguous (IDs and such) one could 
simply search for 10.1371/journal.pcbi.1002947 
<https://www.wikidata.org/w/index.php?search=10.1371/journal.pcbi.1002947&title=Special%3ASearch&profile=default&fulltext=1>.
 Benefit is that it's tolerant to small variation in punctuation but also 
accept partial searches like:
  journal.pcbi.1002947 
<https://www.wikidata.org/w/index.php?search=journal.pcbi.1002947&title=Special%3ASearch&profile=default&fulltext=1>
 or even with small variations: journal pcbi 1002947 
<https://www.wikidata.org/w/index.php?search=journal pcbi 
1002947&title=Special%3ASearch&profile=default&fulltext=1>.
  
  So instead of giving up with no results this kind of searches could be tried 
if a human is behind to select/accept/validate a result.

TASK DETAIL
  https://phabricator.wikimedia.org/T206613

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: EBernhardson, WMDE-leszek, Multichill, Aklapper, Lydia_Pintscher, aude, 
debt, Smalyshev, Lea_Lacroix_WMDE, ArthurPSmith, Esc3300, dcausse, Mvolz, 
E.S.A-Sheild, darthmon_wmde, Premeditated, joker88john, ET4Eva, CucyNoiD, 
Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, 
Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, 
LawExplorer, WSH1906, Avner, Lewizho99, Maathavan, Gehel, _jensen, rosalieper, 
FloNight, Wikidata-bugs, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T206613: Search of wikidata string property values using haswbstatement is case sensitive

2019-06-04 Thread dcausse
dcausse added a comment.


  @Smalyshev I totally agree, I was suggesting a UX where a first attempt 
search would try to match using the haswbstatement keyword (switched to case 
insensitive) and then a second try could be made using the fulltext mode if the 
first attempt is unsuccessful.

TASK DETAIL
  https://phabricator.wikimedia.org/T206613

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: EBernhardson, WMDE-leszek, Multichill, Aklapper, Lydia_Pintscher, aude, 
debt, Smalyshev, Lea_Lacroix_WMDE, ArthurPSmith, Esc3300, dcausse, Mvolz, 
E.S.A-Sheild, darthmon_wmde, Premeditated, joker88john, ET4Eva, CucyNoiD, 
Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, 
Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, 
LawExplorer, WSH1906, Avner, Lewizho99, Maathavan, Gehel, _jensen, rosalieper, 
FloNight, Wikidata-bugs, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Merged] T215615: Stop using negative scores for deboosting statements

2019-06-04 Thread dcausse
dcausse closed this task as a duplicate of T209859: Wikidata autocomplete 
(wbsearchentities) results with score <= 0.

TASK DETAIL
  https://phabricator.wikimedia.org/T215615

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Premeditated, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Merged] T209859: Wikidata autocomplete (wbsearchentities) results with score <= 0

2019-06-04 Thread dcausse
dcausse merged a task: T215615: Stop using negative scores for deboosting 
statements.

TASK DETAIL
  https://phabricator.wikimedia.org/T209859

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Liuxinyu970226, dcausse, Smalyshev, EBernhardson, Aklapper, darthmon_wmde, 
Premeditated, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, 
LawExplorer, _jensen, rosalieper, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Status] T202254: Use ExtensionRegistry instead of class_exists to check for CirrusSearch in Wikibase

2019-06-20 Thread dcausse
dcausse changed the task status from "Stalled" to "Open".

TASK DETAIL
  https://phabricator.wikimedia.org/T202254

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Addshore, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Jayprakash12345, QZanden, LawExplorer, _jensen, rosalieper, Izno, 
Wikidata-bugs, aude, Dinoguy1000, Mbch331, Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T186037: Need mvn build mode that does not build gui

2019-07-16 Thread dcausse
dcausse claimed this task.
dcausse moved this task from Backlog to In progress on the 
Discovery-Wikidata-Query-Service-Sprint board.

TASK DETAIL
  https://phabricator.wikimedia.org/T186037

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1239/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Aklapper, Smalyshev, darthmon_wmde, ET4Eva, Nandana, Lahi, Gq86, 
Darkminds3113, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, 
merbst, LawExplorer, Avner, _jensen, rosalieper, Cirdan, Jonas, FloNight, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T186037: Need mvn build mode that does not build gui

2019-07-16 Thread dcausse
dcausse added a project: Discovery-Wikidata-Query-Service-Sprint.

TASK DETAIL
  https://phabricator.wikimedia.org/T186037

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Aklapper, Smalyshev, darthmon_wmde, ET4Eva, Nandana, Lahi, Gq86, 
Darkminds3113, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, 
merbst, LawExplorer, Avner, _jensen, rosalieper, Cirdan, Jonas, FloNight, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T186037: Need mvn build mode that does not build gui

2019-07-17 Thread dcausse
dcausse added a comment.


  We could also use `mvn -pl -gui` which does not require any changes

TASK DETAIL
  https://phabricator.wikimedia.org/T186037

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Aklapper, Smalyshev, darthmon_wmde, ET4Eva, Nandana, Lahi, Gq86, 
Darkminds3113, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, 
merbst, LawExplorer, Avner, _jensen, rosalieper, Cirdan, Jonas, FloNight, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T173248: Convert blank nodes to “unknown value”

2019-07-17 Thread dcausse
dcausse added a comment.


  I see that the response is
  


t1514691780


t1514691780


  
  Would that work if the API returns again a blank node instead of trying to 
deal with the string?
  


t1514691780


t1514691780


  
  The UI could do something special when it encounters blank nodes.

TASK DETAIL
  https://phabricator.wikimedia.org/T173248

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Jonas, Aklapper, Smalyshev, Lucas_Werkmeister_WMDE, PokestarFan, 
darthmon_wmde, ET4Eva, Nandana, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, Avner, Gehel, _jensen, rosalieper, 
FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T229329: WDQS Updater: java.lang.StringIndexOutOfBoundsException: String index out of range: -8

2019-07-30 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION

  
  #logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - 
%msg %mdc%n
  13:09:09.985 [main] INFO  org.wikidata.query.rdf.tool.Update - Starting 
Updater 0.3.2-SNAPSHOT (2629afc6287b660a4576d795debea6781879afff 
<https://phabricator.wikimedia.org/rWDQR2629afc6287b660a4576d795debea6781879afff>)
  13:09:10.699 [main] INFO  o.w.q.r.t.change.ChangeSourceContext - Checking 
where we left off
  13:09:10.699 [main] INFO  o.w.query.rdf.tool.rdf.RdfRepository - Checking for 
left off time from the updater
  13:09:10.854 [main] INFO  o.w.query.rdf.tool.rdf.RdfRepository - Found left 
off time from the updater
  13:09:10.855 [main] INFO  o.w.q.r.t.change.ChangeSourceContext - Found start 
time in the RDF store: 2019-07-30T12:06:13Z
  13:09:10.881 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Creating 
consumer wdqs1009
  13:09:11.157 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Subscribed to 6 
topics
  13:09:11.158 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
codfw.mediawiki.revision-create-0 to (timestamp=1564488373000, offset=56972315)
  13:09:11.158 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
eqiad.mediawiki.page-undelete-0 to (timestamp=1564488373000, offset=190052)
  13:09:11.158 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
codfw.mediawiki.page-undelete-0 to (timestamp=1564488373000, offset=6581)
  13:09:11.158 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
eqiad.mediawiki.revision-create-0 to (timestamp=1564488373000, 
offset=1346546990)
  13:09:11.158 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
codfw.mediawiki.page-delete-0 to (timestamp=1564488373000, offset=409877)
  13:09:11.159 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
eqiad.mediawiki.page-delete-0 to (timestamp=1564488373000, offset=9775382)
  13:09:12.626 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Found 903 
changes
  13:09:12.809 [main] ERROR org.wikidata.query.rdf.tool.Update - Error during 
updater run.
  java.lang.StringIndexOutOfBoundsException: String index out of range: -8
at java.lang.String.substring(String.java:1931)
at 
org.wikidata.query.rdf.tool.Updater.getRevisionUpdates(Updater.java:289)
at org.wikidata.query.rdf.tool.Updater.handleChanges(Updater.java:228)
at org.wikidata.query.rdf.tool.Updater.run(Updater.java:150)
at org.wikidata.query.rdf.tool.Update.run(Update.java:173)
at org.wikidata.query.rdf.tool.Update.main(Update.java:97)
  #logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - 
%msg %mdc%n
  13:09:24.357 [main] INFO  org.wikidata.query.rdf.tool.Update - Starting 
Updater 0.3.2-SNAPSHOT (2629afc6287b660a4576d795debea6781879afff 
<https://phabricator.wikimedia.org/rWDQR2629afc6287b660a4576d795debea6781879afff>)
  13:09:25.141 [main] INFO  o.w.q.r.t.change.ChangeSourceContext - Checking 
where we left off
  13:09:25.141 [main] INFO  o.w.query.rdf.tool.rdf.RdfRepository - Checking for 
left off time from the updater
  13:09:25.258 [main] INFO  o.w.query.rdf.tool.rdf.RdfRepository - Found left 
off time from the updater
  13:09:25.258 [main] INFO  o.w.q.r.t.change.ChangeSourceContext - Found start 
time in the RDF store: 2019-07-30T12:06:13Z
  13:09:25.279 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Creating 
consumer wdqs1009
  13:09:25.523 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Subscribed to 6 
topics
  13:09:25.524 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
codfw.mediawiki.revision-create-0 to (timestamp=1564488373000, offset=56972315)
  13:09:25.524 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
eqiad.mediawiki.page-undelete-0 to (timestamp=1564488373000, offset=190052)
  13:09:25.524 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
codfw.mediawiki.page-undelete-0 to (timestamp=1564488373000, offset=6581)
  13:09:25.524 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
eqiad.mediawiki.revision-create-0 to (timestamp=1564488373000, 
offset=1346546990)
  13:09:25.524 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
codfw.mediawiki.page-delete-0 to (timestamp=1564488373000, offset=409877)
  13:09:25.524 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Set topic 
eqiad.mediawiki.page-delete-0 to (timestamp=1564488373000, offset=9775382)
  13:09:27.043 [main] INFO  o.w.q.rdf.tool.change.KafkaPoller - Found 903 
changes
  13:09:27.229 [main] ERROR org.wikidata.query.rdf.tool.Update - Error during 
updater run.
  java.lang.StringIndexOutOfBoundsException: String index out of range: -8
at java.lang.String.substring(String.java:1931)
at 
org.wikidata.query.rdf.tool.Updater.getRevisionUpdates(Updater.java:289

[Wikidata-bugs] [Maniphest] [Edited] T229329: WDQS Updater: java.lang.StringIndexOutOfBoundsException: String index out of range: -8

2019-07-30 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T229329

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Jonas, Xmlizer, jkroll, Smalyshev, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T229329: WDQS Updater: java.lang.StringIndexOutOfBoundsException: String index out of range: -8

2019-07-30 Thread dcausse
dcausse added a comment.


  if uris.entity().length() is greater than entityId.length() by 8 char it'll 
cause this exception. Since it's a test server it's perhaps misconfigured.

TASK DETAIL
  https://phabricator.wikimedia.org/T229329

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, dcausse, Aklapper, darthmon_wmde, DannyS712, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, 
_jensen, rosalieper, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T240334: \Wikibase\EntityContent::getTextForSearchIndex no longer includes textual properties

2019-12-10 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata, CirrusSearch.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Discovery-Search.

TASK DESCRIPTION
  It appears that all textual properties are being removed from the indexed 
text search content.
  As far as I understand `repo/config/Wikibase.searchindex.php` is responsible 
for this and it looks like it stopped to be registered as a 
`WikibaseTextForSearchIndex` hook.
  
  Original report 
<https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team#Search_doesn't_include_subtitle_field?>

TASK DETAIL
  https://phabricator.wikimedia.org/T240334

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T240334: \Wikibase\EntityContent::getTextForSearchIndex no longer includes textual properties

2019-12-10 Thread dcausse
dcausse triaged this task as "High" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T240334

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T240334: \Wikibase\EntityContent::getTextForSearchIndex no longer includes textual properties

2019-12-10 Thread dcausse
dcausse added a project: Regression.

TASK DETAIL
  https://phabricator.wikimedia.org/T240334

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Jayprakash12345, QZanden, EBjune, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wong128hk, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Retitled] T240334: Evaluate adding all/more textual properties to the text field

2019-12-10 Thread dcausse
dcausse renamed this task from "\Wikibase\EntityContent::getTextForSearchIndex 
no longer includes textual properties" to "Evaluate adding all/more textual 
properties to the text field".
dcausse lowered the priority of this task from "High" to "Medium".
dcausse removed a project: Regression.
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T240334

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb, Jayprakash12345, 
Wong128hk
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T240334: Evaluate adding all/more textual properties to the text field

2019-12-10 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T240334

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Retitled] T240334: Evaluate adding all/some textual properties to the text field

2019-12-10 Thread dcausse
dcausse renamed this task from "Evaluate adding all/more textual properties to 
the text field" to "Evaluate adding all/some textual properties to the text 
field".

TASK DETAIL
  https://phabricator.wikimedia.org/T240334

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T239898: Investigate triple counts difference between dumps and what blazegraph reports

2019-12-10 Thread dcausse
dcausse closed this task as "Invalid".
dcausse added a comment.


  I recounted properly (using a rdf parser) the triple count from the dump 
after the munge operation and found 8.9B triples, closing as invalid.

TASK DETAIL
  https://phabricator.wikimedia.org/T239898

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: JAllemandou, Gehel, elukey, dcausse, Aklapper, darthmon_wmde, DannyS712, 
Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T240328: Slow indexing for wbsearchentities

2019-12-10 Thread dcausse
dcausse added a comment.


  Could you precise what search string are you using?
  `wbsearchentities` should be using the mysql database when searching using 
entity ids the lag should be relatively small.
  On the hand the search index will take some time to udpate (job queue lag + 
elasticsearch refresh ) so searches based on labels/aliases may not react 
immediately after an entity is added.

TASK DETAIL
  https://phabricator.wikimedia.org/T240328

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Daniel_Mietchen, WMDE-leszek, Lydia_Pintscher, Aklapper, Fnielsen, 
darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
EBjune, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unassigned] T105427: Need a way for WDQS updater to become aware of suppressed deletes

2019-12-10 Thread dcausse
dcausse removed Smalyshev as the assignee of this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T105427

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Bugreporter, Sjoerddebruin, Krenair, gerritbot, JanZerebecki, 
Deskana, daniel, Legoktm, Aklapper, Smalyshev, darthmon_wmde, ET4Eva, 
DannyS712, Nandana, Lahi, Gq86, Darkminds3113, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Avner, Gehel, _jensen, 
rosalieper, Scott_WUaS, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T240453: EPIC: Improve completion search on wikidata

2019-12-11 Thread dcausse
dcausse added a project: Wikidata.
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T240453

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T240328: Slow indexing for wbsearchentities

2019-12-11 Thread dcausse
dcausse added a comment.


  @Fnielsen thanks for letting us know, if search by entity ID is slow again 
please re-open this issue with a link to the entity you created so that we can 
correlate with the metrics we monitor.
  For label search we are currently experiencing recurrent lag on the jobqueue 
that could make it rather bad (several minutes per T224425 
<https://phabricator.wikimedia.org/T224425>).

TASK DETAIL
  https://phabricator.wikimedia.org/T240328

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Fnielsen, dcausse
Cc: dcausse, Daniel_Mietchen, WMDE-leszek, Lydia_Pintscher, Aklapper, Fnielsen, 
darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
EBjune, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T239338: Manually purge obsolete entites from WDQS

2019-12-11 Thread dcausse
dcausse claimed this task.
dcausse added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T239338

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Lea_Lacroix_WMDE, Lydia_Pintscher, Gehel, SCIdude, Aklapper, 
MisterSynergy, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T239338: Manually purge obsolete entites from WDQS

2019-12-11 Thread dcausse
dcausse triaged this task as "Medium" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T239338

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Lea_Lacroix_WMDE, Lydia_Pintscher, Gehel, SCIdude, Aklapper, 
MisterSynergy, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Closed] T239338: Manually purge obsolete entites from WDQS

2019-12-11 Thread dcausse
dcausse closed this task as "Resolved".

TASK DETAIL
  https://phabricator.wikimedia.org/T239338

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Lea_Lacroix_WMDE, Lydia_Pintscher, Gehel, SCIdude, Aklapper, 
MisterSynergy, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Paste] [Updated] P9859: Number of blank nodes used as object and grouped by predicate (wdqs2006)

2019-12-11 Thread dcausse
dcausse changed the title of this paste from "Blank node grouped by predicate 
(wdqs2006)" to "Number of blank nodes used as object and grouped by predicate 
(wdqs2006)".
dcausse added a project: Wikidata-Query-Service.

PASTE DETAIL
  https://phabricator.wikimedia.org/P9859

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T239414: Investigate how blank nodes are used and synced between wikibase and wdqs

2019-12-11 Thread dcausse
dcausse added a comment.


  P9859 <https://phabricator.wikimedia.org/P9859> contains the output of
  
select ?p (count(*)as ?cnt) {
  ?s ?p ?o .
  filter (isBlank(?o))
}
group by ?p
  
  ran on wdqs2006
  
  Will run the same query but with a filter on the subject as asked, 
expectations here are to find only `owl:complementOf` around 42K.

TASK DETAIL
  https://phabricator.wikimedia.org/T239414

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Smalyshev, Lucas_Werkmeister_WMDE, Igorkim78, dcausse, Aklapper, 
darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T239414: Investigate how blank nodes are used and synced between wikibase and wdqs

2019-12-12 Thread dcausse
dcausse added a comment.


select ?p (count(*)as ?cnt) {
  ?s ?p ?o .
  filter (isBlank(?s))
}
group by ?p
  
  output is at P9862 <https://phabricator.wikimedia.org/P9862>
  
  and as expected we only see the corresponding subjects of the owl constraint 
on `owl:complementOf` (`rdf:type`, `owl:onProperty` and `owl:someValuesFrom`) 
as exported by the wikibase today:
  
wdno:P31 a owl:Class ;
owl:complementOf _:genid1 .

_:genid1 a owl:Restriction ;
owl:onProperty wdt:P31 ;
owl:someValuesFrom owl:Thing .
  
  @Igorkim78 could you take a look?

TASK DETAIL
  https://phabricator.wikimedia.org/T239414

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Smalyshev, Lucas_Werkmeister_WMDE, Igorkim78, dcausse, Aklapper, 
darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Unassigned] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse removed Zbyszko as the assignee of this task.
dcausse added a subscriber: Zbyszko.
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T239908

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Zbyszko, Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T239908

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: Zbyszko, Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Assigned] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse assigned this task to Zbyszko.

TASK DETAIL
  https://phabricator.wikimedia.org/T239908

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: Zbyszko, Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T239908: Extract more metrics from blazegraph sparql update response

2019-12-12 Thread dcausse
dcausse triaged this task as "Medium" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T239908

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: Zbyszko, Aklapper, dcausse, darthmon_wmde, DannyS712, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T238002: WDQS Munger should be multi threaded

2019-12-12 Thread dcausse
dcausse added a comment.


  Separation of
  
  - parsing
  - munging
  - writing
  
  in multiple thread doubled the speed of the munger
  
old: 
real1371m34.618s
user1854m48.672s
sys 24m44.480s

new:
real731m20.495s
user1798m42.176s
sys 30m7.888s
  
  I should have linked 
https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/553758 to this task.
  Since the rdf parser is the limiting factor I think we will have to do the 
entity delimitation without a rdf parser if we want to further improve the 
speed of this step.
  We could also consider switching to the `nt` format which I'm sure will be a 
lot faster to parse if the size overhead is acceptable.

TASK DETAIL
  https://phabricator.wikimedia.org/T238002

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Smalyshev, Gehel, Aklapper, darthmon_wmde, DannyS712, Nandana, 
Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T239750: org.wikidata.query.rdf.tool.Updater - Importer error: ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access

2019-12-16 Thread dcausse
dcausse claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T239750

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T240540: Investigate usage of the query service & queries that are run

2019-12-17 Thread dcausse
dcausse added a comment.


  Also T239852 <https://phabricator.wikimedia.org/T239852>

TASK DETAIL
  https://phabricator.wikimedia.org/T240540

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Simon_Villeneuve, Lucas_Werkmeister_WMDE, Lydia_Pintscher, 
Addshore, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T241125: Import wikidata RDF dump to hadoop

2019-12-19 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  We currently have no easy way to run large scale analysis on the wikidata 
graph. WDQS and blazegraph are not suited for this scenario. Hadoop seems to be 
a better fit. Discussing with @JAllemandou we believe that a simple parquet 
file with quads might be sufficient for now.

TASK DETAIL
  https://phabricator.wikimedia.org/T241125

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, JAllemandou, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  Tracking task to collect all the efforts made in this direction.
  
  | start  | dump   | node | munge time | 
import time  | initial lag | time to catchup  |
  | 2019-12-04 | wikidata-20191202-all-BETA.ttl.bz2 | wdqs1010 | 22.85h[1]  | 
191h (8days) | 2 weeks | //in progress//  |
  |
  
  [1] munge times improved to 12.18hours in T238002 
<https://phabricator.wikimedia.org/T238002>

TASK DETAIL
  https://phabricator.wikimedia.org/T241128

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T241128

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T241128

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2019-12-19 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T241128

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T241213: Organize and improve integration test coverage for WDQS Updater

2019-12-20 Thread dcausse
dcausse added a comment.


  The most annoying integration test (and probably slowest) is 
org.wikidata.query.rdf.tool.wikibase.WikibaseRepositoryIntegrationTest:
  
  - it generates anonymous edits to test.wikidata.org in order to test the 
RecentChange api
  - Concurrent runs of this test will cause failure. The test expects to see 
the timestamp of the edits it makes, if this test is run concurrently (two 
patches in CI) it's a race and can fail.
  - it adds a lot of complexity to test the robustness (retries) by launching a 
custom Proxy prior running the integration tests (start-proxy and 
org.wikidata.query.rdf.tool.Proxy)

TASK DETAIL
  https://phabricator.wikimedia.org/T241213

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: dcausse, Aklapper, Zbyszko, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T240453: EPIC: Improve completion search on wikidata

2020-01-02 Thread dcausse
dcausse triaged this task as "Medium" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T240453

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lea_Lacroix_WMDE, dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Dinoguy1000, jayvdb, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T241128: EPIC: Reduce the time needed to do the initial WDQS import

2020-01-06 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T241128

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T242453: wdqs1005 stopped to handle updates

2020-01-10 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  Apparently a deadlock inside blazegraph itself:
  
Found one Java-level deadlock:
=
"GASEngine4":
  waiting for ownable synchronizer 0x7fcbf9dbc3c0, (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "com.bigdata.journal.Journal.executorService1539347"
"com.bigdata.journal.Journal.executorService1539347":
  waiting to lock monitor 0x7fc555798e18 (object 0x7fcfda000320, a 
java.lang.Object),
  which is held by "GASEngine2"
"GASEngine2":
  waiting to lock monitor 0x7fc57c22e358 (object 0x7fcbf9b97710, a 
java.lang.Object),
  which is held by "com.bigdata.journal.Journal.executorService1539347"
  
  full stack: P10117 <https://phabricator.wikimedia.org/P10117>
  
  The problem remained unseen by the system, but started around 
2020-01-10T15:44.
  The machine stopped to handle updates and queries, the lag stopped to be 
reported as well.
  Blazegraph was restarted around 19:44.

TASK DETAIL
  https://phabricator.wikimedia.org/T242453

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T242640: query/wikidata/gui jenkins build broken

2020-01-13 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  Seen on https://gerrit.wikimedia.org/r/c/wikidata/query/gui/+/564056
  
17:41:06 + npm install --no-progress
17:41:07 npm WARN deprecated vis@4.21.0: Please consider using 
https://github.com/visjs
17:41:10 npm WARN deprecated grunt-filerev@2.3.1: Deprecated
17:41:23 npm WARN deprecated hawk@3.1.3: This module moved to @hapi/hawk. 
Please make sure to switch over as this distribution is no longer supported and 
may contain bugs and critical security issues.
17:41:23 npm WARN deprecated boom@2.10.1: This version has been deprecated 
in accordance with the hapi support policy (hapi.im/support). Please upgrade to 
the latest version to get the best features, bug fixes, and security patches. 
If you are unable to upgrade at this time, paid support is available for older 
versions (hapi.im/commercial).
17:41:23 npm WARN deprecated sntp@1.0.9: This module moved to @hapi/sntp. 
Please make sure to switch over as this distribution is no longer supported and 
may contain bugs and critical security issues.
17:41:23 npm WARN deprecated hoek@2.16.3: This version has been deprecated 
in accordance with the hapi support policy (hapi.im/support). Please upgrade to 
the latest version to get the best features, bug fixes, and security patches. 
If you are unable to upgrade at this time, paid support is available for older 
versions (hapi.im/commercial).
17:41:23 npm WARN deprecated cryptiles@2.0.5: This version has been 
deprecated in accordance with the hapi support policy (hapi.im/support). Please 
upgrade to the latest version to get the best features, bug fixes, and security 
patches. If you are unable to upgrade at this time, paid support is available 
for older versions (hapi.im/commercial).
17:41:25 npm WARN deprecated jscs-preset-wikimedia@1.0.1: No longer 
maintained. We recomment migrating to ESLint with eslint-config-wikimedia.
17:41:26 npm WARN deprecated core-js@2.6.11: core-js@<3 is no longer 
maintained and not recommended for usage due to the number of issues. Please, 
upgrade your dependencies to the actual version of core-js@3.
17:41:26 npm WARN deprecated nomnom@1.8.1: Package no longer supported. 
Contact supp...@npmjs.com for more info.
17:41:26 npm ERR! Linux 4.9.0-11-amd64
17:41:26 npm ERR! argv "/usr/bin/nodejs" "/usr/local/bin/npm" "install" 
"--no-progress"
17:41:26 npm ERR! node v6.11.0
17:41:26 npm ERR! npm  v3.8.3
17:41:26 npm ERR! code EMISSINGARG
17:41:26 
17:41:26 npm ERR! typeerror Error: Missing required argument #1
17:41:26 npm ERR! typeerror at andLogAndFinish 
(/usr/local/lib/node_modules/npm/lib/fetch-package-metadata.js:31:3)
17:41:26 npm ERR! typeerror at fetchPackageMetadata 
(/usr/local/lib/node_modules/npm/lib/fetch-package-metadata.js:51:22)
17:41:26 npm ERR! typeerror at resolveWithNewModule 
(/usr/local/lib/node_modules/npm/lib/install/deps.js:455:12)
17:41:26 npm ERR! typeerror at 
/usr/local/lib/node_modules/npm/lib/install/deps.js:456:7
17:41:26 npm ERR! typeerror at 
/usr/local/lib/node_modules/npm/node_modules/iferr/index.js:13:50
17:41:26 npm ERR! typeerror at 
/usr/local/lib/node_modules/npm/lib/fetch-package-metadata.js:37:12
17:41:26 npm ERR! typeerror at addRequestedAndFinish 
(/usr/local/lib/node_modules/npm/lib/fetch-package-metadata.js:82:5)
17:41:26 npm ERR! typeerror at returnAndAddMetadata 
(/usr/local/lib/node_modules/npm/lib/fetch-package-metadata.js:117:7)
17:41:26 npm ERR! typeerror at pickVersionFromRegistryDocument 
(/usr/local/lib/node_modules/npm/lib/fetch-package-metadata.js:134:20)
17:41:26 npm ERR! typeerror at 
/usr/local/lib/node_modules/npm/node_modules/iferr/index.js:13:50
17:41:26 npm ERR! typeerror This is an error with npm itself. Please report 
this error at:
17:41:26 npm ERR! typeerror <http://github.com/npm/npm/issues>
17:41:26 
17:41:26 npm ERR! Please include the following file with any support 
request:
17:41:26 npm ERR! /src/npm-debug.log

TASK DETAIL
  https://phabricator.wikimedia.org/T242640

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T242640: query/wikidata/gui jenkins build broken

2020-01-13 Thread dcausse
dcausse triaged this task as "High" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T242640

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T242640: query/wikidata/gui jenkins build broken

2020-01-13 Thread dcausse
dcausse added a comment.


  very similar to T242587 <https://phabricator.wikimedia.org/T242587>

TASK DETAIL
  https://phabricator.wikimedia.org/T242640

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T242453: wdqs1005 stopped to handle updates

2020-01-16 Thread dcausse
dcausse added a comment.


  icinga check showed: `CHECK_NRPE STATE UNKNOWN: Socket timeout after 10 
seconds.` for `Query Service HTTP Port` and `NaN` for `WDQS high update lag`.
  
  We should probably alert in case of timeouts.
  
  Stackdumps from blazegraph: P10185 <https://phabricator.wikimedia.org/P10185>

TASK DETAIL
  https://phabricator.wikimedia.org/T242453

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Addshore, dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T243270: Test commons RDF dumps on sdcquery.wmflabs.org

2020-01-21 Thread dcausse
dcausse added projects: Wikidata-Query-Service, Discovery-Search (Current work).
Restricted Application added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T243270

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T243292: Fix the munger to support commons RDF dump

2020-01-21 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  When trying to munge the dumps the process is filtering many triples saying:
  
15:03:28.962 
[org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  
o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: 
s:http://commons.wikimedia.org/entity/statement/M51372-16FD5B4C-7B40-4FCC-984C-4DAA9A8D00CA
 p:http://wikiba.se/ontology#rank o:http://wikiba.se/ontology#NormalRank
15:03:28.962 
[org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  
o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: 
s:http://commons.wikimedia.org/entity/statement/M51372-16FD5B4C-7B40-4FCC-984C-4DAA9A8D00CA
 p:http://www.wikidata.org/prop/statement/P7482 
o:http://www.wikidata.org/entity/Q66458942
15:03:28.962 
[org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  
o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized subjects: 
[http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB]
 while processing http://commons.wikimedia.org/entity/M51376.  Expected only 
sitelinks and subjects starting with 
http://commons.wikimedia.org/wiki/Special:EntityData/ and 
[http://www.wikidata.org/entity/, http://commons.wikimedia.org/entity/]
15:03:28.962 
[org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  
o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: 
s:http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB
 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type 
o:http://wikiba.se/ontology#BestRank
15:03:28.962 
[org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  
o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: 
s:http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB
 p:http://wikiba.se/ontology#rank o:http://wikiba.se/ontology#NormalRank
15:03:28.962 
[org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  
o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: 
s:http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB
 p:http://www.wikidata.org/prop/statement/P7482 
o:http://www.wikidata.org/entity/Q66458942
15:03:28.962 
[org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  
o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized subjects: 
[http://commons.wikimedia.org/entity/statement/M51389-FE3B5391-E9F2-45E2-B353-84FD0ED8FDC8]
 while processing http://commons.wikimedia.org/entity/M51389.  Expected only 
sitelinks and subjects starting with 
http://commons.wikimedia.org/wiki/Special:EntityData/ and 
[http://www.wikidata.org/entity/, http://commons.wikimedia.org/entity/]
  
  The munger is ran with the following options: `-w commons.wikimedia.org -U 
http://www.wikidata.org --commonsUri http://commons.wikimedia.org`.

TASK DETAIL
  https://phabricator.wikimedia.org/T243292

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T243431: Grant more rights to wikidata/query/rdf for the group wikidata/query (similar to search)

2020-01-22 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata-Query-Service, Gerrit-Privilege-Requests.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  In order to use the mvn release plugin on `wikidata/query/service` we need 
special rights to the repo.
  
  For search <https://gerrit.wikimedia.org/r/admin/projects/search,access> we 
ALLOW:
  
  - Create Signed Tag
  - Create Annotated Tag
  - Push (no-force-push)

TASK DETAIL
  https://phabricator.wikimedia.org/T243431

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, Zbyszko, Mstyles, Gehel, dcausse, darthmon_wmde, DannyS712, 
Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331, Legoktm, MarcoAurelio
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T243431: Grant more rights to wikidata/query/rdf for the group wikidata/query (similar to search)

2020-01-23 Thread dcausse
dcausse added a project: Release-Engineering-Team.

TASK DETAIL
  https://phabricator.wikimedia.org/T243431

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, Zbyszko, Mstyles, Gehel, dcausse, darthmon_wmde, DannyS712, 
Nandana, NebulousIris, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Liudvikas, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331, Legoktm
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Retitled] T243431: Grant more rights to wikidata/query/rdf for the group wikidata-query (similar to search)

2020-01-23 Thread dcausse
dcausse renamed this task from "Grant more rights to wikidata/query/rdf for the 
group wikidata/query (similar to search)" to "Grant more rights to 
wikidata/query/rdf for the group wikidata-query (similar to search)".

TASK DETAIL
  https://phabricator.wikimedia.org/T243431

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, Zbyszko, Mstyles, Gehel, dcausse, darthmon_wmde, DannyS712, 
Nandana, NebulousIris, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Liudvikas, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331, Legoktm
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-05 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata, Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  The use of blank nodes makes an update process always a challenging operation 
(http://www.aidanhogan.com/docs/blank_nodes_jws.pdf). The use of blank nodes by 
wikibase is very limited and thus I propose to remove them to simplify the WDQS 
update strategy.
  
  In wikibase we use blank nodes for two purposes:
  
  - denote an //unknown value// (originally discussed in T95441 
<https://phabricator.wikimedia.org/T95441>)
  - owl constraints of wdno property
  
  For the unknown value use-case we seem to only use the blank node as a way to 
//filter// such unknown value.
  For the OWL constraints it's unclear if it is actually used/useful.
  
  For unknown values I suggest:
  
wd:Q3 a wikibase:Item wdunk:P2 .
wds:Q3-45abf5ca-4ebf-eb52-ca26-811152eb067c a wikibase:Statement wdunk:P2;
wikibase:rank wikibase:NormalRank .
  
  A query like
  
SELECT ?human
WHERE {
?human wdt:P106 ?o
FILTER isBLANK(?o) .
}
  
  Would become
  
SELECT ?human
WHERE { ?human a wdunk:P106 }
  
  And
  
SELECT ?human
WHERE { ?human wdt:P106 ?o }
  
  Would now mean: //All entities with a known occupation//
  As opposed to //All entities with a known or unkown occupation//
  which should be written as:
  
SELECT ?human
WHERE { {?human wdt:P106 ?o} union {?human a wdunk:P106} }
  
  For OWL constraints I simply suggest to remove them or materialize the blank 
node.
  
wdno:P109 a owl:Class ;
owl:complementOf wdowl:P109 .

wdowl:P109 a owl:Restriction ;
owl:onProperty wdt:P109 ;
owl:someValuesFrom owl:Thing .
  
  This is a breaking change to 
https://www.mediawiki.org/w/index.php?title=Wikibase/Indexing/RDF_Dump_Format 
if this is accepted I suggest a transition period where blank nodes would be 
kept, the use of //isBlank// from the query service could start emitting a 
deprecation warning.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-05 Thread dcausse
dcausse added a comment.


  In T244341#5852014 <https://phabricator.wikimedia.org/T244341#5852014>, 
@Lucas_Werkmeister_WMDE wrote:
  
  > If the problem is just the blank nodes themselves, why not use this new 
`wdunk:P2` in the same way, as in `wd:Q3 wdt:P2 wdunk:P2`? That’s still worse 
than the blank nodes (multiple “unknown value” statements collapse into one 
triple, just as is currently the case for “no value” statements), but at least 
it shouldn’t break as many queries.
  
  Yes the problem are the blank nodes themselves as there are no ways to mutate 
the graph without querying it.
  I'm OK with your suggestion but this makes two unrelated unknown values equal.
  
  Would something like
  
wd:Q2 wdt:P2 wdunk:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658
  
  be acceptable?
  
  This would be very similar to the previous approach using blank nodes.
  
  No different unknown values could be collapsed, the drawback is that to 
extract unknown values one would have to rely on a uri prefix filter using 
`STRSTARTS`.
  
SELECT ?human
WHERE {
?human wdt:P106 ?o
FILTER isBLANK(?o) .
}
  
  would become
  
PREFIX wdunk: <http://www.wikidata.org/prop/unknown/> 

SELECT ?human
WHERE {
?human wdt:P106 ?o
FILTER STRSTARTS( STR(?o), 'http://www.wikidata.org/prop/unknown/' ) .
}
  
  Any other suggestions?
  Ideally I'd like to find a structure that does no require having to run 
filters.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T221709: scap service restarts for WDQS are inconsistent

2020-02-06 Thread dcausse
dcausse triaged this task as "High" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T221709

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Gehel, Aklapper, Smalyshev, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, thcipriani, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331, Jay8g, 
Krenair, fgiunchedi, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-06 Thread dcausse
dcausse added a comment.


  Yes the issue with blank nodes is that they are not "reference-able" and thus 
point delete queries are impossible which is what we want to achieve with the 
next gen updater.
  
  I did some tests and isBlank is a lot faster (I suppose because this 
information is inlined as opposed to the IRI that has to be fetched from its 
dictionary). So materializing the unknown value with the statement identifier 
we risk to encounter timeouts more frequently.
  
  So unless we have a third alternative we have two choices:
  
  - use a constant value: probably very fast but we now say: all unknown values 
are equal.
  - use the statement identifier: very close to the previous semantic but a lot 
slower
  
  I think I prefer the first approach you suggested, dealing with perf issues 
seems more annoying than a less precise graph.
  The usecases that I can think of that could be affected are:
  
  - queries based on equality: find entities which share the same value. Such 
queries will have to filter out explicitly the "unknown value"
  - queries based on the number of unknown values on a particular property? 
Examples would help here I think.
  - other usecases?

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-07 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-07 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-07 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T244590: EPIC: Rework the WDQS updater as an event driven application

2020-02-07 Thread dcausse
dcausse created this task.
dcausse added projects: Epic, Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  The the current merging strategy for applying updates require sending all the 
entity data on every update. The goal of this task is to design a new updater 
that will be able to send the minimal number of triples to the RDF store to 
synchronize the graph with the state of wikibase.
  Note: this is a very rough plan and many details will probably change as 
implementation specific requirements will pop up.
  
  The proposed approach relies on a system able to do stateful computation over 
data streams (flink).
  F31553995: updater_v2.png <https://phabricator.wikimedia.org/F31553995>
  
  Based on a set of source event streams populated by mediawiki and change 
propagation the steps are:
  
  1. filter: filter events related to wikibase and its entities
  2. event time reordering: reorder the events and assemble them to a single 
partitioned stream
  3. rev state evaluation: determine what command needs to applied to mutate 
the graph
- this steps require holding a state of previously seen revision and other 
actions (e.g. visibility change)
- the output of this is a simple event without any data saying: do a diff 
between rev X and Y, fully delete entity QXYZ, ...
- the initial state will be populated using the revisions present in the 
RDF dump
- seen revisions (after a fresh import) will be easy to discard
  4. rdf diff generation: materialize the command and fetch the data from 
wikibase and send it over a RDF stream
- it's probable that in some cases (suppressed delete) the exact set of 
triples to be deleted will be unknown and thus will require a special delete 
command to be applied to the backend
  5. rdf import: The components reading this stream will be very similar to the 
current updater: a process running locally on the wdqs nodes pushing data to 
blazegraph
  
  For the first iteration no cleanups will be performed, orphaned values & 
references will remain in the RDF store. This will be mitigated by more 
frequent reloads of the dump.
  Such system being prone to deviation frequent reloads will be important, it's 
important to note that the state of step 3 is tightly coupled with its dump and 
thus we will have to instantiate a new stream per imported dump. In other words 
a wdqs system imported using dump Y will have to consume the RDF stream 
generated from an initial state based on this same dump. This means that the 
RDF stream will be named against a particular dump instance.
  
  Note on event time reordering:
  
  - seems to be relatively easy in flink: e.g. 
https://github.com/ververica/flink-training-exercises/blob/master/src/main/java/com/ververica/flinktraining/examples/datastream_java/process/CarEventSort.java
  
  Note on state management:
  
  - RocksDB offers incremental checkpoint and seems to support high cardinality 
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend
 quite well, the operation where we need a large state seems to be 
partitionable and thus the state can be split into multiple buckets.
  
  Note on initial state:
  
  - seems to be allowed by flink using its state-processor-api: 
https://flink.apache.org/feature/2019/09/13/state-processor-api.html

TASK DETAIL
  https://phabricator.wikimedia.org/T244590

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, Zbyszko, Gehel, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Dinoguy1000, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding unknown values and OWL constraints

2020-02-17 Thread dcausse
dcausse added a comment.


  Thanks for all the feedback.
  I'll discard the "constant" option.
  
  A note on the motivations:
  we plan to redesign the update process as a set of trivial mutations to the 
graph, as far as I can see updating a graph with blank nodes cannot be a 
"trivial operation", citing
  http://www.aidanhogan.com/docs/blank_nodes_jws.pdf (page 10 //Issues with 
blank nodes//):
  
  > Given a fixed, serialised RDF graph (i.e., a document), labelling of blank 
nodes can vary across parsers and across time. Checking if two representations 
originate from the same data thus often requires an isomorphism check, for 
which in general, no polynomial algorithms are known.
  
  By making some assumptions on the wikibase RDF model I believe that 
generating a diff between two entity revisions should be relatively easy even 
if blank nodes are involved, the problem is when applying this diff to the RDF 
backend, if it involves blank nodes it cannot be a set of trivial mutations 
(here trivial means using `INSERT|DELETE DATA` statements). E.g. if the diff 
indicates that we need to remove:
  
wd:Q2 wdt:P576 _:genid1
  
  because `DELETE DATA` is not possible with blank nodes we have to send 
something like
  
DELETE { ?s ?p ?o }
WHERE {
  wd:Q2 wdt:P576 ?o .
  FILTER(isBlank(?o))
  ?s ?p ?o
}
  
  Which will delete all blank nodes attached to `wd:Q2` by `wdt:P576`. I 
haven't checked but I hope that at most one blank node can be attached to the 
same subject/predicate, if not this makes the sync algorithm a bit more complex.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Jheald, Daniel_Mietchen, mkroetzsch, Denny, Lucas_Werkmeister_WMDE, 
Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Retitled] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-17 Thread dcausse
dcausse renamed this task from "Wikibase RDF dump: stop using blank nodes for 
encoding unknown values and OWL constraints" to "Wikibase RDF dump: stop using 
blank nodes for encoding SomeValue and OWL constraints".
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Jheald, Daniel_Mietchen, mkroetzsch, Denny, Lucas_Werkmeister_WMDE, 
Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T203397: Provide more useful redirect for statement nodes (wds:…)

2020-02-17 Thread dcausse
dcausse added a project: Discovery-Search (Current work).
dcausse added a comment.


  @Lea_Lacroix_WMDE no, we just need to deploy it, sorry for the delay.

TASK DETAIL
  https://phabricator.wikimedia.org/T203397

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Lea_Lacroix_WMDE, Lucas_Werkmeister_WMDE, Aklapper, Beast1978, 
Un1tY, Hook696, Daryl-TTMG, RomaAmorRoma, 0010318400, E.S.A-Sheild, 
darthmon_wmde, Meekrab2012, joker88john, CucyNoiD, Nandana, NebulousIris, 
Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, 
Lahi, Gq86, Af420, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, 
Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, EBjune, LawExplorer, WSH1906, 
Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T196165: Commons image: when pasting the exact title, get the correct file first in the suggester

2020-02-17 Thread dcausse
dcausse added a comment.


  I believe that because the file name has many words the score on the 
tokenized text fields is very high (since we sum all token scores), the score 
on the exact match having only one word and despite having a high weight it's 
not enough to compete with the loss of its text matches discarded because of 
the negation.
  
  In general I suggest using autocomplete APIs (opensearch/prefixsearch) for 
type-a-head searches, this is faster and the list of results does no change 
unexpectedly as you type. What's done in the mobile app is a two steps search: 
first send a prefixsearch then a fulltext search if not results are found.
  
  When using the fulltext search (list=search) if the user is not aware that 
it's using the fulltext engine the UI should escape the search syntax otherwise 
some chars may trigger a special syntax (negation in this case).
  
  The proper way to fix this issue is imo to:
  
  - use a completion API + fulltext search fallback
  - escape the fulltext search syntax from the UI: AND, OR, NOT, ||, &&, -, !, 
", :, \?, *

TASK DETAIL
  https://phabricator.wikimedia.org/T196165

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Silvan_WMDE, dcausse
Cc: hoo, EBernhardson, TJones, dcausse, Ladsgroup, Silvan_WMDE, Addshore, 
Bencemac, Aklapper, Ayack, Liuxinyu970226, Smalyshev, Lydia_Pintscher, 
Lea_Lacroix_WMDE, Beast1978, Un1tY, Hook696, Daryl-TTMG, RomaAmorRoma, 
0010318400, E.S.A-Sheild, Iflorez, darthmon_wmde, alaa_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Af420, 
Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, 
Ramalepe, Liugev6, QZanden, LawExplorer, WSH1906, Lewizho99, Maathavan, 
_jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-18 Thread dcausse
dcausse added a comment.


  In T244341#5890517 <https://phabricator.wikimedia.org/T244341#5890517>, 
@Lucas_Werkmeister_WMDE wrote:
  
  >> I haven't checked but I hope that at most one blank node can be attached 
to the same subject/predicate, if not this makes the sync algorithm a bit more 
complex.
  >
  > At least currently, this is not the case. I added a second “partner: 
unknown value” statement to the sandbox item 
<https://www.wikidata.org/wiki/Q4115189>, and now wd:Q4115189 wdt:P451 ?v 
<https://query.wikidata.org/#SELECT%20%2a%20%7B%20wd%3AQ4115189%20wdt%3AP451%20%3Fv.%20%7D>
 produces two blank nodes as result.
  
  Thanks for checking, this makes the diff process and the update query a bit 
more complex as now we need to track the number of blank nodes attached to a 
particular subject/predicate. As for the update query I believe this is still 
possible with:
  
DELETE { ?s ?p ?o }
WHERE {
  SELECT ?s ?p ?o {
wd:Q4115189 wdt:P451 ?o .
FILTER(isBlank(?o))
?s ?p ?o
  } LIMIT 1 # number of blank nodes to keep
}
  
  But overall this makes updating a triple with a blank node a completely 
separate operation that cannot be batched with and like `INSERT DATA` or 
`DELETE DATA`.
  
  > Once we stop using blank nodes for OWL constraints, though, I believe you 
can at least assume that blank nodes are never the subject of a triple – would 
that help? (I feel like this ought to eliminate the need for a full isomorphism 
check from your quote.)
  
  Indeed, this and the fact that for SomeValue all blank nodes are unique, even 
the same statement "SomeValue" used as wdt and ps is different currently 
<https://query.wikidata.org/#SELECT%20%2a%20%7B%0A%20%20%7B%20wd%3AQ4115189%20wdt%3AP451%20%3Fv.%20%7D%0A%20%20UNION%0A%20%20%7B%0A%20%20%20%20wd%3AQ4115189%20p%3AP451%20%3Fs%20.%0A%20%20%20%20%3Fs%20ps%3AP451%20%3Fv%0A%20%20%7D%0A%7D>.
 
  From the point of view of a "simple diff operation" this is a fortunate 
situation as it makes the update process simpler in the scenario we decline 
this task and stick with blank nodes. In the case we decide to move forward 
with IRIs placeholders the object of wdt and ps predicates of the same 
statement will become identical for SomeValue.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Jheald, Daniel_Mietchen, mkroetzsch, Denny, Lucas_Werkmeister_WMDE, 
Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T244590: EPIC: Rework the WDQS updater as an event driven application

2020-02-18 Thread dcausse
dcausse added a comment.


  In T244590#5893018 <https://phabricator.wikimedia.org/T244590#5893018>, 
@Ottomata wrote:
  
  > COOL! :)
  >
  >> it's important to note that the state of step 3 is tightly coupled with 
its dump and thus we will have to instantiate a new stream per imported dump. 
In other words a wdqs system imported using dump Y will have to consume the RDF 
stream generated from an initial state based on this same dump. This means that 
the RDF stream will be named against a particular dump instance.
  >
  > Hm.  Would it be possible instead to lambda architecture this part?  
Instead of having to reload from a full dump and then recreate a new stream, 
could accomplish the same cleanups by backfilling from a batch job in Hadoop?  
I'm not sure I fully understand the 'cleanups' here.  Are they not do-able with 
the stream because events representing some of the state changes don't exist 
(yet)?
  
  I hope that in the future once the stream has been stabilized yes reloading 
the system might become less necessary and that a fresh and consistent dump can 
be reconstructed (daily?) using the stream itself.
  Reloading from the dump generated by MW is something we need anyways in order 
to bootstrap the system and at the beginning will be needed to circumvent:
  
  - bug fixes (bug where the data is simply lost)
  - lost events (undetected failures or bugs in MW)
  - cleanup
  
  The cleanup operation mentioned here is a sort of "garbage collection", to 
simplify we need to detect unused resources (subgraph) in the graph, the stream 
itself does not know this unless we keep another large state doing references 
counting.
  The solution proposed here is to simply spawn a new system from time to time 
(the dump generated by MW is clean) so that we do cleanup and fix lost events 
at the same time, but I agree with you this is not ideal and leveraging more 
batch jobs and/or more states in the stream will help minimize the need to do a 
full reload.

TASK DETAIL
  https://phabricator.wikimedia.org/T244590

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Ottomata, JAllemandou, Aklapper, Zbyszko, Gehel, dcausse, darthmon_wmde, 
Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Dinoguy1000, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-18 Thread dcausse
dcausse added a comment.


  To move this forward I propose the following plan:
  
  1. add a `wikibase:isSomeValue` custom function configurable to work as a 
proxy to `isBlank()` or  `STRSTARTS( STR(?o), 
'http://www.wikidata.org/prop/somevalue/' )` and announce it
  2. instead of changing the RDF representation generated by wikibase add a new 
option to the updater/munger to transform (on the fly) blank nodes as IRIs 
placeholders
  3. setup a test instance of the query service using this proposal and ask for 
feedback
  4. if no major blockers are encountered we can announce that the RDF 
representation is about to change
  5. start emitting deprecation warnings when seeing `isBlank`
  6. after a deprecation period activate placeholder IRIs everywhere
  7. change the wikibase RDF representation

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Jheald, Daniel_Mietchen, mkroetzsch, Denny, Lucas_Werkmeister_WMDE, 
Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T245533: Add a custom wikibase:isSomeValue() function

2020-02-18 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata, Wikidata-Query-Service.

TASK DESCRIPTION
  In order to allow a "smooth" transition from blank nodes to IRI placeholders 
the `wikibase:isSomeValue` function will be added to the set of custom 
functions offered by the //query service//.
  A new option will be added read at blazegraph startup to instruct this 
function to behave like:
  
  - `isBlank()`
  - or `STRSTARTS( STR(?o), 'http://www.wikidata.org/prop/somevalue/' )`

TASK DETAIL
  https://phabricator.wikimedia.org/T245533

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T245541: Add a new munge option to do blank node skolemization

2020-02-18 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata, Wikidata-Query-Service.

TASK DESCRIPTION
  This munge option will transform all blank nodes as placeholder IRIs using 
the following rules:
  
wdno:P109 a owl:Class ;
owl:complementOf _:1 .

_:1 a owl:Restriction ;
owl:onProperty wdt:P109 ;
owl:someValuesFrom owl:Thing .
  
  to:
  
wdno:P109 a owl:Class ;
owl:complementOf wdowl:P109 .

wdowl:P109 a owl:Restriction ;
owl:onProperty wdt:P109 ;
owl:someValuesFrom owl:Thing .
  
  Introducing a new prefix:
  
@prefix wdowl: <http://www.wikidata.org/owl/> .
  
  -
  
wd:Q2 wdt:P576 _:genid1 ;
 p:P576 s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 .

s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 a wikibase:Statement,
wikibase:BestRank ;
wikibase:rank wikibase:NormalRank ;
ps:P576 _:genid2 ;
pq:P805 wd:Q2003654 .
  
  to
  
wd:Q2 wdt:P576 wdsome:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 ;
 p:P576 s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 .

s:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 a wikibase:Statement,
wikibase:BestRank ;
wikibase:rank wikibase:NormalRank ;
ps:P576 wdsome:Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658 ;
pq:P805 wd:Q2003654 .
  
  introducing a new prefix:
  
@prefix wdsome: <http://www.wikidata.org/prop/somevalue/> .
  
  Question: https://www.w3.org/2011/rdf-wg/wiki/Skolemisation mentions using 
//well known// IRIs (rfc5785 <https://tools.ietf.org/html/rfc5785>) but since 
this proposal is not finished (and unlikely to be ever finished? stalled since 
2011) I wonder if we should follow it?
  The placeholder has not been decided but arbitrarily choosing //bnode// the 
placeholder IRIs would become:
  
  
`http://www.wikidata.org/.well-known/bnode/Q2-6657d0b5-4aa4-b465-12ed-d1b8a04ef658`

TASK DETAIL
  https://phabricator.wikimedia.org/T245541

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, Lucas_Werkmeister_WMDE, mkroetzsch, Daniel_Mietchen, Jheald, 
dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-18 Thread dcausse
dcausse added a comment.


  In T244341#5893723 <https://phabricator.wikimedia.org/T244341#5893723>, 
@Lucas_Werkmeister_WMDE wrote:
  
  > Well, I’d like to see what the IRIs for unknown value in qualifiers and 
references look like before we move ahead with this plan.
  
  Sure, I tried to add some but I'm not sure how I did not find my way in the 
UI, could you try to update the sandbox item so that we can have a look?

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Jheald, Daniel_Mietchen, mkroetzsch, Denny, Lucas_Werkmeister_WMDE, 
Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T245533: Add a custom wikibase:isSomeValue() function

2020-02-18 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T245533

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T245533: Add a custom function to identify wikibase "somevalue"

2020-02-19 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T245533

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Retitled] T245533: Add a custom function to identify wikibase "somevalue"

2020-02-19 Thread dcausse
dcausse renamed this task from "Add a custom wikibase:isSomeValue() function" 
to "Add a custom function to identify wikibase "somevalue"".
dcausse added a subscriber: Lucas_Werkmeister_WMDE.
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T245533

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lucas_Werkmeister_WMDE, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T239687: Rework how value and reference changes are handled

2020-02-19 Thread dcausse
dcausse moved this task from In Progress to Done on the Discovery-Search 
(Current work) board.
dcausse added a comment.


  The munger has been reworked so that it does not deal with this cleanup. The 
next gen updater will address this cleanup in a different way. For the current 
updater one thing to keep in mind is that the ref cleanup was disabled some 
time ago (investigating T194325 <https://phabricator.wikimedia.org/T194325>: 
https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/437362) and never 
re-enabled since then. We could imagine disabling values cleanup as well this 
could give us some room with the current updater.

TASK DETAIL
  https://phabricator.wikimedia.org/T239687

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Daniel_Mietchen, Lucas_Werkmeister_WMDE, dcausse, Aklapper, darthmon_wmde, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T241125: Import wikidata RDF dump to hadoop

2020-02-19 Thread dcausse
dcausse added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T241125

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Daniel_Mietchen, Aklapper, dcausse, JAllemandou, darthmon_wmde, Nandana, 
Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T239908: Extract more metrics from blazegraph sparql update response

2020-02-19 Thread dcausse
dcausse moved this task from To Be Deployed to Done on the Discovery-Search 
(Current work) board.
dcausse added a comment.


  Dashboard created here: 
https://grafana.wikimedia.org/d/dSksY08Zk/wikidata-query-service-updater?orgId=1

TASK DETAIL
  https://phabricator.wikimedia.org/T239908

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: Daniel_Mietchen, Zbyszko, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T203397: Provide more useful redirect for statement nodes (wds:…)

2020-02-19 Thread dcausse
dcausse claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T203397

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Lea_Lacroix_WMDE, Lucas_Werkmeister_WMDE, Aklapper, darthmon_wmde, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Merged] T244590: EPIC: Rework the WDQS updater as an event driven application

2020-02-19 Thread dcausse
dcausse merged a task: T229544: Create RDF diff for WDQS updating.
dcausse added subscribers: Smalyshev, Iamamz3.

TASK DETAIL
  https://phabricator.wikimedia.org/T244590

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Iamamz3, Smalyshev, Ottomata, JAllemandou, Aklapper, Zbyszko, Gehel, 
dcausse, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Dinoguy1000, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T229544: Create RDF diff for WDQS updating

2020-02-19 Thread dcausse
dcausse closed this task as a duplicate of T244590: EPIC: Rework the WDQS 
updater as an event driven application.

TASK DETAIL
  https://phabricator.wikimedia.org/T229544

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Iamamz3, Aklapper, Smalyshev, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T244341: Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints

2020-02-19 Thread dcausse
dcausse added a comment.


  @Lucas_Werkmeister_WMDE thanks!
  
  Indeed this becomes a bit more challenging as the statement identifier alone 
cannot be used to identify a bnode under a particular statement. I'll continue 
to discuss about this specific issue in T245541 
<https://phabricator.wikimedia.org/T245541> to limit noise on this ticket.
  
  @Jheald about blank nodes usage in T239414 
<https://phabricator.wikimedia.org/T239414> we investigated how blank nodes are 
currently used and extracted some numbers here: P9859 
<https://phabricator.wikimedia.org/P9859> (count per predicate where a blank 
node is used a an object).
  
  Sadly such counts won't be faster using this new proposed approach.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Jheald, Daniel_Mietchen, mkroetzsch, Denny, Lucas_Werkmeister_WMDE, 
Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Retitled] T231515: Duplicate blank nodes on edited properties

2020-02-20 Thread dcausse
dcausse renamed this task from "Duplicate wdno: clauses on edited properties" 
to "Duplicate blank nodes on edited properties".
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T231515

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Igorkim78, Gehel, Aklapper, Smalyshev, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T245541: Add a new munge option to do blank node skolemization

2020-02-20 Thread dcausse
dcausse added a comment.


  In 
https://www.wikidata.org/wiki/Q4115189#Q4115189$7d68afee-408d-1c1e-946b-43d8d37a17b5
 @Lucas_Werkmeister_WMDE added more "somevalue" to the graph (references and 
qualifiers) which outputs the following graph:
  
wd:Q4115189 p:P370 s:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5 .

s:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5 a wikibase:Statement,
wikibase:BestRank ;
wikibase:rank wikibase:NormalRank ;
ps:P370 _:genid6 ;
pq:P2315 "this is a demo for T244341, if possible please don’t remove 
it before, say, 2020-02-26 :)"@en ;
pq:P370 _:genid7 ;
pq:P1106 _:genid8 ;
prov:wasDerivedFrom ref:6c8b1cd1c3cd814ab99e3c40580f12024ceff994 .

ref:6c8b1cd1c3cd814ab99e3c40580f12024ceff994 a wikibase:Reference ;
pr:P370 _:genid9 ;
pr:P855 _:genid10 .
  
  //First constatation is that our current update strategy is not able do a 
clean change on this entity, existing blank nodes are leaked (updated T231515 
<https://phabricator.wikimedia.org/T231515>).//
  
  The proposed solution for encoding bnodes as currently stated does not work 
well as it will conflate all bnodes attached to a statement.
  
  One obvious solution would to encode more information to this made-up IRI by 
prefixing/suffixing the predicate:
  
wd:Q4115189 p:P370 s:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5 .

s:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5 a wikibase:Statement,
wikibase:BestRank ;
wikibase:rank wikibase:NormalRank ;
ps:P370 wdsome:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5-PS-P370 ;
pq:P2315 "this is a demo for T244341, if possible please don’t remove 
it before, say, 2020-02-26 :)"@en ;
pq:P370 wdsome:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5-PQ-P370 ;
pq:P1106 wdsome:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5-PQ-P1106 ;
prov:wasDerivedFrom ref:6c8b1cd1c3cd814ab99e3c40580f12024ceff994 .

ref:6c8b1cd1c3cd814ab99e3c40580f12024ceff994 a wikibase:Reference ;
pr:P370 wdsome:ref-6c8b1cd1c3cd814ab99e3c40580f12024ceff994-PR-P370 ;
pr:P855 wdsome:ref-6c8b1cd1c3cd814ab99e3c40580f12024ceff994-PR-P855 .
  
  This is a bit ugly but this would ensure uniqueness of the IRIs, also I'm not 
a big fan of propagating information into IDs as I'm afraid that some process 
may want to make some assumptions on the structure of the ID itself. Here the 
only information we want to encode is the:
  
  - uniqueness of the node
  - a common IRI prefix to detect that these are skolem IRIs.
  
  I wonder if we could not simply hash things. 
`wdsome:Q4115189-7d68afee-408d-1c1e-946b-43d8d37a17b5-PS-P370` would become 
`wdsome:e81da6d67fa0cbf0e1daf440c31cf138ffe565c8`

TASK DETAIL
  https://phabricator.wikimedia.org/T245541

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, Lucas_Werkmeister_WMDE, mkroetzsch, Daniel_Mietchen, Jheald, 
dcausse, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T245727: Create a streaming-updater submodule under query/wikidata/rdf

2020-02-20 Thread dcausse
dcausse created this task.
dcausse added projects: Epic, Wikidata-Query-Service, Wikidata.

TASK DESCRIPTION
  Using flink and scala with ideally a small test case.

TASK DETAIL
  https://phabricator.wikimedia.org/T245727

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: Gehel, Zbyszko, Aklapper, JAllemandou, Ottomata, Smalyshev, Iamamz3, 
dcausse, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Dinoguy1000, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T245728: Add a component to generate a diff between two entity revisions

2020-02-20 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata-Query-Service, Wikidata.

TASK DESCRIPTION
  This component will take the list of triples of entity at revision X and Y 
and generate a diff between these two.
  The diff should be the list of triples to add and the ones to delete.
  For now we assume that no blank nodes are present.
  Depending on the outcome on the discussion about blank nodes diffing before 
or after munging might vary. It might be safer to consider that this component 
takes unmunged version of the triples even if for now the very first step will 
be to munge the inputs.

TASK DETAIL
  https://phabricator.wikimedia.org/T245728

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Zbyszko, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T245727: Create a streaming-updater submodule under query/wikidata/rdf

2020-02-20 Thread dcausse
dcausse removed a project: Epic.

TASK DETAIL
  https://phabricator.wikimedia.org/T245727

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: Gehel, Zbyszko, Aklapper, JAllemandou, Ottomata, Smalyshev, Iamamz3, 
dcausse, darthmon_wmde, Nandana, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331, Dinoguy1000
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-26 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata, Wikidata-Query-Service.

TASK DESCRIPTION
  It would nice to have an idea of the percentage of queries that uses the 
`isBlank` function. It might interesting to know if we can identify tools using 
this function in order to contact their maintainer if we were to introduce a 
new function to replace `isBlank`.

TASK DETAIL
  https://phabricator.wikimedia.org/T246237

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: JAllemandou, Aklapper, Lucas_Werkmeister_WMDE, dcausse, darthmon_wmde, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Edited] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-26 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T246237

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: JAllemandou, Aklapper, Lucas_Werkmeister_WMDE, dcausse, darthmon_wmde, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Assigned] T246238: Investigate common qualifiers for “unknown value” statement main snaks

2020-02-27 Thread dcausse
dcausse assigned this task to JAllemandou.
dcausse added a subscriber: JAllemandou.
dcausse added a comment.


  @JAllemandou did some work and could extract some numbers from a dump 
imported to hadoop:
  
SELECT ?property (COUNT(*) AS ?count) WHERE {
  ?statement ps:P20 ?unknown.
  FILTER(ISBLANK(?unknown))
  ?statement ?pq ?qualifier.
  ?property wikibase:qualifier ?pq.
}
GROUP BY ?property
ORDER BY DESC(?count)
  
  
  
++-+

|s1  |count|
++-+
|http://www.wikidata.org/entity/P1319|7|
|http://www.wikidata.org/entity/P17  |5|
|http://www.wikidata.org/entity/P131 |4|
|http://www.wikidata.org/entity/P1476|1|
++-+
  
  
  
SELECT ?mainProperty ?qualifierProperty (COUNT(*) AS ?count) WHERE {
  ?mainProperty wikibase:claim ?p;
wikibase:statementProperty ?ps.
  ?qualifierProperty wikibase:qualifier ?pq.
  ?subject ?p ?statement.
  ?statement ?ps ?unknown. FILTER(isBlank(?unknown))
  ?statement ?pq ?qualifier.
}
GROUP BY ?mainProperty ?qualifierProperty
  
  
  

+++-+
|mp  |qp  
|count|

+++-+

|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P1810|77855|
|http://www.wikidata.org/entity/P123 
|http://www.wikidata.org/entity/P1932|56081|
|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P459 
|29573|

|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P3519|24827|

|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P3382|15795|
|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P352 
|9593 |

|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P2926|7310 
|
|http://www.wikidata.org/entity/P570 
|http://www.wikidata.org/entity/P1319|4143 |
|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P973 
|2233 |
|http://www.wikidata.org/entity/P98  
|http://www.wikidata.org/entity/P1932|2192 |
|http://www.wikidata.org/entity/P98  
|http://www.wikidata.org/entity/P1545|1630 |
|http://www.wikidata.org/entity/P393 
|http://www.wikidata.org/entity/P1932|1356 |
|http://www.wikidata.org/entity/P110 
|http://www.wikidata.org/entity/P1932|1334 |
|http://www.wikidata.org/entity/P110 
|http://www.wikidata.org/entity/P1545|1302 |
|http://www.wikidata.org/entity/P559 |http://www.wikidata.org/entity/P131 
|957  |
|http://www.wikidata.org/entity/P655 
|http://www.wikidata.org/entity/P1932|952  |
|http://www.wikidata.org/entity/P655 
|http://www.wikidata.org/entity/P1545|879  |
|http://www.wikidata.org/entity/P570 
|http://www.wikidata.org/entity/P1326|864  |
|http://www.wikidata.org/entity/P569 
|http://www.wikidata.org/entity/P1326|690  |

|http://www.wikidata.org/entity/P5202|http://www.wikidata.org/entity/P1932|627  
|

|http://www.wikidata.org/entity/P5202|http://www.wikidata.org/entity/P1545|615  
|
|http://www.wikidata.org/entity/P50  
|http://www.wikidata.org/entity/P1932|542  |
|http://www.wikidata.org/entity/P569 
|http://www.wikidata.org/entity/P1319|384  |

|http://www.wikidata.org/entity/P2093|http://www.wikidata.org/entity/P1545|380  
|

|http://www.wikidata.org/entity/P1343|http://www.wikidata.org/entity/P3523|369  
|
|http://www.wikidata.org/entity/P571 
|http://www.wikidata.org/entity/P1326|336  |
|http://www.wikidata.org/entity/P571 
|http://www.wikidata.org/entity/P1319|273  |
|http://www.wikidata.org/entity/P3872|http://www.wikidata.org/entity/P137 
|223  |
|http://www.wikidata.org/entity/P3872|http://www.wikidata.org/entity/P580 
|206  |
|http://www.wikidata.org/entity/P3872|http://www.wikidata.org/entity/P582 
|206  |
|http://www.wikidata.org/entity/P26  |http://www.wikidata.org/entity/P580 
|199  |
|http://www.wikidata.org/entity/P816 |http://www.wikidata.org/entity/P817 
|192  |
|http://www.wikidata.org/entity/P921 
|http://www.wikidata.org/entity/P1545|158  |
|http://www.wikidata.org/entity/P816 
|http://www.wikidata.org/entity/P1107|139  |

|http://www.wikidata.org/entity/P2679|http://www.wikidata.org/entity/P1932|137  
|
|http://www.wikidata.org/entity/P3383|http://www.wikidata.org/entity/P805 
|135  |
|http://www.wikidata.org/entity/P921 
|http://www.wikidata.org/entity/P1559|131  |
|http://www.wikidata.org/entity/P625 |http://www.wikidata.org/entity/P828 
|126  |
|http://www.wikidata.org/entity/P629 |http://www.wikidata.org/entity/P407 
|114  |

|http://www.wikidata.org/entity/P1181

[Wikidata-bugs] [Maniphest] [Updated] T246238: Investigate common qualifiers for “unknown value” statement main snaks

2020-02-27 Thread dcausse
dcausse added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T246238

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou, dcausse
Cc: JAllemandou, Lea_Lacroix_WMDE, Gehel, Aklapper, dcausse, Igorkim78, 
Lucas_Werkmeister_WMDE, Jheald, darthmon_wmde, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Assigned] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-27 Thread dcausse
dcausse assigned this task to JAllemandou.
dcausse added a project: Discovery-Search (Current work).
dcausse added a subscriber: Lea_Lacroix_WMDE.
dcausse added a comment.


  @Lea_Lacroix_WMDE the use of `isBlank` seems pretty low, do you think we 
should still try to identify bots by grouping by user-agent and see if 
something is identifiable?

TASK DETAIL
  https://phabricator.wikimedia.org/T246237

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou, dcausse
Cc: Lea_Lacroix_WMDE, JAllemandou, Aklapper, Lucas_Werkmeister_WMDE, dcausse, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Triaged] T246237: Extract some statistics on the use of the isBlank() function in wdqs query logs

2020-02-27 Thread dcausse
dcausse triaged this task as "Medium" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T246237

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou, dcausse
Cc: Lea_Lacroix_WMDE, JAllemandou, Aklapper, Lucas_Werkmeister_WMDE, dcausse, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


  1   2   3   4   5   6   7   8   9   10   >