[Wikidata-bugs] [Maniphest] T275133: Limit query parallelism from Flink based WDQS updater to Wikidata

2021-04-15 Thread dcausse
dcausse added a comment.
Restricted Application added a project: wdwb-tech.


  Since we are going to use envoy to contact MW applications servers I wonder 
if this kind of limits could be enforced by it?
  
  Today I think that wdqs updaters are talking to the edge caches and some 
requests might not reach app servers but when using envoy we will always hit 
the app servers.
  
  I have no clue what would be a reasonable limit here. I collected some stats 
on backend timings for the first 7 day of April 2021 (time_firstbyte on cache 
misses for `/wiki/Special:EntityData/QXYZ.ttl?flavor=dump=XYZ`):
  
  | day of april | count | p50   | p75   | p95   | p99   |
  | 1| 1241154 | 0.083 | 0.104 | 0.157 | 0.212 |
  | 2| 1570675 | 0.084 | 0.105 | 0.156 | 0.210 |
  | 3| 1315251 | 0.083 | 0.103 | 0.153 | 0.209 |
  | 4| 1064852 | 0.081 | 0.102 | 0.155 | 0.209 |
  | 5| 1232205 | 0.081 | 0.103 | 0.154 | 0.209 |
  | 6| 1242875 | 0.082 | 0.103 | 0.156 | 0.209 |
  | 7| 1257607 | 0.082 | 0.103 | 0.157 | 0.212 |

TASK DETAIL
  https://phabricator.wikimedia.org/T275133

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Gehel, Aklapper, Invadibot, MPhamWMF, maantietaja, wkandek, 
JMeybohm, CBogen, Akuckartz, Nandana, Namenlos314, jijiki, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Addshore, Mbch331, Dzahn
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T273098: High Availability Flink

2021-04-14 Thread dcausse
dcausse added a comment.


  In T273098#6997661 <https://phabricator.wikimedia.org/T273098#6997661>, 
@JMeybohm wrote:
  
  > I do see that using the configmap election method is appealing as it is 
build in and does not require additional software to function. Unfortunately I 
was not able to understand (by briefly reading the docs) if this uses a 
separate configmap or the one that is actually used for configuring flink.
  > While the former would be okay-ish I guess, the latter will potentially 
cause problems as every deployment will result in a re-creation of said 
configmap by helm. Resetting it to whatever state the chart has defined.
  
  My understanding is that it is a separate config map named 
`flink-config-${clusterId}` where clusterId is being set via 
kubernetes.cluster-id 
<https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/ha/kubernetes_ha.html#configuration>.

TASK DETAIL
  https://phabricator.wikimedia.org/T273098

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mstyles, dcausse
Cc: Mstyles, dcausse, JMeybohm, jijiki, Aklapper, Gehel, akosiaris, Invadibot, 
MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279698: WDQS should retry when getting 404s

2021-04-08 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T279698

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279698: WDQS should retry when getting 404s

2021-04-08 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T279698

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279698: WDQS should retry when getting 404s

2021-04-08 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a maintainer of the wdqs streaming updater I want requests to 
Special:EntityData receiving a 404 response to be retried so that there are 
fewer items to reconcile (T279541 <https://phabricator.wikimedia.org/T279541>).
  
  There is a race between the events flowing to kafka and mysql replication. 
This race might cause the events to be processed before the data they point to 
is available on the mysql replica being reached.
  
  One simple to circumvent the problem would be to retry on 404. The retry 
could be guarded by a check on the difference processing time and the event 
time, if the difference is less than e.g. 10 seconds then a retry is performed.
  
  Looking at the side output data of the streaming updater for the first seven 
day of april we see (//range// is the delta between the ingestion time vs the 
processing time):
  
+---+--+
|range  |events|
+---+--+
|0: 0-1s|65|
|1: 1-3s|137   |
|2: 3-5s|38|
|3: 5-7s|9 |
+---+--+
  
  which translates to: over the 8 days of wikidata edits 249 events failed with 
a 404 but for which the data is actually available (most probably due to 
replication lag) and whose events were ingested between 0 and 7 seconds after 
their event time.
  
  There are 141 events for which we received a 404 that is still a 404 now:
  
++--+
|range   |events|
++--+
|1: 1-3s |4 |
|3: 5-7s |1 |
|4: 7-10s|2 |
|5: >10s |134   |
++-+
  
  So retrying 404 for events with an `processing_time - event_time < 10 
seconds` seems the right threshold that will cause an extra latency only for a 
few hundreds of events per week.
  
  AC:
  
  - retry 404 until the event time is 10sec older than the processing time

TASK DETAIL
  https://phabricator.wikimedia.org/T279698

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279639: Items sometimes repeat in the Search and Item dropdowns

2021-04-08 Thread dcausse
dcausse added a comment.


  Another `weird` behavior is that you can expand the 7 results without asking 
for more:
  
  Steps to reproduce:
  
  - copy "te" into your paste buffer
  - enter the search widget
  - paste and wait for the 7 results to be suggested
  - delete the search box (e.g. hitting Ctrl-Backspace)
  - do not wait for the search box to refresh and rapidly paste again "te"
  - it should expand more results instead of displaying the same 7 results
  - repeat this operation 8 times it will cycle again and reset to the 7 
initial results
  
  I could not reproduce the duplicates but it seems to me that the UI behaves 
differently depending on the timing of the actions you take.
  
  note: same outcome can be obtained if you rapidly add and delete a letter to 
you search string.

TASK DETAIL
  https://phabricator.wikimedia.org/T279639

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: valerio.bozzolan, Lea_Lacroix_WMDE, dcausse, Gehel, Aklapper, Moebeus, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279639: Items sometimes repeat in the Search and Item dropdowns

2021-04-08 Thread dcausse
dcausse added a comment.


  @Moebeus thanks for the report, do you know if the duplicates appear after 
clicking `more` to display the remaining results or directly?
  If they appear directly could you check by scrolling down if all the first 7 
results are duplicated?
  By default only 7 items are searched and shown, more can be displayed only if 
you hit the `more` button.

TASK DETAIL
  https://phabricator.wikimedia.org/T279639

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: valerio.bozzolan, Lea_Lacroix_WMDE, dcausse, Gehel, Aklapper, Moebeus, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279639: Items sometimes repeat in the Search and Item dropdowns

2021-04-08 Thread dcausse
dcausse edited projects, added Discovery-Search; removed Discovery.

TASK DETAIL
  https://phabricator.wikimedia.org/T279639

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lea_Lacroix_WMDE, dcausse, Gehel, Aklapper, Moebeus, Invadibot, MPhamWMF, 
maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Mbch331, ET4Eva, Darkminds3113, Avner, FloNight
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T279541

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-04-07 Thread dcausse
dcausse added a subtask: T279541: Add a reconciliation strategy to the wdqs 
streaming updater.

TASK DETAIL
  https://phabricator.wikimedia.org/T244590

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: MPhamWMF, Daniel_Mietchen, Thadguidry, tfmorris, revi, Ladsgroup, 
Multichill, darthmon_wmde, Iamamz3, Smalyshev, Ottomata, JAllemandou, Aklapper, 
Zbyszko, Gehel, dcausse, Invadibot, maantietaja, NavinRizwi, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Dinoguy1000, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an 
event driven application.

TASK DETAIL
  https://phabricator.wikimedia.org/T279541

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T279541

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279541: Add a reconciliation strategy to the wdqs streaming updater

2021-04-07 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a maintainer of WDQS I want the streaming updater to be able to reconcile 
a wikibase item so that I can fix some inconsistencies without reloading the 
full database.
  
  This can be achieved by introducing a new topic the streaming updater would 
consume and would contain two type of messages:
  
  - reconcile a specific item revision
  - reconcile a deleted item
  
  This can be used to reconcile missed events (MW bugs, missing events, late 
events), the third mode will be used on fetch failures.
  When a delete is required existing code will be used.
  When the item is existing the mutation message will contain all the entity 
data and the consumer will work like the old updater and will perform a full 
reconciliation.
  
  Automatic reconciliation (probably via a batch running from the analytics 
cluster) should be possible reading side-outputs:
  
  - late events 
<https://schema.wikimedia.org/repositories//secondary/jsonschema/rdf_streaming_updater/lapsed_action/latest.yaml>
  - failed events 
<https://schema.wikimedia.org/repositories/secondary/jsonschema/rdf_streaming_updater/fetch_failure/latest.yaml>
  
  Ad-hoc reconciliation should be possible via a script (or possibly from 
wikibase itself if this is deemed necessary).
  
  The schema of this new topic should be as follow:
  
  - meta: typical event metadata
  - item: string the wikibase item to update
  - revision: long the revision with
  - type: enum: create or delete
  
  The decide mutation operation should be changed to support a new operation:
  
  - if the revision in the message is older than the one seen in the state then 
an operation corresponding to the state is emitted:
- `reconcile` if the state is `CREATED` using the revision seen and fetch 
the data from this revision
- `delete` if the state is `DELETED`
  - if the revision in the message is newer than the one seen in the state (or 
never seen) then an operation corresponding to the message is emitted:
- `reconcile` if the message has a type `create` using the revision from 
the message
- `delete` if the message has a type `delete`
  
  AC:
  
  - a new type of operation `reconcile` is added to MutationEventData
  - streaming-updater-producer operators are adapted to support this new message
  - a new schema is added to 
https://schema.wikimedia.org/repositories/secondary/jsonschema/rdf_streaming_updater
  - the streaming-updater-consumer supports the `reconcile` operation

TASK DETAIL
  https://phabricator.wikimedia.org/T279541

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T270476: Linked Data Fragments endpoint returns IllegalStateException

2021-03-31 Thread dcausse
dcausse claimed this task.
dcausse moved this task from Ready for Development to In Progress on the 
Discovery-Search (Current work) board.

TASK DETAIL
  https://phabricator.wikimedia.org/T270476

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: RKemper, Gehel, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278693: Manually purge obsolete/outdated entites from WDQS (2021-03)

2021-03-30 Thread dcausse
dcausse closed this task as "Resolved".
dcausse claimed this task.
dcausse added a comment.


  In T278693#6956808 <https://phabricator.wikimedia.org/T278693#6956808>, 
@MisterSynergy wrote:
  
  > I read the announcement and I am pretty excited about the improvements. The 
query-preview servers do not seem to have the problem that I have reported 
here, but I am not sure right now whether you have reloaded the entities there 
as well.
  
  No I did not, nice to hear you did not find the inconsistencies there, thanks 
for checking!
  
  > Until now I have not used query-preview except for some tests, but I will 
have a closer look in the next time.
  
  Thanks!
  
  > The production query service seems fine again, at least with respect to the 
reported entities. We can close this ticket from my opinion.
  
  Sure will do.

TASK DETAIL
  https://phabricator.wikimedia.org/T278693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, MisterSynergy, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278693: Manually purge obsolete/outdated entites from WDQS (2021-03)

2021-03-30 Thread dcausse
dcausse moved this task from Ready for Development to Needs Reporting on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  @MisterSynergy thanks for the report!
  
  Note that we're testing a new system to update wdqs entities and we hope it's 
more reliable. If you have time and if it's not too complicated on your side I 
wonder if you could extract the same list out of this test endpoint: 
https://query-preview.wikidata.org/ ? thanks!

TASK DETAIL
  https://phabricator.wikimedia.org/T278693

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, MisterSynergy, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T274982: Disable fetching constraints from the wdqs updater

2021-03-29 Thread dcausse
dcausse added a comment.


  Thanks for bringing this here, this link is generated from 
https://www.wikidata.org/wiki/Module:Constraints/SPARQL and seems to be added 
to all properties except the fews that define no constraint.
  Digging more through the impact over the 370 queries using 
`wikibase:hasViolationForConstraint` for March (1st -> 28th):
  
  - 200 come from one of these links
  - 112 are from 
https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/Pi_bot_13
  - 28 are from 
https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/Pi_bot_11
  - 8 are from example queries taken from the announcement: 
https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_property_constraints/Archive_2#You_can_now_query_the_constraint_violations_with_the_Query_Service
  - no obvious categories for the remaining 22 queries

TASK DETAIL
  https://phabricator.wikimedia.org/T274982

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Tacsipacsi, Mohammed_Sadat_WMDE, Lea_Lacroix_WMDE, Addshore, MPhamWMF, 
Lydia_Pintscher, Lucas_Werkmeister_WMDE, WMDE-leszek, dcausse, Aklapper, 
Invadibot, maantietaja, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, 
Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Gaboe420, Giuliamocci, 
Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, QZanden, EBjune, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278385: Streaming Updater must make all requests to proxy endpoints

2021-03-25 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T278385

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: JMeybohm, jijiki, Aklapper, Gehel, akosiaris, Mstyles, Invadibot, MPhamWMF, 
maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277637: Report latency metric to the wdqs-ui from the wdqs streaming updater

2021-03-23 Thread dcausse
dcausse added a comment.


  Suggestion for a better long term solution here: T278246 
<https://phabricator.wikimedia.org/T278246>

TASK DETAIL
  https://phabricator.wikimedia.org/T277637

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, Alter-paule, 
Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, 
Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, 
Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278246: Report WDQS update latency when displaying/serving results

2021-03-23 Thread dcausse
dcausse renamed this task from "WDQS latency " to "Report WDQS update latency 
when displaying/serving results ".
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T278246

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278246: WDQS latency

2021-03-23 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a WDQS user I want to know if the results I see come from a server that is 
potentially lagged so that I can evaluate if they are stale.
  
  Current the WDQS UI  reports a latency metric querying the backend for the 
following triple:
  
select ?lastUpdated where { <http://www.wikidata.org> schema:dateModified 
?lastUpdated }
  
  This technique assumes couple things that might not be true:
  
  - the server this metric is obtained from is the same that the one the results
  - the latency is stable without out of order events
  
  Solving the first assumption might possible by changing how the UI presents 
this data to the user and instead of decoupling the display of this metric and 
the display of the results it could fetch this data from a HTTP header set by 
the WDQS backend.
  The WDQS backend could be with an additional set of servlet filter to 
maintain in memory this latency metric.
  The updater would set a custom header with the latency information that the 
filter would use to update its memory structure.
  The wdqs backend would set a custom header with this latency information it 
has in memory.
  The UI would compare the latency information with the client's "current time" 
giving a rough estimate of whether or not the results might be stale according 
to user expectations.
  
  This solution works for cached response as well.
  
  Solving the second point is more difficult but using the average of the event 
times found in the batch being currently persisted might help.
  
  AC:
  
  - The WDQS UI reports the latency it finds in the response
  - The WDQS UI no longer reports latency information that is not related to 
the server serving the results
  - The latency information is no longer stored in the triple store
  - The WDQS backend reports latency information from a custom HTTP header 
`X-Last-Event-Time` (name TBD)
  - The WDQS backend updates its latency information from a custom HTTP header 
`X-Set-Last-Event-Time` on  write requests
  - The event time is an average of the batch being processed

TASK DETAIL
  https://phabricator.wikimedia.org/T278246

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277637: Report latency metric to the wdqs-ui from the wdqs streaming updater

2021-03-23 Thread dcausse
dcausse claimed this task.
dcausse moved this task from incoming to in progress on the Wikidata board.

TASK DETAIL
  https://phabricator.wikimedia.org/T277637

WORKBOARD
  https://phabricator.wikimedia.org/project/board/71/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277691: Argument 1 passed to DataValues\Geo\Values\LatLongValue::__construct() must be of the type float, string given

2021-03-22 Thread dcausse
dcausse removed a project: GeoData.

TASK DETAIL
  https://phabricator.wikimedia.org/T277691

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Addshore, WMDE-leszek, Aklapper, brennen, Invadibot, maantietaja, 
Akuckartz, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Verdy_p, Wikidata-bugs, 
aude, Lydia_Pintscher, Jdforrester-WMF, Mbch331, Jay8g, MaxSem
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T274354: rdf munger and hence wdqs-updater requires siteLinks to be formed using a specific articlePath

2021-03-19 Thread dcausse
dcausse added a comment.


  @despens could you provide a reproducible test case (a small RDF file that 
triggers the problem would be great). I don't see how site links could be 
involved in the problem you raise and a test case will definitely help. Thanks!

TASK DETAIL
  https://phabricator.wikimedia.org/T274354

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Addshore, despens, Aklapper, MPhamWMF, maantietaja, CBogen, 
Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, abian, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T263427: Unable to process a particular wikibase dump using munge.sh (localised namespace name)

2021-03-19 Thread dcausse
dcausse added a comment.


  I see two ways to fix this:
  
  - wikibase should always use `Special:EntityData` and not the localized page 
name for its RDF output (similar to the workaround suggested)
  - wdqs accepts new options to the munger/updater/loader to indicate the 
localized version it has to look for
  
  I'm fine either ways, the second option will require wdqs admins to configure 
an additional option.
  
  @Addshore do  you have an opinion on this?

TASK DETAIL
  https://phabricator.wikimedia.org/T263427

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Addshore, Andrawaag, JeroenDeDauw, WMDE-leszek, dcausse, Aklapper, 
Nikerabbit, MPhamWMF, maantietaja, CBogen, Samantha_Alipio_WMDE, Akuckartz, 
Jelabra, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, Asahiko, abian, despens, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Nemo_bis, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T258776: Add Structured Data on Commons M-ID to Wikidata dumps

2021-03-17 Thread dcausse
dcausse added a comment.


  Note: in an attempt to unblock the status quo I created T277665 
<https://phabricator.wikimedia.org/T277665> with some practical solution (esp 
the first one suggested in T258769#6332430 
<https://phabricator.wikimedia.org/T258769#6332430>)

TASK DETAIL
  https://phabricator.wikimedia.org/T258776

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Fuzheado, Jane023, Multichill, Lydia_Pintscher, Spinster, FRomeo_WMF, 
GFontenelle_WMF, dcausse, Jarekt, Librarian_lena, Lucas_Werkmeister_WMDE, 
Jheald, Aklapper, Tpt, MPhamWMF, maantietaja, CBogen, Nintendofan885, 
Akuckartz, Nandana, JKSTNK, Namenlos314, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T258769: ImageGrid for WCQS

2021-03-17 Thread dcausse
dcausse added a comment.


  Note: in an attempt to unblock the status quo I created T277665 
<https://phabricator.wikimedia.org/T277665> with some practical solution (esp 
the first one suggested in T258769#6332430 
<https://phabricator.wikimedia.org/T258769#6332430>)

TASK DETAIL
  https://phabricator.wikimedia.org/T258769

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Ainali, Nikki, Zbyszko, Librarian_lena, Lucas_Werkmeister_WMDE, 
Jheald, Husky, Rachmat04, Gehel, Aklapper, MPhamWMF, maantietaja, Muchiri124, 
CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Ramsey-WMF, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Poyekhali, _jensen, 
rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Ixocactus, Wong128hk, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, El_Grafo, 
Dinoguy1000, Manybubbles, Steinsplitter, Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277637: Report latency metric to the wdqs-ui from the wdqs streaming updater

2021-03-17 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a user of the WDQS UI I want to know the lag of the servers so that I can 
more easily evaluate if the results I see are stale.
  
  The way it is designed today is to use a dedicated triple in the store:
  
<http://www.wikidata.org> schema:dateModified 
"2021-03-12T20:59:57Z"^^xsd:dateTime
  
  This triple being an artifact of the update it was decided to not maintain it 
with the new updater. Reason was that the triple seemed the wrong choice for 
storing metrics regarding the health of the system.
  
  On the other hand the WDQS UI is currently reporting the lag metric using 
this triple despite being inaccurate (there are no guarantee that the lag 
reported is from the same server than the one that will serve the results 
displayed).
  
  Short term solution is perhaps to still maintain this triple from the new 
updater and figure out better long term solution to properly inform users of 
the latency of the results they see.
  
  AC:
  
  - the triple `<http://www.wikidata.org> schema:dateModified TIMESTAMP` is 
maintained by the new updater
  - create a ticket describing a better long term solution

TASK DETAIL
  https://phabricator.wikimedia.org/T277637

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277565: Misleading markup placements when querying items located around the 180th meridian

2021-03-16 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  When running:
  
#defaultView:Map
SELECT ?item ?itemLabel ?district ?districtLabel ?coords ?FindaGrave WHERE {
  ?item wdt:P31 wd:Q39614.
  ?item wdt:P17 wd:Q664 .
  ?item wdt:P131 ?district.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .
 ?item rdfs:label ?itemLabel . 
 ?district rdfs:label ?districtLabel .}
  optional {?item wdt:P625 ?coords.}
  optional {?item wdt:P2025 ?FindaGrave.}
  filter(!strstarts(?itemLabel,"Burial"))
  filter(!strstarts(?itemLabel,"Midden"))
  filter (?item NOT IN (wd:Q79309412, wd:Q79309112, wd:Q79309436, 
wd:Q79311062, wd:Q23073922))
} order by ?districtLabel ?itemLabel
  
  Instead of collocating the points close to each others they are displayed on 
the "same" map:
  
  F34164210: wdqs_map_issue.png <https://phabricator.wikimedia.org/F34164210>
  
  Ideal placement should be spread around the 180th meridian.
  Might be related: 
https://stackoverflow.com/questions/38820724/how-to-display-leaflet-markers-near-the-180-meridian

TASK DETAIL
  https://phabricator.wikimedia.org/T277565

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Lea_Lacroix_WMDE, dcausse, Aklapper, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, abian, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277108: Query service throws exception for non-English wikis

2021-03-16 Thread dcausse
dcausse closed this task as a duplicate of T263427: Unable to process a 
particular wikibase dump using munge.sh (localised namespace name).

TASK DETAIL
  https://phabricator.wikimedia.org/T277108

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Andrawaag, Aklapper, JeroenDeDauw, Addshore, MPhamWMF, 
maantietaja, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, abian, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T263427: Unable to process a particular wikibase dump using munge.sh (localised namespace name)

2021-03-16 Thread dcausse
dcausse merged a task: T277108: Query service throws exception for non-English 
wikis.
dcausse added subscribers: JeroenDeDauw, Andrawaag, Addshore.
Restricted Application added a project: wdwb-tech-focus.

TASK DETAIL
  https://phabricator.wikimedia.org/T263427

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Addshore, Andrawaag, JeroenDeDauw, WMDE-leszek, dcausse, Aklapper, 
Nikerabbit, MPhamWMF, maantietaja, CBogen, Samantha_Alipio_WMDE, Akuckartz, 
Jelabra, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, Asahiko, abian, despens, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Nemo_bis, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277108: Query service throws exception for non-English wikis

2021-03-16 Thread dcausse
dcausse added a comment.


  Tentatively closing as a duplicate of T263427 
<https://phabricator.wikimedia.org/T263427> as this sounds very similar, please 
re-open if you think it's completely different or if the workaround mentioned 
there does not work for you.

TASK DETAIL
  https://phabricator.wikimedia.org/T277108

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Andrawaag, Aklapper, JeroenDeDauw, Addshore, MPhamWMF, 
maantietaja, CBogen, Akuckartz, darthmon_wmde, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, abian, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-03-15 Thread dcausse
dcausse added a subtask: T277443: The streaming updater consumer should log 
information when divergences are detected.

TASK DETAIL
  https://phabricator.wikimedia.org/T244590

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Daniel_Mietchen, Thadguidry, tfmorris, revi, Ladsgroup, Multichill, 
darthmon_wmde, Iamamz3, Smalyshev, Ottomata, JAllemandou, Aklapper, Zbyszko, 
Gehel, dcausse, MPhamWMF, maantietaja, NavinRizwi, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Dinoguy1000, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277443: The streaming updater consumer should log information when divergences are detected

2021-03-15 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an 
event driven application.

TASK DETAIL
  https://phabricator.wikimedia.org/T277443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277443: The streaming updater consumer should log information when divergences are detected

2021-03-15 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a maintainer of the rdf-streaming-updater I want information to be logged 
when divergences are detected on patch application so that I can more easily 
debug the cause of these divergences.
  
  When applying a RDF patch to the triple store (blazegraph) some divergences 
may occur for the following reasons:
  
  - the state of the store is not what is expected by the flink pipeline 
(actual divergences)
  - false positives: some triples/literals are modified on the fly by 
blazegraph (unicode normalization/large values cutoff/precisions). Should be a 
couple to a dozen triples per hour.
  
  Finding out what's the cause of a non-negligible bump in the number of 
divergences is not straightforward today, adding some more logs to the 
streaming consumer will help such investigations.
  
  AC:
  
  - meaningful logs allowing to trace what are the changes involved in a bump 
of divergences

TASK DETAIL
  https://phabricator.wikimedia.org/T277443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T276784: Recover lexemes on wdqs1009

2021-03-09 Thread dcausse
dcausse moved this task from In Progress to Needs review on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  Reprocessed all updates related to lexemes on wdqs1009 using a custom build 
with https://gerrit.wikimedia.org/r/670090

TASK DETAIL
  https://phabricator.wikimedia.org/T276784

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, maantietaja, Alter-paule, Beast1978, CBogen, 
Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T276784: Recover lexemes on wdqs1009

2021-03-08 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  Lexemes were not imported on wdqs1009 and when the streaming-updater-consumer 
started consuming lexemes from the flink ouput streams the triple store did not 
match what was expected causing diffs to generate a lot of inconsistencies.
  
  The full flink output is still retained in kafka (1 month retention) and we 
could try to recover lexemes without having to do a full import again.
  
  1. add an option to the streaming-updater-consumer to ingest only lexemes
  2. import the lexemes from `/srv/wdqs/lex-munged/`
  3. start a manual streaming-updater-consumer re-reading the whole mutation 
stream filtered on lexeme from offset 12080891
  4. stop it when it catches up with the normal consumer

TASK DETAIL
  https://phabricator.wikimedia.org/T276784

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2021-03-08 Thread dcausse
dcausse added a subtask: T276750: Add a mean to upgrade the flink code even 
when incompatible serialization changes are involved.

TASK DETAIL
  https://phabricator.wikimedia.org/T244590

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Daniel_Mietchen, Thadguidry, tfmorris, revi, Ladsgroup, Multichill, 
darthmon_wmde, Iamamz3, Smalyshev, Ottomata, JAllemandou, Aklapper, Zbyszko, 
Gehel, dcausse, MPhamWMF, maantietaja, NavinRizwi, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Dinoguy1000, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T276750: Add a mean to upgrade the flink code even when incompatible serialization changes are involved

2021-03-08 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an 
event driven application.

TASK DETAIL
  https://phabricator.wikimedia.org/T276750

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T276750: Add a mean to upgrade the flink code even when incompatible serialization changes are involved

2021-03-08 Thread dcausse
dcausse claimed this task.
dcausse added projects: Wikidata-Query-Service, Discovery-Search (Current work).
dcausse updated the task description.
Restricted Application added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T276750

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
abian, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS

2021-03-03 Thread dcausse
dcausse added a comment.


  In T244341#6871123 <https://phabricator.wikimedia.org/T244341#6871123>, 
@Lucas_Werkmeister_WMDE wrote:
  
  > It’s probably worth mentioning in that documentation that this change 
applies not just to the query service but also to the RDF dumps and 
Special:EntityData. Otherwise, it looks good to me :)
  
  Good point, I added some notes about this, thanks!

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: ericP, JMinor, TomT0m, Gehel, Multichill, Pfps, Mmarx, Dipsacus_fullonum, 
Luitzen, VladimirAlexiev, Lea_Lacroix_WMDE, Jheald, Daniel_Mietchen, 
mkroetzsch, Denny, Lucas_Werkmeister_WMDE, Aklapper, dcausse, MPhamWMF, CBogen, 
Akuckartz, Viztor, 94rain, Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, abian, jkroll, Wikidata-bugs, Jdouglas, Snowolf, aude, Tobias1984, 
Manybubbles, Shizhao, Mbch331, Rxy
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS

2021-03-01 Thread dcausse
dcausse added a comment.


  Started some documentation about the change at 
https://www.mediawiki.org/wiki/Wikidata_Query_Service/Blank_Node_Skolemization, 
comments/suggestions are welcome.

TASK DETAIL
  https://phabricator.wikimedia.org/T244341

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: ericP, JMinor, TomT0m, Gehel, Multichill, Pfps, Mmarx, Dipsacus_fullonum, 
Luitzen, VladimirAlexiev, Lea_Lacroix_WMDE, Jheald, Daniel_Mietchen, 
mkroetzsch, Denny, Lucas_Werkmeister_WMDE, Aklapper, dcausse, MPhamWMF, CBogen, 
Akuckartz, Viztor, 94rain, Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, abian, jkroll, Wikidata-bugs, Jdouglas, Snowolf, aude, Tobias1984, 
Manybubbles, Shizhao, Mbch331, Rxy
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T274982: Disable fetching constraints from the wdqs updater

2021-02-17 Thread dcausse
dcausse added a subscriber: MPhamWMF.
dcausse added a comment.


  If we agree to stop fetching constraints from the updater this would 
effectively mean that we do not fetch new violations for new edits existing 
ones will stay until the next reload (T267927 
<https://phabricator.wikimedia.org/T267927>). After the reload wdqs won't be 
usable for querying constraint violation. Proper solutions will have to be 
found to expose constraint violations (ideally in another graph).
  
  If we disagree to stop fetching constraint violations and keep the system 
as-is they will disappear after the reload and only new edits will repopulate 
violations. When enabling the new updater we will have to keep the old one 
running just for fetching violations on edits (status quo).

TASK DETAIL
  https://phabricator.wikimedia.org/T274982

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: MPhamWMF, Lydia_Pintscher, Lucas_Werkmeister_WMDE, WMDE-leszek, dcausse, 
Aklapper, Alter-paule, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, Kent7301, 
joker88john, CucyNoiD, Nandana, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, 
Af420, Bsandipan, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Lewizho99, 
Maathavan, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T199219: WDQS should use internal endpoint to communicate to Wikidata

2021-02-17 Thread dcausse
dcausse added a comment.


  The new updater is currently running on the analytics network (working on 
getting k8s deployment reading), we could set it up to appservers-ro but I 
think a hole needs to be open between the two networks (see similar issue in 
T274951 <https://phabricator.wikimedia.org/T274951>).

TASK DETAIL
  https://phabricator.wikimedia.org/T199219

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Ladsgroup, akosiaris, BBlack, Aklapper, Smalyshev, Gehel, 
MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Jony, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Vali.matei, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T274982: Disable fetching constraints from the wdqs updater

2021-02-17 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  The way constraints are updated using the old updater is suboptimal and 
partial:
  
  - does not take the revision into account
  - only fetched on item edits
  - they all disappear after a data-reload
  - the new streaming updater does not support these
  - user impact does seem small: 370 out of 230,492,864 queries for jan 2021 
are using the `wikibase:hasViolationForConstraint` predicate
  
  I suggest to disable them while a proper solution is found and put in place 
on the wikibase side (T201150 <https://phabricator.wikimedia.org/T201150>, 
T201147 <https://phabricator.wikimedia.org/T201147>)

TASK DETAIL
  https://phabricator.wikimedia.org/T274982

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T272120: Deleted item still gets shown in WDQS query results

2021-02-16 Thread dcausse
dcausse added a comment.


  FTR, deletes will be re-synced regularly using an ad-hoc script available at 
https://people.wikimedia.org/~dcausse/wdqs_manual_deletes/.
  This will be done the time needed to ship the new updater to production or if 
the root cause on the current updater is found and fixed.

TASK DETAIL
  https://phabricator.wikimedia.org/T272120

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel, dcausse
Cc: M2k_dewiki, Fnielsen, Gehel, MPhamWMF, Mahir256, dcausse, Aklapper, 
Mbch331, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-02-16 Thread dcausse
dcausse added a comment.


  Mitigation for deletes will be made using a script that polls for the 
deletion log and resync the items, ref: 
https://people.wikimedia.org/~dcausse/wdqs_manual_deletes/ .

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Mbch331, Skim, Strepon, Multichill, Zbyszko, RKemper, Epidosis, dcausse, 
Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, 
Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, MPhamWMF, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T274519: timeout on geospatial query; resolved by OPTIONAL?

2021-02-15 Thread dcausse
dcausse closed this task as "Declined".
dcausse added a comment.


  Definitely a blazegraph opitimization issue.
  Disabling the optimizer seems to also help:
  
SELECT ?place ?placeLabel ?page ?location ?dist WHERE
{
  hint:Query hint:optimizer "None".
  wd:Q84 wdt:P625 ?loc .
  SERVICE wikibase:around {
  ?place wdt:P625 ?location .
  bd:serviceParam wikibase:center ?loc .
  bd:serviceParam wikibase:radius "1" .
  }
  ?page schema:about ?place;
schema:isPartOf <https://en.wikipedia.org/> ###
  BIND(geof:distance(?loc, ?location) as ?dist)
  SERVICE wikibase:label {
bd:serviceParam wikibase:language "en" .
  }
} order by ?dist
  
  I'm closing since I think this is unlikely we have the expertise/bandwidth to 
address such complex issue related to blazegraph internals.

TASK DETAIL
  https://phabricator.wikimedia.org/T274519

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, VladimirAlexiev, MPhamWMF, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267927: Reload wikidata journal from fresh dumps

2021-02-04 Thread dcausse
dcausse added a comment.


  I pre-fetched the dumps required for the reload on wdqs1010 & wdqs1009.
  
  - `wdqs1009` needs to be reloaded using `--reuse-downloaded-dump 
--reload-data wikidata --skolemize`
  - `wdqs1010` with `--reuse-downloaded-dump --reload-data wikidata`

TASK DETAIL
  https://phabricator.wikimedia.org/T267927

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Aklapper, dcausse, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T273636: Blazegraph journal for wcqs is too big

2021-02-03 Thread dcausse
dcausse moved this task from Incoming to Ready for Development on the 
Discovery-Search (Current work) board.
dcausse set the point value for this task to "3".

TASK DETAIL
  https://phabricator.wikimedia.org/T273636

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T273636: Blazegraph journal for wcqs is too big

2021-02-03 Thread dcausse
dcausse triaged this task as "High" priority.
dcausse added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T273636

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T273636: Blazegraph journal for wcqs is too big

2021-02-02 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  The blazegraph journal on wcqs-beta-01 has grown too big:
  
-rw-rw-r-- 1 blazegraph blazegraph 3.0T Feb  2 15:56 wcqs.jnl
  
  AC:
  
  - The blazegraph journal on this machine should be proportional to the 
dataset it serves.

TASK DETAIL
  https://phabricator.wikimedia.org/T273636

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T272120: Deleted item still gets shown in WDQS query results

2021-01-27 Thread dcausse
dcausse reopened this task as "Open".
dcausse added a comment.


  Re-opening as there still seem to be a problem related to deletes, and the 
fix done in T267175 <https://phabricator.wikimedia.org/T267175> was not 
effective.
  
  All servers seem to have missed the deletion of //Q104982840//.
  Last edit for revision 1348777513 was done on `2021-01-26T18:33:39Z`, delete 
was done at `2021-01-27T07:48:16Z`.
  
  Note that the server running the new updater has properly handled the delete.

TASK DETAIL
  https://phabricator.wikimedia.org/T272120

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Mbch331, MPhamWMF, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-27 Thread dcausse
dcausse added a comment.


  Thanks for the comments.
  Inconsistencies for edits prior to Jan 20 (time when the last fix was 
deployed) are //expected// and will be fixed by the reload.
  Inconsistency on Q104982840 is more troubling as the delete was done after 
this date. I'll re-open this specific ticket as I wonder if there is not 
something specific to deletes happening here.

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Mbch331, Skim, Strepon, Multichill, Zbyszko, RKemper, Epidosis, dcausse, 
Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, 
Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, MPhamWMF, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T272994: Ensure scalastyle import order rules are verified

2021-01-26 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a developper of the scala projects in the wdqs repo I want scalastyle to 
checks the import order so that the imports order remains consistent.
  
  There seem to be a check in our rule but it does not seem effective:
  

  
java,scala,other
javax?\..*
scala\..*
.*
  

  
  We also have some intelliJ config 
<https://repo1.maven.org/maven2/org/wikimedia/discovery/discovery-maven-tool-configs/1.14/discovery-maven-tool-configs-1.14-intellij-config.jar>
 that can be imported to assist the developper containing proper import orders 
<https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia/discovery/discovery-maven-tool-configs/+/refs/heads/master/src/editors/intellij/codestyles/scala_imports.xml>.
  
  AC:
  
  - import orders are checked during the build ran in our CI
  - import orders are consistent with the config provided through 
discovery-maven-tool-configs

TASK DETAIL
  https://phabricator.wikimedia.org/T272994

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-26 Thread dcausse
dcausse merged a task: T272120: Deleted item still gets shown in WDQS query 
results.
dcausse added a subscriber: Mbch331.

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Mbch331, Skim, Strepon, Multichill, Zbyszko, RKemper, Epidosis, dcausse, 
Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, 
Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, MPhamWMF, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T272120: Deleted item still gets shown in WDQS query results

2021-01-26 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, 
which should be filter out; number of entries in result set might change when 
executed repeatedly (possible caching/indexing problem).

TASK DETAIL
  https://phabricator.wikimedia.org/T272120

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Mbch331, MPhamWMF, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T272120: Deleted item still gets shown in WDQS query results

2021-01-26 Thread dcausse
dcausse added a comment.


  I think this king of inconsistencies were related to the problems reported in 
T267175 <https://phabricator.wikimedia.org/T267175>.
  Thanks for the report, I manually resynced but please let me know via this 
ticket or T267175 <https://phabricator.wikimedia.org/T267175> if you see other 
stale entities.

TASK DETAIL
  https://phabricator.wikimedia.org/T272120

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Mbch331, MPhamWMF, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267927: Reload wikidata journal from fresh dumps

2021-01-25 Thread dcausse
dcausse added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T267927

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-25 Thread dcausse
dcausse moved this task from To Be Deployed to Needs Reporting on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  Tried to find more inconsistencies using the query provided by @Multichill 
(https://w.wiki/ugf) and could not spot any while it was very easy to find one 
previously. I'm assuming the problem is resolved and that we can proceed with 
the full reload based on new units 
(https://phabricator.wikimedia.org/T267644#6758238).
  If no new inconsistencies are reported since then here is the expected 
schedule:
  
  - units are reloaded on Friday 29 (2021-01-29)
  - full data-reload of one machine starts next Friday (2021-02-05)
  - depending on the time needed and in best case scenario (no import failures) 
the reimport of all the wdqs machines should be finished by the end of February

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Skim, Strepon, Multichill, Zbyszko, RKemper, Epidosis, dcausse, 
Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, 
Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, MPhamWMF, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T272447: Extract a list of the 200 most viewed black historical figures from WDQS

2021-01-20 Thread dcausse
dcausse added a comment.


  @Miriam this would be great indeed!
  I think we can precise the query with (generalize the ethnicity and filter on 
Q5):
  
SELECT ?item WHERE {
  ?item wdt:P31 wd:Q5 ;
wdt:P172/wdt:P279* wd:Q817393 .
}

TASK DETAIL
  https://phabricator.wikimedia.org/T272447

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: MPhamWMF, CBogen, Miriam, dcausse, Mstyles, Gehel, Aklapper, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2021-01-20 Thread dcausse
dcausse moved this task from Blocked (from outside the team) to Waiting on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  Moving to waiting as T267175 <https://phabricator.wikimedia.org/T267175> is 
the last ticket still blocking this and is on the search team's plate.

TASK DETAIL
  https://phabricator.wikimedia.org/T267644

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Zbyszko, Aklapper, Lucas_Werkmeister_WMDE, Gehel, Toni_001, 
MPhamWMF, CBogen, Akuckartz, Iflorez, alaa_wmde, Nandana, Namenlos314, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2021-01-20 Thread dcausse
dcausse added a comment.


  In T267644#6758340 <https://phabricator.wikimedia.org/T267644#6758340>, 
@Lucas_Werkmeister_WMDE wrote:
  
  > Alright, I uploaded a new unitConversionConfig.json at 
https://gerrit.wikimedia.org/r/657131; I’ve -2ed it for now due to the 
unclarity about synchronization issues. Lexeme dumps are currently blocked, 
though (T220883 <https://phabricator.wikimedia.org/T220883>).
  
  Thanks!
  
  Lexeme RDF dumps are functional and are the one wdqs is using.

TASK DETAIL
  https://phabricator.wikimedia.org/T267644

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Zbyszko, Aklapper, Lucas_Werkmeister_WMDE, Gehel, Toni_001, 
MPhamWMF, CBogen, Akuckartz, Iflorez, alaa_wmde, Nandana, Namenlos314, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267644: Update Wikidata unit conversion config (normalized quantities)

2021-01-19 Thread dcausse
dcausse added a comment.


  Thanks for fixing this script!
  
  Few notes on the next steps and how we should synchronize our efforts:
  
  Once the script has updated the json file read by wikibase we will have to 
re-import the wdqs machines using a dump generated based on the new units.
  Given current schedules I think the ideal plan is:
  
  - update and deploy unitConversionConfig.json on a friday before the lexeme 
dump is started (23:00 UTC on fridays)
  - once the ttl dumps are availabe (lexemes and all) generally on the next 
thursday start the cook-book to reimport one wdqs machine
  - once the import is done and the lag is absorbed (one/two weeks) use the 
data-transfer cookbook to replicate the fresh journal to other wdqs machines
  
  As to when we should schedule:
  
  - we are still waiting on T267175 <https://phabricator.wikimedia.org/T267175> 
to make sure we fix all outstanding synchronization issues before moving 
forward with a fresh import

TASK DETAIL
  https://phabricator.wikimedia.org/T267644

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Zbyszko, Aklapper, Lucas_Werkmeister_WMDE, Gehel, Toni_001, 
MPhamWMF, CBogen, Akuckartz, Iflorez, alaa_wmde, Nandana, Namenlos314, Lahi, 
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-18 Thread dcausse
dcausse added a comment.


  Checked a couple of these inconsistencies and they appear to all be out of 
order in the kafka topics. I suggest to disable `async imports` as I believe it 
might be possible cause of these inconsistencies.

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Skim, Strepon, Multichill, Zbyszko, RKemper, Epidosis, dcausse, 
Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, 
Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, MPhamWMF, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2021-01-13 Thread dcausse
dcausse assigned this task to Lucas_Werkmeister_WMDE.
dcausse moved this task from To Be Deployed to Needs Reporting on the 
Discovery-Search (Current work) board.

TASK DETAIL
  https://phabricator.wikimedia.org/T239931

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE, dcausse
Cc: Gehel, Lucas_Werkmeister_WMDE, tfmorris, CBogen, EBernhardson, Ladsgroup, 
Addshore, Aklapper, dcausse, MPhamWMF, Wilmanbeno, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Iflorez, Kent7301, alaa_wmde, joker88john, CucyNoiD, 
Nandana, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, Lewizho99, Maathavan, _jensen, 
rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, jayvdb, Lydia_Pintscher, 
Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T271851: Clean up gui from the wdqs deploy repo and puppet

2021-01-13 Thread dcausse
dcausse assigned this task to Ladsgroup.
dcausse added a project: Discovery-Search (Current work).
Restricted Application added a project: User-Ladsgroup.

TASK DETAIL
  https://phabricator.wikimedia.org/T271851

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ladsgroup, dcausse
Cc: Addshore, dcausse, RKemper, Gehel, Aklapper, Ladsgroup, MPhamWMF, 
Alter-paule, Hazizibinmahdi, Beast1978, CBogen, Un1tY, Akuckartz, Hook696, 
Kent7301, joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, 
Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Mahir256, QZanden, EBjune, merbst, LawExplorer, Salgo60, 
Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2021-01-11 Thread dcausse
dcausse added a comment.


  In T239931#6736120 <https://phabricator.wikimedia.org/T239931#6736120>, 
@Lucas_Werkmeister_WMDE wrote:
  
  > Agreed. Should we (Wikidata team) do the config change or leave it to you? 
:)
  
  We can ship the config change no worries :)

TASK DETAIL
  https://phabricator.wikimedia.org/T239931

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Lucas_Werkmeister_WMDE, tfmorris, CBogen, EBernhardson, Ladsgroup, 
Addshore, Aklapper, dcausse, MPhamWMF, Wilmanbeno, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Iflorez, Kent7301, alaa_wmde, joker88john, CucyNoiD, 
Nandana, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, Lewizho99, Maathavan, _jensen, 
rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, jayvdb, Lydia_Pintscher, 
Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2021-01-11 Thread dcausse
dcausse added a comment.


  In T239931#6719994 <https://phabricator.wikimedia.org/T239931#6719994>, 
@EBernhardson wrote:
  
  > With the holidays over and everyone back, i think we can turn this on?
  
  Sounds good to me!

TASK DETAIL
  https://phabricator.wikimedia.org/T239931

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Lucas_Werkmeister_WMDE, tfmorris, CBogen, EBernhardson, Ladsgroup, 
Addshore, Aklapper, dcausse, MPhamWMF, Wilmanbeno, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Iflorez, Kent7301, alaa_wmde, joker88john, CucyNoiD, 
Nandana, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, Lewizho99, Maathavan, _jensen, 
rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, jayvdb, Lydia_Pintscher, 
Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2021-01-05 Thread dcausse
dcausse merged a task: T270975: Some lexemes cannot be obtained by SPARQL query.
dcausse added subscribers: Strepon, Skim.

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Skim, Strepon, Multichill, Zbyszko, RKemper, Epidosis, dcausse, 
Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, 
Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, MPhamWMF, Alter-paule, Beast1978, 
Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Maathavan, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T270975: Some lexemes cannot be obtained by SPARQL query

2021-01-05 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, 
which should be filter out; number of entries in result set might change when 
executed repeatedly (possible caching/indexing problem).

TASK DETAIL
  https://phabricator.wikimedia.org/T270975

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Skim, Strepon, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, 
Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Mahir256, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Bodhisattwa, Scott_WUaS, 
Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T270975: Some lexemes cannot be obtained by SPARQL query

2021-01-05 Thread dcausse
dcausse added a comment.


  I manually refreshed the entities mentioned here, the root cause is already 
being worked on in a separate issue and thus closing this ticket as duplicate 
of it.

TASK DETAIL
  https://phabricator.wikimedia.org/T270975

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Skim, Strepon, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, 
Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Mahir256, QZanden, 
EBjune, merbst, LawExplorer, _jensen, rosalieper, Bodhisattwa, Scott_WUaS, 
Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T270614: Automatically depool wdqs servers that are "lagged"

2020-12-21 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a wdqs user I would like servers that are lagged to be depooled so that I 
don't experience stale results.
  
  We (wdqs maintainers) often have to depool wdqs servers manually because they 
are heavily lagged, this has several drawbacks:
  
  - it relies on a manual intervention
  - the operator that depooled the server in the first place must remember to 
repool the server once the lag is back to acceptable values
  
  AC:
  
  - a server should be automatically depooled if the lag reached a certain 
threshold (re-use the same threshold used by icinga?)
  - a server should be automatically repooled when its lag is back to normal 
values
  - do not automatically depool more that what we currently can serve

TASK DETAIL
  https://phabricator.wikimedia.org/T270614

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Aklapper, dcausse, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-17 Thread dcausse
dcausse added a comment.


  In T269619#6696071 <https://phabricator.wikimedia.org/T269619#6696071>, 
@Ottomata wrote:
  
  > It depends on what you want to do :) EventGate will handle multi DC, 
filling some default values, and topic prefixes for you, but is an extra hop to 
Kafka.  As a prod system in a language with a good Kafka client, producing to 
Kafka is totally allowed.  You'd be the first main user of event platform not 
going through EventGate, but it is definitely something we want to support.
  >
  > Perhaps we'll want to build in some logic for doing some of the things 
EventGate is doing (as a proxy) into wikimedia-event-utilties (including 
validation, as you suggested).
  
  I think the most important for me is that these events get stored in HDFS 
properly and I'm not sure what are the requirements for this. I went with 
event-gate because I know it works :)
  
  But using the flink kafka producers makes more sense, bit's robust and easy 
to use but I was worried of missing important parts of the process. If this 
kind of usecases is something we'd like to support then I think I should go 
with this approach, I created T270371 
<https://phabricator.wikimedia.org/T270371> to discuss all the tools that 
wikimedia-event-utilties should provide to help this usecase.

TASK DETAIL
  https://phabricator.wikimedia.org/T269619

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Ottomata, Aklapper, Gehel, Mstyles, MPhamWMF, Alter-paule, Beast1978, 
CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T270371: wikimedia-event-utilities should provide tools for JVM based apps producing directly to kafka

2020-12-17 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata-Query-Service, Analytics.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a user of the event platform I want wikimedia-event-utilities to have all 
the functionalities that event-gate provides so that I can use my own kafka 
producers to ship sane events.
  
  In the context of a flink application it is generally simpler to use existing 
kafka producer, but there needs to be tools available to make sure the json 
events that are produced are valid regarding the event-platform guidelines and 
wikimedia-event-utilities seems to be the best place to put these tools for JVM 
based applications.
  
  AC:
  
  - there should be a way to validate an event against its json schema
  - there should utilities to help prefixing the kafka topic with the DC name
  - anything that would help to prevent shipping invalid events

TASK DETAIL
  https://phabricator.wikimedia.org/T270371

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Ottomata, dcausse, Aklapper, MPhamWMF, CBogen, Akuckartz, 4748kitoko, 
Nandana, Namenlos314, Akovalyov, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, JAllemandou, terrrydactyl, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-16 Thread dcausse
dcausse added a comment.


  In T269619#6695454 <https://phabricator.wikimedia.org/T269619#6695454>, 
@Ottomata wrote:
  
  > @dcausse, will these be POSTed to an EventGate, or to produced directly to 
Kafka?
  
  I plan to POST them to event gate using a very naive SinkFunction and see how 
it behaves but I can push directly to kafka if it's preferable?

TASK DETAIL
  https://phabricator.wikimedia.org/T269619

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Ottomata, Aklapper, Gehel, Mstyles, MPhamWMF, Alter-paule, Beast1978, 
CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269619: Create pipelines for late/spurious/failed events

2020-12-16 Thread dcausse
dcausse added a comment.


  In T269619#6693136 <https://phabricator.wikimedia.org/T269619#6693136>, 
@Ottomata wrote:
  
  > @dcausse for retrieving schemas, 
https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master
 might help. :)
  
  Thanks!
  it worked like a charm but I still had to pull 
`com.github.java-json-tools:json-schema-validator:2.2.14` to do schema 
validation, would it make sense to add some helper functions to 
`wikimedia-event-utilities` for validating a json against its schema? Use-case 
is a unit test to make sure that the json produced is compliant with the schema 
it's referencing.

TASK DETAIL
  https://phabricator.wikimedia.org/T269619

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Ottomata, Aklapper, Gehel, Mstyles, MPhamWMF, Alter-paule, Beast1978, 
CBogen, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T270245: Jmx metrics for blazegraph are no longer visible in grafana

2020-12-16 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As a wdqs maintainer I want to see metrics exported from the jmx prometheus 
exporter in grafana so that I can monitor the health of blazegraph.
  
  These metrics seemed to have stopped being collected some time ago while 
being properly exported from the jmx agent itself:
  
dcausse@wdqs1010:~$ curl -s localhost:9102 | grep jvm_threads_current
# HELP jvm_threads_current Current thread count of a JVM
# TYPE jvm_threads_current gauge
jvm_threads_current 151.0
  
  They are not getting through prometheus and according to netstat on one wdqs 
machine prometheus does not even seem to try to connect to and poll the metrics.
  
  AC:
  
  - the graphs: //Throttled requests//, //Banned requests//, //heap used// and 
//thread count// should display data on 
https://grafana-rw.wikimedia.org/d/00489/wikidata-query-service?orgId=1=1m

TASK DETAIL
  https://phabricator.wikimedia.org/T270245

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, MPhamWMF, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-14 Thread dcausse
dcausse assigned this task to RKemper.
dcausse moved this task from Ready for Development to Needs Reporting on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  Seems to be resolved now:
  
dcausse@mwmaint1002:~$ mwscript 
extensions/Wikidata.org/maintenance/updateQueryServiceLag.php --wiki 
wikidatawiki --cluster wdqs --prometheus prometheus.svc.eqiad.wmnet 
--prometheus prometheus.svc.codfw.wmnet
Done.

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: dcausse, Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, 
Muchiri124, CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, 
Zppix, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, 
Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T239931: Reduce the impact of the sanitizer on wikidata

2020-12-14 Thread dcausse
dcausse added a comment.


  The sanitizer is working OK with increased concurrency (T266762 
<https://phabricator.wikimedia.org/T266762>), we might try to enable it again 
on wikidata and sees how it performs.

TASK DETAIL
  https://phabricator.wikimedia.org/T239931

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Lucas_Werkmeister_WMDE, tfmorris, CBogen, EBernhardson, Ladsgroup, 
Addshore, Aklapper, dcausse, Wilmanbeno, Alter-paule, Beast1978, Un1tY, 
Akuckartz, Hook696, Iflorez, Kent7301, alaa_wmde, joker88john, CucyNoiD, 
Nandana, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, Lewizho99, Maathavan, _jensen, 
rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, jayvdb, Lydia_Pintscher, 
Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-07 Thread dcausse
dcausse moved this task from Needs review to In Progress on the 
Discovery-Search (Current work) board.
dcausse reassigned this task from Zbyszko to RKemper.
dcausse added subscribers: RKemper, Zbyszko.
dcausse added a comment.


  We will re-enable the kafka poller that was disabled for security reasons 
back in january. The plan is as follow:
  
  - merge the test patch https://gerrit.wikimedia.org/r/646631 and verify for 
one day that wdqs1010 behaves correctly
  - merge https://gerrit.wikimedia.org/r/646632 to enable the kafka poller on 
all the machines
  
  Moving back to in progress an re-assigning to @RKemper for merging the test 
patch.

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Zbyszko, RKemper, Epidosis, dcausse, Tagishsimon, Lydia_Pintscher, CBogen, 
Z_thomas, agray, Gehel, Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, 
Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, 
CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, 
Af420, Bsandipan, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, 
Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269331: wdqs.data-reload cookbook fails when deleting the old namespace

2020-12-04 Thread dcausse
dcausse renamed this task from "wdqs.data-reload cookbook fails when switching" 
to "wdqs.data-reload cookbook fails when deleting the old namespace".

TASK DETAIL
  https://phabricator.wikimedia.org/T269331

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: RKemper, dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T244590: [Epic] Rework the WDQS updater as an event driven application

2020-12-04 Thread dcausse
dcausse added a subtask: T269451: Possible flink optimizations/cleanups.

TASK DETAIL
  https://phabricator.wikimedia.org/T244590

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Thadguidry, tfmorris, revi, Ladsgroup, Multichill, darthmon_wmde, Iamamz3, 
Smalyshev, Ottomata, JAllemandou, Aklapper, Zbyszko, Gehel, dcausse, 
NavinRizwi, CBogen, Akuckartz, DannyS712, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Dinoguy1000, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269451: Possible flink optimizations/cleanups

2020-12-04 Thread dcausse
dcausse added a parent task: T244590: [Epic] Rework the WDQS updater as an 
event driven application.

TASK DETAIL
  https://phabricator.wikimedia.org/T269451

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269451: Possible flink optimizations/cleanups

2020-12-04 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As part of the flink review with ververica here are the few points we agreed 
to exeriment before our next meeting in January:
  
  - remove unnecessary chained operators (e.g. routing to side outputs can be 
done directly inside the same process function)
  - cleanup unnecessary serialization of the Patch class, might just be 
necessary to declare the Statement interface to Kryo
  - use of DataStreamUtils#reinterpretAsKeyedStream(DataStream, 
KeySelector) and possibly use the KeyedStream signature a bit more
  - try to drop custom parallelism
  - Enable Object reuse
  - Test unnaligned checkpoints on backfills

TASK DETAIL
  https://phabricator.wikimedia.org/T269451

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-12-04 Thread dcausse
dcausse removed dcausse as the assignee of this task.
dcausse removed a project: Discovery-Search (Current work).
dcausse added a comment.


  Moving back to the backlog to re-evaluate the priority

TASK DETAIL
  https://phabricator.wikimedia.org/T233204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Unjoanqualsevol, Nikki, CamelCaseNick, Smalyshev, Aklapper, 
Lucas_Werkmeister_WMDE, Igorkim78, Gehel, Lea_Lacroix_WMDE, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269421: Upgrade blazegraph to recent ICU4J version

2020-12-04 Thread dcausse
dcausse removed dcausse as the assignee of this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T269421

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269421: Upgrade blazegraph to recent ICU4J version

2020-12-04 Thread dcausse
dcausse created this task.
dcausse added projects: Wikidata-Query-Service, Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  As wdqs user I want the service to not conflate unrelated characters so that 
I can see what is actually stored in wikidata.
  
  Note that upgrading this will likely require to rebuild the journal.
  
  AC:
  
  -  should not be conflated with Ꜵ

TASK DETAIL
  https://phabricator.wikimedia.org/T269421

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-12-04 Thread dcausse
dcausse added a comment.


  I did some experiments using one chunk of our dumps which accounts for 
31,883,361 triples which is ~3‰ of the dump size.
  The journal size using the default //tertiary// strength is 154Gb it grows up 
to 174Gb using //identical// which is close to 13% increase in size. Assuming 
that this increase remains linear we would jump from 886Gb to 1Tb (114Gb 
increase) on current production machine.
  For the benefit (the terms that are no longer conflated): //Identical// 
allows to store 9855953 terms vs 9855878 for //tertiary//. Which means that out 
of the 9855953 terms I inspected only **75** are conflated.
  Using collation strength //Identical// does not seem to be the right approach 
to me (cost vs benefit).
  
  I believe we should at least fix the obvious ICU issues by upgrading the 
version used by blazegraph but concerning the symbols (P13502 
<https://phabricator.wikimedia.org/P13502>) we should try to find an 
alternative at the blazegraph level that does not involve a 13% increase in 
journal size.
  
  I wonder for instance why blazegraph is using collation for building its keys 
here, is the term index used for sorting or doing range queries? If not maybe 
there would be a way to add a custom key generator that just does NFC 
normalization and using UTF-8 for the Term2ID index a bit like what lucene does.
  
  To summarize:
  
  - using //Identical// does not seem to be viable solution to solve this issue
  - upgrading blazegraph to a newer version of ICU will solve **some** of the 
problems
  - evaluate other approaches for computing the Term2ID keys to stop conflating 
symbols
  
  Given that blazegraph is un-maintained I'm pessimistic about the third point, 
the second point sounds more approachable.

TASK DETAIL
  https://phabricator.wikimedia.org/T233204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Unjoanqualsevol, Nikki, CamelCaseNick, Smalyshev, Aklapper, 
Lucas_Werkmeister_WMDE, Igorkim78, Gehel, Lea_Lacroix_WMDE, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269204: Some wdqs metrics changed when switching to python3

2020-12-04 Thread dcausse
dcausse added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T269204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: Gehel, RKemper, dcausse, Aklapper, lmata, CBogen, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, herron, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Chicocvenancio, QZanden, EBjune, merbst, LawExplorer, Volans, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269331: wdqs.data-reload cookbook fails when switching

2020-12-03 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As wdqs maintainer I want to reload all categories on wdqs machines without 
errors.
  
Processing zuwiktionary...
http://www.w3.org/TR/html4/loose.dtd;>blazegraph by SYSTAPtotalElapsed=1ms, elapsed=1ms, connFlush=0ms, batchResolve=0, 
whereClause=0ms, deleteClause=0ms, insertClause=0msCOMMIT: totalElapsed=29ms, commitTime=1606887972763, 
mutationCount=197DELETE NAMESPACE: namespace=categories20200416
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
com.bigdata.rdf.sail.webapp.DatasetNotFoundException: 
namespace=categories20200416

at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at 
com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292)
at 
com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doDeleteNamespace(MultiTenancyServlet.java:622)

[...]
Caused by: java.lang.RuntimeException: 
com.bigdata.rdf.sail.webapp.DatasetNotFoundException: 
namespace=categories20200416

at 
com.bigdata.rdf.sail.BigdataSail.getUnisolatedConnectionLocksAndRunLambda(BigdataSail.java:1459)
[...]
... 1 more
Caused by: com.bigdata.rdf.sail.webapp.DatasetNotFoundException: 
namespace=categories20200416
[...]
... 12 more
2020-12-02T05:46:12+00:00 categories reload done

PASS |███| 100% (1/1) [1:33:15<00:00, 
5595.63s/hosts]
FAIL | |   0% (0/1) 
[1:33:15= 100.0% threshold) for command: 
'/usr/local/bin/r...tegories.sh wdqs'.
100.0% (1/1) success ratio (>= 100.0% threshold) of nodes successfully 
executed all commands.
Categories loaded in 1:33:46.955369
Enabling Puppet with reason "T259588: reload categories on all wdqs - 
ryankemper@cumin1001 - T259588" on 1 hosts: wdqs1006.eqiad.wmnet
- OUTPUT of 'enable-puppet "T...n1001 - T259588"' -

PASS |███| 100% (1/1) [00:02<00:00, 
 2.86s/hosts]
FAIL |   |   0% (0/1) 
[00:02= 100.0% threshold) for command: 
'enable-puppet "T...n1001 - T259588"'.
100.0% (1/1) success ratio (>= 100.0% threshold) of nodes successfully 
executed all commands.
- OUTPUT of 'awk '/^\s*comman...cinga/icinga.cfg' -
/var/lib/icinga/rw/icinga.cmd

PASS |███| 100% (1/1) [00:00<00:00, 
 2.08hosts/s]
FAIL |   |   0% (0/1) 
[00:00= 100.0% threshold) for command: 'awk 
'/^\s*comman...cinga/icinga.cfg'.
100.0% (1/1) success ratio (>= 100.0% threshold) of nodes successfully 
executed all commands.
- OUTPUT of 'echo -n "[160688...ga/rw/icinga.cmd' -

PASS |███| 100% (1/1) [00:00<00:00, 
 1.73hosts/s]
FAIL |   |   0% (0/1) 
[00:00= 100.0% threshold) for command: 'echo -n 
"[160688...ga/rw/icinga.cmd'.
100.0% (1/1) success ratio (>= 100.0% threshold) of nodes successfully 
executed all commands.
END (PASS) - Cookbook sre.wdqs.data-reload (exit_code=0)
done!
  
  AC:
  
  - reloading categories works without errors

TASK DETAIL
  https://phabricator.wikimedia.org/T269331

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: RKemper, dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269204: Some wdqs metrics changed when switching to python3

2020-12-02 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T269204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: RKemper, dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269204: Some wdqs metrics changed when switching to python3

2020-12-02 Thread dcausse
dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  As WDQS maintainer I want all metrics that wdqs reports to prometheus to have 
a consistent name no matter what version of python is being used so that I can 
have the same dashboards for all WDQS nodes.
  
  In 
https://github.com/prometheus/client_python/commit/a4dd93bcc6a0422e10cfa585048d1813909c6786
 counter metrics were forcibly suffixed with //_total//. Since the switch to 
python3 (buster?) and a new prometheus client all the counter metrics produced 
by this new python client will now have //_total// appended and notably the 
//blazegraph_lastupdated// counter which is used to monitor the update lag.
  
  One solution could be to explicitly append the //_total// suffix to existing 
counters and migrate dependent tools/dashboards to it.
  
  AC
  
  - update lag is properly monitored on wdqs1011-wdqs1013
  - counter metrics are consistent even in mixed stretch/buster env

TASK DETAIL
  https://phabricator.wikimedia.org/T269204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: RKemper, dcausse, Aklapper, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-01 Thread dcausse
dcausse merged a task: T268408: Query returns outdated results .
dcausse added a subscriber: Epidosis.

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: Epidosis, dcausse, Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, 
Gehel, Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, Akuckartz, Nandana, 
Namenlos314, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T268408: Query returns outdated results

2020-12-01 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, 
which should be filter out; number of entries in result set might change when 
executed repeatedly (possible caching/indexing problem).

TASK DETAIL
  https://phabricator.wikimedia.org/T268408

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Epidosis, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267175: SPARQL-Query shows entries, which should be filter out; number of entries in result set might change when executed repeatedly (possible caching/indexing problem)

2020-12-01 Thread dcausse
dcausse merged a task: T267924: WDQS Updater (based on recent changes) missed 
some updates.

TASK DETAIL
  https://phabricator.wikimedia.org/T267175

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko, dcausse
Cc: dcausse, Tagishsimon, Lydia_Pintscher, CBogen, Z_thomas, agray, Gehel, 
Lucas_Werkmeister_WMDE, Aklapper, M2k_dewiki, Akuckartz, Nandana, Namenlos314, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T267924: WDQS Updater (based on recent changes) missed some updates

2020-12-01 Thread dcausse
dcausse closed this task as a duplicate of T267175: SPARQL-Query shows entries, 
which should be filter out; number of entries in result set might change when 
executed repeatedly (possible caching/indexing problem).

TASK DETAIL
  https://phabricator.wikimedia.org/T267924

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Gehel, Aklapper, dcausse, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-12-01 Thread dcausse
dcausse added a comment.


  Ꜵ being conflated with  is a bug in the version of ICU4j we use, switching 
to ICU 68.1 (currently use 4.8 from 2011) solves the problem.
  
  Other issues related to similar chars (⑫ vs ⓬) do indeed require switching to 
switch the collation strength to //identical// which will increase the key 
sizes by ~80%. Hard to tell what is the actual impact on the blazegraph journal 
size. As discussed in https://github.com/blazegraph/database/issues/93 it does 
seem that query perf should not be affected too much.
  The user impact is hard to evaluate as well, while it's clearly 
wrong when two terms are conflated we have no idea how useful it can 
be when the terms are not ambiguous. There are queries that are perhaps relying 
on this to find results.
  In P13502 <https://phabricator.wikimedia.org/P13502> I've listed (brute-force 
search) the list of charaters that would no longer be conflated using 
//identical//. This sadly does not take into account sequences (like emojis and 
the angola flag) for which I don't have great ideas on how to evaluate the 
impact, this particular problem could well be very isolated.
  
  Concerning the version of ICU we currently use, I believe that using 
//identical// will solve most of these problems but it's probable that we might 
be affected by other bugs esp. when sorting. This probably deserves its own 
ticket and is more related to blazegraph's tech-dept.
  
  To move this ticket forward it does seem clear that we can't enable this 
option on production machines without prior testing on sizes but also on user 
impact.
  We don't have enough machines to run multiple tests at the same time and we 
might have to either:
  
  - wait for the planned tests (blank node removal with the streaming updater) 
to finish
  - or do it at the same time.
  
  I'll prepare some puppet patches in the meantime

TASK DETAIL
  https://phabricator.wikimedia.org/T233204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Unjoanqualsevol, Nikki, CamelCaseNick, Smalyshev, Aklapper, 
Lucas_Werkmeister_WMDE, Igorkim78, Gehel, Lea_Lacroix_WMDE, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T233204: Mixup of unicode characters in Query Service

2020-11-30 Thread dcausse
dcausse claimed this task.
dcausse moved this task from Ready for Development to In Progress on the 
Discovery-Search (Current work) board.

TASK DETAIL
  https://phabricator.wikimedia.org/T233204

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: Unjoanqualsevol, Nikki, CamelCaseNick, Smalyshev, Aklapper, 
Lucas_Werkmeister_WMDE, Igorkim78, Gehel, Lea_Lacroix_WMDE, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T250140: icinga: WDQS high update lag should alert when the service times out

2020-11-30 Thread dcausse
dcausse updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T250140

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: William_Avery, Aklapper, Addshore, dcausse, lmata, CBogen, Akuckartz, 
Nandana, Namenlos314, Lahi, Gq86, herron, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Chicocvenancio, QZanden, EBjune, merbst, LawExplorer, 
Volans, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Mbch331, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


  1   2   3   4   5   6   7   >