[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-09-07 Thread EBernhardson
EBernhardson added a comment.


  To do something about this some number needs to be decided on. What is the 
appropriate amount of acceptable lag?  From the search engine perspective a 
minute or even five is considered fully working.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel, EBernhardson
Cc: Esc3300, So9q, Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, 
Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-09-07 Thread Esc3300
Esc3300 added a comment.


  I'm not sure if everybody knows what "60s" or "162s" (what may seem quick) 
actually means to editors: it's the time they wait between creating the item 
and adding it as a statement value (which there seems long).
  
  Personally, I paste qids as search isn't efficient for this.
  
  An alternative could be to finally build a feature that allows creating new 
items directly from statement entry (e.g. type a label, click "create person 
item").

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel, Esc3300
Cc: Esc3300, So9q, Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, 
Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-09-07 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.


  We have editors experiencing problems like this again. Can you have another 
look please?

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel, Lydia_Pintscher
Cc: So9q, Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, 
Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-08-23 Thread Gehel
Gehel closed this task as "Resolved".
Gehel claimed this task.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: So9q, Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, 
Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-08-20 Thread So9q
So9q added a comment.


  @EBernhardson That is a very nice script! Thanks for annotating it, so I 
almost understand everything going on.  
  I just ran it for a few minutes and I saw lags of 70-162 seconds. The highest 
being of a lexeme.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: So9q
Cc: So9q, Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, 
Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-08-17 Thread EBernhardson
EBernhardson added a comment.


  I wrote a quick script (P17040 ) to 
read the public event streams to get revision timestamps, and then poll the 
mediawiki api's to find out which revision we have indexed in search. I haven't 
run any fancy stats over it, but the output over a few minutes of running is 
all numbers in the 10-60s range.  Plausibly the concept of this script could be 
expanded into a prometheus metric that can go on a dashboard somewhere.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: EBernhardson
Cc: Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, Lydia_Pintscher, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-08-17 Thread EBernhardson
EBernhardson added a comment.


  Current rates and backlog on ElasticaWrite look reasonable, p99 backlog is 
under 10m, roughly comparable to the numbers we saw in eqiad. Overall looks 
healthy.
  
  Quasi relatedly though, the backlog on cirrusSearchIncomingLinkCount has 
increased recently. The backlog is typically under 2 min but it's been 2 hours 
for the last couple days. This job is less worrying though, it's updating some 
statistics used for search relevance and lagged updates won't cause any visible 
regressions. This job should perhaps be monitored, but nothing to do now.
  
  Overall this seems resolved from our side, but I'm not sure how to really 
call it done. Ideally we'd have a metric indicating the user-perceived update 
lag but that doesn't exist today.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: EBernhardson
Cc: Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, Lydia_Pintscher, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-08-16 Thread MPhamWMF
MPhamWMF moved this task from needs triage to Current work on the 
Discovery-Search board.
MPhamWMF edited projects, added Discovery-Search (Current work); removed 
Discovery-Search.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1849/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, Lydia_Pintscher, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-08-04 Thread hnowlan
hnowlan added a subscriber: Pchelolo.
hnowlan added a comment.


  In T287563#7246465 , 
@dcausse wrote:
  
  > Hi @hnowlan in case you have ideas, we're investigating why the 
`cirrusSearchElasticaWrite` job is being backlogged more frequently since the 
switch. There does not seem to be more messages produced to it  nor we see more 
indexing requests sent to elastic (this job is responsible for sending data to 
elasticsearch) but the processing rate suggests that we're processing a lot 
more than before the switch.
  
  Hey, apologies for the delay, I've been PTO. Unfortunately I don't have any 
good explanations for the time being. It appears that there's been some 
contention on the k8s cluster leading to evictions, and one jobqueue process 
might have been in a bad state due to memory usage issues. I've done a full 
redeploy to iron out any potential issues that might be there, and I'm going to 
monitor the usage throughout the day today to see if we need to increase our 
resource thresholds a little more. It would be odd if we need to as both eqiad 
and codfw have the same, but similar has happened before with changeprop. 
Tagging @Pchelolo to see if he has any ideas too

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hnowlan
Cc: Pchelolo, hnowlan, EBernhardson, dcausse, Nikki, Aklapper, Lydia_Pintscher, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-29 Thread dcausse
dcausse added a subscriber: hnowlan.
dcausse added a comment.


  Hi @hnowlan in case you have ideas, we're investigating why the 
`cirrusSearchElasticaWrite` job is being backlogged more frequently since the 
switch. There does not seem to be more messages produced to it  nor we see more 
indexing requests sent to elastic (this job is responsible for sending data to 
elasticsearch) but the processing rate suggests that we're processing a lot 
more than previous the switch.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: hnowlan, EBernhardson, dcausse, Nikki, Aklapper, Lydia_Pintscher, 
Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-29 Thread dcausse
dcausse added a comment.


  The ElasticaWrite job seems to be receiving roughly the same amount of 
messages (150/s per partition on average) before and after the switch.
  Looking at the partitioned topic ElasticWrite it's heavily backlogged since 
the switch:
  
  F34569267: Capture d’écran du 2021-07-29 10-57-33.png 

  
  A huge backlog of 5 million messages have been accumulated from jun 28 and 
starts to be absorbed since jul 2. It matches a big bump in processing rates of 
changeprop consumers which is explained by a restart of some of the cpjobqueue 
pods that were behaving poorly :
  
  F34569457: Capture d’écran du 2021-07-29 15-15-03.png 

  
  Job timings as reported by changeprop do not suggest an increase (quite the 
opposite) but looking at cirrus backend logs for `send_data_write` there is 
clearly a bump in request_time (see P16924 
, roughly + 15ms) but could perhaps 
be explained because jobrunners running in codfw now have to write 2 distant 
elasticsearch clusters (prod-eqiad and cloudelastic) as opposed to one. Also 
the jobqueue reports a processing time of 250 to 300ms both before and after 
the switch so I'm not sure that +15ms between mw and elastic could cause such a 
change.
  
  The consumer group lag suggests that we lack processing power but we don't 
seem to produce more ElasticaWrite jobs or I can't find any evidence...
  The increase in processing rate could perhaps be explained by the backlog 
causing the jobqueue to consume messages in bursts?
  
  So possible cause of the backlog:
  
  - more messages produced to cirrusElasticWrite after the switch?
- the week before the switch this topic ingested 128,916,000 messages the 
week after 147,295,000 (+14%).
  - processing time of the job increased?
- can't find any evidence of this except a small increase in mw <-> elastic 
timings but not visible in the job processing time
  - changeprop not giving enough room to this job?
- seemed to have been the case just after switch but then the processing 
rates went higher then usual

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: EBernhardson, dcausse, Nikki, Aklapper, Lydia_Pintscher, Invadibot, 
MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-28 Thread EBernhardson
EBernhardson added a comment.


  Checked a few things, looking for why this job would have increased it's 
backlog. Most notable metric change is that eqiad was processing ElasticaWrites 
at a rate of ~150 for the partitioned topic. codfw is currently averaging over 
250, Where did the extra 100 comes from (and what unit are they?).
  
  Sources ruled out:
  
  - ElasticaWrite jobs can retry, and these aren't marked as part of the normal 
job queue retries. Observability comes through logsmatching `ElasticaWrite job 
reported failure on cluster`. July logs show we peak at a few hundred per hour, 
not large enough to influence this.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: EBernhardson
Cc: EBernhardson, dcausse, Nikki, Aklapper, Lydia_Pintscher, Invadibot, 
MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, EBjune, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-28 Thread dcausse
dcausse added a comment.


  Looking at the metrics of the `cirrusElasticaWrite` job it seems that its 
backlog time was greatly degraded just after the DC switch (June 28). I don't 
see anything particular in other dashboards that could explain such difference 
in latency.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Nikki, Aklapper, Lydia_Pintscher, Invadibot, MPhamWMF, 
maantietaja, CBogen, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, 
aude, Gryllida, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-28 Thread Addshore
Addshore added a project: wdwb-tech.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: Nikki, Aklapper, Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, 
Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-28 Thread Nikki
Nikki added a comment.


  Lexemes are also taking longer than they used to to show up, although I 
haven't checked whether it's taking as long as items.

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Nikki
Cc: Nikki, Aklapper, Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T287563: slow indexing of new Items on Wikidata?

2021-07-28 Thread Lydia_Pintscher
Lydia_Pintscher created this task.
Lydia_Pintscher added projects: Elasticsearch, Wikidata.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Discovery-Search.

TASK DESCRIPTION
  We have some anecdodal evidence that recently indexing of new Items in the 
search index on Wikidata takes longer - in the order of 10 to 15 minutes. Do we 
have any stats to confirm this?
  
  In the past we have added special handling to get newly created Items added 
to the search index fast. This was done because the search index powers the 
entity selector for making new statements. The workflow of creating a new Item 
and then immediately wanting to select that Item as the value of a new 
statement is frequent and hindered by a large lack. (It's still possible to add 
the value by using the Item's ID but it's not great.)

TASK DETAIL
  https://phabricator.wikimedia.org/T287563

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lydia_Pintscher
Cc: Nikki, Aklapper, Lydia_Pintscher, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Gryllida, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org