Greetings,

This is the weekly update from the Search Platform team for the week
starting 2019-03-25 and 2019-04-01.

As always, feedback and questions are welcome.

== Discussions ==

=== Search ===
* ElasticSearch upgrade to v6:
** incident [0]
*Trey finished a deep dive into the performance of language
identification for cross-wiki searching [1] (example [2]) and
punctuation-related problems, and discovered things are working pretty
well overall, but the Chinese language model is a bit off.
* Erik noticed that the inlabel / incaption keywords should highlight
the label/caption but were not [3]
* David worked on fixing an error code that Elasticsearch 6
nested_path and nested_filter are deprecated [4] and
_retry_on_conflict was deprecated [5]
* We worked on migrating mjolnir to stdout/syslog/cee logging output [6]
* The team worked on upgrade to elasticsearch 6.5.4 for cirrus / codfw
(specifically) [7] and for eqiad [8]
* Erik worked on the implementation and testing of glent m0
integration with wmf infrastructure [9]
* David did a lot of work to update the mw-config to use the psi&omega
elastic clusters [10]
* David found that the auto_generate_phrase_queries is deprecated and
ineffective [11]
* The team fixed an old bug where we were getting fatal errors -
"cannot perform this operation with arrays" from
CirrusSearch/ElasticaWrite (using JobQueueDB) [12]
* Gehel worked to make spicerack more robust when unfreezing writes to
elasticsearch / cirrus [13] as well as creating a cookbook to reset
frozen write state on elasticsearch / cirrus [14]
* Stas moved WikibaseLexeme search code to WikibaseLexemeCirrusSearch
extension [15]
* We noticed that Elasticsearch indices went read-only, causing a huge lag [16]
* We also saw where search exceptions handling was printing response
information on the screen [17]
* The team fixed an issue where mwgrep was not working [18]
* We also fixed an issue where Elasticsearch 6 needed to silence
deprecation warnings to avoid logspam [19]
* We needed to create an extra elasticsearch clusters in the beta cluster [20]
* We also needed some alerts so we know if mjolnir starts misbehaving [21]
* We also converted check_elasticsearch.py icinga plugin to py3 [22]
* We needed to start using local nginx reverse proxy for connections reuse [23]
* The version of curator that we currently use (5.2.0) isn't
compatible with elasticsearch 6. Which causes issues in a few cron on
logtash servers (see blelow). Version 5.6.0 supports both
elasticsearch 5 and 6.....so...we updated it [24]
* We also did some cleanup of the reprepro configuration for
elasticsearch-curator [25]
* Getting a centralized way to inspect the content of the search
profiles might be helpful when investigating search behaviors. In the
same vein as other dump debug APIs (mapping/settings/cirrusdoc) David
suggested that we should add a new simple API to dump the profiles
(cirrus-profiles-dump) [26]
* David also found that a call to a member function toArray() on a
non-object (null) in
vendor/ruflin/elastica/lib/Elastica/Client.php:736 and fixed it [27]

[0] 
https://wikitech.wikimedia.org/wiki/Incident_documentation/20190327-elasticsearch
report
[1] 
https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Review_of_Language_Identification_in_Production,_with_a_Special_Focus_on_Stupid_Identification_Tricks
[2] 
https://en.wikipedia.org/w/index.php?search=%D0%93%D0%B0%D1%80%D1%80%D0%B8+%D0%9F%D0%BE%D1%82%D1%82%D0%B5%D1%80%D0%B5
[3] https://phabricator.wikimedia.org/T217809
[4] https://phabricator.wikimedia.org/T219266
[5] https://phabricator.wikimedia.org/T219265
[6] https://phabricator.wikimedia.org/T218833
[7] https://phabricator.wikimedia.org/T218878
[8] https://phabricator.wikimedia.org/T218879
[9] https://phabricator.wikimedia.org/T218164
[10] https://phabricator.wikimedia.org/T210381
[11] https://phabricator.wikimedia.org/T219267
[12] https://phabricator.wikimedia.org/T124196
[13] https://phabricator.wikimedia.org/T219640
[14] https://phabricator.wikimedia.org/T219638
[15] https://phabricator.wikimedia.org/T216206
[16] https://phabricator.wikimedia.org/T219364
[17] https://phabricator.wikimedia.org/T216959
[18] https://phabricator.wikimedia.org/T219162
[19] https://phabricator.wikimedia.org/T219269
[20] https://phabricator.wikimedia.org/T213940
[21] https://phabricator.wikimedia.org/T214494
[22] https://phabricator.wikimedia.org/T215439
[23] https://phabricator.wikimedia.org/T215491
[24] https://phabricator.wikimedia.org/T218991
[25] https://phabricator.wikimedia.org/T216235
[26] https://phabricator.wikimedia.org/T218682
[27] https://phabricator.wikimedia.org/T217402

----

Subscribe to receive on-wiki (or opt-in email) notifications of the
Discovery weekly update.

https://www.mediawiki.org/wiki/Newsletter:Discovery_Weekly

The archive of all past updates can be found on MediaWiki.org:

https://www.mediawiki.org/wiki/Discovery/Status_updates

Interested in getting involved? See tasks marked as "Easy" or
"Volunteer needed" in Phabricator.

[1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R
[2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R

Yours,
Chris Koerner (he/him)
Community Relations Specialist
Wikimedia Foundation

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to