Hello again. It's been a while.

This is the weekly update from the Search Platform team for the week
starting 2018-09-17 through 2018-10-01.

As always, feedback and questions welcome.

== Discussions ==

=== Search ===
* Implemented indexing statement values as part of main data in
Wikidata, so that statement values are now searchable without special
syntax [0]
* Reindexed wikidata which also enables qualifier indexing [1]
* Mathew worked on resolving an elasticsearch shard size alert by
doing an in place reindex [2]
* There was a lot of work done to investigate a brief outage of
CirrusSearch (mw exception spike for api.php) [3], but it's resolved
enough for now.
* Gehel and others worked on refactoring puppet to support multiple
elasticsearch instances on same node [4]
* Erik worked on an issue where the text content of wiki page in
search index can merge words making them unfindable [5]
* Stas updated the search engine of Wikidata to enable searching by
author name string [6]
* David and Erik worked together on evaluating adding an image quality
score to media search result ranking [7]
* Stas added X-Search-Id to WikidataCompletionSearchClicks events [8]
* David added a way to configure timeouts of autocomplete queries [9]
* Erik upgraded saneitizer to constantly re-index documents [10]
* David investigated why interwiki cache hit/miss was no longer
reported (since 2017) and decided to drop the support for caching
interwiki queries [11]
* Mathew and Gehel worked on raising the alert level on disk space for
old elasticsearch servers [12]
* Erik worked to correct issues where the Cirrus MLT cache had a 0%
hit rate on switchover [13]

=== WDQS ===

* Added new NTriples RDF dump (which makes it easier to do per-line
processing) [14]
* Internal cluster switched to Kafka events as change source, public
cluster next [15]

== Did you know? ==
* Different languages can have a different number of sounds they use;
the set of sounds used in a particular language is called its
“phonemic inventory”. [16] The numbers of sounds can range from 11 to
over 140! Having more sounds than letters, or different sounds than
the usual sound associated with a letter, can be the source of unusual
orthographies and/or transliteration schemes—including "q" formerly
being used as a vowel in Natqgu (now Natügu), a language of the
Solomon Islands.

[0] https://phabricator.wikimedia.org/T163642
[1] https://phabricator.wikimedia.org/T193407
[2] https://phabricator.wikimedia.org/T204362
[3] https://phabricator.wikimedia.org/T204776
[4] https://phabricator.wikimedia.org/T198351
[5] https://phabricator.wikimedia.org/T195389
[6] https://phabricator.wikimedia.org/T179815
[7] https://phabricator.wikimedia.org/T202339
[8] https://phabricator.wikimedia.org/T205597
[9] https://phabricator.wikimedia.org/T204959
[10] https://phabricator.wikimedia.org/T203622
[11] https://phabricator.wikimedia.org/T191961
[12] https://phabricator.wikimedia.org/T204361
[13] https://phabricator.wikimedia.org/T204148
[14] https://phabricator.wikimedia.org/T144103
[15] https://phabricator.wikimedia.org/T189458
[16] https://en.wikipedia.org/wiki/Phonemic_inventory

----

Subscribe to receive on-wiki (or opt-in email) notifications of the
Discovery weekly update.

https://www.mediawiki.org/wiki/Newsletter:Discovery_Weekly

The archive of all past updates can be found on MediaWiki.org:

https://www.mediawiki.org/wiki/Discovery/Status_updates

Interested in getting involved? See tasks marked as "Easy" or
"Volunteer needed" in Phabricator.

[1] https://phabricator.wikimedia.org/maniphest/query/qW51XhCCd8.7/#R
[2] https://phabricator.wikimedia.org/maniphest/query/5KEPuEJh9TPS/#R


Yours,
Chris Koerner
Community Relations Specialist
Wikimedia Foundation

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to