[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-15 Thread Ladsgroup
Ladsgroup added a comment. Made tickets for those ^TASK DETAILhttps://phabricator.wikimedia.org/T188993EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: gerritbot, Lydia_Pintscher, Aklapper, Lucas_Werkmeister_WMDE, Jonas, Ladsgroup, Majesticalreaper

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-15 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. Still to be done: enable those config settings on wikidata (and testwikidata, beta, etc.) if everything goes okay, disable the “force writing” setting again run maintenance script to kill term_search_key and term_weight. I think rebuildTermSqlIndex.php migh

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-15 Thread gerritbot
gerritbot added a comment. Change 418968 merged by jenkins-bot: [mediawiki/extensions/Wikibase@master] Add options to control term_search_key+term_weight use https://gerrit.wikimedia.org/r/418968TASK DETAILhttps://phabricator.wikimedia.org/T188993EMAIL PREFERENCEShttps://phabricator.wikimedia.org/

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-12 Thread gerritbot
gerritbot added a comment. Change 418968 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)): [mediawiki/extensions/Wikibase@master] Add options to control term_search_key+term_weight use https://gerrit.wikimedia.org/r/418968TASK DETAILhttps://phabricato

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-12 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. Also, I have no idea what to do in the client, where we also use TermSqlIndex but don’t have a useCirrus setting… Perhaps this should be a separate setting and not automatically inferred from useCirrus, so that we can have it in both repo and client?TASK DET

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-12 Thread Ladsgroup
Ladsgroup added a comment. In T188993#4043154, @Lucas_Werkmeister_WMDE wrote: I’d feel a lot more comfortable with this change if we split it into two phases – first stop using the columns, then stop writing them. We don’t have to maintain this separation forever, but for the initial deployment I’

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-12 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. I’d feel a lot more comfortable with this change if we split it into two phases – first stop using the columns, then stop writing them. We don’t have to maintain this separation forever, but for the initial deployment I’d like to be able to roll back this cha

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-12 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. TermPropertyLabelResolver also uses a TermIndex, so if I understand correctly that means there’s a small chance that a Lua module referring to a property by its non-normalized label will stop working…TASK DETAILhttps://phabricator.wikimedia.org/T188993EMAIL P

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-12 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. Yeah, at least ZWNJ, ZWJ and ZWSP are in Cf and widely used. Also, due to our outdated PHP version some recently added codepoints for regular characters are misclassified as Cn (unassigned), we shouldn’t remove those either. It’s probably acceptable to use t

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-09 Thread Ladsgroup
Ladsgroup added a comment. In T188993#4036412, @Ladsgroup wrote: Looking at list of Cf and searching for them in Wikidata, doesn't return anything. We are not using them and I think they should be dropped too. I couldn't been more wrong, ZWNJ is one of themTASK DETAILhttps://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-08 Thread Ladsgroup
Ladsgroup added a comment. Looking at list of Cf and searching for them in Wikidata, doesn't return anything. We are not using them and I think they should be dropped too.TASK DETAILhttps://phabricator.wikimedia.org/T188993EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailprefe

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-08 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. Okay, that’s because StringNormalizer only removes category Cc (other, control) whereas TermSqlIndex also removes Cf (other, format), Cn (other, not assigned) and Cs (other, surrogate). I doubt that’s intentional… perhaps we should clean that up anyways.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-08 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. Yet more information :) @Ladsgroup pointed out that some of these normalizations are also performed by Wikibase on any term (not just for the search key), specifically Unicode normalization and whitespace stripping. However, removal of control characters does

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-07 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. Before we can wipe the column, we need to decide what to do with this – we can either add another LabelConflictFinder implementation based on Cirrus/Elastic, or use term_text instead of term_search_key in conflict detection (and accept that “foo” and “foO” wi

[Wikidata-bugs] [Maniphest] [Commented On] T188993: Replace term_search_key and term_weight with empty values when wb_terms is not used for search

2018-03-07 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment. Currently, term_search_key is still used for TermSqlIndex’ implementation of the LabelConflictFinder interface – to find entities with the same label and description (you’re not allowed to produce such conflicts when editing or creating entities). Before we c