[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-15 Thread gerritbot
gerritbot added a comment.

Change 218308 had a related patch set uploaded (by Aude):
Increase max conflicts returned for conflict detections

https://gerrit.wikimedia.org/r/218308


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, gerritbot
Cc: Candalua, Bugreporter, gerritbot, aude, Bene, daniel, Aklapper, 
Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-15 Thread gerritbot
gerritbot added a comment.

Change 218308 merged by jenkins-bot:
Increase max conflicts returned for conflict detections

https://gerrit.wikimedia.org/r/218308


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, gerritbot
Cc: Candalua, Bugreporter, gerritbot, aude, Bene, daniel, Aklapper, 
Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-15 Thread aude
aude added a comment.

deployed the fix to wikidata.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: Candalua, Bugreporter, gerritbot, aude, Bene, daniel, Aklapper, 
Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-15 Thread gerritbot
gerritbot added a comment.

Change 218351 merged by jenkins-bot:
Update Wikidata - fix property label constraint bug

https://gerrit.wikimedia.org/r/218351


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, gerritbot
Cc: Candalua, Bugreporter, gerritbot, aude, Bene, daniel, Aklapper, 
Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-15 Thread gerritbot
gerritbot added a comment.

Change 218351 had a related patch set uploaded (by Aude):
Update Wikidata - fix property label constraint bug

https://gerrit.wikimedia.org/r/218351


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude, gerritbot
Cc: Candalua, Bugreporter, gerritbot, aude, Bene, daniel, Aklapper, 
Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-14 Thread aude
aude added a comment.

@daniel can backport it tomorrow, and totally agree about the *real* fix.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: Candalua, Bugreporter, Liuxinyu970226, gerritbot, aude, Bene, daniel, 
Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread daniel
daniel added a comment.

I can only add that I, too, am quite confused. I can' t reproduce the issue 
locally, nor can I think of a way for this to happen, except for extreme slave 
lag. The uniqueness checks are run against the slave database. This is 
something we could change, and it has the potential to fix this. But it's a 
short in the dark, really.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread Bene
Bene added a comment.

See also http://quarry.wmflabs.org/query/3972: it is possible that the same 
label gets tracked in wb_terms twice.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bene
Cc: Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread aude
aude added a subscriber: aude.
aude added a comment.

this is the query that is run and it is quite fast:

  select term_entity_type,term_type,term_language,term_text,term_entity_id  
FROM `wb_terms`   WHERE ((term_language='en' AND term_search_key='title' AND 
term_type='label' AND term_entity_type='property'))  LIMIT 10 \G;
  *** 1. row ***
  term_entity_type: property
 term_type: label
 term_language: en
 term_text: title
term_entity_id: 1476
  1 row in set (0.01 sec)


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread Bene
Bene added a comment.

The query posted by @aude gives us the expected result (cf. 
http://quarry.wmflabs.org/query/3973)

It seems that also label-description duplicates are possible: 
https://www.wikidata.org/w/index.php?title=Property%3AP357type=revisiondiff=222062411oldid=222000959


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bene
Cc: aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread daniel
daniel added a comment.

@bene: yes, this is no database level constraint, so it's possible on that 
level. The question is just, how does it happen? Our business logic (the 
uniqueness validators) should prevent that.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread daniel
daniel added a comment.

@aude I think you got that exactly right! Ouch. That was my fault. I somehow 
got it in my head that 10 conflicting items would always be sufficient. But it 
was 10 conflicting terms, which of course may all be self-conflicts.

The reason I introduced the post-filtering was to a) reduce the complexity of 
the already complex query and b) to keep validator knowledge out of TermIndex 
interface. But that interface should be broken up anyway.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: daniel
Cc: gerritbot, aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread gerritbot
gerritbot added a subscriber: gerritbot.
gerritbot added a comment.

Change 217849 had a related patch set uploaded (by Aude):
Increase max conflicts returned for conflict detections

https://gerrit.wikimedia.org/r/217849


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: gerritbot
Cc: gerritbot, aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread aude
aude added a comment.

What I see is that LabelUniquenessValidator requests all conflicting labels (in 
all languages), including self-conflicts on the same item, then filters out 
self-conflicts.

TermSqlIndex is what detects conflicts and it returns a maximum of 10 
conflicts. If these are all self-conflicts (then cutting off results before 
getting to non-self conflicts), then they all get filtered out afterwards and 
LabelUniquenessValidator finds no conflict.

Easiest solution is to increase max conflicts, to say 500.

also, is there a need at all for the $ignoreEntityId option in 
LabelDescriptionDuplicateDetector::detectTermConflicts? if we always end up 
filtering them, then is it possible to have this done as part of the query and 
get rid of the option + post filtering.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread Bene
Bene added a comment.

What are those self-conflicts actually? How can a label be found more than 
once? Don't we filter by entity type? Also, why don't we just filter out 
self-conflicts in the where clause?


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bene
Cc: gerritbot, aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread gerritbot
gerritbot added a comment.

Change 217849 merged by jenkins-bot:
Increase max conflicts returned for conflict detections

https://gerrit.wikimedia.org/r/217849


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: gerritbot
Cc: gerritbot, aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-12 Thread aude
aude added a comment.

@bene terms being searched for conflicts included:

  array (
'fr' = 'titre (OBSOLETE - P1476)',
'en' = '(OBSOLETE) title (use P1476, title)',
'ja' = '題名 *廃止(P1476を使用)',
'nl' = '(VEROUDERD) titel van publicatie',
'it' = '(obsoleta) titolo (usare P1476)',
'pt' = '(OBSOLETO) título (usar P1476)',
'pt-br' = 'título (OBSOLETO)',
'es' = 'título',
'ko' = '원제목',
'ca' = 'títol (OBSOLET, utilitzeu P1476)',
'cs' = '(ZASTARALÉ) titul originálu',
'gl' = 'título~',
'hu' = '(ELAVULT) műcím (használd a P1476-ot)',
'fa' = 'ﻊﻧﻭﺎﻧ',
'ro' = 'titlu',
'vi' = 'tựa đề',
'sv' = 'titel~',
'pl' = 'tytuł oryginalny',
'de' = '(VERALTET) Titel',
'el' = 'τίτλος',
'zh-hans' = '(停用)标题字符串(请改用P1476)',
'zh-hant' = '標題(字串)',
'bs' = 'naslov (slovima)',
'he' = 'שם מקורי',
'uk' = 'назва мовою оригіналу',
'nds' = 'Titel',
'fi' = 'alkuperäisotsikko',
'ka' = 'ორიგინალური დასახელება',
'ru' = '(УСТАРЕЛО) название (используйте P1476)',
'bn' = 'মূল শিরোনাম',
'be' = 'назва на мове арыгінала',
'sh' = 'originalni naslov',
'nn' = '(FORELDA) originaltittel',
'eo' = '(MALNOVA) originala titolo',
'da' = '(FORÆLDET) titel',
'sr' = 'наслов',
'sr-ec' = 'наслов',
'is' = 'upprunalegur titill',
'mk' = 'изворен наслов',
'oc' = 'títol',
'zh' = '标题字符串(已废弃,请使用P1476)',
'be-tarask' = 'назва',
'ms' = 'tajuk',
'nb' = '(FORELDET) tittel',
'zh-tw' = '標題(字符串)',
'lv' = 'nosaukums',
'gu' = 'શીર્ષક~',
'et' = '(VANANENUD) pealkiri',
'zh-hk' = '標題(字符串)',
'zh-cn' = '标题字符串',
'hi' = 'शीर्षक',
'te' = 'శీర్షిక',
'or' = 'ନାମ',
'sr-el' = 'naslov',
'sco' = 'title',
'sl' = 'naslov (string)',
'ia' = 'titulo',
'la' = 'titulus',
'scn' = 'tìtulu',
'eu' = '(ZAHARKITUA) izenburua (erabili P1476)',
'mr' = 'शीर्षक',
'yi' = 'טיטל (פארעלטערט)',
  )

I got back results like:

  array (
0 =
Wikibase\Term::__set_state(array(
  'fields' =
  array (
'entityType' = 'property',
'termType' = 'label',
'termLanguage' = 'ko',
'termText' = '원제목',
'entityId' = 357,
  ),
)),
1 =
Wikibase\Term::__set_state(array(
  'fields' =
  array (
'entityType' = 'property',
'termType' = 'label',
'termLanguage' = 'ro',
'termText' = 'titlu',
'entityId' = 357,
  ),
)),
2 =
Wikibase\Term::__set_state(array(
  'fields' =
  array (
'entityType' = 'property',
'termType' = 'label',
'termLanguage' = 'vi',
'termText' = 'tựa đề',
'entityId' = 357,
  ),
)),

as conflicts. (got 10 of these, all entityId = 357)

i think those are questions for Daniel.  it might be for performance reasons or 
something, but think there must be a better way.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aude
Cc: gerritbot, aude, Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-11 Thread Bene
Bene added a comment.

As far as I can see the terms table contains the correct terms and search keys 
so I wonder how my edits could pass the uniqueness filters.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bene
Cc: Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-11 Thread Bene
Bene added a comment.

Note that this issue does not happen on test.wikidata.org. You can play around 
with https://test.wikidata.org/wiki/Property:P155 and 
https://test.wikidata.org/wiki/Property:P164. When entering the same label on 
both properties, I get the expected error message.


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Bene
Cc: Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-11 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.

Is it possibly related to the fact that we are trying to edit an already 
conflicting label?


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lydia_Pintscher
Cc: Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T102148: possible to violate label uniqueness constraint of property labels

2015-06-11 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.

Do I read those queries correctly that the terms table has them differently?


TASK DETAIL
  https://phabricator.wikimedia.org/T102148

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lydia_Pintscher
Cc: Bene, daniel, Aklapper, Lydia_Pintscher, Wikidata-bugs, aude



___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs