[Bug 48706] Duplicated properties

2013-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #11 from Jamie Thingelstad ja...@thingelstad.com ---
Created attachment 12770
  -- https://bugzilla.wikimedia.org/attachment.cgi?id=12770action=edit
Another example of duplicated properties

Another example of duplicated properties.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

Jamie Thingelstad ja...@thingelstad.com changed:

   What|Removed |Added

 CC||ja...@thingelstad.com

--- Comment #12 from Jamie Thingelstad ja...@thingelstad.com ---
I'm still seeing this issue on all the current master releases of SMW. I have
duplication for a good number of properties as well. 

http://wikiapiary.com/wiki/Special:Properties

Attaching a screenshot of the duplication for Has bot segment. If I can help
with debugging I would be happy to do so.

(Sorry for two entries, didn't know I could put that with the image
attachment.)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-07-06 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #13 from Jamie Thingelstad ja...@thingelstad.com ---
Just noting that this duplication does not get counted when SMWInfo is used to
ask for properties. For example, on Special:Properties it shows 210 properties:

http://wikiapiary.com/w/index.php?title=Special:Propertieslimit=500offset=0

SMWInfo shows 169 and 160:

http://wikiapiary.com/w/api.php?action=smwinfo

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

Alexander Mashin alex.mas...@gmail.com changed:

   What|Removed |Added

 CC||alex.mas...@gmail.com

--- Comment #2 from Alexander Mashin alex.mas...@gmail.com ---
It happens when the edited article is accessed under two different names
considered the same due to database collation and capitalisation.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #3 from Vicente Aguilar vicente.agui...@ext.oami.europa.eu ---
Well, that could be it because one of the things I noticed while trying to
diagnose this issue and bug 48707 is that we have different charsets and
collations on some of our wikis (created at different times with different MW
versions, but if the default changes shouldn't it be updated when running
update.php? Oh well, that's another issue anyway.)

I'll try to unify all our charset/collations when I get the time, it's
something I wanted to look into anyway.

In any case, if this was really the origin of the duplicates: why sometimes the
issue doesn't go away after a SMW_refreshData -ftpv? If I got your reasoning
right, the dup gets to the DB the moment someone accesses an article with a
different - but equal according to the collation - name. But right after a
refresh shouldn't be everything OK then?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #4 from Alexander Mashin alex.mas...@gmail.com ---
 the dup gets to the DB the moment someone accesses an article with a
different - but equal according to the collation - name.
Not just accesses but edits.

SMW_refreshData.php -ftpv is wrong. It purges type and property pages but after
that you should run SMW_refreshData.php -fv to rebuild the pages themselves.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #5 from MWJames jamesin.hongkon...@gmail.com ---
(In reply to comment #2)
 It happens when the edited article is accessed under two different names
 considered the same due to database collation and capitalisation.

Are you talking about redirects? Normally, an article has one specific name so
how can it be that I can access an article under two different names unless it
is an redirect?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #6 from Vicente Aguilar vicente.agui...@ext.oami.europa.eu ---
(In reply to comment #4)
 Not just accesses but edits.

Well, ok, but my point remains: right after a full refresh, shouldn't the DB be
clean of dupes?

 SMW_refreshData.php -ftpv is wrong. It purges type and property pages but
 after
 that you should run SMW_refreshData.php -fv to rebuild the pages themselves.

Yes, that's what we do, -ftpv and then -v.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #7 from Vicente Aguilar vicente.agui...@ext.oami.europa.eu ---
(In reply to comment #5)
 Are you talking about redirects? Normally, an article has one specific name
 so how can it be that I can access an article under two different names 
 unless it is an redirect?

No, he means DB collation, the way the DB (not MediaWiki, but mySQL) compares
two strings. It has to do with different charsets and different languages, e.g.
considering capital and lower case equal or not, removing tildes, etc. All this
is configured on a per-table and per-field basis.

http://dev.mysql.com/doc/refman/5.0/en/charset-collation-effect.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #8 from MWJames jamesin.hongkon...@gmail.com ---
(In reply to comment #7)
 (In reply to comment #5)
  Are you talking about redirects? Normally, an article has one specific name
  so how can it be that I can access an article under two different names 
  unless it is an redirect?
 
 
 http://dev.mysql.com/doc/refman/5.0/en/charset-collation-effect.html

So, according to the link above, [[Has property::Muffler]] annotation with a
latin1_swedish_ci collation would if being switched to a latin1_german2_ci
collation being understood as [[Has property::Müller]]?

Which would lead to [[Has property::Muffler]] and [[Has property::Müller]] for
the same article?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #9 from Vicente Aguilar vicente.agui...@ext.oami.europa.eu ---
(In reply to comment #8)
 So, according to the link above, [[Has property::Muffler]] annotation with a
 latin1_swedish_ci collation would if being switched to a latin1_german2_ci
 collation being understood as [[Has property::Müller]]?

That 1st example is about sorting, not comparison.

But yes, if MW/SMW is not doing any more checks and is relying only on the DB
(which I can't tell, I haven't looked at the code that closely), depending on
the collation Bar == bär == BAR.

Is that really the cause of this issue? I don't know. But my wikis do have a
different charset/collation configuration so... maybe.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-23 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #10 from Alexander Mashin alex.mas...@gmail.com ---
No, I don't think so. It's the collation in article title that causes
duplication.

If you have an article called Müller and set any property [[has
property::value]] on it and then open http://your.site/wiki/Muller?action=edit
and you DB collation treats u and ü as the same (that is, it will not allow to
create Muller if there is already Müller) than there will be two of each
properties for Müller and Muller and both of them (one red) will appear in any
SMW query for those properties ({{#ask:[[has property::value]]|format=list}}
will give Müller, Muller).

You don't even need to change DB collation.

Similar artifacts will be observable in MW logs: they will show the name under
which the page was accessed (red if different) not stored.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48706] Duplicated properties

2013-05-22 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48706

--- Comment #1 from Vicente Aguilar vicente.agui...@ext.oami.europa.eu ---
Created attachment 12369
  -- https://bugzilla.wikimedia.org/attachment.cgi?id=12369action=edit
Duplicated properties screenshot

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l