[Bug 21926] HTML entities do not work with inline queries of n-ary properties

2010-09-15 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21926

Markus Krötzsch  changed:

   What|Removed |Added

 CC||cparmstr...@liberty.edu

--- Comment #2 from Markus Krötzsch  2010-09-15 
07:48:30 UTC ---
*** Bug 25178 has been marked as a duplicate of this bug. ***

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 21926] HTML entities do not work with inline queries of n-ary properties

2009-12-29 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=21926


Markus Krötzsch  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
Summary|Single quote in PAGENAME|HTML entities do not work
   |breaks inline queries of n- |with inline queries of n-ary
   |ary properties  |properties




--- Comment #1 from Markus Krötzsch   2009-12-29 
18:01:41 UTC ---
The actual problem is that {{PAGENAME}} returns a HTML entity in this example,
and this entity is terminated by a semicolon that is mistaken by SMW for the
separator of an n-ary value. The example page has been updated to illustrate
this.

Alas, this gets us into encoding of special characters. SMW is rather faithful
now in this respect: it really makes a difference whether you write "ö" or
"ö" in an annotation. In general, SMW sticks to what it gets from
MediaWiki and does not do additional encoding or decoding. The problem is that
neither encoding nor decoding is idempotent: if you execute either action
twice, you may get different results from doing it just once. And doing it
twice usually leads to wrong results. For example, if you write "ö"
then you would expect a Unicode equivalent of "ö" which would be rendered
in HTML again as "ö" but never as "ö".

This is why we cannot blindly decode references unless we are sure that it has
not been done before. Ideally, we would treat HTML entity inputs just the same
as the characters they encode. In fact, SMW used to do this for annotations at
some point. It seems that the behaviour of MediaWiki has changed since then; or
some change in SMW has lead to the new behaviour. When trying to preserve
compatibility with existing MW version, it is necessary to understand what
changed and when. More investigations are needed to find out whether or not
MediaWiki decodes/encodes any entities in strings that various parts of SMW
receive (clearly, this could be different for parser functions like #ask and
for parsing extensions like our semantic links). Depending on this information,
we can try to normalise SMW's stored data in such a way that it is feasible to
apply entity decoding to inputs that properties in #ask receive (note that
decoding, splitting at the remaining ";", and encoding the string again does
not lead to the same input that the user has given; e.g. a given "ö" would
turn into "ö" which would no longer match unless the DB stores "ö" and
"ö" in the same way, which it currently does not).


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l