Well, I'm not affiliated with Linked Geo Data, but have already looked
at way too many RDF-related encoding problems in my life, so why not
look at one more ...
It is indeed a problem in Linked Geo Data.
The Moscow resource
http://linkedgeodata.org/triplify/node/27503927
has the following value for the :name property, in N-Triples:
"\u00D0\u009C\u00D0\u00BE\u00D1\u0081\u00D0\u00BA
\u00D0\u00B2\u00D0\u00B0"
These are characters escaped with the \u notation of N-Triples. If one
decodes the characters, this is garbage: ÐоÑква
I guess the problem is that the Linked Geo Data code messes up an
UTF-8 encoded input stream that comes from the input dataset. It looks
like the original stream contained bytes (hexadecimal)
D0 9C D0 BE D1 81 D0 BA D0 B2 D0 B0
If interpreted as a UTF-8 encoded Unicode string, this is: Москва
Now apparently in Linked Geo Data this byte sequence was escaped into
\u notation simply by prepending \u00 to every byte. That doesn't
work. One actually has to decode the UTF-8 into Unicode characters
first, and then escape them one by one, resulting in:
"\u041C\u043E\u0441\u043A\u0432\u0430"
A well-tested PHP implementation of this string escaping for N-Triples
is available in DBpedia as RDFliteral::escape():
http://dbpedia.svn.sourceforge.net/viewvc/dbpedia/extraction/core/RDFliteral.php?view=markup
Best,
Richard
On 22 Mar 2010, at 10:33, Hugh Williams wrote:
Hi Mitko/Alexander,
Perhaps someone on the Linked Geo Data group I have added to this
reply, can comment ?
Best Regards
Hugh Williams
Professional Services
OpenLink Software
Web: http://www.openlinksw.com
Support: http://support.openlinksw.com
Forums: http://boards.openlinksw.com/support
Twitter: http://twitter.com/OpenLink
On 22 Mar 2010, at 10:21, Mitko Iliev wrote:
The problem is in LinkedGeoData dataset. Can be reproduced with :
ttlp (http_get ('http://linkedgeodata.org/triplify/node/27503927'),
'', 'http://linkedgeodata.org/triplify/node/27503927');
and query : select * where { <http://linkedgeodata.org/triplify/node/27503927#id
> ?y ?z . }
Best Regards,
Mitko
On Mar 20, 2010, at 9:33 PM, Alexander Sidorov wrote:
Hm... Look at this query results:
SELECT ?s ?p ?o ?name
WHERE
{
?s ?p ?o .
?s a <http://linkedgeodata.org/vocabulary#city> .
?o bif:contains '"moscow"' .
OPTIONAL
{
?s <http://linkedgeodata.org/vocabulary#name> ?name
}
}
Do you see "Москва" as name? I see some strange symbols
despite I see correct cyrillic symbols at your query results.
Looks like LinkedGeoData specific problem.
2010/3/17 Mitko Iliev <[email protected]>
Hi Alexander,
The sparql endpoint returns UTF8, also the experiments shows
proper encoding, for example try to execute :
SELECT ?o WHERE {<http://dbpedia.org/resource/Moscow> rdfs:label ?
o . filter (lang(?o) = 'ru' ) }
or
SELECT ?o WHERE { ?s ?p ?o . ?o bif:contains '"Москва"' }
limit 100
against http://lod.openlinksw.com/sparql . both returns readable
content.
If your query executed on endpoint above returns bad utf8 please
give us the query so we can debug what happens, otherwise a
possible problem is at client side re-coding the response or
reading it as narrow charset.
Best Regards,
Mitko
On Mar 17, 2010, at 3:54 AM, Alexander Sidorov wrote:
Hi Hugh,
As I remember ADO.NET encoding bug was fixed (I haven't checked
because it has no sense while other Entity Framework bug you know
about is not fixed).
But this problem has no relation to ADO.NET. As I haven't yet
deployed my application to Amazon EC2, I execute geo queries
using lod.openlinksw.com/sparql endpoint using SPARQL protocol
(but not using database directly). Here are my screen shots:
1. Manchester: http://img171.imageshack.us/img171/5568/manchesterk.png
2. Moscow: http://img204.imageshack.us/img204/7850/moscow.png
Regards,
Alexander
2010/3/17 Hugh Williams <[email protected]>
Hi Alexander,
Is this the encoding issue with the ADO.Net Provider you reported
previously as that is the only one I am aware of, which is still
to be resolved ?
Note, their is a 40K limit on the size of emails to this mailing
list thus your mail with attachment which exceeded this limit was
with held pending approval initially. Please place such
attachments on a remote server and provide links in your mails in
future ...
Best Regards
Hugh Williams
Professional Services
OpenLink Software
Web: http://www.openlinksw.com
Support: http://support.openlinksw.com
Forums: http://boards.openlinksw.com/support
Twitter: http://twitter.com/OpenLink
On 17 Mar 2010, at 00:27, Alexander Sidorov wrote:
Hello!
I have already asked about LOD encoding problems before but no
feedback followed. To be more expressive I have attached my
application's screen shots with information about Manchester
(english symbols - everything is okay) and Moscow (russian
symbols are displayed incorrectly).
Regards,
Alexander
<
Manchester
.png
>
<
Moscow
.png
>
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find
bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev_______________________________________________
Virtuoso-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/virtuoso-users
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev_______________________________________________
Virtuoso-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/virtuoso-users
--
Mitko Iliev
Developer Virtuoso Team
OpenLink Software
http://www.openlinksw.com/virtuoso
Cross Platform Web Services Middleware
--
Mitko Iliev
Developer Virtuoso Team
OpenLink Software
http://www.openlinksw.com/virtuoso
Cross Platform Web Services Middleware
--
You received this message because you are subscribed to the Google
Groups "Linked Geo Data" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to [email protected]
.
For more options, visit this group at http://groups.google.com/group/linked-geo-data?hl=en
.