Enrico,

On 17 Mar 2010, at 17:35, Enrico Daga wrote:

> Hi Hugh,
> first of all thank you for your reply :)
> 
> On 17 March 2010 18:05, Hugh Williams <[email protected]> wrote:
>> Hi Enrico,
>> 
>> On 17 Mar 2010, at 11:50, Enrico Daga wrote:
>> 
>>> Hi
>>> I am a virtuoso fan (and newbie), I have installed and played a bit
>>> with it and I think it is great!
>>> 
>>> Now I am experiencing some problems in IRI names when using non ASCII
>>> characters.
>>> I have collected as much information as I could, hope they are enough
>>> to figure out thwe problem:
>>> These are the two cases:
>>> 
>>> 1) Differences between isql command line tool and Conductor's 'Sparql
>>> Execution' tool
>>> When I do the following statement I can see the triple correctly from
>>> the same interface, but wrongly in the other.
>>> For example, I do from isql
>>> 
>>> SQL> sparql insert into <http://localhost/test/charsets> {<http://ààà>
>>> rdf:type owl:Thing};
>>> callret-0
>>> VARCHAR
>>> _______________________________________________________________________________
>>> 
>>> Insert into <http://localhost/test/charsets>, 1 triples -- done
>>> 
>>> 1 Rows. -- 11 msec.
>>> SQL> sparql select * from <http://localhost/test/charsets> where {?a ?b ?c};
>>> a
>>>           b
>>>                      c
>>> VARCHAR
>>>           VARCHAR
>>>                      VARCHAR
>>> _______________________________________________________________________________
>>> 
>>> http://ààà
>>>           http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>>                      http://www.w3.org/2002/07/owl#Thing
>>> 
>>> 1 Rows. -- 2 msec.
>>> 
>>> Then I try to see the result from Conductor, but this is the result:
>>> 
>>> a     b       c
>>> http://???    http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>>       http://www.w3.org/2002/07/owl#Thing
>>> 
>>> The same example but in the opposite order, now I do the insert from 
>>> Conductor:
>>> 
>>> insert into <http://localhost/test/charsets> {<http://ààà2> rdf:type 
>>> owl:Thing}
>>> 
>>> and
>>> 
>>> select * from <http://localhost/test/charsets> where {?a ?b ?c}
>>> 
>>> a     b       c
>>> http://???    http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>>       http://www.w3.org/2002/07/owl#Thing
>>> http://ààà2   http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>>       http://www.w3.org/2002/07/owl#Thing
>>> 
>>> The triple inserted from Conductor displays correctly. But it is not from 
>>> isql:
>>> 
>>> SQL> sparql select * from <http://localhost/test/charsets> where {?a ?b ?c};
>>> a
>>>           b
>>>                      c
>>> VARCHAR
>>>           VARCHAR
>>>                      VARCHAR
>>> _______________________________________________________________________________
>>> 
>>> http://ààà
>>>           http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>>                      http://www.w3.org/2002/07/owl#Thing
>>> http://Ã Ã Ã 2
>>>           http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>>                      http://www.w3.org/2002/07/owl#Thing
>>> 
>>> 2 Rows. -- 2 msec.
>>> 
>> 
>> [Hugh]  I have been able to recreate this issue which we shall be looking 
>> into. Seems the conductor is performing some recoding of chars from utf8. 
>> Note via HTTP you can use the Virtuoso sparal endpoint 
>> (http://localhost:8890/sparql) to perform such operations as it does not 
>> have the problem ...
> 
> I have tryed the same 'insert' statement from the /sparql endpoint but
> it behaves exactly as the conductor. I can see the IRIs correctly from
> there, but then, from ISQL, the IRIs are displayed wrongly.

[Hugh]OK, we shall check this also ...

> 
>> 
>>> 
>>> 2) Wrong characters when using 'load <IRI>' statement from both interfaces
>>> 
>>> In both interfaces, when I use the sparql load <IRI> statement, I
>>> cannot see IRI names correctly when some 'à', 'ò' etc... chars are in.
>>> The public rdf/xml file is correct, its encoding is UTF-8. This is not
>>> declared in the xml top declaration (but I have tried to add it
>>> manually, and I obtained the same behaviour).
>>> IRI are written in two ways, inside the rdf/xml:
>>> - http://someasciicharsà
>>> - http://someasciichars&#224;
>>> In both cases the IRI displays wrong in both interfaces.
>> 
>> [Hugh] Can you please provide more specific steps to recreate the issue you 
>> are seeing, as I can only see similar recoding issues in the conductor to 
>> those in 1) above , with isql working fine when using the load function to 
>> load triples ?
> 
> In this case none of the two interfaces (ISQL, Conductor) are working
> fine. In both IRIs result corrupted.
> Attached is a test RDF/XML file.
> I have tryed this:
> SQL> sparql load <http://sem-dev.src.cnr.it/testIRIencoding.rdf>;
> callret-0
> VARCHAR
> _______________________________________________________________________________
> 
> Load <http://myserver/testIRIencoding.rdf> into graph
> <http://myserver/testIRIencoding.rdf> -- done
> 
> 1 Rows. -- 10 msec.
> SQL> sparql select * from <http://myserver/testIRIencoding.rdf> where
> {?a ?b ?c};
> a
>           b
>                      c
> VARCHAR
>           VARCHAR
>                      VARCHAR
> _______________________________________________________________________________
> 
> http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA3
>      http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>                 http://www.cnr.it/ontology/cnr/personale.owl#UnitÃÂ
> DiPersonaleInterno
> http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA5
>      http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>                 http://www.cnr.it/ontology/cnr/personale.owl#UnitÃÂ
> DiPersonaleInterno

[Hugh] the RDF in the file testIRIencoding.rdf is :

<rdf:Description 
rdf:about="http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA3";>
<rdf:type 
rdf:resource="http://www.cnr.it/ontology/cnr/personale.owl#UnitàDiPersonaleInterno"/>
</rdf:Description>
−
<rdf:Description 
rdf:about="http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA5";>
<rdf:type 
rdf:resource="http://www.cnr.it/ontology/cnr/personale.owl#UnitàDiPersonaleInterno"/>
</rdf:Description>
</rdf:RDF>

Which does not seem to correspond to your output above or what I get when I 
load and query the graph:

$ /opt/virtuoso/bin/isql 1111
Connected to OpenLink Virtuoso
Driver: 06.00.3127 OpenLink Virtuoso ODBC Driver
OpenLink Interactive SQL (Virtuoso), version 0.9849b.
Type HELP; for help and EXIT; to exit.
SQL>  sparql load <http://sem-dev.src.cnr.it/testIRIencoding.rdf>;
callret-0
VARCHAR
_______________________________________________________________________________

Load <http://sem-dev.src.cnr.it/testIRIencoding.rdf> into graph 
<http://sem-dev.src.cnr.it/testIRIencoding.rdf> -- done

1 Rows. -- 725 msec.
SQL> sparql select * from <http://sem-dev.src.cnr.it/testIRIencoding.rdf> where 
{?s ?p ?o};
s                                                                               
  p                                                                             
    o
VARCHAR                                                                         
  VARCHAR                                                                       
    VARCHAR
_______________________________________________________________________________

http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA3     
  http://www.w3.org/1999/02/22-rdf-syntax-ns#type                               
    http://www.cnr.it/ontology/cnr/personale.owl#Unità DiPersonaleInterno
http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA5     
  http://www.w3.org/1999/02/22-rdf-syntax-ns#type                               
    http://www.cnr.it/ontology/cnr/personale.owl#Unità DiPersonaleInterno

2 Rows. -- 6 msec.

Although their is some apparent corruption ...

Regards
Hugh

> 
> Can you try this?
> Thank you for your help!
> 
> Enrico
> 
>> 
>> Best Regards
>> Hugh Williams
>> OpenLink Software
>> 
>>> 
>>> Other notes:
>>> * I have seen some documentation about CHARSET parameter of the
>>> connection, and tried to change it through ISQL, but I had the same
>>> behaviour both in 1) and 2).
>>> * I have noted that the HTTP header of the Conductor says UTF-8 while
>>> the HTML meta tag says ISO-8859-1, but I do not know if this has some
>>> influence on the general case (maybe there are multiple problems that
>>> I collapse ;) )
>>> * Virtuoso.ini file contains the following configuration
>>> [HTTPServer]
>>> Charset = UTF-8
>>> 
>>> Do I need to configure something?
>>> Can anybody help me on figure out the problem?
>>> 
>>> Thank you in advance
>>> 
>>> Enrico
>>> 
>>> 
>>> 
>>> --
>>> Enrico Daga
>>> Technology Expert
>>> --
>>> Ufficio Sistemi Informativi  (DCSPI-USI)
>>> National Research Council (CNR)
>>> P.le Aldo Moro 7 - Rome, Italy
>>> Tel +39 4993 3321
>>> --
>>> Semantic Technology Laboratory (STLab)
>>> Institute for Cognitive Science and Technology (ISTC-CNR)
>>> Via Nomentana 56, Rome - Italy
>>> --
>>> http://stlab.istc.cnr.it/stlab/User:EnricoDaga
>>> http://www.enridaga.net
>>> skype: enri-pan
>>> 
>>> ------------------------------------------------------------------------------
>>> Download Intel&#174; Parallel Studio Eval
>>> Try the new software tools for yourself. Speed compiling, find bugs
>>> proactively, and fine-tune applications for parallel performance.
>>> See why Intel Parallel Studio got high marks during beta.
>>> http://p.sf.net/sfu/intel-sw-dev
>>> _______________________________________________
>>> Virtuoso-users mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>> 
>> 
> 
> 
> 
> -- 
> Enrico Daga
> Technology Expert
> --
> Ufficio Sistemi Informativi  (DCSPI-USI)
> National Research Council (CNR)
> P.le Aldo Moro 7 - Rome, Italy
> Tel +39 4993 3321
> --
> Semantic Technology Laboratory (STLab)
> Institute for Cognitive Science and Technology (ISTC-CNR)
> Via Nomentana 56, Rome - Italy
> --
> http://stlab.istc.cnr.it/stlab/User:EnricoDaga
> http://www.enridaga.net
> skype: enri-pan
> <testIRIencoding.rdf>


Reply via email to