Hugh, yes, I have sent the wrong file (that is a snippet of the actual data I have to manage), instead of the 'test snippet' I had prepared for you... :) But the problem represented is the same, unrecognition of latin-accented characters.
Regards Enrico On 18 March 2010 04:02, Hugh Williams <[email protected]> wrote: > Enrico, > > On 17 Mar 2010, at 17:35, Enrico Daga wrote: > >> Hi Hugh, >> first of all thank you for your reply :) >> >> On 17 March 2010 18:05, Hugh Williams <[email protected]> wrote: >>> Hi Enrico, >>> >>> On 17 Mar 2010, at 11:50, Enrico Daga wrote: >>> >>>> Hi >>>> I am a virtuoso fan (and newbie), I have installed and played a bit >>>> with it and I think it is great! >>>> >>>> Now I am experiencing some problems in IRI names when using non ASCII >>>> characters. >>>> I have collected as much information as I could, hope they are enough >>>> to figure out thwe problem: >>>> These are the two cases: >>>> >>>> 1) Differences between isql command line tool and Conductor's 'Sparql >>>> Execution' tool >>>> When I do the following statement I can see the triple correctly from >>>> the same interface, but wrongly in the other. >>>> For example, I do from isql >>>> >>>> SQL> sparql insert into <http://localhost/test/charsets> {<http://ààà> >>>> rdf:type owl:Thing}; >>>> callret-0 >>>> VARCHAR >>>> _______________________________________________________________________________ >>>> >>>> Insert into <http://localhost/test/charsets>, 1 triples -- done >>>> >>>> 1 Rows. -- 11 msec. >>>> SQL> sparql select * from <http://localhost/test/charsets> where {?a ?b >>>> ?c}; >>>> a >>>> b >>>> c >>>> VARCHAR >>>> VARCHAR >>>> VARCHAR >>>> _______________________________________________________________________________ >>>> >>>> http://ààà >>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>> http://www.w3.org/2002/07/owl#Thing >>>> >>>> 1 Rows. -- 2 msec. >>>> >>>> Then I try to see the result from Conductor, but this is the result: >>>> >>>> a b c >>>> http://??? http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>> http://www.w3.org/2002/07/owl#Thing >>>> >>>> The same example but in the opposite order, now I do the insert from >>>> Conductor: >>>> >>>> insert into <http://localhost/test/charsets> {<http://ààà2> rdf:type >>>> owl:Thing} >>>> >>>> and >>>> >>>> select * from <http://localhost/test/charsets> where {?a ?b ?c} >>>> >>>> a b c >>>> http://??? http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>> http://www.w3.org/2002/07/owl#Thing >>>> http://ààà2 http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>> http://www.w3.org/2002/07/owl#Thing >>>> >>>> The triple inserted from Conductor displays correctly. But it is not from >>>> isql: >>>> >>>> SQL> sparql select * from <http://localhost/test/charsets> where {?a ?b >>>> ?c}; >>>> a >>>> b >>>> c >>>> VARCHAR >>>> VARCHAR >>>> VARCHAR >>>> _______________________________________________________________________________ >>>> >>>> http://ààà >>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>> http://www.w3.org/2002/07/owl#Thing >>>> http://à à à 2 >>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>> http://www.w3.org/2002/07/owl#Thing >>>> >>>> 2 Rows. -- 2 msec. >>>> >>> >>> [Hugh] I have been able to recreate this issue which we shall be looking >>> into. Seems the conductor is performing some recoding of chars from utf8. >>> Note via HTTP you can use the Virtuoso sparal endpoint >>> (http://localhost:8890/sparql) to perform such operations as it does not >>> have the problem ... >> >> I have tryed the same 'insert' statement from the /sparql endpoint but >> it behaves exactly as the conductor. I can see the IRIs correctly from >> there, but then, from ISQL, the IRIs are displayed wrongly. > > [Hugh]OK, we shall check this also ... > >> >>> >>>> >>>> 2) Wrong characters when using 'load <IRI>' statement from both interfaces >>>> >>>> In both interfaces, when I use the sparql load <IRI> statement, I >>>> cannot see IRI names correctly when some 'à', 'ò' etc... chars are in. >>>> The public rdf/xml file is correct, its encoding is UTF-8. This is not >>>> declared in the xml top declaration (but I have tried to add it >>>> manually, and I obtained the same behaviour). >>>> IRI are written in two ways, inside the rdf/xml: >>>> - http://someasciicharsà >>>> - http://someasciicharsà >>>> In both cases the IRI displays wrong in both interfaces. >>> >>> [Hugh] Can you please provide more specific steps to recreate the issue you >>> are seeing, as I can only see similar recoding issues in the conductor to >>> those in 1) above , with isql working fine when using the load function to >>> load triples ? >> >> In this case none of the two interfaces (ISQL, Conductor) are working >> fine. In both IRIs result corrupted. >> Attached is a test RDF/XML file. >> I have tryed this: >> SQL> sparql load <http://sem-dev.src.cnr.it/testIRIencoding.rdf>; >> callret-0 >> VARCHAR >> _______________________________________________________________________________ >> >> Load <http://myserver/testIRIencoding.rdf> into graph >> <http://myserver/testIRIencoding.rdf> -- done >> >> 1 Rows. -- 10 msec. >> SQL> sparql select * from <http://myserver/testIRIencoding.rdf> where >> {?a ?b ?c}; >> a >> b >> c >> VARCHAR >> VARCHAR >> VARCHAR >> _______________________________________________________________________________ >> >> http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA3 >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type >> http://www.cnr.it/ontology/cnr/personale.owl#Unità>> DiPersonaleInterno >> http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA5 >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type >> http://www.cnr.it/ontology/cnr/personale.owl#Unità>> DiPersonaleInterno > > [Hugh] the RDF in the file testIRIencoding.rdf is : > > <rdf:Description > rdf:about="http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA3"> > <rdf:type > rdf:resource="http://www.cnr.it/ontology/cnr/personale.owl#UnitàDiPersonaleInterno"/> > </rdf:Description> > − > <rdf:Description > rdf:about="http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA5"> > <rdf:type > rdf:resource="http://www.cnr.it/ontology/cnr/personale.owl#UnitàDiPersonaleInterno"/> > </rdf:Description> > </rdf:RDF> > > Which does not seem to correspond to your output above or what I get when I > load and query the graph: > > $ /opt/virtuoso/bin/isql 1111 > Connected to OpenLink Virtuoso > Driver: 06.00.3127 OpenLink Virtuoso ODBC Driver > OpenLink Interactive SQL (Virtuoso), version 0.9849b. > Type HELP; for help and EXIT; to exit. > SQL> sparql load <http://sem-dev.src.cnr.it/testIRIencoding.rdf>; > callret-0 > VARCHAR > _______________________________________________________________________________ > > Load <http://sem-dev.src.cnr.it/testIRIencoding.rdf> into graph > <http://sem-dev.src.cnr.it/testIRIencoding.rdf> -- done > > 1 Rows. -- 725 msec. > SQL> sparql select * from <http://sem-dev.src.cnr.it/testIRIencoding.rdf> > where {?s ?p ?o}; > s > p > o > VARCHAR > VARCHAR > VARCHAR > _______________________________________________________________________________ > > http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA3 > http://www.w3.org/1999/02/22-rdf-syntax-ns#type > http://www.cnr.it/ontology/cnr/personale.owl#Unità DiPersonaleInterno > http://www.cnr.it/ontology/cnr/individuo/unitaDiPersonaleInterno/MATRICOLA5 > http://www.w3.org/1999/02/22-rdf-syntax-ns#type > http://www.cnr.it/ontology/cnr/personale.owl#Unità DiPersonaleInterno > > 2 Rows. -- 6 msec. > > Although their is some apparent corruption ... > > Regards > Hugh > >> >> Can you try this? >> Thank you for your help! >> >> Enrico >> >>> >>> Best Regards >>> Hugh Williams >>> OpenLink Software >>> >>>> >>>> Other notes: >>>> * I have seen some documentation about CHARSET parameter of the >>>> connection, and tried to change it through ISQL, but I had the same >>>> behaviour both in 1) and 2). >>>> * I have noted that the HTTP header of the Conductor says UTF-8 while >>>> the HTML meta tag says ISO-8859-1, but I do not know if this has some >>>> influence on the general case (maybe there are multiple problems that >>>> I collapse ;) ) >>>> * Virtuoso.ini file contains the following configuration >>>> [HTTPServer] >>>> Charset = UTF-8 >>>> >>>> Do I need to configure something? >>>> Can anybody help me on figure out the problem? >>>> >>>> Thank you in advance >>>> >>>> Enrico >>>> >>>> >>>> >>>> -- >>>> Enrico Daga >>>> Technology Expert >>>> -- >>>> Ufficio Sistemi Informativi (DCSPI-USI) >>>> National Research Council (CNR) >>>> P.le Aldo Moro 7 - Rome, Italy >>>> Tel +39 4993 3321 >>>> -- >>>> Semantic Technology Laboratory (STLab) >>>> Institute for Cognitive Science and Technology (ISTC-CNR) >>>> Via Nomentana 56, Rome - Italy >>>> -- >>>> http://stlab.istc.cnr.it/stlab/User:EnricoDaga >>>> http://www.enridaga.net >>>> skype: enri-pan >>>> >>>> ------------------------------------------------------------------------------ >>>> Download Intel® Parallel Studio Eval >>>> Try the new software tools for yourself. Speed compiling, find bugs >>>> proactively, and fine-tune applications for parallel performance. >>>> See why Intel Parallel Studio got high marks during beta. >>>> http://p.sf.net/sfu/intel-sw-dev >>>> _______________________________________________ >>>> Virtuoso-users mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/virtuoso-users >>> >>> >> >> >> >> -- >> Enrico Daga >> Technology Expert >> -- >> Ufficio Sistemi Informativi (DCSPI-USI) >> National Research Council (CNR) >> P.le Aldo Moro 7 - Rome, Italy >> Tel +39 4993 3321 >> -- >> Semantic Technology Laboratory (STLab) >> Institute for Cognitive Science and Technology (ISTC-CNR) >> Via Nomentana 56, Rome - Italy >> -- >> http://stlab.istc.cnr.it/stlab/User:EnricoDaga >> http://www.enridaga.net >> skype: enri-pan >> <testIRIencoding.rdf> > > -- Enrico Daga Technology Expert -- Ufficio Sistemi Informativi (DCSPI-USI) National Research Council (CNR) P.le Aldo Moro 7 - Rome, Italy Tel +39 4993 3321 -- Semantic Technology Laboratory (STLab) Institute for Cognitive Science and Technology (ISTC-CNR) Via Nomentana 56, Rome - Italy -- http://stlab.istc.cnr.it/stlab/User:EnricoDaga http://www.enridaga.net skype: enri-pan
