Hi Philip,
Thanks for your quick reply. The following is the SPARQL query: SELECT DISTINCT ?Name ?Country ?Cls WHERE { ?Country a ?Cls ; rdfs:label ?Name ; <http://dbpedia.org/property/capital> ?capital . OPTIONAL { ?Country dbpedia-owl:dissolutionYear ?year } . FILTER(!BOUND(?year)) FILTER (?Cls = <http://dbpedia.org/ontology/Country>) FILTER ( langMatches( lang(?Name), "es") ) } ORDER BY (?Name) The DBPedia SPARQL Endpoint (http://dbpedia.org/sparql) returns a total of 329 country names for the query above, whilst the same query returns 305 country names only within the Large KB Gazetteer. >From the tests conducted I noticed that all the Spanish country names that do not contain any special character such as Austria, Australia, etc. are all recognised (since they have been populated in the gazetteer), whilst the ones containing special characters such as Brunéi, Camerún, etc. are not recognised as Countries, even though some of the country names are within the gazetteer. I cant figure out why the names containing special characters are not being recognised by the Large KB Gazetteer, even though some of the names are listed within. Regards, Keith From: Philip Alexiev [mailto:philip.alex...@ontotext.com] Sent: 18 June 2012 11:11 To: Keith Cortis Cc: KIM discussion Subject: Re: [Kim-discussion] Fwd: Large KB Gazetteer Hi Keith, Most probably the gazetteer query is not matching the RDF for those labels. Please provide the RDF for some of the missed countries and also the gazetteer query, in case you customized it. Regards, Philip Alexiev Software Engineer, KIM team On 18 Jun 2012, at 12:55 PM, Philip Alexiev wrote: Begin forwarded message: I have been testing out the Large KB Gazetteer module in GATE (v 7.0), where I noticed that the country names having a special character, are not being imported into the newly created gazetteer. For example, if I want to create a Gazetteer containing all the countries in the world, in Spanish (rdfs:label ="es"), the gazetteer is only loading 299 instances from a possible 324. Therefore, country names such as: Afganistán, Azerbaiyán, Benín, Brunéi, etc.. are not being loaded, thus not recognised as a Country entity. The same problem is occurring for city names, where all the names are being imported into the gazetteer, but the ones containing any special character (like the example provided above), are not being recognised as being an entity. Do you know what might be causing this issue please? Thanks a lot for your help. Regards, Keith ---------------- Keith Cortis Digital Enterprise Research Institute (DERI) Galway, Semantic Collaborative Software Unit (USCS) National University of Ireland, Galway Lower Dangan Galway, Ireland _______________________________________________ Kim-discussion mailing list Kim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion
_______________________________________________ Kim-discussion mailing list Kim-discussion@ontotext.com http://ontomail.semdata.org/cgi-bin/mailman/listinfo/kim-discussion