Re: text to lod object matcher
thanks to all for your quick reply. while i have heard/used about few of the services above, i was looking for something that could give me LOD URI for few words at most and often a single word(eg china or moon) rather than full text extraction. if the word is ambiguous i am ok with a list of URIs as well. meanwhile i will look up the references provided to see if there is anything that fits my bill thanks ravinder thakur
Re: text to lod object matcher
How about just simply http://lookup.dbpedia.org/ Juan Sequeda, Ph.D Student Dept. of Computer Sciences The University of Texas at Austin www.juansequeda.com www.semanticwebaustin.org 2009/11/9 रविंदर ठाकुर (ravinder thakur) ravindertha...@gmail.com thanks to all for your quick reply. while i have heard/used about few of the services above, i was looking for something that could give me LOD URI for few words at most and often a single word(eg china or moon) rather than full text extraction. if the word is ambiguous i am ok with a list of URIs as well. meanwhile i will look up the references provided to see if there is anything that fits my bill thanks ravinder thakur
Re: text to lod object matcher
again thanks to all i think yahoo term extractor will do the job for me. eg for string Italian sculptors and painters of the renaissance favored the Virgin Mary for inspiration it extracts following terms which is what i needed. italian sculptors virgin mary painters renaissance inspiration thanks a lot again public-lod rocks!!! 1) http://developer.yahoo.com/search/content/V1/termExtraction.html
Re: text to lod object matcher
रविंदर ठाकुर (ravinder thakur) wrote: thanks to all for your quick reply. while i have heard/used about few of the services above, i was looking for something that could give me LOD URI for few words at most and often a single word(eg china or moon) rather than full text extraction. if the word is ambiguous i am ok with a list of URIs as well. meanwhile i will look up the references provided to see if there is anything that fits my bill thanks ravinder thakur You can try: http://lod.openlinksw.com . You will get URIs for entities associated with your full text pattern. You can filter by Entity Type or Entity Properties. This I suspect addresses part of you question. The ideal situation would be for an NLP handler to process the text and then filter your results accordingly (basically you do less filtering by Type / Properties en route to desired Entity URIs), if this is the case, confirm as part of this conversation. -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: text to lod object matcher
रविंदर ठाकुर (ravinder thakur) wrote: again thanks to all i think yahoo term extractor will do the job for me. eg for string Italian sculptors and painters of the renaissance favored the Virgin Mary for inspiration it extracts following terms which is what i needed. italian sculptors virgin mary painters renaissance inspiration thanks a lot again public-lod rocks!!! 1) http://developer.yahoo.com/search/content/V1/termExtraction.html Note, Orchestra8's AlchmeyAPI should also do the same, but with the additional benefit of URIs for the Extracts courtesy of DBpedia lookups. -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: text to lod object matcher
2009/11/9 Kingsley Idehen kide...@openlinksw.com रविंदर ठाकुर (ravinder thakur) wrote: again thanks to all i think yahoo term extractor will do the job for me. eg for string Italian sculptors and painters of the renaissance favored the Virgin Mary for inspiration it extracts following terms which is what i needed. italian sculptors virgin mary painters renaissance inspiration thanks a lot again public-lod rocks!!! 1) http://developer.yahoo.com/search/content/V1/termExtraction.html Note, Orchestra8's AlchmeyAPI should also do the same, but with the additional benefit of URIs for the Extracts courtesy of DBpedia lookups. yes exactly, DBpedia, MusicBrainz, GeoNames and FreeBase afaik. cheers, Davide -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President CEO OpenLink Software Web: http://www.openlinksw.com
Re: Ontology modules and namespaces
On Nov 8, 2009, at 7:03 PM, Alan Ruttenberg wrote: On 11/4/09, Holger Knublauch hol...@knublauch.com wrote: Since TopBraid Composer [1] was criticized here, please allow me explain that it can very well be used in the scenario below. I will let the people on this list decide whether it behaves well or not. The mechanism it uses has been stable for the last three years, and I think it has worked quite well so far. It does not. Thanks for sharing *your opinion*. The original question was about modularizing ontologies so that resources from the same namespace can be organized across multiple files. Many users are confused about the difference between ontology URIs and namespaces, and I was addressing this. You seem to be switching topics now to whether base URIs are a valid mechanism to identify those (multiple) files or whether the URIs of owl:Ontologies should be used only. If users are editing files from their hard drive, TBC will associate each file with a base URI. This base URI is later used to resolve owl:imports, so that the system can figure out whether it has local copies of web resources without going to the web. The base URI is retrieved from the files by looking into the first few lines - if it's an RDF/XML file then it uses the declared xml:base, This is simply wrong and causes problems in practice. Thankfully it is finally being fixed in Protege. Comparing TopBraid and Protege is like comparing apples with oranges. Protege 4 has been designed as a native OWL 2 tool, and it generally cannot correctly handle RDF files. TopBraid has been designed as a semantic web technology tool with a focus on RDF-based languages including, but not limited to, OWL. Many RDF files do not even declare owl:Ontologies, making your suggested solution not attractive in general. In practice however TopBraid makes efforts to make sure that base URI (written as xml:base in RDF/XML) and the owl:Ontology remain synchronized. It will add missing owl:Ontology triples if a file gets saved from the web, to maximize OWL compatibility. TopBraid also provides a warning if there are more than one owl:Ontologies in a file, and has a button to fix this scenario. For most back-ends (such as databases), TopBraid also checks for the owl:Ontology to learn about the base URI. So I don't think there are substantial practical differences between what you outline and what we have implemented. BTW I just downloaded Protege to see how it handles the case of multiple base URIs (owl:Ontology URIs) across multiple files. With 4.0.1, I did the following: - Create file test.owl with URI http://example.org/test - Add a class Person and save - Create file test2.owl with same base URI as above - Protege opens the old (!) file test.owl and no file (or warning) gets created ! - Since Protege does not allow me to create two files with the same URI, I - create a file test2.owl with http://example.org/test2 - Protege opens the file - Close Protege - Manually edit test2.owl so that it has the same base URI - Open test2.owl and add owl:imports to file test.owl - Imports view now claims to import test.owl, but none of its triples show up All this shows that the issue is not fixed at all in Protege, and the same kind of base URI/ontology confusion may arise like in TopBraid. IMHO it is still better to allow working with multiple files with the same base URI than just silently ignoring them and hoping for the best. - I also tried to import http://rdfex.org/foaf/Person,firstName which TopBraid handles without problems, but Protege fails completely because there is no owl:Ontology declared. I guess such a strict interpretation of the OWL spec is not helpful if the OWL community wants to interoperate with RDF-based ontologies. And why should all RDF snippets in the world be forced to declare an extra triple only because some OWL tools are inflexible? OWL is based on RDF, not the other way around. The xml:base has no status whatsoever in OWL. owl:imports in both OWL 1 and OWL 2 are based on the ontology URI. The only way to determine the ontology URI is to fully parse an OWL file. In doing so one must recognize that certain :x rdf:type owl:Ontology triples are the result of serialization of owl:import statements and so their subject is not the name of the ontology. Once these are discounted, there should be a single triple of the above form, and whatever is in the place of the :x is the name of the ontology. How is this supposed to work in practice? My humble understanding of the owl:imports mechanism is that it is supposed to support importing ontologies, in particular from the web. The URL being imported should therefore align with the physical location of that file, following best linked data practice. If they are different (like in the infamous case of the SWRL ontology) significant problems arise. What is the use case of having distinct
Re: Ontology modules and namespaces
On Mon, Nov 9, 2009 at 2:14 PM, Holger Knublauch hol...@knublauch.com wrote: On Nov 8, 2009, at 7:03 PM, Alan Ruttenberg wrote: On 11/4/09, Holger Knublauch hol...@knublauch.com wrote: Since TopBraid Composer [1] was criticized here, please allow me explain that it can very well be used in the scenario below. I will let the people on this list decide whether it behaves well or not. The mechanism it uses has been stable for the last three years, and I think it has worked quite well so far. It does not. Thanks for sharing *your opinion*. You are *welcome*. The original question was about modularizing ontologies so that resources from the same namespace can be organized across multiple files. Many users are confused about the difference between ontology URIs and namespaces, and I was addressing this. You have addressed it poorly. There is no connection between the two. Suggesting otherwise does a disservice. You seem to be switching topics now to whether base URIs are a valid mechanism to identify those (multiple) files or whether the URIs of owl:Ontologies should be used only. As far as any of the semantic web technologies go xml:base *does not exist*. The specs know *nothing* about it. Nor should they. If users are editing files from their hard drive, TBC will associate each file with a base URI. This base URI is later used to resolve owl:imports, so that the system can figure out whether it has local copies of web resources without going to the web. The base URI is retrieved from the files by looking into the first few lines - if it's an RDF/XML file then it uses the declared xml:base, This is simply wrong and causes problems in practice. Thankfully it is finally being fixed in Protege. Comparing TopBraid and Protege is like comparing apples with oranges. They were compared because they share the same bug, and because parts of the protege code base were written by the same developer. Protege 3 and Protege 4 have the same problem in this regard. Protege 4 has been designed as a native OWL 2 tool, and it generally cannot correctly handle RDF files. TopBraid has been designed as a semantic web technology tool with a focus on RDF-based languages including, but not limited to, OWL. Many RDF files do not even declare owl:Ontologies, making your suggested solution not attractive in general. I only commented on the OWL case, but since you mention it, the use of xml base in RDF is similarly unjustified. In practice however TopBraid makes efforts to make sure that base URI (written as xml:base in RDF/XML) and the owl:Ontology remain synchronized. Tools need to live in the world defined by the specification, so that artifacts that are generated according to the specification will work with them. I, and I presume many others, are not interested in idiosyncratic, tool specific, solutions. It will add missing owl:Ontology triples if a file gets saved from the web, to maximize OWL compatibility. TopBraid also provides a warning if there are more than one owl:Ontologies in a file, and has a button to fix this scenario. For most back-ends (such as databases), TopBraid also checks for the owl:Ontology to learn about the base URI. These have *nothing* to do with each other. So I don't think there are substantial practical differences between what you outline and what we have implemented. BTW I just downloaded Protege As far as I know these fixes have not yet been pushed out. -Alan
Re: Need help mapping two letter country code to URI
Hi Aldo, Note that there are multiple branches of the ISO 3166 familiy of codes. See pages 23 and 24 of the GoodRelations Technical Report (http://www.heppnetz.de/projects/goodrelations/GoodRelations-TR-final.pdf) for a more detailed discussion. I am still not aware of any authoritative URI schema for ISO 3166, which is why GoodRelations uses string literals for that code. The key ISO page http://www.iso.org/iso/country_codes.htm does also not refer to any established http or URN URI schema for the ISO 3166 family of codes. I assume that dbPedia URIs may be well suited, but they are not as authoritative. If they have ISO 3166 codes attached via properties, entity consolidation on that basis may be relatively simple. Below, please find an excerpt from the discussion re identifiers for countries in the GoodRelations Technical Report: Country or Region ... GoodRelations could reuse several approaches for ontologies of regions and places for specifying Countries and Regions. However, we suggest a more pragmatic approach of reusing the ISO Standard 3166, in particular ISO 3166-1 (ISO, 2006) and ISO 3166-2 (ISO, 1998). The first defines 2- or 3-letter identifiers for existing countries and a few independent geopolitical entities. ISO 3166-1 alpha-2 defines 2-letter codes for most countries. There exist alternative standards with 3-letter codes and a numerical representation. For the following reasons, we suggest using the 2-letter codes: First, they are well established and people are likely more familiar with them (they are also used for most top-level domains). Second, and more important, the 2-letter variant is the basis for ISO 3166-2, which breaks down the countries from ISO 3166-1 into administrative subdivisions (ISO, 1998). The code elements used in ISO 3166-2 consist of “the alpha-2 code element from ISO 3166-1 followed by a separator and a further string of up to three alphanumeric characters e. g.” (from: http://www.iso.org/iso/en/prods- services/iso3166ma/04background-on-iso-3166/iso3166-2.html). This allows using simple string operations on the respective ISO 3166 codes in order to handle administrative subdivisions. For example, if a certain Offering is said to be valid for Canada (ISO 3166-1 two-letter code “CA”), then one can infer that any longer search string specifying an administrative subdivision of Canada (e.g. British Columbia, ISO 3166-2 “CA-BC”) is also an eligible region. Examples: Canada (CA), Austria (AT), Canada: British Columbia (CA-BC), Italy (IT), Italy: Province of Milano (IT-MI) Note: More complex modeling of Countries and Regions may be useful in some scenarions, and GoodRelations can be imported and extended if necessary. However, most offerings on the Web contain statements on the level of countries only, for which ISO 3166-1 is sufficient and very common. Martin Aldo Bucchi wrote: Hi, I found a dataset that represents countries as two letter country codes: DK, FI, NO, SE, UK. I would like to turn these into URIs of the actual countries they represent. ( I have no idea on whether this follows an ISO standard or is just some private key in this system ). Any ideas on a set of candidata URIs? I would like to run a complete coverage test and take care I don't introduce distortion ( that is pretty easy by doing some heuristic tests against labels, etc ). There are some border cases that suggest this isn't ISO3166-1, but I am not sure yet. ( and if it were, which widely used URIs are based on this standard? ). Thanks! A -- -- martin hepp e-business web science research group universitaet der bundeswehr muenchen e-mail: h...@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! = Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ Recipe for Yahoo SearchMonkey: http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey Talk at the Semantic Technology Conference 2009: Semantic Web-based E-Commerce: The GoodRelations Ontology http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287 Overview article on Semantic Universe: http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html Project page: http://purl.org/goodrelations/ Resources for developers: http://www.ebusiness-unibw.org/wiki/GoodRelations Tutorial materials: CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_IEEE_CEC%2709 attachment: martin_hepp.vcf
Re: Need help mapping two letter country code to URI
On Mon, Nov 9, 2009 at 10:47 PM, Aldo Bucchi aldo.buc...@gmail.com wrote: Hi, I found a dataset that represents countries as two letter country codes: DK, FI, NO, SE, UK. I would like to turn these into URIs of the actual countries they represent. ( I have no idea on whether this follows an ISO standard or is just some private key in this system ). Any ideas on a set of candidata URIs? I would like to run a complete coverage test and take care I don't introduce distortion ( that is pretty easy by doing some heuristic tests against labels, etc ). There are some border cases that suggest this isn't ISO3166-1, but I am not sure yet. ( and if it were, which widely used URIs are based on this standard? ). http://www.fao.org/countryprofiles/geoinfo.asp might have something useful for you? Dan
Re: Need help mapping two letter country code to URI
There are quite a few, but I don't know which other ones follow ISO 3166-1. http://sameas.org/?uri=http://dbpedia.org/resource/Austria Gives a selection. Or also http://unlocode.rkbexplorer.com/id/AT http://ontologi.es/place/AT Our site, http://unlocode.rkbexplorer.com/id/AT is our capture of UN/LOCODE 2009-1, the United Nations Code for Trade and Transport Locations, which uses the 2-letter country codes from ISO 3166-1, as well as the 1-3 letter subdivision codes of ISO 3166-2 See http://www.unece.org/cefact/locode/ It also gives inclusion and coords, etc. We need to do more coref to other than onologi.es . Best Hugh On 09/11/2009 21:47, Aldo Bucchi aldo.buc...@gmail.com wrote: Hi, I found a dataset that represents countries as two letter country codes: DK, FI, NO, SE, UK. I would like to turn these into URIs of the actual countries they represent. ( I have no idea on whether this follows an ISO standard or is just some private key in this system ). Any ideas on a set of candidata URIs? I would like to run a complete coverage test and take care I don't introduce distortion ( that is pretty easy by doing some heuristic tests against labels, etc ). There are some border cases that suggest this isn't ISO3166-1, but I am not sure yet. ( and if it were, which widely used URIs are based on this standard? ). Thanks! A
Need help mapping two letter country code to URI
Hi, I found a dataset that represents countries as two letter country codes: DK, FI, NO, SE, UK. I would like to turn these into URIs of the actual countries they represent. ( I have no idea on whether this follows an ISO standard or is just some private key in this system ). Any ideas on a set of candidata URIs? I would like to run a complete coverage test and take care I don't introduce distortion ( that is pretty easy by doing some heuristic tests against labels, etc ). There are some border cases that suggest this isn't ISO3166-1, but I am not sure yet. ( and if it were, which widely used URIs are based on this standard? ). Thanks! A -- Aldo Bucchi skype:aldo.bucchi http://www.univrz.com/ http://aldobucchi.com/ PRIVILEGED AND CONFIDENTIAL INFORMATION This message is only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, please do not distribute or copy this communication, by e-mail or otherwise. Instead, please notify us immediately by return e-mail.
Re: Need help mapping two letter country code to URI
On 09/11/2009 21:47, Aldo Bucchi aldo.buc...@gmail.com wrote: I found a dataset that represents countries as two letter country codes: DK, FI, NO, SE, UK. http://dbpedia.org/resource/ISO_3166-2:DK http://dbpedia.org/resource/ISO_3166-2:FI http://dbpedia.org/resource/ISO_3166-2:NO http://dbpedia.org/resource/ISO_3166-2:SE http://dbpedia.org/resource/ISO_3166-2:GB (UK) ?
Re: Need help mapping two letter country code to URI
On Mon, Nov 9, 2009 at 23:59, Nathan nat...@webr3.org wrote: On 09/11/2009 21:47, Aldo Bucchi aldo.buc...@gmail.com wrote: I found a dataset that represents countries as two letter country codes: DK, FI, NO, SE, UK. http://dbpedia.org/resource/ISO_3166-2:DK http://dbpedia.org/resource/ISO_3166-2:FI http://dbpedia.org/resource/ISO_3166-2:NO http://dbpedia.org/resource/ISO_3166-2:SE http://dbpedia.org/resource/ISO_3166-2:GB (UK) ? With '-1' instead of '-2', these all dbpprop:redirect to their respective countries: http://dbpedia.org/resource/ISO_3166-1:DK http://dbpedia.org/resource/ISO_3166-1:FI http://dbpedia.org/resource/ISO_3166-1:NO http://dbpedia.org/resource/ISO_3166-1:SE http://dbpedia.org/resource/ISO_3166-1:GB I guess this pattern is quite reliable, because some people at Wikipedia were rather diligent: http://en.wikipedia.org/wiki/Category:Redirects_from_ISO_3166