Re: text to lod object matcher

2009-11-09 Thread ravinder thakur
thanks to all for your quick reply.


while i have heard/used about few of the services above, i was looking for
something that could give me LOD URI for few words at most and often a
single word(eg china or moon) rather than full text extraction. if the word
is ambiguous i am ok with a list of URIs as well.


meanwhile i will look up the references provided to see if there is anything
that fits my bill


thanks
ravinder thakur


Re: text to lod object matcher

2009-11-09 Thread Juan Sequeda
How about just simply http://lookup.dbpedia.org/


Juan Sequeda, Ph.D Student
Dept. of Computer Sciences
The University of Texas at Austin
www.juansequeda.com
www.semanticwebaustin.org


2009/11/9 रविंदर ठाकुर (ravinder thakur) ravindertha...@gmail.com

 thanks to all for your quick reply.


 while i have heard/used about few of the services above, i was looking for
 something that could give me LOD URI for few words at most and often a
 single word(eg china or moon) rather than full text extraction. if the word
 is ambiguous i am ok with a list of URIs as well.


 meanwhile i will look up the references provided to see if there is
 anything that fits my bill


 thanks
 ravinder thakur






Re: text to lod object matcher

2009-11-09 Thread ravinder thakur
again thanks to all i think yahoo term extractor will do the job for me.


eg for string Italian sculptors and painters of the renaissance favored the
Virgin Mary for inspiration it extracts following terms which is what i
needed.



italian sculptors
virgin mary
painters
renaissance
inspiration


thanks a lot again
public-lod rocks!!!


1) http://developer.yahoo.com/search/content/V1/termExtraction.html


Re: text to lod object matcher

2009-11-09 Thread Kingsley Idehen

रविंदर ठाकुर (ravinder thakur) wrote:

thanks to all for your quick reply.


while i have heard/used about few of the services above, i was looking 
for something that could give me LOD URI for few words at most and 
often a single word(eg china or moon) rather than full text 
extraction. if the word is ambiguous i am ok with a list of URIs as well.



meanwhile i will look up the references provided to see if there is 
anything that fits my bill



thanks
ravinder thakur



You can try: http://lod.openlinksw.com . You will get URIs for entities 
associated with your full text pattern. You can filter by Entity Type or 
Entity Properties. This I suspect addresses part of you question. The 
ideal situation would be for an NLP handler to process the text and then 
filter your results accordingly (basically you do less filtering by Type 
/ Properties en route to desired Entity URIs), if this is the case, 
confirm as part of this conversation.


--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: text to lod object matcher

2009-11-09 Thread Kingsley Idehen

रविंदर ठाकुर (ravinder thakur) wrote:
again thanks to all i think yahoo term extractor will do the job 
for me.


eg for string Italian sculptors and painters of the renaissance 
favored the Virgin Mary for inspiration it extracts following terms 
which is what i needed.




italian sculptors
virgin mary
painters
renaissance
inspiration


thanks a lot again
public-lod rocks!!!


1) http://developer.yahoo.com/search/content/V1/termExtraction.html

Note, Orchestra8's AlchmeyAPI should also do the same, but with the 
additional benefit of URIs for the Extracts courtesy of DBpedia lookups.


--


Regards,

Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
President  CEO 
OpenLink Software Web: http://www.openlinksw.com








Re: text to lod object matcher

2009-11-09 Thread Davide Palmisano
2009/11/9 Kingsley Idehen kide...@openlinksw.com

 रविंदर ठाकुर (ravinder thakur) wrote:

 again thanks to all i think yahoo term extractor will do the job for
 me.

 eg for string Italian sculptors and painters of the renaissance favored
 the Virgin Mary for inspiration it extracts following terms which is what i
 needed.



 italian sculptors
 virgin mary
 painters
 renaissance
 inspiration


 thanks a lot again
 public-lod rocks!!!


 1) http://developer.yahoo.com/search/content/V1/termExtraction.html

  Note, Orchestra8's AlchmeyAPI should also do the same, but with the
 additional benefit of URIs for the Extracts courtesy of DBpedia lookups.


yes exactly, DBpedia, MusicBrainz, GeoNames and FreeBase afaik.

cheers,

Davide



 --


 Regards,

 Kingsley Idehen   Weblog: http://www.openlinksw.com/blog/~kidehen
 President  CEO OpenLink Software Web: http://www.openlinksw.com








Re: Ontology modules and namespaces

2009-11-09 Thread Holger Knublauch


On Nov 8, 2009, at 7:03 PM, Alan Ruttenberg wrote:


On 11/4/09, Holger Knublauch hol...@knublauch.com wrote:

Since TopBraid Composer [1] was criticized here, please allow me
explain that it can very well be used in the scenario below. I will
let the people on this list decide whether it behaves well or not.  
The

mechanism it uses has been stable for the last three years, and I
think it has worked quite well so far.


It does not.


Thanks for sharing *your opinion*. The original question was about  
modularizing ontologies so that resources from the same namespace can  
be organized across multiple files. Many users are confused about the  
difference between ontology URIs and namespaces, and I was addressing  
this. You seem to be switching topics now to whether base URIs are a  
valid mechanism to identify those (multiple) files or whether the URIs  
of owl:Ontologies should be used only.





If users are editing files from their hard drive, TBC will associate
each file with a base URI. This base URI is later used to resolve
owl:imports, so that the system can figure out whether it has local
copies of web resources without going to the web. The base URI is
retrieved from the files by looking into the first few lines - if  
it's

an RDF/XML file then it uses the declared xml:base,


This is simply wrong and causes problems in practice. Thankfully it is
finally being fixed in Protege.


Comparing TopBraid and Protege is like comparing apples with oranges.  
Protege 4 has been designed as a native OWL 2 tool, and it generally  
cannot correctly handle RDF files. TopBraid has been designed as a  
semantic web technology tool with a focus on RDF-based languages  
including, but not limited to, OWL. Many RDF files do not even declare  
owl:Ontologies, making your suggested solution not attractive in  
general. In practice however TopBraid makes efforts to make sure that  
base URI (written as xml:base in RDF/XML) and the owl:Ontology remain  
synchronized. It will add missing owl:Ontology triples if a file gets  
saved from the web, to maximize OWL compatibility. TopBraid also  
provides a warning if there are more than one owl:Ontologies in a  
file, and has a button to fix this scenario. For most back-ends (such  
as databases), TopBraid also checks for the owl:Ontology to learn  
about the base URI. So I don't think there are substantial practical  
differences between what you outline and what we have implemented.


BTW I just downloaded Protege to see how it handles the case of  
multiple base URIs (owl:Ontology URIs) across multiple files. With  
4.0.1, I did the following:

- Create file test.owl with URI http://example.org/test
- Add a class Person and save
- Create file test2.owl with same base URI as above
- Protege opens the old (!) file test.owl and no file (or warning)  
gets created !
- Since Protege does not allow me to create two files with the same  
URI, I

- create a file test2.owl with http://example.org/test2
- Protege opens the file
- Close Protege
- Manually edit test2.owl so that it has the same base URI
- Open test2.owl and add owl:imports to file test.owl
- Imports view now claims to import test.owl, but none of its triples  
show up


All this shows that the issue is not fixed at all in Protege, and the  
same kind of base URI/ontology confusion may arise like in TopBraid.  
IMHO it is still better to allow working with multiple files with the  
same base URI than just silently ignoring them and hoping for the best.


- I also tried to import http://rdfex.org/foaf/Person,firstName which  
TopBraid handles without problems, but Protege fails completely  
because there is no owl:Ontology declared. I guess such a strict  
interpretation of the OWL spec is not helpful if the OWL community  
wants to interoperate with RDF-based ontologies. And why should all  
RDF snippets in the world be forced to declare an extra triple only  
because some OWL tools are inflexible? OWL is based on RDF, not the  
other way around.



The xml:base has no status whatsoever
in OWL. owl:imports in both OWL 1 and OWL 2 are based on the ontology
URI. The only way to determine the ontology URI is to fully parse an
OWL file. In doing so one must recognize that certain

:x rdf:type owl:Ontology

triples are the result of serialization of owl:import statements and
so their subject is not the name of the ontology. Once these are
discounted, there should be a single triple of the above form, and
whatever is in the place of the :x is the name of the ontology.


How is this supposed to work in practice? My humble understanding of  
the owl:imports mechanism is that it is supposed to support importing  
ontologies, in particular from the web. The URL being imported should  
therefore align with the physical location of that file, following  
best linked data practice. If they are different (like in the infamous  
case of the SWRL ontology) significant problems arise. What is the use  
case of having distinct 

Re: Ontology modules and namespaces

2009-11-09 Thread Alan Ruttenberg
On Mon, Nov 9, 2009 at 2:14 PM, Holger Knublauch hol...@knublauch.com wrote:

 On Nov 8, 2009, at 7:03 PM, Alan Ruttenberg wrote:

 On 11/4/09, Holger Knublauch hol...@knublauch.com wrote:

 Since TopBraid Composer [1] was criticized here, please allow me
 explain that it can very well be used in the scenario below. I will
 let the people on this list decide whether it behaves well or not. The
 mechanism it uses has been stable for the last three years, and I
 think it has worked quite well so far.

 It does not.

 Thanks for sharing *your opinion*.

You are *welcome*.

 The original question was about
 modularizing ontologies so that resources from the same namespace can be
 organized across multiple files. Many users are confused about the
 difference between ontology URIs and namespaces, and I was addressing this.

You have addressed it poorly. There is no connection between the two.
Suggesting otherwise does a disservice.

 You seem to be switching topics now to whether base URIs are a valid
 mechanism to identify those (multiple) files or whether the URIs of
 owl:Ontologies should be used only.

As far as any of the semantic web technologies go xml:base *does not
exist*. The specs know *nothing* about it. Nor should they.

 If users are editing files from their hard drive, TBC will associate
 each file with a base URI. This base URI is later used to resolve
 owl:imports, so that the system can figure out whether it has local
 copies of web resources without going to the web. The base URI is
 retrieved from the files by looking into the first few lines - if it's
 an RDF/XML file then it uses the declared xml:base,

 This is simply wrong and causes problems in practice. Thankfully it is
 finally being fixed in Protege.

 Comparing TopBraid and Protege is like comparing apples with oranges.

They were compared because they share the same bug, and because parts
of the protege code base were written by the same developer. Protege 3
and Protege 4 have the same problem in this regard.

 Protege 4 has been designed as a native OWL 2 tool, and it generally cannot
 correctly handle RDF files. TopBraid has been designed as a semantic web
 technology tool with a focus on RDF-based languages including, but not
 limited to, OWL. Many RDF files do not even declare owl:Ontologies, making
 your suggested solution not attractive in general.

I only commented on the OWL case, but since you mention it, the use of
xml base in RDF is similarly unjustified.

 In practice however
 TopBraid makes efforts to make sure that base URI (written as xml:base in
 RDF/XML) and the owl:Ontology remain synchronized.

Tools need to live in the world defined by the specification, so that
artifacts that are generated according to the specification will work
with them. I, and I presume many others, are not interested in
idiosyncratic, tool specific, solutions.

 It will add missing
 owl:Ontology triples if a file gets saved from the web, to maximize OWL
 compatibility. TopBraid also provides a warning if there are more than one
 owl:Ontologies in a file, and has a button to fix this scenario. For most
 back-ends (such as databases), TopBraid also checks for the owl:Ontology to
 learn about the base URI.

These have *nothing* to do with each other.

 So I don't think there are substantial practical
 differences between what you outline and what we have implemented.
 BTW I just downloaded Protege

As far as I know these fixes have not yet been pushed out.

-Alan



Re: Need help mapping two letter country code to URI

2009-11-09 Thread Martin Hepp (UniBW)

Hi Aldo,

Note that there are multiple branches of the ISO 3166 familiy of codes. 
See pages 23 and 24 of the GoodRelations Technical Report 
(http://www.heppnetz.de/projects/goodrelations/GoodRelations-TR-final.pdf) 
for a more detailed discussion. I am still not aware of any 
authoritative URI schema for ISO 3166, which is why GoodRelations uses 
string literals for that code.


The key ISO page http://www.iso.org/iso/country_codes.htm does also not 
refer to any established http or URN URI schema for the ISO 3166 family 
of codes.


I assume that dbPedia URIs may be well suited, but they are not as 
authoritative. If they have ISO 3166 codes attached via properties, 
entity consolidation on that basis may be relatively simple.


Below, please find an excerpt from the discussion re identifiers for 
countries in the GoodRelations Technical Report:


Country or Region

...

GoodRelations could reuse several approaches for ontologies of regions 
and places for
specifying Countries and Regions. However, we suggest a more pragmatic 
approach of
reusing the ISO Standard 3166, in particular ISO 3166-1 (ISO, 2006) and 
ISO 3166-2
(ISO, 1998). The first defines 2- or 3-letter identifiers for existing 
countries and a few
independent geopolitical entities. ISO 3166-1 alpha-2 defines 2-letter 
codes for most
countries. There exist alternative standards with 3-letter codes and a 
numerical
representation. For the following reasons, we suggest using the 2-letter 
codes: First, they
are well established and people are likely more familiar with them (they 
are also used for
most top-level domains). Second, and more important, the 2-letter 
variant is the basis for
ISO 3166-2, which breaks down the countries from ISO 3166-1 into 
administrative
subdivisions (ISO, 1998). The code elements used in ISO 3166-2 consist 
of “the alpha-2
code element from ISO 3166-1 followed by a separator and a further 
string of up to three

alphanumeric characters e. g.” (from: http://www.iso.org/iso/en/prods-
services/iso3166ma/04background-on-iso-3166/iso3166-2.html).
This allows using simple string operations on the respective ISO 3166 
codes in order to
handle administrative subdivisions. For example, if a certain Offering 
is said to be valid
for Canada (ISO 3166-1 two-letter code “CA”), then one can infer that 
any longer search
string specifying an administrative subdivision of Canada (e.g. British 
Columbia, ISO

3166-2 “CA-BC”) is also an eligible region.
Examples: Canada (CA), Austria (AT), Canada: British Columbia (CA-BC), 
Italy (IT),

Italy: Province of Milano (IT-MI)

Note: More complex modeling of Countries and Regions may be useful in some
scenarions, and GoodRelations can be imported and extended if necessary. 
However,
most offerings on the Web contain statements on the level of countries 
only, for which

ISO 3166-1 is sufficient and very common.

Martin



Aldo Bucchi wrote:

Hi,

I found a dataset that represents countries as two letter country
codes: DK, FI, NO, SE, UK.
I would like to turn these into URIs of the actual countries they represent.

( I have no idea on whether this follows an ISO standard or is just
some private key in this system ).

Any ideas on a set of candidata URIs? I would like to run a complete
coverage test and take care I don't introduce distortion ( that is
pretty easy by doing some heuristic tests against labels, etc ).

There are some border cases that suggest this isn't ISO3166-1, but I
am not sure yet. ( and if it were, which widely used URIs are based on
this standard? ).

Thanks!
A

  


--
--
martin hepp
e-business  web science research group
universitaet der bundeswehr muenchen

e-mail:  h...@ebusiness-unibw.org
phone:   +49-(0)89-6004-4217
fax: +49-(0)89-6004-4620
www: http://www.unibw.de/ebusiness/ (group)
http://www.heppnetz.de/ (personal)
skype:   mfhepp 
twitter: mfhepp


Check out GoodRelations for E-Commerce on the Web of Linked Data!
=

Webcast:
http://www.heppnetz.de/projects/goodrelations/webcast/

Recipe for Yahoo SearchMonkey:
http://www.ebusiness-unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey

Talk at the Semantic Technology Conference 2009: 
Semantic Web-based E-Commerce: The GoodRelations Ontology

http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-goodrelations-ontology-1535287

Overview article on Semantic Universe:
http://www.semanticuniverse.com/articles-semantic-web-based-e-commerce-webmasters-get-ready.html

Project page:
http://purl.org/goodrelations/

Resources for developers:
http://www.ebusiness-unibw.org/wiki/GoodRelations

Tutorial materials:
CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey 
http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-Commerce_Tutorial_IEEE_CEC%2709



attachment: martin_hepp.vcf

Re: Need help mapping two letter country code to URI

2009-11-09 Thread Dan Brickley
On Mon, Nov 9, 2009 at 10:47 PM, Aldo Bucchi aldo.buc...@gmail.com wrote:
 Hi,

 I found a dataset that represents countries as two letter country
 codes: DK, FI, NO, SE, UK.
 I would like to turn these into URIs of the actual countries they represent.

 ( I have no idea on whether this follows an ISO standard or is just
 some private key in this system ).

 Any ideas on a set of candidata URIs? I would like to run a complete
 coverage test and take care I don't introduce distortion ( that is
 pretty easy by doing some heuristic tests against labels, etc ).

 There are some border cases that suggest this isn't ISO3166-1, but I
 am not sure yet. ( and if it were, which widely used URIs are based on
 this standard? ).

http://www.fao.org/countryprofiles/geoinfo.asp might have something
useful for you?

Dan



Re: Need help mapping two letter country code to URI

2009-11-09 Thread Hugh Glaser
There are quite a few, but I don't know which other ones follow ISO 3166-1.
http://sameas.org/?uri=http://dbpedia.org/resource/Austria
Gives a selection.
Or also
http://unlocode.rkbexplorer.com/id/AT
http://ontologi.es/place/AT

Our site, http://unlocode.rkbexplorer.com/id/AT
is our capture of UN/LOCODE 2009-1, the United Nations Code for Trade and
Transport Locations, which uses the 2-letter country codes from ISO 3166-1,
as well as the 1-3 letter subdivision codes of ISO 3166-2
See http://www.unece.org/cefact/locode/
It also gives inclusion and coords, etc.
We need to do more coref to other than onologi.es .

Best
Hugh

On 09/11/2009 21:47, Aldo Bucchi aldo.buc...@gmail.com wrote:

 Hi,
 
 I found a dataset that represents countries as two letter country
 codes: DK, FI, NO, SE, UK.
 I would like to turn these into URIs of the actual countries they represent.
 
 ( I have no idea on whether this follows an ISO standard or is just
 some private key in this system ).
 
 Any ideas on a set of candidata URIs? I would like to run a complete
 coverage test and take care I don't introduce distortion ( that is
 pretty easy by doing some heuristic tests against labels, etc ).
 
 There are some border cases that suggest this isn't ISO3166-1, but I
 am not sure yet. ( and if it were, which widely used URIs are based on
 this standard? ).
 
 Thanks!
 A




Need help mapping two letter country code to URI

2009-11-09 Thread Aldo Bucchi
Hi,

I found a dataset that represents countries as two letter country
codes: DK, FI, NO, SE, UK.
I would like to turn these into URIs of the actual countries they represent.

( I have no idea on whether this follows an ISO standard or is just
some private key in this system ).

Any ideas on a set of candidata URIs? I would like to run a complete
coverage test and take care I don't introduce distortion ( that is
pretty easy by doing some heuristic tests against labels, etc ).

There are some border cases that suggest this isn't ISO3166-1, but I
am not sure yet. ( and if it were, which widely used URIs are based on
this standard? ).

Thanks!
A

-- 
Aldo Bucchi
skype:aldo.bucchi
http://www.univrz.com/
http://aldobucchi.com/

PRIVILEGED AND CONFIDENTIAL INFORMATION
This message is only for the use of the individual or entity to which it is
addressed and may contain information that is privileged and confidential. If
you are not the intended recipient, please do not distribute or copy this
communication, by e-mail or otherwise. Instead, please notify us immediately by
return e-mail.



Re: Need help mapping two letter country code to URI

2009-11-09 Thread Nathan
 On 09/11/2009 21:47, Aldo Bucchi aldo.buc...@gmail.com wrote:
 I found a dataset that represents countries as two letter country
 codes: DK, FI, NO, SE, UK.

http://dbpedia.org/resource/ISO_3166-2:DK
http://dbpedia.org/resource/ISO_3166-2:FI
http://dbpedia.org/resource/ISO_3166-2:NO
http://dbpedia.org/resource/ISO_3166-2:SE
http://dbpedia.org/resource/ISO_3166-2:GB (UK)

?



Re: Need help mapping two letter country code to URI

2009-11-09 Thread Jona Christopher Sahnwaldt
On Mon, Nov 9, 2009 at 23:59, Nathan nat...@webr3.org wrote:
 On 09/11/2009 21:47, Aldo Bucchi aldo.buc...@gmail.com wrote:
 I found a dataset that represents countries as two letter country
 codes: DK, FI, NO, SE, UK.

 http://dbpedia.org/resource/ISO_3166-2:DK
 http://dbpedia.org/resource/ISO_3166-2:FI
 http://dbpedia.org/resource/ISO_3166-2:NO
 http://dbpedia.org/resource/ISO_3166-2:SE
 http://dbpedia.org/resource/ISO_3166-2:GB (UK)

 ?



With '-1' instead of '-2', these all dbpprop:redirect to their
respective countries:

http://dbpedia.org/resource/ISO_3166-1:DK
http://dbpedia.org/resource/ISO_3166-1:FI
http://dbpedia.org/resource/ISO_3166-1:NO
http://dbpedia.org/resource/ISO_3166-1:SE
http://dbpedia.org/resource/ISO_3166-1:GB

I guess this pattern is quite reliable, because some
people at Wikipedia were rather diligent:

http://en.wikipedia.org/wiki/Category:Redirects_from_ISO_3166