Re: Help with modeling my ontology

2013-02-28 Thread Dave Reynolds
Just on the question of representing measurements then one approach to 
that is the RDF Data Cube vocabulary [1]. In that each observation has a 
measure (the thing you are measuring, such as canopyHeight), the 
dimensions of where/when/etc the measurement applies to and the 
attributes that allow you to interpret the measurement.


So you would normally make the unit of measure an attribute.

If the method doesn't fundamentally change the nature of the thing you 
are measuring then you could make that another attribute.  If it does 
then you should have a different measure property for the different 
methods (possibly with some common super property).


Dave

[1] http://www.w3.org/TR/vocab-data-cube/

On 27/02/13 20:58, Luca Matteis wrote:

Hello all,

At http://www.cropontology.org/ I'm trying to make things a little more
RDF friendly. For example, we have an ontology about Groundnut here:
http://www.cropontology.org/ontology/CO_337/Groundnut/ttl

I'm generating this from a somewhat flat list of names/concepts, so it's
still a work in progress. But I'm having issues making sense of it all
so that the ontology can be used by people that actually have Groundnut
data.

For example, in that Turtle dump, search for Canopy height. This is a
concept that people might use to describe the height of the canopy of
their groundnut plant, as the comment describes (this should be a
Property not a Class, but like I said, it's still work-in-progress).
Let's try with some sample data someone might have about groundnut, and
see if I can further explain my issue (I assume co: is a prefix for my
cropontology.org http://cropontology.org site, also the URIs are
different but it's just an example):

 :groundnut1
   a co:Groundnut;
   co:canopyHeight xxx .

Ok here's the issue, we know that `canopyHeight` is measured using
different methodologies. For example it might be measured using a
methodology that we found to be described as Measuring the distance
from the base to the tip of the main stem, but it might also be some
other method. And, funny enough, we also realized that it is measured
using centimeters, with a minimum of 0 and a maximum of 10cm.

So how should I make this easier on the people that are using my
ontology? Should it be:

 :groundnut1
   a co:Groundnut;
   co:canopyHeight 9.5cm .

or should it be:

 :groundnut1
   a co:Groundnut;
   co:canopyHeight [
 co:method Measuring the distance from the base to the tip of
the main stem;
 co:scale 9.5cm
   ] .

Maybe I'm going about this the wrong way and should think more about how
this ontology is going to be used by people that have data about it...
but I'm not sure. Any advice would be great. And here's the actual
browsable list of concepts, in a tree sort of interface:
http://www.cropontology.org/terms/CO_337:039/

As you can see there's this kind of thing happening all over the
ontology where we have the Property-the method it was measured- and
finally the scale. Any help? Thanks!






2nd Announcement: DC-2013 call for participation

2013-02-28 Thread DCMI Announce
*** Please excuse the cross-posting ***

LINKING TO THE FUTURE
International Conference on Dublin Core and Metadata Applications
2-6 September 2013, Lisbon, Portugal

=
2nd ANNOUNCEMENT: DC-2013 CALL FOR PARTICIPATION
=

DC-2013 will explore questions regarding the persistence, maintenance, and
preservation of metadata and descriptive vocabularies. The need for stable
representations and descriptions spans all sectors including cultural
heritage and scientific data, eGovernment, finance and commerce. Thus, the
maintenance and management of metadata is essential to address the long
term availability of information of legal, cultural and economic value.  On
the web, data—and especially descriptive vocabularies—can change or vanish
from one moment to the next. Nonetheless, the web increasingly forms the
ecosystem for our vocabularies and our data. DC-2013 will bring together in
Lisbon the community of metadata scholars and practitioners to engage in
the exchange of knowledge and best practices in developing a sustainable
metadata ecosystem.

DC-2013 will be collocated and run simultaneous with iPRES 2013 providing a
rich environment for synergistic exploration of issues common to both
communities.

=
IMPORTANT DEADLINES  DATES:
--SUBMISSION DEADLINE: 29 March 2013
--AUTHOR NOTIFICATION: 7 June 2013
--FINAL COPY: 5 July 2013
-
IMPORTANT URLS:
--ONLINE CFP: http://purl.org/dcevents/dc-2013/cfp
--CONFERENCE WEBSITE: http://purl.org/dcevents/dc-2013
--SUBMISSION URL:
http://dcevents.dublincore.org/index.php/IntConf/dc-2013/author/submit?requiresAuthor=1
--ORGANIZING COMMITTEE:
http://dcevents.dublincore.org/index.php/IntConf/dc-2013/about/organizingTeam
=

Beyond the conference theme, papers, reports, and poster submissions are
welcome on a wide range of metadata topics, such as:

-- Metadata principles, guidelines, and best practices
-- Metadata quality (methods, tools, and practices)
-- Conceptual models and frameworks (e.g., RDF, DCAM, OAIS)
-- Application profiles
-- Metadata generation (methods, tools, and practices)
-- Metadata interoperability across domains, languages,
   time, structures, and scales.
-- Cross-domain metadata uses (e.g., recordkeeping, preservation,
   curation, institutional repositories, publishing)
-- Domain metadata (e.g., for corporations, cultural memory
   institutions, education, government, and scientific fields)
-- Bibliographic standards (e.g., RDA, FRBR, subject headings)
   as Semantic Web vocabularies
-- Accessibility metadata
-- Metadata for scientific data, e-Science and grid applications
-- Social tagging and user participation in building metadata
-- Usage data (paradata/attention metadata)
-- Knowledge Organization Systems (e.g., ontologies, taxonomies,
   authority files, folksonomies, and thesauri) and Simple Knowledge
   Organization Systems (SKOS)
-- Ontology design and development
-- Integration of metadata and ontologies
-- Search engines and metadata
-- Linked data and the Semantic Web (metadata and applications)
-- Vocabulary registries and registry services

-
SUBMISSIONS

--All submissions must be in English.
--All submissions will be peer-reviewed by the International Program
Committee.
--Unless previously arranged, accepted papers, project reports and posters
must be presented in Lisbon by at least one of their authors.

Submissions for Asynchronous Participation:  With prior arrangement, a few
exceptional papers, project reports and extended poster abstracts will be
accepted for asynchronous presentation by their authors. Submissions
accepted for asynchronous presentation must follow both the general author
guidelines for submission as well as additional instructions located at
http://dcevents.dublincore.org/IntConf/index/pages/view/remote.

-
PUBLICATION

-- Accepted papers, project reports and poster abstracts will be published
in the permanent online conference proceedings and in DCMI Publications (
http://dcpapers.dublincore.org/).
-- Special session and community workshop session abstracts will be
published in the online conference proceedings.
-- Papers, research reports and poster abstracts must conform to the
appropriate formatting template available through the DCMI Peer Review
System.
-- Submitting authors in all categories must provide basic information
regarding current professional positions and affiliations as a condition of
acceptance and publication.

-
SUBMISSION CATEGORIES

FULL PAPERS (8-10 pages; Peer reviewed): Full papers either describe
innovative work in detail or provide critical, well-referenced overviews of
key developments or good practice in the areas outlined above. Full papers

Re: two datasets for DBLP

2013-02-28 Thread Hugh Glaser
Hi Kalpa,
As the person responsible for the second site, here is an explanation.
It's quite long, but you did ask, and maybe some people will find it useful.
Firstly, DBLP is a stunning resource, and so for the rkbexplorer (and now other 
) services, we were keen to have their data.
Let me say that again - DBLP is a stunning resource.

So why do we take a copy of their data (which they helpfully provide) and 
publish it as Linked Data?
Well, we wanted it as Linked Data. But in fact there is another Linked Data 
site with the same data, and my best recollection was that it was already in 
existence when we brought up our site in what must have been about 2005.
We didn't really want to, but there was a problem with the data at source [1] .
DBLP is essentially for searching. So for their purpose, they prefer to have 
high recall when the name of an author is put in. That is, they are quite 
liberal (it seems) about whether two authors of the same name are the same 
person, because they don't want to miss out on any cases (false negatives).
NLP people will tell you that the price of high recall is low precision - there 
will be more cases where they incorrectly conflate two authors (false 
positives).
See for the beginnings of this discussion 
http://eprints.soton.ac.uk/id/eprint/264361 .
In fact we did some analysis of the extent of the problem 
(http://eprints.soton.ac.uk/id/eprint/265181 ) and without too much trouble we 
found that in source [1], one author URI that was a conflation of 15 different 
people (as best we could tell).
I am not certain whether the problem came from their version of the DBLP data, 
or was introduced by the process of building source [1].

Our purposes were more complex - we were using the information as part of a 
more involved knowledge processing system, which included inferring information 
based on the semantic relationships, and any false positives caused a knock-on 
effect.
For example (as best I recall, and in fact the thing that first raised the 
problem for us), there was a conflation of two Prof Tom Andersons - one at the 
University of Newcastle, UK, and another in California. So when you looked at 
the UK Tom Anderson, we inferred that he was funded to a large extent by the US 
government, and indeed we therefore inferred that the University of Newcastle 
was also funded by the US government to a much greater extent than it was. 
Further author problems then would have caused us to deduce that the University 
of Newcastle, UK was the same institution as the University of Newcastle, NSW, 
Australia.
you will therefore understand that the precision/recall needs of our 
application were very different from those of the DBLP site.

This situation was and is not unique to DBLP - it has been true of almost every 
source we have tried to use. Last time I looked, the ACM library had conflated 
the two Universities of Newcastle. And it is also a problem for other sites - 
Microsoft Academic Search has me as the same Glaser as someone who published 
before I was born. And last time I tried to check, I found that Hugh Glaser 
was Google unique.

So we now (periodically) download the DBLP dump and convert it to RDF and 
publish it as Linked Data.
But with our completely independent view of author disambiguation (we call it 
co-reference).
In fact, since we were doing it, we used the AKT ontology, which was more 
convenient to us (note to Kingsley - it isn;t just another publication of the 
same RDF, it is actually uses a completely different ontology).
So source [2] is DBLP data (which does not have URIs for authors at all, it 
just has strings), with our own URIs.
We generate a new, unique, URI for every author on every paper, and then do our 
own analysis to conflate them.

Finally, the sameAs relations with source [1]: since the source [1] URIs for 
papers are safe, we establish sameAs with them. But for authors, we can't 
safely do that, as the follow-your-nose would suck in the incorrect 
information; so our system is explicitly fixed to reject such Linked Data from 
source [1]. And in fact, when I do http://sameas.org harvesting I avoid source 
[1].

It may be that things are different now - I haven't done any checking for quite 
a few years.

As I say, I have gone on at some length here, but I think this is an instance 
of a very important issue for Linked Data applications - some would argue that 
much of the Linked Data cloud is derived from similar data that has been set to 
prefer recall over precision.

Thanks for reminding me to refresh source [2], it was very out of date!

Best
Hugh



On 27 Feb 2013, at 12:10, Kalpa Gunaratna kalpagunara...@gmail.com wrote:

 Hi,
I am trying to do an alignment task between LOD datasets and came to see 
 that DBLP has two different datasets hosted in two places possibly with 
 different schemas. Following are the two URLs of them.
 
 http://dblp.l3s.de/d2r/ [1]
 
 http://dblp.rkbexplorer.com/ [2]
 
 both these datasets have 

Call for Applications: IESD Challenge 2013 (**LAST DAY***)

2013-02-28 Thread Dhavalkumar Thakker
[Apologies for cross-posting]


CALL FOR APPLICATIONS:

*Intelligent Exploration of Semantic Data (IESD) Challenge 2013*

http://imash.leeds.ac.uk/event/2013/challenge.html
Part of the IESD International Workshop at Hypertext 2013, Paris, France. May 
1, 2013

IMPORTANT DATES
===
- March 1st 2013: Submissions due
- Notification of acceptance: 22 March 2013
- Notification of Winner: During the Workshop

OVERVIEW

Application submissions are now invited for IESD Challenge 2013. The IESD 
Challenge aims to attract participations from the Semantic Web community 
particularly focusing on semantic data exploration. The Challenge is open to 
everyone from industry and academia.

The authors of the best application will be awarded a prize at IESD Workshop.

CHALLENGE APPLICATION ENTRY REQUIREMENTS

We invite applications particularly focusing on semantic data exploration. The 
application should meet minimal requirements listed below:
1. The application provides an end-user interface, i.e. either to general Web 
users or to domain users.
2. The application is implemented using Semantic Web technologies (such as RDF, 
linked open data and other Semantic related technologies).
3. The application should support semantic data exploration by addressing the 
three key themes of IESD workshop: Human factors, computational models and 
application domains. See http://imash.leeds.ac.uk/event/2013/topics.html for 
more details.

HOW TO PARTICIPATE
==
Step 1. Visit http://imash.leeds.ac.uk/event/2013/challenge.html in order to 
participate and register for the IESD Challenge 2013 by submitting the required 
information (in step 2).

Step 2. Provide following information when submitting:
1. Abstract: no more than 200 words.
2. A short description with following details about the application:
a) What is the key novelty of the system?
b) Who are the likely users?
c) URL/Demo video of the system?
d) How does the application address the key themes of IESD Workshop 2012 (a 
link to workshop here); human factors, computational models and application 
domains?
e) Architecture/key components of the system.

Papers should not exceed 4 pages. All submissions should be formatted according 
to the official ACM SIG proceedings template and submitted via EasyChair at:

https://www.easychair.org/conferences/?conf=iesd2013

In the EasyChair, when asked for category, please select IESD Challenge.

Step 3. Present the application at the workshop (10 minutes) to address the 
evaluation criteria below.

EVALUATION CRITERIA
===
The submitted application will be evaluated on how well it addresses the three 
key themes of IESD Workshop 2012 to help users explore semantic data with 
following features:

Computational models:
• Novel contributions to methods and techniques for semantic data exploration
• Scalable system architecture (in terms of the amount of data used and 
performance of the system components)
Domain and applications:
• Meet the needs of the problem domain
• Uptake and adaptability of the system on other domains
Human factors:
• Support people dealing with information overload
• Support people understanding complex/large-scale data through exploration
• Support learning/knowledge-discovery through exploration
• Support personalization/adaptation

JUDGING AND PRIZES
==
A jury consisting of experts from three workshop themes will be appointed to 
evaluate the best systems before the workshop. The jury will take into 
consideration of the descriptions submitted, the online demos, the presentation 
of the application at the workshop and the evaluation criteria specified above. 
During the workshop, attendees are encouraged to provide feedbacks to the jury 
after the presentation. The winner will be announced at the end of the workshop.

PRIZE SPONSOR
=
The winner of the IESD evaluation challenge will be awarded a prize sponsored 
by Dicode project(http://dicode-project.eu/).



--

Dr Dhaval Thakker
Knowledge Engineering Research Fellow
University of Leeds
Leeds LS2 9JT
(O) +44 113-343-6797
(E) d.thak...@leeds.ac.uk
(W) http://tinyurl.com/68bla9p