Re: Help with modeling my ontology
Just on the question of representing measurements then one approach to that is the RDF Data Cube vocabulary [1]. In that each observation has a measure (the thing you are measuring, such as canopyHeight), the dimensions of where/when/etc the measurement applies to and the attributes that allow you to interpret the measurement. So you would normally make the unit of measure an attribute. If the method doesn't fundamentally change the nature of the thing you are measuring then you could make that another attribute. If it does then you should have a different measure property for the different methods (possibly with some common super property). Dave [1] http://www.w3.org/TR/vocab-data-cube/ On 27/02/13 20:58, Luca Matteis wrote: Hello all, At http://www.cropontology.org/ I'm trying to make things a little more RDF friendly. For example, we have an ontology about Groundnut here: http://www.cropontology.org/ontology/CO_337/Groundnut/ttl I'm generating this from a somewhat flat list of names/concepts, so it's still a work in progress. But I'm having issues making sense of it all so that the ontology can be used by people that actually have Groundnut data. For example, in that Turtle dump, search for Canopy height. This is a concept that people might use to describe the height of the canopy of their groundnut plant, as the comment describes (this should be a Property not a Class, but like I said, it's still work-in-progress). Let's try with some sample data someone might have about groundnut, and see if I can further explain my issue (I assume co: is a prefix for my cropontology.org http://cropontology.org site, also the URIs are different but it's just an example): :groundnut1 a co:Groundnut; co:canopyHeight xxx . Ok here's the issue, we know that `canopyHeight` is measured using different methodologies. For example it might be measured using a methodology that we found to be described as Measuring the distance from the base to the tip of the main stem, but it might also be some other method. And, funny enough, we also realized that it is measured using centimeters, with a minimum of 0 and a maximum of 10cm. So how should I make this easier on the people that are using my ontology? Should it be: :groundnut1 a co:Groundnut; co:canopyHeight 9.5cm . or should it be: :groundnut1 a co:Groundnut; co:canopyHeight [ co:method Measuring the distance from the base to the tip of the main stem; co:scale 9.5cm ] . Maybe I'm going about this the wrong way and should think more about how this ontology is going to be used by people that have data about it... but I'm not sure. Any advice would be great. And here's the actual browsable list of concepts, in a tree sort of interface: http://www.cropontology.org/terms/CO_337:039/ As you can see there's this kind of thing happening all over the ontology where we have the Property-the method it was measured- and finally the scale. Any help? Thanks!
2nd Announcement: DC-2013 call for participation
*** Please excuse the cross-posting *** LINKING TO THE FUTURE International Conference on Dublin Core and Metadata Applications 2-6 September 2013, Lisbon, Portugal = 2nd ANNOUNCEMENT: DC-2013 CALL FOR PARTICIPATION = DC-2013 will explore questions regarding the persistence, maintenance, and preservation of metadata and descriptive vocabularies. The need for stable representations and descriptions spans all sectors including cultural heritage and scientific data, eGovernment, finance and commerce. Thus, the maintenance and management of metadata is essential to address the long term availability of information of legal, cultural and economic value. On the web, data—and especially descriptive vocabularies—can change or vanish from one moment to the next. Nonetheless, the web increasingly forms the ecosystem for our vocabularies and our data. DC-2013 will bring together in Lisbon the community of metadata scholars and practitioners to engage in the exchange of knowledge and best practices in developing a sustainable metadata ecosystem. DC-2013 will be collocated and run simultaneous with iPRES 2013 providing a rich environment for synergistic exploration of issues common to both communities. = IMPORTANT DEADLINES DATES: --SUBMISSION DEADLINE: 29 March 2013 --AUTHOR NOTIFICATION: 7 June 2013 --FINAL COPY: 5 July 2013 - IMPORTANT URLS: --ONLINE CFP: http://purl.org/dcevents/dc-2013/cfp --CONFERENCE WEBSITE: http://purl.org/dcevents/dc-2013 --SUBMISSION URL: http://dcevents.dublincore.org/index.php/IntConf/dc-2013/author/submit?requiresAuthor=1 --ORGANIZING COMMITTEE: http://dcevents.dublincore.org/index.php/IntConf/dc-2013/about/organizingTeam = Beyond the conference theme, papers, reports, and poster submissions are welcome on a wide range of metadata topics, such as: -- Metadata principles, guidelines, and best practices -- Metadata quality (methods, tools, and practices) -- Conceptual models and frameworks (e.g., RDF, DCAM, OAIS) -- Application profiles -- Metadata generation (methods, tools, and practices) -- Metadata interoperability across domains, languages, time, structures, and scales. -- Cross-domain metadata uses (e.g., recordkeeping, preservation, curation, institutional repositories, publishing) -- Domain metadata (e.g., for corporations, cultural memory institutions, education, government, and scientific fields) -- Bibliographic standards (e.g., RDA, FRBR, subject headings) as Semantic Web vocabularies -- Accessibility metadata -- Metadata for scientific data, e-Science and grid applications -- Social tagging and user participation in building metadata -- Usage data (paradata/attention metadata) -- Knowledge Organization Systems (e.g., ontologies, taxonomies, authority files, folksonomies, and thesauri) and Simple Knowledge Organization Systems (SKOS) -- Ontology design and development -- Integration of metadata and ontologies -- Search engines and metadata -- Linked data and the Semantic Web (metadata and applications) -- Vocabulary registries and registry services - SUBMISSIONS --All submissions must be in English. --All submissions will be peer-reviewed by the International Program Committee. --Unless previously arranged, accepted papers, project reports and posters must be presented in Lisbon by at least one of their authors. Submissions for Asynchronous Participation: With prior arrangement, a few exceptional papers, project reports and extended poster abstracts will be accepted for asynchronous presentation by their authors. Submissions accepted for asynchronous presentation must follow both the general author guidelines for submission as well as additional instructions located at http://dcevents.dublincore.org/IntConf/index/pages/view/remote. - PUBLICATION -- Accepted papers, project reports and poster abstracts will be published in the permanent online conference proceedings and in DCMI Publications ( http://dcpapers.dublincore.org/). -- Special session and community workshop session abstracts will be published in the online conference proceedings. -- Papers, research reports and poster abstracts must conform to the appropriate formatting template available through the DCMI Peer Review System. -- Submitting authors in all categories must provide basic information regarding current professional positions and affiliations as a condition of acceptance and publication. - SUBMISSION CATEGORIES FULL PAPERS (8-10 pages; Peer reviewed): Full papers either describe innovative work in detail or provide critical, well-referenced overviews of key developments or good practice in the areas outlined above. Full papers
Re: two datasets for DBLP
Hi Kalpa, As the person responsible for the second site, here is an explanation. It's quite long, but you did ask, and maybe some people will find it useful. Firstly, DBLP is a stunning resource, and so for the rkbexplorer (and now other ) services, we were keen to have their data. Let me say that again - DBLP is a stunning resource. So why do we take a copy of their data (which they helpfully provide) and publish it as Linked Data? Well, we wanted it as Linked Data. But in fact there is another Linked Data site with the same data, and my best recollection was that it was already in existence when we brought up our site in what must have been about 2005. We didn't really want to, but there was a problem with the data at source [1] . DBLP is essentially for searching. So for their purpose, they prefer to have high recall when the name of an author is put in. That is, they are quite liberal (it seems) about whether two authors of the same name are the same person, because they don't want to miss out on any cases (false negatives). NLP people will tell you that the price of high recall is low precision - there will be more cases where they incorrectly conflate two authors (false positives). See for the beginnings of this discussion http://eprints.soton.ac.uk/id/eprint/264361 . In fact we did some analysis of the extent of the problem (http://eprints.soton.ac.uk/id/eprint/265181 ) and without too much trouble we found that in source [1], one author URI that was a conflation of 15 different people (as best we could tell). I am not certain whether the problem came from their version of the DBLP data, or was introduced by the process of building source [1]. Our purposes were more complex - we were using the information as part of a more involved knowledge processing system, which included inferring information based on the semantic relationships, and any false positives caused a knock-on effect. For example (as best I recall, and in fact the thing that first raised the problem for us), there was a conflation of two Prof Tom Andersons - one at the University of Newcastle, UK, and another in California. So when you looked at the UK Tom Anderson, we inferred that he was funded to a large extent by the US government, and indeed we therefore inferred that the University of Newcastle was also funded by the US government to a much greater extent than it was. Further author problems then would have caused us to deduce that the University of Newcastle, UK was the same institution as the University of Newcastle, NSW, Australia. you will therefore understand that the precision/recall needs of our application were very different from those of the DBLP site. This situation was and is not unique to DBLP - it has been true of almost every source we have tried to use. Last time I looked, the ACM library had conflated the two Universities of Newcastle. And it is also a problem for other sites - Microsoft Academic Search has me as the same Glaser as someone who published before I was born. And last time I tried to check, I found that Hugh Glaser was Google unique. So we now (periodically) download the DBLP dump and convert it to RDF and publish it as Linked Data. But with our completely independent view of author disambiguation (we call it co-reference). In fact, since we were doing it, we used the AKT ontology, which was more convenient to us (note to Kingsley - it isn;t just another publication of the same RDF, it is actually uses a completely different ontology). So source [2] is DBLP data (which does not have URIs for authors at all, it just has strings), with our own URIs. We generate a new, unique, URI for every author on every paper, and then do our own analysis to conflate them. Finally, the sameAs relations with source [1]: since the source [1] URIs for papers are safe, we establish sameAs with them. But for authors, we can't safely do that, as the follow-your-nose would suck in the incorrect information; so our system is explicitly fixed to reject such Linked Data from source [1]. And in fact, when I do http://sameas.org harvesting I avoid source [1]. It may be that things are different now - I haven't done any checking for quite a few years. As I say, I have gone on at some length here, but I think this is an instance of a very important issue for Linked Data applications - some would argue that much of the Linked Data cloud is derived from similar data that has been set to prefer recall over precision. Thanks for reminding me to refresh source [2], it was very out of date! Best Hugh On 27 Feb 2013, at 12:10, Kalpa Gunaratna kalpagunara...@gmail.com wrote: Hi, I am trying to do an alignment task between LOD datasets and came to see that DBLP has two different datasets hosted in two places possibly with different schemas. Following are the two URLs of them. http://dblp.l3s.de/d2r/ [1] http://dblp.rkbexplorer.com/ [2] both these datasets have
Call for Applications: IESD Challenge 2013 (**LAST DAY***)
[Apologies for cross-posting] CALL FOR APPLICATIONS: *Intelligent Exploration of Semantic Data (IESD) Challenge 2013* http://imash.leeds.ac.uk/event/2013/challenge.html Part of the IESD International Workshop at Hypertext 2013, Paris, France. May 1, 2013 IMPORTANT DATES === - March 1st 2013: Submissions due - Notification of acceptance: 22 March 2013 - Notification of Winner: During the Workshop OVERVIEW Application submissions are now invited for IESD Challenge 2013. The IESD Challenge aims to attract participations from the Semantic Web community particularly focusing on semantic data exploration. The Challenge is open to everyone from industry and academia. The authors of the best application will be awarded a prize at IESD Workshop. CHALLENGE APPLICATION ENTRY REQUIREMENTS We invite applications particularly focusing on semantic data exploration. The application should meet minimal requirements listed below: 1. The application provides an end-user interface, i.e. either to general Web users or to domain users. 2. The application is implemented using Semantic Web technologies (such as RDF, linked open data and other Semantic related technologies). 3. The application should support semantic data exploration by addressing the three key themes of IESD workshop: Human factors, computational models and application domains. See http://imash.leeds.ac.uk/event/2013/topics.html for more details. HOW TO PARTICIPATE == Step 1. Visit http://imash.leeds.ac.uk/event/2013/challenge.html in order to participate and register for the IESD Challenge 2013 by submitting the required information (in step 2). Step 2. Provide following information when submitting: 1. Abstract: no more than 200 words. 2. A short description with following details about the application: a) What is the key novelty of the system? b) Who are the likely users? c) URL/Demo video of the system? d) How does the application address the key themes of IESD Workshop 2012 (a link to workshop here); human factors, computational models and application domains? e) Architecture/key components of the system. Papers should not exceed 4 pages. All submissions should be formatted according to the official ACM SIG proceedings template and submitted via EasyChair at: https://www.easychair.org/conferences/?conf=iesd2013 In the EasyChair, when asked for category, please select IESD Challenge. Step 3. Present the application at the workshop (10 minutes) to address the evaluation criteria below. EVALUATION CRITERIA === The submitted application will be evaluated on how well it addresses the three key themes of IESD Workshop 2012 to help users explore semantic data with following features: Computational models: • Novel contributions to methods and techniques for semantic data exploration • Scalable system architecture (in terms of the amount of data used and performance of the system components) Domain and applications: • Meet the needs of the problem domain • Uptake and adaptability of the system on other domains Human factors: • Support people dealing with information overload • Support people understanding complex/large-scale data through exploration • Support learning/knowledge-discovery through exploration • Support personalization/adaptation JUDGING AND PRIZES == A jury consisting of experts from three workshop themes will be appointed to evaluate the best systems before the workshop. The jury will take into consideration of the descriptions submitted, the online demos, the presentation of the application at the workshop and the evaluation criteria specified above. During the workshop, attendees are encouraged to provide feedbacks to the jury after the presentation. The winner will be announced at the end of the workshop. PRIZE SPONSOR = The winner of the IESD evaluation challenge will be awarded a prize sponsored by Dicode project(http://dicode-project.eu/). -- Dr Dhaval Thakker Knowledge Engineering Research Fellow University of Leeds Leeds LS2 9JT (O) +44 113-343-6797 (E) d.thak...@leeds.ac.uk (W) http://tinyurl.com/68bla9p