Hello Dominic, thank you very much for your detailed reply which answers my questions.
Reading this brought up a new question that is not directly related to CIDOCCRM: When mapping RDF vocabularies in practice, how often will I have to use rule based systems instead of reasoning because RDFS/OWL is not expressive enough for the mappings ? Regards, Michael Brunnbauer On Tue, Jun 25, 2013 at 07:10:32AM +0100, Dominic Oldman wrote: > Hi Michael. > > Thanks for the question. I am glad you asked :-) but sorry in advance for the > long answer. > > > Before I go further though, a fundamental project objective is to make > mapping to the CRM straight forward and simple for organisations that > understand their own data and the CRM-SIG (Special Interest group) have > already made inroads. > > > The CIDOCCRM is an ontology formed in the 1990s and progressed to be an ISO > standard in 2006 and has therefore been around for a long time before many > other ontologies existed. In fact, the Europeana Data Model used for > aggregating both in Europe and the US and covering over 20 million items in > Europe borrows the concepts of time and place and people from the CRM despite > also borrowing from Dublin Core, SKOS and FOAF! The EDMdoesn't describe a > Museum object record being too general. > > The CRM is used in the method I described in my other email without > specialisation but by vocabulary plugins that type the events and reify > properties. If we specialised them we would extend the CRM by another 200 > properties and entities - and that would make it complicated. ( see web as > literature presentation at > http://www.researchspace.org/project-updates/webasliterature-britishlibrary10thjune) > > > > The ontology was created bottom up by examining hundreds of data models in > the cultural heritage world and abstracting an ontology that would harmonise > them. The reality is that museum and cultural heritage data is highly > complex. The CIDOCCRM instead of avoiding it addresses this complexity, and > by mapping data to it provides an extremely accessible and easy way to search > rich data sets even when it is sourced from many different organisations. > While it is comprehensive in its treatment of cultural heritage (it is not > confined to museums) it has been careful not to over generalise like, for > example, dublin core. It is completely contextual and does not mandate a > common set of fields and thereby does not strip down data from its original > sources - it provides a complete knowledge representation. (see > http://www.oldman.me.uk/blog/costsofculturalheritage/ ) > > > Museums and other cultural heritage organisations all have collection > catalogues that may be based on standards but these standards are wide and > are in any event highly customised and employ very different vocabularies. In > this respect they probably represent one of the hardest use cases for linked > data. The British Museum have internal thesauri and authorities that cover > object types, materials, techniques, cultures, subjects, periods, languages > as well as bibliographies, biographies and places (both modern and archaic). > There are specialist thesauri for particular objects like 'wares' and clock > and watches. Moreover there are terminologies that describe processes that > are unique to the objects. The model itself reflects particular institutional > priorities, policies and customs formed over very many years and we have been > digitising the collection for over 30 years. All of this information is > important for data harmonisation purposes (particularly for > research). > > > The processes involved in the production of an object could involve a large > number of different methods with different influences and with associations > with different groups (artistic or otherwise) and different types of > production environment. All these things are addressed differently in > different cultural heritage organisations. The use of dates and periods can > be particularly complicated for different object types and with different > opinions about what a period means and how an object fits within a period. > Our dating starts at 2 million years BC to the present day and date > classifications can be varied and non-standard. > > > In reality the CRM is not a complicated ontology but merely generalises over > an extremely rich and complicated data environment to the extent that it can, > for example, harmonise the artificial objects of a museum of antiquities with > say a Natural History museum with natural objects and with classification > following a completely different set of standards. Equally it can be used for > HEI research (King's College have starting using it for their specialised > research like a prosopography). There are no other ontologies that describes > a British Museum object to the level at which it would be useful for research > as well as education and engagement. Take object production; > > > Production events typically records the process in relationship to the person > and place techniques and time spans with the following variations > > 1. Production by specific process (place and actor) > > 2. Production by closely related group or pupil > 3. Production with no specific process (place and actor) > > 4. Production by different ethnic groups > 5. Production involving likelihood and probability (place and actor) (ie. > attributed to or assign to ). > > 6. Production with parts which may have been part of the overall process or > created as part of a separate production process. > 7. Production authorities (the motivation for a production). > 8. Production influences. > 9. Production made for a particular place or for a particular person. > > These different variations for production are determined by a British Museum > specific set of internal codes are different in other museums. For example > > 5: Drawn by > AU: Author > BC: Block cut by > CA: Calligrapher > D: Designed by, DM: Medal designed and made by > DE: Decorated by > E: Engraved by > I: Issuer > ID: Intermediary draughtsman > J: Modelled by > L: Lustred by > M: Made by, DM: Medal designed and made by > P: Painted by > PH: Photographed by > SC: Scribe > WR: Written by > Z: Published by > G: Moneyer > T: Mint > PA: Print artist, PM: Print made by, R: Printed by > > AG: Office/studio of > AJ: Circle/School of > F: Factory of > O: Official/Office/Dept > W: Workshop of > A: Attributed to > AA: Attributed to an Apprentice/Pupil of > AB: Ascribed to > AC: Attributed to the Circle of > AD: Assigned to > AW: Attributed to the Workshop of > CB: Claimed to be by > AE: Formerly attributed to > IR: Inscription by > LE: Lettering engraved by > MB: Bell made by > MC: Case made by > MD: Dial made by > ME: Ebauche maker > MM: Movement made by > MP: Watch pendant made by > MQ: Dust-cap maker > AF: Attributed to a Follower of > AI: Attributed to an Imitator of > AL: Manner/Style of > AM: Attributed to the Manner of > AT: After > C: Close to > CF: Compare with > CM: Connected with the Manner of > CW: Connected with > S: School of/style of > RE: Related to > NE: Near > RC: Recalls > > > These principles apply to other parts of the object record including > acquisition, inscription, visual representations and so on. I have attached a > construct for "Acquisition From" which is one acquisition construct along > with 'Acquisition Through', 'Custody From', 'Acquisitision Motivated by', > 'Acquisition Through', 'Legislation', 'Found By' etc all of which have > different semantics that are very important to represent. > > > The ontology allows us to take densities of data that have been contextually > harmonised and infer new knowledge, correct data that is wrong and to > co-reference all the different terms and vocabularies. The Museum has terms, > people and places that don't appear in central authorities, Getty, Viaf etc. > More obscure artists or artisans, for example, will simply never be > co-referenced without the contextualisation that CRM provides. > > > Here (attched) is the record (not the largest) of a significant object - The > Rosetta Stone. One object out of 2 million records that represent a large > number of object types from art history to archeology representing different > internal approaches to classification and knowledge representation. I would > suggest that CRM is more of a miracle than an ontology and it would surely > take 15 years for anyone to start from from scratch and produce something > equivalent. :-) > > Hope this answers the questions a little bit. Sorry for being so long winded. > > > Cheers Dominic > > > > > > > > > > > > > > > > > > > ________________________________ > From: Michael Brunnbauer <[email protected]> > To: [email protected] > Sent: Sunday, 23 June 2013, 18:21 > Subject: Re: Big data applications for general users based on RDF - where > are they? > > > > Hello Dominic, > > On Sun, Jun 23, 2013 at 09:35:53AM +0100, Dominic Oldman wrote: > > I take the point about the ability to set things up quickly, but this just > > points to the fact that we have some way to go on a number of strands. But > > we all know we on the right path. I would say that focusing in on some of > > the huge range of potential applications that you couldn't do with a > > relational database will help move things along more. > > > > On ontology here is my experience. You need a solid ontology that describes > > your domain at precisely the right level to represent domain knowledge to > > establish key relationships but which supports specialisation below this > > level. This level is just at the > point above which the domain varies. However after going down the > specialisation route we have found a more accessible and portable approach. > > > > We have used an ontology that does precisely the above but used it to > > create a set of ready made constructs for key domain concepts that are > > uncontentious. A particular concept may have a number of alternative > > constructs from which an organisation can select as appropriate. We then > > avoid the need to specialise the constructs using sub classes and sub > > properties and instead provide a mechanism for plugging in local > > vocabularies. This transfers the issue of co-referencing ontology > > extensions to co-referencing vocabularies. This is far more accessible for > > two reasons. Firstly, the contextualisation of the non-specialised elements > > provide enough knowledge representation to perform the co-referencing. > > Secondly, there are many vocabulary co-referencing initiatives that are > > becoming more mature > and accessible. The plugin approach is supported by typing whole event > constructs and reification of key properties with local terminology, people > and place > > authorities, but also terminology unique to the organisation > >(Institutional context). > > > > For example, the production of something may have a generalised property of > > "carried out by". This could be specialised in a large number of ways. > > Instead we can look at the local specialisations and use them as a > > vocabulary to either type the full event or to reify the property itself. > > E.g. "designed by". > > > > This process avoids a whole range of issues and also has the potential to > > be built into accessible implementation tools useful for organisations > > without technical resources. It means that we can start producing the > > applications that we can't do with relational databases and which operate > > across many different datasets robustly. > > > > How does this > sound? > > This sounds complicated :-) > > That the cultural heritage crowd seems to have a need for it's own upper level > ontology underlines my point about schematic and structural heterogeneity. > > Why does CIDOC define general concepts as place, event, spacial coordinates ? > Are there no suitable existing ontologies for this ? > > Can CIDOC also be used without specialization of properties and classes ? > > I think museums have used controlled vocabularies for quite a while. Can you > give an example that illustrates why the additional effort required for your > project is justified ? > > Regards, > > Michael Brunnbauer > > -- > ++ Michael Brunnbauer > ++ netEstate GmbH > ++ GeisenhausenerStraße 11a > ++ 81379 München > ++ Tel +49 89 32 19 77 80 > ++ Fax +49 89 32 19 77 89 > ++ E-Mail [email protected] > ++ http://www.netestate.de/ > ++ > ++ Sitz: München, HRBNr.142452 (Handelsregister B München) > ++ USt-IdNr. DE221033342 > ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer > ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel -- ++ Michael Brunnbauer ++ netEstate GmbH ++ Geisenhausener Straße 11a ++ 81379 München ++ Tel +49 89 32 19 77 80 ++ Fax +49 89 32 19 77 89 ++ E-Mail [email protected] ++ http://www.netestate.de/ ++ ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) ++ USt-IdNr. DE221033342 ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
pgpX9eukh0jhT.pgp
Description: PGP signature
