Hello Dominic,

thank you very much for your detailed reply which answers my questions.

Reading this brought up a new question that is not directly related to CIDOCCRM:

When mapping RDF vocabularies in practice, how often will I have to use rule 
based systems instead of reasoning because RDFS/OWL is not expressive enough 
for the mappings ?

Regards,

Michael Brunnbauer

On Tue, Jun 25, 2013 at 07:10:32AM +0100, Dominic Oldman wrote:
> Hi Michael.
> 
> Thanks for the question. I am glad you asked :-) but sorry in advance for the 
> long answer.
> 
> 
> Before I go further though, a fundamental project objective is to make 
> mapping to the CRM straight forward and simple for organisations that 
> understand their own data and the CRM-SIG (Special Interest group) have 
> already made inroads. 
> 
> 
> The CIDOCCRM is an ontology formed in the 1990s and progressed to be an ISO 
> standard in 2006 and has therefore been around for a long time before many 
> other ontologies existed. In fact, the Europeana Data Model used for 
> aggregating both in Europe and the US and covering over 20 million items in 
> Europe borrows the concepts of time and place and people from the CRM despite 
> also borrowing from Dublin Core, SKOS and FOAF! The EDMdoesn't describe a 
> Museum object record being too general.
> 
> The CRM is used in the method I described in my other email without 
> specialisation but by vocabulary plugins that type the events and reify 
> properties. If we specialised them we would extend the CRM by another 200 
> properties and entities - and that would make it complicated.  ( see web as 
> literature presentation at  
> http://www.researchspace.org/project-updates/webasliterature-britishlibrary10thjune)
>  
> 
> 
> The ontology was created bottom up by examining hundreds of data models in 
> the cultural heritage world and abstracting an ontology that would harmonise 
> them. The reality is that museum and cultural heritage data is highly 
> complex. The CIDOCCRM instead of avoiding it addresses this complexity,  and 
> by mapping data to it provides an extremely accessible and easy way to search 
> rich data sets even when it is sourced from many different organisations. 
> While it is comprehensive in its treatment of cultural heritage (it is not 
> confined to museums) it has been careful not to over generalise like, for 
> example, dublin core. It is completely contextual and does not mandate a 
> common set of fields and thereby does not strip down data from its original 
> sources - it provides a complete knowledge representation. (see 
> http://www.oldman.me.uk/blog/costsofculturalheritage/ )
> 
> 
> Museums and other cultural heritage organisations all have collection 
> catalogues that may be based on standards but these standards are wide and 
> are in any event highly customised and employ very different vocabularies. In 
> this respect they probably represent one of the hardest use cases for linked 
> data. The British Museum have internal thesauri and authorities that cover 
> object types, materials, techniques, cultures, subjects, periods, languages 
> as well as bibliographies, biographies and places (both modern and archaic). 
> There are specialist thesauri for particular objects like 'wares' and clock 
> and watches. Moreover there are terminologies that describe processes that 
> are unique to the objects. The model itself reflects particular institutional 
> priorities, policies and customs formed over very many years and we have been 
> digitising the collection for over 30 years. All of this information is 
> important for data harmonisation purposes (particularly for
>  research).
> 
> 
> The processes involved in the production of an object could involve a large 
> number of different methods with different influences and with associations 
> with different groups (artistic or otherwise) and different types of 
> production environment. All these things are addressed differently in 
> different cultural heritage organisations. The use of dates and periods can 
> be particularly complicated for different object types and with different 
> opinions about what a period means and how an object fits within a period. 
> Our dating starts at 2 million years BC to the present day and date 
> classifications can be varied and non-standard.
> 
> 
> In reality the CRM is not a complicated ontology but merely generalises over 
> an extremely rich and complicated data environment to the extent that it can, 
> for example, harmonise the artificial objects of a museum of antiquities with 
> say a Natural History museum with natural objects and with classification 
> following a completely different set of standards. Equally it can be used for 
> HEI research (King's College have starting using it for their specialised 
> research like a prosopography). There are no other ontologies that describes 
> a British Museum object to the level at which it would be useful for research 
> as well as education and engagement. Take object production; 
> 
> 
> Production events typically records the process in relationship to the person 
> and place techniques and time spans with the following variations
> 
> 1. Production by specific process (place and actor)   
> 
> 2. Production by closely related group or pupil
> 3. Production with no specific process (place and actor)
> 
> 4. Production by different ethnic groups
> 5. Production involving likelihood and probability (place and actor) (ie. 
> attributed to or assign to ).
> 
> 6. Production with parts which may have been part of the overall process or 
> created as part of a separate production process.
> 7. Production authorities (the motivation for a production).
> 8. Production influences.
> 9. Production made for a particular place or for a particular person.
> 
> These different variations for production are determined by a British Museum 
> specific set of internal codes are different in other museums. For example
> 
> 5: Drawn by  
> AU: Author  
> BC: Block cut by  
> CA: Calligrapher  
> D: Designed by, DM: Medal designed and made by  
> DE: Decorated by  
> E: Engraved by  
> I: Issuer  
> ID: Intermediary draughtsman  
> J: Modelled by  
> L: Lustred by  
> M: Made by, DM: Medal designed and made by  
> P: Painted by  
> PH: Photographed by  
> SC: Scribe  
> WR: Written by  
> Z: Published by  
> G: Moneyer 
> T: Mint  
> PA: Print artist, PM: Print made by, R: Printed by  
>   
> AG: Office/studio of  
> AJ: Circle/School of  
> F: Factory of  
> O: Official/Office/Dept  
> W: Workshop of  
> A: Attributed to  
> AA: Attributed to an Apprentice/Pupil of  
> AB: Ascribed to  
> AC: Attributed to the Circle of  
> AD: Assigned to  
> AW: Attributed to the Workshop of  
> CB: Claimed to be by  
> AE: Formerly attributed to  
> IR: Inscription by  
> LE: Lettering engraved by  
> MB: Bell made by  
> MC: Case made by  
> MD: Dial made by  
> ME: Ebauche maker  
> MM: Movement made by  
> MP: Watch pendant made by  
> MQ: Dust-cap maker  
> AF: Attributed to a Follower of  
> AI: Attributed to an Imitator of  
> AL: Manner/Style of  
> AM: Attributed to the Manner of  
> AT: After  
> C: Close to  
> CF: Compare with  
> CM: Connected with the Manner of  
> CW: Connected with  
> S: School of/style of  
> RE: Related to  
> NE: Near  
> RC: Recalls  
> 
> 
> These principles apply to other parts of the object record including 
> acquisition, inscription, visual representations and so on. I have attached a 
> construct for "Acquisition From" which is one acquisition construct along 
> with 'Acquisition Through', 'Custody From', 'Acquisitision Motivated by', 
> 'Acquisition Through', 'Legislation', 'Found By' etc all of which have 
> different semantics that are very important to represent. 
> 
> 
> The ontology allows us to take densities of data that have been contextually 
> harmonised and infer new knowledge, correct data that is wrong and to 
> co-reference all the different terms and vocabularies. The Museum has terms, 
> people and places that don't appear in central authorities, Getty, Viaf etc. 
> More obscure artists or artisans, for example, will simply never be 
> co-referenced without the contextualisation that CRM provides. 
> 
> 
> Here (attched) is the record (not the largest) of a significant object - The 
> Rosetta Stone. One object out of 2 million records that represent a large 
> number of object types from art history to archeology representing different 
> internal approaches to classification and knowledge representation. I would 
> suggest that CRM is more of a miracle than an ontology and it would surely 
> take 15 years for anyone to start from from scratch and produce something 
> equivalent. :-)
> 
> Hope this answers the questions a little bit. Sorry for being so long winded.
> 
> 
> Cheers Dominic
> 
> 
> 
> 
> 
> 
> 
> 
>   
> 
> 
> 
>   
> 
> 
> 
> 
> 
> ________________________________
>  From: Michael Brunnbauer <[email protected]>
> To: [email protected] 
> Sent: Sunday, 23 June 2013, 18:21
> Subject: Re: Big data applications for general users based on RDF - where     
> are they?
>  
> 
> 
> Hello Dominic,
> 
> On Sun, Jun 23, 2013 at 09:35:53AM +0100, Dominic Oldman wrote:
> > I take the point about the ability to set things up quickly, but this just 
> > points to the fact that we have some way to go on a number of strands. But 
> > we all know we on the right path.  I would say that focusing in on some of 
> > the huge range of potential applications that you couldn't do with a 
> > relational database will help move things along more.
> > 
> > On ontology here is my experience. You need a solid ontology that describes 
> > your domain at precisely the right level to represent domain knowledge to 
> > establish key relationships but which supports specialisation below this 
> > level. This level is just at the
>  point above which the domain varies. However after going down the 
> specialisation route we have found a more accessible and portable approach. 
> > 
> > We have used an ontology that does precisely the above but used it to 
> > create a set of ready made constructs for key domain concepts that are 
> > uncontentious. A particular concept may have a number of alternative 
> > constructs from which an organisation can select as appropriate. We then 
> > avoid the need to specialise the constructs using sub classes and sub 
> > properties and instead provide a mechanism for plugging in local 
> > vocabularies. This transfers the issue of co-referencing ontology 
> > extensions to co-referencing vocabularies. This is far more accessible for 
> > two reasons. Firstly, the contextualisation of the non-specialised elements 
> > provide enough knowledge representation to perform the co-referencing.  
> > Secondly, there are many vocabulary co-referencing initiatives that are 
> > becoming more mature
>  and accessible. The plugin approach is supported by typing whole event 
> constructs and reification of key properties with local terminology, people 
> and place
> >  authorities, but also terminology unique to the organisation 
> >(Institutional context).
> > 
> > For example, the production of something may have a generalised property of 
> > "carried out by". This could be specialised in a large number of ways. 
> > Instead we can look at the local specialisations and use them as a 
> > vocabulary to either type the full event or to reify the property itself. 
> > E.g. "designed by".
> > 
> > This process avoids a whole range of issues and also has the potential to 
> > be built into accessible implementation tools useful for organisations 
> > without technical resources. It means that we can start producing the 
> > applications that we can't do with relational databases and which operate 
> > across many different datasets robustly.
> > 
> > How does this
>  sound?
> 
> This sounds complicated :-)
> 
> That the cultural heritage crowd seems to have a need for it's own upper level
> ontology underlines my point about schematic and structural heterogeneity.
> 
> Why does CIDOC define general concepts as place, event, spacial coordinates ?
> Are there no suitable existing ontologies for this ?
> 
> Can CIDOC also be used without specialization of properties and classes ?
> 
> I think museums have used controlled vocabularies for quite a while. Can you
> give an example that illustrates why the additional effort required for your 
> project is justified ?
> 
> Regards,
> 
> Michael Brunnbauer
> 
> -- 
> ++  Michael Brunnbauer
> ++  netEstate GmbH
> ++  GeisenhausenerStraße 11a
> ++  81379 München
> ++  Tel +49 89 32 19 77 80
> ++  Fax +49 89 32 19 77 89 
> ++  E-Mail [email protected]
> ++  http://www.netestate.de/
> ++
> ++  Sitz: München, HRBNr.142452 (Handelsregister B München)
> ++  USt-IdNr. DE221033342
> ++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
> ++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel



-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail [email protected]
++  http://www.netestate.de/
++
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Attachment: pgpX9eukh0jhT.pgp
Description: PGP signature

Reply via email to