Re: [Crm-sig] Issue 230 Co-reference

martin Thu, 27 Mar 2014 11:35:39 +0200

Hi Maximilian,

Yes, sure! Self-evident that you would not state derived links manually.The process we describe here here is based on scholarly insight. Onlythose links are described, which result

directly from insight, and are not already deduced due to transitivity.
The minimal number of links is always the same.

The deductions, the rest of the n*(n-1), can be managed by a program,even be calculated at query time from the original statements.


A "high error rate in manual curation" does not make sense for the rest:

We talk here about primary knowledge from scholarly research. There isno other source than curatorial knowledge, regardless whether it issupported by an automated "instance matching

algorithm", which produces identity assumptions, or detected manually.

See also:

1. Meghini, C., Doerr, M., & Spyratos, N. (2009). Managing Co-reference
   Knowledge for Data Integration. /Proceeding of the 2009 conference
   on Information Modelling and Knowledge Bases XX/, (pp. 224-244).
   Amsterdam, The Netherlands, The Netherlands: IOS Press
   (978-1-58603-957-8), (pdf
   <http://www.ics.forth.gr/_publications/ejc08-final.pdf>).
2. Doerr, M., Meghini, C., & Spyratos, N. (2007). Leveraging on
   Associations - a New Challenge for Digital Libraries. /In Proc of
   the First International Workshop on Digital Libraries Foundations In
   conjunction with ACM IEEE Joint Conference on Digital Libraries
   (JCDL 2007)/, Vancouver, Canada, 23 June. (pdf
   
<http://www.ics.forth.gr/_publications/Martin%20Canada%20paper%20-%20Leveraging.pdf>).

See VIAF, they have created the co-reference network to over 30%manually. Only 69% could be done by an algorithm that was manually!confirmed to be sufficiently reliable.Simple references to "unknown objects" miss the point: Who said thatthey are the same?


I'll take your point to add this intention to the scope note!

Best,

Martin


On 26/3/2014 10:15 ??, Maximilian Schich wrote:

Whoever collects co-references: The number of co-reference linksexplodes n*(n-1) with the number of references to the referenceobject. Imagine co-reference links between all books citing the bible.This is likely to result in a high error-rate, especially in manualcuration.
Workaround: Use simple references to an "unknown/potential referenceobject" and put a probability on those links => scales with n.
Max Schich
On 2014-03-26 14:26 , martin wrote:
Dear All,

Here my homework:


      E91 Co-Reference Assignment

Subclass of:E13 Attribute Assignment
Scope note:This class comprises actions of making the assertionwhether two or more particular instances of E89 Propositional Objectrefer to the same instance of E1 CRM Entity. The assertion is basedon the assumption that this was an implicit fact being made explicitby this assignment. Use of this class allows for the full descriptionof the context of this assignment. (MD will write an extension aboutthe levels of belief)
A co-reference assertion may admit a certain degree or strength ofbelief, such as "possibly", "most likely" etc. This can be modelledusing the property /P2 has type/ with a suitable terminology.However, this degree of belief will be common to all statementasserted by one instance of E91 Co-Reference Assignment. Otherwise,the assertion must be broken down into a suitable number of instanceswith different degrees of belief.
If there exists a document describing particular evidence, this canbe referred to by using /P used specific object/. There may nothingmore be known about the instance of E1 CRM Entity to which thedescribed statements are assumed to refer to than the facts expressedby these very statements.
Frequently, scholars may like to contradict to a co-referencestatement or point to frequent confusions. This can be modelled usingthe property /P154 <#_P154_assigned_non>//assigned non co-reference to./
The property /P155 <#_P155_has_co-reference>//has co-referencetarget/allows for associating an ???
//
In the end, I got confused: The range of P155 can be interpreted as aURI used within the same knowledge base as the instance of E91. Then,it would correspond to a co-reference between some text element andthe knowledge base in which we implement the CRM, the "local truth".In that case, also one instance of P153 would make sense, even twoinstances of P155 only.In case we talk about Linked Open Data, the issue becomes moreobscure. We could regard the co-reference to be between some textelement and the document the URI resolves into.If however someone uses this very URI in another context, thequestion of co-reference is again there.
It appears as if we need a construct to refer to the use of a URIwithin a knowledge base or RDF document as an instance ofPropositional Object. If we follow this line, then the interpretationof P155 pointing to a "self co-reference" would be consistent, andany othermeaning of referring to a URI would need a contextualization of theURI to be discussed.
Opinions?

Best,

Martin
--

--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Research Director             |  Fax:+30(2810)391638        |
                                |  Email:[email protected]  |
                                                              |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
                                                              |
                N.Plastira 100, Vassilika Vouton,             |
                 GR70013 Heraklion,Crete,Greece               |
                                                              |
              Web-site:http://www.ics.forth.gr/isl            |
--------------------------------------------------------------



_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig
_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig



--

--------------------------------------------------------------
 Dr. Martin Doerr              |  Vox:+30(2810)391625        |
 Research Director             |  Fax:+30(2810)391638        |
                               |  Email: [email protected] |
                                                             |
               Center for Cultural Informatics               |
               Information Systems Laboratory                |
                Institute of Computer Science                |
   Foundation for Research and Technology - Hellas (FORTH)   |
                                                             |
               N.Plastira 100, Vassilika Vouton,             |
                GR70013 Heraklion,Crete,Greece               |
                                                             |
             Web-site: http://www.ics.forth.gr/isl           |
--------------------------------------------------------------

Re: [Crm-sig] Issue 230 Co-reference

Reply via email to