Hi Maximilian,

Yes, sure! Self-evident that you would not state derived links manually. The process we describe here here is based on scholarly insight. Only those links are described, which result
directly from insight, and are not already deduced due to transitivity.
The minimal number of links is always the same.
The deductions, the rest of the n*(n-1), can be managed by a program, even be calculated at query time from the original statements.

A "high error rate in manual curation" does not make sense for the rest:
We talk here about primary knowledge from scholarly research. There is no other source than curatorial knowledge, regardless whether it is supported by an automated "instance matching
algorithm", which produces identity assumptions, or detected manually.

See also:

1. Meghini, C., Doerr, M., & Spyratos, N. (2009). Managing Co-reference
   Knowledge for Data Integration. /Proceeding of the 2009 conference
   on Information Modelling and Knowledge Bases XX/, (pp. 224-244).
   Amsterdam, The Netherlands, The Netherlands: IOS Press
   (978-1-58603-957-8), (pdf
   <http://www.ics.forth.gr/_publications/ejc08-final.pdf>).
2. Doerr, M., Meghini, C., & Spyratos, N. (2007). Leveraging on
   Associations - a New Challenge for Digital Libraries. /In Proc of
   the First International Workshop on Digital Libraries Foundations In
   conjunction with ACM IEEE Joint Conference on Digital Libraries
   (JCDL 2007)/, Vancouver, Canada, 23 June. (pdf
   
<http://www.ics.forth.gr/_publications/Martin%20Canada%20paper%20-%20Leveraging.pdf>).


See VIAF, they have created the co-reference network to over 30% manually. Only 69% could be done by an algorithm that was manually! confirmed to be sufficiently reliable. Simple references to "unknown objects" miss the point: Who said that they are the same?

I'll take your point to add this intention to the scope note!

Best,

Martin


On 26/3/2014 10:15 ??, Maximilian Schich wrote:
Whoever collects co-references: The number of co-reference links explodes n*(n-1) with the number of references to the reference object. Imagine co-reference links between all books citing the bible. This is likely to result in a high error-rate, especially in manual curation.

Workaround: Use simple references to an "unknown/potential reference object" and put a probability on those links => scales with n.
Max Schich
On 2014-03-26 14:26 , martin wrote:
Dear All,

Here my homework:


      E91 Co-Reference Assignment

Subclass of:E13 Attribute Assignment

Scope note:This class comprises actions of making the assertion whether two or more particular instances of E89 Propositional Object refer to the same instance of E1 CRM Entity. The assertion is based on the assumption that this was an implicit fact being made explicit by this assignment. Use of this class allows for the full description of the context of this assignment. (MD will write an extension about the levels of belief)

A co-reference assertion may admit a certain degree or strength of belief, such as "possibly", "most likely" etc. This can be modelled using the property /P2 has type/ with a suitable terminology. However, this degree of belief will be common to all statement asserted by one instance of E91 Co-Reference Assignment. Otherwise, the assertion must be broken down into a suitable number of instances with different degrees of belief.

If there exists a document describing particular evidence, this can be referred to by using /P used specific object/. There may nothing more be known about the instance of E1 CRM Entity to which the described statements are assumed to refer to than the facts expressed by these very statements.

Frequently, scholars may like to contradict to a co-reference statement or point to frequent confusions. This can be modelled using the property /P154 <#_P154_assigned_non>//assigned non co-reference to./

The property /P155 <#_P155_has_co-reference>//has co-reference target/allows for associating an ???

//


In the end, I got confused: The range of P155 can be interpreted as a URI used within the same knowledge base as the instance of E91. Then, it would correspond to a co-reference between some text element and the knowledge base in which we implement the CRM, the "local truth". In that case, also one instance of P153 would make sense, even two instances of P155 only. In case we talk about Linked Open Data, the issue becomes more obscure. We could regard the co-reference to be between some text element and the document the URI resolves into. If however someone uses this very URI in another context, the question of co-reference is again there.

It appears as if we need a construct to refer to the use of a URI within a knowledge base or RDF document as an instance of Propositional Object. If we follow this line, then the interpretation of P155 pointing to a "self co-reference" would be consistent, and any other meaning of referring to a URI would need a contextualization of the URI to be discussed.

Opinions?

Best,

Martin
--

--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Research Director             |  Fax:+30(2810)391638        |
                                |  Email:[email protected]  |
                                                              |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
                                                              |
                N.Plastira 100, Vassilika Vouton,             |
                 GR70013 Heraklion,Crete,Greece               |
                                                              |
              Web-site:http://www.ics.forth.gr/isl            |
--------------------------------------------------------------



_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig



_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig


--

--------------------------------------------------------------
 Dr. Martin Doerr              |  Vox:+30(2810)391625        |
 Research Director             |  Fax:+30(2810)391638        |
                               |  Email: [email protected] |
                                                             |
               Center for Cultural Informatics               |
               Information Systems Laboratory                |
                Institute of Computer Science                |
   Foundation for Research and Technology - Hellas (FORTH)   |
                                                             |
               N.Plastira 100, Vassilika Vouton,             |
                GR70013 Heraklion,Crete,Greece               |
                                                             |
             Web-site: http://www.ics.forth.gr/isl           |
--------------------------------------------------------------

Reply via email to