Dear Robert,

I may have lost the track. I had published my final version of the guidelines before January. A final approval may be pending, but I have elaborated in much details these properties. Yes, of course, they have to be defined, that was the idea of the deprecation. I thought it was accepted already...:-)

See:

"Whereas the CRM regards that intervals of primitive values are primitive values by themselves, there is currently no corresponding practice in RDF. Therefore, in analogy to the properties of E52 Time-Span, we define in CRM RDFS two more subproperties of P90 has value: “P90a_has_lower_value_limit” and “P90b_has_upper_value_limit”. Even if we regard complex matrices of numbers as one value for an instance of E54 Dimension, such as RGB images, we can argue that minimal and maximal values exist as two separate matrices of the same structure. The precise guidelines for using these properties are given in the section “Guidelines for using P90a, P90, P90b” below."

"


 *Guidelines for using P90a, P90, P90b*

The CRM recommends to approximate numerical values of Dimensions with intervals. The range of the respective property "P90 has value" is defined in the CRM as E60 Number. Whereas the CRM regards that intervals of primitive values are primitive values by themselves, there is currently no corresponding practice in RDF. Therefore, in analogy to the properties of E52 Time-Span, we define in CRM RDFS two more subproperties of P90 has value: “/P90a_has_lower_value_limit/” and “/P90b_has_upper_value_limit/”.

The reasons for recommending this approximation are the following: All scientific measurements of non-discrete values are imprecise because of the tolerances of the measurement devices, shortcomings in applying the procedures and the indeterminacy of the measured effect itself. In natural sciences, important results of measurements are associated with possibly complex probabilistic distributions for the true value of the measured effect.

The most complex case relevant for cultural-historical data are the so-called “battleship curves” for calibrated C14 dating data. Many of these distribution models actually extend to infinity with non-zero probability, which is neither practical nor always justified. In the case of C14 however, the actual width of the distribution is often underestimated. Nevertheless, even data with a given probabilistic uncertainty to infinity are typically associated by scientists with narrower “confidence intervals” at one to three “standard deviations”, i.e., with a probability of some 68% – 99.7% for the value to be in the given range (https://en.wikipedia.org/wiki/Standard_deviation).

Whereas querying globally a very large aggregation of cultural-historical data by time intervals is highly relevant, querying and reasoning with different approximations of dimensions is normally restricted to quite narrow questions. For many cases, a medium value without explicit limits is sufficient for the application, such as the length of a museum object in millimeters for packaging it in a box. Nevertheless, querying explicit representation of actual outer limits or at least reasonably wide confidence intervals is computationally highly effective, and therefore a good way to ensure recall at query time, i.e., that the relevant results are contained in the answer to the query, even if it also contains irrelevant ones.

We therefore recommend to use /P90_has_value/ for documenting a medium value//or a value without error estimates, when the precision appears to be self-evident or irrelevant.

We recommend to use /P90a_has_lower_value_limit /for documenting the highest explicit lower limit available for the respective value, even if it provides very wide margins. It is an error to omit the lower limit even if it appears to be overly pessimistic.

We recommend to use /P90b_has_upper_value_limit /for documenting the lowest explicit upper limit available for the respective, even if it provides very wide margins. It is an error to omit the upper limit even if it appears to be overly pessimistic.

In case of approximating probabilistic distributions, we recommend to keep lower and upper limit at two standard deviations or enclosing the true value with 95% probability.

/P90a_has_lower_value_limit/ should always be used together with /P90b_has_upper_value_limit. /If they are used, the property /P90_has_value/ may be used as well or be omitted."


On 6/11/2019 12:56 PM, Robert Sanderson wrote:

Apologies for missing this back in February …

Before the deprecation of P83 and P84 in favor of P191, it was possible to say that a TimeSpan had a minimum duration of 2 days and a maximum duration of 4 days by using P83 and P84.

Now there is only a single Dimension related via P191, with the intent that the value can be an interval.

Given that in the RDF projection of CRM, the value of a Dimension is a single number (and similarly, the dates are single dates), it is not possible to express the above without some additional constructions in that projection.

Thus it seems like we need at least to define P90a_has_minumum_value and P90b_has_maximum_value as properties of Dimension to be able to express the interval value. This would be more consistent, and provide access to the construction for other uses of Dimension, so I’m happy with the deprecation of the last SIG … but we need to follow through with the corresponding RDF definitions.

I propose the following properties, which could be defined in the same document as P81a/b and P82a/b:

P90a_has_minimum_value

This property allows the lowest possible value of an E54 Dimension to be approximated by an E60 Number primitive.

P90b_has_maximum_value

This property allows the greatest possible value of an E54 Dimension to be approximated by an E60 Number primitive.

Rob

*From: *Martin Doerr <[email protected]>
*Date: *Saturday, February 23, 2019 at 4:59 PM
*To: *Robert Sanderson <[email protected]>, crm-sig <[email protected]>
*Subject: *Re: [Crm-sig] Issue 397

Dear Robert,

On 2/23/2019 1:09 AM, Robert Sanderson wrote:

    This becomes problematic, unfortunately, in RDF which does not
    have a way to natively express a Number that is actually an
    interval.  The resolution would be to do the same as P81a/b …
    which would have the same effect as maintaining P83 and P84, just
    not in the model directly.

    While I appreciate the theoretical consistency that this change
    would add, from an implementation perspective, this would bring
    more complexity than value.

I do not understand what increases the complexity: If I have in RDFS two paths  P83-E54-P90 AND P83-E54-P90, and the ambiguity how to use P90a, P90b together with these paths, OR I have a single path Pxxx-E54 that splits into P90a, P90b, then, in the end I have again two paths: Pxxx-E54-P90a AND Pxxx-E54-P90b and no ambiguity to use P83 or P90a.

So where is the added complexity? I see it only reduced, but I may be wrong!

My second question was if, since we have bound the Dimension already to temporal durations in the definition of Pxxx, we should express that by a subclass of E54.

Best,

martin

    Overall, I’m not in favor of the deprecation, but am not averse to
    adding had_duration separately, with the potential to deprecate 83
    and 84 if a holistic approach to date and number intervals can be
    devised.

    Thanks!

    Rob

    *From: *Crm-sig <[email protected]>
    <mailto:[email protected]> on behalf of Martin Doerr
    <[email protected]> <mailto:[email protected]>
    *Date: *Friday, February 15, 2019 at 9:18 AM
    *To: *crm-sig <[email protected]> <mailto:[email protected]>
    *Subject: *[Crm-sig] Issue 397

    Dear All

    As discussed in Berlin, I proposed to deprecate P83, P84, because
    in competes with an interval interpretation of P90, and :

    Introduce instead Pxxx had duration, Domain:  E52 Time-Span,
    Range: E54 Dimension
    and use the P90, P90a, P90b as adequate

    or introduce  an Exxx Temporal Duration , subclass of E54
    Dimension, and define subproperties in RDFS ending in xsd:duration.

    Here my definition:

    *Pxxx had duration (was duration of)*

    Domain: E52 Time-Span

    Range: E54 Dimension

    Quantification: one to one (1,1:1,1)

    Scope note:         This property describes the length of time
    covered by an E52 Time-Span. It allows an E52 Time-Span to be
    associated with an E54 Dimension representing duration (i.e. it’s
    inner boundary) independent from the actual beginning and end.
    Indeterminacy of the duration value can be expressed by assigning
    a numerical interval to the property P90 has value of E54 Dimension.

    Examples:

    §  the time span of the Battle of Issos 333 B.C.E. (E52) /had
    duration/ Battle of Issos minimum duration (E54) has unit (P91)
    day (E58) has value (P90) (E60)

    In First Order Logic:

    Pxxx(x,y) ⊃E52(x)

    Pxxx(x,y) ⊃E54(y)

    *Comments?*

    
------------------------------------------------------------------------------------------------------

    See:

    P83 had at least duration (was minimum duration of)

    Domain: E52 Time-Span

    Range: E54 Dimension

    Quantification: one to one (1,1:1,1)

    Scope note:         This property describes the minimum length of
    time covered by an E52 Time-Span.

    It allows an E52 Time-Span to be associated with an E54 Dimension
    representing it’s minimum duration (i.e. it’s inner boundary)
    independent from the actual beginning and end.

    Examples:

    §  the time span of the Battle of Issos 333 B.C.E. (E52) had at
    least duration Battle of Issos minimum duration (E54) has unit
    (P91) day (E58) has value (P90) 1 (E60)

    In First Order Logic:

    P83(x,y) ⊃ E52(x)

    P83(x,y) ⊃ E54(y)


    P84 had at most duration (was maximum duration of)

    Domain: E52 Time-Span

    Range: E54 Dimension

    Quantification: one to one (1,1:1,1)

    Scope note:         This property describes the maximum length of
    time covered by an E52 Time-Span.

    It allows an E52 Time-Span to be associated with an E54 Dimension
    representing it’s maximum duration (i.e. it’s outer boundary)
    independent from the actual beginning and end.

    Examples:

    §  the time span of the Battle of Issos 333 B.C.E. (E52) had at
    most duration Battle of Issos maximum duration (E54) has unit
    (P91) day (E58) has value (P90) 2 (E60)

    In First Order Logic:

    P84(x,y) ⊃ E52(x)

    P84(x,y) ⊃ E54(y)

--
    ------------------------------------

      Dr. Martin Doerr

      Honorary Head of the

      Center for Cultural Informatics

      Information Systems Laboratory

      Institute of Computer Science

      Foundation for Research and Technology - Hellas (FORTH)

      N.Plastira 100, Vassilika Vouton,

      GR70013 Heraklion,Crete,Greece

      Vox:+30(2810)391625

 Email:[email protected] <mailto:[email protected]>  Web-site:http://www.ics.forth.gr/isl
--
------------------------------------
  Dr. Martin Doerr
 Honorary Head of the
  Center for Cultural Informatics
 Information Systems Laboratory
  Institute of Computer Science
  Foundation for Research and Technology - Hellas (FORTH)
 N.Plastira 100, Vassilika Vouton,
  GR70013 Heraklion,Crete,Greece
 Vox:+30(2810)391625  Email:[email protected] <mailto:[email protected]>  Web-site:http://www.ics.forth.gr/isl
_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig


--
------------------------------------
 Dr. Martin Doerr

 Honorary Head of the
 Center for Cultural Informatics

 Information Systems Laboratory
 Institute of Computer Science
 Foundation for Research and Technology - Hellas (FORTH)

 N.Plastira 100, Vassilika Vouton,
 GR70013 Heraklion,Crete,Greece

 Vox:+30(2810)391625
 Email: [email protected]
 Web-site: http://www.ics.forth.gr/isl

Reply via email to