Dear Robert,
I may have lost the track. I had published my final version of the
guidelines before January. A final approval may be pending, but I have
elaborated in much details these properties. Yes, of course, they have
to be defined, that was the idea of the deprecation. I thought it was
accepted already...:-)
See:
"Whereas the CRM regards that intervals of primitive values are
primitive values by themselves, there is currently no corresponding
practice in RDF. Therefore, in analogy to the properties of E52
Time-Span, we define in CRM RDFS two more subproperties of P90 has
value: “P90a_has_lower_value_limit” and “P90b_has_upper_value_limit”.
Even if we regard complex matrices of numbers as one value for an
instance of E54 Dimension, such as RGB images, we can argue that minimal
and maximal values exist as two separate matrices of the same structure.
The precise guidelines for using these properties are given in the
section “Guidelines for using P90a, P90, P90b” below."
"
*Guidelines for using P90a, P90, P90b*
The CRM recommends to approximate numerical values of Dimensions with
intervals. The range of the respective property "P90 has value" is
defined in the CRM as E60 Number. Whereas the CRM regards that intervals
of primitive values are primitive values by themselves, there is
currently no corresponding practice in RDF. Therefore, in analogy to the
properties of E52 Time-Span, we define in CRM RDFS two more
subproperties of P90 has value: “/P90a_has_lower_value_limit/” and
“/P90b_has_upper_value_limit/”.
The reasons for recommending this approximation are the following: All
scientific measurements of non-discrete values are imprecise because of
the tolerances of the measurement devices, shortcomings in applying the
procedures and the indeterminacy of the measured effect itself. In
natural sciences, important results of measurements are associated with
possibly complex probabilistic distributions for the true value of the
measured effect.
The most complex case relevant for cultural-historical data are the
so-called “battleship curves” for calibrated C14 dating data. Many of
these distribution models actually extend to infinity with non-zero
probability, which is neither practical nor always justified. In the
case of C14 however, the actual width of the distribution is often
underestimated. Nevertheless, even data with a given probabilistic
uncertainty to infinity are typically associated by scientists with
narrower “confidence intervals” at one to three “standard deviations”,
i.e., with a probability of some 68% – 99.7% for the value to be in the
given range (https://en.wikipedia.org/wiki/Standard_deviation).
Whereas querying globally a very large aggregation of
cultural-historical data by time intervals is highly relevant, querying
and reasoning with different approximations of dimensions is normally
restricted to quite narrow questions. For many cases, a medium value
without explicit limits is sufficient for the application, such as the
length of a museum object in millimeters for packaging it in a box.
Nevertheless, querying explicit representation of actual outer limits or
at least reasonably wide confidence intervals is computationally highly
effective, and therefore a good way to ensure recall at query time,
i.e., that the relevant results are contained in the answer to the
query, even if it also contains irrelevant ones.
We therefore recommend to use /P90_has_value/ for documenting a medium
value//or a value without error estimates, when the precision appears to
be self-evident or irrelevant.
We recommend to use /P90a_has_lower_value_limit /for documenting the
highest explicit lower limit available for the respective value, even if
it provides very wide margins. It is an error to omit the lower limit
even if it appears to be overly pessimistic.
We recommend to use /P90b_has_upper_value_limit /for documenting the
lowest explicit upper limit available for the respective, even if it
provides very wide margins. It is an error to omit the upper limit even
if it appears to be overly pessimistic.
In case of approximating probabilistic distributions, we recommend to
keep lower and upper limit at two standard deviations or enclosing the
true value with 95% probability.
/P90a_has_lower_value_limit/ should always be used together with
/P90b_has_upper_value_limit. /If they are used, the property
/P90_has_value/ may be used as well or be omitted."
On 6/11/2019 12:56 PM, Robert Sanderson wrote:
Apologies for missing this back in February …
Before the deprecation of P83 and P84 in favor of P191, it was
possible to say that a TimeSpan had a minimum duration of 2 days and a
maximum duration of 4 days by using P83 and P84.
Now there is only a single Dimension related via P191, with the intent
that the value can be an interval.
Given that in the RDF projection of CRM, the value of a Dimension is a
single number (and similarly, the dates are single dates), it is not
possible to express the above without some additional constructions in
that projection.
Thus it seems like we need at least to define P90a_has_minumum_value
and P90b_has_maximum_value as properties of Dimension to be able to
express the interval value. This would be more consistent, and provide
access to the construction for other uses of Dimension, so I’m happy
with the deprecation of the last SIG … but we need to follow through
with the corresponding RDF definitions.
I propose the following properties, which could be defined in the same
document as P81a/b and P82a/b:
P90a_has_minimum_value
This property allows the lowest possible value of an E54 Dimension to
be approximated by an E60 Number primitive.
P90b_has_maximum_value
This property allows the greatest possible value of an E54 Dimension
to be approximated by an E60 Number primitive.
Rob
*From: *Martin Doerr <[email protected]>
*Date: *Saturday, February 23, 2019 at 4:59 PM
*To: *Robert Sanderson <[email protected]>, crm-sig
<[email protected]>
*Subject: *Re: [Crm-sig] Issue 397
Dear Robert,
On 2/23/2019 1:09 AM, Robert Sanderson wrote:
This becomes problematic, unfortunately, in RDF which does not
have a way to natively express a Number that is actually an
interval. The resolution would be to do the same as P81a/b …
which would have the same effect as maintaining P83 and P84, just
not in the model directly.
While I appreciate the theoretical consistency that this change
would add, from an implementation perspective, this would bring
more complexity than value.
I do not understand what increases the complexity: If I have in RDFS
two paths P83-E54-P90 AND P83-E54-P90, and the ambiguity how to use
P90a, P90b together with these paths, OR I have a single path Pxxx-E54
that splits into P90a, P90b, then, in the end I have again two paths:
Pxxx-E54-P90a AND Pxxx-E54-P90b and no ambiguity to use P83 or P90a.
So where is the added complexity? I see it only reduced, but I may be
wrong!
My second question was if, since we have bound the Dimension already
to temporal durations in the definition of Pxxx, we should express
that by a subclass of E54.
Best,
martin
Overall, I’m not in favor of the deprecation, but am not averse to
adding had_duration separately, with the potential to deprecate 83
and 84 if a holistic approach to date and number intervals can be
devised.
Thanks!
Rob
*From: *Crm-sig <[email protected]>
<mailto:[email protected]> on behalf of Martin Doerr
<[email protected]> <mailto:[email protected]>
*Date: *Friday, February 15, 2019 at 9:18 AM
*To: *crm-sig <[email protected]> <mailto:[email protected]>
*Subject: *[Crm-sig] Issue 397
Dear All
As discussed in Berlin, I proposed to deprecate P83, P84, because
in competes with an interval interpretation of P90, and :
Introduce instead Pxxx had duration, Domain: E52 Time-Span,
Range: E54 Dimension
and use the P90, P90a, P90b as adequate
or introduce an Exxx Temporal Duration , subclass of E54
Dimension, and define subproperties in RDFS ending in xsd:duration.
Here my definition:
*Pxxx had duration (was duration of)*
Domain: E52 Time-Span
Range: E54 Dimension
Quantification: one to one (1,1:1,1)
Scope note: This property describes the length of time
covered by an E52 Time-Span. It allows an E52 Time-Span to be
associated with an E54 Dimension representing duration (i.e. it’s
inner boundary) independent from the actual beginning and end.
Indeterminacy of the duration value can be expressed by assigning
a numerical interval to the property P90 has value of E54 Dimension.
Examples:
§ the time span of the Battle of Issos 333 B.C.E. (E52) /had
duration/ Battle of Issos minimum duration (E54) has unit (P91)
day (E58) has value (P90) (E60)
In First Order Logic:
Pxxx(x,y) ⊃E52(x)
Pxxx(x,y) ⊃E54(y)
*Comments?*
------------------------------------------------------------------------------------------------------
See:
P83 had at least duration (was minimum duration of)
Domain: E52 Time-Span
Range: E54 Dimension
Quantification: one to one (1,1:1,1)
Scope note: This property describes the minimum length of
time covered by an E52 Time-Span.
It allows an E52 Time-Span to be associated with an E54 Dimension
representing it’s minimum duration (i.e. it’s inner boundary)
independent from the actual beginning and end.
Examples:
§ the time span of the Battle of Issos 333 B.C.E. (E52) had at
least duration Battle of Issos minimum duration (E54) has unit
(P91) day (E58) has value (P90) 1 (E60)
In First Order Logic:
P83(x,y) ⊃ E52(x)
P83(x,y) ⊃ E54(y)
P84 had at most duration (was maximum duration of)
Domain: E52 Time-Span
Range: E54 Dimension
Quantification: one to one (1,1:1,1)
Scope note: This property describes the maximum length of
time covered by an E52 Time-Span.
It allows an E52 Time-Span to be associated with an E54 Dimension
representing it’s maximum duration (i.e. it’s outer boundary)
independent from the actual beginning and end.
Examples:
§ the time span of the Battle of Issos 333 B.C.E. (E52) had at
most duration Battle of Issos maximum duration (E54) has unit
(P91) day (E58) has value (P90) 2 (E60)
In First Order Logic:
P84(x,y) ⊃ E52(x)
P84(x,y) ⊃ E54(y)
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email:[email protected] <mailto:[email protected]>
Web-site:http://www.ics.forth.gr/isl
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email:[email protected] <mailto:[email protected]>
Web-site:http://www.ics.forth.gr/isl
_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email: [email protected]
Web-site: http://www.ics.forth.gr/isl