Dear Richard, Robert,
It is simply wrong that encoding structured data into an rdfs:Literal
makes it invisible to SPARQL. It is exactly what xsd:dateTime does. The
year, month, etc., is available to querying individually in SPARQL, not
by magic but by a standard extension mechanism. It is a question to IT
experts to tell us how to upload into the SPARQL code the respective
string functions for other compounds. If we decide one standard way to
encode the person name compounds, that would be quite feasible.
Interoperability is in any case given with a trivial mapping, because
standard SPARQL recognizes any custom datatype. Of course we would also
provide standard string functions to take the compound apart. For this
discussion we need a completely informed decision.
We must really be more aware how badly current RDF platforms still
perform with longer property paths. There are *good reasons* why time,
geometry and others are not encoded with rdf properties.
The first question we have to answer is A) how many compounds we need
that must be queried component-wise. Then we should find B) the*best XML
representation regardless *platforms. Then we discuss C) how that should
go into RDFS.
I propose for A):
1) miles-yards.... American Standard Lengths "A mile is *exactly
1.609344* kilometers. Yes, the mile has a metric
<https://www.mathsisfun.com/measure/metric-length.html> definition."
(https://www.mathsisfun.com/measure/us-standard-length.html)
2) Person Name compounds,
3) Street address compounds
I propose for B)
2) following either TEI or RDA guidlines. I do not propose to use MARC
tags as is. The translation into XML elements is trivial syntactic sugar
(and exists, I think). The relevant question is, if the *analysis is
effective or not.
*I propose for C)*
*to find out if anybody has solved the problem already. *
*So, does anybody propose a good-practice analysis of name compounds?
Best,
Martin
**
On 11/22/2018 10:21 AM, Richard Light wrote:
On 21/11/2018 22:43, Robert Sanderson wrote:
All,
My concern with this approach is that standard mechanisms for
interacting with the data will not expect these sorts of compound
values. This would also affect other ongoing discussions, such as
compound monetary amounts or other dimensions.
For example, if there are subfield indicators or XML elements
embedded within a literal, rather than using the model to manage this
information, queries at the model level will not work. If “Dr” is not
a separate Appellation from “Snoopy”, with an appropriate Type
associated with it to ensure it is known to be a prefix rather than a
first name, it will be invisible to SPARQL or any other graph query
language.
For names, which already support partitioning, the answer seems
obvious to me that we should continue to use the model as intended.
The consistency for compound dimensions needs further discussion.
Similarly the value range for dimensions should follow existing
patterns (P81a anyone?) rather than trying to embed one format within
another.
Martin's original suggestion involved identifying contexts where we
could express compound values as a single string. This approach
potentially has merit where such a string, as a whole, is in a format
which is meaningful within existing systems and processible by
existing software. As you say, there is a direct trade-off between
the convenience and structural simplicity of having a single string
(and an associated single 'unit') and the [lack of] potential for
native RDF querying of the contents of that string.
I think it is more of a loss to be unable to query on people with
forename "Richard" than to be unable to query all dimensions involving
'6 inches'. So I agree that we should not pursue this particular line
of thought.
As regards using an XML encoding within a literal, I think this would
be a /really /bad idea. It would require the provision of an XML
parser and support tools within the context of all RDF serializations
(Turtle, JSON, ...). RDF/XML has provision for embedded XML, but this
wouldn't help for any other serialization of the RDF.
Richard
Rob
*From: *Crm-sig <[email protected]> on behalf of Martin
Doerr <[email protected]>
*Date: *Wednesday, November 21, 2018 at 2:11 PM
*To: *"[email protected]" <[email protected]>
*Subject: *Re: [Crm-sig] ISSUE: representing compound name strings
Dear Richard,
XML is even better. The distinction between XML tags and MARC
subfield markers is not so substantial. An XML file is still a
string. The question is about RDF, putting a compound into rdfs:Literal.
So, again, is there a good practice with XML elements ????
Cheers,
Martin
On 11/21/2018 6:58 PM, Richard Light wrote:
On 15/11/2018 21:28, Martin Doerr wrote:
Dear All,
I would expect that the library or archival community do have
a good practice how to "squeeze" a compound name, such as :
"His Majesty Dr. Snoopy Hickup Miller Jr", with respective
separators, in a machine readable string, that could be used
as custom datatype in an rdfs:Literal as one instance of
Appellation, rather than defining all possible name
constituents as individual rdf properties.
Could be a MARC string? XML? TEI?
This would be very helpful for our users.
Martin,
I'm pretty sure that the most recent attempt at doing this will
be the subfield markers ($a, etc.) in MARC. which date from the
era of punched cards. The requirement that all of the name
appears in a single string will rule out anything that might have
been done in XML (where you might typically use attributes or
subelements) or TEI (which is, after all, simply an XML application).
It's a nice idea, which follows the approach of encoding one
'compound' value as a single string, but I don't think we will
find a ready-made standard for it.
Richard
Best,
Martin
--
*Richard Light*
_______________________________________________
Crm-sig mailing list
[email protected] <mailto:[email protected]>
http://lists.ics.forth.gr/mailman/listinfo/crm-sig
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email:[email protected] <mailto:[email protected]>
Web-site:http://www.ics.forth.gr/isl
_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig
--
*Richard Light*
_______________________________________________
Crm-sig mailing list
[email protected]
http://lists.ics.forth.gr/mailman/listinfo/crm-sig
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email: [email protected]
Web-site: http://www.ics.forth.gr/isl