Re: [RDA-L] Linked data

Ed Jones Thu, 03 Feb 2011 09:08:56 -0800

The transcribed fields correspond to ISBD areas 1-4 and 6 (245, 250, 362 [for 
serials; other fields for some other formats], 260, and 4XX. Note fields may 
also contain transcribed data in some cases, but note fields typically consist 
of a single subfield and are already "consolidated" in this sense.  "Dual 
function" fields that serve both as notes and as access points (e.g., 246, 
760-785) might benefit from having these functions disentangled.  For example,


http://lccn.loc.gov/81642892, http://marc21.info/element/246, "Issues for 
<2000-> have also acronym title: CCQ."
http://lccn.loc.gov/81642892, http://marc21.info/element/740, "CCQ"

I would be hesitant to combine data from all transcribed fields into a single 
field, if only because different applications might want the freedom to display 
different subsets of this data.

Ed

-----Original Message-----
From: Resource Description and Access / Resource Description and Access 
[mailto:[email protected]] On Behalf Of Karen Coyle
Sent: Thursday, February 03, 2011 8:17 AM
To: [email protected]
Subject: Re: [RDA-L] Linked data

Ed, that is a very interesting approach. If we treat

   New York, NY, Random House, c1998

as a simple string with no data "capabilities" it also emphasizes  
those areas where we would need to create separately actual data if we  
wish to provide services, e.g. around place or date. In fact, this is  
approximately what MARC already does by having data in fixed fields  
that replicates information that is either transcribed or purely  
textual. Taking this further, the entire set of transcribed elements,  
from title proper through date, could be considered a single unit,  
ISBD encoded for display. Everything else could be represented as  
separate data elements.

The downside to this is that it requires some information to be coded  
and carried twice - once as text and once as data. The way to satisfy  
both data and display considerations is to generate displays from data  
(the other way around doesn't work). So a coded date that represents a  
copyright date could be displayed as "c1998".

It seems to me that the number of strictly transcribed fields is very  
small. Is this a full list?

- title proper
- subtitle
- statement of responsibility
- place
- publisher

kc

Quoting Ed Jones <[email protected]>:

> "What we need to capture" may be the key phrase here. There are some  
> MARC fields that would not suffer a loss of information if they were  
> treated as single elements.  For example, while the 260 field  
> consists of several separately delimited elements, these elements  
> are all transcribed (more or less) and the transcribed data is by  
> definition non-standard, dependent entirely on what appears on the  
> source from which they are transcribed.  From a machine (or Semantic  
> Web) point of view, treating such component elements separately just  
> introduces a temptation to treat them as though the data they  
> contained was standardized in some way and so reliable for creating  
> record sets. Treating these transcribed fields as single elements  
> would also obviate the need for relating them to one another in RDA  
> triplets.  If the information in one of the subfields in a  
> transcribed field is really considered useful for creating record  
> sets, it should be coded separately in a standardized way elsewhere  
> in the record.  Otherwise, something like the following should  
> suffice:
>
> http://lccn.loc.gov/75647252, http://marc21.info/element/260, "[New  
> York, etc. Elsevier Inc., etc.]"
>
> Ed Jones
>
>
> -----Original Message-----
> From: Resource Description and Access / Resource Description and  
> Access [mailto:[email protected]] On Behalf Of Karen Coyle
> Sent: Thursday, January 20, 2011 8:37 AM
> To: [email protected]
> Subject: Re: [RDA-L] Linked data
>
> Quoting Jonathan Rochkind <[email protected]>:
>
>
>>
>> What we need is a "data schema" (aka "data dictionary", aka "data
>> vocabulary") that actually semantically captures what we need to
>> capture.
>
> And I will again say what I have said before: I have set up a wiki
> page for such a project, if anyone wants to join in. I don't expect
> that we will be able to actually transform MARC in this informal way,
> but I see it as a way to explore some of the issues (like the one I
> brought up about the uniform title, and which I will add there).
>
> http://futurelib.pbworks.com/w/page/29114548/MARC-elements
>
> If you want to add info, comment, or edit the page, I will need to set
> you up with an ID, I think. Also, I'm trying to figure out how to
> allow comments...
>
> kc
>
>
>
>> That's the hard part, and it neccesarily will not be round-trip
>> backwards compatible with MARC.  If we have that, whether we put it
>> in XML or something else doesn't matter. The serialization format
>> itself is, to a large extent, an implementation issue. This is my
>> contention.
>>
>> If you have that, then you can, as Behrnard says 'make it a snap to
>> extract the "title" of the piece represented, unambiguously and
>> independent of context inside the record that only a human reader
>> can unravel.' And, sure, you can do that from an XML format. Just
>> not AACR2-style MarcXML.
>>
>> Jonathan
>>
>>> In the light of this, what we need is a real data format. It may look
>>> not all that different from MARC, but it needs to be understood in
>>> a markedly different way (and RDA supports this view more than AACR2 in
>>> that it clearly leaves textual display (ISBD) outside the rules).
>>> What we do not need, however, is an RDB sort of format, consisting
>>> of a set of interrelated tables. This seems to be what Thomale
>>> understands best. And for many developers, RDB is synonymous with
>>> "database". And that's the other trap into which we ought not fall.
>>>
>>> A true format must, for one thing, make it a snap to extract the
>>> "title" of the piece represented, unambiguously and independent of
>>> context inside the record that only a human reader can unravel.
>>> OTOH, it will never be easy to say and pin down what the title of a
>>> thing is, no matter what syntax you use to record it. In MARC, the
>>> 245 is the most confounded element - no, textual paragraph.
>>>
>>> B.Eversberg
>>
>
>
>
> --
> Karen Coyle
> [email protected] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>



-- 
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Re: [RDA-L] Linked data

Reply via email to