Re: [RDA-L] libraries, society and RDA

2008-11-17 Thread Stephens, Owen
  Unless something has changed recently, it seems to be the trend in
 several of the more recent projects. For example, VuFind, Koha,
 Evergreen, at least by default or in their demo configurations, use
the
 1xx heading rather than 245$c statement of responsibility, like
 Worldcat.org as described by Adam Schiff.
 
  Also lacking in many of the new projects is the ability to do
 alphabetical heading browses like those possible at
 http://authorities.loc.gov/ and http://catalog.loc.gov/.

Sometimes the browse options are not so obvious - for instance VuFind
does offer browse in it's demo system
http://vufind.org/demo/Browse/Author, but the simple 'Search' is so
upfront, you might overlook the browse and advanced search options.
Admittedly it looks like the VuFind browse function needs some more work
doing on it.

In other cases I've seen examples where you can do 'browse' by first
searching for a blank string (returning the whole catalogue), and then
using the browse facility. This may not be obvious, but there is no
reason why in an implementation this shouldn't be made clearer.

What I'd be more interested in is how the browse indexes like
authorities.loc.gov might be integrated into the users search
experience. This comes back to how we link our data together, and the
move towards linked data is one of the reasons that I feel RDA moves us
in vaguely the right direction, even if it doesn't get us all the way.

Owen


Re: [RDA-L] libraries, society and RDA

2008-11-14 Thread Stephens, Owen
 Three years or so ago I thought, finally the significance of what
 authority data can do for improving data management is understood but
more
 recently it seems to have been lost in the dust. I would add to
Bernhard and
 Jim's comments that the rules governing the construction of authority
data
 for automated management are long overdue. Too much of the data and
rules
 are designed for human intervention. So much of the focus in current
 discussions is on bibliographic records rather than authority records,
which is
 really backwards. A name authority record should have as much data as
possible
 on a person. Ideally, all known works should be added to a name
record, with
 additions over time. The relationship to the work should be provided
 (author, director, actor, performer, etc.). Birth and death dates, and
 other identifying information should all be provided in a manner to
help
 identify other outside resources like online biography sources.

 Mary L. Mastraccio
 Cataloging  Authorities Librarian
 MARCIVE, Inc.
 San Antonio Texas 78265
 1-800-531-7678
 [EMAIL PROTECTED]

Thanks for highlighting this Mary. I hope the following makes sense (and
is correct!)

I think this is absolutely correct, and chimes extremely well with the
work that Karen, Diane et al. have been doing with RDA - the realisation
is that what is true for Author's or Subject's is true for other
(possibly less complex) aspects of metadata.

Even with something as simple as 'physical description', can benefit
from a separation between the metadata record and the detail of the
physical description.

For example, if we consider 300$b (other physical details), which maps
to a number of RDA elements, including 'Production Method'
(http://www.collectionscanada.gc.ca/jsc/docs/5rda-parta-ch3rev.pdf, p.26
in doc not as numbered). Karen, Diane et al. have created a relevant
vocabulary at http://metadataregistry.org/vocabulary/show/id/33.html -
click on 'Concepts' tab to see the list of values. So, rather than
inserting the 'literal' value lithograph into 300$b, these
vocabularies open up the possibility of linking to
http://RDVocab.info/termLIst/RDAproductionMethod/1007 - which is the URI
for the term lithograph. RDA refers to this as a 'non-literal value
surrogate' - this comes from the language of Dublin Core Metadata
Initiative, but basically means that you point at the value using a URI,
rather than the use of the literal string 'lithograph' (or whatever) as
we do with MARC.

As with using Authority files for Author's etc. this use of separated
vocabularies opens up the possibility of saying more about lithographs -
what they are, alternative terms, translations etc.

A weakness of MARC21 is that it doesn't make use of the reference to the
Authority record into the metadata record - we rely on 'literals' too
much - making it more difficult to ensure consistency, make changes, or
draw into our indexing information not held directly in the MARC record.

I have to admit that I my understanding of RDA gets a bit hazy at this
point, but from what I can see it also treats elements such as 'Creator'
as non-literal value surrogates - and anywhere this is true, we can
treat as a possible point to establish a link to an Authority file.

Owen


Re: [RDA-L] libraries, society and RDA

2008-11-14 Thread Stephens, Owen
Jim,

 

I recognise that you disagree with the report (and me!) that consistency
in metadata of judgement is feasible. I would say that whatever
consistency has been achieved in the past (and I'm not convinced it is
as consistent as you suggest here - if it was, then issues such as
FRBRisation, deduping etc. would not be half the problem they are today)
has been achieved by trying to create a 'controlled environment' with
partial success - I believe that the advent of the web fundamentally
changed our ability to control our environment. 

 

However, all this aside (which we could I'm sure discuss for sometime
preferably over a glass or several of something nice), what I think we
should be clear on is that the report under discussion does not make an
argument of automatic creation of metadata of judgement - such as
subject headings. The report says explicitly about deep/broad records
relating to metadata of judgement that computer technology is not yet
at a stage to replace human effort in this regard. The report also is
doubtful that automatic creation of 'brief' records of metadata of
judgement is possible at the current time saying brief judgemental
records are the domain of humans (and maybe computers)

 

Owen

 

Owen Stephens

Assistant Director: eStrategy and Information Resources

Central Library

Imperial College London

South Kensington Campus

London

SW7 2AZ

 

t: +44 (0)20 7594 8829

e: [EMAIL PROTECTED]

From: Resource Description and Access / Resource Description and Access
[mailto:[EMAIL PROTECTED] On Behalf Of Weinheimer Jim
Sent: 14 November 2008 12:56
To: RDA-L@INFOSERV.NLC-BNC.CA
Subject: Re: [RDA-L] libraries, society and RDA

 

Owen Stephens wrote:
 
 The question of 'feasibility' takes us beyond a question of whether it
 is 'worth it' to whether it can be done. What the report says is that
 the authors do not believe it is possible to achieve consistency with
 metadata of judgement except within a tightly controlled, narrow and
 consistent environment - and the repository environment is not any of
 these things. This is not just about cost, but about people and their
 behaviour.

No argument here. I was being generous in my original message, but I
went on to say that it is possible to achieve consistency in metadata
of judgement. I will go on to say that people have relied precisely on
this consistency for over a hundred years, if not far longer, and for
someone to say that it isn't feasible is an unjustified conclusion, in
my opinion.

Certainly, this consistency is not 100%, and people must be trained to
do it correctly (I fear that current training in subject analysis and
heading assignment is not improving). For many years, studies have shown
that two different, highly-trained people will assign different subjects
to the same item. My reply is: so what? This ignores the power of the
syndetic structure of the catalog, where users can find related terms
and therefore find everything. Perhaps one cataloger assigns Despotism
while another assigns Authoritarianism, users can still use the
syndetic structure to find the works. Humans may not hit the
bull's-eye each time, but they will come close, and with the use of the
structures, things should be found.

Compare this to computer systems automatically assigning terms that are
completely off the mark. Instead of either of the headings above, a
computer may come up with Military art and science or x-ray
photography. I realize that general understanding of the use and
importance of the syndetic structure is not appreciated, and this is
probably because it is so poorly implemented in our current catalogs.

Before concluding that something that has been relied upon for such a
long time is not feasible, a little more work should take place and
the alternatives need to be explored in depth. I will be the first to
agree that deep and profound changes are needed and that automated
subject assignment is improving and may actually work someday.

But not today.

Jim Weinheimer



Re: [RDA-L] libraries, society and RDA

2008-11-14 Thread Stephens, Owen
  A weakness of MARC21 is that it doesn't make use of the reference to
  the Authority record into the metadata record - we rely on
'literals' too
  much - making it more difficult to ensure consistency, make changes,
  or draw into our indexing information not held directly in the MARC
  record.

 MM: No doubt new technologies and formats could allow us to do better
 data management more easily, however, I find arguments that include
the
 phrase weakness of MARC21 usually look at the current AACR2 rules or
 implementation of MARC21 rather than MARC21 itself. There are many
 things that could currently be done in MARC if 1. systems were
designed to use
 the data, 2. the data was consistently entered, 3. [not required but
would
 make life easier] some modifications should be made to the MARC fields
to
 allow the same type of data to be entered all in one place at one
time. For
 example: Catalogers now have several places in the fixed fields to
 enter form/format information and then again in 245$h (not to mention
other
 fields with subfield $h), possibly 300, 6xx subfield $v and then in
field 655.
 This is what causes inconsistency in data input and a major reason why
 systems are not designed to harvest that data. This problem isn't with
MARC, so
 the same issues are important when designing and implementing a new
format.


I think that in this case I am talking about a weakness in MARC21 rather
than AACR2. Possibly it is the implementation - but isn't MARC21 an
implementation of MARC?

As an example, I cannot see how MARC could support a linked data
approach to information stored in fixed fields - one of the places where
it would benefit from it.

However, I do agree that new technologies are formats could allow us to
do better data management more easily! I also believe that even if it is
possible to 'tweak' MARC to allow more use of non-literals this wouldn't
necessarily be easier than designing a new format.

Moving on to your example of authority records - this looks fine to me,
but is this covering the ground that FRBR and FRAD have started on -
which in turn RDA is building on?

Owen


Re: [RDA-L] libraries, society and RDA

2008-11-14 Thread Stephens, Owen
  OS:As an example, I cannot see how MARC could support a linked data
  approach to information stored in fixed fields - one of the places
 where
  it would benefit from it.

 MM: Systems already use MARC fixed fields information to display icons
 related to format and to refine searches and there are drop down boxes
 with
 additional information. There is no reason a system design cannot use
 the
 code in the fixed fields to provide additional information to a user.


This is slightly different - I agree systems can be intelligent about
how they use coded information, but I very strongly believe that we need
to use linked data, not codes that need system interpretation. We may be
in danger of violently agreeing on the basic premise here :)

 This IS covering the ground that FRBR and FRAD claim to be covering
but
 I
 and many others are growing frustrated in not seeing any progress
 beyond the
 theorizing that began years ago and virtually nothing on FRAD (where
 some of
 think we need to start). Progress might be happening in secret
 somewhere but
 those of us waiting eagerly in the wings are beginning to think it
 isn't
 going to happen, or if it does, it will not be as well designed as it
 could
 be. I do think RDA is trying to build on the FRBR/FRAD effort but
their
 work
 might be easier if they had a clearer structure around which to build
 their
 documentation. Yes, I know that RDA is supposed to free us from
 structure,
 but we all know there has to be some structure or they can reduce RDA
 to
 put in what you want.


I suspect that many would agree - it is frustratingly slow. However, as
we can see from discussion on this list, there is still much work to be
done to persuade people that (a) change is needed and (b) RDA is a step
in vaguely the right direction.

As you probably know, the Working Group on the Future of Bibliographic
Control seem to recommend that more fundamental work needed to be done
to 'test' FRBR, and work on RDA should be suspended until this was done:

3.2.5.1 JSC: Suspend further new development work on RDA
until a) the use and business cases for moving to RDA have
been satisfactorily articulated, b) the presumed benefits of
RDA have been convincingly demonstrated, and c) more,
large-scale, comprehensive testing of FRBR as it relates to
proposed provisions of RDA has been carried out against real
cataloging data, and the results of those tests have been
analyzed (see 4.2.1 below)

However, LoC together with NAL and NLM responded by saying that:

Until the completion of the rules and the availability of the RDA online
tool, reviewers will not be able fully to assess their impact on:
Description, access, and navigation practices for a broad array of users
and types of materials
Current and future electronic carriers and information management
systems to support RDA goals
Estimated costs for implementation and maintenance during a time of
flat, even reduced, budgets

Essentially saying that RDA should be completed, and then an assessment
carried out - now it has been announced that there will be a delay to
the 'online tool' I'm not sure whether this will change the statement
above (i.e. will an assessment of impact be possible with only the text
available?)

It feels like a real chicken and egg scenario - we are unlikely to see
substantial effort going into systems supporting RDA unless it is
adopted by the community. The community is unlikely to adopt RDA if
there are no systems that support it...

Owen