On Dec 13, 2007, at 9:10 AM, Don Gourley wrote:

Put another way, if I want to use repository using
NET::OAI::Harvester to read repository data in a
form other than DC will I need to write an additional
module such as NET::OAI::Record::MARCXML?

I don't know if this is the only way to do it, but that is how
I use NET::OAI::Harvester to handle metadata in DIDL that I get
out of a DSpace repository via a custom crosswalk plugin.  The
harvest script simply invokes the Harvester like this:

my $rec = $harvester->getRecord('identifier' => $oaiid,
                                'metadataPrefix' => 'oai_didl',
                                'metadataHandler' => 'DOC_DIDL',
                                'set' => $set);



On Dec 13, 2007, at 9:48 AM, Ed Summers wrote:

Net::OAI::Record::OAI_DC is an example of a SAX filter
which receives SAX events for each metadata record in a
response and builds up a representation of the record.
Since oai_dc is standard in oai-pmh-land it's assumed
as a default a lot of the time.

So if you want to retrieve another kind of metadata you
have to write a SAX filter for it, and then reference
it when you are calling getRecord(), listRecords() or
listAllRecords()....

And here's a barely functional MODSHandler that just
pulls out the title:

package MODSHandler;

  use XML::SAX::Base;
  use base qw(XML::SAX::Base);

  sub new {
      my $class = shift;
      return bless {inside => 0}, ref($class) || $class;
  }

  sub title {
      return shift->{title};
  }

  sub start_element {
     my ($self, $element) = @_;
     if ($element->{Name} eq 'title') {$self->{inside} = 1;}
  }

  sub end_element {
      my ($self, $element) = @_;
      if ($element->{Name} eq 'title') {$self->{inside} = 0;}
  }

  sub characters {
      my ($self, $chars) = @_;
      if ($self->{inside}) {
          $self->{title} .= $chars->{Data};
      }
  }

  1;



Thank you for the prompt replies, and y'all have confirmed what I
believed. The "best" way to accomplish my goal is to write a SAX
filter for the metadata schema I desire.

But I'm lazy, and even though it is not the best solution, I will
explore another option. Specifically, I will use oai_dump (which
comes with N::O::H), change the metadata scheme from oai_dc to
marc21, run the script, and parse the resulting XML. If I'm lucky my
parser will able to be written as a SAX filter that can be added to
the N::O::H distribution. In the meantime, at least I will have the
data. Wish me luck.

--
Eric Lease Morgan
University Libraries of Notre Dame

Reply via email to