Hi,
have a look at the Catmandu framework for Perl.
Install Catmandu ( https://metacpan.org/pod/Catmandu) and Catmandu::OAI
( https://metacpan.org/pod/Catmandu::OAI).
# in the perl script:
use Catmandu::Importer::OAI;
my $importer = Catmandu::Importer::OAI->new(
url => "...",
metadataPrefix => "..." , ); $importer->each(sub { my $hashref = $_[0];
# do something with $hashref... });
or directly from the command line:
$ catmandu convert OAI --url http://pub.uni-bielefeld.de/oai to JSON
(the arxiv oai interface seems to be very slow.)
There's also an importer for arxiv.org: Catmandu::ArXiv (
https://metacpan.org/pod/Catmandu::ArXiv)
Everything is also on github: https://github.com/LibreCat
Cheers,
Vitali
On 14.01.2014 21:01, Eka Grguric wrote:
Hi,
I am a complete newbie to Perl (and to Code4Lib) and am trying to set up a
harvester to get complete metadata records from oai-pmh repositories. My
current approach is to use things already built as much as possible -
specifically the Net::Oai::Harvester
(http://search.cpan.org/~esummers/OAI-Harvester-1.0/lib/Net/OAI/Harvester.pm).
The code I'm using is located in the synopsis and specific parts of it seem to
work with some samples I've tried. For example, if I submit a request for a
list of sets to the oai url for arXiv.org (http://arXiv.org/oai2) I get the
correct list.
The error I run into reads "can't call listRecords() on an undefined value in
*filename* line *#*". listRecords() seems to have been an issue in past iterations
but I'm not sure how to get around it.
At the moment it looks like this:
## list all the records in a repository
my $list = $harvester->listRecords(
metadataPrefix = 'oai_dc'
);
Any help (or Perl resources) would be appreciated!
Thanks,
Eka
MLIS Candidate, UBC iSchool
--
Vitali Peil
Fachreferent
PUB <pub.uni-bielefeld.de>
Raum E1-144, Tel. 0521 106 6125
Universitätsbibliothek Bielefeld