Hi,

have a look at the Catmandu framework for Perl.

Install Catmandu ( https://metacpan.org/pod/Catmandu) and Catmandu::OAI ( https://metacpan.org/pod/Catmandu::OAI).

# in the perl script:

use Catmandu::Importer::OAI;
my $importer = Catmandu::Importer::OAI->new(
url => "...",
metadataPrefix => "..." , ); $importer->each(sub { my $hashref = $_[0]; # do something with $hashref... });


or directly from the command line:
$ catmandu convert OAI --url http://pub.uni-bielefeld.de/oai to JSON

(the arxiv oai interface seems to be very slow.)

There's also an importer for arxiv.org: Catmandu::ArXiv ( https://metacpan.org/pod/Catmandu::ArXiv)

Everything is also on github: https://github.com/LibreCat

Cheers,
Vitali

On 14.01.2014 21:01, Eka Grguric wrote:
Hi,

I am a complete newbie to Perl (and to Code4Lib) and am trying to set up a 
harvester to get complete metadata records from oai-pmh repositories. My 
current approach is to use things already built as much as possible - 
specifically the Net::Oai::Harvester 
(http://search.cpan.org/~esummers/OAI-Harvester-1.0/lib/Net/OAI/Harvester.pm). 
The code I'm using is located in the synopsis and specific parts of it seem to 
work with some samples I've tried. For example, if I submit a request for a 
list of sets to the oai url for arXiv.org (http://arXiv.org/oai2) I get the 
correct list.

The error I run into reads "can't call listRecords() on an undefined value in 
*filename* line *#*". listRecords() seems to have been an issue in past iterations 
but I'm not sure how to get around it.

At the moment it looks like this:
  ## list all the records in a repository
      my $list = $harvester->listRecords(
                metadataPrefix = 'oai_dc'
         );

Any help (or Perl resources) would be appreciated!

Thanks,

Eka
MLIS Candidate, UBC iSchool


--
Vitali Peil
Fachreferent
PUB <pub.uni-bielefeld.de>
Raum E1-144, Tel. 0521 106 6125
Universitätsbibliothek Bielefeld

Reply via email to