I think you have to check for warnings as you read each record, so try moving your error handing code right after the batch->next() call. But Robin's suggestion is good advice, and is probably a more robust way to handle the crud that can show up in a file of marc records.
-Tim On Fri, May 30, 2014 at 5:20 AM, Stefano Bargioni <bargi...@pusc.it> wrote: > If I'm not wrong, > $batch->strict_off(); > will avoid your loop to print warnings and stop processing records. > HTH. Stefano > > On 29/mag/2014, at 23.13, John E Guillory wrote: > > Thanks Timothy for your help. > > > > When processing about 5 million records I would expect some crazy records. > The new script (incorporating Timothy’s suggestions) exited prematurely on > record 85,877 with: “Warnings detected: Entirely empty subfield found in > tag 260”. I know 260 is publication stuff but it’s not “required”. I’m > deliberately printing warnings but again the script exited prematurely. > > > > Thanks for assistance. > > John > > > > > > > > > > > > > > *From:* Timothy Prettyman [mailto:timo...@umich.edu] > *Sent:* Thursday, May 29, 2014 11:23 AM > *To:* John E Guillory > *Cc:* perl4lib@perl.org > *Subject:* Re: sending marc records into a script that uses MARC::Batch > > > > For your first question, instead of: > > > > $batch = MARC::Batch->new(‘USMARC’,<STDIN>); > > > > use: > > > > $batch = MARC::Batch->new(‘USMARC’,STDIN); > > > > For your second, the error is likely caused when a field you're using > as_string() on doesn't exist in the record. > > > > So, you could do something like the following: > > > > $field = $record->field('008'); > > $field or do { # check for > existence of field > > print "no 008 field for record\n"; # no field > > next; # skip the field > (or whatever) > > }; > > $field_008 = $field->as_string(); > > > > Hope this helps > > > > -Tim > > > > Timothy Prettyman > > LIT/Library Systems > > University of Michigan > > > > On Thu, May 29, 2014 at 12:08 PM, John E Guillory <jo...@lsu.edu> wrote: > > Hello, > > Two questions please: > > > > 1. I’ve written a script that opens a marc file for reading using > this syntax: > > > > $file = $ARGV[0]; > > $batch = MARC::Batch->new('USMARC',$file); > > > > It then loops thru the records using this syntax: > > while ( $record = $batch->next()) { > > …..check position 6, 7 of leader and position 23 of 008 and make > some changes > > } > > > > This works great. However, instead of accessing the file this way, I want > to pipe the output of a previously run marc dump command directly into this > script via the pipe. > > I understand that this can be done using this syntax: while ($line > =<STDIN>){ …}, but I don’t understand how to use that STDIN with > “MARC::Batch->new(‘USMARC’,$file);” This does not work: $batch = > MARC::Batch->new(‘USMARC’,<STDIN>); > > > > 2. My current script successfully reads and processes a marc file of > over 5 gigs!....but exits entirely on record 160,585 with the error from > MARC::Batch, “Can't call method "as_string" on an undefined value at ./ > marc_batch.pl”. Documentation on using MARC::Batch says that to tell it > to continue processing even when errors are encountered one should use > strict_off(), then print/report warnings at the bottom of the script. I > don’t think my particular error is being handled by the strict_off() > setting. Doesn’t anybody know what causes/how to fix “Can’t call method > as_string?” error? Full script below—it’s pretty short, thanks to > MARC::Batch. > > > > Thanks for ensights! > > > > > > use MARC::Batch; > > > > $file = $ARGV[0]; > > chomp($file); > > > > $batch = MARC::Batch->new('USMARC',$file); > > $batch->strict_off(); # otherwise script exits when encounters errors > > > > open(OUT,'>new_marc'); > > > > while ( $record = $batch->next()) { > > $leader = $record->leader(); > > $leader_pos_6 = substr($leader,6,1); > > $leader_pos_7 = substr($leader,7,1); > > > > $field = $record->field('008'); > > $field_008 = $field->as_string(); > > $field_008_position_23 = substr($field_008,23,1); > > > > if ( ($leader_pos_6 eq "a") && ($leader_pos_7 eq "m") && > ($field_008_position_23 eq "o") || ($field_008_position_23 eq "s") ) { > > > > $control_num = $record->field('001'); > > $control_num = $control_num->as_string(); > > > > print "008 position 23: $field_008_position_23 \n"; > > print "OLD leader: $leader \n"; > > $old_leader = $leader; > > substr($leader,6,1) = 'm'; > > print "NEW leader: $leader \n"; > > > > print OUT $record->as_usmarc(); > > print "$control_num|$old_leader|$leader|$field_008\n"; > > > > } else { # not a match so just print this one unchanged… > > print OUT $record->as_usmarc(); > > } > > > > } > > > > # handles errors: > > if (@warnings = $batch->warnings()) { > > print "\n Warnings detected: \n", @warnings; > > } > > > > close(OUT); > > close(LOG); > > > > > > > > John Guillory > > Louisiana Library Network > > 225.578.3758 > > > > > > > __________________________________________________ > Il tuo *5x1000* al Patronato di San Girolamo della Carità è un gesto > semplice ma di grande valore. > Una tua firma aiuterà i sacerdoti ad essere più vicini alle esigenze di > tutti noi. > Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti > indicando nella dichiarazione dei redditi il codice fiscale *97023980580*. > >