Re: Deleting a subfield using MARC::Record
On May 3, 2006, at 11:25 AM, Mark Jordan wrote: For example, in a given batch, most but not all records have an 856 subfield 3, followed by multiple subfield u's. If you ask to delete the first u using pos, then your target will be different determined by the presence of subfield 3. If you know that you want to eliminate u's (without regard to what else is in the field) then your target would be easier to hit. Ok this I like, having a use case like this makes it much easier to decide about the API. How about we go back to occurrence and remove pos? However, you raise a good point -- how much functionality do people need? Maybe some actual examples from the wild would be useful. I can supply some but probably not until tomorrow afternoon since I have a presentation to prepare for tomorrow. If other users have some examples of real records or use cases they might clarify the most common usage. I'll see what I can find tomorrow. Yeah, you know if you have the interest/time it would be great if you could add a couple tests to the existing test file. The tests need not pass, but they should illustrate they should illustrate the use. Hop onto #code4lib and I can walk you through how to do this, and get your access set up if you are interested. //Ed
Re: Question about MARC::RECORD usage
Bryan, Many thanks for the quick response. There are times when the proper order would be $a, $n, $p, $b, $c, as well, aren't there? Thanks for the forwarning - I haven't been told that yet - I'm not involved in the production of the data just in extracting it for publishing! This is proving to be something of a baptism by fire. According to the POD in MARC::Field: "Or if you think there might be more than one you can get all of them by calling in a list context: my @subfields = $field->subfield( 'a' );" Alternatively, get all subfields in the field and parse as needed: my $field245 = $record->field('245'); my @subfields = $field245->subfields(); while (my $subfield = pop(@subfields)) { my ($code, $data) = @$subfield; #do something with data #or add code and data to array unshift (@newsubfields, $code, $data); } # while This look like something I can sit down this evening and work on to trya nd understand. Many thanks again - any other suggestions welcome.
Re: Deleting a subfield using MARC::Record
Ed, the only problem I can see with position in the field is if a preceding subfield does not exist in every record. For example, in a given batch, most but not all records have an 856 subfield 3, followed by multiple subfield u's. If you ask to delete the first u using pos, then your target will be different determined by the presence of subfield 3. If you know that you want to eliminate u's (without regard to what else is in the field) then your target would be easier to hit. However, you raise a good point -- how much functionality do people need? Maybe some actual examples from the wild would be useful. I can supply some but probably not until tomorrow afternoon since I have a presentation to prepare for tomorrow. If other users have some examples of real records or use cases they might clarify the most common usage. I'll see what I can find tomorrow. Mark Edward Summers wrote: On May 3, 2006, at 8:55 AM, Mark Jordan wrote: I think it should mean "the zeroth occurrence of subfield 'u'", since specifying which of a repeated group of subfields is a realistic task, as you say. For example, each record has two 'u's but all of the first ones are garbage. Actually 'pos' as implemented will remove the subfield u if it is at position n in the field. So we could have occurrence too. I feel like I'm chasing windmills a bit. Do y'all really *need* all this functionality in delete_subfield() :-) I guess you do or else you wouldn't be so interested in asking for it. I didn't implement the -1 behavior because i wasn't quite sure how to do it quickly, and it seemed like too much somehow. //Ed -- Mark Jordan Head of Library Systems W.A.C. Bennett Library, Simon Fraser University Burnaby, British Columbia, V5A 1S6, Canada Phone (604) 291 5753 / Fax (604) 291 3023 [EMAIL PROTECTED] / http://www.sfu.ca/~mjordan/
Re: Deleting a subfield using MARC::Record
Edward Summers wrote: On May 3, 2006, at 8:55 AM, Mark Jordan wrote: I think it should mean "the zeroth occurrence of subfield 'u'", since specifying which of a repeated group of subfields is a realistic task, as you say. For example, each record has two 'u's but all of the first ones are garbage. Actually 'pos' as implemented will remove the subfield u if it is at position n in the field. So we could have occurrence too. I feel like I'm chasing windmills a bit. Do y'all really *need* all this functionality in delete_subfield() :-) I guess you do or else you wouldn't be so interested in asking for it. Well, maybe this IS getting a little out of hand! I could live with the old-fashioned way myself. Being a newbie to the list I was surprised how fast you jumped in and provided the new functionality. Mike -- Michael Kreyche Systems Librarian Associate Professor Kent State University Libraries and Media Services http://www.personal.kent.edu/~mkreyche 330-672-1918
Re: Deleting a subfield using MARC::Record
On May 3, 2006, at 8:55 AM, Mark Jordan wrote: I think it should mean "the zeroth occurrence of subfield 'u'", since specifying which of a repeated group of subfields is a realistic task, as you say. For example, each record has two 'u's but all of the first ones are garbage. Actually 'pos' as implemented will remove the subfield u if it is at position n in the field. So we could have occurrence too. I feel like I'm chasing windmills a bit. Do y'all really *need* all this functionality in delete_subfield() :-) I guess you do or else you wouldn't be so interested in asking for it. I didn't implement the -1 behavior because i wasn't quite sure how to do it quickly, and it seemed like too much somehow. //Ed
RE: Question about MARC::RECORD usage
On Wednesday, May 03, 2006 9:28 AM, Ed @ Go Britain wrote: >In the 245 record it is >possible to have numerous $n and $p fields which need to be >output with formating between the fields. > >My knowledge of PERL isn't too good and I'm struggling to know >how to extract these repeated subfields and place formatting >between the subfields in the prescribed order $a, $b, $n, $p, >$c. Both n and p could be repeated several times. There are times when the proper order would be $a, $n, $p, $b, $c, as well, aren't there? >At the moment I take each field into a variable eg > >$Field245c = $record->subfield('245','c'); > >and then output these as follows > > if ($Field245c) >{ >$EntryBody = $EntryBody . " -- " . $Field245c; >} > >However, this approach assigns the first occurance of a >subfield and I haven't yet discovered a tachnique for >accessing further subfields. > According to the POD in MARC::Field: "Or if you think there might be more than one you can get all of them by calling in a list context: my @subfields = $field->subfield( 'a' );" Alternatively, get all subfields in the field and parse as needed: my $field245 = $record->field('245'); my @subfields = $field245->subfields(); while (my $subfield = pop(@subfields)) { my ($code, $data) = @$subfield; #do something with data #or add code and data to array unshift (@newsubfields, $code, $data); } # while ### I hope this helps, Bryan Baldus [EMAIL PROTECTED] [EMAIL PROTECTED] http://home.inwave.com/eija
Question about MARC::RECORD usage
I've been using MARC::Record for a while to extract data using Perl to prepare it for a publishing package (Ventura). This has all worked well for about a year until it was spotted that a repeated subfield has been omitted. In the 245 record it is possible to have numerous $n and $p fields which need to be output with formating between the fields. My knowledge of PERL isn't too good and I'm struggling to know how to extract these repeated subfields and place formatting between the subfields in the prescribed order $a, $b, $n, $p, $c. Both n and p could be repeated several times. At the moment I take each field into a variable eg $Field245c = $record->subfield('245','c'); and then output these as follows if ($Field245c) { $EntryBody = $EntryBody . " -- " . $Field245c; } However, this approach assigns the first occurance of a subfield and I haven't yet discovered a tachnique for accessing further subfields. All suggestions and approaches welcomed. Ed Brown www.go-britain.com www.solid-us.com 0870 752
Re: Deleting a subfield using MARC::Record
Brad Baxter wrote: On 5/3/06, Michael Kreyche <[EMAIL PROTECTED]> wrote: The term "position" ("pos") seems a little ambiguous to me on the face of it. Does (code => 'u', pos => 0) mean "the first subfield u" (which is what I take it to mean) or "subfield u if it's the first subfield" (which it might sound like outside the context of this discussion)? I had the same thought. To me, 'occur' has a clearer meaning in that context, "the zeroth occurrence of subfield 'u'", while 'pos' has more of the ambiguity described above. So of the two, I'd prefer 'occur', but I can live with 'pos'. Synonyms perhaps? (Unless someone has a need to delete the third subfield regardless of code? I never have, so perhaps not.) -- Brad I think it should mean "the zeroth occurrence of subfield 'u'", since specifying which of a repeated group of subfields is a realistic task, as you say. For example, each record has two 'u's but all of the first ones are garbage. Mark -- Mark Jordan Head of Library Systems W.A.C. Bennett Library, Simon Fraser University Burnaby, British Columbia, V5A 1S6, Canada Phone (604) 291 5753 / Fax (604) 291 3023 [EMAIL PROTECTED] / http://www.sfu.ca/~mjordan/
Re: Deleting a subfield using MARC::Record
On 5/3/06, Michael Kreyche <[EMAIL PROTECTED]> wrote: The term "position" ("pos") seems a little ambiguous to me on the face of it. Does (code => 'u', pos => 0) mean "the first subfield u" (which is what I take it to mean) or "subfield u if it's the first subfield" (which it might sound like outside the context of this discussion)? I had the same thought. To me, 'occur' has a clearer meaning in that context, "the zeroth occurrence of subfield 'u'", while 'pos' has more of the ambiguity described above. So of the two, I'd prefer 'occur', but I can live with 'pos'. Synonyms perhaps? (Unless someone has a need to delete the third subfield regardless of code? I never have, so perhaps not.) -- Brad
Re: Deleting a subfield using MARC::Record
Edward Summers wrote: The current documentation for the new method reads like this: -- delete_subfield() allows you to remove subfields from a field: # delete any subfield a in the field $field->delete_subfield(code => 'a'); # delete any subfield a or u in the field $field->delete_subfield(code => ['a', 'u']); If you want to only delete subfields at a particular position you can use the position parameter: # delete subfield u at the first position $field->delete_subfield(code => 'u', position => 0); # delete subfield u at first or second position $field->delete_subfield(code => 'u', position => [0,1]); If you implemented negative indexes, it would be nice to add an example: # delete subfield u at last position $field->delete_subfield(code => 'u', pos => [-1]); You can specify a regex to for only deleting subfields that match: # delete any subfield u that matches zombo.com $field->delete_subfield(code => 'u', match => qr/zombo.com/); The term "position" ("pos") seems a little ambiguous to me on the face of it. Does (code => 'u', pos => 0) mean "the first subfield u" (which is what I take it to mean) or "subfield u if it's the first subfield" (which it might sound like outside the context of this discussion)? Mike -- Michael Kreyche Systems Librarian Associate Professor Kent State University Libraries and Media Services http://www.personal.kent.edu/~mkreyche 330-672-1918
Re: Deleting a subfield using MARC::Record
On May 3, 2006, at 6:28 AM, Edward Summers wrote: $field->delete_subfield(pos => 2); won't work because 'pos' is a perl keyword-- I should've tried it before I said this -- it works fine in that context, even though my perl syntax highlighter indicates otherwise. So I've changed the parameter name from 'position' to 'pos' keeping with Leif's original suggestion. //Ed
Re: Deleting a subfield using MARC::Record
On May 1, 2006, at 4:41 PM, Leif Andersson wrote: +1 "count" can possibly be complemented or replaced with occurrence as suggested. It'd be nice to be able to denote last occurrence [-1]. And I suppose the indexing should be based on ordinary perl subscript indexing - i.e. governed by the value of special variable $[ $field->delete_subfield( code => $code, # of course occur => [0,2,3], # "occur" or "pos" or whatever... match => qr/pat/, # doesn't need to be repeatable ); I actually like 'pos' better than 'occur' -- but alas $field->delete_subfield(pos => 2); won't work because 'pos' is a perl keyword--which is why I like using it I suppose :-) How about: $field->delete_subfield(position => 2); A bit more wordy I guess, but I still like it better than occur. Nice tip on the use of $[ by the way! I also like Tim's suggestion to allow 'code' to take multiple values too: $field->delete_subfield(code => ['a','b','c']) So if you check out the CVS you should find this implemented. If you are interested in adding any tests or documentation let me know and I'll add you as a sf.net developer. The current documentation for the new method reads like this: -- delete_subfield() allows you to remove subfields from a field: # delete any subfield a in the field $field->delete_subfield(code => 'a'); # delete any subfield a or u in the field $field->delete_subfield(code => ['a', 'u']); If you want to only delete subfields at a particular position you can use the position parameter: # delete subfield u at the first position $field->delete_subfield(code => 'u', position => 0); # delete subfield u at first or second position $field->delete_subfield(code => 'u', position => [0,1]); You can specify a regex to for only deleting subfields that match: # delete any subfield u that matches zombo.com $field->delete_subfield(code => 'u', match => qr/zombo.com/); -- Sound ok? //Ed