Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Edward Summers


On May 3, 2006, at 11:25 AM, Mark Jordan wrote:

For example, in a given batch, most but not all records have an 856  
subfield 3, followed by multiple subfield u's. If you ask to delete  
the first u using pos, then your target will be different  
determined by the presence of subfield 3. If you know that you   
want to eliminate u's (without regard to what else is in the field)  
then your target would be easier to hit.


Ok this I like, having a use case like this makes it much easier to  
decide about the API. How about we go back to occurrence and remove pos?




However, you raise a good point -- how much functionality do people  
need? Maybe some actual examples from the wild would be useful. I  
can supply some but probably not until tomorrow afternoon since I  
have a presentation to prepare for tomorrow. If other users have  
some examples of real records or use cases they might clarify the  
most common usage. I'll see what I can find tomorrow.


Yeah, you know if you have the interest/time it would be great if you  
could add a couple tests to the existing test file. The tests need  
not pass, but they should illustrate they should illustrate the use.  
Hop onto #code4lib and I can walk you through how to do this, and get  
your access set up if you are interested.


//Ed


Re: Question about MARC::RECORD usage

2006-05-03 Thread Ed @ Go Britain

Bryan,

Many thanks for the quick response.

There are times when the proper order would be $a, $n, $p, $b, $c, as 
well,

aren't there?
Thanks for the forwarning - I haven't been told that yet - I'm not involved 
in the production of the data just in extracting it for publishing! This is 
proving to be something of a baptism by fire.



According to the POD in MARC::Field:
"Or if you think there might be more than one you can get all of them by
calling in a list context:

   my @subfields = $field->subfield( 'a' );"

Alternatively, get all subfields in the field and parse as needed:

my $field245 = $record->field('245');
my @subfields = $field245->subfields();

while (my $subfield = pop(@subfields)) {
   my ($code, $data) = @$subfield;
   #do something with data

   #or add code and data to array
   unshift (@newsubfields, $code, $data);
} # while
This look like something I can sit down this evening and work on to trya nd 
understand.


Many thanks again - any other suggestions welcome.



Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Mark Jordan
Ed, the only problem I can see with position in the field is if a 
preceding subfield does not exist in every record. For example, in a 
given batch, most but not all records have an 856 subfield 3, followed 
by multiple subfield u's. If you ask to delete the first u using pos, 
then your target will be different determined by the presence of 
subfield 3. If you know that you  want to eliminate u's (without regard 
to what else is in the field) then your target would be easier to hit.


However, you raise a good point -- how much functionality do people 
need? Maybe some actual examples from the wild would be useful. I can 
supply some but probably not until tomorrow afternoon since I have a 
presentation to prepare for tomorrow. If other users have some examples 
of real records or use cases they might clarify the most common usage. 
I'll see what I can find tomorrow.


Mark

Edward Summers wrote:


On May 3, 2006, at 8:55 AM, Mark Jordan wrote:
I think it should mean "the zeroth occurrence of subfield 'u'", since 
specifying which of a repeated group of subfields is a realistic task, 
as you say. For example, each record has two 'u's but all of the first 
ones are garbage.


Actually 'pos' as implemented will remove the subfield u if it is at 
position n in the field. So we could have occurrence too. I feel like 
I'm chasing windmills a bit. Do y'all really *need* all this 
functionality in delete_subfield() :-) I guess you do or else you 
wouldn't be so interested in asking for it.


I didn't implement the -1 behavior because i wasn't quite sure how to do 
it quickly, and it seemed like too much somehow.


//Ed


--
Mark Jordan
Head of Library Systems
W.A.C. Bennett Library, Simon Fraser University
Burnaby, British Columbia, V5A 1S6, Canada
Phone (604) 291 5753 / Fax (604) 291 3023
[EMAIL PROTECTED] / http://www.sfu.ca/~mjordan/


Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Michael Kreyche

Edward Summers wrote:


On May 3, 2006, at 8:55 AM, Mark Jordan wrote:
I think it should mean "the zeroth occurrence of subfield 'u'", since 
specifying which of a repeated group of subfields is a realistic task, 
as you say. For example, each record has two 'u's but all of the first 
ones are garbage.


Actually 'pos' as implemented will remove the subfield u if it is at 
position n in the field. So we could have occurrence too. I feel like 
I'm chasing windmills a bit. Do y'all really *need* all this 
functionality in delete_subfield() :-) I guess you do or else you 
wouldn't be so interested in asking for it.


Well, maybe this IS getting a little out of hand! I could live with the 
old-fashioned way myself. Being a newbie to the list I was surprised how 
fast you jumped in and provided the new functionality.


Mike
--
Michael Kreyche
Systems Librarian
Associate Professor
Kent State University Libraries and Media Services
http://www.personal.kent.edu/~mkreyche
330-672-1918



Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Edward Summers


On May 3, 2006, at 8:55 AM, Mark Jordan wrote:
I think it should mean "the zeroth occurrence of subfield 'u'",  
since specifying which of a repeated group of subfields is a  
realistic task, as you say. For example, each record has two 'u's  
but all of the first ones are garbage.


Actually 'pos' as implemented will remove the subfield u if it is at  
position n in the field. So we could have occurrence too. I feel like  
I'm chasing windmills a bit. Do y'all really *need* all this  
functionality in delete_subfield() :-) I guess you do or else you  
wouldn't be so interested in asking for it.


I didn't implement the -1 behavior because i wasn't quite sure how to  
do it quickly, and it seemed like too much somehow.


//Ed


RE: Question about MARC::RECORD usage

2006-05-03 Thread Bryan Baldus
On Wednesday, May 03, 2006 9:28 AM, Ed @ Go Britain wrote:
>In the 245 record it is 
>possible to have numerous $n and $p fields which need to be 
>output with formating between the fields.
>
>My knowledge of PERL isn't too good and I'm struggling to know 
>how to extract these repeated subfields and place formatting 
>between the subfields in the prescribed order $a, $b, $n, $p, 
>$c. Both n and p could be repeated several times. 

There are times when the proper order would be $a, $n, $p, $b, $c, as well,
aren't there?

>At the moment I take each field into a variable eg 
>
>$Field245c = $record->subfield('245','c');
>
>and then output these as follows
>
>   if ($Field245c)
>{
>$EntryBody = $EntryBody . " -- " . $Field245c;
>}
>
>However, this approach assigns the first occurance of a 
>subfield and I haven't yet discovered a tachnique for 
>accessing further subfields.
>

According to the POD in MARC::Field:
"Or if you think there might be more than one you can get all of them by
calling in a list context:

my @subfields = $field->subfield( 'a' );"

Alternatively, get all subfields in the field and parse as needed:

my $field245 = $record->field('245');
my @subfields = $field245->subfields();

while (my $subfield = pop(@subfields)) {
my ($code, $data) = @$subfield;
#do something with data 

#or add code and data to array
unshift (@newsubfields, $code, $data);
} # while


###

I hope this helps,

Bryan Baldus
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://home.inwave.com/eija
 


Question about MARC::RECORD usage

2006-05-03 Thread Ed @ Go Britain
I've been using MARC::Record for a while to extract data using Perl to prepare 
it for a publishing package (Ventura). This has all worked well for about a 
year until it was spotted that a repeated subfield has been omitted. In the 245 
record it is possible to have numerous $n and $p fields which need to be output 
with formating between the fields.

My knowledge of PERL isn't too good and I'm struggling to know how to extract 
these repeated subfields and place formatting between the subfields in the 
prescribed order $a, $b, $n, $p, $c. Both n and p could be repeated several 
times. 

At the moment I take each field into a variable eg 

$Field245c = $record->subfield('245','c');

and then output these as follows

   if ($Field245c)
{
$EntryBody = $EntryBody . " -- " . $Field245c;
}

However, this approach assigns the first occurance of a subfield and I haven't 
yet discovered a tachnique for accessing further subfields.

All suggestions and approaches welcomed.




Ed Brown
www.go-britain.com
www.solid-us.com
0870 752

Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Mark Jordan

Brad Baxter wrote:

On 5/3/06, Michael Kreyche <[EMAIL PROTECTED]> wrote:


The term "position" ("pos") seems a little ambiguous to me on the face
of it. Does (code => 'u', pos => 0) mean "the first subfield u" (which
is what  I take it to mean) or "subfield u if it's the first subfield"
(which it might sound like outside the context of this discussion)?


I had the same thought.  To me, 'occur' has a clearer meaning in
that context, "the zeroth occurrence of subfield 'u'", while 'pos' has
more of the ambiguity described above.  So of the two, I'd prefer
'occur', but I can live with 'pos'.  Synonyms perhaps?  (Unless someone
has a need to delete the third subfield regardless of code?  I never
have, so perhaps not.)

--
Brad


I think it should mean "the zeroth occurrence of subfield 'u'", since 
specifying which of a repeated group of subfields is a realistic task, 
as you say. For example, each record has two 'u's but all of the first 
ones are garbage.


Mark

--
Mark Jordan
Head of Library Systems
W.A.C. Bennett Library, Simon Fraser University
Burnaby, British Columbia, V5A 1S6, Canada
Phone (604) 291 5753 / Fax (604) 291 3023
[EMAIL PROTECTED] / http://www.sfu.ca/~mjordan/


Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Brad Baxter

On 5/3/06, Michael Kreyche <[EMAIL PROTECTED]> wrote:


The term "position" ("pos") seems a little ambiguous to me on the face
of it. Does (code => 'u', pos => 0) mean "the first subfield u" (which
is what  I take it to mean) or "subfield u if it's the first subfield"
(which it might sound like outside the context of this discussion)?


I had the same thought.  To me, 'occur' has a clearer meaning in
that context, "the zeroth occurrence of subfield 'u'", while 'pos' has
more of the ambiguity described above.  So of the two, I'd prefer
'occur', but I can live with 'pos'.  Synonyms perhaps?  (Unless someone
has a need to delete the third subfield regardless of code?  I never
have, so perhaps not.)

--
Brad


Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Michael Kreyche

Edward Summers wrote:


The current documentation for the new method reads like this:

--

   delete_subfield() allows you to remove subfields from a field:

   # delete any subfield a in the field
   $field->delete_subfield(code => 'a');

   # delete any subfield a or u in the field
   $field->delete_subfield(code => ['a', 'u']);

   If you want to only delete subfields at a particular position you 
can

   use the position parameter:

   # delete subfield u at the first position
   $field->delete_subfield(code => 'u', position => 0);

   # delete subfield u at first or second position
   $field->delete_subfield(code => 'u', position => [0,1]);


If you implemented negative indexes, it would be nice to add an example:

# delete subfield u at last position
$field->delete_subfield(code => 'u', pos => [-1]);


   You can specify a regex to for only deleting subfields that match:

  # delete any subfield u that matches zombo.com
  $field->delete_subfield(code => 'u', match => qr/zombo.com/);


The term "position" ("pos") seems a little ambiguous to me on the face 
of it. Does (code => 'u', pos => 0) mean "the first subfield u" (which 
is what  I take it to mean) or "subfield u if it's the first subfield" 
(which it might sound like outside the context of this discussion)?


Mike
--
Michael Kreyche
Systems Librarian
Associate Professor
Kent State University Libraries and Media Services
http://www.personal.kent.edu/~mkreyche
330-672-1918



Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Edward Summers

On May 3, 2006, at 6:28 AM, Edward Summers wrote:

$field->delete_subfield(pos => 2);

won't work because 'pos' is a perl keyword--


I should've tried it before I said this -- it works fine in that  
context, even though my perl syntax highlighter indicates otherwise.  
So I've changed the parameter name from 'position' to 'pos' keeping  
with Leif's original suggestion.


//Ed


Re: Deleting a subfield using MARC::Record

2006-05-03 Thread Edward Summers


On May 1, 2006, at 4:41 PM, Leif Andersson wrote:


+1

"count" can possibly be complemented or replaced with occurrence as  
suggested.

It'd be nice to be able to denote last occurrence [-1].
And I suppose the indexing should be based on ordinary perl  
subscript indexing - i.e. governed by the value of special variable $[


$field->delete_subfield( code => $code,  # of course
 occur => [0,2,3],   # "occur" or "pos" or  
whatever...
 match => qr/pat/,   # doesn't need to be  
repeatable

   );


I actually like 'pos' better than 'occur' -- but alas

$field->delete_subfield(pos => 2);

won't work because 'pos' is a perl keyword--which is why I like using  
it I suppose :-) How about:


$field->delete_subfield(position => 2);

A bit more wordy I guess, but I still like it better than occur. Nice  
tip on the use of $[ by the way! I also like Tim's suggestion to  
allow 'code' to take multiple values too:


$field->delete_subfield(code => ['a','b','c'])

So if you check out the CVS you should find this implemented. If you  
are interested in adding any tests or documentation let me know and  
I'll add you as a sf.net developer.


The current documentation for the new method reads like this:

--

   delete_subfield() allows you to remove subfields from a field:

   # delete any subfield a in the field
   $field->delete_subfield(code => 'a');

   # delete any subfield a or u in the field
   $field->delete_subfield(code => ['a', 'u']);

   If you want to only delete subfields at a particular position  
you can

   use the position parameter:

   # delete subfield u at the first position
   $field->delete_subfield(code => 'u', position => 0);

   # delete subfield u at first or second position
   $field->delete_subfield(code => 'u', position => [0,1]);

   You can specify a regex to for only deleting subfields that  
match:


  # delete any subfield u that matches zombo.com
  $field->delete_subfield(code => 'u', match => qr/zombo.com/);

--

Sound ok?

//Ed