You could also use negative lookahead to capture the 856 fields where subfield 
9 is not at the end.

regexp_replace(marc,E'<datafield[^>]+?tag="856".+?<subfield 
code="9">HCC</subfield>(?:(?!</datafield>).)*</datafield>','','g') 

I did not test this much either, but it does pick up only your HCC fields in 
the data you provided below.  That being said, I was always surprised when 
using regex on large sets of data.  A good way to test this is to run it as a 
SELECT first to try and pull records with that regex, then grep through them to 
make sure you are only getting rid of the data you want.  Then for 
completeness, do another select for 856 fields with a WHERE clause using your 
regex to exclude the items you just verified for completeness, and grep through 
those results to make sure you are not missing any 9 = HCC fields.

Liam


On 2013-04-02, at 10:14 AM, Mike Rylander <[email protected]> wrote:

> On Tue, Apr 2, 2013 at 9:43 AM, Tim Spindler <[email protected]> wrote:
> This is the regex I was trying but in testing it wasn't working.
> 
> regexp_replace(marc,E'<datafield[^>]*?tag="856"+?<subfield 
> code="9">HCC</subfield></datafield>','','g') 
> 
> 
> Perhaps something like: 
> regexp_replace(marc,E'<datafield[^>]+?tag="856".+?<subfield 
> code="9">HCC</subfield></datafield>','','g')
> 
> As long as subfield 9 is at the end of all the datafields, that should work.  
> It's not heavily tested.
>  
> --miker
> 
> 
> On Tue, Apr 2, 2013 at 9:41 AM, Tim Spindler <[email protected]> wrote:
> Mike, 
> 
> For the batch update, it is about 6000 records with bucket, I didn't think 
> Evergreen could handle that many in a bucket.
> 
> Tim
> 
> 
> On Tue, Apr 2, 2013 at 9:20 AM, Mike Rylander <[email protected]> wrote:
> If your removal code is regexp-based, you'll just need to match on HCC in the 
> middle of the "find for removal" regexp.  Alternatively, you could apply the 
> batch MARC update function to a bucket or CSV file full of the records.  That 
> has the ability to modify records conditionally, based on subfield content 
> within a datafield.
> 
> --miker
> 
> 
> 
> On Tue, Apr 2, 2013 at 9:12 AM, Tim Spindler <[email protected]> wrote:
> We have a library that wants to reload its Films on Demand records.  These 
> records have multiple 856s for other institutions with the subfield 9 
> associated with their shortname.  What would the best way be to remove these 
> for instance.   I have created a list of the records needing updating in the 
> staging table.  Here is an example of the of he 856.  I have code that can 
> remove all the 856s but am having trouble removing just one.  We just want to 
> remove the one for HCC (although I'll take suggests for removing the 
> duplicate QCC lines also)
> 
> <datafield tag="856" ind1="4" ind2="0"><subfield code="y">Click here to 
> access this film online (MCLA students, faculty, and staff 
> only)</subfield><subfield 
> code="u">http://libproxy.mcla.edu:2048/login?url=http://digital.films.com/PortalPlaylists.aspx?aid=11580&amp;xtid=152</subfield><subfield
>  code="9">MCLA</subfield></datafield>
> <datafield tag="856" ind1="4" ind2="0"><subfield 
> code="u">http://ezhc.ez.cwmars.org:4000/login?url=http://digital.films.com/PortalPlaylists.aspx?aid=9206&amp;xtid=152</subfield><subfield
>  code="y">Click here to access this film online (HCC Students and Staff 
> only)</subfield><subfield code="9">HCC</subfield></datafield>
> <datafield tag="856" ind1="4" ind2="0"><subfield code="y">Click here to 
> access this film online (QCC studetns, faculty, &amp; staff 
> only)</subfield><subfield 
> code="u">http://ezqc.ez.cwmars.org:5200/login?url=http://digital.films.com/PortalPlaylists.aspx?aid=19277&amp;xtid=152</subfield><subfield
>  code="9">QCC</subfield></datafield>
> <datafield tag="856" ind1="4" ind2="0"><subfield 
> code="u">http://ezhc.ez.cwmars.org:4000/login?url=http://digital.films.com/PortalPlaylists.aspx?aid=9206&amp;xtid=152</subfield><subfield
>  code="y">Click here to access this film online (HCC Students and Staff 
> only)</subfield><subfield code="9">HCC</subfield></datafield>
> <datafield tag="856" ind1="4" ind2="0"><subfield code="y">Click here to 
> access this film online (QCC studetns, faculty, &amp; staff 
> only)</subfield><subfield 
> code="u">http://ezqc.ez.cwmars.org:5200/login?url=http://digital.films.com/PortalPlaylists.aspx?aid=19277&amp;xtid=152</subfield><subfield
>  code="9">QCC</subfield></datafield>
> 
> 
> -- 
> Tim Spindler
> [email protected]
> 
> P   Go Green - Save a tree! Please don't print this e-mail unless it's really 
> necessary.
>  
> 
> 
> 
> -- 
> Mike Rylander
>  | Director of Research and Development
>  | Equinox Software, Inc. / Your Library's Guide to Open Source
>  | phone:  1-877-OPEN-ILS (673-6457)
>  | email:  [email protected]
>  | web:  http://www.esilibrary.com
> 
> 
> 
> -- 
> Tim Spindler
> [email protected]
> 
> P   Go Green - Save a tree! Please don't print this e-mail unless it's really 
> necessary.
>  
> 
> 
> 
> -- 
> Tim Spindler
> [email protected]
> 
> P   Go Green - Save a tree! Please don't print this e-mail unless it's really 
> necessary.
>  
> 
> 
> 
> -- 
> Mike Rylander
>  | Director of Research and Development
>  | Equinox Software, Inc. / Your Library's Guide to Open Source
>  | phone:  1-877-OPEN-ILS (673-6457)
>  | email:  [email protected]
>  | web:  http://www.esilibrary.com

Reply via email to