Yes, the YMMV was intentional because people do do it and we all have our own paths to follow, but there are a lot of "if"s that need to be in place (a few of which you mention) to make it a painless process.
Kevin -----Original message----- > From:Roy Tennant > Sent: Tuesday, January 10 2017, 6:07 pm > To: [email protected] > Subject: Re: [CODE4LIB] MARCXML help again > > Well, I think that's a *bit* harsh. But the "YMMV" addition was appreciated, > because it can and will. That is, if the XML is completely consistent, then > Kyle's suggestion is an excellent one. If it isn't, then Kevin's link > applies, IMHO. Since it appears from what we have been told that the records > are consistent, I think Kyle's solution is not only workable but the most > efficient. Given the caveat stated above. > Roy > > > On Jan 10, 2017, at 5:57 PM, Kevin S. Clarke <[email protected]> wrote: > > > > On the mention of parsing XML with string operations, I'm compelled to post > > one of my favorite StackOverflow responses: > > > > http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 > > > > YMMV of course... > > > > Kevin > > > > > > > > -----Original message----- > >> From:Kyle Banerjee > >> Sent: Tuesday, January 10 2017, 5:44 pm > >> To: [email protected] > >> Subject: Re: [CODE4LIB] MARCXML help again > >> > >> Howdy Julie, > >> > >> Depending on your specific needs, it's often easier/faster to use string > >> rather than XML operations to work with XML. > >> > >> Especially if you have a large number of files and/or the files are very > >> big, stripping the whitespace between elements and then performing a simple > >> string substitution would be a fast low tech way to remove the unwanted > >> fields. > >> > >> kyle > >> > >> On Tue, Jan 10, 2017 at 1:13 PM, Julie Swierczek <[email protected]> > >> wrote: > >> > >>> Thanks to all who responded to my earlier plea for help. I now have a new > >>> problem. I'm not sure if I can do this with find and replace in Oxygen, > >>> or > >>> if this requires XSLT, or what. > >>> > >>> I have a project of MARCXML records like this: > >>> > >>> <?xml version="1.0" encoding="UTF-8" ?> > >>> <marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" > >>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > >>> xsi:schemaLocation="http://www.loc.gov/MARC21/slim > >>> http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"> > >>> <marc:record> > >>> <!--Lots of other datafields here --> > >>> <marc:datafield tag="710" ind1="2" ind2=" "> > >>> <marc:subfield code="a">Faux College</marc:subfield> > >>> <marc:subfield code="b">Special Collections</marc:subfield> > >>> </marc:datafield> > >>> </marc:record> > >>> </marc:collection> > >>> > >>> I want to strip out all instances of: > >>> <marc:datafield tag="710" ind1="2" ind2=" "> > >>> <marc:subfield code="a">Faux College</marc:subfield> > >>> <marc:subfield code="b">Special Collections</marc:subfield> > >>> </marc:datafield> > >>> but I want to leave other <marc:datafield tag="710" ind1="2" ind2=" "> > >>> instances intact. I only want to delete ones with both the Faux College > >>> and Special Collections text in the subfields. > >>> > >>> Where would I go from here? I thought of doing an xsl:template match in an > >>> XSL stylesheet, and then not providing any instructions for replacing the > >>> match, but I don't know how to select for that specific text. My attempts > >>> to figure that out have not worked. You can only read so much W3C > >>> documentation and Stack Overflow before you need to just sit quietly and > >>> stare at a wall for a while. > >>> > >>> Thanks in advance -- > >>> > >>> Julie > >>> > >> > Yes
