[CODE4LIB] convert MODS XML into CSV or tab-delimted text
Hello, Does anyone out there have an XSL stylesheet to transform MODS XML into a CSV or tab-delimited text file? Even if it's highly localized to your own institution/project, it would probably still be useful. Thanks in advance, Eben English Web Services Developer Boston Public Library 700 Boylston St. Boston, MA 02116 617.859.2238 eengl...@bpl.org
Re: [CODE4LIB] convert MODS XML into CSV or tab-delimted text
Given that you'll most likely have to deal with elements that are missing and/or repeat variable amounts of times, conditional mappings, and data that needs to be transformed, it may be easier to use a string parsing routine to do what you need. kyle On Tue, Apr 22, 2014 at 11:35 AM, English, Eben eengl...@bpl.org wrote: Hello, Does anyone out there have an XSL stylesheet to transform MODS XML into a CSV or tab-delimited text file? Even if it's highly localized to your own institution/project, it would probably still be useful. Thanks in advance, Eben English Web Services Developer Boston Public Library 700 Boylston St. Boston, MA 02116 617.859.2238 eengl...@bpl.org
Re: [CODE4LIB] convert MODS XML into CSV or tab-delimted text
The easiest approach to converting XML to flat formats is to use a tool like Oxygen or Altova MapForce. Unfortunately, MapForce Enterprise — the version you would need — runs around $1k and Oxygen XML Editor is also pricey, although you might qualify for a discount. You can also open XML files directly in Excel or Open Office, but they might return a mess. Cary On Apr 22, 2014, at 11:35 AM, English, Eben eengl...@bpl.org wrote: Hello, Does anyone out there have an XSL stylesheet to transform MODS XML into a CSV or tab-delimited text file? Even if it's highly localized to your own institution/project, it would probably still be useful. Thanks in advance, Eben English Web Services Developer Boston Public Library 700 Boylston St. Boston, MA 02116 617.859.2238 eengl...@bpl.org
Re: [CODE4LIB] convert MODS XML into CSV or tab-delimted text
Try XML Spy. It has the capability to export http://manual.altova.com/XMLSpy/spyenterprise/index.html?exporttotextfiles.htmand should also work for XSL. On Tue, Apr 22, 2014 at 1:37 PM, Cary Gordon listu...@chillco.com wrote: The easiest approach to converting XML to flat formats is to use a tool like Oxygen or Altova MapForce. Unfortunately, MapForce Enterprise — the version you would need — runs around $1k and Oxygen XML Editor is also pricey, although you might qualify for a discount. You can also open XML files directly in Excel or Open Office, but they might return a mess. Cary On Apr 22, 2014, at 11:35 AM, English, Eben eengl...@bpl.org wrote: Hello, Does anyone out there have an XSL stylesheet to transform MODS XML into a CSV or tab-delimited text file? Even if it's highly localized to your own institution/project, it would probably still be useful. Thanks in advance, Eben English Web Services Developer Boston Public Library 700 Boylston St. Boston, MA 02116 617.859.2238 eengl...@bpl.org
Re: [CODE4LIB] convert MODS XML into CSV or tab-delimted text
There is also the XML package for R with tools for parsing XML which are all completely free. It even has a function called xmlToDataFrame to convert an XML document to a data frame which can then be written to .csv This function can be used to extract data from an XML document (or sub-document) that has a sim- ple, shallow structure that does appear reasonably commonly. The idea is that there is a collection of nodes which have the same fields (or a subset of common fields) which contain primitive values, i.e. numbers, strings, etc. Each node corresponds to an observation and each of its sub-elements correspond to a variable. This function then builds the corresponding data frame, using the union of the variables in the different observation nodes. This can handle the case where the nodes do not all have all of the variables. On Tue, Apr 22, 2014 at 1:37 PM, Cary Gordon listu...@chillco.com wrote: The easiest approach to converting XML to flat formats is to use a tool like Oxygen or Altova MapForce. Unfortunately, MapForce Enterprise — the version you would need — runs around $1k and Oxygen XML Editor is also pricey, although you might qualify for a discount. You can also open XML files directly in Excel or Open Office, but they might return a mess. Cary On Apr 22, 2014, at 11:35 AM, English, Eben eengl...@bpl.org wrote: Hello, Does anyone out there have an XSL stylesheet to transform MODS XML into a CSV or tab-delimited text file? Even if it's highly localized to your own institution/project, it would probably still be useful. Thanks in advance, Eben English Web Services Developer Boston Public Library 700 Boylston St. Boston, MA 02116 617.859.2238 eengl...@bpl.org -- Simon Brown simoncbr...@gmail.com simoncharlesbrown (Skype) 831.440.7466 (Phone) *Following our will and wind we may just go where no one's been -- MJK*
Re: [CODE4LIB] convert MODS XML into CSV or tab-delimted text
On Tuesday, April 22, 2014 1:36 PM, Eben English wrote: Does anyone out there have an XSL stylesheet to transform MODS XML into a CSV or tab-delimited text file? Even if it's highly localized to your own institution/project, it would probably still be useful. I'm not sure how well it would work, but MarcEdit [1] has a MODS=MARC XML conversion option, and an option to Export Tab Delimited Records. [1] http://marcedit.reeset.net/ I hope this helps, Bryan Baldus Senior Cataloger Quality Books Inc. The Best of America's Independent Presses 1-800-323-4241x402 bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/
Re: [CODE4LIB] convert MODS XML into CSV or tab-delimted text
LoC has XSLT stylesheets to convert MODS to DC, HTML, and MARCXML. http://www.loc.gov/standards/mods/mods-conversions.html There are also XML to CSV XSLT scripts out here, and there's this app which I tested on a MODS 3.0 record and it didn't look too bad: https://code.google.com/p/xml2csv-conv/ On Wed, Apr 23, 2014 at 5:04 AM, Bryan Baldus bryan.bal...@quality-books.com wrote: On Tuesday, April 22, 2014 1:36 PM, Eben English wrote: Does anyone out there have an XSL stylesheet to transform MODS XML into a CSV or tab-delimited text file? Even if it's highly localized to your own institution/project, it would probably still be useful. I'm not sure how well it would work, but MarcEdit [1] has a MODS=MARC XML conversion option, and an option to Export Tab Delimited Records. [1] http://marcedit.reeset.net/ I hope this helps, Bryan Baldus Senior Cataloger Quality Books Inc. The Best of America's Independent Presses 1-800-323-4241x402 bryan.bal...@quality-books.com eij...@cpan.org http://home.comcast.net/~eijabb/