Hello Andrew, as you are "clean slate" anyway in handling XML files, you could take a look to XSLT processing -- also an off-topic area. There are free tools available around, and many examples of "XML to CSV XSLT" on StackOverflow.
HTH, Gabriele -----Original Message----- On January 4, 2017 12:45:08 PM PST, Ben Tupper <btup...@bigelow.org> wrote: >Hi, > >You should keep replies on the list - you never know when someone will >swoop in with the right answer to make your life easier. > >Below is a simple example that uses xpath syntax to identify (and in >this case retrieve) children that match your xpath expression. xpath >epxressions are sort of like /a/directory/structure/description so you >can visualize elements of XML like nested folders or subdirectories. > >Hopefully this will get you started. A lot more on xpath here >http://www.w3schools.com/xml/xml_xpath.asp There are other extraction >tools in xml2 - just type ?xml2 at the command prompt to see more. > >Since you have more deeply nested elements you'll need to play with >this a bit first. > >library(xml2) >uri = 'http://www.w3schools.com/xml/simple.xml' >x = read_xml(uri) > >name_nodes = xml_find_all(x, "//name") >name = xml_text(name_nodes) > >price_nodes = xml_find_all(x, "//price") >price = xml_text(price_nodes) > >calories_nodes = xml_find_all(x, "//calories") >calories = xml_double(calories_nodes) > >X = data.frame(name, price, calories, stringsAsFactors = FALSE) >write.csv(X, file = 'foo.csv') > >Cheers, >Ben > >> On Jan 4, 2017, at 2:13 PM, Andrew Lachance <alach...@bates.edu> >wrote: >> >> Hello Ben, >> >> Thank you for the advice. I am extremely new to any sort of coding so >I have learned a lot already. Essentially, I was given an XML file and >was told to convert all of it to a csv so that it could be uploaded >into a database. Unfortunately the information I am working with is >medical information and can't really share it. I initially tried to >convert it using online programs, however that ended up with a large >amount of blank spaces that wasn't useful for uploading into the >database. >> >> So essentially, my goal is to parse all the data in the XML to a >coherent, succinct CSV that could be uploaded. In the document, there >are 361 patient files with 13 subcategories for each patient which >further branches off to around 150 categories total. Since I am so new, >I have been having a hard time seeing the bigger picture or knowing if >there are any intermediary steps that will prevent all the blank spaces >that the online conversion programs created. >> >> I will look through the information on the xml2 package. Any advice >or recommendations would be greatly appreciated as I have felt fairly >stuck. Once again, thank you very much for your help. >> >> Best, >> Andrew >> >> On Tue, Jan 3, 2017 at 2:29 PM, Ben Tupper <btup...@bigelow.org ><mailto:btup...@bigelow.org>> wrote: >> Hi, >> >> It's hard to know what to advise - much depends upon the XML data you >have and what you want to extract from it. Without knowing about those >two things there is little anyone could do to help. Can you post to >the internet a to example data and provide the link here? Then state >explicitly what you want to have in hand at the end. >> >> If you are just starting out I suggest that you try xml2 package ( >https://cran.r-project.org/web/packages/xml2/ ><https://cran.r-project.org/web/packages/xml2/> ) rather than XML >package ( https://cran.r-project.org/web/packages/XML/ ><https://cran.r-project.org/web/packages/XML/> ). I have been using it >much more since the authors added the ability to create xml nodes >(rather than just extracting data from existing xml nodes). >> >> Cheers, >> Ben >> >> P.S. Hello to my niece Olivia S on the Bates EMS team. >> >> >> > On Jan 3, 2017, at 11:27 AM, Andrew Lachance <alach...@bates.edu ><mailto:alach...@bates.edu>> wrote: >> > >> > up votdown votefavorite >> > ><http://stats.stackexchange.com/questions/254328/how-to-convert-a-large-xml-file-to-a-csv-file-using-r?noredirect=1# ><http://stats.stackexchange.com/questions/254328/how-to-convert-a-large-xml-file-to-a-csv-file-using-r?noredirect=1#>> >> > >> > I am completely new to R and have tried to use several functions >within the >> > xml packages to convert an XML to a csv and have had little >success. Since >> > I am so new, I am not sure what the necessary steps are to complete >this >> > conversion without a lot of NA. >> > >> > -- >> > Andrew D. Lachance >> > Chief of Service, Bates Emergency Medical Service >> > Residence Coordinator, Hopkins House >> > Bates College Class of 2017 >> > alach...@bates.edu <mailto:alach...@bates.edu> <wcur...@bates.edu ><mailto:wcur...@bates.edu>> >> > (207) 620-4854 >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- >To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help ><https://stat.ethz.ch/mailman/listinfo/r-help> >> > PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html ><http://www.r-project.org/posting-guide.html> >> > and provide commented, minimal, self-contained, reproducible code. >> >> Ben Tupper >> Bigelow Laboratory for Ocean Sciences >> 60 Bigelow Drive, P.O. Box 380 >> East Boothbay, Maine 04544 >> http://www.bigelow.org <http://www.bigelow.org/> >> >> >> >> >> >> >> -- >> Andrew D. Lachance >> Chief of Service, Bates Emergency Medical Service >> Residence Coordinator, Hopkins House >> Bates College Class of 2017 >> alach...@bates.edu <mailto:wcur...@bates.edu> >> (207) 620-4854 > >Ben Tupper >Bigelow Laboratory for Ocean Sciences >60 Bigelow Drive, P.O. Box 380 >East Boothbay, Maine 04544 >http://www.bigelow.org > > > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.