Hello,

R is good at handling XML, but in this case I would rather do the first
step with an XSLT transformation, e.g. with Saxon, possibly to a CSV
file.

HTH,
Gabriele


-----Original Message-----
From: Anika Masters [mailto:anika.mast...@gmail.com] 
Sent: Tuesday, January 29, 2013 3:01 AM
To: r-help@r-project.org
Subject: [R] converting XML document to table or dataframe

I am a relatively new user to R, and I am trying to learn more about
converting data in an XML document into "2-dimensional format" such as a
table or array.  I might eventually wish to export this data into a
relational database such as SQL, and/or to work with this data within
the R package.

My sample XML document is located at "
http://www.sec.gov/Archives/edgar/data/743988/000124636013000561/form.xm
l"

I have successfully import the XML document and then converted the XML
document to a list.

I am "stuck" trying to convert the document into a "2-dimenional" table
or dataframe.

What is a "good" way to convert the XML document to a 2-dimensional
table or data.frame?  Ideally, I'd like a table with 1 row for each XML
document, and unique fieldnames.  If fieldnames repeat, I'd like the
names to be numbered sequentially

e.g.
$nonDerivativeTable$nonDerivativeTransaction$transactionAmounts$transact
ionPricePerShare$value_1
$nonDerivativeTable$nonDerivativeTransaction$transactionAmounts$transact
ionPricePerShare$value_2
$nonDerivativeTable$nonDerivativeTransaction$transactionAmounts$transact
ionPricePerShare$value_3

etc




myxml = xmlParse("
http://www.sec.gov/Archives/edgar/data/743988/000124636013000561/form.xm
l")
mylist <- xmlToList(mydoc)
mydf <- xmlToDataFrame(mydoc)
mydf2 <- data.frame(mylist)
mytable <- as.table(mylist)
mydf2 <- data.frame(mydoc)
mytable <- as.table(mydoc)

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to