Ryan, it sounds like you simply want to pull two relational tables into drupal using referenced entities in a one (question) to many (answers) relationship?

This can be accomplished, albeit unintuitively (it is Drupal afterall), by using the Feeds CSV parser with Feeds Tamper, and I would skip what seems like an unnecessary conversion to XML step.

First, you import your questions and set a GUID target (typically your db id). Then import your answers -- each should have a reference to the id mapped using the "Entity Reference by Feeds GUID". I have done it in reverse order too, in which case you'd import all your answers first. In this case, your second (questions) import needs to contain a single column of delimited answer GUIDs into a single column. Use the Feeds Tamper explode plugin on that field during mapping as explained (rather vaguely) here:

http://drupal.stackexchange.com/questions/32234/how-to-use-feeds-module-to-import-multi-value-fields

If that's not what you're trying to do, can you clarify?

-Shaun

On 6/16/14 3:35 PM, Joshua Welker wrote:
Sorry, the last line got messed up by outlook.

#now save the whole thing as an xml file

with open('myfile.xml', 'wb') as file
  ElementTree(rootNode).write(file)


Josh Welker


-----Original Message-----
From: Joshua Welker [mailto:[email protected]]
Sent: Monday, June 16, 2014 2:32 PM
To: Code for Libraries
Subject: RE: [CODE4LIB] Excel to XML (for a Drupal Feeds import)

This should be quite doable in most programming languages with
out-of-the-box tools and no tricky parsing code. The gist is to save in
Excel as a delimited text file (tab is a good choice), then have your script
ingest the document and turn it into an array, and then turn the array into
XML. In Python, it could be something like the code below (not tested but
the principles should be sound):

import 'csv'
from elementtree.ElementTree import Element, SubElement

#create a list
mylist = []

#open your delimited file with a csv reader with open('myfile.txt', 'rb') as
textfile:
   reader = csv.reader( textfile, delimiter='\t', quotechat='"')     #this
assumes your file is tab-delimited (\t)

   #loop through rows in your file and save each row as a key/value pair
(dictionary)
   for row in textfile:
     fields = {
       'field1': row[0]
       'field2': row[1]
       'field3': row[2]
       'field4': row[3]
     }

     #append this row to our master list
     mylist.append( fields )


#create an xml root node
rootNode = Element("XmlRoot")

#loop through our list of "rows" from the text file and create xml nodes for
row in mylist:
   rowNode = Element("record")

   #loop through all the fields on this "row" and turn them into xml nodes
   for fieldName, fieldValue in row:
     fieldNode = Element(fieldName)
     fieldNode.text = fieldValue

     #append each field node to the parent row node
     rowNode.append(fieldNode)

   #append each row node to the document root node
   rootNode.append(rowNode)

#now save the whole thing as an xml file with open('myfile.xml', 'wb') as
file
  ElementTree(rootNode).write(file)



Josh Welker

-----Original Message-----
From: Code for Libraries [mailto:[email protected]] On Behalf Of Kyle
Banerjee
Sent: Monday, June 16, 2014 1:04 PM
To: [email protected]
Subject: Re: [CODE4LIB] Excel to XML (for a Drupal Feeds import)

I'd just do this the old fashioned way. Awk is great for problems like this.
For example, if your file is tab delimited, the following should work

awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}''
yourfile

In the example above, I just print the fields but you could easily encase
them in tags.

kyle


On Mon, Jun 16, 2014 at 9:29 AM, Ryan Engel <[email protected]> wrote:

Thanks for the responses, on the list and off, so far.

As I'm sure is true for so many of us, my interest in learning more
about how to solve this type of problem is balanced against my need to
just get the project done so I can move on to other things.  One of
the great things about this list is the ability to learn from the
collective experiences of colleagues.  For this project specifically,
even clues about better search terms is useful; as Chris Gray pointed
out, basic Google searches present too many hits.

I did try following the "Create an XML data file and XML schema file
from worksheet data" instructions on the Microsoft site.  And it did
produce an XML document, but it wasn't able to transform this:
Row1    Question1    Q1Answer1
Row2                        Q1Answer2

...into something like this:
<row1>Row One Data</row1>
<question1>This is a question</question1> <answers> <q1answer1>Answer
1</q1answer1> <q1answer2>Answer2</q1answer2> </answers

Instead, I could get it to either convert every row into its own XML
entry, meaning that I had a lot of answers with no associated
questions, or I got an XML file that had 1 question with EVERY SINGLE
answer nested beneath it -- effectively all questions after the first
question were ignored.  Based on those results, I wasn't sure if there
is more tweaking I could do in Excel, or if there is some programmed
logic in Excel that can't be accounted for when associating a schema.


Another suggestion I received was to "fill" the question column so
that every row had a question listed.  I did consider this, but the
problem then is during the data import, I'd have to convince my CMS to
put all the answers back together based on the question, something I'm
sure Drupal COULD do, but I'm not sure how to do that either.


Finally, this project is a spreadsheet with 225,270 rows, so you can
imagine why I'd like a process that is reasonably trustworthy AND that
can run locally.


Anyway, any/all additional suggestions appreciated, even if they are
"try searching for "blah blah python parser", or "I made something
that solves a similar process, and you can download it from Git".

Ryan
___

Ryan Engel
Web Stuff
UW-Madison

Dana Pearson <mailto:[email protected]> June 13, 2014 at 7:14
PM I don't use Excel but a client did who wanted to use XSL I had
created ONIX to MARC to transform bibliographic metadata in Excel to
XML. The built in Excel XML converter was not very helpful since
empty cells were skipped so that it was impossible to use that
result.

There is an add on that allow you to map your data to XML elements by
creating a schema which is pretty cool.

http://bit.ly/1jpwtqM

This might be helpful.

regards,
dana





Terry Brady <mailto:[email protected]> June 13, 2014 at 6:53 PM
The current version of Excel offers a save as XML option.

It will produce something like this. There is other wrapping
metadata, but the table is pretty easy to parse.

<Table ss:ExpandedColumnCount="3" ss:ExpandedRowCount="7"
x:FullColumns="1"
x:FullRows="1" ss:DefaultRowHeight="15"> <Row> <Cell
ss:StyleID="s62"><Data ss:Type="String">row 1</Data></Cell>
<Cell><Data ss:Type="String">question 1</Data></Cell> <Cell><Data
ss:Type="String">answer 1</Data></Cell> </Row> <Row> <Cell
ss:StyleID="s62"><Data ss:Type="String">row 2</Data></Cell> <Cell
ss:Index="3"><Data ss:Type="String">answer 2</Data></Cell> </Row>
<Row> <Cell ss:StyleID="s62"><Data ss:Type="String">row
3</Data></Cell> <Cell ss:Index="3"><Data ss:Type="String">answer
3</Data></Cell> </Row> <Row> <Cell ss:StyleID="s62"><Data
ss:Type="String">row 4</Data></Cell> <Cell><Data
ss:Type="String">question 2</Data></Cell> <Cell><Data
ss:Type="String">answer 1</Data></Cell> </Row> <Row> <Cell
ss:StyleID="s62"><Data ss:Type="String">row 5 </Data></Cell> <Cell
ss:Index="3"><Data ss:Type="String">answer 2</Data></Cell> </Row>
<Row> <Cell ss:StyleID="s62"><Data ss:Type="String">row
6</Data></Cell> <Cell><Data ss:Type="String">quest </Data></Cell>
<Cell><Data ss:Type="String">answer 3</Data></Cell> </Row> <Row>
<Cell ss:StyleID="s62"/> </Row> </Table>





Ryan Engel <mailto:[email protected]>
June 13, 2014 at 4:28 PM
Hello -

I have an Excel spreadsheet that, for the purposes of an easy import
into a Drupal site, I'd like to convert to XML.  I know people more
knowledgeable than I could code up something in Python or Perl to
convert a CSV version of the data to XML (and I have a colleague who
offered to do just that for me), but I am looking for recommendations
for something more immediately accessible.

Here's an idea of how the spreadsheet is structured:

Row1    Question1    Q1Answer1
Row2                        Q1Answer2
Row3                        Q1Answer3
Row4    Question2    Q2Answer1
Row5                        Q2Answer2
Row6    Question3    Q3Answer1
etc.

How do other people approach this?  Import the data to an SQL
database, write some clever queries, and then export that to XML?
Work some wizardry in GoogleRefine/OpenRefine?  Are scripting
languages really the best all around solution?  Excel's built in XML
mapping function wasn't able to process the one-to-many relationship
of questions to answers, though maybe I just don't know how to build
the mapping structure correctly.

In the interest immanent deadlines, I have handed the spreadsheet off
to my Perl-writing colleague.  But as a professional growth
opportunity, I'm interested in suggestions from Libraryland about
ways others have approached this successfully.

Thanks!

Ryan Engel
Web Stuff
UW-Madison


--




--
Shaun Ellis
User Interface Developer, Digital Initiatives
Princeton University Library
609.258.1698

Reply via email to