Re: [Dataperf] Example of Merging DP data into MS Word

Brian Hancock Mon, 16 Jun 2008 20:05:42 -0700

Hi Piotr and others,

This is the penultimate part of my posting about merging DP XML data into Word 
2003.  This part show how to create the Word template document and how to 
create the seed document.

If you have followed the first two parts you should now have a data from DP,
DPNews.doc, and loaded the DPNews Schema into Word. and you should have opened
an existing document or created a new document , and selected to use the DPNew
Scehma with this document, remmeber to to select to allow saving the document
even if not valid.

You should have a Task Pane visible showing the XML Structure, and have ticked
to show XML tags in the document, and to make life a little easier, ticked the
"List only child elements of the current element"

You can create the document as you would a normal document. The original sample
I created is quite complex, so you might want to take baby steps and create a
very simple document, without headers, footers, text boxes etc. These just add
a little extra complexity where you have to think about where you are in the
XML structure.

In the XML Structure pane, of the Task Pane, you should be able to see a list
of the elements in the document. Even though you have ticked to only show
child elements you should see all of the elements. Click the root element ie
the "dpnews" element tag. A popup box will ask you to apply this to a section
or the entire document. Apply it to the entire document.

This should surround your text with the dpnews tags. In general (unless you
have good reason otherwise) all your content should go in between these tags.
If you cursor is in between these tags, the Task Pane should update to only
swhow the child elements. If you want to include the month at a particular
point, move the cursor to the insertion point and then click the "month"
element in the task pane.

I must say that I generally would create a sample document as my template, and
then just add the XML at the required places. But you can do it the other way
around, by adding tags, text and formatting as you go.

One thing to think about is that some tags elements contain the data you want
to use, and some contain the elements you want to use. Think of this tree of
data as ehting being branches (containers) or leaves (the data). Remember also
that you can use the same elements multiple times in the dcoument, however if
you need to do this, you will have to hand edit the final file, as Word think
you want to use the next instance of that element in the XML data course. (This
can be a wonderful thing, as you can have different formatting for each items.
For example if I had 4 articles in my newletter and I wanted the lead story to
have a different formatting than the rest I would add the Article element
twice, The first one I could set the formatting for the lead article, and the
second would set the formatting for the subsequent articles.

If for example you wanted to include a table of data. Lets say you wanted the
customer information, my DPNews.doc XML file also includes contact name, phone
number, email and fax. The easiest way to do this is to create a row or two of
the table structure and then select the entier row, before applying the
"customer" element from the Task Pane. (This step is omitted from the Microsoft
documentation, although it shows a diagram which can only have been created
that way. If you do not select the row to start off with, and instead put the
customer information in a single cell, and then add the contactname, phone and
other fields inside each instance, it will look like it should work, however
Word will think you wish to use the 2nd, 3rd and 4th instances etc of the
"customer" element from the data file, and depending on the file you might get
a result, but it might be quite jumbled up.

When you place your cursor inside a container element, you are setting what is
called the current CONTEXT. If you think of data as a heirachical tree, like a
directory structure then placing your cursor inside an element is like setting
that as being the current directory. The way data is referred to in the
resulting template file is using a technology called XPath. To specify a
particular piece of data in the file, Word will create an XPath expression to
find it. For very simple Word documents you will not even need to think aobut
the XPath, and in the Newsletter sample, I didn;t really need to think about
the XPath. The way Word shows the structure in the Task Pane sorts out most of
that for you.
There are time when you might need to think about the XPath, so I will show you
a couple of examples.

"/dpnews" refers to the root node called dpnews. This XPath expression would
refer to all the content of the document enacsed in the "dpnews" element, even
including the dpnews tag

If I was working inside this context, then the XPath expression "month" would
refer to the month element visible at this context.

The expression "/dpnews/month" would refer to the same expression but is
independant of the context as it specifies the path from the root.

The expression "/dpnews/article[1]" refers to the first article in the
DPNews.doc file

The expression "/dpnews/article[2]/heading/text()" would refer to the heading
text of the second article. The text() function means that the tag itself is
not returned.

The expression "/dpnews/article" refers to the set of articles, , ie all the
articles and not a specific one.

In Word when you use an element more than once in the template it adds the
following XPath
"/dpnews/article[position()=1]" for the first instance and
"/dpnews/article[position() > 1]" for all the other instances.
/dpnews/article[position() = last()] to refer to just the last article

The bit inside the square brackets is called a PREDICATE, and it allows a range
of selection criteria to be included, including even conditional pieces of
information, For example in a statement of outstanding invoice balances you
might have an Xpath expression "/statement/customer/invoice[balance/text() > 0]
to show where the invoice is unpaid, and you could apply different formatting,
or include different text.

Although handcoding to add these XPath statements is probably jumping ahead too
much, it show the level of sophistication that can be applied to merging XML
data into a document.

When you save the Template document, you need to save it as XML, and you must
remember to uncheck the Save Data Only and the Apply Transform checkboxes.

But back to more mundane things. The template and markup you create is referred
to by Microsoft as a Seed document. It is not yet ready yet but it is only a
couple of steps away.

Microsoft WordProcessingML Transformation Inference Tool which you can get as a
free download as part of the Word2003 SDK, or directly from this link.
http://www.microsoft.com/downloads/details.aspx?familyid=2cb5b04e-61d9-4f16-9b18-223ec626080e&displaylang=en

When installed you will have a commandline utility WML2XSLT which is used to
create an XSLT file from your seed document. (I added a path to this utility in
the Windows environment Path so I can access it from any folder).

To use this utility open up a Command line window (DOS window), and then run
the following

wml2xslt SeedDocument.XML -db

You will be asked which namespace to process, for my example you would check
the "DPNews" namespace, and if everything goes well you will create a document
like named to your SeedDocument with the extension .XSL. This is the file that
you can use to tranform your data, as in the example right back at Part 1.

By the way, the -db option to wml2xslt is needed when you have a mixture of
leaf and container objects. A simple flat XML file (imitating a typical
mailmerge file) would not need it, as other than the root there would be no
other container objects.

Generally this is all that is needed for a simple Word merging, and although it
looks like there are many steps, after you understand the concepts it only
takes a short while to create simple XML merge templates. The DP Newletter
example only took about an hour or so, to both plan and write.

Then next posting which will be the last give some other ideas to use

Regards
Brian

_______________________________________________
Dataperf mailing list
[email protected]
http://lists.dataperfect.nl/mailman/listinfo/dataperf

Re: [Dataperf] Example of Merging DP data into MS Word

Reply via email to