Hi Piotr and others,

This is the penultimate part of my posting about merging DP XML data into Word 
2003.  This part show how to create the Word template document and how to 
create the seed document.

If you have followed the first two parts you should now have a data from DP, 
DPNews.doc, and loaded the DPNews Schema into Word. and you should have opened 
an existing document or created a new document , and selected to use the DPNew 
Scehma with this document, remmeber to to select to allow saving the document 
even if not valid.

You should have a Task Pane visible showing the XML Structure, and have ticked 
to show XML tags in the document, and to make life a little easier, ticked the 
"List only child elements of the current element"

You can create the document as you would a normal document. The original sample 
I created is quite complex, so you might want to take baby steps and create a 
very simple document, without headers, footers, text boxes etc. These just add 
a little extra complexity where you have to think about where you are in the 
XML structure.

In the XML Structure pane, of the Task Pane, you should be able to see a list 
of the elements in the document.  Even though you have ticked to only show 
child elements you should see all of the elements. Click the root element ie 
the "dpnews" element tag. A popup box will ask you to apply this to a section 
or the entire document. Apply it to the entire document. 

This should surround your text with the dpnews tags. In general (unless you 
have good reason otherwise) all your content should go in between these tags. 
If you cursor is in between these tags, the Task Pane should update to only 
swhow the child elements. If you want to include the month at a particular 
point, move the cursor to the insertion point and then click the "month" 
element in the task pane. 

I must say that I generally would create a sample document as my template, and 
then just add the XML at the required places. But you can do it the other way 
around, by adding tags, text and formatting as you go. 

One thing to think about is that some tags elements contain the data you want 
to use, and some contain the elements you want to use. Think of this tree of 
data as ehting being branches (containers) or leaves (the data). Remember also 
that you can use the same elements multiple times in the dcoument, however if 
you need to do this, you will have to hand edit the final file, as Word think 
you want to use the next instance of that element in the XML data course. (This 
can be a wonderful thing, as you can have different formatting for each items. 
For example if I had 4 articles in my newletter and I wanted the lead story to 
have a different formatting than the rest I would add the Article element 
twice, The first one I could set the formatting for the lead article, and the 
second would set the formatting for the subsequent articles. 

If for example you wanted to include a table of data. Lets say you wanted the 
customer information, my DPNews.doc XML file also includes contact name, phone 
number, email and fax. The easiest way to do this is to create a row or two of 
the table structure and then select the entier row, before applying the 
"customer" element from the Task Pane. (This step is omitted from the Microsoft 
documentation, although it shows a diagram which can only have been created 
that way. If you do not select the row to start off with, and instead put the 
customer information in a single cell, and then add the contactname, phone and 
other fields inside each instance, it will look like it should work, however 
Word will think you wish to use the 2nd, 3rd and 4th instances etc of the 
"customer" element from the data file, and depending on the file you might get 
a result, but it might be quite jumbled up.

When you place your cursor inside a container element, you are setting what is 
called the current CONTEXT. If you think of data as a heirachical tree, like a 
directory structure then placing your cursor inside an element is like setting 
that as being the current directory.  The way data is referred to in the 
resulting template file is using a technology called XPath.  To specify a 
particular piece of data in the file, Word will create an XPath expression to 
find it. For very simple Word documents you will not even need to think aobut 
the XPath, and in the Newsletter sample, I didn;t really need to think about 
the XPath. The way Word shows the structure in the Task Pane sorts out most of 
that for you.  
There are time when you might need to think about the XPath, so I will show you 
a couple of examples.

"/dpnews" refers to the root node called dpnews. This XPath expression would 
refer to all the content of the document enacsed in the "dpnews" element, even 
including the dpnews tag

If I was working inside this context, then the XPath expression "month" would 
refer to the month element visible at this context.

The expression "/dpnews/month" would refer to the same expression but is 
independant of the context as it specifies the path from the root.

The expression "/dpnews/article[1]" refers to the first article in the 
DPNews.doc file

The expression "/dpnews/article[2]/heading/text()" would refer to the heading 
text of the second article. The text() function means that the tag itself is 
not returned.

The expression "/dpnews/article" refers to the set of articles, , ie all the 
articles and not a specific one.

In Word when you use an element more than once in the template it adds the 
following XPath 
"/dpnews/article[position()=1]"  for the first instance and
"/dpnews/article[position() > 1]" for all the other instances. 
/dpnews/article[position() = last()] to refer to just the last article

The bit inside the square brackets is called a PREDICATE, and it allows a range 
of selection criteria to be included, including even conditional pieces of 
information, For example in a statement of outstanding invoice balances you 
might have an Xpath expression "/statement/customer/invoice[balance/text() > 0] 
to show where the invoice is unpaid, and you could apply different formatting, 
or include different text. 

Although handcoding to add these XPath statements is probably jumping ahead too 
much, it show the level of sophistication that can be applied to merging XML 
data into a document.

When you save the Template document, you need to save it as XML, and you must 
remember to uncheck the Save Data Only and the Apply Transform checkboxes.

But back to more mundane things. The template and markup you create is referred 
to by Microsoft as a Seed document. It is not yet ready yet but it is only a 
couple of steps away.

Microsoft WordProcessingML Transformation Inference Tool which you can get as a 
free download as part of the Word2003 SDK, or directly from this link.  
http://www.microsoft.com/downloads/details.aspx?familyid=2cb5b04e-61d9-4f16-9b18-223ec626080e&displaylang=en

When installed you will have a commandline utility WML2XSLT which is used to 
create an XSLT file from your seed document. (I added a path to this utility in 
the Windows environment Path so I can access it from any folder).  

To use this utility open up a Command line window (DOS window), and then run 
the following

wml2xslt SeedDocument.XML -db  

You will be asked which namespace to process, for my example you would check 
the "DPNews" namespace, and if everything goes well you will create a document 
like named to your SeedDocument with the extension .XSL. This is the file that 
you can use to tranform your data, as in the example right back at Part 1.

By the way, the -db option to wml2xslt is needed when you have a mixture of 
leaf and container objects. A simple flat XML file (imitating a typical 
mailmerge file) would not need it, as other than the root there would be no 
other container objects.

Generally this is all that is needed for a simple Word merging, and although it 
looks like there are many steps, after you understand the concepts it only 
takes a short while to create simple XML merge templates. The DP Newletter 
example only took about an hour or so, to both plan and write.

Then next posting which will be the last give some other ideas to use

Regards
Brian










_______________________________________________
Dataperf mailing list
[email protected]
http://lists.dataperfect.nl/mailman/listinfo/dataperf

Reply via email to