Hi Piotr,
In my previous posting I showed an example of merging DP with XML data. In this
post, I will describe the XML data structures, and how to prepare for using it
with Word 2003+.
My DP Application has two main panels
Customers
Articles
it also had one other dummy panel with one record which I always employ for
housing reports, I call it Configuration for want of a better name.
Secondly I mapped out a rough idea of the structure of the XML file I wanted to
create.:
<dpnews>
<month>June</month>
<customer>
<company>Acme Mouse Trap Company</company>
<contact>Sylvester Felis</contact>
<!-- etc etc -->
</customer>
<article>
<heading>DP Does XML<heading>
<content>>Mauris ullamcorper metus eu arcu? Donec
pellentesque!<content>
<article>
<article>
<!-- etc etc -->
.
.
</article>
<!-- etc etc -->
</dpnews>
As it is for a personalised newsletter application instead of this structure I
could have used something more like this:
<dpnews>
<month><month>
<customer>
<article></article>
<article<</article>
</customer>
</dpnews>
where the articles were wrapped up in the customer element. This would be a
little easier to work with in Word, however I expected that more than one
customer might one the same newsletter so I left it will the possibility of
adding multiple customers for the same set of articles - eventually I could
have a structure like:
<dpnews>
<month><month>
<customer></customer>
<customer></customer>
<article></article>
<article></article>
<article></article>
</dpnews>
Instead of putting "month" in as a separate element, I could have done
something like:
<dpnews month="June">
Which would have been adding "month" as an ATTRIBUTE of the "dpnews" ELEMENT,
however it would have made life very difficult creating the template in Word.
Creating the XML document in DP really is a piece of cake, far easier than say
with most XML capable database products. Other products might be smarter but
they usually have their own way of doing things, DP can work with XML with
almost the same absolute and total flexibility of using Notepad, after all XML
is only a text file. As an aside the whole project took about an hour or so to
complete, however I spent at least 2 hours debugging a small problem, which I
eventually tracked down to me forgetting to set the margins and page lengths in
the DP report to zero
Before Word 2003 can create a template document for this XML document it has to
understand its structure. If it is a very simple structure similar to say a
standard mail merge file, where it is essentially a flat file, Word can deduce
the structure and life is really quite easy. For anything more that this and
you will need to create a Schema . This is about the hardest part of the
project, you need to get this right. I am a bit of a weakling at writing
Schemas, so I take a more simple approach. I let other (free) software do it
for me.
Schemas can be pretty powerful as it describes how the data is structured and
the rules for the content. It can specify thing like the type of data that will
be found in an element, etc etc. Complex Schemas give me a headache. If I let
Schema creation software analyse the DP XML file, even though it is very simple
I get a very complex Schema, and thats going to give me a headache, and not
necessairly just me, it will possibly give Microsoft Word 2003 something of a
headache. Very fortunately before Schemas came out there was a much simple way
of describing simple XML documents, called a DTD, (or Document Type
Definition). DTD was far more basic and less likely to give me a headache, but
unfortunately Word cannot use a DTD. However since a DTD contains more basic
information than say a sample of my DP XML file, Schemas software which can
usually convert a DTD into a much simpler Schema which with just a little
modification, is ideal for Word.
I used some freeware software WMHelp XMLPad 3 to create the DTD, which looked
like this:
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT dpnews (month, customer, article+)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT customer (company+, contact?, phone?, email?, fax?)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT contact (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT fax (#PCDATA)>
<!ELEMENT article (heading, content)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT content (#PCDATA | br | b | i | u)*>
<!ELEMENT br EMPTY>
<!ELEMENT b (#PCDATA | i)*>
<!ELEMENT b (#PCDATA |u)*>
<!ELEMENT i (#PCDATA | b)*>
Just a super quick overview of this, the element "dpnews" can contain one of
element "month", one of element "customer" and one or more elements "article"
(the one or more signified by the "+" sign at the end). The "month" element can
contain text (#PCDATA). The "customer" element can contain one or more
"company" elements ("+") and optionally ie zero or one "contact" elements
(signified by the "?").
The interesting one, which might need hand coding is the element "content"
which can contain zero, one or more of the elements "br" or "b" or "i" or "u".
Since my content comes from a AxAy Memotype field and their can be line breaks,
italics or underline characters, I want to include them in the specification of
the structure. Note that I have specify the br element as an EMPTY as in <br
/>, and I have had to specify the "b" and "i" or "u" elements as potentially
including each other. I could have gone further by letting bold or underlined
text include a line break, but I didn't.
By the way, to create the <br /> or <i> or <u> or <b> elements you will need to
use the latest DP2.6Y version of the software, in either the U or I versions
(note that I added the specification for both "u" and "i" elements to the DTD.
Line breaks are easy to handle, but the other text attributes must be correctly
nested, so unless it is imperative to the application then I would avoid
anything but a line break, as DP does not enforced well-formed code so it up to
the person editing the AxAy memo field to get it right. Which is the reaon I
left it out of the DTD.
After this I used the same XMLPad software to create a a Schema from the DTD.
The Schema is implemented as an XML file usually with the file extension .XSD
The Schema XMLPad created looks like this:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="DPNEWS.dtd"
xmlns:wmh="http://www.wmhelp.com/2003/eGenerator"
elementFormDefault="qualified" targetNamespace="DPNEWS.dtd">
<xs:element name="dpnews">
<xs:complexType>
<xs:sequence>
<xs:element ref="month"/>
<xs:element ref="customer"/>
<xs:element ref="article" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="month" type="xs:string"/>
<xs:element name="customer">
<xs:complexType>
<xs:sequence>
<xs:element ref="company" maxOccurs="unbounded"/>
<xs:element ref="contact" minOccurs="0"/>
<xs:element ref="phone" minOccurs="0"/>
<xs:element ref="email" minOccurs="0"/>
<xs:element ref="fax" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="company" type="xs:string"/>
<xs:element name="contact" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="fax" type="xs:string"/>
<xs:element name="article">
<xs:complexType>
<xs:sequence>
<xs:element ref="heading"/>
<xs:element ref="content"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="heading" type="xs:string"/>
<xs:element name="content">
<xs:complexType mixed="true">
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="br"/>
<xs:element ref="b"/>
<xs:element ref="i"/>
<xs:element ref="u"/>
</xs:choice>
</xs:complexType>
</xs:element>
<xs:element name="br">
<xs:complexType/>
</xs:element>
<xs:element name="b">
<xs:complexType mixed="true">
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="i"/>
</xs:choice>
</xs:complexType>
</xs:element>
<xs:element name="i">
<xs:complexType mixed="true">
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="b"/>
</xs:choice>
</xs:complexType>
</xs:element>
<xs:element name="u">
<xs:complexType mixed="true">
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="b"/>
</xs:choice>
</xs:complexType>
</xs:element>
</xs:schema>
Everything is good in this file except for the processing instruction starting
on the 2nd line:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="DPNEWS-schema3.dtd" xmlns:wmh="http://www.wmhelp.com/2003/eGenerator"
elementFormDefault="qualified" targetNamespace="DPNEWS-schema3.dtd">
In fact I hand edited this, so it looks like:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="DPNews"
targetNamespace="DPNews">
(a lot simpler)
Namespaces indicated by the xmlns="DPNews" and targetNamespace attributes are
quite complex, but in a nutshell they allow you to combine XML files from
different sources, and different parts of the resultant document can be
identified by the namespace. Because Word uses a large range of Namespaces, it
needs to be able to identify which part of the markup belong to data from the
DPNews file, and which is from the Word markup. Remember though to save the
new Schema).
Namespaces usually have a prefix so in the XML Word produces you will see in
the root element:
<w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"
xmlns:v="urn:schemas-microsoft-com:vml" etc etc
and throughout the document you will see tags like <w:tblBorders> In this case
there is a namespace prefix "w" so <w:tblBorders> means it belongs to the
namespace http://schemas.microsoft.com/office/word/2003/wordml
In our DP XML we want all the data to belong to the same namespace so we leave
out a prefix and the namespace will be the default namespace. To add a
namespace to our XML we need to revisit the DP report creating the XML data
source and add a default namespace to that document, I will also add an XML
prolog to make things a little more formal so the first few line of the XML
look like this:
<?xml version="1.0" encoding="ISO-8859-1"?>
<dpnews xmlns="DPNews">
<month>June</month>
.
.
etc
Ok, now we have a DP application which creates an XML file, which has a
uniquely named Namespace "DPNews" (often namespaces use as URL, so I might
create a URL like www.brileigh.com/DPNews so that it is not confused with
another users DPNews Schema. But for this exercise the "DPNews" is enough. (If
you do use a URL it does not have to lead to a webpage, the page is never
accessed. Sometimes however if you are using document that other will use, you
might actually have a valid address to people can look up information about the
document and how it is used.
At this point we have an XML data document and an Schema which can be added to
Word library of schemas. Ok so back to Word 2003, and potentially our template
document.
After creating a blank document or using a document which we want to convert
into a template, on the Tool Menu in Word 2003, in "Templates and Add-Ins" in
the "XML Schema" tab, after clicking the "Add Schema" button, browse and select
the Schema file you created. It will be added to the library of Available XML
Schemas, and check the one you just added. Before leaving the "XML Schema" Tab
we need to do one more thing and that is to check the box "Allow saving as XML
even if not valid". This is important because we are not using Word to create
the XML data source, but rather to create a document which might be subset of
fields from the DP XML file, and which will unlikely be valid according to our
DPNews Schema.
One last thing before marking up our template. If the Task Pane is not visible
you will need to display it, and select the "XML Structure" pane. This will
show the structure. Check the box "Show XML tags in the document" otherwise
you will not see what you are doing. It might also be easier for complex data
structures to check the box "List only Child Elements of the Current element".
We are now ready to create the Template document. We can make many different
templates from the same schema, and we do not need to go through any of the
above process again, unless we change our structure.
In the next part I will show how the Word 2003 template is created and how
markup is added.
You can download the DTD and Schema I used from this link
http://www.brileigh.com/DPWeb/DPNewsPart2.zip
Regards
Brian
_______________________________________________
Dataperf mailing list
[email protected]
http://lists.dataperfect.nl/mailman/listinfo/dataperf