Hi Piotr, Oh my, what a complex subject, its good to be an Aussie :-)
In a DP report what happens with the representation of these diacriticals in the resultant XML file? Can the resultant XML file be opened without errors, eg in Firefox or IE? Are the documents well formed or do the diacritics cause a parsing problem? If you can open them, what gets displayed when you opened them in a browser? >From DP I would generally use the encoding ISO-8859-1 which is an 8-bit >encoding, although sometimes I used UTF-8 (which is really no benefit as DP >cannot produce a unicode file) So my normal XML prolog <?xml version="1.0" >encoding="ISO-8859-1"?>. Similarly in an 8Bit format you can use <?xml >version="1.0" encoding="ISO-8859-16"?> which if I understand it correctly >allow you to represent the Slavic diacritical characters. Does that make >any difference, or am I barking totally up the wrong tree... Regards Brian ----- Original Message ----- From: piotrzyk Gazeta.pl To: Dataperfect Users Discussion Group Sent: Wednesday, June 25, 2008 7:35 PM Subject: Re: [Dataperf] Pt4 Example of Merging DP data into MS Word Hi Brian, Well, diacritic (national) characters, are the letters too :-) http://en.wikipedia.org/wiki/Diacritic You're right - ASCII is 7-bit, the 8th bit is used to have different extensions =code pages http://en.wikipedia.org/wiki/Code_page One of them is Central Europe cp=852 http://en.wikipedia.org/wiki/Code_page_852 For the Polish language we use 18 diacritic letters (9 lowercase, 9 uppercase), eg.: A is Dec65; Polish A_with_tail is Dec165 and is generated with Alt-A key. In the DOS environment this is made by (one of many) TSR programs. In Windows the same is after setting Regional Option in the Control Panel (no need to change config.sys); diacritcs are generated/displayed in the same correct way in Windows applications as in DP. In WordPerfect to have diacritics we use key assigmnets Keyboard Layout / Map; lowercase diacrits with Alt, uppercase with Control key, eg. Ctrl-A is Compose 1,94 (Character Set 1 = Multinational 1) Maybe there is a better way, but I usually use macros to convert files_with_diacritics from WordPerfect to Word and vice versa. Anyway, it seems DP differently uses next 127 characters of extended ASCII, so translation national characters is not like 'tigers like the best' :-) Of course, that is rather small problem, which can be bypassed with the macros. Thank you for your response and attention, Have a really good day! Piotr Barancewicz 2008/6/22 Brian Hancock <[EMAIL PROTECTED]>: Hi Piotr, I know very ltitle about codepage. Australia being so isolated from the rest of the world we never need to incorporate international character sets and so I have never developed an understanding of how to work with them. Just so I can understand, out of the 255 characters representable in a one byte characters the 852 CodePage uses the normal 128 character ASCII set for the first 128 characters, and then replace the next 127 which are normally the line draw characters etc with the Central European characters? right? How do you usually work with these in DP? Do you have to make changes to the Config.sys to support the codepage characters at the operating system level? Does DP display the characters as they normally would be displayed? XML solves the problem of international language characters with Unicode, which encodes characters as a 2 or even 4 bytes instead of ASCII's 1 byte. I guess WordPerfect solved it is a similar method with their multiple byte special characters, and character sets. Does Microsoft Word support these foreign characters, if you use a WordPerfect secondary file as the merge data source? It might be possible to use XML (HTML) character entities, Before Unicode came out, there was some support for international characters in HTML, these characters can also be used with XML, however other than the standard 5 predefined entitites you need to define others in the DTD, either internal to the XML document or externally. You would then need to convert the characters in your DP XML to their entity representation, perhaps using a text utility like AWK. Alternatively the XML encoding ISO-8859-16 might be of help http://en.wikipedia.org/wiki/ISO/IEC_8859 Sorry I can'tbe of more help with this topic Brian ----- Original Message ----- From: "piotrzyk Gazeta.pl" <[EMAIL PROTECTED]> To: "Dataperfect Users Discussion Group" <[email protected]> Sent: Friday, June 20, 2008 8:25 PM Subject: Re: [Dataperf] Pt4 Example of Merging DP data into MS Word Hi Brian, It seems to be a good solution for me. Thank you for these guidelines on the way I have to go. Now, what I need is to have more understanding / learn XML-XLST to be able to differently format (size, appearance) each field imported/merged from DP. One thing is still unclear for me: I'm using Windows XP Pro with CP=852 to have our national characters properly entered & displayed in Windows. These characters are properly entered, displayed & sorted in DP2.6x. In your example DP2.6Y Report generates file DPNEWS.DOC. I've entered some records with national characters (properly entered & displayed in DP2.6Y). After opening the file in MS Word national characters are distorted. To experiment, I've entered some national characters into DPNEWS.DOC by the Notepad (properly entered & displayed in Notepad). After opening the file in MS Word I can see these 'Notepad_entered' characters correctly. Is distortion was made by DP2.6Y? MEIRTE Danny was writing something similiar about using multilingual codes. Is there any bypass to have DP2.6Y with national characters? If not, I'll write MS Word macro to replace what was distorted. Any suggestion? Have a good day, Piotr Barancewicz 2008/6/17, Brian Hancock <[EMAIL PROTECTED]>: Hi Piotr, and others, The previous 3 postings have been about how to create a DP XML file, and how to use it with Word 2003 for an XML "mail merge" type application. In the first part, we used Word to open up the DPNews.doc file, and how to apply a transformation within Word to generate the new document. The second part showed how to convert the structure of theXML file to a Schema which could be used by Word to create a templae document, the third posting showed how to creata a Template dcument with Word. Once you have created a template document, it is not essential to use Word to create new documents based on this template. It is possible to use other programs (some free) to generate the merged Word 2003 and document, into an ordinary documents that an everyday Word users could open without having any understanding of how it was created. To them it is a native Word document. There are many programs that can perform XSLT transformations. Word of course was one of them, but also you can use many applications, including say the free WMHelp XMLPad I used to create the DTD and Schema files in my second posting. There are many other such as XMLSpy, Oxygen, but my favourite is an open source commandline utility xsltproc.exe. It is available for both the Linux/Unix or Windows operating systems. You can find details and documentation for it at http://xmlsoft.org/XSLT/xsltproc2.html and download the Windows version from http://www.zlatkovic.com/pub/libxml/libxslt-1.1.23+.win32.zip Going back to the DPNews.doc file created by DP in the first posting, if you wanted to create the final native (XML) Word 2003 document you would simply use the command: xsltproc -o Newfile.doc DPNews.doc and it would create the file NewFile.doc which could be directly opened by Word without it ever being obvious that it waqs not created in Word. If you preferred you could leave the new file with the extension XML instead of DOC. It is likely that on a Windows platform that it would be correctly opened by Word as there is a processing instruction in the XML file that tell the operating system that it is a Word file. The above command would use the associated XSLT sheet which was specified in the DP XML file. If you wanted to use a different XSLT template you could override the default by using the command xsltproc -o Newfile.doc DPNews.xsl DPNews.doc Some really cool things with this is that, if you use the STDOUT facility from the new DP2.6Y you could transform a file like this: dp myapp.str /EI=mylog.log | xsltproc (with or without the -o NewFile.doc) and it would take the output directly from DP and pipe it to the XSLT processing utility and either save the results to a new file or send the output to STDOUT. This means for example that you could from a webserver merge the data with a templkate and deliver the result as a file download to the user's browser. Alol without ever having to have used Word to generate the document, note of course you need Word to create the template, and you need Word (or the free Word Viewer) to view the final document. To make it a file download rather than be read as a webpage all that is needed to do this is to add the appropriate MIME header before the document is output ie Content-disposition: attachment; filename=defaultfilename.doc If you have created your own DPNewsletter, you have probably found that embedded line breaks in a memo field were ignored when you merged them with Word, whereas in my original example the linebreaks appeared. This is because I had edited the DPNews.xsl template file and added another template rule: <xsl:template match="ns0:br"> <w:br/> </xsl:template> The "ns0:" is a namespace prefix that Word added to the document (and it might be different values so watch out). This rule ways that when you encounter a <br /> in the DP X<ML file, replace it in the word document with <w:br /> which is the Word 2003 instruction for a line break. It would be possible to add rules to cover the bold, underlined or italic however that is not quite so easy and I wanted to make the example as quickly as possible so I omitted it. My example only output a single customer's newsletter. If I wanted to create a number of customers' newsletters in one merge it would have been possible however I would have probably had to hand tune Word XSLT file. Instead of just having to apply the root "dpnews" element to the whole document I would have had to apply the "customer" element to the whole document as well. This would have meant that within the document my context would have been set at the "customer" level so the "article" elements would not have been so obvious. I would have had to had tune the XPath, so from within the "customer" content I would have had to specify the XPath as either "/dpnews/article" or perhaps "../article" (XPath's really do behave a lot like directory paths) I hope this gives an insignt into merging DP and Word. Regards Brian _______________________________________________ Dataperf mailing list [email protected] http://lists.dataperfect.nl/mailman/listinfo/dataperf _______________________________________________ Dataperf mailing list [email protected] http://lists.dataperfect.nl/mailman/listinfo/dataperf ------------------------------------------------------------------------------ _______________________________________________ Dataperf mailing list [email protected] http://lists.dataperfect.nl/mailman/listinfo/dataperf
_______________________________________________ Dataperf mailing list [email protected] http://lists.dataperfect.nl/mailman/listinfo/dataperf
