RE: [U2] XML Processing...
Yup - I usually just set them to 1, 2 etc so they correspond to the attribute number the data is extracted to. With regard to the OP original question I have only ever used prepare/open/read xml with a file on disk. So I would use say the http api to get a feed, write it to disk then prepare and open etc. As you say it all works pretty well. (Except for utf-8 encoded files with characters over 0x7f - my little gripe !) Rgds Symeon. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Boydell, Stuart Sent: 08 June 2007 03:11 To: u2-users@listserver.u2ug.org Subject: RE: [U2] XML Processing... As far as I have discovered, it's just dumb - there's no positional or extract logic relationship between the EXT fields and dictionary - except that the fields are required in the EXT. Maybe they intend to impliment it one day. --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] XML Processing...
As far as I have discovered, it's just dumb - there's no positional or extract logic relationship between the EXT fields and dictionary - except that the fields are required in the EXT. Maybe they intend to impliment it one day. Am I being a clod and missing something? Inquiring Minds Need to Know! ** This email message and any files transmitted with it are confidential and intended solely for the use of addressed recipient(s). If you have received this communication in error, please reply to this e-mail to notify the sender of its incorrect delivery and then delete it and your reply. It is your responsibility to check this email and any attachments for viruses and defects before opening or sending them on. Spotless collects information about you to provide and market our services. For information about use, disclosure and access, see our privacy policy at http://www.spotless.com.au Please consider our environment before printing this email. ** --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
Re: [U2] XML processing
Thanks guys - One of the files in question does have ISO-8859-1 as the encoding in the xml header. However i played arround with this, including using the linux iconv funtion to convert to utf-8, then back to ISO-8859-1. The other is utf-8, which i changed to ISO-8859-1. However no luck I am affraid, I even wrote a unibasic prog to change each char over 7f to character encoded form (i.e. #x7f;) but it still crashed out on this. So for the time being i have a prog to delete each char over 7f and it is ok. - not a solution tho ! I shall continue my discusions with IBM. Interestingly they can not replicate it on Windows except every now and then, + i did run a test on a solaris box and it was fine, so maybe it is something to do with Lang settings or kernel/udtconfig setup aswell ?? On 05/04/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Did you try a different encoding? How did you go? -- *From:* [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] *On Behalf Of *Symeon Breen [EMAIL PROTECTED] *Sent:* Monday, 3 April 2006 19:06 *To:* u2 users group u2-users@listserver.u2ug.org *Subject:* [U2] XML processing Hi Guys - I am currently working on a problem I have, with JayJay at IBM but thought I would throw it open to see if anyone else out there has seen something similar. I am using PREPAREAML/OPENXMLDATA/READXML to process some fairly large XMl files (over 30 meg) very succesfully. However i do have some instances where particular elements have foreign language characters in (over char 127). This seems to be giving me a segmentation faults in the open stage, if i use an EXT that tries to extract data from that element or any elements after the element in question. Has anyone had any similar kind off problem at all ?? Cheers Symeon. --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ ** This email message and any files transmitted with it are confidential and intended solely for the use of addressed recipient(s). If you have received this email in error please notify the Spotless IS Support Centre (+61 3 9269 7555) immediately, who will advise further action. This footnote also confirms that this email message has been scanned for the presence of computer related viruses. ** --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] XML processing
I had a similar thing happen, we were getting some accented characters ( 7F/127) and the parser would jerk to a halt (sometimes). Our issue was that the XML was actually written in iso-8859-1 encoding but the encoding element in the XML said UTF-8 ?xml version=1.0 encoding=utf-8 ?. I think the way it works is that UTF-8 expects -007f characters to be single byte for ASCII compatibility and after that as double byte. Mostly this is fine if all the characters are under 007F (127). In this case, when the parser got to the accent character it would throw it's hands in the air (like it did in fact care) saying okay - you've told me it's utf-8, but then when I read this character that should have another byte with it, but it doesn't - what is going on!? Check that your encoding, element is correct. If the encoding element says UTF-8 and the text is actually ASCII then the parser may be having difficulty when it sees an over 127 character and thinks that the text should be Unicode. Try changing the encoding to ISO-8859-1 (or something suitable) and see what happens. Stuart __ some instances where particular elements have foreign language characters in (over char 127). This seems to be giving me a segmentation faults in the open stage, if i use an EXT that tries to extract data from that element or any elements after the element in question. ** This email message and any files transmitted with it are confidential and intended solely for the use of addressed recipient(s). If you have received this email in error please notify the Spotless IS Support Centre (+61 3 9269 7555) immediately, who will advise further action. This footnote also confirms that this email message has been scanned for the presence of computer related viruses. ** --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
Re: [U2] XML processing
At 04:05 AM 4/3/2006, you wrote: However i do have some instances where particular elements have foreign language characters in (over char 127). We just spent a few weeks with an end user and IBM on this very issue. Bottom line is that the UV XML parser does not handle UTF-8 characters. Due to the needs of our customer, IBM has agreed to add an enhancement in a near future patch release. The enhancement will do the following: The partial solution where the XML tags must be in ASCII but text and attribute values can be double byte character set (using UTF-8 encoding) will be implemented as a patch release. In a future major release, there is talk of a complete re-working of IBM's XML implementation. If IBM were to do this, their XML processor would finally be encoding agnostic. I urge you to let IBM know of your use of the XML parser needs now so that they can weigh the priority of scheduling the XML parser changes accordingly. As I stated, this is already on the board for some changes but is currently scheduled a few months off right now. Doug Miller [EMAIL PROTECTED] Manager of Technical Services Strategy 7Dallas TX --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/