Ack! Please do not use outdated versions of Xerces libraries. It makes it difficult to assure you that your problem is not some sort of bug.
First I would try upgrading. If upgrading does not work, perhaps you could search the user and dev lists for FFFE and FEFF, as well as trying the xml.apache.org JIRA databse. They might have record of this being a fixed bug, etc. HTH! Matt -----Original Message----- From: Xiaofan Zhou [mailto:[EMAIL PROTECTED] Sent: Thursday, May 26, 2005 3:06 PM To: c-dev@xerces.apache.org Subject: RE: Invalid character 0xFFFe Matt, That's what I did, and I couldn't find FFFE at all. BTW, I am using Xerces-c 1.7, could that be a problem? Thanks. Frank -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, May 26, 2005 2:52 PM To: c-dev@xerces.apache.org Subject: RE: Invalid character 0xFFFe Can't you write a routine that does a trivial O(n) search for that byte sequence, or do a memory search with GDB? Anything that could tell you where it is would be a good idea, just to find out if it's actually there or if something else is wrong. Maybe print out your document array as ASCIIfied hex with printf("...%x...") and so forth, then grep it for 0xFFFE. Or dump it into a file and look at it with a hex editor. --Matt -----Original Message----- From: Xiaofan Zhou [mailto:[EMAIL PROTECTED] Sent: Thursday, May 26, 2005 2:48 PM To: c-dev@xerces.apache.org Subject: RE: Invalid character 0xFFFe Thanks much for the reply. But in my case, the XML is created in memory, and I don't see the BOM char at the beginning when I look at it in memory in debugger, in fact, I don't see FFFE at all in memory. Also, if I make the XML smaller, then everything works fine. So I wouild think that the FFFE is introduced when I dump my DOM tree into a string (all this is done in memory), but just don't know where. Any suggestion is very much appreciated. Thanks. Frank -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, May 26, 2005 2:34 PM To: c-dev@xerces.apache.org Subject: RE: Invalid character 0xFFFe FFFE is the little endian rendition of the Byte Order Mark. Byter Order Marks are not generally valid in in-memory Unicode data, only in Unicode files. It's probably on byte 0 of the input array. Can you strip it out? See this quotation from http://www.xencraft.com/resources/unicodebom.html : The BOM character The Unicode Character Standard designated two characters as an aid to distinguish big-endian data from little-endian data. "Endianness" is not a problem for UTF-8 since it is a serialized byte stream. However, to process data encoded in UTF-16 or UTF-32, an application must first determine if the data being read is in the same or different "endianness" from the architecture that the application runs on. Unicode designated the character U+FEFF as the "Byte Order Mark" (BOM) and reserved U+FFFE as an illegal character. If an application detects a U+FFFE it can therefore presume that the data is in the opposite endianness of the architecture and that the data should be byte swapped. (A 32-bit architecture should also be word swapped.) HTH! Matt -----Original Message----- From: Xiaofan Zhou [mailto:[EMAIL PROTECTED] Sent: Thursday, May 26, 2005 1:27 PM To: xerces-c-dev@xml.apache.org Subject: Invalid character 0xFFFe Hi, All, I am encounting a XML parsing error saying something like the following: XML Parser failed, Error: Invalid character (Unicode 0xFFFe) The XML file size is about 100K in disk, the application is sort of like this: it is created from a DOM tree in memory then passed to a SAX parser. The error is reported in the SAX parser. Not sure where the unicode 0xFFFe come from, if I save the XML in file then load it into Internet Explorer, it is fine. Also, If I reduce the size, it also works fine. Any suggestions? Thanks much in advance. Frank ___________________________________________________________________ The information contained in this message and any attachment may be proprietary, confidential, and privileged or subject to the work product doctrine and thus protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify me immediately by replying to this message and deleting it and all copies and backups thereof. Thank you. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]