Hi,

I have encountered some of our sensor data files in unicode.

If you look at them using a text editor, they look good and everything
is cool.
  <?xml version="1.0" encoding="UTF-8"?>
  <sensor>
    <entry tstamp="1113819173814" ....
   ....

However, if you use a hex editor, you would see:
  FF FE 3C 00 3F 00 78 00 6D 00 63 00....

FFFE: (My guess) unicode endian order mark
3C00: <
3F00: ?
7800: x
6D00: m
6300: l

Obviously, the file uses UTF-16 encoding.

The problem is when I use JDOM to parse it:
  Document doc = new SAXBuilder().build(fileName)

It gives exception:
  "Error on line 1: Document root element is missing."

I think JDOM is confused by "FFFE" at the beginning of the file.

Does anybody know how to solve the problem?

Thanks

Cedric

Reply via email to