Well, everything indicates there are no hidden characters in front of
the beginning of the file. Either the "debug" command as you suggested
(see results below), or parsing the first characters of the InputStream
until the first '<', both point out that '<' is indeed the first
character encountered.
Could it be possible it comes from the encoding of the file?
I "iso-8859-1"-ed everything possible though to make every aspect of the
parsing coherent..
Where does the prolog start and end? Maybe the problem comes from the
end of the prolog?
...
PS: the extract from debug.exe
0D49:0100 3C 3F 78 6D 6C 20 76 65-72 73 69 6F 6E 3D 22 31 <?xml
version="1
0D49:0110 2E 30 22 20 65 6E 63 6F-64 69 6E 67 3D 22 69 73 .0"
encoding="is
0D49:0120 6F 2D 38 38 35 39 2D 31-22 3F 3E 0A 0A 3C 21 44
o-8859-1"?>..<!D
0D49:0130 4F 43 54 59 50 45 20 55-6E 69 74 2D 6F 66 2D 73 OCTYPE
Unit-of-s
0D49:0140 74 75 64 79 0A 20 20 50-55 42 4C 49 43 20 22 2D tudy.
PUBLIC "-
0D49:0150 2F 2F 4F 55 4E 4C 2F 2F-44 54 44 20 45 4D 4C 2F //OUNL//DTD
EML/
0D49:0160 58 4D 4C 20 62 69 6E 64-69 6E 67 20 31 2E 30 2F XML binding
1.0/
0D49:0170 31 2E 30 2F 2F 45 4E 22-20 22 68 74 74 70 3A 2F 1.0//EN"
"http:/
Robert Houben wrote:
This may not be your problem, but I've wasted tons of time in the past
because of these symptoms, so here is why it happened to me...
I have seen this happen when a file is read that contains byte order
marks at the beginning. Most editors strip these out and get the
encoding right, so you don't know this is happening. If you are doing
your own file reader to get an InputStream, you may need to skip a few
bytes at the beginning, setting the encoding value correctly based on
them, prior to setting up the reader. To tell if this is happening to
you, on a windows system, use the debug.exe command from the command
line:
C:\>debug test.xml
-d
1480:0100 FF FE 3C 00 74 00 65 00-73 00 74 00 3E 00 74 00
..<.t.e.s.t.>.t.
1480:0110 65 00 73 00 74 00 3C 00-2F 00 74 00 65 00 73 00
e.s.t.<./.t.e.s.
1480:0120 74 00 3E 00 0D 00 0A 00-00 00 00 00 00 00 00 00
t.>.............
1480:0130 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
................
1480:0140 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
................
1480:0150 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
................
1480:0160 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
................
1480:0170 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
................
-q
C:\>
Note that the file starts with "FFFE" which is a Unicode 16 Little
Endian byte order mark (BOM). If you create your own file reader and
try to pull this in, you will encounter the error that you are
mentioning. Notepad will show this as normal text, you'll never see the
funny stuff.
HTH,
-----Original Message-----
From: Andy Clark [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 27, 2005 5:46 PM
To: [email protected]
Subject: Re: going crazy with this: org.xml.sax.SAXParseException:
Content is not allowed in prolog
Paul Ekeland wrote:
my problem is that I cannot see any whitespace/strange characters
before the root element of the document. I have used several
different hexadecimal editors to check that, with no success! Do you
have a different way to find out of the existence of such things?
Can you attach the first few lines of the file to a
followup message? (Attach, not paste.)
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]