Am Donnerstag, 23. Juni 2005 10:10 schrieb Dominik Stadler:
> Hi,
>
> We have a file that contains multiple XML-Messages in the form of:
>
> <FIRSTMESSAGE>...</FIRSTMESSAGE><SECONDMESSAGE>...</SECONDMESSAGE>
>
> I know this looks broken from the beginning, but we cannot change the
> application that generates these kind of data, so we need a way to
> cope with it.
>
> How would I go about reading/splitting this? I don't think there is
> functionality available to do this with Xerces, right?
>
> I thought about using SAX to try to parse the complete text (we
> should get an error at the second message) and then read the
> char/line information from the errormessage, but this sounds like a
> hack to me, is there some other way?
Hi Dominik,
you could try to make your data stream look like legal xml content, by
preceding the stream with a xml header and an opening root element. It
would then look like
<?xml version="1.0"?>
<data-stream>
<FIRSTMESSAGE>...</FIRSTMESSAGE><SECONDMESSAGE>...</SECONDMESSAGE>
This can be parsed with SAX parser, even if the root element never will
be closed.
Cheers,
Axel
--
Humboldt-Universität zu Berlin
Institut für Informatik
Signalverarbeitung und Mustererkennung
Dipl.-Inf. Axel Weiß
Rudower Chaussee 25
12489 Berlin-Adlershof
+49-30-2093-3050
** www.freesp.de **
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]