Kjære Christian,
for meg følgende code virker:
oSFA = createUNOService ("com.sun.star.ucb.SimpleFileAccess")
oInpStream = oSFA.openFileRead(sUrl)
oTextInpStream = createUnoService("com.sun.star.io.TextInputStream")
oTextInpStream.setInputStream(oInpStream)
oTextInpStream.setEncoding("iso-8859-1")
oDB = createUnoService("com.sun.star.xml.dom.DocumentBuilder")
domDoc = oDB.parse(oTextInpStream)
oInpStream.closeInput
Sorry for my bad Norvegian but It's long ago, I've been there.
To the code:
You have to use a TextInputStream to be able to set the encoding.
Hope it helps.
Ha det bra,
Christoph
Christian Andersson wrote:
> I have a small problem, In starbasic I'm using (almost) the following
> code (there might be small mistakes sicne I'm writing this from memory)
> to read and parse an xml document with starbasic
>
> oSFA = createUNOService ("com.sun.star.ucb.SimpleFileAccess")
> oInpStream = oSFA.openFileRead(sUrl)
> oDB = createUnoService("com.sun.star.xml.dom.DocumentBuilder")
> domDoc = oDB.parse(oInpStream)
> oInpStream.closeInput
>
> this works for me almost perfectly, and I say almost, since there are
> some xml documents that it cannot read.
>
> the problem I am having is that some documents (that are beeing
> generated by a third party system which I cannot change)
>
> have not declared that it is an xml document like this
> <?xml version="1.0" encoding="utf-8" ?>
>
> it just starts with the xml tags directly liek this
>
> <test>
> <test2>
> .....
> </test2>
> </test>
>
> this is all fine, I have other xml documents that also look liek this,
> and Openoffice can read and parse them.
> however within these problematic documents they are using national
> characters (åæø) encoded using iso-8859-1 and this is the problem.
> if they were encoded with utf-8 openoffice can read the document without
> having any ecoding declaration. but with iso-8859-1 the oDB.parse
> function just returns null. no errors/exceptions or anything, just null.
>
> if I in that file manually add <?xml version="1.0" encoding="iso-8859-1"
> ?> at the start, openoffice can read it perfectly..
>
> so is there some way I can force the dom parser to use iso-8859-1
> instead of utf-8 ?
> it would be great if I could do
> domDoc = oDB.parse(oInpStream, "iso-8859-1")
> and it would work, but from what I can see there is no function for this
> in the DocumentBuilder, not is there anything like this in the
> inputstream object or the simplefileaccess object.
>
> I should be able to get around this problem by programmaticly make a
> copy of the file, and insert the <?... part first and then use my
> modified file for reading the xml file, but this is only a last resort
> sollution.
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]