Re: [api-dev] problems reading xml file with com.sun.star.xml.dom.DocumentBuilder

Christoph Jopp Thu, 21 Sep 2006 14:31:00 -0700

Kjære Christian,
for meg følgende code virker:

    oSFA = createUNOService ("com.sun.star.ucb.SimpleFileAccess")
    oInpStream = oSFA.openFileRead(sUrl)
    oTextInpStream = createUnoService("com.sun.star.io.TextInputStream")
    oTextInpStream.setInputStream(oInpStream)
    oTextInpStream.setEncoding("iso-8859-1")
    oDB = createUnoService("com.sun.star.xml.dom.DocumentBuilder")
    domDoc = oDB.parse(oTextInpStream)
    oInpStream.closeInput


Sorry for my bad Norvegian but It's long ago, I've been there.
To the code:
You have to use a TextInputStream to be able to set the encoding.

Hope it helps.
Ha det bra,
    Christoph


Christian Andersson wrote:
> I have a small problem, In starbasic I'm using (almost) the following
> code (there might be small mistakes sicne I'm writing this from memory)
> to read and parse an xml document with starbasic
>
>       oSFA = createUNOService ("com.sun.star.ucb.SimpleFileAccess")
>       oInpStream = oSFA.openFileRead(sUrl)
>       oDB = createUnoService("com.sun.star.xml.dom.DocumentBuilder")
>       domDoc = oDB.parse(oInpStream)
>       oInpStream.closeInput
>
> this works for me almost perfectly, and I say almost, since there are
> some xml documents that it cannot read.
>
> the problem I am having is that some documents (that are beeing
> generated by a third party system which I cannot change)
>
> have not declared that it is an xml document like this
> <?xml version="1.0" encoding="utf-8" ?>
>
> it just starts with the xml tags directly liek this
>
> <test>
>   <test2>
>     .....
>   </test2>
> </test>
>
> this is all fine, I have other xml documents that also look liek this,
> and Openoffice can read and parse them.
> however within these problematic documents they are using national
> characters (åæø) encoded using iso-8859-1 and this is the problem.
> if they were encoded with utf-8 openoffice can read the document without
> having any ecoding declaration. but with iso-8859-1 the oDB.parse
> function just returns null. no errors/exceptions or anything, just null.
>
> if I in that file manually add <?xml version="1.0" encoding="iso-8859-1"
> ?> at the start, openoffice can read it perfectly..
>
> so is there some way I can force the dom parser to use iso-8859-1
> instead of utf-8 ?
> it would be great if I could do
>       domDoc = oDB.parse(oInpStream, "iso-8859-1")
> and it would work, but from what I can see there is no function for this
> in the DocumentBuilder, not is there anything like this in the
> inputstream object or the simplefileaccess object.
>
> I should be able to get around this problem by programmaticly make a
> copy of the file, and insert the <?... part first and then use my
> modified file for reading the xml file, but this is only a last resort
> sollution.
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [api-dev] problems reading xml file with com.sun.star.xml.dom.DocumentBuilder

Reply via email to