Re: [api-dev] problems reading xml file with com.sun.star.xml.dom.DocumentBuilder

2006-09-26 Thread Christoph Jopp
Sorry, it seems to be a platform independent problem.
I could reproduce your problem even on the Linux machine.
Just had no really 'iso formatted' xml test document first.
In the code I sent, the TextInputStream is in fact providing the correct
character encoding and it turned out that the DocumentBuilder seems to
look only into the stream for the encoding. Thus it doesn't help to
provide the stream with a correct character encoding and you must
provide the definition of the encoding inside the stream (here in the
first line of your xml document).
The only way I could think of to bypass this problem would be
1. Write this definition into your file (as you stated)
2. Somehow write this definition into your stream first (don't know yet
how to do this)
3. Convert your stream encoding (maybe reading bytes from inputstream
and writing utf to the parser - how?)

Sorry again for not really helping you.

Maybe somebody else?

Btw: To get the build number without writing code you could open the
about box from the help menu and type sdt keeping the control key
pressed for all three letters.

Christian Andersson wrote:
 Hmm this is not working for me, I still get a null object from oDB.parse...

 what system do you test this on?
 I am running this on windows 2003 server and openoffice 2.0
 (I know that there is a way to get build number, but I keep forgetting it)

 Christoph Jopp wrote:
   
 Kjære Christian,
 for meg følgende code virker:

 oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
 oInpStream = oSFA.openFileRead(sUrl)
 oTextInpStream = createUnoService(com.sun.star.io.TextInputStream)
 oTextInpStream.setInputStream(oInpStream)
 oTextInpStream.setEncoding(iso-8859-1)
 oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
 domDoc = oDB.parse(oTextInpStream)
 oInpStream.closeInput

 Sorry for my bad Norvegian but It's long ago, I've been there.
 To the code:
 You have to use a TextInputStream to be able to set the encoding.

 Hope it helps.
 Ha det bra,
 Christoph


 Christian Andersson wrote:
 
 I have a small problem, In starbasic I'm using (almost) the following
 code (there might be small mistakes sicne I'm writing this from memory)
 to read and parse an xml document with starbasic

 oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
 oInpStream = oSFA.openFileRead(sUrl)
 oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
 domDoc = oDB.parse(oInpStream)
 oInpStream.closeInput

 this works for me almost perfectly, and I say almost, since there are
 some xml documents that it cannot read.

 the problem I am having is that some documents (that are beeing
 generated by a third party system which I cannot change)

 have not declared that it is an xml document like this
 ?xml version=1.0 encoding=utf-8 ?

 it just starts with the xml tags directly liek this

 test
   test2
 .
   /test2
 /test

 this is all fine, I have other xml documents that also look liek this,
 and Openoffice can read and parse them.
 however within these problematic documents they are using national
 characters (åæø) encoded using iso-8859-1 and this is the problem.
 if they were encoded with utf-8 openoffice can read the document without
 having any ecoding declaration. but with iso-8859-1 the oDB.parse
 function just returns null. no errors/exceptions or anything, just null.

 if I in that file manually add ?xml version=1.0 encoding=iso-8859-1
 ? at the start, openoffice can read it perfectly..

 so is there some way I can force the dom parser to use iso-8859-1
 instead of utf-8 ?
 it would be great if I could do
 domDoc = oDB.parse(oInpStream, iso-8859-1)
 and it would work, but from what I can see there is no function for this
 in the DocumentBuilder, not is there anything like this in the
 inputstream object or the simplefileaccess object.

 I should be able to get around this problem by programmaticly make a
 copy of the file, and insert the ?... part first and then use my
 modified file for reading the xml file, but this is only a last resort
 sollution.

   
   
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


 

   

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [api-dev] problems reading xml file with com.sun.star.xml.dom.DocumentBuilder

2006-09-25 Thread Christian Andersson
Hmm this is not working for me, I still get a null object from oDB.parse...

what system do you test this on?
I am running this on windows 2003 server and openoffice 2.0
(I know that there is a way to get build number, but I keep forgetting it)

Christoph Jopp wrote:
 Kjære Christian,
 for meg følgende code virker:
 
 oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
 oInpStream = oSFA.openFileRead(sUrl)
 oTextInpStream = createUnoService(com.sun.star.io.TextInputStream)
 oTextInpStream.setInputStream(oInpStream)
 oTextInpStream.setEncoding(iso-8859-1)
 oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
 domDoc = oDB.parse(oTextInpStream)
 oInpStream.closeInput
 
 Sorry for my bad Norvegian but It's long ago, I've been there.
 To the code:
 You have to use a TextInputStream to be able to set the encoding.
 
 Hope it helps.
 Ha det bra,
 Christoph
 
 
 Christian Andersson wrote:
 I have a small problem, In starbasic I'm using (almost) the following
 code (there might be small mistakes sicne I'm writing this from memory)
 to read and parse an xml document with starbasic

  oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
  oInpStream = oSFA.openFileRead(sUrl)
  oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
  domDoc = oDB.parse(oInpStream)
  oInpStream.closeInput

 this works for me almost perfectly, and I say almost, since there are
 some xml documents that it cannot read.

 the problem I am having is that some documents (that are beeing
 generated by a third party system which I cannot change)

 have not declared that it is an xml document like this
 ?xml version=1.0 encoding=utf-8 ?

 it just starts with the xml tags directly liek this

 test
   test2
 .
   /test2
 /test

 this is all fine, I have other xml documents that also look liek this,
 and Openoffice can read and parse them.
 however within these problematic documents they are using national
 characters (åæø) encoded using iso-8859-1 and this is the problem.
 if they were encoded with utf-8 openoffice can read the document without
 having any ecoding declaration. but with iso-8859-1 the oDB.parse
 function just returns null. no errors/exceptions or anything, just null.

 if I in that file manually add ?xml version=1.0 encoding=iso-8859-1
 ? at the start, openoffice can read it perfectly..

 so is there some way I can force the dom parser to use iso-8859-1
 instead of utf-8 ?
 it would be great if I could do
  domDoc = oDB.parse(oInpStream, iso-8859-1)
 and it would work, but from what I can see there is no function for this
 in the DocumentBuilder, not is there anything like this in the
 inputstream object or the simplefileaccess object.

 I should be able to get around this problem by programmaticly make a
 copy of the file, and insert the ?... part first and then use my
 modified file for reading the xml file, but this is only a last resort
 sollution.

   
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 

-- 
Christian Andersson - [EMAIL PROTECTED]

Configuration and Collaboration for OpenOffice.org
Open Framework Systems AS http://www.ofs.no

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [api-dev] problems reading xml file with com.sun.star.xml.dom.DocumentBuilder

2006-09-25 Thread Christoph Jopp
The system I tested it was a Linux Machine ;-) so it might be true that
there is a difference.
To check it on a Windows (XP) machine I have to wait until the evening.
But what I found in the IDL reference might help:
They say they use the character encoding name according to this
http://www.iana.org/assignments/character-sets document.
So it might be a different wrighting and you could check with some of
these possibilities I found there:

Name: ISO_8859-1:1987[RFC1345,KXS2]
MIBenum: 4
Source: ECMA registry
Alias: iso-ir-100
Alias: ISO_8859-1
Alias: ISO-8859-1 (preferred MIME name)
Alias: latin1
Alias: l1
Alias: IBM819
Alias: CP819
Alias: csISOLatin1

If something of it works tell me please. Otherwise I'll check it today in the 
evening on my windows machine.



Christian Andersson wrote:
 Hmm this is not working for me, I still get a null object from oDB.parse...

 what system do you test this on?
 I am running this on windows 2003 server and openoffice 2.0
 (I know that there is a way to get build number, but I keep forgetting it)

 Christoph Jopp wrote:
   
 Kjære Christian,
 for meg følgende code virker:

 oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
 oInpStream = oSFA.openFileRead(sUrl)
 oTextInpStream = createUnoService(com.sun.star.io.TextInputStream)
 oTextInpStream.setInputStream(oInpStream)
 oTextInpStream.setEncoding(iso-8859-1)
 oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
 domDoc = oDB.parse(oTextInpStream)
 oInpStream.closeInput

 Sorry for my bad Norvegian but It's long ago, I've been there.
 To the code:
 You have to use a TextInputStream to be able to set the encoding.

 Hope it helps.
 Ha det bra,
 Christoph


 Christian Andersson wrote:
 
 I have a small problem, In starbasic I'm using (almost) the following
 code (there might be small mistakes sicne I'm writing this from memory)
 to read and parse an xml document with starbasic

 oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
 oInpStream = oSFA.openFileRead(sUrl)
 oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
 domDoc = oDB.parse(oInpStream)
 oInpStream.closeInput

 this works for me almost perfectly, and I say almost, since there are
 some xml documents that it cannot read.

 the problem I am having is that some documents (that are beeing
 generated by a third party system which I cannot change)

 have not declared that it is an xml document like this
 ?xml version=1.0 encoding=utf-8 ?

 it just starts with the xml tags directly liek this

 test
   test2
 .
   /test2
 /test

 this is all fine, I have other xml documents that also look liek this,
 and Openoffice can read and parse them.
 however within these problematic documents they are using national
 characters (åæø) encoded using iso-8859-1 and this is the problem.
 if they were encoded with utf-8 openoffice can read the document without
 having any ecoding declaration. but with iso-8859-1 the oDB.parse
 function just returns null. no errors/exceptions or anything, just null.

 if I in that file manually add ?xml version=1.0 encoding=iso-8859-1
 ? at the start, openoffice can read it perfectly..

 so is there some way I can force the dom parser to use iso-8859-1
 instead of utf-8 ?
 it would be great if I could do
 domDoc = oDB.parse(oInpStream, iso-8859-1)
 and it would work, but from what I can see there is no function for this
 in the DocumentBuilder, not is there anything like this in the
 inputstream object or the simplefileaccess object.

 I should be able to get around this problem by programmaticly make a
 copy of the file, and insert the ?... part first and then use my
 modified file for reading the xml file, but this is only a last resort
 sollution.

   
   
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


 

   

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [api-dev] problems reading xml file with com.sun.star.xml.dom.DocumentBuilder

2006-09-22 Thread Christian Andersson
thank you, I'll try that at once, and don't worry about the norwegian,
I'm not good at it either.

Christoph Jopp wrote:
 Kjære Christian,
 for meg følgende code virker:
 
 oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
 oInpStream = oSFA.openFileRead(sUrl)
 oTextInpStream = createUnoService(com.sun.star.io.TextInputStream)
 oTextInpStream.setInputStream(oInpStream)
 oTextInpStream.setEncoding(iso-8859-1)
 oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
 domDoc = oDB.parse(oTextInpStream)
 oInpStream.closeInput
 
 Sorry for my bad Norvegian but It's long ago, I've been there.
 To the code:
 You have to use a TextInputStream to be able to set the encoding.
 
 Hope it helps.
 Ha det bra,
 Christoph
 
 
 Christian Andersson wrote:
 I have a small problem, In starbasic I'm using (almost) the following
 code (there might be small mistakes sicne I'm writing this from memory)
 to read and parse an xml document with starbasic

  oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
  oInpStream = oSFA.openFileRead(sUrl)
  oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
  domDoc = oDB.parse(oInpStream)
  oInpStream.closeInput

 this works for me almost perfectly, and I say almost, since there are
 some xml documents that it cannot read.

 the problem I am having is that some documents (that are beeing
 generated by a third party system which I cannot change)

 have not declared that it is an xml document like this
 ?xml version=1.0 encoding=utf-8 ?

 it just starts with the xml tags directly liek this

 test
   test2
 .
   /test2
 /test

 this is all fine, I have other xml documents that also look liek this,
 and Openoffice can read and parse them.
 however within these problematic documents they are using national
 characters (åæø) encoded using iso-8859-1 and this is the problem.
 if they were encoded with utf-8 openoffice can read the document without
 having any ecoding declaration. but with iso-8859-1 the oDB.parse
 function just returns null. no errors/exceptions or anything, just null.

 if I in that file manually add ?xml version=1.0 encoding=iso-8859-1
 ? at the start, openoffice can read it perfectly..

 so is there some way I can force the dom parser to use iso-8859-1
 instead of utf-8 ?
 it would be great if I could do
  domDoc = oDB.parse(oInpStream, iso-8859-1)
 and it would work, but from what I can see there is no function for this
 in the DocumentBuilder, not is there anything like this in the
 inputstream object or the simplefileaccess object.

 I should be able to get around this problem by programmaticly make a
 copy of the file, and insert the ?... part first and then use my
 modified file for reading the xml file, but this is only a last resort
 sollution.

   
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 

-- 
Christian Andersson - [EMAIL PROTECTED]

Configuration and Collaboration for OpenOffice.org
Open Framework Systems AS http://www.ofs.no

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [api-dev] problems reading xml file with com.sun.star.xml.dom.DocumentBuilder

2006-09-21 Thread Christoph Jopp
Kjære Christian,
for meg følgende code virker:

oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
oInpStream = oSFA.openFileRead(sUrl)
oTextInpStream = createUnoService(com.sun.star.io.TextInputStream)
oTextInpStream.setInputStream(oInpStream)
oTextInpStream.setEncoding(iso-8859-1)
oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
domDoc = oDB.parse(oTextInpStream)
oInpStream.closeInput

Sorry for my bad Norvegian but It's long ago, I've been there.
To the code:
You have to use a TextInputStream to be able to set the encoding.

Hope it helps.
Ha det bra,
Christoph


Christian Andersson wrote:
 I have a small problem, In starbasic I'm using (almost) the following
 code (there might be small mistakes sicne I'm writing this from memory)
 to read and parse an xml document with starbasic

   oSFA = createUNOService (com.sun.star.ucb.SimpleFileAccess)
   oInpStream = oSFA.openFileRead(sUrl)
   oDB = createUnoService(com.sun.star.xml.dom.DocumentBuilder)
   domDoc = oDB.parse(oInpStream)
   oInpStream.closeInput

 this works for me almost perfectly, and I say almost, since there are
 some xml documents that it cannot read.

 the problem I am having is that some documents (that are beeing
 generated by a third party system which I cannot change)

 have not declared that it is an xml document like this
 ?xml version=1.0 encoding=utf-8 ?

 it just starts with the xml tags directly liek this

 test
   test2
 .
   /test2
 /test

 this is all fine, I have other xml documents that also look liek this,
 and Openoffice can read and parse them.
 however within these problematic documents they are using national
 characters (åæø) encoded using iso-8859-1 and this is the problem.
 if they were encoded with utf-8 openoffice can read the document without
 having any ecoding declaration. but with iso-8859-1 the oDB.parse
 function just returns null. no errors/exceptions or anything, just null.

 if I in that file manually add ?xml version=1.0 encoding=iso-8859-1
 ? at the start, openoffice can read it perfectly..

 so is there some way I can force the dom parser to use iso-8859-1
 instead of utf-8 ?
 it would be great if I could do
   domDoc = oDB.parse(oInpStream, iso-8859-1)
 and it would work, but from what I can see there is no function for this
 in the DocumentBuilder, not is there anything like this in the
 inputstream object or the simplefileaccess object.

 I should be able to get around this problem by programmaticly make a
 copy of the file, and insert the ?... part first and then use my
 modified file for reading the xml file, but this is only a last resort
 sollution.

   

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]