Marc Santhoff wrote:
Am Dienstag, den 29.11.2005, 09:56 +0100 schrieb Stephan Bergmann:

Marc Santhoff wrote:

Am Montag, den 28.11.2005, 10:29 +0100 schrieb Stephan Bergmann:


Marc Santhoff wrote:


Hi,

I'm using dictionaries from basic code and noticed a problem. When the
search word from a dictionary entry is inserted into a writer doc the
encoding is not shown correctly.

Try this in a german localized version:

sub encError
        dls = createUnoService("com.sun.star.linguistic2.DictionaryList")
        dic = dls.getDictionaryByName("soffice.dic")
        entries = dic.getEntries()
        msgbox entries(16).getDictionaryWord()
end sub

In a german language version of OO.o 1.1.x this should read
"Bemaßungslinien" but the char "ß" is not converted correctly. This
holds true for the german  OO.o2.0-RC1/Windows, too.

Is this worth filing an issue or is it a pilots error?

It sure sounds like an error (so please file an issue): XDictionaryEntry.getDictionaryWord returns a UNO string, which is Unicode, so no excuse to garble an "ß" (and Basic's msgbox command should also be fully Unicode...).


Thank for replying.

I only thought I was missing some conversion function or the like
because all umlauts are garbled too. They are shown as two chars in a
writer doc. And from the GUI anything works as expected ...

You mean, adding text to a writer doc via some Basic code (where the text to be added is represented as a literal Basic string) leads to garbled characters? That's strange. Maybe Andreas Bregas knows whether there is some part of Basic or the Basic IDE that works with locale-dependent text encodings instead of Unicode?


Yes, that's what I wanted to say.

Another Test fpor the german localized OO.o:

sub encError2
        BasicLibraries.LoadLibrary("Tools")
        dls = createUnoService("com.sun.star.linguistic2.DictionaryList")
        dic = dls.getDictionaryByName("soffice.dic")
        entries = dic.getEntries()
        tmpDoc = CreateNewDocument("swriter")
        csr = tmpDoc.Text.createTextCursor()
        tmpDoc.Text.string = entries(16).getDictionaryWord() ' "ß"
        tEnd = tmpDoc.Text.getEnd()
        tEnd.String = entries(46).getDictionaryWord() ' "ö"
end sub

This does garble the special chars, too.

Regards,
Marc

Two things I noticed when trying to reproduce this:

1 You must be using a non-UTF-8 locale (probably 8859-1), check the environment variable LANG. If you set LANG to something like "de_DE.UTF-8" the problem should go away.

2  If you modify the Basic script by adding

    tEnd = tmpDoc.Text.getEnd()
    tEnd.String = "äöü"
  end sub

to the end, you see that Basic is not the culprit, as the umlauts show up correctly in the writer doc, regardless of LANG setting.

I suspect that the OOo dictionary implementation erroneously uses osl_getThreadTextEncoding() (which depends on LANG) to translate the (obviously UTF-8 encoded) strings within the dictionary data base to Unicode. Please update the issue (did you already write one?) accordingly.

-Stephan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to