I'm watching a strange Unicode issue that appeared a couple of days
ago. I'm sorry I can't be more exact, but it's pretty fresh anyway. I
didn't report it immediately when I saw it, because I was trying to
gather a minimal amount of info that would be enough to reproduce it.
But I can't manage to do it yet.

So this is what I have:

current trunk of leo (I do bzr pull on a daily basis)
python 2.6.2
pyqt-4.4.4
slackware linux 13.0
non-utf8 locale (it's koi8-r)

My workflow where I observe the problem looks like this:

- beginning of the day
- I open a big outline in leo that I work with on a daily basis
- a node body contains mixed English and Russian text
- the Russian text is screwed (I have a line at the top of the body
and I'm using it as reference)
- if I look at that part of the outline with a text editor, it doesn't
look like proper utf8 to me
- here is the hex dump of it:

$ xxd aaa
0000000: 41c3 90c2 bdc3 90c2 b0c3 91c2 82c3 90c2  A...............
0000010: bec3 90c2 bbc3 90c2 b8c3 90c2 b920 c390  ............. ..
0000020: c29a c390 c2be c390 c2b3 c390 c2b0 c390  ................
0000030: c2bd 202d 20c3 90c2 92c3 90c2 bec3 90c2  .. - ...........
0000040: b9c3 90c2 bac3 91c2 832c 20c3 91c2 81c3  ........., .....
0000050: 91c2 8bc3 90c2 bd20 c390 c2a2 c391 c283  ....... ........
0000060: c390 c2b4 c390 c2be c391 c280 c390 c2b0  ................
0000070: 0a0a 0a

- I manually replace the garbage by retyping that line in Russian
- I save the outline
- I look at the part of the outline again in the text editor, it is
shorter now and looks like proper utf8. I'm sure because the command
"$ echo <Russian text in koi8-r> | iconv -f koi8-r -t utf8" is similar
to what I see in the text editor
- here is the hex dump of it:
$ xxd aaa
0000000: d090 d0bd d0b0 d182 d0be d0bb d0b8 d0b9  ................
0000010: 20d0 9ad0 bed0 b3d0 b0d0 bd20 2d20 d092   .......... - ..
0000020: d0be d0b9 d0ba d183 2c20 d181 d18b d0bd  ........, ......
0000030: 20d0 a2d1 83d0 b4d0 bed1 80d0 b00a 0a     ..............
- I close leo
- I look at the outline file with a text editor again, the line in
question stays proper utf8
- I launch leo again and open the outline one more time
- the line in question looks ok in leo
- it looks proper utf8 when I look at the outline file with the text
editor again
- I keep the outline open for the whole day, occasionally modifying it
- I close leo at the end of the day
- the beginning of next day is similar: when I open the outline for
the first time, I see garbage again instead of Russian

The problem here is that I don't understand why the issue doesn't pop
up immediately at the beginning of the day, when I edit that line to
look properly, save then reopen the outline. I might use some
suggestions about how could I start with an empty outline, do
something so I could see immediately that broken Russian text.

Edward, if you consider it'll be easier for you to track my feedback
about this, feel free to move this post to a new topic and we'll
continue the discussion on the subject there.
-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en.


Reply via email to