Re: very small idea

Pablo Saratxaga Thu, 19 Sep 2002 13:16:27 -0700

Kaixo!

On Thu, Sep 19, 2002 at 07:31:30PM +0200, Mike Fabian wrote:
> "Maiorana, Jason" <[EMAIL PROTECTED]> writes:
> 
> > If anyone using vim6 wants UTF8 copy/paste, I have a patch to it
> > that adds support for UTF8_STRING. It hasnt been added to the main
> > vim yet. It only works when multibyte is enabled and iconv is available.
> 
> This appears to work already for me with gvim (GTK-1 version).
> 
> I tried to cut and paste between gvim, mlterm, xterm, XEmacs, kedit.
> Worked in all directions without problems with UTF-8 encoded
> Japanese text.


Are you sure it is UTF8_STRING and not COMPOUND_TEXT ?
CTEXT uses iso-2022, which in theory can cover the whole unicode and even
more.
In theory.
In practice, both applications must support the iso-2022 escape sequances
and encodings used. It is not much of a problem for japanese, it should
be supported by almost all programs; but what about if you copy a text
which has laotian unicode chars? or Hindi? or Tigrinya? or ...

With the arrival of iso-8859-15 locales, a lot of people got, for simple
latin accented letters, a lot of trash in the cut and past buffer;
things like "r�ve" being put as "\e\xiso-8859-15\yr�ve"
(\x and \y beign some control codes I don't remeber right now)

The problem of CTEXT is that it is very complex (in fact it can use an
arbitrary complexit; nothing forbids to use a different encoding for each
letter of string you copy and paste for example).
And the number of encoding actually supported depends from program to
program, there is no certitude on what will work and what will not.

With UTF8_STRING however, you pass a simple string, without any extra
stuff, it is much simpler to implement, it is as simple as the STRING
one, but not limited to latin1.

[root@test root]# strings /usr/X11R6/bin/gvim | grep 'STRING\|TEXT'
_VIM_TEXT
COMPOUND_TEXT
STRING
[root@test root]# strings /usr/bin/yudit | grep 'STRING\|TEXT'
UTF8_STRING
COMPOUND_TEXT

as you can see, gvim supports COMPOUND_TEXT and STRING for cut and paste.
while yudit supports UTF8_STRING and COMPOUND_TEXT.

cut and paste between yudit and gvim will be possible, as COMPOUND_TEXT is
common, but most likely the range covered by COMPOUND_TEXT will be less
that the full unicode range.
old "STRING" is mostly useless, as it only covers iso-8859-1 and nothing
else; note that yudit don't even care to support it.

Both Gtk2 and Qt3 toolkits support UTF8_STRING and COMPOUND_TEXT.

UTF8_STRING is the way to go; COMPOUND_TEXT is still needed for compatibility
with old programs not supporting UTF8_STRING yet.
"STRING" seems not very used nowadays.

> And when Vim running in a terminal, it depends on the terminal whether
> it works or not.

Of course.

xterm and gnome-terminal support both COMPOUND_TEXT and UTF8_STRING,
rxvt supports only COMPOUND_TEXT, etc.

Note that program A and program B supporting COMPOUND_TEXT doesn't means
cut and paste will work.
Maybe program A only supports iso-8859-{1,2,9} and not cyrillic.

> Can you tell me how to reproduce a situation where it doesn't work and
> where the patch helps?

Try the sample file coming with yudit that has semi-graphics, API, greek
polytonic, thai, tamil, etc. and do a cut and paste of the whole text.

Note that some programs implement UTF-8 strings as an encoding of
COMPOUND_TEXT.
It is possible; but nothing ensures that the other program will 
understand it.
 
-- 
Ki �a vos v�ye b�n,
Pablo Saratxaga

http://chanae.stben.be/pablo/           PGP Key available, key ID: 0xD9B85466
[you can write me in Walloon, Spanish, French, English, Italian or Portuguese]

msg03189/pgp00000.pgp
Description: PGP signature

Re: very small idea

Reply via email to