Re: Dear Bram

Ben Schmidt Sat, 19 Feb 2011 07:04:12 -0800

Plus, it's only the queue of incoming keypresses - that queue isn't
going to stay very big for very long.


It's not just the input queue that's in question here, it's everywhere
in Vim where keypresses are represented. For instance, the right hand
sides of mappings are not primarily characters, but lists of keypresses.
They need the same amount of expressive power to work properly.


Yes. And the scheme I created is no less expressive than the byte
queues. In fact it is more expressive, able to disambiguate neatly
situations that cannot normally be represented - see also my other mail
on the thread a few minutes ago.


Yeah, there was never anything lacking in your scheme in terms of
expressive power. It's just that it's not only the input queue that
would need to use it. It's a lot more.

When macros are recorded, registers, which usually are primarily lists
of characters, are used to store keypresses. Likewise, for feedkeys() to
work, its input, a string, needs to be able to represent keypresses.

And I'm sure there are plenty more subtleties.


Yes - again, my point of these structures is that they exactly _do_
represent these keypresses.


Yes. But what happens when you then edit that macro by putting the
register into a buffer, changing it, and yanking it again? This is not
uncommonly done. How should the registers be stored in .viminfo? How do
you write the input to the feedkeys function as a string in vimscript?
Etc.. These are the kinds of issues I was trying to raise.

The bottom line, though, is that changing to a struct-based approach
could make the job absolutely huge, requiring reworking and/or
redesigning how maps, registers, etc. all work. And it might not even be
possible since, e.g. registers need to be able to do both characters and
keys.


But every character -can- be represented as a key - namely, the key that
generates it.


When there is one. Or a codepoint in some encoding. Yeah, your struct
does allow for that. The struct isn't inexpressive, it's just that it's
big, and if it needs to be used in a LOT of places (which seem to just
be increasing...input queue, mappings, registers, strings, buffers?!),
suddenly not just one thing is getting a bit bigger--a lot of things are
getting bigger--potentially a lot bigger! If every time you yank text
into a register it gets 12 times larger, that may cause some problems!

Let me again repeat my structure:

   struct keypress {
     unsigned int is_special : 1;
     unsigned int modifiers  : 10;
     unsigned int codepoint  : 21;
   };

Any character has a Unicode codepoint (cp). That's represented by


Well, a codepoint in whatever encoding you are using.

   { 0, 0, cp }

Any keypress has a symbolic key number, taken from some arbitrary
enumeration - I care not where.

  { 1, mod, key }

Now, we can also represent those modified Unicode keys such as Ctrl-I
and Shift-Space which previously were impossible:

  { 0, MOD_CTRL, 'i' }    /* Ctrl-I */
  { 0, MOD_SHIFT, ' ' }   /* Shift-Space */


Yeah. The benefit here is that different keyboard layouts can be
represented, e.g. keyboards which have single keys for accented
characters can represent those keys with modifiers, easily.

OK - well, if you feel confident that the existing prefix-escape
mechanism can completely an unambiguously represent all these possible
keystrokes, then sure, that way might result in less code change
overall.


I think with some careful design it could, which may well require some
reworking of how it's currently done. There is the issue, of course,
that <80> is a valid character in some encodings, too, and I don't know
if this is accounted for. It needs to be. So maybe the escape mechanism
needs to be a bit more formal (if it can be--but maybe there's no way
around this, or maybe that's a longer-term consideration). Definitely
the issues have to be bashed out. I still think this would be the
easiest and best way forward, though. But it's far from my call!

Regarding that whole meta issue that you raised earlier--I think, yes,
using an 8-bit-high representation for meta is completely out of the
question. Part of the input code should be transforming that to a proper
Vim-internal escape sequence for terminals that use it. Same for other
things. The input code should transform input into Vim's internal
representation which should be carefully designed not to be ambiguous,
etc., but which, nevertheless, is at its most fundamental, a byte stream
so it can be used in the input queue, mappings, registers, strings and
even buffers.

But do make quite sure it can represent them. Specifically consider the
two tricky special-cases I suggested - repeated here again:

   Tab                      { 1, 0, KEY_TAB }
   Ctrl-I                   { 0, MOD_CTRL, 'i' }
   Ctrl-Shift-I             { 0, MOD_CTRL, 'I' }

   Escape C                 { 1, 0, KEY_ESCAPE }, { 0, 0, 'C' }
   Alt+C                    { 0, MOD_ALT, 'C' }
   é                        { 0, 0, 0xe9 }


Yes. This all has to be carefully accounted for.

Maybe the way to do it is to have printable characters represented just
as printable characters, 'default' control characters optionally just as
themselves (e.g. <09> for tab), to use when a terminal that can't
distinguish between this and Ctrl-I is used, and (whenever possible),
use <80>+flag_byte+character for control/special keys. One of the
flag_byte flags could be a 'special' flag, which means the 'character'
in the third byte would be some kind of mnemonic for the key (e.g. u for
the up arrow). When the key is not 'special' the character it produces
would be the third byte (and perhaps later for multibyte characters,
e.g. if ctrl is used with an accented character on keyboards that have a
specific key for this). How all this works depends a bit on how the
keyboard interfacing works--maybe it should just use keycodes if
characters are not readily obtainable. And maybe you need two flag bytes
if there are more than 7 modifier keys that need to be distinguished.

Then the behaviour of mappings needs to be defined--if there is a
mapping for ^I (<09>) and I push tab, will it be triggered? If in a
terminal which can't distinguish control-I and tab, and a ^I is
received, should the mapping for Tab or control-I be triggered? If
there's a mapping for ^I as well as Tab, which has precedence?


I propose a pair of boolean settings, with the following defaults:

  :set nomodifiedunicode
  :set altisescape

Under these settings,

   :map<Ctrl-I>  ... vs  :map<Tab>  ...  shall have the same effect, each
     overwriting the effects of the other; last one wins.

   Pressing the Tab key or Ctrl-I shall both invoke the last mapping
     registered.

   :map<Escape>C ... vs  :map<Alt+C>  ... shall behave similarly, each
     overwrites the other. Typing either key sequence will invoke the
     last map to be registered.

If a user decides "I want to use those extra Ctrl- keys", they can set
in their .vimrc

   :set modifiedunicode

At this point,

   :map<Ctrl-I>  and :map<Tab>   shall fill two -different-
     slots of the mapping list, and typing either key will activate the
     indicated mapping.


What should Vim do in a terminal that can't distinuish between those
keys when this option is in effect? This is the same question as below,
really.

Maybe we need an annotation to the mapping, like <default> or something
that says "this mapping is acceptable to use when we're not sure if this
exact combo was pressed, but we got a code that 'could be it' in some
terminal."

If a user further decides "I want the Alt modifier to not mean Escape
prefixing", then

   :set noaltisescape

At which point,

   :map<Escape>C and :map<Alt+C>  shall also be different.


I think there might already be an option for this. Or something in
termcap? I seem to remember reading some stuff about it in the Vim
manual. I guess, though, the 'non-alt prefixing' of Vim probably means
to use 8-bit-high for meta, which we already said, is pretty useless.

I'm not sure two options are needed. Probably a single option suffices
to deal with both situations.

The only remaining ambiguity to be answered is, what happens in the
following case:

   :set nomodifiedunicode
   :map<Tab>  ONE
   :map<Ctrl-I>  TWO
   :set modifiedunicode

   now press Tab or Ctrl-I

I don't have an easy answer for this case; no particular behaviour jumps
out at me as "obviously correct".


I agree. If we go with the <default> annotation to mappings (preferably
with a better keyword), I guess that is the one that wins and takes
precedence. It may not even need to delete the other ones, so both slots
could remain active, but it would just run the 'default' one.

All these
kinds of questions need clear answers, and sensible specifications and
design need to address them, avoid ambiguity, and take care to require
as little as possible work for users, plugin authors, etc. to update
their code and mappings.

Then the input code needs to be reworked in all the GUIs and in the
terminal handling to generate the appropriate internal codes,
consistently across all the different GUIs, etc., in line with that
specification/design.


I propose that the structure approach makes this simpler. It is now
clearly obvious how e.g. GTK should fill in these structures, as GDK
keypress events already have unicode / modifier bits or symbolic keys.
It's also obvious how to render such a structure as a string, possibly
wrapping<>  around it, possibly prefixing C- or A- or whatever as
appropriate.


That's good for mappings, and I guess the \< escape in vimscript
strings, but it doesn't help for registers (which can end up in
buffers), etc..

Maybe it's a sensible 'middle point', though, for the input queue. The
GUI specific code could go into a structure like this, which generic
code can then translate into internal byte-stream representation as
appropriate, and then that byte-stream representation is the standard
throughout the rest of Vim. Basically the structure would only be used
as a mechanism to deal with the code that deals with hardware
keypresses, making it consistent between GUIs, platforms, etc..

(( This sort of behaviour is already implemented by libtermkey, even
    going so far as to support

     char buffer[256];
     termkey_snprint_key(tk, buffer, sizeof buffer,&key, TERMKEY_FORMAT_VIM);

    for exactly this purpose. :) ))


Ben.



--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Re: Dear Bram

Raspunde prin e-mail lui