commit 818ec746f4caae453d09368b101c3e841cf39870
Author:     Hiltjo Posthuma <[email protected]>
AuthorDate: Wed Jun 17 21:35:39 2020 +0200
Commit:     Hiltjo Posthuma <[email protected]>
CommitDate: Wed Jun 17 21:35:39 2020 +0200

    fix unicode glitch in DCS strings, patch by Tim Allen
    
    Reported on the mailinglist:
    
    "
    I discovered recently that if an application running inside st tries to
    send a DCS string, subsequent Unicode characters get messed up. For
    example, consider the following test-case:
    
        printf '\303\277\033P\033\\\303\277'
    
    ...where:
    
      - \303\277 is the UTF-8 encoding of U+00FF LATIN SMALL LETTER Y WITH
        DIAERESIS (ÿ).
      - \033P is ESC P, the token that begins a DCS string.
      - \033\\ is ESC \, a token that ends a DCS string.
      - \303\277 is the same ÿ character again.
    
    If I run the above command in a VTE-based terminal, or xterm, or
    QTerminal, or pterm (PuTTY), I get the output:
    
        ÿÿ
    
    ...which is to say, the empty DCS string is ignored. However, if I run
    that command inside st (as of commit 9ba7ecf), I get:
    
        ÿÿ
    
    ...where those last two characters are \303\277 interpreted as ISO8859-1
    characters, instead of UTF-8.
    
    I spent some time tracing through the state machines in st.c, and so far
    as I can tell, this is how it works currently:
    
      - ESC P sets the "ESC_DCS" and "ESC_STR" flags, indicating that
        incoming bytes should be collected into the strescseq buffer, rather
        than being interpreted.
      - ESC \ sets the "ESC_STR_END" flag (when ESC is received), and then
        calls strhandle() (when \ is received) to interpret the collected
        bytes.
      - If the collected bytes begin with 'P' (i.e. if this was a DCS
        string) strhandle() sets the "ESC_DCS" flag again, confusing the
        state machine.
    
    If my understanding is correct, fixing the problem should be as easy as
    removing the line that sets ESC_DCS from strhandle():
    
    diff --git a/st.c b/st.c
    index ef8abd5..b5b805a 100644
    --- a/st.c
    +++ b/st.c
    @@ -1897,7 +1897,6 @@ strhandle(void)
                    xsettitle(strescseq.args[0]);
                    return;
            case 'P': /* DCS -- Device Control String */
    -               term.mode |= ESC_DCS;
            case '_': /* APC -- Application Program Command */
            case '^': /* PM -- Privacy Message */
                    return;
    
    I've tried the above patch and it fixes my problem, but I don't know if
    it introduces any others.
    "

diff --git a/st.c b/st.c
index ef8abd5..b5b805a 100644
--- a/st.c
+++ b/st.c
@@ -1897,7 +1897,6 @@ strhandle(void)
                xsettitle(strescseq.args[0]);
                return;
        case 'P': /* DCS -- Device Control String */
-               term.mode |= ESC_DCS;
        case '_': /* APC -- Application Program Command */
        case '^': /* PM -- Privacy Message */
                return;

Reply via email to