>>>> In response to the following comment made by Bram on Aug 2, 2007: >>>> (can be viewed at >>>> http://groups.google.com/group/vim_dev/browse_thread/thread/3b73a504c77ba803/) >>>> >>>>> I hesitate removing the Hangul support without knowing for sure that it >>>>> is not needed. Browsing through the messages I do see remarks that it >>>>> might still be useful to a few people. >>>>> >>>>> Perhaps the Hangul support can be changed to also work for UTF-8? >>>> >>>> I made (finally) a patch that enables hangul-input module to work for >>>> UTF-8. >>> >>> Thanks. I'm glad to finally see this implemented. >>> It still needs some work though. >>> >>>> Finally, hg diff: >>>> ... It is too long. But I cannot find a way to attach a file, so, here >>>> goes the diff: >>> >>> Please do send this as an attachment. Long lines got wrapped, making it >>> impossible to apply. >>> >>> The change to getchar.c should not be there. Perhaps you are not >>> encoding the strings that go into the input buffer correctly? A CSI >>> should be put there as three characters: CSI KS_EXTRA KE_CSI. >>> I guess fix_input_buffer() can be used in push_raw_key(). >> >> 1. I took a look into fix_input_buffer() and used it to "fix" hangul input >> buffer. >> But fix_input_buffer() function did not do anything. >> It escapes CSI into K_SPECIAL KS_EXTRA KE_CSI sequence >> only when the first byte of the input buffer is CSI. >> But the hangul codes in question have 0x9b in the middle or at the end, >> e.g) EB A0 9B. >> The function does not have any chance to "fix" the buffer. > > I think that when CSI appears halfway a utf-8 byte sequence it doesn't > need to be escaped. That only happens when it's at the start of a > character, it needs to be escaped to avoid it being interpreted as a > special key byte sequence.
Yes, I also believe the 0x9b in the middle of an encoded byte does not need to be escaped. It's part of valid code. >> 2. 0x9b in hangul codes is valid code. I encoded the strings correctly. >> 0x9b(CSI) is part of utf-8 encoded hangul code. > > The encoding in the input buffer is a bit weird, it includes special > byte sequences, and then what the user types has to be escaped to avoid > that byte sequence being handled in the wrong way. > >> 3. Question: I guest that the CSI is some kind of special character that >> indicates subsequent characters have some special meaning, right? Then, >> in gui mode, in what case a user can generate CSI code? >> If I knew what does the CSI do and when the CSI is generated, it would be >> much easier for me to do the job. > > In the GUI it's a bit different, we don't read raw bytes from what the > user types, but create a byte stream from events. E.g. in > src/gui_gtk_x11.c in key_press_event(). The hangul input automata is initiated from THAT routine. Following is the callstack when hangul input automata is being in action: src/gui_gtk_x11.c: key_press_event() --> src/ui.c: add_to_input_buffer() --> src/hangulin.c: hangul_input_process() (the automata) or src/gui_x11.c: gui_x11_key_hit_cb() --> src/ui.c: add_to_input_buffer() --> src/hangulin.c: hangul_input_process() (the automata) The hangul_input_process() creates hangul code from what user has typed in. And then it puts the hangul code in "inbuf" buffer by calling push_raw_key(). And then somewhere in the way, the "inbuf" is processd by vgetc() in src/getchar.c. The function finds out that the 0x9b(CSI) is in the middle of the code, and the routine I commented out (src/getchar.c: vgetc()) interprets the 0x9b as a special code, and modifies "inbuf", where it should not be interpreted as a special key, but be preserved as they are. Am I missing something? And, what should I do to avoid interpreting 0x9b as CSI? Please consider that hangul input routine is meaningful only when MULTIBYTE and GUI option is enabled. > >> Now I'm working on the advices you made before :-) >> As soon as you shed some light on the secret of CSI, I will work on it. >> >> Looking forward to your kind advice. > > I hope this helps. > > -- > hundred-and-one symptoms of being an internet addict: > 120. You ask a friend, "What's that big shiny thing?" He says, "It's the sun." > > /// Bram Moolenaar -- [email protected] -- http://www.Moolenaar.net \\\ > /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ > \\\ an exciting new programming language -- http://www.Zimbu.org /// > \\\ help me help AIDS victims -- http://ICCF-Holland.org /// Regards, Shawn. -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php
