Shawn Y.H. Kim wrote: > >> In response to the following comment made by Bram on Aug 2, 2007: > >> (can be viewed at > >> http://groups.google.com/group/vim_dev/browse_thread/thread/3b73a504c77ba803/) > >> > >>> I hesitate removing the Hangul support without knowing for sure that it > >>> is not needed. Browsing through the messages I do see remarks that it > >>> might still be useful to a few people. > >>> > >>> Perhaps the Hangul support can be changed to also work for UTF-8? > >> > >> I made (finally) a patch that enables hangul-input module to work for > >> UTF-8. > > > > Thanks. I'm glad to finally see this implemented. > > It still needs some work though. > > > >> Finally, hg diff: > >> ... It is too long. But I cannot find a way to attach a file, so, here > >> goes the diff: > > > > Please do send this as an attachment. Long lines got wrapped, making it > > impossible to apply. > > > > The change to getchar.c should not be there. Perhaps you are not > > encoding the strings that go into the input buffer correctly? A CSI > > should be put there as three characters: CSI KS_EXTRA KE_CSI. > > I guess fix_input_buffer() can be used in push_raw_key(). > > 1. I took a look into fix_input_buffer() and used it to "fix" hangul input > buffer. > But fix_input_buffer() function did not do anything. > It escapes CSI into K_SPECIAL KS_EXTRA KE_CSI sequence > only when the first byte of the input buffer is CSI. > But the hangul codes in question have 0x9b in the middle or at the end, > e.g) EB A0 9B. > The function does not have any chance to "fix" the buffer.
I think that when CSI appears halfway a utf-8 byte sequence it doesn't need to be escaped. That only happens when it's at the start of a character, it needs to be escaped to avoid it being interpreted as a special key byte sequence. > 2. 0x9b in hangul codes is valid code. I encoded the strings correctly. > 0x9b(CSI) is part of utf-8 encoded hangul code. The encoding in the input buffer is a bit weird, it includes special byte sequences, and then what the user types has to be escaped to avoid that byte sequence being handled in the wrong way. > 3. Question: I guest that the CSI is some kind of special character that > indicates subsequent characters have some special meaning, right? Then, > in gui mode, in what case a user can generate CSI code? > If I knew what does the CSI do and when the CSI is generated, it would be > much easier for me to do the job. In the GUI it's a bit different, we don't read raw bytes from what the user types, but create a byte stream from events. E.g. in src/gui_gtk_x11.c in key_press_event(). > Now I'm working on the advices you made before :-) > As soon as you shed some light on the secret of CSI, I will work on it. > > Looking forward to your kind advice. I hope this helps. -- hundred-and-one symptoms of being an internet addict: 120. You ask a friend, "What's that big shiny thing?" He says, "It's the sun." /// Bram Moolenaar -- [email protected] -- http://www.Moolenaar.net \\\ /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\ \\\ an exciting new programming language -- http://www.Zimbu.org /// \\\ help me help AIDS victims -- http://ICCF-Holland.org /// -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php
