Re: utf8 patch for mc, slang 2 version
Dear All, On Wed, 14 Jun 2006, Egmont Koblinger wrote: On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote: BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. With Unicode support maintaining the two will be much harder since AFAIK slang works with UTF-8 while ncurses uses UCS-4. Hence a completely different patch would be required for the two cases. Why not to follow the way used in non-UTF mc, and use curses-like calls (translated to ncurses calls or emulated by slang calls in myslang.h)? I have made mc 4.6.1 with UTF patches compile and run when linked against ncrurses 5.4 by replacing SLsmg_write_nwchars by addnwstr. Thus, it may be a good idea to use addnwstr as the wide character output routine, addnwstr being a ncurses call or a wrapper around SLsmg_write_nwchars (along the lines addch is implemented now). Sincerely, Michail ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Thu, 15 Jun 2006, Egmont Koblinger wrote: Anyway, there are plenty of apps (e.g. vim, joe) that perfectly support UTF-8 and use ncurses (not the w version). I wonder how it is possible... Because they only use the terminfo (low-level) parts of ncurses, see man 3ncurses terminfo. Those parts do not care about UTF-8. MC uses a higher level part of ncurses (the display routines), but restricted to stdscr. There's an even higher level part of ncurses that supports managing (overlapping) windows, but MC does not use it. SLang (without the newt library) does not support that. So one advantage of using ncurses over slang is that MC could make use of ncurses' windowing code, thereby simplifying its own code. In the way that MC uses SLang and ncurses right now there is very little difference between the two libraries. If you google (groups) for it (Miguel's posts) you'll find that around '95/'96 SLang was preferred over ncurses because it was faster, smaller, and less buggy. But that is no longer the case, there is not much difference now. Bart ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi, BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. Maybe compatibility with older UN*Xes with curses but no slang? Asking these sysadmins to install slang (or compile mc to its bundled slang) is IMHO easier than to do double work in mc. Last time I played with it ncursesw (but not plain ncurses) handled UTF-8 just fine. e.g. you can pass a UTF-8 encoded string to addstr(), and provided the locale is set correctly, ncursesw will compute its width correctly. It is *also* possible to use addwstr() with UCS-4, but not compulsory. It's clear now, thanks! Anyway, there are plenty of apps (e.g. vim, joe) that perfectly support UTF-8 and use ncurses (not the w version). I wonder how it is possible... -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz Kłoczko wrote: BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. With Unicode support maintaining the two will be much harder since AFAIK slang works with UTF-8 while ncurses uses UCS-4. Hence a completely different patch would be required for the two cases. BTW2: few monts ago was on this list about converting in source tree alle po/*.po files to UTF-8 .. still not done. Any reasons ? It can be performed by: [mc]$ for i in po/*.po; do msgconv -t UTF-8 $i -o $i; done Are you sure it is safe to use the same output file? I'd rather use a tmp file. It depends on msgconv's internal implementation, but I can easily imagine a situation where the writing file descriptor's position exceeds the reading position (since UTF-8 is longer than the 8-bit version) and this may cause invalid results. Due to buffered read and write requests I guess it needs larger files (with approx. 4096 -- 8192 accented characters) for this bug to occur. I'm not sure, though, just reasonably paranoid :-) -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Monday 12 June 2006 10:55, Jindrich Novy wrote: Please take notice. It would probably be good if Vladimir and you could keep your UTF-8 patch sets in sync. It just seems wasted effort to do this in two places independently. Some inter-distro communication is not going to hurt. Maybe these patches should be centrally maintained in a contribs tree. The problem is that Vladimir and me use different versions of mc. My approach is to be more in sync with upstream so that there's a more recent CVS snapshot in Fedora for now because of a very rare release period of mc. This helps me to do only minimal changes when I want to send a patch to upstream. SuSE uses the stable mc-4.6.1, AFAIK. It would not be a problem to have a more recent mc snapshot in openSUSE Factory. Even better, if it is synced with Fedora. +1 for storing useful patches in contrib. It would be quite hard to keep them in sync with devel mc though. Good idea. However we need a long-term solution. We should discuss what must be done to have UTF support in upstream. -- Vladimir Nadvornik ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Wed, 14 Jun 2006, Egmont Koblinger wrote: On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote: BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. Maybe compatibility with older UN*Xes with curses but no slang? With Unicode support maintaining the two will be much harder since AFAIK slang works with UTF-8 while ncurses uses UCS-4. Hence a completely different patch would be required for the two cases. Last time I played with it ncursesw (but not plain ncurses) handled UTF-8 just fine. e.g. you can pass a UTF-8 encoded string to addstr(), and provided the locale is set correctly, ncursesw will compute its width correctly. It is *also* possible to use addwstr() with UCS-4, but not compulsory. Bart ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hello! On Thu, 2006-06-15 at 09:45 +1200, Bart Oldeman wrote: On Wed, 14 Jun 2006, Egmont Koblinger wrote: On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote: BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. Maybe compatibility with older UN*Xes with curses but no slang? It's a bogus argument. UNIX curses was removed long ago, and it had never worked well anyway. I don't remember a single person complaining. Besides, S-Lang is included with mc and it's quite portable. -- Regards, Pavel Roskin ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Thu, Jun 15, 2006 at 09:45:26AM +1200, Bart Oldeman wrote: On Wed, 14 Jun 2006, Egmont Koblinger wrote: On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote: BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. Maybe compatibility with older UN*Xes with curses but no slang? that doesn't sound too convincing. With Unicode support maintaining the two will be much harder since AFAIK slang works with UTF-8 while ncurses uses UCS-4. Hence a completely different patch would be required for the two cases. Last time I played with it ncursesw (but not plain ncurses) handled UTF-8 just fine. good. i'm all for killing slang support. why that one? libslang is twice as big as libncursesw. probably because it was meant to be much more than just a display lib. -- Hi! I'm a .signature virus! Copy me into your ~/.signature, please! -- Chaos, panic, and disorder - my work here is done. ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Wed, 14 Jun 2006, Egmont Koblinger wrote: On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz Kłoczko wrote: BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. If someone want ask which backend is more importand and will be better keep as major IMO answer tn this kind question is very simple: # rpm -q --whatrequires libslang.so.2 | grep -v slang | wc -l 4 # rpm -q --whatrequires libncurses.so.5 libncursesw.so.5 | grep -v ncurses |wc -l 55 Above slang list contain only packages which do not have now ncurses backend and are not importand as mc :) Also current state have other sick points on Linux. In case using mc with gpm it causes runtime linking with more than one term toolkit library (slang or internal slang and ncurses used by libgpm). [..] [mc]$ for i in po/*.po; do msgconv -t UTF-8 $i -o $i; done Are you sure it is safe to use the same output file? I'd rather use a tmp file. I'm sure. msgconv cumulates output in memory and aftewr finish conversion writes output to file name passed in -o parameter in single step. kloczek -- --- *Ludzie nie mają problemów, tylko sobie sami je stwarzają* --- Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: [EMAIL PROTECTED]___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Wed, 14 Jun 2006, Pavel Roskin wrote: Hello! On Thu, 2006-06-15 at 09:45 +1200, Bart Oldeman wrote: On Wed, 14 Jun 2006, Egmont Koblinger wrote: On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote: BTW: anyone is working on UFT-8 support for ncurses(w) backend ? I don't think so, and I don't think it is worth it. Maintaining two backends IMHO just causes headaches while it doesn't make mc better. I still can't see why developers do not decide which one to use and drop the other one. Maybe compatibility with older UN*Xes with curses but no slang? It's a bogus argument. UNIX curses was removed long ago, and it had never worked well anyway. I don't remember a single person complaining. Besides, S-Lang is included with mc and it's quite portable. We had that argument once already. curses is not that old and it is standard. And it has unicode support. Its just a matter of using it. ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Sun, Jun 11, 2006 at 01:08:46PM +0200, Leonard den Ottolander wrote: Egmonts patches are worth looking at. I've blatantly ignored pushing them to you as I'd expected you to integrate them over time. I've probably not made it clear enough to you before that these patches are worth considering. These patches hardly change nowadays. They only change if: - I face a new utf8 related bug in mc. It didn't happen in the last ~1.5 years (except for the off-by-one fix), and I always use the same small subset of mc's features, so it's unlikely for this to happen. - Something else (e.g. slang) is upgraded which introduces or triggers new bugs. - mc is upgraded in our distro. This will only happen if a new mainstream version is released. I don't want to bother with CVS snapshots, 4.6.1 is working reasonably well. After all, I'll try not to forget to mention it here if anything noticable changes in these patches. It's much simpler this way than for anyone to keep track of our changes. Finally, note that while these patches fix many issues with single-width UTF-8 characters, they may be really buggy with double-width (CJK) or zero-width Unicode characters, since I often assume that each Unicode entity occupies one column. I know it is totally false and I already knew it when I created these patches, but doing these things right would have required much more efforts. -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi Leonard, On Sun, 2006-06-11 at 13:08 +0200, Leonard den Ottolander wrote: On Thu, 2006-06-08 at 20:13 +0200, Egmont Koblinger wrote: I've just upgraded to slang-2 in our distro and also updated the UTF-8 patches to mc-4.6.1 (based on SUSE's version). It was easier than I thought it would be. I was glad to see that you had applied plenty of my patches in SUSE (those named 00-*). Please take a look at here again: https://svn.uhulinux.hu/packages/dev/mc/patches/ 00-74, 00-78 patches were helpful for me as well, thanks Egmont. They are now applied in devel FC. Please take notice. It would probably be good if Vladimir and you could keep your UTF-8 patch sets in sync. It just seems wasted effort to do this in two places independently. Some inter-distro communication is not going to hurt. Maybe these patches should be centrally maintained in a contribs tree. The problem is that Vladimir and me use different versions of mc. My approach is to be more in sync with upstream so that there's a more recent CVS snapshot in Fedora for now because of a very rare release period of mc. This helps me to do only minimal changes when I want to send a patch to upstream. SuSE uses the stable mc-4.6.1, AFAIK. +1 for storing useful patches in contrib. It would be quite hard to keep them in sync with devel mc though. Cheers, Jindrich ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hello Jindrich, On Thu, 2006-06-08 at 20:13 +0200, Egmont Koblinger wrote: I've just upgraded to slang-2 in our distro and also updated the UTF-8 patches to mc-4.6.1 (based on SUSE's version). It was easier than I thought it would be. I was glad to see that you had applied plenty of my patches in SUSE (those named 00-*). Please take a look at here again: https://svn.uhulinux.hu/packages/dev/mc/patches/ Please take notice. It would probably be good if Vladimir and you could keep your UTF-8 patch sets in sync. It just seems wasted effort to do this in two places independently. Some inter-distro communication is not going to hurt. Maybe these patches should be centrally maintained in a contribs tree. Egmonts patches are worth looking at. I've blatantly ignored pushing them to you as I'd expected you to integrate them over time. I've probably not made it clear enough to you before that these patches are worth considering. Leonard. -- mount -t life -o ro /dev/dna /genetic/research ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
In non-UTF-8 mode slang2 behaves a bit different than the patched slang1. As a result, mc does work with 8bit encodings, like 8859-2 or KOI8. The attached patch fixes the SLsmg_write_nwchars() function to be fully compatible with the slang1 version and uses it consistently instead of SLsmg_write_char(). It should be applied on top of all the previous patches. Thanks for the patch. The last hunk didn't apply as there's no view_add_character used in editdraw.c. My patch is based on mc-4.6.1 The idea is to replace all occurrences of SLsmg_write_char with the now fixed SLsmg_write_nwchars, because it is the only way that works in all locales. -- Vladimir Nadvornik ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi Vladimir (and others), I've just upgraded to slang-2 in our distro and also updated the UTF-8 patches to mc-4.6.1 (based on SUSE's version). It was easier than I thought it would be. I was glad to see that you had applied plenty of my patches in SUSE (those named 00-*). Please take a look at here again: https://svn.uhulinux.hu/packages/dev/mc/patches/ 00-77 had to be updated to slang-2. When a user searches for a file by ^S, panel-search_buffer is filled up individually with every single byte pressed. Hence it often contains partial UTF-8 string. Displaying it just happened to work with slang-1, but slang-2 prints the partial UTF-8 as C3 or similar. As a result, the cyan box overflows: if you search in the left panel for an existing accented filename, two cyan blocks appear in the right panel. The updated patch first finds the longest valid UTF-8 prefix of the string and only prints that part. You might find 00-79 useful too, it fixes an off-by-one bug introduced by the UTF-8 patches that causes Alt+Backspace to behave differently (erase a whole word and one more character, usually a space) than in bash or in vanilla mc (erase only the word). bye, Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Saturday 12 November 2005 20:59, Leonard den Ottolander wrote: Hi Bart, list, On Tue, 2005-09-20 at 08:14 +1200, Bart Oldeman wrote: Basically you'd need to apply Fedora's patch, and then (sorry no proper patch here but it's not a big deal) Attached you'll find a proper patch. I've moved the hunk from global.h to myslang.h (and dropped the inclusion of slang.h as it is redundant). Jindrich, please merge this patch with the UTF8 patch. You might even consider to build the next mc for FC --with-screen=mcslang, as I see the FC development tree does not yet feature slang-2. Hi all, In non-UTF-8 mode slang2 behaves a bit different than the patched slang1. As a result, mc does work with 8bit encodings, like 8859-2 or KOI8. The attached patch fixes the SLsmg_write_nwchars() function to be fully compatible with the slang1 version and uses it consistently instead of SLsmg_write_char(). It should be applied on top of all the previous patches. -- Vladimir Nadvornik developer - SuSE CR, s.r.o. e-mail: [EMAIL PROTECTED] Drahobejlova 27 tel:+420 2 9654 2373 190 00 Praha 9 fax:+420 2 9654 2374 Ceska republika http://www.suse.cz diff -ruN mc-4.6.1.orig/edit/editdraw.c mc-4.6.1/edit/editdraw.c --- mc-4.6.1.orig/edit/editdraw.c 2006-06-07 11:57:19.0 +0200 +++ mc-4.6.1/edit/editdraw.c 2006-06-07 11:56:30.0 +0200 @@ -234,7 +234,7 @@ lowlevel_set_color (color); } #ifdef UTF8 - SLsmg_write_char(textchar); + SLsmg_write_nwchars(textchar, 1); #else addch (textchar); #endif diff -ruN mc-4.6.1.orig/src/help.c mc-4.6.1/src/help.c --- mc-4.6.1.orig/src/help.c 2006-06-07 11:57:19.0 +0200 +++ mc-4.6.1/src/help.c 2006-06-07 11:56:30.0 +0200 @@ -461,7 +461,7 @@ len = mbrtowc(wc, p, MB_CUR_MAX, mbs); if (len = 0) len = 1; /* skip broken multibyte chars */ - SLsmg_write_char(wc); + SLsmg_write_nwchars(wc, 1); p += len - 1; } else #endif diff -ruN mc-4.6.1.orig/src/util.c mc-4.6.1/src/util.c --- mc-4.6.1.orig/src/util.c 2006-06-07 11:57:19.0 +0200 +++ mc-4.6.1/src/util.c 2006-06-07 11:56:30.0 +0200 @@ -58,8 +58,26 @@ #if SLANG_VERSION = 2 void SLsmg_write_nwchars(wchar_t *s, size_t n) { - while(n--) - SLsmg_write_char(*s++); +if (SLsmg_is_utf8_mode()) { /* slang can handle it directly */ + while(n-- *s) + SLsmg_write_char(*s++); +} +else { /* convert wchars back to 8bit encoding */ +mbstate_t mbs; + memset (mbs, 0, sizeof (mbs)); + while (n-- *s) { + char buf[MB_LEN_MAX + 1]; /* should use 1 char, but to be sure */ + if (*s 0x80) { + SLsmg_write_char(*s++); /* ASCII */ + } + else { + if (wcrtomb(buf, *s++, mbs) == 1) + SLsmg_write_char((wchar_t)(buf[0])); + else + SLsmg_write_char('?'); /* should not happen */ + } + } +} } #endif diff -ruN mc-4.6.1.orig/src/view.c mc-4.6.1/src/view.c --- mc-4.6.1.orig/src/view.c 2006-06-07 11:57:19.0 +0200 +++ mc-4.6.1/src/view.c 2006-06-07 11:56:30.0 +0200 @@ -852,7 +852,7 @@ #ifndef UTF8 #define view_add_character(view,c) addch (c) #else /* UTF8 */ -#define view_add_character(view,c) SLsmg_write_char(c) +#define view_add_character(view,c) {wchar_t tmp=c; SLsmg_write_nwchars(tmp, 1);} #endif /* UTF8 */ #define view_add_one_vline() one_vline() #define view_add_string(view,s)addstr (s) ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi, On Wed, 2006-06-07 at 12:18 +0200, Vladimir Nadvornik wrote: On Saturday 12 November 2005 20:59, Leonard den Ottolander wrote: Hi Bart, list, On Tue, 2005-09-20 at 08:14 +1200, Bart Oldeman wrote: Basically you'd need to apply Fedora's patch, and then (sorry no proper patch here but it's not a big deal) Attached you'll find a proper patch. I've moved the hunk from global.h to myslang.h (and dropped the inclusion of slang.h as it is redundant). Jindrich, please merge this patch with the UTF8 patch. You might even consider to build the next mc for FC --with-screen=mcslang, as I see the FC development tree does not yet feature slang-2. slang-2.0.5 is used since FC5 so only FC4 now uses the old slang-1.4.9. Hi all, In non-UTF-8 mode slang2 behaves a bit different than the patched slang1. As a result, mc does work with 8bit encodings, like 8859-2 or KOI8. The attached patch fixes the SLsmg_write_nwchars() function to be fully compatible with the slang1 version and uses it consistently instead of SLsmg_write_char(). It should be applied on top of all the previous patches. Thanks for the patch. The last hunk didn't apply as there's no view_add_character used in editdraw.c. Jindrich ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi Bart, list, On Tue, 2005-09-20 at 08:14 +1200, Bart Oldeman wrote: Basically you'd need to apply Fedora's patch, and then (sorry no proper patch here but it's not a big deal) Attached you'll find a proper patch. I've moved the hunk from global.h to myslang.h (and dropped the inclusion of slang.h as it is redundant). Jindrich, please merge this patch with the UTF8 patch. You might even consider to build the next mc for FC --with-screen=mcslang, as I see the FC development tree does not yet feature slang-2. Leonard. -- mount -t life -o ro /dev/dna /genetic/research diff -pruN mc-4.6.1a/src/help.c mc-4.6.1a_utf8_slang2/src/help.c --- mc-4.6.1a/src/help.c 2005-11-12 17:46:32.0 +0100 +++ mc-4.6.1a_utf8_slang2/src/help.c 2005-11-12 17:52:11.0 +0100 @@ -449,7 +449,7 @@ static void help_show (Dlg_head *h, cons #ifndef HAVE_SLANG addch (acs_map [c]); #else -#ifdef UTF8 +#if defined(UTF8) SLANG_VERSION 2 SLsmg_draw_object (h-y + line + 2, h-x + col + 2, acs_map [c]); #else SLsmg_draw_object (h-y + line + 2, h-x + col + 2, c); diff -pruN mc-4.6.1a/src/myslang.h mc-4.6.1a_utf8_slang2/src/myslang.h --- mc-4.6.1a/src/myslang.h 2005-11-12 17:46:32.0 +0100 +++ mc-4.6.1a_utf8_slang2/src/myslang.h 2005-11-12 18:09:27.0 +0100 @@ -11,6 +11,12 @@ #endif /* HAVE_SLANG_SLANG_H */ #endif +#if SLANG_VERSION = 2 +#define UTF8 1 +#define SLsmg_Is_Unicode SLsmg_is_utf8_mode() +void SLsmg_write_nwchars(wchar_t *s, size_t n); +#endif + #ifdef UTF8 #include wchar.h #endif diff -pruN mc-4.6.1a/src/slint.c mc-4.6.1a_utf8_slang2/src/slint.c --- mc-4.6.1a/src/slint.c 2005-09-05 04:14:29.0 +0200 +++ mc-4.6.1a_utf8_slang2/src/slint.c 2005-11-12 17:49:21.0 +0100 @@ -141,7 +141,9 @@ void slang_init (void) { SLtt_get_terminfo (); - +#if SLANG_VERSION = 2 +SLutf8_enable (-1); +#endif /* * If the terminal in not in terminfo but begins with a well-known * string such as linux or xterm S-Lang will go on, but the diff -pruN mc-4.6.1a/src/util.c mc-4.6.1a_utf8_slang2/src/util.c --- mc-4.6.1a/src/util.c 2005-11-12 17:46:32.0 +0100 +++ mc-4.6.1a_utf8_slang2/src/util.c 2005-11-12 17:55:17.0 +0100 @@ -56,6 +56,14 @@ static const char app_text [] = Midnight-Commander; int easy_patterns = 1; +#if SLANG_VERSION = 2 +void SLsmg_write_nwchars(wchar_t *s, size_t n) +{ +while(n--) + SLsmg_write_char(*s++); +} +#endif + extern void str_replace(char *s, char from, char to) { for (; *s != '\0'; s++) { ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi, config.h @ 3 +#ifdef __APPLE__ +#define unix 1 +#endif I guess developers prefer patches created with diff -u rather than just some pseudo-code. Exctact the original source code to a directory called mc-4.6.1.orig (or actually you can call it whatever you want), copy it to mc-4.6.1 (or whatever you like), make your modifications under the latter one and then run diff -Naurdp mc-4.6.1.orig mc-4.6.1 or something similar. Then attach it to the mail, it's easier to handle attachments than cutting parts of the message body. Developers will correct me if I'm wrong. The Israeli (ivrit) and others like Chinese and Japanese are showing up corretly, but they crash the windowing, because of they are multi-byte chars. Question: how the length of these multi-byte chars can be decided? Does anyone have any idea? If I understand you, here by multi-byte and length you actually mean how many character cells they occupy on the screen. It's usually called the width of a character. See the manual of wcwidth() and wcswidth(). -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi, I have patches for the NFC / NFD issue and two other patches for the Darwin/Mac Platform for the current UTF-8 version, with all patches applied (I don't know where to post it, so I post here, sorry): config.h @ 3 +#ifdef __APPLE__ +#define unix 1 +#endif I guess developers prefer patches created with diff -u rather than just some pseudo-code. Exctact the original source code to a directory called mc-4.6.1.orig (or actually you can call it whatever you want), copy it to mc-4.6.1 (or whatever you like), make your modifications under the latter one and then run diff -Naurdp mc-4.6.1.orig mc-4.6.1 or something similar. Then attach it to the mail, it's easier to handle attachments than cutting parts of the message body. Developers will correct me if I'm wrong. The Israeli (ivrit) and others like Chinese and Japanese are showing up corretly, but they crash the windowing, because of they are multi-byte chars. Question: how the length of these multi-byte chars can be decided? Does anyone have any idea? If I understand you, here by multi-byte and length you actually mean how many character cells they occupy on the screen. It's usually called the width of a character. See the manual of wcwidth() and wcswidth(). -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi, 1) thank you for your answer, I'll post a diff if everything will work as expected :) 2) the multi-byte char Slang2 bug came up at Debian too, see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=316010 . I applied this patch (080_wide_chars.patch), but still counts badly. Still no Asian and/or Mac developers here? :) Regards, Balint On 25/09/05, Koblinger Egmont [EMAIL PROTECTED] wrote: Question: how the length of these multi-byte chars can be decided? Does anyone have any idea? If I understand you, here by multi-byte and length you actually mean how many character cells they occupy on the screen. It's usually called the width of a character. See the manual of wcwidth() and wcswidth(). ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi, I have patches for the NFC / NFD issue and two other patches for the Darwin/Mac Platform for the current UTF-8 version, with all patches applied (I don't know where to post it, so I post here, sorry): config.h @ 3 +#ifdef __APPLE__ +#define unix 1 +#endif config.h @ 53 +#include wchar.h vfs.c @ 507 +#ifdef __APPLE__ +else +{ +gchar* gp2 = g_utf8_normalize (result-d_name,-1,G_NORMALIZE_ALL_COMPOSE); + strncpy(result-d_name,gp2,strlen(gp2)+1); + g_free(gp2); + } +#endif // this is the main conversion routine - from the Glib. This patch makes possible displaying all 2 byte Utf-8 encoded chars (all West and Central-European languages and Cyrillic letters as well) On any HPFS/HPFS+ filesystem. The Israeli (ivrit) and others like Chinese and Japanese are showing up corretly, but they crash the windowing, because of they are multi-byte chars. Question: how the length of these multi-byte chars can be decided? Does anyone have any idea? Two issues remained open: - the panel header and the footer command line. Does anyone here who knows where can I find these items (screen.c) ? - any other Mac users who can test these changes, please? Huge Kudos goes to Akos Huszti for great help, research and support! Regards, Balint On 21/09/05, Pavel Tsekov [EMAIL PROTECTED] wrote: Hello, Maybe this discussion should be moved to general list - mc at gnome dot org. ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Tue, Sep 20, 2005 at 10:11:28PM +0200, Bálint Kardos wrote: But even with all patches and stuff, I see the following Unicode glitches: - the utf-8 chars are not diplayed in the dir list (on Ubuntu, everything is OK) for ÉÁŰŐÚÖÜÓ I see EAUOUOUO (upper, lowercase all wrong) - the files/dirs that contain the unicode chars, are still not properly aligned to the grids What could cause Darwin to behave such unpredictably? In the filesystem, there's another error: if you do 'ls', the alignment of the columns after the unicode chars are broken as well. Unices use NFC, while MacOS uses NFD representation of accents (at least for filenames, I don't know how about file contents). NFC means each accented character has its own composed value, that is, one Unicode entity, which is usually stored as two (maybe three) bytes in UTF-8. NFD composes the characters from two Unicode entities, first the unaccented letter, followed by an accent on its own. Its UTF-8 representation hence takes three bytes (one for the unaccented letter and two more for the accent). There are different levels of Unicode specified, I guess supporting NFD requires a higher level of conformance since it's a harder job than supporting NFC. I bet mc's UTF-8 patch only supports NFC. -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi, I've debuged it from the beginning, and: screen.c 762: // inserted wchar_t lonaka[100]; memcpy(lonaka,buffer,txtlen*sizeof(wchar_t)); lonaka[txtlen]=0; // end inserted printw (%*s, still, ); printw(%ls,lonaka); //SLsmg_write_nwchars ((wchar_t *) buffer, txtlen); printw (%*s, len - txtwidth - still, ); printw(/ls,lonaka); handles everything as expected, the right UTF-8 chars appeared on the screen. So text in the buffer is properly encoded. It is an Slang2 issue, but it's too compicated to figure out for the first blick, the problem in the slsmg.c file. Does anyone know why #define unix != 1 for darwin-ppc in mc (and/or) slang regards, Bálint Unices use NFC, while MacOS uses NFD representation of accents (at least for filenames, I don't know how about file contents). NFC means each accented character has its own composed value, that is, one Unicode entity, which is usually stored as two (maybe three) bytes in UTF-8. NFD composes the characters from two Unicode entities, first the unaccented letter, followed by an accent on its own. Its UTF-8 representation hence takes three bytes (one for the unaccented letter and two more for the accent). There are different levels of Unicode specified, I guess supporting NFD requires a higher level of conformance since it's a harder job than supporting NFC. I bet mc's UTF-8 patch only supports NFC. -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hello, Maybe this discussion should be moved to general list - mc at gnome dot org. On Wed, 21 Sep 2005, [ISO-8859-1] Bálint Kardos wrote: Hi, I've debuged it from the beginning, and: screen.c 762: // inserted wchar_t lonaka[100]; memcpy(lonaka,buffer,txtlen*sizeof(wchar_t)); lonaka[txtlen]=0; // end inserted printw (%*s, still, ); printw(%ls,lonaka); //SLsmg_write_nwchars ((wchar_t *) buffer, txtlen); printw (%*s, len - txtwidth - still, ); printw(/ls,lonaka); handles everything as expected, the right UTF-8 chars appeared on the screen. So text in the buffer is properly encoded. It is an Slang2 issue, but it's too compicated to figure out for the first blick, the problem in the slsmg.c file. Does anyone know why #define unix != 1 for darwin-ppc in mc (and/or) slang regards, Bálint Unices use NFC, while MacOS uses NFD representation of accents (at least for filenames, I don't know how about file contents). NFC means each accented character has its own composed value, that is, one Unicode entity, which is usually stored as two (maybe three) bytes in UTF-8. NFD composes the characters from two Unicode entities, first the unaccented letter, followed by an accent on its own. Its UTF-8 representation hence takes three bytes (one for the unaccented letter and two more for the accent). There are different levels of Unicode specified, I guess supporting NFD requires a higher level of conformance since it's a harder job than supporting NFC. I bet mc's UTF-8 patch only supports NFC. -- Egmont ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
Hi Folks, sorry for bothering, I'm just a newbie, who tries to have UTF-8 MC on his OS X. I've downloaded the latest source, compiled, it was errorous. I have Fedora, and their MC far more better, but I just can not reproduce it on my Mac. I've downloaded from the CVS all FC4 patches and applied to 4.6.1, (according to the sources file, they used 4.6.1-pre5) BUT nothing really changed. The GUI is/was properly encoded (LANG=hu_HU.UTF-8), but the folder list is still wicked. I've Ubuntu too, I aliened the FC4 mc rpm version to it, but: On 19/09/05, Bart Oldeman [EMAIL PROTECTED] wrote: On Mon, 19 Sep 2005, Arkadiusz Miskiewicz wrote: There are few patches adding UTF-8 support to mc available but they require modified slang1 library. There is already stable slang-2 version with native UTF-8 support. Does anyone know mc utf8 patch that uses slang-2 ?No, but it's not too hard too change the existing patch.Basically you'd need to apply Fedora's patch, and then (sorry no proper patch here but it's not a big deal)in slint.c, use SLtt_get_terminfo()+#if SLANG_VERSION = 2+ SLutf8_enable (-1);+#endifin help.c, use#if defined(UTF8) SLANG_VERSION 2 SLsmg_draw_object (h-y + line + 2, h-x + col + 2, acs_map [c]);#elseSLsmg_draw_object (h-y + line + 2, h-x + col + 2, c);#endif /* UTF8 */and then (possibly global.h isn't the best place though):--- mc/src/util.c 2005-09-18 22:36:30.0 +1200+++ mc-utf8/src/util.c2005-08-26 13:04: 45.0 +1200@@ -48,6 +51,14 @@static const char app_text [] = Midnight-Commander;int easy_patterns = 1;+#if SLANG_VERSION = 2+void SLsmg_write_nwchars(wchar_t *s, size_t n) +{+while(n--)+SLsmg_write_char(*s++);+}+#endif+extern void str_replace(char *s, char from, char to){for (; *s != '\0'; s++) {@@ -78,9 +89,40 @@--- mc/src/global.h 2005-08-27 15:51: 32.0 +1200+++ mc-utf8/src/global.h2005-07-13 01:15:40.0 +1200@@ -146,6 +146,13 @@# define N_(String) (String)#endif /* !ENABLE_NLS */+#include slang.h+#if SLANG_VERSION = 2 +#define UTF8 1+#define SLsmg_Is_Unicode SLsmg_is_utf8_mode()+void SLsmg_write_nwchars(wchar_t *s, size_t n);+#endif+#include fs.h#include util.h___ Mc-devel mailing listhttp://mail.gnome.org/mailman/listinfo/mc-devel ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
On Monday 19 of September 2005 22:14, Bart Oldeman wrote: Does anyone know mc utf8 patch that uses slang-2 ? No, but it's not too hard too change the existing patch. Basically you'd need to apply Fedora's patch, and then (sorry no proper patch here but it's not a big deal) Thanks, works. -- Arkadiusz MiśkiewiczPLD/Linux Team http://www.t17.ds.pwr.wroc.pl/~misiek/ http://ftp.pld-linux.org/ ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel
Re: utf8 patch for mc, slang 2 version
hi, Arkadiusz was kind enough to provide me the sourcecode, and I managed to compile it on my Mac (Darwin 8.2.0). But even with all patches and stuff, I see the following Unicode glitches: - the utf-8 chars are not diplayed in the dir list (on Ubuntu, everything is OK) for ÉÁŰŐÚÖÜÓ I see EAUOUOUO (upper, lowercase all wrong) - the files/dirs that contain the unicode chars, are still not properly aligned to the grids ( What could cause Darwin to behave such unpredictably? In the filesystem, there's another error: if you do 'ls', the alignment of the columns after the unicode chars are broken as well. (where can I find the source code for the ls command? what package on other linux distributions?) ) - I've lost the functionality of the Arrow keys, it just types ABCD. (so I use screen) problem on both Linux/Darwin: if you cd into a unicode-named directory, the blinking command prompt will be positioned to the right with spaces, as much unicode chars you have in the dir name. for example: ~/ÉÁŰŐÚÖÜÓ [] Kind regards, Balint ___ Mc-devel mailing list http://mail.gnome.org/mailman/listinfo/mc-devel