Re: utf8 patch for mc, slang 2 version

2006-06-29 Thread Michail Vidiassov
Dear All,

On Wed, 14 Jun 2006, Egmont Koblinger wrote:

 On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote:

 BTW: anyone is working on UFT-8 support for ncurses(w) backend ?

 I don't think so, and I don't think it is worth it. Maintaining two backends
 IMHO just causes headaches while it doesn't make mc better. I still can't
 see why developers do not decide which one to use and drop the other one.

 With Unicode support maintaining the two will be much harder since AFAIK
 slang works with UTF-8 while ncurses uses UCS-4. Hence a completely
 different patch would be required for the two cases.


Why not to follow the way used in non-UTF mc,
and use curses-like calls (translated to ncurses calls or emulated by 
slang calls in myslang.h)?

   I have made mc 4.6.1 with UTF patches compile and run when
   linked against ncrurses 5.4 by replacing
   SLsmg_write_nwchars by addnwstr.

Thus, it may be a good idea to use addnwstr as the wide character output 
routine, addnwstr being a ncurses call or a wrapper around SLsmg_write_nwchars
  (along the lines addch is implemented now).

Sincerely, Michail

___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-16 Thread Bart Oldeman
On Thu, 15 Jun 2006, Egmont Koblinger wrote:

 Anyway, there are plenty of apps (e.g. vim, joe) that perfectly support
 UTF-8 and use ncurses (not the w version). I wonder how it is possible...

Because they only use the terminfo (low-level) parts of ncurses, see man 
3ncurses terminfo. Those parts do not care about UTF-8.

MC uses a higher level part of ncurses (the display routines), but 
restricted to stdscr.

There's an even higher level part of ncurses that supports managing
(overlapping) windows, but MC does not use it. SLang (without the newt 
library) does not support that. So one advantage of using ncurses over 
slang is that MC could make use of ncurses' windowing code, thereby 
simplifying its own code. In the way that MC uses SLang and ncurses 
right now there is very little difference between the two libraries.

If you google (groups) for it (Miguel's posts) you'll find that around 
'95/'96 SLang was preferred over ncurses because it was faster, smaller, 
and less buggy. But that is no longer the case, there is not much 
difference now.

Bart



___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-15 Thread Egmont Koblinger
Hi,

  BTW: anyone is working on UFT-8 support for ncurses(w) backend ?
 
  I don't think so, and I don't think it is worth it. Maintaining two backends
  IMHO just causes headaches while it doesn't make mc better. I still can't
  see why developers do not decide which one to use and drop the other one.
 
 Maybe compatibility with older UN*Xes with curses but no slang?

Asking these sysadmins to install slang (or compile mc to its bundled slang)
is IMHO easier than to do double work in mc.

 Last time I played with it ncursesw (but not plain ncurses) handled UTF-8 
 just fine.
 
 e.g. you can pass a UTF-8 encoded string to addstr(), and provided the
 locale is set correctly, ncursesw will compute its width correctly. It is 
 *also* possible to use addwstr() with UCS-4, but not compulsory.

It's clear now, thanks!

Anyway, there are plenty of apps (e.g. vim, joe) that perfectly support
UTF-8 and use ncurses (not the w version). I wonder how it is possible...


-- 
Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-14 Thread Egmont Koblinger
On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz Kłoczko wrote:

 BTW: anyone is working on UFT-8 support for ncurses(w) backend ?

I don't think so, and I don't think it is worth it. Maintaining two backends
IMHO just causes headaches while it doesn't make mc better. I still can't
see why developers do not decide which one to use and drop the other one.

With Unicode support maintaining the two will be much harder since AFAIK
slang works with UTF-8 while ncurses uses UCS-4. Hence a completely
different patch would be required for the two cases.

 BTW2: few monts ago was on this list about converting in source 
 tree alle po/*.po files to UTF-8 .. still not done. Any reasons ?
 
 It can be performed by:
 
 [mc]$ for i in po/*.po; do msgconv -t UTF-8 $i -o $i; done

Are you sure it is safe to use the same output file? I'd rather use a tmp
file. It depends on msgconv's internal implementation, but I can easily
imagine a situation where the writing file descriptor's position exceeds the
reading position (since UTF-8 is longer than the 8-bit version) and this may
cause invalid results. Due to buffered read and write requests I guess it
needs larger files (with approx. 4096 -- 8192 accented characters) for this
bug to occur. I'm not sure, though, just reasonably paranoid :-)



-- 
Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-14 Thread Vladimir Nadvornik
On Monday 12 June 2006 10:55, Jindrich Novy wrote:

  Please take notice. It would probably be good if Vladimir and you could
  keep your UTF-8 patch sets in sync. It just seems wasted effort to do
  this in two places independently. Some inter-distro communication is not
  going to hurt. Maybe these patches should be centrally maintained in a
  contribs tree.

 The problem is that Vladimir and me use different versions of mc. My
 approach is to be more in sync with upstream so that there's a more
 recent CVS snapshot in Fedora for now because of a very rare release
 period of mc. This helps me to do only minimal changes when I want to
 send a patch to upstream. SuSE uses the stable mc-4.6.1, AFAIK.

It would not be a problem to have a more recent mc snapshot in 
openSUSE Factory. Even better, if it is synced with Fedora.

 +1 for storing useful patches in contrib. It would be quite hard to keep
 them in sync with devel mc though.

Good idea.

However we need a long-term solution. We should discuss what must be done
to have UTF support in upstream.

-- 
Vladimir Nadvornik
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-14 Thread Bart Oldeman
On Wed, 14 Jun 2006, Egmont Koblinger wrote:

 On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote:

 BTW: anyone is working on UFT-8 support for ncurses(w) backend ?

 I don't think so, and I don't think it is worth it. Maintaining two backends
 IMHO just causes headaches while it doesn't make mc better. I still can't
 see why developers do not decide which one to use and drop the other one.

Maybe compatibility with older UN*Xes with curses but no slang?

 With Unicode support maintaining the two will be much harder since AFAIK
 slang works with UTF-8 while ncurses uses UCS-4. Hence a completely
 different patch would be required for the two cases.

Last time I played with it ncursesw (but not plain ncurses) handled UTF-8 
just fine.

e.g. you can pass a UTF-8 encoded string to addstr(), and provided the
locale is set correctly, ncursesw will compute its width correctly. It is 
*also* possible to use addwstr() with UCS-4, but not compulsory.

Bart

___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-14 Thread Pavel Roskin
Hello!

On Thu, 2006-06-15 at 09:45 +1200, Bart Oldeman wrote:
 On Wed, 14 Jun 2006, Egmont Koblinger wrote:
 
  On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote:
 
  BTW: anyone is working on UFT-8 support for ncurses(w) backend ?
 
  I don't think so, and I don't think it is worth it. Maintaining two backends
  IMHO just causes headaches while it doesn't make mc better. I still can't
  see why developers do not decide which one to use and drop the other one.
 
 Maybe compatibility with older UN*Xes with curses but no slang?

It's a bogus argument.  UNIX curses was removed long ago, and it had
never worked well anyway.  I don't remember a single person complaining.
Besides, S-Lang is included with mc and it's quite portable.

-- 
Regards,
Pavel Roskin

___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-14 Thread Oswald Buddenhagen
On Thu, Jun 15, 2006 at 09:45:26AM +1200, Bart Oldeman wrote:
 On Wed, 14 Jun 2006, Egmont Koblinger wrote:
  On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote:
  BTW: anyone is working on UFT-8 support for ncurses(w) backend ?
 
  I don't think so, and I don't think it is worth it. Maintaining two backends
  IMHO just causes headaches while it doesn't make mc better. I still can't
  see why developers do not decide which one to use and drop the other one.
 
 Maybe compatibility with older UN*Xes with curses but no slang?
 
that doesn't sound too convincing.

  With Unicode support maintaining the two will be much harder since AFAIK
  slang works with UTF-8 while ncurses uses UCS-4. Hence a completely
  different patch would be required for the two cases.
 
 Last time I played with it ncursesw (but not plain ncurses) handled UTF-8 
 just fine.
 
good.

i'm all for killing slang support. why that one? libslang is twice as
big as libncursesw. probably because it was meant to be much more than
just a display lib.

-- 
Hi! I'm a .signature virus! Copy me into your ~/.signature, please!
--
Chaos, panic, and disorder - my work here is done.
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-14 Thread Tomasz Kłoczko

On Wed, 14 Jun 2006, Egmont Koblinger wrote:


On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz Kłoczko wrote:


BTW: anyone is working on UFT-8 support for ncurses(w) backend ?


I don't think so, and I don't think it is worth it. Maintaining two backends
IMHO just causes headaches while it doesn't make mc better. I still can't
see why developers do not decide which one to use and drop the other one.


If someone want ask which backend is more importand and will be better 
keep as major IMO answer tn this kind question is very simple:


# rpm -q --whatrequires libslang.so.2 | grep -v slang | wc -l
4
# rpm -q --whatrequires libncurses.so.5 libncursesw.so.5 | grep -v ncurses |wc 
-l
55

Above slang list contain only packages which do not have now ncurses 
backend and are not importand as mc :)
Also current state have other sick points on Linux. In case using mc with 
gpm it causes runtime linking with more than one term toolkit library 
(slang or internal slang and ncurses used by libgpm).


[..]

[mc]$ for i in po/*.po; do msgconv -t UTF-8 $i -o $i; done


Are you sure it is safe to use the same output file? I'd rather use a tmp
file.


I'm sure. msgconv cumulates output in memory and aftewr finish conversion 
writes output to file name passed in -o parameter in single step.


kloczek
--
---
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
---
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: [EMAIL PROTECTED]___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-14 Thread Pavel Tsekov
On Wed, 14 Jun 2006, Pavel Roskin wrote:

 Hello!

 On Thu, 2006-06-15 at 09:45 +1200, Bart Oldeman wrote:
 On Wed, 14 Jun 2006, Egmont Koblinger wrote:

 On Tue, Jun 13, 2006 at 07:14:41PM +0200, Tomasz K?oczko wrote:

 BTW: anyone is working on UFT-8 support for ncurses(w) backend ?

 I don't think so, and I don't think it is worth it. Maintaining two backends
 IMHO just causes headaches while it doesn't make mc better. I still can't
 see why developers do not decide which one to use and drop the other one.

 Maybe compatibility with older UN*Xes with curses but no slang?

 It's a bogus argument.  UNIX curses was removed long ago, and it had
 never worked well anyway.  I don't remember a single person complaining.
 Besides, S-Lang is included with mc and it's quite portable.

We had that argument once already. curses is not that old and it is
standard. And it has unicode support. Its just a matter of using it.

___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-13 Thread Egmont Koblinger
On Sun, Jun 11, 2006 at 01:08:46PM +0200, Leonard den Ottolander wrote:

 Egmonts patches are worth looking at. I've blatantly ignored pushing
 them to you as I'd expected you to integrate them over time. I've
 probably not made it clear enough to you before that these patches are
 worth considering.

These patches hardly change nowadays. They only change if:

 - I face a new utf8 related bug in mc. It didn't happen in the last ~1.5
   years (except for the off-by-one fix), and I always use the same small
   subset of mc's features, so it's unlikely for this to happen.

 - Something else (e.g. slang) is upgraded which introduces or
   triggers new bugs.

 - mc is upgraded in our distro. This will only happen if a new mainstream
   version is released. I don't want to bother with CVS snapshots, 4.6.1 is
   working reasonably well.

After all, I'll try not to forget to mention it here if anything noticable
changes in these patches. It's much simpler this way than for anyone to keep
track of our changes.

Finally, note that while these patches fix many issues with single-width
UTF-8 characters, they may be really buggy with double-width (CJK) or
zero-width Unicode characters, since I often assume that each Unicode entity
occupies one column. I know it is totally false and I already knew it when I
created these patches, but doing these things right would have required much
more efforts.



-- 
Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-12 Thread Jindrich Novy
Hi Leonard,

On Sun, 2006-06-11 at 13:08 +0200, Leonard den Ottolander wrote:
 On Thu, 2006-06-08 at 20:13 +0200, Egmont Koblinger wrote:
  I've just upgraded to slang-2 in our distro and also updated the UTF-8
  patches to mc-4.6.1 (based on SUSE's version). It was easier than I thought
  it would be. I was glad to see that you had applied plenty of my patches in
  SUSE (those named 00-*). Please take a look at here again:
  
  https://svn.uhulinux.hu/packages/dev/mc/patches/

00-74, 00-78 patches were helpful for me as well, thanks Egmont. They
are now applied in devel FC.

 Please take notice. It would probably be good if Vladimir and you could
 keep your UTF-8 patch sets in sync. It just seems wasted effort to do
 this in two places independently. Some inter-distro communication is not
 going to hurt. Maybe these patches should be centrally maintained in a
 contribs tree.

The problem is that Vladimir and me use different versions of mc. My
approach is to be more in sync with upstream so that there's a more
recent CVS snapshot in Fedora for now because of a very rare release
period of mc. This helps me to do only minimal changes when I want to
send a patch to upstream. SuSE uses the stable mc-4.6.1, AFAIK.

+1 for storing useful patches in contrib. It would be quite hard to keep
them in sync with devel mc though.

Cheers,
Jindrich


___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-11 Thread Leonard den Ottolander
Hello Jindrich,

On Thu, 2006-06-08 at 20:13 +0200, Egmont Koblinger wrote:
 I've just upgraded to slang-2 in our distro and also updated the UTF-8
 patches to mc-4.6.1 (based on SUSE's version). It was easier than I thought
 it would be. I was glad to see that you had applied plenty of my patches in
 SUSE (those named 00-*). Please take a look at here again:
 
 https://svn.uhulinux.hu/packages/dev/mc/patches/

Please take notice. It would probably be good if Vladimir and you could
keep your UTF-8 patch sets in sync. It just seems wasted effort to do
this in two places independently. Some inter-distro communication is not
going to hurt. Maybe these patches should be centrally maintained in a
contribs tree.

Egmonts patches are worth looking at. I've blatantly ignored pushing
them to you as I'd expected you to integrate them over time. I've
probably not made it clear enough to you before that these patches are
worth considering.

Leonard.

-- 
mount -t life -o ro /dev/dna /genetic/research


___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-08 Thread Vladimir Nadvornik
 
  In non-UTF-8 mode slang2 behaves a bit different than the patched slang1.
  As a result, mc does work with 8bit encodings, like 8859-2 or KOI8.
  The attached patch fixes the SLsmg_write_nwchars() function to be fully
  compatible with the slang1 version and uses it consistently instead of
  SLsmg_write_char(). It should be applied on top of all the previous
  patches.

 Thanks for the patch. The last hunk didn't apply as there's no
 view_add_character used in editdraw.c.

My patch is based on mc-4.6.1
The idea is to replace all occurrences of SLsmg_write_char with
the now fixed SLsmg_write_nwchars, because it is the only way
that works in all locales.

-- 
Vladimir Nadvornik
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-08 Thread Egmont Koblinger
Hi Vladimir (and others),

I've just upgraded to slang-2 in our distro and also updated the UTF-8
patches to mc-4.6.1 (based on SUSE's version). It was easier than I thought
it would be. I was glad to see that you had applied plenty of my patches in
SUSE (those named 00-*). Please take a look at here again:

https://svn.uhulinux.hu/packages/dev/mc/patches/

00-77 had to be updated to slang-2. When a user searches for a file by ^S,
panel-search_buffer is filled up individually with every single byte
pressed. Hence it often contains partial UTF-8 string. Displaying it just
happened to work with slang-1, but slang-2 prints the partial UTF-8 as
C3 or similar. As a result, the cyan box overflows: if you search in the
left panel for an existing accented filename, two cyan blocks appear in the
right panel. The updated patch first finds the longest valid UTF-8 prefix of
the string and only prints that part.

You might find 00-79 useful too, it fixes an off-by-one bug introduced by
the UTF-8 patches that causes Alt+Backspace to behave differently (erase a
whole word and one more character, usually a space) than in bash or in
vanilla mc (erase only the word).


bye,

Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-07 Thread Vladimir Nadvornik
On Saturday 12 November 2005 20:59, Leonard den Ottolander wrote:
 Hi Bart, list,

 On Tue, 2005-09-20 at 08:14 +1200, Bart Oldeman wrote:
  Basically you'd need to apply Fedora's patch, and then (sorry no proper
  patch here but it's not a big deal)

 Attached you'll find a proper patch. I've moved the hunk from global.h
 to myslang.h (and dropped the inclusion of slang.h as it is redundant).

 Jindrich, please merge this patch with the UTF8 patch. You might even
 consider to build the next mc for FC --with-screen=mcslang, as I see the
 FC development tree does not yet feature slang-2.

Hi all,

In non-UTF-8 mode slang2 behaves a bit different than the patched slang1.
As a result, mc does work with 8bit encodings, like 8859-2 or KOI8.
The attached patch fixes the SLsmg_write_nwchars() function to be fully
compatible with the slang1 version and uses it consistently instead of
SLsmg_write_char(). It should be applied on top of all the previous patches.


-- 
Vladimir Nadvornik
developer
-  
SuSE CR, s.r.o. e-mail: [EMAIL PROTECTED]
Drahobejlova 27 tel:+420 2 9654 2373 
190 00 Praha 9  fax:+420 2 9654 2374   
Ceska republika http://www.suse.cz
diff -ruN mc-4.6.1.orig/edit/editdraw.c mc-4.6.1/edit/editdraw.c
--- mc-4.6.1.orig/edit/editdraw.c	2006-06-07 11:57:19.0 +0200
+++ mc-4.6.1/edit/editdraw.c	2006-06-07 11:56:30.0 +0200
@@ -234,7 +234,7 @@
 	lowlevel_set_color (color);
 	}
 #ifdef UTF8
-	SLsmg_write_char(textchar);
+	SLsmg_write_nwchars(textchar, 1);
 #else
 	addch (textchar);
 #endif
diff -ruN mc-4.6.1.orig/src/help.c mc-4.6.1/src/help.c
--- mc-4.6.1.orig/src/help.c	2006-06-07 11:57:19.0 +0200
+++ mc-4.6.1/src/help.c	2006-06-07 11:56:30.0 +0200
@@ -461,7 +461,7 @@
 		len = mbrtowc(wc, p, MB_CUR_MAX, mbs);
 		if (len = 0) len = 1; /* skip broken multibyte chars */
 
-	SLsmg_write_char(wc);
+	SLsmg_write_nwchars(wc, 1);
 		p += len - 1;
 		} else
 #endif
diff -ruN mc-4.6.1.orig/src/util.c mc-4.6.1/src/util.c
--- mc-4.6.1.orig/src/util.c	2006-06-07 11:57:19.0 +0200
+++ mc-4.6.1/src/util.c	2006-06-07 11:56:30.0 +0200
@@ -58,8 +58,26 @@
 #if SLANG_VERSION = 2
 void SLsmg_write_nwchars(wchar_t *s, size_t n)
 {
-  while(n--)
-  SLsmg_write_char(*s++);
+if (SLsmg_is_utf8_mode()) { /* slang can handle it directly */
+	while(n--  *s)
+	SLsmg_write_char(*s++);
+}
+else { /* convert wchars back to 8bit encoding */
+mbstate_t mbs;
+	memset (mbs, 0, sizeof (mbs));
+	while (n--  *s) {
+	char buf[MB_LEN_MAX + 1]; /* should use 1 char, but to be sure */
+	if (*s  0x80) {
+		SLsmg_write_char(*s++); /* ASCII */
+	}
+	else {
+		if (wcrtomb(buf, *s++, mbs) == 1)
+		SLsmg_write_char((wchar_t)(buf[0]));
+		else
+		SLsmg_write_char('?'); /* should not happen */
+	}
+	} 
+}
 }
 #endif
 
diff -ruN mc-4.6.1.orig/src/view.c mc-4.6.1/src/view.c
--- mc-4.6.1.orig/src/view.c	2006-06-07 11:57:19.0 +0200
+++ mc-4.6.1/src/view.c	2006-06-07 11:56:30.0 +0200
@@ -852,7 +852,7 @@
 #ifndef UTF8
 #define view_add_character(view,c) addch (c)
 #else /* UTF8 */
-#define view_add_character(view,c) SLsmg_write_char(c)
+#define view_add_character(view,c) {wchar_t tmp=c; SLsmg_write_nwchars(tmp, 1);}
 #endif /* UTF8 */
 #define view_add_one_vline()   one_vline()
 #define view_add_string(view,s)addstr (s)
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2006-06-07 Thread Jindrich Novy
Hi,

On Wed, 2006-06-07 at 12:18 +0200, Vladimir Nadvornik wrote:
 On Saturday 12 November 2005 20:59, Leonard den Ottolander wrote:
  Hi Bart, list,
 
  On Tue, 2005-09-20 at 08:14 +1200, Bart Oldeman wrote:
   Basically you'd need to apply Fedora's patch, and then (sorry no proper
   patch here but it's not a big deal)
 
  Attached you'll find a proper patch. I've moved the hunk from global.h
  to myslang.h (and dropped the inclusion of slang.h as it is redundant).
 
  Jindrich, please merge this patch with the UTF8 patch. You might even
  consider to build the next mc for FC --with-screen=mcslang, as I see the
  FC development tree does not yet feature slang-2.
 

slang-2.0.5 is used since FC5 so only FC4 now uses the old slang-1.4.9.

 Hi all,
 
 In non-UTF-8 mode slang2 behaves a bit different than the patched slang1.
 As a result, mc does work with 8bit encodings, like 8859-2 or KOI8.
 The attached patch fixes the SLsmg_write_nwchars() function to be fully
 compatible with the slang1 version and uses it consistently instead of
 SLsmg_write_char(). It should be applied on top of all the previous patches.

Thanks for the patch. The last hunk didn't apply as there's no
view_add_character used in editdraw.c.

Jindrich

___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-11-12 Thread Leonard den Ottolander
Hi Bart, list,

On Tue, 2005-09-20 at 08:14 +1200, Bart Oldeman wrote:
 Basically you'd need to apply Fedora's patch, and then (sorry no proper 
 patch here but it's not a big deal)

Attached you'll find a proper patch. I've moved the hunk from global.h
to myslang.h (and dropped the inclusion of slang.h as it is redundant).

Jindrich, please merge this patch with the UTF8 patch. You might even
consider to build the next mc for FC --with-screen=mcslang, as I see the
FC development tree does not yet feature slang-2.

Leonard.

-- 
mount -t life -o ro /dev/dna /genetic/research

diff -pruN mc-4.6.1a/src/help.c mc-4.6.1a_utf8_slang2/src/help.c
--- mc-4.6.1a/src/help.c	2005-11-12 17:46:32.0 +0100
+++ mc-4.6.1a_utf8_slang2/src/help.c	2005-11-12 17:52:11.0 +0100
@@ -449,7 +449,7 @@ static void help_show (Dlg_head *h, cons
 #ifndef HAVE_SLANG
 			addch (acs_map [c]);
 #else
-#ifdef UTF8
+#if defined(UTF8)  SLANG_VERSION  2
 			SLsmg_draw_object (h-y + line + 2, h-x + col + 2, acs_map [c]);
 #else
 			SLsmg_draw_object (h-y + line + 2, h-x + col + 2, c);
diff -pruN mc-4.6.1a/src/myslang.h mc-4.6.1a_utf8_slang2/src/myslang.h
--- mc-4.6.1a/src/myslang.h	2005-11-12 17:46:32.0 +0100
+++ mc-4.6.1a_utf8_slang2/src/myslang.h	2005-11-12 18:09:27.0 +0100
@@ -11,6 +11,12 @@
 #endif	/* HAVE_SLANG_SLANG_H */
 #endif
 
+#if SLANG_VERSION = 2
+#define UTF8 1
+#define SLsmg_Is_Unicode SLsmg_is_utf8_mode()
+void SLsmg_write_nwchars(wchar_t *s, size_t n);
+#endif
+
 #ifdef UTF8
 #include wchar.h
 #endif
diff -pruN mc-4.6.1a/src/slint.c mc-4.6.1a_utf8_slang2/src/slint.c
--- mc-4.6.1a/src/slint.c	2005-09-05 04:14:29.0 +0200
+++ mc-4.6.1a_utf8_slang2/src/slint.c	2005-11-12 17:49:21.0 +0100
@@ -141,7 +141,9 @@ void
 slang_init (void)
 {
 SLtt_get_terminfo ();
-
+#if SLANG_VERSION = 2
+SLutf8_enable (-1);
+#endif
/*
 * If the terminal in not in terminfo but begins with a well-known
 * string such as linux or xterm S-Lang will go on, but the
diff -pruN mc-4.6.1a/src/util.c mc-4.6.1a_utf8_slang2/src/util.c
--- mc-4.6.1a/src/util.c	2005-11-12 17:46:32.0 +0100
+++ mc-4.6.1a_utf8_slang2/src/util.c	2005-11-12 17:55:17.0 +0100
@@ -56,6 +56,14 @@
 static const char app_text [] = Midnight-Commander;
 int easy_patterns = 1;
 
+#if SLANG_VERSION = 2
+void SLsmg_write_nwchars(wchar_t *s, size_t n)
+{
+while(n--)
+	SLsmg_write_char(*s++);
+}
+#endif
+
 extern void str_replace(char *s, char from, char to)
 {
 for (; *s != '\0'; s++) {
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-26 Thread Koblinger Egmont
Hi,

 config.h @ 3
 +#ifdef __APPLE__
 +#define unix 1
 +#endif

I guess developers prefer patches created with diff -u rather than just
some pseudo-code. Exctact the original source code to a directory called
mc-4.6.1.orig (or actually you can call it whatever you want), copy it to
mc-4.6.1 (or whatever you like), make your modifications under the latter
one and then run diff -Naurdp mc-4.6.1.orig mc-4.6.1 or something similar.
Then attach it to the mail, it's easier to handle attachments than cutting
parts of the message body. Developers will correct me if I'm wrong.

 The Israeli (ivrit) and others like Chinese and Japanese are showing
 up corretly, but they crash the windowing, because of they are
 multi-byte chars.
 
 Question: how the length of these multi-byte chars can be decided?
 Does anyone have any idea?

If I understand you, here by multi-byte and length you actually mean how
many character cells they occupy on the screen. It's usually called the
width of a character. See the manual of wcwidth() and wcswidth().



-- 
Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-26 Thread Koblinger Egmont
Hi,

 I have patches for the NFC / NFD issue and two other patches for the
 Darwin/Mac Platform for the current UTF-8 version, with all patches
 applied (I don't know where to post it, so I post here, sorry):
 
 config.h @ 3
 +#ifdef __APPLE__
 +#define unix 1
 +#endif

I guess developers prefer patches created with diff -u rather than just
some pseudo-code. Exctact the original source code to a directory called
mc-4.6.1.orig (or actually you can call it whatever you want), copy it to
mc-4.6.1 (or whatever you like), make your modifications under the latter
one and then run diff -Naurdp mc-4.6.1.orig mc-4.6.1 or something similar.
Then attach it to the mail, it's easier to handle attachments than cutting
parts of the message body. Developers will correct me if I'm wrong.

 The Israeli (ivrit) and others like Chinese and Japanese are showing
 up corretly, but they crash the windowing, because of they are
 multi-byte chars.
 
 Question: how the length of these multi-byte chars can be decided?
 Does anyone have any idea?

If I understand you, here by multi-byte and length you actually mean how
many character cells they occupy on the screen. It's usually called the
width of a character. See the manual of wcwidth() and wcswidth().



-- 
Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-26 Thread Bálint Kardos
Hi,

1) thank you for your answer, I'll post a diff if everything will work
as expected :)
2) the multi-byte char Slang2 bug came up at Debian too, see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=316010 . I applied
this patch (080_wide_chars.patch), but still counts badly.

Still no Asian and/or Mac developers here? :)

Regards,

Balint


On 25/09/05, Koblinger Egmont [EMAIL PROTECTED] wrote:

  Question: how the length of these multi-byte chars can be decided?
  Does anyone have any idea?

 If I understand you, here by multi-byte and length you actually mean how
 many character cells they occupy on the screen. It's usually called the
 width of a character. See the manual of wcwidth() and wcswidth().
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-23 Thread Bálint Kardos
Hi,

I have patches for the NFC / NFD issue and two other patches for the
Darwin/Mac Platform for the current UTF-8 version, with all patches
applied (I don't know where to post it, so I post here, sorry):

config.h @ 3
+#ifdef __APPLE__
+#define unix 1
+#endif

config.h @ 53
+#include wchar.h


vfs.c @ 507
+#ifdef __APPLE__
+else
+{
+gchar* gp2 = g_utf8_normalize (result-d_name,-1,G_NORMALIZE_ALL_COMPOSE);
+   strncpy(result-d_name,gp2,strlen(gp2)+1);
+   g_free(gp2);
+   }
+#endif

// this is the main conversion routine - from the Glib.

This patch makes possible displaying all 2 byte Utf-8 encoded chars
(all West and Central-European languages and Cyrillic letters as well)
On any HPFS/HPFS+ filesystem.
The Israeli (ivrit) and others like Chinese and Japanese are showing
up corretly, but they crash the windowing, because of they are
multi-byte chars.

Question: how the length of these multi-byte chars can be decided?
Does anyone have any idea?

Two issues remained open:

- the panel header and the footer command line.
Does anyone here who knows where can I find these items (screen.c) ?
- any other Mac users who can test these changes, please?


Huge Kudos goes to Akos Huszti for great help, research and support!

Regards,

Balint


On 21/09/05, Pavel Tsekov [EMAIL PROTECTED] wrote:
 Hello,

 Maybe this discussion should be moved to general list - mc at gnome dot
 org.
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-21 Thread Koblinger Egmont
On Tue, Sep 20, 2005 at 10:11:28PM +0200, Bálint Kardos wrote:

 But even with all patches and stuff, I see the following Unicode glitches:
 
 - the utf-8 chars are not diplayed in the dir list (on Ubuntu, everything is 
 OK)
 for ÉÁŰŐÚÖÜÓ I see EAUOUOUO (upper, lowercase all wrong)
 
 - the files/dirs that contain the unicode chars, are still not
 properly aligned to the grids
 
 What could cause Darwin to behave such unpredictably?
 In the filesystem, there's another error:
 if you do 'ls', the alignment of the columns after the unicode chars
 are broken as well.

Unices use NFC, while MacOS uses NFD representation of accents (at least for
filenames, I don't know how about file contents). NFC means each accented
character has its own composed value, that is, one Unicode entity, which
is usually stored as two (maybe three) bytes in UTF-8. NFD composes the
characters from two Unicode entities, first the unaccented letter, followed
by an accent on its own. Its UTF-8 representation hence takes three bytes
(one for the unaccented letter and two more for the accent).

There are different levels of Unicode specified, I guess supporting NFD
requires a higher level of conformance since it's a harder job than
supporting NFC. I bet mc's UTF-8 patch only supports NFC.



-- 
Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-21 Thread Bálint Kardos
Hi, I've debuged it from the beginning, and:

screen.c 762:


// inserted

wchar_t lonaka[100];
memcpy(lonaka,buffer,txtlen*sizeof(wchar_t));
lonaka[txtlen]=0;

// end inserted

printw (%*s, still, );

printw(%ls,lonaka);
   //SLsmg_write_nwchars ((wchar_t *) buffer, txtlen);

printw (%*s, len - txtwidth - still, );


printw(/ls,lonaka); handles everything as expected, the right UTF-8
chars appeared on the screen. So text in the buffer is properly
encoded.
It is an Slang2 issue, but it's too compicated to figure out for the
first blick, the problem in the slsmg.c file.

Does anyone know why #define unix != 1 for darwin-ppc in mc (and/or) slang


regards,

Bálint



 Unices use NFC, while MacOS uses NFD representation of accents (at least for
 filenames, I don't know how about file contents). NFC means each accented
 character has its own composed value, that is, one Unicode entity, which
 is usually stored as two (maybe three) bytes in UTF-8. NFD composes the
 characters from two Unicode entities, first the unaccented letter, followed
 by an accent on its own. Its UTF-8 representation hence takes three bytes
 (one for the unaccented letter and two more for the accent).

 There are different levels of Unicode specified, I guess supporting NFD
 requires a higher level of conformance since it's a harder job than
 supporting NFC. I bet mc's UTF-8 patch only supports NFC.



 --
 Egmont

___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-21 Thread Pavel Tsekov
Hello,

Maybe this discussion should be moved to general list - mc at gnome dot
org.

On Wed, 21 Sep 2005, [ISO-8859-1] Bálint Kardos wrote:

 Hi, I've debuged it from the beginning, and:

 screen.c 762:


 // inserted

   wchar_t lonaka[100];
   memcpy(lonaka,buffer,txtlen*sizeof(wchar_t));
   lonaka[txtlen]=0;

 // end inserted

   printw (%*s, still, );

   printw(%ls,lonaka);
  //SLsmg_write_nwchars ((wchar_t *) buffer, txtlen);

   printw (%*s, len - txtwidth - still, );


 printw(/ls,lonaka); handles everything as expected, the right UTF-8
 chars appeared on the screen. So text in the buffer is properly
 encoded.
 It is an Slang2 issue, but it's too compicated to figure out for the
 first blick, the problem in the slsmg.c file.

 Does anyone know why #define unix != 1 for darwin-ppc in mc (and/or) slang


 regards,

 Bálint


 
  Unices use NFC, while MacOS uses NFD representation of accents (at least for
  filenames, I don't know how about file contents). NFC means each accented
  character has its own composed value, that is, one Unicode entity, which
  is usually stored as two (maybe three) bytes in UTF-8. NFD composes the
  characters from two Unicode entities, first the unaccented letter, followed
  by an accent on its own. Its UTF-8 representation hence takes three bytes
  (one for the unaccented letter and two more for the accent).
 
  There are different levels of Unicode specified, I guess supporting NFD
  requires a higher level of conformance since it's a harder job than
  supporting NFC. I bet mc's UTF-8 patch only supports NFC.
 
 
 
  --
  Egmont
 
 ___
 Mc-devel mailing list
 http://mail.gnome.org/mailman/listinfo/mc-devel

___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-20 Thread Bálint Kardos
Hi Folks,

sorry for bothering, I'm just a newbie, who tries to have UTF-8 MC on his OS X.
I've downloaded the latest source, compiled, it was errorous.
I have Fedora, and their MC far more better, but I just can not reproduce it on my Mac.
I've downloaded from the CVS all FC4 patches and applied to 4.6.1, (according to the sources
file, they used 4.6.1-pre5) BUT nothing really changed.
The GUI is/was properly encoded (LANG=hu_HU.UTF-8), but the folder list is still wicked.
I've Ubuntu too, I aliened the FC4 mc rpm version to it, but:
On 19/09/05, Bart Oldeman [EMAIL PROTECTED] wrote:
On Mon, 19 Sep 2005, Arkadiusz Miskiewicz wrote: There are few patches adding UTF-8 support to mc available but they require modified slang1 library. There is already stable slang-2 version with native
 UTF-8 support. Does anyone know mc utf8 patch that uses slang-2 ?No, but it's not too hard too change the existing patch.Basically you'd need to apply Fedora's patch, and then (sorry no proper
patch here but it's not a big deal)in slint.c, use SLtt_get_terminfo()+#if SLANG_VERSION = 2+ SLutf8_enable (-1);+#endifin help.c, use#if defined(UTF8)  SLANG_VERSION  2
SLsmg_draw_object
(h-y + line + 2, h-x + col + 2, acs_map [c]);#elseSLsmg_draw_object
(h-y + line + 2, h-x + col + 2, c);#endif /* UTF8 */and then (possibly global.h isn't the best place though):--- mc/src/util.c 2005-09-18 22:36:30.0 +1200+++ mc-utf8/src/util.c2005-08-26 13:04:
45.0 +1200@@ -48,6 +51,14 @@static const char app_text [] = Midnight-Commander;int easy_patterns = 1;+#if SLANG_VERSION = 2+void SLsmg_write_nwchars(wchar_t *s, size_t n)
+{+while(n--)+SLsmg_write_char(*s++);+}+#endif+extern void str_replace(char *s, char from, char to){for (; *s != '\0'; s++) {@@ -78,9 +89,40 @@--- mc/src/global.h 2005-08-27 15:51:
32.0 +1200+++ mc-utf8/src/global.h2005-07-13 01:15:40.0 +1200@@ -146,6 +146,13 @@# define N_(String) (String)#endif /* !ENABLE_NLS */+#include slang.h+#if SLANG_VERSION = 2
+#define UTF8 1+#define SLsmg_Is_Unicode SLsmg_is_utf8_mode()+void SLsmg_write_nwchars(wchar_t *s, size_t n);+#endif+#include fs.h#include util.h___
Mc-devel mailing listhttp://mail.gnome.org/mailman/listinfo/mc-devel
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-20 Thread Arkadiusz Miskiewicz
On Monday 19 of September 2005 22:14, Bart Oldeman wrote:

  Does anyone know mc utf8 patch that uses slang-2 ?

 No, but it's not too hard too change the existing patch.

 Basically you'd need to apply Fedora's patch, and then (sorry no proper
 patch here but it's not a big deal)

Thanks, works.

-- 
Arkadiusz MiśkiewiczPLD/Linux Team
http://www.t17.ds.pwr.wroc.pl/~misiek/  http://ftp.pld-linux.org/
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: utf8 patch for mc, slang 2 version

2005-09-20 Thread Bálint Kardos
hi,

Arkadiusz was kind enough to provide me the sourcecode, and I managed
to compile it on my Mac (Darwin 8.2.0).

But even with all patches and stuff, I see the following Unicode glitches:

- the utf-8 chars are not diplayed in the dir list (on Ubuntu, everything is OK)
for ÉÁŰŐÚÖÜÓ I see EAUOUOUO (upper, lowercase all wrong)

- the files/dirs that contain the unicode chars, are still not
properly aligned to the grids

(

What could cause Darwin to behave such unpredictably?
In the filesystem, there's another error:
if you do 'ls', the alignment of the columns after the unicode chars
are broken as well.

(where can I find the source code for the ls command? what package on
other linux distributions?)

)

- I've lost the functionality of the Arrow keys, it just types ABCD.
(so I use screen)

problem on both Linux/Darwin: if you cd into a unicode-named
directory, the blinking command prompt will be positioned to the right
with spaces, as much unicode chars you have in the dir name.

for example:

~/ÉÁŰŐÚÖÜÓ []


Kind regards,

Balint
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel