Re: [PATCH] Use wcwidth(1) to calculate character width

2015-05-18 Thread Nicholas Marriott
Hi

I think you will have another table of ambiguous width characters in
utf8.c which it can check (maybe depending on LOCALE, I guess that is
how wcwidth does it).


On Mon, May 18, 2015 at 11:23:27PM +0900, Kohei Suzuki wrote:
The character ranges are listed in
[1]http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt .
A means ambiguous width as described in
[2]http://www.unicode.org/reports/tr11/#Definitions .
Kohei Suzuki
[3]eagle...@gmail.com
2015-05-18 23:16 GMT+09:00 Nicholas Marriott
[4]nicholas.marri...@gmail.com:
 
  Hi
 
  How does wcwidth figure this out? What are the character ranges?
 
  On Mon, May 18, 2015 at 09:24:37PM +0900, Kohei Suzuki wrote:
  ** ** Thank you for reviewing.
  ** ** Okey, then how about adding a new option ambiguous-width which
  controls
  ** ** the width of East Asian ambiguous width characters?
  ** ** It is set to 1 by default, and users can set it to 2 by
  `set-option
  ** ** ambiguous-width 2` .
  ** ** Is it acceptable?
  ** ** Kohei Suzuki
  ** ** [1][5]eagle...@gmail.com
  ** ** 2015-05-18 2:08 GMT+09:00 Nicholas Marriott
  ** ** [2][6]nicholas.marri...@gmail.com:
  
  ** ** ** Hi
  
  ** ** ** We can't do this because tmux must know the width of the UTF-8
  character
  ** ** ** but the locale may not be UTF-8.
  
  ** ** ** On Mon, May 18, 2015 at 01:59:11AM +0900, Kohei Suzuki wrote:
  ** ** ** ** ** Several characters' width are depending on locale.
  They're called
  ** ** ** East
  ** ** ** ** ** Asian Width. For instance, U+03B1 (GREEK SMALL LETTER
  ALPHA) has
  ** ** ** width 1
  ** ** ** ** ** in most locales, but it has width 2 in some East Asian
  locales
  ** ** ** (e.g.
  ** ** ** ** ** Japanese).
  ** ** ** 
  ** ** ** ** ** Kohei Suzuki
  ** ** ** ** ** [1][3][7]eagle...@gmail.com
  ** ** ** 
  ** ** **  References
  ** ** ** 
  ** ** ** ** ** Visible links
  ** ** ** ** ** 1. mailto:[4][8]eagle...@gmail.com
  
  ** ** **  From 6324eb0bef76c9ec0d214cefa433efa1493f1845 Mon Sep 17
  00:00:00 2001
  ** ** **  From: Kohei Suzuki [5][9]eagle...@gmail.com
  ** ** **  Date: Mon, 18 May 2015 01:28:29 +0900
  ** ** **  Subject: [PATCH] Use wcwidth(1) to calculate character width
  ** ** ** 
  ** ** **  Several characters' width are depending on locale. They're
  called East
  ** ** **  Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA)
  has width
  ** ** ** 1
  ** ** **  in most locales, but it has width 2 in some East Asian
  locales (e.g.
  ** ** **  Japanese).
  ** ** **  ---
  ** ** ** ** server.c |** **1 -
  ** ** ** ** tmux.h** **|** **1 -
  ** ** ** ** utf8.c** **| 239
  ** ** **
  ++-
  ** ** ** ** 3 files changed, 6 insertions(+), 235 deletions(-)
  ** ** ** 
  ** ** **  diff --git a/server.c b/server.c
  ** ** **  index 738abe1..3c164ff 100644
  ** ** **  --- a/server.c
  ** ** **  +++ b/server.c
  ** ** **  @@ -145,7 +145,6 @@ server_start(int lockfd, char *lockfile)
  ** ** ** ** ** ** **TAILQ_INIT(session_groups);
  ** ** ** ** ** ** **mode_key_init_trees();
  ** ** ** ** ** ** **key_bindings_init();
  ** ** **  -** ** **utf8_build();
  ** ** ** 
  ** ** ** ** ** ** **start_time = time(NULL);
  ** ** ** ** ** ** **log_debug(socket path %s, socket_path);
  ** ** **  diff --git a/tmux.h b/tmux.h
  ** ** **  index 054a859..7a6265a 100644
  ** ** **  --- a/tmux.h
  ** ** **  +++ b/tmux.h
  ** ** **  @@ -2286,7 +2286,6 @@ void** ** ** ** ** **
  ** ** ** **session_group_synchronize1(struct session *, struct session
  *);
  ** ** ** ** void** ** ** ** ** session_renumber_windows(struct session
  *);
  ** ** ** 
  ** ** ** ** /* utf8.c */
  ** ** **  -void** ** ** ** ** utf8_build(void);
  ** ** ** ** void** ** ** ** ** utf8_set(struct utf8_data *, u_char);
  ** ** ** ** int** ** ** ** ** **utf8_open(struct utf8_data *, u_char);
  ** ** ** ** int** ** ** ** ** **utf8_append(struct utf8_data *,
  u_char);
  ** ** **  diff --git a/utf8.c b/utf8.c
  ** ** **  index 76b4846..4a84f20 100644
  ** ** **  --- a/utf8.c
  ** ** **  +++ b/utf8.c
  ** ** **  @@ -20,184 +20,10 @@
  ** ** ** 
  ** ** ** ** #include stdlib.h
  ** ** ** ** #include string.h
  ** ** **  +#include wchar.h
  ** ** ** 
  ** ** ** ** #include tmux.h
  ** ** ** 
  ** ** **  -struct utf8_width_entry {
  ** ** **  -** ** **u_int** **first;
  ** ** **  -** ** **u_int** **last;
  ** ** **  -
  ** ** **  -** ** **int** ** **width;
  ** ** **  -
  ** ** **  -** ** **struct utf8_width_entry *left;
  ** ** **  -** ** **struct utf8_width_entry *right;

Re: [PATCH] Use wcwidth(1) to calculate character width

2015-05-18 Thread Nicholas Marriott
Hi

How does wcwidth figure this out? What are the character ranges?


On Mon, May 18, 2015 at 09:24:37PM +0900, Kohei Suzuki wrote:
Thank you for reviewing.
Okey, then how about adding a new option ambiguous-width which controls
the width of East Asian ambiguous width characters?
It is set to 1 by default, and users can set it to 2 by `set-option
ambiguous-width 2` .
Is it acceptable?
Kohei Suzuki
[1]eagle...@gmail.com
2015-05-18 2:08 GMT+09:00 Nicholas Marriott
[2]nicholas.marri...@gmail.com:
 
  Hi
 
  We can't do this because tmux must know the width of the UTF-8 character
  but the locale may not be UTF-8.
 
  On Mon, May 18, 2015 at 01:59:11AM +0900, Kohei Suzuki wrote:
  ** ** Several characters' width are depending on locale. They're called
  East
  ** ** Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA) has
  width 1
  ** ** in most locales, but it has width 2 in some East Asian locales
  (e.g.
  ** ** Japanese).
  
  ** ** Kohei Suzuki
  ** ** [1][3]eagle...@gmail.com
  
   References
  
  ** ** Visible links
  ** ** 1. mailto:[4]eagle...@gmail.com
 
   From 6324eb0bef76c9ec0d214cefa433efa1493f1845 Mon Sep 17 00:00:00 2001
   From: Kohei Suzuki [5]eagle...@gmail.com
   Date: Mon, 18 May 2015 01:28:29 +0900
   Subject: [PATCH] Use wcwidth(1) to calculate character width
  
   Several characters' width are depending on locale. They're called East
   Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA) has width
  1
   in most locales, but it has width 2 in some East Asian locales (e.g.
   Japanese).
   ---
  ** server.c |** **1 -
  ** tmux.h** **|** **1 -
  ** utf8.c** **| 239
  ++-
  ** 3 files changed, 6 insertions(+), 235 deletions(-)
  
   diff --git a/server.c b/server.c
   index 738abe1..3c164ff 100644
   --- a/server.c
   +++ b/server.c
   @@ -145,7 +145,6 @@ server_start(int lockfd, char *lockfile)
  ** ** ** **TAILQ_INIT(session_groups);
  ** ** ** **mode_key_init_trees();
  ** ** ** **key_bindings_init();
   -** ** **utf8_build();
  
  ** ** ** **start_time = time(NULL);
  ** ** ** **log_debug(socket path %s, socket_path);
   diff --git a/tmux.h b/tmux.h
   index 054a859..7a6265a 100644
   --- a/tmux.h
   +++ b/tmux.h
   @@ -2286,7 +2286,6 @@ void** ** ** ** ** **
  **session_group_synchronize1(struct session *, struct session *);
  ** void** ** ** ** ** session_renumber_windows(struct session *);
  
  ** /* utf8.c */
   -void** ** ** ** ** utf8_build(void);
  ** void** ** ** ** ** utf8_set(struct utf8_data *, u_char);
  ** int** ** ** ** ** **utf8_open(struct utf8_data *, u_char);
  ** int** ** ** ** ** **utf8_append(struct utf8_data *, u_char);
   diff --git a/utf8.c b/utf8.c
   index 76b4846..4a84f20 100644
   --- a/utf8.c
   +++ b/utf8.c
   @@ -20,184 +20,10 @@
  
  ** #include stdlib.h
  ** #include string.h
   +#include wchar.h
  
  ** #include tmux.h
  
   -struct utf8_width_entry {
   -** ** **u_int** **first;
   -** ** **u_int** **last;
   -
   -** ** **int** ** **width;
   -
   -** ** **struct utf8_width_entry *left;
   -** ** **struct utf8_width_entry *right;
   -};
   -
   -/* Random order. Not optimal but it'll do for now... */
   -struct utf8_width_entry utf8_width_table[] = {
   -** ** **{ 0x00951, 0x00954, 0, NULL, NULL },
   -** ** **{ 0x00ccc, 0x00ccd, 0, NULL, NULL },
   -** ** **{ 0x0fff9, 0x0fffb, 0, NULL, NULL },
   -** ** **{ 0x2, 0x2fffd, 2, NULL, NULL },
   -** ** **{ 0x00ebb, 0x00ebc, 0, NULL, NULL },
   -** ** **{ 0x01932, 0x01932, 0, NULL, NULL },
   -** ** **{ 0x0070f, 0x0070f, 0, NULL, NULL },
   -** ** **{ 0x00a70, 0x00a71, 0, NULL, NULL },
   -** ** **{ 0x02329, 0x02329, 2, NULL, NULL },
   -** ** **{ 0x00acd, 0x00acd, 0, NULL, NULL },
   -** ** **{ 0x00ac7, 0x00ac8, 0, NULL, NULL },
   -** ** **{ 0x00a3c, 0x00a3c, 0, NULL, NULL },
   -** ** **{ 0x009cd, 0x009cd, 0, NULL, NULL },
   -** ** **{ 0x00591, 0x005bd, 0, NULL, NULL },
   -** ** **{ 0x01058, 0x01059, 0, NULL, NULL },
   -** ** **{ 0x0ffe0, 0x0ffe6, 2, NULL, NULL },
   -** ** **{ 0x01100, 0x0115f, 2, NULL, NULL },
   -** ** **{ 0x0fe20, 0x0fe23, 0, NULL, NULL },
   -** ** **{ 0x0302a, 0x0302f, 0, NULL, NULL },
   -** ** **{ 0x01772, 0x01773, 0, NULL, NULL },
   -** ** **{ 0x005bf, 0x005bf, 0, NULL, NULL },
   -** ** **{ 0x006ea, 0x006ed, 0, NULL, NULL },
   -** ** **{ 0x00bc0, 0x00bc0, 0, NULL, NULL },
   -** ** **{ 0x00962, 0x00963, 0, NULL, NULL },
   -** ** **{ 0x01732, 0x01734, 0, NULL, NULL },
   -** ** **{ 0x00d41, 0x00d43, 0, NULL, 

Re: [PATCH] Use wcwidth(1) to calculate character width

2015-05-18 Thread Kohei Suzuki
The character ranges are listed in
http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt .
A means ambiguous width as described in
http://www.unicode.org/reports/tr11/#Definitions .

Kohei Suzuki
eagle...@gmail.com

2015-05-18 23:16 GMT+09:00 Nicholas Marriott nicholas.marri...@gmail.com:

 Hi

 How does wcwidth figure this out? What are the character ranges?


 On Mon, May 18, 2015 at 09:24:37PM +0900, Kohei Suzuki wrote:
 Thank you for reviewing.
 Okey, then how about adding a new option ambiguous-width which
 controls
 the width of East Asian ambiguous width characters?
 It is set to 1 by default, and users can set it to 2 by `set-option
 ambiguous-width 2` .
 Is it acceptable?
 Kohei Suzuki
 [1]eagle...@gmail.com
 2015-05-18 2:08 GMT+09:00 Nicholas Marriott
 [2]nicholas.marri...@gmail.com:
 
   Hi
 
   We can't do this because tmux must know the width of the UTF-8
 character
   but the locale may not be UTF-8.
 
   On Mon, May 18, 2015 at 01:59:11AM +0900, Kohei Suzuki wrote:
   ** ** Several characters' width are depending on locale. They're
 called
   East
   ** ** Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA)
 has
   width 1
   ** ** in most locales, but it has width 2 in some East Asian
 locales
   (e.g.
   ** ** Japanese).
   
   ** ** Kohei Suzuki
   ** ** [1][3]eagle...@gmail.com
   
References
   
   ** ** Visible links
   ** ** 1. mailto:[4]eagle...@gmail.com
 
From 6324eb0bef76c9ec0d214cefa433efa1493f1845 Mon Sep 17 00:00:00
 2001
From: Kohei Suzuki [5]eagle...@gmail.com
Date: Mon, 18 May 2015 01:28:29 +0900
Subject: [PATCH] Use wcwidth(1) to calculate character width
   
Several characters' width are depending on locale. They're called
 East
Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA) has
 width
   1
in most locales, but it has width 2 in some East Asian locales
 (e.g.
Japanese).
---
   ** server.c |** **1 -
   ** tmux.h** **|** **1 -
   ** utf8.c** **| 239
   ++-
   ** 3 files changed, 6 insertions(+), 235 deletions(-)
   
diff --git a/server.c b/server.c
index 738abe1..3c164ff 100644
--- a/server.c
+++ b/server.c
@@ -145,7 +145,6 @@ server_start(int lockfd, char *lockfile)
   ** ** ** **TAILQ_INIT(session_groups);
   ** ** ** **mode_key_init_trees();
   ** ** ** **key_bindings_init();
-** ** **utf8_build();
   
   ** ** ** **start_time = time(NULL);
   ** ** ** **log_debug(socket path %s, socket_path);
diff --git a/tmux.h b/tmux.h
index 054a859..7a6265a 100644
--- a/tmux.h
+++ b/tmux.h
@@ -2286,7 +2286,6 @@ void** ** ** ** ** **
   **session_group_synchronize1(struct session *, struct session *);
   ** void** ** ** ** ** session_renumber_windows(struct session *);
   
   ** /* utf8.c */
-void** ** ** ** ** utf8_build(void);
   ** void** ** ** ** ** utf8_set(struct utf8_data *, u_char);
   ** int** ** ** ** ** **utf8_open(struct utf8_data *, u_char);
   ** int** ** ** ** ** **utf8_append(struct utf8_data *, u_char);
diff --git a/utf8.c b/utf8.c
index 76b4846..4a84f20 100644
--- a/utf8.c
+++ b/utf8.c
@@ -20,184 +20,10 @@
   
   ** #include stdlib.h
   ** #include string.h
+#include wchar.h
   
   ** #include tmux.h
   
-struct utf8_width_entry {
-** ** **u_int** **first;
-** ** **u_int** **last;
-
-** ** **int** ** **width;
-
-** ** **struct utf8_width_entry *left;
-** ** **struct utf8_width_entry *right;
-};
-
-/* Random order. Not optimal but it'll do for now... */
-struct utf8_width_entry utf8_width_table[] = {
-** ** **{ 0x00951, 0x00954, 0, NULL, NULL },
-** ** **{ 0x00ccc, 0x00ccd, 0, NULL, NULL },
-** ** **{ 0x0fff9, 0x0fffb, 0, NULL, NULL },
-** ** **{ 0x2, 0x2fffd, 2, NULL, NULL },
-** ** **{ 0x00ebb, 0x00ebc, 0, NULL, NULL },
-** ** **{ 0x01932, 0x01932, 0, NULL, NULL },
-** ** **{ 0x0070f, 0x0070f, 0, NULL, NULL },
-** ** **{ 0x00a70, 0x00a71, 0, NULL, NULL },
-** ** **{ 0x02329, 0x02329, 2, NULL, NULL },
-** ** **{ 0x00acd, 0x00acd, 0, NULL, NULL },
-** ** **{ 0x00ac7, 0x00ac8, 0, NULL, NULL },
-** ** **{ 0x00a3c, 0x00a3c, 0, NULL, NULL },
-** ** **{ 0x009cd, 0x009cd, 0, NULL, NULL },
-** ** **{ 0x00591, 0x005bd, 0, NULL, NULL },
-** ** **{ 0x01058, 0x01059, 0, NULL, NULL },
-** ** **{ 0x0ffe0, 0x0ffe6, 2, NULL, NULL },
-** ** **{ 0x01100, 0x0115f, 2, NULL, NULL },
-** ** **{ 0x0fe20, 0x0fe23, 0, NULL, NULL },
 

Re: [PATCH] Use wcwidth(1) to calculate character width

2015-05-18 Thread Kohei Suzuki
Thank you for reviewing.
Okey, then how about adding a new option ambiguous-width which controls
the width of East Asian ambiguous width characters?
It is set to 1 by default, and users can set it to 2 by `set-option
ambiguous-width 2` .
Is it acceptable?

Kohei Suzuki
eagle...@gmail.com

2015-05-18 2:08 GMT+09:00 Nicholas Marriott nicholas.marri...@gmail.com:

 Hi

 We can't do this because tmux must know the width of the UTF-8 character
 but the locale may not be UTF-8.



 On Mon, May 18, 2015 at 01:59:11AM +0900, Kohei Suzuki wrote:
 Several characters' width are depending on locale. They're called East
 Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA) has
 width 1
 in most locales, but it has width 2 in some East Asian locales (e.g.
 Japanese).
 
 Kohei Suzuki
 [1]eagle...@gmail.com
 
  References
 
 Visible links
 1. mailto:eagle...@gmail.com

  From 6324eb0bef76c9ec0d214cefa433efa1493f1845 Mon Sep 17 00:00:00 2001
  From: Kohei Suzuki eagle...@gmail.com
  Date: Mon, 18 May 2015 01:28:29 +0900
  Subject: [PATCH] Use wcwidth(1) to calculate character width
 
  Several characters' width are depending on locale. They're called East
  Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA) has width 1
  in most locales, but it has width 2 in some East Asian locales (e.g.
  Japanese).
  ---
   server.c |   1 -
   tmux.h   |   1 -
   utf8.c   | 239
 ++-
   3 files changed, 6 insertions(+), 235 deletions(-)
 
  diff --git a/server.c b/server.c
  index 738abe1..3c164ff 100644
  --- a/server.c
  +++ b/server.c
  @@ -145,7 +145,6 @@ server_start(int lockfd, char *lockfile)
TAILQ_INIT(session_groups);
mode_key_init_trees();
key_bindings_init();
  - utf8_build();
 
start_time = time(NULL);
log_debug(socket path %s, socket_path);
  diff --git a/tmux.h b/tmux.h
  index 054a859..7a6265a 100644
  --- a/tmux.h
  +++ b/tmux.h
  @@ -2286,7 +2286,6 @@ void session_group_synchronize1(struct
 session *, struct session *);
   void  session_renumber_windows(struct session *);
 
   /* utf8.c */
  -void  utf8_build(void);
   void  utf8_set(struct utf8_data *, u_char);
   int   utf8_open(struct utf8_data *, u_char);
   int   utf8_append(struct utf8_data *, u_char);
  diff --git a/utf8.c b/utf8.c
  index 76b4846..4a84f20 100644
  --- a/utf8.c
  +++ b/utf8.c
  @@ -20,184 +20,10 @@
 
   #include stdlib.h
   #include string.h
  +#include wchar.h
 
   #include tmux.h
 
  -struct utf8_width_entry {
  - u_int   first;
  - u_int   last;
  -
  - int width;
  -
  - struct utf8_width_entry *left;
  - struct utf8_width_entry *right;
  -};
  -
  -/* Random order. Not optimal but it'll do for now... */
  -struct utf8_width_entry utf8_width_table[] = {
  - { 0x00951, 0x00954, 0, NULL, NULL },
  - { 0x00ccc, 0x00ccd, 0, NULL, NULL },
  - { 0x0fff9, 0x0fffb, 0, NULL, NULL },
  - { 0x2, 0x2fffd, 2, NULL, NULL },
  - { 0x00ebb, 0x00ebc, 0, NULL, NULL },
  - { 0x01932, 0x01932, 0, NULL, NULL },
  - { 0x0070f, 0x0070f, 0, NULL, NULL },
  - { 0x00a70, 0x00a71, 0, NULL, NULL },
  - { 0x02329, 0x02329, 2, NULL, NULL },
  - { 0x00acd, 0x00acd, 0, NULL, NULL },
  - { 0x00ac7, 0x00ac8, 0, NULL, NULL },
  - { 0x00a3c, 0x00a3c, 0, NULL, NULL },
  - { 0x009cd, 0x009cd, 0, NULL, NULL },
  - { 0x00591, 0x005bd, 0, NULL, NULL },
  - { 0x01058, 0x01059, 0, NULL, NULL },
  - { 0x0ffe0, 0x0ffe6, 2, NULL, NULL },
  - { 0x01100, 0x0115f, 2, NULL, NULL },
  - { 0x0fe20, 0x0fe23, 0, NULL, NULL },
  - { 0x0302a, 0x0302f, 0, NULL, NULL },
  - { 0x01772, 0x01773, 0, NULL, NULL },
  - { 0x005bf, 0x005bf, 0, NULL, NULL },
  - { 0x006ea, 0x006ed, 0, NULL, NULL },
  - { 0x00bc0, 0x00bc0, 0, NULL, NULL },
  - { 0x00962, 0x00963, 0, NULL, NULL },
  - { 0x01732, 0x01734, 0, NULL, NULL },
  - { 0x00d41, 0x00d43, 0, NULL, NULL },
  - { 0x01b42, 0x01b42, 0, NULL, NULL },
  - { 0x00a41, 0x00a42, 0, NULL, NULL },
  - { 0x00eb4, 0x00eb9, 0, NULL, NULL },
  - { 0x00b01, 0x00b01, 0, NULL, NULL },
  - { 0x00e34, 0x00e3a, 0, NULL, NULL },
  - { 0x03040, 0x03098, 2, NULL, NULL },
  - { 0x0093c, 0x0093c, 0, NULL, NULL },
  - { 0x00c4a, 0x00c4d, 0, NULL, NULL },
  - { 0x01032, 0x01032, 0, NULL, NULL },
  - { 0x00f37, 0x00f37, 0, NULL, NULL },
  - { 0x00901, 0x00902, 0, NULL, NULL },
  - { 0x00cbf, 0x00cbf, 0, NULL, NULL },
  - { 0x0a806, 0x0a806, 0, NULL, NULL },
  - { 0x00dd2, 0x00dd4, 0, NULL, NULL },
  - { 0x00f71, 0x00f7e, 0, NULL, NULL },
  - { 0x01752, 0x01753, 0, NULL, NULL },
  - { 0x1d242, 0x1d244, 0, NULL, NULL },
  - { 0x005c1, 0x005c2, 0, NULL, NULL },
  - { 0x0309b, 0x0a4cf, 2, NULL, NULL },
  - { 0xe0100, 0xe01ef, 0, NULL, NULL },
  - { 0x017dd, 0x017dd, 0, 

Re: [PATCH] Use wcwidth(1) to calculate character width

2015-05-17 Thread Nicholas Marriott
Hi

We can't do this because tmux must know the width of the UTF-8 character
but the locale may not be UTF-8.



On Mon, May 18, 2015 at 01:59:11AM +0900, Kohei Suzuki wrote:
Several characters' width are depending on locale. They're called East
Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA) has width 1
in most locales, but it has width 2 in some East Asian locales (e.g.
Japanese).
 
Kohei Suzuki
[1]eagle...@gmail.com
 
 References
 
Visible links
1. mailto:eagle...@gmail.com

 From 6324eb0bef76c9ec0d214cefa433efa1493f1845 Mon Sep 17 00:00:00 2001
 From: Kohei Suzuki eagle...@gmail.com
 Date: Mon, 18 May 2015 01:28:29 +0900
 Subject: [PATCH] Use wcwidth(1) to calculate character width
 
 Several characters' width are depending on locale. They're called East
 Asian Width. For instance, U+03B1 (GREEK SMALL LETTER ALPHA) has width 1
 in most locales, but it has width 2 in some East Asian locales (e.g.
 Japanese).
 ---
  server.c |   1 -
  tmux.h   |   1 -
  utf8.c   | 239 
 ++-
  3 files changed, 6 insertions(+), 235 deletions(-)
 
 diff --git a/server.c b/server.c
 index 738abe1..3c164ff 100644
 --- a/server.c
 +++ b/server.c
 @@ -145,7 +145,6 @@ server_start(int lockfd, char *lockfile)
   TAILQ_INIT(session_groups);
   mode_key_init_trees();
   key_bindings_init();
 - utf8_build();
  
   start_time = time(NULL);
   log_debug(socket path %s, socket_path);
 diff --git a/tmux.h b/tmux.h
 index 054a859..7a6265a 100644
 --- a/tmux.h
 +++ b/tmux.h
 @@ -2286,7 +2286,6 @@ void session_group_synchronize1(struct 
 session *, struct session *);
  void  session_renumber_windows(struct session *);
  
  /* utf8.c */
 -void  utf8_build(void);
  void  utf8_set(struct utf8_data *, u_char);
  int   utf8_open(struct utf8_data *, u_char);
  int   utf8_append(struct utf8_data *, u_char);
 diff --git a/utf8.c b/utf8.c
 index 76b4846..4a84f20 100644
 --- a/utf8.c
 +++ b/utf8.c
 @@ -20,184 +20,10 @@
  
  #include stdlib.h
  #include string.h
 +#include wchar.h
  
  #include tmux.h
  
 -struct utf8_width_entry {
 - u_int   first;
 - u_int   last;
 -
 - int width;
 -
 - struct utf8_width_entry *left;
 - struct utf8_width_entry *right;
 -};
 -
 -/* Random order. Not optimal but it'll do for now... */
 -struct utf8_width_entry utf8_width_table[] = {
 - { 0x00951, 0x00954, 0, NULL, NULL },
 - { 0x00ccc, 0x00ccd, 0, NULL, NULL },
 - { 0x0fff9, 0x0fffb, 0, NULL, NULL },
 - { 0x2, 0x2fffd, 2, NULL, NULL },
 - { 0x00ebb, 0x00ebc, 0, NULL, NULL },
 - { 0x01932, 0x01932, 0, NULL, NULL },
 - { 0x0070f, 0x0070f, 0, NULL, NULL },
 - { 0x00a70, 0x00a71, 0, NULL, NULL },
 - { 0x02329, 0x02329, 2, NULL, NULL },
 - { 0x00acd, 0x00acd, 0, NULL, NULL },
 - { 0x00ac7, 0x00ac8, 0, NULL, NULL },
 - { 0x00a3c, 0x00a3c, 0, NULL, NULL },
 - { 0x009cd, 0x009cd, 0, NULL, NULL },
 - { 0x00591, 0x005bd, 0, NULL, NULL },
 - { 0x01058, 0x01059, 0, NULL, NULL },
 - { 0x0ffe0, 0x0ffe6, 2, NULL, NULL },
 - { 0x01100, 0x0115f, 2, NULL, NULL },
 - { 0x0fe20, 0x0fe23, 0, NULL, NULL },
 - { 0x0302a, 0x0302f, 0, NULL, NULL },
 - { 0x01772, 0x01773, 0, NULL, NULL },
 - { 0x005bf, 0x005bf, 0, NULL, NULL },
 - { 0x006ea, 0x006ed, 0, NULL, NULL },
 - { 0x00bc0, 0x00bc0, 0, NULL, NULL },
 - { 0x00962, 0x00963, 0, NULL, NULL },
 - { 0x01732, 0x01734, 0, NULL, NULL },
 - { 0x00d41, 0x00d43, 0, NULL, NULL },
 - { 0x01b42, 0x01b42, 0, NULL, NULL },
 - { 0x00a41, 0x00a42, 0, NULL, NULL },
 - { 0x00eb4, 0x00eb9, 0, NULL, NULL },
 - { 0x00b01, 0x00b01, 0, NULL, NULL },
 - { 0x00e34, 0x00e3a, 0, NULL, NULL },
 - { 0x03040, 0x03098, 2, NULL, NULL },
 - { 0x0093c, 0x0093c, 0, NULL, NULL },
 - { 0x00c4a, 0x00c4d, 0, NULL, NULL },
 - { 0x01032, 0x01032, 0, NULL, NULL },
 - { 0x00f37, 0x00f37, 0, NULL, NULL },
 - { 0x00901, 0x00902, 0, NULL, NULL },
 - { 0x00cbf, 0x00cbf, 0, NULL, NULL },
 - { 0x0a806, 0x0a806, 0, NULL, NULL },
 - { 0x00dd2, 0x00dd4, 0, NULL, NULL },
 - { 0x00f71, 0x00f7e, 0, NULL, NULL },
 - { 0x01752, 0x01753, 0, NULL, NULL },
 - { 0x1d242, 0x1d244, 0, NULL, NULL },
 - { 0x005c1, 0x005c2, 0, NULL, NULL },
 - { 0x0309b, 0x0a4cf, 2, NULL, NULL },
 - { 0xe0100, 0xe01ef, 0, NULL, NULL },
 - { 0x017dd, 0x017dd, 0, NULL, NULL },
 - { 0x00600, 0x00603, 0, NULL, NULL },
 - { 0x009e2, 0x009e3, 0, NULL, NULL },
 - { 0x00cc6, 0x00cc6, 0, NULL, NULL },
 - { 0x0a80b, 0x0a80b, 0, NULL, NULL },
 - { 0x01712, 0x01714, 0, NULL, NULL },
 - { 0x00b3c, 0x00b3c, 0, NULL, NULL },
 - { 0x01b00, 0x01b03, 0, NULL, NULL },
 - { 0x007eb, 0x007f3, 0, NULL, NULL },
 - { 0xe0001, 0xe0001, 0, NULL, NULL },
 - { 0x1d185, 0x1d18b, 0, NULL, NULL },
 - { 0x0feff,