(moved from elinks-users to elinks-dev because of the patch) Karl Ove Hufthammer <k...@huftis.org> writes:
> When I use ELinks interactively, table borders are drawn using nice > line-drawing characters. However, when I use ‘links --dump’, these are > replaced by ugly -, | and + ASCII characters, even if I dump to UTF-8. > Is there a way to retain the nice borders when dumping a Web page? Not at the moment. The attached patch for elinks-0.12 (20dfdb284f9a23742800fb5b4023bef54c6ad982) implements this, but I'm not sure it is the right solution, because e.g. KOI8-R also supports line-drawing characters so the fix should preferably not be specific to UTF-8. Comments?
From 827a77a6e5fad1f4dc69909090bf07fb7b84ee51 Mon Sep 17 00:00:00 2001 From: Kalle Olavi Niemitalo <k...@iki.fi> Date: Tue, 9 Jun 2009 01:48:42 +0300 Subject: [PATCH] Line-drawing characters in UTF-8 dumps When dumping the document to a file, ELinks used to represent lines in tables and HR elements as ASCII -+| characters. Now, if the output charset is UTF-8, it uses Unicode line-drawing characters instead. This change affects elinks --dump and File -> Save formatted document, but not the Lua current_document_formatted function. --- NEWS | 2 + src/terminal/screen.c | 2 +- src/terminal/terminal.h | 1 + src/viewer/dump/dump-specialized.h | 39 +++++++++++++++++++++++------------ 4 files changed, 29 insertions(+), 15 deletions(-) diff --git a/NEWS b/NEWS index a84f3f9..f407c7c 100644 --- a/NEWS +++ b/NEWS @@ -15,6 +15,8 @@ includes the changes listed under ``ELinks 0.11.6.GIT now'' below. * minor bug 1017: To work around HTTP server bugs, disable protocol.http.compression by default, until ELinks can report decompression errors or automatically retry the connection. +* enhancement: ``--dump'' and ``Save formatted document'' output + line-drawing characters if using UTF-8. ELinks 0.12pre4: ---------------- diff --git a/src/terminal/screen.c b/src/terminal/screen.c index 8f838a6..34c93d8 100644 --- a/src/terminal/screen.c +++ b/src/terminal/screen.c @@ -41,7 +41,7 @@ static const unsigned char frame_vt100[48] = "aaaxuuukkuxkjjjkmvwtqnttmlvwtqnvvw * characters encoded in CP437. * When UTF-8 I/O is enabled, ELinks uses this array instead of * ::frame_vt100[], and converts the characters from CP437 to UTF-8. */ -static const unsigned char frame_vt100_u[48] = { +const unsigned char frame_vt100_u[48] = { 177, 177, 177, 179, 180, 180, 180, 191, 191, 180, 179, 191, 217, 217, 217, 191, 192, 193, 194, 195, 196, 197, 195, 195, diff --git a/src/terminal/terminal.h b/src/terminal/terminal.h index c2c1d79..3bf9d19 100644 --- a/src/terminal/terminal.h +++ b/src/terminal/terminal.h @@ -166,6 +166,7 @@ extern LIST_OF(struct terminal) terminals; extern const unsigned char frame_dumb[]; +extern const unsigned char frame_vt100_u[]; struct terminal *init_term(int, int); void destroy_terminal(struct terminal *); diff --git a/src/viewer/dump/dump-specialized.h b/src/viewer/dump/dump-specialized.h index f60aeed..6d21839 100644 --- a/src/viewer/dump/dump-specialized.h +++ b/src/viewer/dump/dump-specialized.h @@ -41,6 +41,9 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int fd, unsigned char *background = &color[3]; int width = get_opt_int("document.dump.width"); #endif /* DUMP_COLOR_MODE_TRUE */ +#ifdef DUMP_CHARSET_UTF8 + const int cp437 = get_cp_index("cp437"); +#endif for (y = 0; y < document->height; y++) { int white = 0; @@ -105,23 +108,11 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int fd, c = document->data[y].chars[x].data; +#ifndef DUMP_CHARSET_UTF8 if ((attr & SCREEN_ATTR_FRAME) && c >= 176 && c < 224) c = frame_dumb[c - 176]; -#ifdef DUMP_CHARSET_UTF8 - else { - unsigned char *utf8_buf = encode_utf8(c); - - while (*utf8_buf) { - if (write_char(*utf8_buf++, - fd, buf, &bptr)) return -1; - } - - x += unicode_to_cell(c) - 1; - - continue; - } -#endif /* DUMP_CHARSET_UTF8 */ +#endif /* !DUMP_CHARSET_UTF8 */ if (c <= ' ') { /* Count spaces. */ @@ -136,10 +127,30 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int fd, white--; } +#ifdef DUMP_CHARSET_UTF8 + if ((attr & SCREEN_ATTR_FRAME) + && c >= 176 && c < 224) + c = cp2u(cp437, frame_vt100_u[c - 176]); + + { + unsigned char *utf8_buf = encode_utf8(c); + + while (*utf8_buf) { + if (write_char(*utf8_buf++, + fd, buf, &bptr)) return -1; + } + + x += unicode_to_cell(c) - 1; + + continue; + } +#else /* !DUMP_CHARSET_UTF8 */ /* Print normal char. */ if (write_char(c, fd, buf, &bptr)) return -1; +#endif /* !DUMP_CHARSET_UTF8 */ } + #if defined(DUMP_COLOR_MODE_16) || defined(DUMP_COLOR_MODE_256) || defined(DUMP_COLOR_MODE_TRUE) for (;x < width; x++) { if (write_char(' ', fd, buf, &bptr)) -- 1.6.3.1.25.g0f3af
pgp9aRP8iRW9i.pgp
Description: PGP signature
_______________________________________________ elinks-dev mailing list elinks-dev@linuxfromscratch.org http://linuxfromscratch.org/mailman/listinfo/elinks-dev