(moved from elinks-users to elinks-dev because of the patch)
Karl Ove Hufthammer k...@huftis.org writes:
When I use ELinks interactively, table borders are drawn using nice
line-drawing characters. However, when I use ‘links --dump’, these are
replaced by ugly -, | and + ASCII characters, even if I dump to UTF-8.
Is there a way to retain the nice borders when dumping a Web page?
Not at the moment. The attached patch for elinks-0.12
(20dfdb284f9a23742800fb5b4023bef54c6ad982) implements this, but
I'm not sure it is the right solution, because e.g. KOI8-R also
supports line-drawing characters so the fix should preferably
not be specific to UTF-8. Comments?
From 827a77a6e5fad1f4dc69909090bf07fb7b84ee51 Mon Sep 17 00:00:00 2001
From: Kalle Olavi Niemitalo k...@iki.fi
Date: Tue, 9 Jun 2009 01:48:42 +0300
Subject: [PATCH] Line-drawing characters in UTF-8 dumps
When dumping the document to a file, ELinks used to represent lines in
tables and HR elements as ASCII -+| characters. Now, if the output
charset is UTF-8, it uses Unicode line-drawing characters instead.
This change affects elinks --dump and File - Save formatted document,
but not the Lua current_document_formatted function.
---
NEWS |2 +
src/terminal/screen.c |2 +-
src/terminal/terminal.h|1 +
src/viewer/dump/dump-specialized.h | 39 +++
4 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/NEWS b/NEWS
index a84f3f9..f407c7c 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,8 @@ includes the changes listed under ``ELinks 0.11.6.GIT now''
below.
* minor bug 1017: To work around HTTP server bugs, disable
protocol.http.compression by default, until ELinks can report
decompression errors or automatically retry the connection.
+* enhancement: ``--dump'' and ``Save formatted document'' output
+ line-drawing characters if using UTF-8.
ELinks 0.12pre4:
diff --git a/src/terminal/screen.c b/src/terminal/screen.c
index 8f838a6..34c93d8 100644
--- a/src/terminal/screen.c
+++ b/src/terminal/screen.c
@@ -41,7 +41,7 @@ static const unsigned char frame_vt100[48] =
aaaxuuukkuxkjjjkmvwtqnttmlvwtqnvvw
* characters encoded in CP437.
* When UTF-8 I/O is enabled, ELinks uses this array instead of
* ::frame_vt100[], and converts the characters from CP437 to UTF-8. */
-static const unsigned char frame_vt100_u[48] = {
+const unsigned char frame_vt100_u[48] = {
177, 177, 177, 179, 180, 180, 180, 191,
191, 180, 179, 191, 217, 217, 217, 191,
192, 193, 194, 195, 196, 197, 195, 195,
diff --git a/src/terminal/terminal.h b/src/terminal/terminal.h
index c2c1d79..3bf9d19 100644
--- a/src/terminal/terminal.h
+++ b/src/terminal/terminal.h
@@ -166,6 +166,7 @@ extern LIST_OF(struct terminal) terminals;
extern const unsigned char frame_dumb[];
+extern const unsigned char frame_vt100_u[];
struct terminal *init_term(int, int);
void destroy_terminal(struct terminal *);
diff --git a/src/viewer/dump/dump-specialized.h
b/src/viewer/dump/dump-specialized.h
index f60aeed..6d21839 100644
--- a/src/viewer/dump/dump-specialized.h
+++ b/src/viewer/dump/dump-specialized.h
@@ -41,6 +41,9 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int fd,
unsigned char *background = color[3];
int width = get_opt_int(document.dump.width);
#endif /* DUMP_COLOR_MODE_TRUE */
+#ifdef DUMP_CHARSET_UTF8
+ const int cp437 = get_cp_index(cp437);
+#endif
for (y = 0; y document-height; y++) {
int white = 0;
@@ -105,23 +108,11 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int
fd,
c = document-data[y].chars[x].data;
+#ifndef DUMP_CHARSET_UTF8
if ((attr SCREEN_ATTR_FRAME)
c = 176 c 224)
c = frame_dumb[c - 176];
-#ifdef DUMP_CHARSET_UTF8
- else {
- unsigned char *utf8_buf = encode_utf8(c);
-
- while (*utf8_buf) {
- if (write_char(*utf8_buf++,
- fd, buf, bptr)) return -1;
- }
-
- x += unicode_to_cell(c) - 1;
-
- continue;
- }
-#endif /* DUMP_CHARSET_UTF8 */
+#endif /* !DUMP_CHARSET_UTF8 */
if (c = ' ') {
/* Count spaces. */
@@ -136,10 +127,30 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int
fd,
white--;
}
+#ifdef DUMP_CHARSET_UTF8
+ if ((attr SCREEN_ATTR_FRAME)
+c = 176 c 224)
+ c = cp2u(cp437, frame_vt100_u[c - 176]);
+
+ {
+