(moved from elinks-users to elinks-dev because of the patch)

Karl Ove Hufthammer <k...@huftis.org> writes:

> When I use ELinks interactively, table borders are drawn using nice 
> line-drawing characters. However, when I use ‘links --dump’, these are 
> replaced by ugly -, | and + ASCII characters, even if I dump to UTF-8. 
> Is there a way to retain the nice borders when dumping a Web page?

Not at the moment.  The attached patch for elinks-0.12
(20dfdb284f9a23742800fb5b4023bef54c6ad982) implements this, but
I'm not sure it is the right solution, because e.g. KOI8-R also
supports line-drawing characters so the fix should preferably
not be specific to UTF-8.  Comments?

From 827a77a6e5fad1f4dc69909090bf07fb7b84ee51 Mon Sep 17 00:00:00 2001
From: Kalle Olavi Niemitalo <k...@iki.fi>
Date: Tue, 9 Jun 2009 01:48:42 +0300
Subject: [PATCH] Line-drawing characters in UTF-8 dumps

When dumping the document to a file, ELinks used to represent lines in
tables and HR elements as ASCII -+| characters.  Now, if the output
charset is UTF-8, it uses Unicode line-drawing characters instead.
This change affects elinks --dump and File -> Save formatted document,
but not the Lua current_document_formatted function.
---
 NEWS                               |    2 +
 src/terminal/screen.c              |    2 +-
 src/terminal/terminal.h            |    1 +
 src/viewer/dump/dump-specialized.h |   39 +++++++++++++++++++++++------------
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/NEWS b/NEWS
index a84f3f9..f407c7c 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,8 @@ includes the changes listed under ``ELinks 0.11.6.GIT now'' 
below.
 * minor bug 1017: To work around HTTP server bugs, disable
   protocol.http.compression by default, until ELinks can report
   decompression errors or automatically retry the connection.
+* enhancement: ``--dump'' and ``Save formatted document'' output
+  line-drawing characters if using UTF-8.
 
 ELinks 0.12pre4:
 ----------------
diff --git a/src/terminal/screen.c b/src/terminal/screen.c
index 8f838a6..34c93d8 100644
--- a/src/terminal/screen.c
+++ b/src/terminal/screen.c
@@ -41,7 +41,7 @@ static const unsigned char frame_vt100[48] =  
"aaaxuuukkuxkjjjkmvwtqnttmlvwtqnvvw
  * characters encoded in CP437.
  * When UTF-8 I/O is enabled, ELinks uses this array instead of
  * ::frame_vt100[], and converts the characters from CP437 to UTF-8.  */
-static const unsigned char frame_vt100_u[48] = {
+const unsigned char frame_vt100_u[48] = {
        177, 177, 177, 179, 180, 180, 180, 191,
        191, 180, 179, 191, 217, 217, 217, 191,
        192, 193, 194, 195, 196, 197, 195, 195,
diff --git a/src/terminal/terminal.h b/src/terminal/terminal.h
index c2c1d79..3bf9d19 100644
--- a/src/terminal/terminal.h
+++ b/src/terminal/terminal.h
@@ -166,6 +166,7 @@ extern LIST_OF(struct terminal) terminals;
 
 
 extern const unsigned char frame_dumb[];
+extern const unsigned char frame_vt100_u[];
 
 struct terminal *init_term(int, int);
 void destroy_terminal(struct terminal *);
diff --git a/src/viewer/dump/dump-specialized.h 
b/src/viewer/dump/dump-specialized.h
index f60aeed..6d21839 100644
--- a/src/viewer/dump/dump-specialized.h
+++ b/src/viewer/dump/dump-specialized.h
@@ -41,6 +41,9 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int fd,
        unsigned char *background = &color[3];
        int width = get_opt_int("document.dump.width");
 #endif /* DUMP_COLOR_MODE_TRUE */
+#ifdef DUMP_CHARSET_UTF8
+       const int cp437 = get_cp_index("cp437");
+#endif
 
        for (y = 0; y < document->height; y++) {
                int white = 0;
@@ -105,23 +108,11 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int 
fd,
 
                        c = document->data[y].chars[x].data;
 
+#ifndef DUMP_CHARSET_UTF8
                        if ((attr & SCREEN_ATTR_FRAME)
                            && c >= 176 && c < 224)
                                c = frame_dumb[c - 176];
-#ifdef DUMP_CHARSET_UTF8
-                       else {
-                               unsigned char *utf8_buf = encode_utf8(c);
-
-                               while (*utf8_buf) {
-                                       if (write_char(*utf8_buf++,
-                                               fd, buf, &bptr)) return -1;
-                               }
-
-                               x += unicode_to_cell(c) - 1;
-
-                               continue;
-                       }
-#endif /* DUMP_CHARSET_UTF8 */
+#endif /* !DUMP_CHARSET_UTF8 */
 
                        if (c <= ' ') {
                                /* Count spaces. */
@@ -136,10 +127,30 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int 
fd,
                                white--;
                        }
 
+#ifdef DUMP_CHARSET_UTF8
+                       if ((attr & SCREEN_ATTR_FRAME)
+                           && c >= 176 && c < 224)
+                               c = cp2u(cp437, frame_vt100_u[c - 176]);
+
+                       {
+                               unsigned char *utf8_buf = encode_utf8(c);
+
+                               while (*utf8_buf) {
+                                       if (write_char(*utf8_buf++,
+                                               fd, buf, &bptr)) return -1;
+                               }
+
+                               x += unicode_to_cell(c) - 1;
+
+                               continue;
+                       }
+#else  /* !DUMP_CHARSET_UTF8 */
                        /* Print normal char. */
                        if (write_char(c, fd, buf, &bptr))
                                return -1;
+#endif  /* !DUMP_CHARSET_UTF8 */
                }
+
 #if defined(DUMP_COLOR_MODE_16) || defined(DUMP_COLOR_MODE_256) || 
defined(DUMP_COLOR_MODE_TRUE)
                for (;x < width; x++) {
                        if (write_char(' ', fd, buf, &bptr))
-- 
1.6.3.1.25.g0f3af

Attachment: pgp9aRP8iRW9i.pgp
Description: PGP signature

_______________________________________________
elinks-dev mailing list
elinks-dev@linuxfromscratch.org
http://linuxfromscratch.org/mailman/listinfo/elinks-dev

Reply via email to