Launchpad has imported 19 comments from the remote bug at
http://bugs.gentoo.org/show_bug.cgi?id=96376.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2005-06-17T06:39:14+00:00 World-root wrote:

I've had a few RTF documents to text, and I noticed that unrtf outputs
an exclamation mark instead of accents.

Here's a patch that makes it produce valid UTF-8 text for any ANSI RTF
input file. Please test :-)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/0

------------------------------------------------------------------------
On 2005-06-17T06:40:38+00:00 World-root wrote:

Created attachment 61385
Patch to output ANSI RTF characters correctly

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/1

------------------------------------------------------------------------
On 2005-06-17T06:42:22+00:00 World-root wrote:

Created attachment 61386
Patch for the ebuild

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/2

------------------------------------------------------------------------
On 2005-06-20T13:25:38+00:00 Tove wrote:

Robin, do you want to take this bug?

Joël, did you sent the patch to the upstream developers?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/3

------------------------------------------------------------------------
On 2005-06-20T13:25:38+00:00 Tove wrote:

Robin, do you want to take this bug?

Jo

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/4

------------------------------------------------------------------------
On 2005-06-20T14:54:12+00:00 World-root wrote:

No, not yet. Should I send it ?

(I suppose unrtf was written before a common encoding, UTF-8 was created. So now
that many people use UTF-8, I guess it's nice to put the extended characters to
good use)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/5

------------------------------------------------------------------------
On 2005-06-20T15:26:31+00:00 Tove wrote:

Let's wait for robbat2's comment. He's travelling for the next 2 weeks.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/6

------------------------------------------------------------------------
On 2005-07-02T14:05:04+00:00 Robin H. Johnson wrote:

please send this to upstream.
if they are unresponsive, then i'll just patch our ebuild, but i'd prefer it if 
they took it first.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/7

------------------------------------------------------------------------
On 2005-07-03T02:31:49+00:00 World-root wrote:

Robin,

Thanks for your response ! I'm trying to do it.

Two remarks though:
- I've just found a newer version: http://ftp.gnu.org/gnu/unrtf/0.19.7/
- [email protected] does not work
- there is a patch (text_french.patch) in the 0.19.7 package, which is similar
to mine, but only handles a few accents. I'll try to contact its author.

I'll let you know when I get something !

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/8

------------------------------------------------------------------------
On 2006-01-11T02:09:35+00:00 Gentoo-bugger wrote:

Any news on this?  I'm just trying the 3rd party kat ebuilds and they
contain an ebuild with this patch.  Would be cool if I needed one ebuild
less in my overlay :)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/9

------------------------------------------------------------------------
On 2006-01-11T02:14:24+00:00 Gentoo-bugger wrote:

I just saw that there's a new version 0.19.9 from last week, from the changelog:
| 0.19.4: added unicode support
| 0.19.5: removed defective PS support and non-free text files
|         more unicode support
|         improved symbol font support - no longer puts entities in latex output
|         Bug#266020 concerning double slashes fixed
|         Bug#269054 concerning Doctype fixed
|         Bug#287038 security breach fixed
|                 (thanks to Joey Hess <[email protected]>)
| 0.19.6: fix some latex problems
| 0.19.7: updated FSF address
| 0.19.8: minor fixes
| 0.19.9: included verbose mode

So it might be fixed in that version...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/10

------------------------------------------------------------------------
On 2006-01-11T02:33:28+00:00 World-root wrote:

Hi,

Actually (before I made the patch) the authors did put an _unused_
"text_french.patch" file in unrtf 0.19.7 -- but their patch is
incomplete (see comment #7).

I sent an email containing the information, as well as a link to this bugzilla 
page, to the upstream developers on 3rd July 2005:
TO: [email protected], [email protected]
CC: [email protected]

I got no response so far.

I haven't looked (or tried) unrtf 0.19.9 -- could you have a quick look
at the test.c file, to see what characters they added in the tables ?

Best Regards

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/11

------------------------------------------------------------------------
On 2006-01-11T03:03:28+00:00 Gentoo-bugger wrote:

unrtf has a project page at savannah, here [1].  There's both a bug and
a patch tracker, maybe you've got more luck there.

[1] http://savannah.gnu.org/projects/unrtf/

It seems like they added a few but not all characters, and different to your 
solution:
mss@otherland ~/tmp $ diff -u unrtf-0.19.3/text.c unrtf_0.19.9/text.c
--- unrtf-0.19.3/text.c 2004-02-19 00:35:04.000000000 +0100
+++ unrtf_0.19.9/text.c 2006-01-06 22:56:06.000000000 +0100
@@ -1,7 +1,6 @@
-
 /*=============================================================================
    GNU UnRTF, a command-line program to convert RTF documents to other formats.
-   Copyright (C) 2000,2001 Zachary Thayer Smith
+   Copyright (C) 2000,2001,2004 by Zachary Smith

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -15,20 +14,25 @@

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  
USA

-   The author is reachable by electronic mail at [email protected].
+   The maintainer is reachable by electronic mail at [email protected]
 =============================================================================*/


 /*----------------------------------------------------------------------
  * Module name:    text
- * Author name:    Zach Smith
+ * Author name:    Zachary Smith
  * Create date:    19 Sep 01
  * Purpose:        Plain text output module
  *----------------------------------------------------------------------
  * Changes:
  * 22 Sep 01, [email protected]: added function-level comment blocks
+ * 29 Mar 05, [email protected]: changes requested by ZT Smith
+ * 14 Jun 05, [email protected]: higher Iso-Latin-1 characters
+ *             added - thanks to [email protected] and
+ *             [email protected]
+ * 23 Jul 05, [email protected]: added endash, emdash and bullet
  *--------------------------------------------------------------------*/


@@ -59,22 +63,24 @@

 static char*
 upper_translation_table [128] = {
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
+/*        0    1    2    3    4    5    6    7 */
+/* 80 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* 88 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* 90 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* 98 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* A0 */ "

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/12

------------------------------------------------------------------------
On 2006-01-11T03:03:28+00:00 Gentoo-bugger wrote:

unrtf has a project page at savannah, here [1].  There's both a bug and
a patch tracker, maybe you've got more luck there.

[1] http://savannah.gnu.org/projects/unrtf/

It seems like they added a few but not all characters, and different to your 
solution:
mss@otherland ~/tmp $ diff -u unrtf-0.19.3/text.c unrtf_0.19.9/text.c
--- unrtf-0.19.3/text.c 2004-02-19 00:35:04.000000000 +0100
+++ unrtf_0.19.9/text.c 2006-01-06 22:56:06.000000000 +0100
@@ -1,7 +1,6 @@
-
 /*=============================================================================
    GNU UnRTF, a command-line program to convert RTF documents to other formats.
-   Copyright (C) 2000,2001 Zachary Thayer Smith
+   Copyright (C) 2000,2001,2004 by Zachary Smith

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -15,20 +14,25 @@

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  
USA

-   The author is reachable by electronic mail at [email protected].
+   The maintainer is reachable by electronic mail at [email protected]
 =============================================================================*/


 /*----------------------------------------------------------------------
  * Module name:    text
- * Author name:    Zach Smith
+ * Author name:    Zachary Smith
  * Create date:    19 Sep 01
  * Purpose:        Plain text output module
  *----------------------------------------------------------------------
  * Changes:
  * 22 Sep 01, [email protected]: added function-level comment blocks
+ * 29 Mar 05, [email protected]: changes requested by ZT Smith
+ * 14 Jun 05, [email protected]: higher Iso-Latin-1 characters
+ *             added - thanks to [email protected] and
+ *             [email protected]
+ * 23 Jul 05, [email protected]: added endash, emdash and bullet
  *--------------------------------------------------------------------*/


@@ -59,22 +63,24 @@

 static char*
 upper_translation_table [128] = {
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
-       "?", "?", "?", "?", "?", "?", "?", "?",
+/*        0    1    2    3    4    5    6    7 */
+/* 80 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* 88 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* 90 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* 98 */ "?", "?", "?", "?", "?", "?", "?", "?",
+/* A0 */ " ", "¡", "¢", "£", "¤", "¥", "¦", "§",
+/* A8 */ "¨", "©", "ª", "«", "¬", "­", "®", "¯",
+/* B0 */ "°", "±", "²", "³", "´", "µ", "¶", "·",
+/* B8 */ "¸", "¹", "º", "»", "¼", "½", "¾", "¿",
+/* C0 */ "À", "Á", "Â", "Ã", "Ä", "Å", "Æ", "Ç",
+/* C8 */ "È", "É", "Ê", "Ë", "Ì", "Í", "Î", "Ï",
+/* D0 */ "Ð", "Ñ", "Ò", "Ó", "Ô", "Õ", "Ö", "×",
+/* D8 */ "Ø", "Ù", "Ú", "Û", "Ü", "Ý", "Þ", "ß",
+/* E0 */ "à", "á", "â", "ã", "ä", "å", "æ", "ç",
+/* E8 */ "è", "é", "ê", "ë", "ì", "í", "î", "ï",
+/* F0 */ "ð", "ñ", "ò", "ó", "ô", "õ", "ö", "÷",
+/* F8 */ "ø", "ù", "ú", "û", "ü", "ý", "þ", "ÿ",
+/*        8    9    A    B    C    D    E    F */
 };


@@ -255,6 +261,11 @@
        text_op->chars.left_quote = "`";
        text_op->chars.right_dbl_quote = "''";
        text_op->chars.left_dbl_quote = "``";
+#if 1 /* daved - 0.19.8 */
+       text_op->chars.endash = "­"; /* not ASCII */
+       text_op->chars.emdash = "-";
+       text_op->chars.bullet = "·"; /* not ASCII */
+#endif

        return text_op;
 }


Reply at: 
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/13

------------------------------------------------------------------------
On 2006-01-11T07:10:37+00:00 World-root wrote:

Ah, this new patch looks good :-)

It handles everything, excluding values 0x80..0x9F. It can be because
that range of values is forbidden/reserved and cannot not be found in
ANSI RTF anyway (I have no idea what's the deal with these 0x80..0x9F
values).

My only concern: filling the array in a C file with characters (instead
of hex value) could be a bit dangerous, depending on the compiler's
character set support (?)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/14

------------------------------------------------------------------------
On 2006-02-16T17:38:41+00:00 Robin H. Johnson wrote:

I've just commit 0.19.9 to the tree, is the patch from this bug still
needed?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/15

------------------------------------------------------------------------
On 2006-02-17T04:30:35+00:00 World-root wrote:

I've just tried the 0.19.9 version.

Indeed, the patch I posted is not needed anymore, *but* please note that
unrtf will always output ISO-8859-1 text, regardless of the user's $LANG
setting. Not very good for pure UTF-8 users IMHO.

Ideal workaround: unrtf should iconv() the whole text at runtime, so the
input obeys the user's preferred encoding.

In the meantime, I suggest adding this as a first line in src_compile():

src_compile() {
    iconv -f ISO-8859-15 text.c >text.c.new && mv text.c.new text.c

This would detect the user's encoding at emerge time, which is better
than ignoring it completely. With this line added, unrtf outputs proper
UTF-8 text for me.

Since iconv is called without '-t' (target encoding) argument, it
*should* convert to the user's preferred encoding. It works for UTF-8 --
can someone please test with an ISO-8859 $LANG/$LC_ALL ? I have
userlocales and only UTF-8 locales built.

Thanks

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/16

------------------------------------------------------------------------
On 2006-02-20T01:06:24+00:00 Robin H. Johnson wrote:

I don't agree with using iconv like that.
My root user runs in a different $LANG than my regular user.
unrtf really must be made encoding-aware.

I'm going to close this for now, and I'd ask you take it to upstream
again. If you diff the old release with the new one, you'll see there is
a new maintainer, and hopefully he can be more responsive.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/17

------------------------------------------------------------------------
On 2006-02-20T08:13:41+00:00 World-root wrote:

He's from Australia, right ?

Ok, e-mail is sent (including of course, a link to this page) :-)

When something happens I'll report it here.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/comments/18


** Changed in: gentoo
       Status: Unknown => Fix Released

** Changed in: gentoo
   Importance: Unknown => Wishlist

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/290503

Title:
  Unrtf does not handle UTF-8 correctly. The version is rather old

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/unrtf/+bug/290503/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to