Hi François, Santiago! Hopefully you won't mind if I add an other comment here.
Note that with the program below, after the error, the task->input.cursor is still pointing to the begining of task->input.buffer, even though part of the string has already bene translated. AFAICS, this makes it impossible to tell which character exactly caused the error, and which part of the string has already been translated. Also, the info page says "Its value will internally advance as the recoding goes, until it reaches the value of `input.limit'.", which doesn't seem in line with this behaviour. Hopefully, you'll be able to fix this (or point out to me what I'm doing wrong). Thanks! Bas. You wrote: > Hello François. > > I just received this from the Debian bug system. > > What about the new recode release you were preparing? It would be wonderful > if you could take a look at the list bugs reported to Debian: > > http://bugs.debian.org/recode > > [ Note: The only bug not in forwarded state is #313925, you can ignore it, > as I expect it to be fixed by the Free Translation Team ]. > > ---------- Forwarded message ---------- > From: Bas Zoetekouw <[EMAIL PROTECTED]> > To: Debian Bug Tracking System <[EMAIL PROTECTED]> > Date: Thu, 19 Jan 2006 21:28:26 +0100 > Subject: Bug#348909: recode_perform_task() returns wrong error code > > Package: librecode0 > Version: 3.6-12 > Severity: normal > > According to the info page, recode_perform_task() should return the > error code RECODE_UNTRANSLATABLE in task->error_so_far if the input > contains characters that cannot be represented in the output charset. > > However, it returns RECODE_INVALID_INPUT when trying to translate > certain chars from utf8 to latin1, even if the input is valid utf8. > > Below's an example C program that show the bug. It tries to translate > the string "á ç ÿÿ ÿÿ" from utf8 into latin1. The á and ç work fine, > but it chokes on the alpha (as it should, because latin1 doesn't > contain an alpha). However, the error code it returns is 4 > (==RECODE_INVALID_INPUT) instead of 3 (==RECODE_UNTRANSLATABLE). > > This bug obviously makes it impossible to distinguish between invalid > inputs (which, in a user application, should throw an error) or > characters that simple cannot be represented in the desired charset > (which could be replaced by a ? for example). > > #include <stdio.h> > #include <stdbool.h> > #include <recodext.h> > #include <string.h> > > int > main () > { > /* utf8 test string: 2 chars ('a, ,c) representable in latin1, > * followed by 2 chars (alpha, zeta) that cannot be represented > * in latin1 */ > char greek_utf_str[] = "\303\241 \303\247 \316\261 \316\266"; > char buf[100] = ""; > > RECODE_OUTER outer = recode_new_outer (false); > RECODE_REQUEST request = recode_new_request (outer); > RECODE_TASK task; > bool success; > > recode_scan_request (request, "utf-8..latin1"); > > task = recode_new_task (request); > task->input.buffer = &(greek_utf_str[0]); > task->input.cursor = task->input.buffer; > task->input.limit = task->input.buffer + sizeof(greek_utf_str); > task->output.buffer = &(buf[0]); > task->output.cursor = task->output.buffer; > task->output.limit = task->output.buffer + sizeof(buf); > > success = recode_perform_task (task); > > printf("task completed with error %i\n", task->error_so_far); > printf("output buffer: "); > while (task->output.buffer < task->output.cursor) { > printf("%02X ", (unsigned char) *(task->output.buffer++)); > } > printf("\n"); > > return 0; > } > > > -- System Information: > Debian Release: testing/unstable > APT prefers unstable > APT policy: (500, 'unstable'), (1, 'experimental') > Architecture: i386 (i686) > Shell: /bin/sh linked to /bin/bash > Kernel: Linux 2.6.14.3 > Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) > > Versions of packages librecode0 depends on: > ii libc6 2.3.5-6 GNU C Library: Shared libraries > an > > librecode0 recommends no packages. > > -- no debconf information > -- +--------------------------------------------------------------------+ | Bas Zoetekouw | GPG key: 0644fab7 | |----------------------------| Fingerprint: c1f5 f24c d514 3fec 8bf6 | | [EMAIL PROTECTED], [EMAIL PROTECTED] | a2b1 2bae e41f 0644 fab7 | +--------------------------------------------------------------------+

