Thank you very much for reporting this problem. I have fixed the problem, but in a different way than you did. Please see the attached patch.

There were two problems. You are correct that the loop was going one ku too far; although the ku is 1-origin the loop was zero origin and thus should use a < test instead of a <= test.

The other problem was that the same technique that is appropriate for Japanese EUC was also used for Korean and Taiwanese EUC. You worked around this by changing the add of 0x8080 to an or. Although this solves the problem for standard Korean characters, it is not quite right and is definitely wrong for Taiwanese.

The correct fix is to add 0x8080 for Japanese and add 0x8000 for Korean and Taiwanese (since the second byte can be be in both halves in Korean and Taiwanese).

Could you please test this updated patch? I believe that it is correct, but since I do not understand Korean I can't be certain.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.
Si vis pacem, para bellum.
*** utf8.c.old  Thu Apr  7 17:44:41 2005
--- utf8.c      Mon May  9 20:46:31 2005
***************
*** 10,16 ****
   *            Internet: [EMAIL PROTECTED]
   *
   * Date:      11 June 1997
!  * Last Edited:       7 April 2005
   * 
   * The IMAP toolkit provided in this Distribution is
   * Copyright 1988-2005 University of Washington.
--- 10,16 ----
   *            Internet: [EMAIL PROTECTED]
   *
   * Date:      11 June 1997
!  * Last Edited:       9 May 2005
   * 
   * The IMAP toolkit provided in this Distribution is
   * Copyright 1988-2005 University of Washington.
***************
*** 430,458 ****
        if (tab[i] != UBOGON) rmap[tab[i]] = (unsigned short) i;
        break;
      case CT_EUC:              /* 2 byte ASCII + utf8_eucparam base/CS2/CS3 */
-     case CT_DBYTE:            /* 2 byte ASCII + utf8_eucparam */
        for (param = (struct utf8_eucparam *) cs->tab,
           tab = (unsigned short *) param->tab,
!          ku = 0; ku <= param->max_ku; ku++)
!       for (ten = 0; ten <= param->max_ten; ten++)
          if ((u = tab[(ku * param->max_ten) + ten]) != UBOGON)
            rmap[u] = ((ku + param->base_ku) << 8) + (ten + param->base_ten) +
              0x8080;
        break;
      case CT_DBYTE2:           /* 2 byte ASCII + utf8_eucparam plane1/2 */
        for (param = (struct utf8_eucparam *) cs->tab,
           tab = (unsigned short *) param->tab,
!          ku = 0; ku <= param->max_ku; ku++)
!       for (ten = 0; ten <= param->max_ten; ten++)
          if ((u = tab[(ku * param->max_ten) + ten]) != UBOGON)
            rmap[u] = ((ku + param->base_ku) << 8) + (ten + param->base_ten) +
!             0x8080;
        param++;
!       for (ku = 0; ku <= param->max_ku; ku++)
!       for (ten = 0; ten <= param->max_ten; ten++)
          if ((u = tab[(ku * param->max_ten) + ten]) != UBOGON)
            rmap[u] = ((ku + param->base_ku) << 8) + (ten + param->base_ten) +
!             0x8080;
        break;
      case CT_SJIS:             /* 2 byte Shift-JIS */
        for (ku = 0; ku <= MAX_JIS0208_KU; ku++)
--- 430,466 ----
        if (tab[i] != UBOGON) rmap[tab[i]] = (unsigned short) i;
        break;
      case CT_EUC:              /* 2 byte ASCII + utf8_eucparam base/CS2/CS3 */
        for (param = (struct utf8_eucparam *) cs->tab,
           tab = (unsigned short *) param->tab,
!          ku = 0; ku < param->max_ku; ku++)
!       for (ten = 0; ten < param->max_ten; ten++)
          if ((u = tab[(ku * param->max_ten) + ten]) != UBOGON)
            rmap[u] = ((ku + param->base_ku) << 8) + (ten + param->base_ten) +
              0x8080;
        break;
+     case CT_DBYTE:            /* 2 byte ASCII + utf8_eucparam */
+       for (param = (struct utf8_eucparam *) cs->tab,
+          tab = (unsigned short *) param->tab,
+          ku = 0; ku < param->max_ku; ku++)
+       for (ten = 0; ten < param->max_ten; ten++)
+         if ((u = tab[(ku * param->max_ten) + ten]) != UBOGON)
+           rmap[u] = ((ku + param->base_ku) << 8) + (ten + param->base_ten) +
+             0x8000;
+       break;
      case CT_DBYTE2:           /* 2 byte ASCII + utf8_eucparam plane1/2 */
        for (param = (struct utf8_eucparam *) cs->tab,
           tab = (unsigned short *) param->tab,
!          ku = 0; ku < param->max_ku; ku++)
!       for (ten = 0; ten < param->max_ten; ten++)
          if ((u = tab[(ku * param->max_ten) + ten]) != UBOGON)
            rmap[u] = ((ku + param->base_ku) << 8) + (ten + param->base_ten) +
!             0x8000;
        param++;
!       for (ku = 0; ku < param->max_ku; ku++)
!       for (ten = 0; ten < param->max_ten; ten++)
          if ((u = tab[(ku * param->max_ten) + ten]) != UBOGON)
            rmap[u] = ((ku + param->base_ku) << 8) + (ten + param->base_ten) +
!             0x8000;
        break;
      case CT_SJIS:             /* 2 byte Shift-JIS */
        for (ku = 0; ku <= MAX_JIS0208_KU; ku++)
_______________________________________________
Imap-uw mailing list
[email protected]
https://mailman1.u.washington.edu/mailman/listinfo/imap-uw

Reply via email to