>Unfortunately, I have a lot of experience and troubles with character >set conversion.
Well, if you just bit the bullet and switched to UTF-8, you wouldn't have all of these problems! :-) >> Should we return the original bytes? > >It is not the best idea. Some sequences of bytes are control sequences >for terminal. This sometimes set terminal in unusable state. Seems fine to me. >> An error? [..] Some string which says, "We cannot convert >> klingon-8842 to us-ascii" or the equivalent? >> > >In practice it means a spam in exotic language and at this point I know >that I do not want to read such a message. I can see that, but I'm not sure that's an appropriate choice for all cases (like, for instance, MIME parameters). >> - What to do when we cannot convert a particular character. This is a >> little more clear; the general trend is to use a substitution >> character. > >This is very frequent and causes a lot of troubles. Entire message in >English and one foreign family name in original. Message send in utf-8 >but (suppose) my terminal support only ASCII. Converison would fail. Errr ... really? In the case I'm thinking, the one foreign family name would have the offending character output as a '?' (or whatever). The conversion would go through fine. >In my personal opinion a very good choice is conversion into >html-entities, like ą or ł . It remains quite readable and >is still unique enough to convert it back in case of need. Um, ouch. Unless there's a common library that already implements that behavior, that's not on the table at all. --Ken _______________________________________________ Nmh-workers mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/nmh-workers
