Abel Cheung wrote:
> >    (because there are very few
> >    meaningful strings which look like UTF-8 but aren't).
>
> Yes, that's rare, though real world case has really happened before,
> especially for multibyte characters. Here is a sample:
>
> http://qa.mandrakesoft.com/show_bug.cgi?id=3935

Yes. It's a heuristic, and heuristics are always buggy. The programmer has
to weigh the benefit for the many users for which it "just works" against
the problem that it will cause for a few ones. In this case, when the
heuristic doesn't work, the result will be a filename that is garbage, and
a different garbage than if no heuristic took place.

Bruno


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to