On 8 May 2001, Not Zed wrote:
> On 05 May 2001 08:16:45 -0400, Richard Schoeller wrote:
> > Better still would be to treat invalid utf-8 as though it were
> > in the codeset of the current global locale. It is very likely
> > that end users outside of western europe will exchange most of
> > their email with users of the same codeset as themselves.
>
> We can't even do that, at least not entirely. Another hint we have (at
> least in mail messages), is the charset used for the body content, which
> is often mirrored in the subject and whatnot. So idont think gal should
> be doing this at all.
And also there should be UI controls that allow user to force a particular
charset of the current message. Having provided this, there won't be any need
in guessing language/encoding etc. All sensible browsers allow to select
particular encoding for current page (since there are a lot of broken pages
that don't specify one) and even some reasonable MUAs. Evolution also should
do this.
> > There may be other mechanisms for establishing a context in
> > which to interpret the invalid utf-8, but those will be specific
> > to the application and can't be resolved in gal.
>
> Exactly.
>
> The utf8 functions just need to fail in some meaningful way. Chances
> are, if we have invalid input at that point, we are unlikely to get
> anything workable afterwards, but crashing is definetly not on.
>
> > Dick
> >
> > On 04 May 2001 21:43:08 -0400, Christopher James Lahey wrote:
> > > On 04 May 2001 12:50:59 -0400, Jon Trowbridge wrote:
> > > > OK, I've just committed some tweaks to gal's g_utf8_* functions that make
> > > > them check all their inputs and print a warning if they are applied to bad
> > > > utf-8.
> > > >
> > > > The checks are macroified, but I've turned them on by default. This way,
> > > > there will at least be some warning when we are doing something broken.
> > > >
> > > > -JT
> > >
> > > The problem is that these are often outside world generated strings.
> > > I'm not sure if we should leave dealing with checking in real life
> > > situations to the caller or to the internal functions. It would be less
> > > programmer work probably to fix the internal functions to handle
> > > incorrect utf8. Ideally to assume that incorrect utf8 is latin1 since
> > > that's what it will be most of the time.
> > > Chris
> > >
Best regards,
-Vlad
_______________________________________________
evolution-hackers maillist - [EMAIL PROTECTED]
http://lists.helixcode.com/mailman/listinfo/evolution-hackers