Re: [Gal-hackers] Re: [Evolution-hackers] utf-8 requires vigilance

Vlad Harchev Wed, 09 May 2001 00:39:10 -0700
On 9 May 2001, Not Zed wrote:

> No, you dont understand.
> 
> This is totally apart from the UI, and must be handled separately in
> many cases.

 This thread discussed what charset to assume if string is not valid utf8.
Somebody even proposed to autodetect charset. I said that if user was given an
option to force particular charset using UI, it would be the most flexible and
usable solution, that will make charset autodetection not needed.

PS: mozilla has nice charset autodetection functions - they may be borrowed
sometime in the future.

> On 08 May 2001 11:22:11 +0500, Vlad Harchev wrote:
> > On 8 May 2001, Not Zed wrote:
> > 
> > > On 05 May 2001 08:16:45 -0400, Richard Schoeller wrote:
> > > > Better still would be to treat invalid utf-8 as though it were
> > > > in the codeset of the current global locale.  It is very likely
> > > > that end users outside of western europe will exchange most of
> > > > their email with users of the same codeset as themselves.
> > > 
> > > We can't even do that, at least not entirely.  Another hint we have (at
> > > least in mail messages), is the charset used for the body content, which
> > > is often mirrored in the subject and whatnot.  So idont think gal should
> > > be doing this at all.
> > 
> >  And also there should be UI controls that allow user to force a particular
> > charset of the current message. Having provided this, there won't be any need
> > in guessing language/encoding etc. All sensible browsers allow to select
> > particular encoding for current page (since there are a lot of broken pages
> > that don't specify one) and even some reasonable MUAs. Evolution also should
> > do this.
> > 
> > > > There may be other mechanisms for establishing a context in
> > > > which to interpret the invalid utf-8, but those will be specific
> > > > to the application and can't be resolved in gal.
> > > 
> > > Exactly.
> > > 
> > > The utf8 functions just need to fail in some meaningful way.  Chances
> > > are, if we have invalid input at that point, we are unlikely to get
> > > anything workable afterwards, but crashing is definetly not on.
> > > 
> > > > Dick
> > > > 
> > > > On 04 May 2001 21:43:08 -0400, Christopher James Lahey wrote:
> > > > > On 04 May 2001 12:50:59 -0400, Jon Trowbridge wrote:
> > > > > > OK, I've just committed some tweaks to gal's g_utf8_* functions that make
> > > > > > them check all their inputs and print a warning if they are applied to bad
> > > > > > utf-8.
> > > > > > 
> > > > > > The checks are macroified, but I've turned them on by default.  This way,
> > > > > > there will at least be some warning when we are doing something broken.
> > > > > > 
> > > > > > -JT
> > > > > 
> > > > > The problem is that these are often outside world generated strings.
> > > > > I'm not sure if we should leave dealing with checking in real life
> > > > > situations to the caller or to the internal functions.  It would be less
> > > > > programmer work probably to fix the internal functions to handle
> > > > > incorrect utf8.  Ideally to assume that incorrect utf8 is latin1 since
> > > > > that's what it will be most of the time.
> > > > >     Chris
> > > > > 
> > 
> >  Best regards,
> >   -Vlad
> 

 Best regards,
  -Vlad


_______________________________________________
evolution-hackers maillist  -  [EMAIL PROTECTED]
http://lists.helixcode.com/mailman/listinfo/evolution-hackers
Re: [Gal-hackers] Re: [Evolution-hackers] utf-8 requires vigilance

Reply via email to