On Tue, 2003-04-01 at 00:12, Alex Jiang wrote:
> Thank you for your comments.
>
> >On Mon, 2003-03-31 at 03:23, Alex Jiang wrote:
> >
> >
> >>Hi,
> >>
> >>I am now working on i18n issues of evolution.
> >>
> >>There are several things I think we should do to improve the i18n
> >>performace of evolution.
> >>First, c the patch for bug 40519. which make sure whenever we call
> >>iconv, there is a
> >>charset(locale_charset for default), so a uniform manner is applied.
> >>
> >>
> >
> >I don't like this patch.
> >
> >
> OK , I c what you mean. However, I've checked almost all the e_iconv
> calles in gal/evolution,
> I will try to not touch e_iconv... things, but add necessary things on the
> application-level. (I mean, add more if (charset == NULL), charset =
> e_iconv_locale_charset())
this should already be handled everywhere in camel afaik.
>
> >
> >1) US-ASCII does not ever need to be passed in to iconv(), and so
> >therefor e_iconv_charset_name() is perfectly fine returning NULL.
> >
> >
> Yes, since the funtion returns NULL, there shall be hidden problem.
> There are codes like "
> if (iconv_open(to, from) == (iconv_t)-1) {
> g_warning("%s%s ...", to, from);
> }", which may lead to core dumps.
> It's kind of style or philosophy things, however, I respect your
> original style, and I shall make
> patches on application only.
ok, then e_iconv_open() needs to check that the resulting to/from are
not NULL, and if they are - return -1 and set errno appropriately.
>
> >2) (e_iconv_init): The second parameter of setlocale() from NULL to "".
> >
> >does passing NULL cause it to crash on Solaris? or what is the reasoning
> >for this change? If it is to prevent a crash or something, I'm fine with
> >it.
> >
> >
> In fact, I don't understand why call setlocale with NULL. In manual, the
> consequent behavior
> is not defined.
eh?
char *setlocale(int category, const char * locale);
DESCRIPTION
The setlocale() function is used to set or query the pro�
gram's current locale.
If locale is not NULL, the program's current locale is
modified according to the arguments. The argument cate�
gory determines which parts of the program's current
locale should be modified.
looks like the behaviour is defined to me ;-)
since we don't want to change the LC_ALL value, we pass in NULL as the
second argument. passing in "" (like your patch does) will clobber the
old value. this is not what we want.
>
> >3) I don't see why you want to always set the locale charset. if the
> >default locale charset is just going to be US-ASCII (which is what it
> >HAS to be), what difference does it make? Just leave it as NULL.
> >
> >4) I also don't think we should do a default locale_lang. First, we
> >don't even use it yet - but that is irrelevant. The reason I left it
> >NULL is that it will be useful in the composer in that we will clearly
> >see that the user does not have a locale language setup - and therefor
> >we should skip setting a lang on the message (or prompt the user or
> >something, which we should maybe do anyway).
> >
> >
> OK for (3) and (4)
>
> >5) No need to have e_iconv_charset_guess() so that e_iconv_init() can
> >parse it and construct a iconv-friendly name for it because it should
> >ALREADY be in whatever form the system's iconv() will want. Besides,
> >e_iconv_open() will always pass both to/from charsets into
> >e_iconv_charset_name() and so even if we use the locale_charset value,
> >it will still be transformed into the iconv-friendly charset name.
> >
> >
> I am to just make it more elegant(each occurence of camel lib call(),
> will always generate iconv_friendly
> charset name. while, I don't think it's nessary fro e_iconv_open() to
> call e_iconv_charset_name()
> again. However, again I respect your paradigm.
it's extremely little overhead and acts as a "just in case".
>
> >>Secondly, display charset should be stored in gconf(for each
> >>folder(vfolder?)). Or, when
> >>we change the display charset, functions like
> >>process_header()/header_decode_string() will
> >>not be affected(and it's not easy to pass the chosen charset to them),
> >>which may lead
> >>to bug 40521 . i.e. those multibyte subject/sender... which not
> >>following rfc2047, will not
> >>be readable(But this happens very often)
> >>
> >>
> >
> >I don't understand your proposal here?
> >
> OKay, when there is a multibyted subject or sender ...(broken mailer),
> even we chose a right display
> charset, the mail will be reparsed. However, the mail header, processed
> by process_header()...
> will not be affected by the rechose charset.
>
> As to NotZed's comment, the subject is already wrongly converted from a
> wrong charset to utf-8,
> what can I do with display. The best and most elegant way is to pass
> display charset to everywhere,
> but perhaps that is worse.
the message is not re-parsed when you change the display charset.
what you will actually want to do is not use
camel_mime_message_get_subject() when display the subjct string, but
instead call camel_medium_get_header() to get the Subject header, and
then decode yourself. this is yuck, but it would work.
>
> >>Third, when we open evolution twice in different locale, the summary
> >>file will not be rebuilt.
> >> So, improperly converted subject/title will remain unreadable. One
> >>possibly runnable way is
> >>to make summary in a "mbox.ev-summary.gb2312","mbox.ev-summary.en"
> >>way. But the final
> >>solution is that we be ablet ot change the summary file when we change
> >>the charset of a mail
> >>(c the next item) or of a folder.
> >>
> >>
> >
> >No. This is The Wrong Way (tm) to fix it. the mbox.ev-summary files
> >store everything in UTF-8 so there only ever has to be ONE.
> >
> When converting from other encoded charset to UTF-8, the charset is
> wrong. the suffix is
> just a suggestion of using what charset to convert to UTF-8.
> It's an intermedia solution, but I hope you understand what I mean.
I know what you mean, but it is not the right fix. as notzed says, this
is definetely not the way to do it. the Right Fix would be:
- invent a way to do better auto-detection of charsets given the raw
8bit text... but as well all well know, this is a very difficult task
and is still not 100% reliable.
the only other thing we can do is ignore it and say "fix the other
mailers" which is the Real Fix (tm).
>
> >
> >The only solution here is to ban people from using broken crap mailers
> >that don't set the charset(s) correctly. :-)
> >
> : -). Dear Jeffery, many many mailers do not follow rfc2047 in process
> headers. Including some
> mail servers used by IBM and other big companies. This is so often, in
> us, mutlibyte charset
> contries.
I'm fully aware.
>
> >it should be possible to change the code to render the message header
> >block in the "preview pane" according to the user's picked charset (if
> >it doesn't already) by grabbing the raw headers from the mesage object I
> >suppose? kinda yuck, but oh well.
> >
> >changing the message-list is not doable.
> >
> >
> OK. but the summary file is just to speedup the listing things...
it's for more than that...
>
> >>Fouth, we should be able to madatorily change the charset of a single
> >>mail. so perhaps a UI
> >>method will be introduces first(e.g. when we change the charset in the
> >>mail window, the mail's
> >>charset will be changed, while we change in the folder view, folder
> >>charset will be affected).
> >>Perhaps, to keep the wholesome of the summary file, a user-defined tag
> >>in the CamelMessageInfo
> >>struture will be used to define the charset of a single mail.
> >>
> >>
> >
> >I don't really like this idea...
> >
> >
> Even malformed mails are using different charsets. Storing mandatory
> charset somewhere is a sooner or
> later thing.
nah, eventually all the other mailers will be fixed :-)
Jeff
--
Jeffrey Stedfast
Evolution Hacker - Ximian, Inc.
[EMAIL PROTECTED] - www.ximian.com
_______________________________________________
evolution-hackers maillist - [EMAIL PROTECTED]
http://lists.ximian.com/mailman/listinfo/evolution-hackers