Aaron Stone wrote: > On Sat, May 12, 2007, Anne <[EMAIL PROTECTED]> said:
>> Does this mean when someone is still at 2.2.2, one should wait for 2.2.6 ? Maybe, but with some luck not. Read on. > The encoding issues won't be a regression in 2.2.5, just an ongoing issue > that will take too long to fix to meet the intended schedule. Well, it turns out there *was* a regression that looked like something else at first. > > Something Paul may need to clarify is if there will be any differences in > how data is stored in 2.2.2 vs. the upcoming 2.2.5 -- would there be a > problem if a site upgraded to 2.2.5, found bugs, then reverted back to > 2.2.2? Ok. Reverting back will be possible. The changes affect the headercache only, which can easily be rebuild using dbmail-util. However, building on patches offered by Anton and others we now have reliable utf8 support in the headercache and the code that hooks into that. Some history: Before 2.2.3, headervalues were stored in 2.1.x and 2.2.x 'as-is'. This meant that headers were encoded in utf7 if they contained non-us-ascii characters. But quite early on Lars and others complained this broke searching and sorting on the headers. They were right, storing headers as-is is not valid for this and for another reason: quite often oft-used mailclients send email that contains illegal non-ascii 8bit characters, and there is simply no way you can store those reliably in a us-ascii database and expect good things to come of it. Still, that was the status-quo ante in dbmail and users have complained about it from the beginning. Especially if they where using postgres on non-US-ASCII encoded database. Since 2.2.3 however, all headervalues are stored in the encoding specified in the 'ENCODING' configuration field which is supposed to map one-on-one with the encoding the database uses internally. This means users can now store all headers using UTF8 encoding. And that is very good. Today, I managed to fix the subtle bug that broke the FETCH response the OP mentions. Turns out Anton's original patch did some utf8 magic on the strings before storing them to improve collation (used in sorting), but doing that was not required (since the database can handle collation just fine) and it also broke recoding the headervalues back into utf7 (which is what the mailclients want to see). But I just landed a couple of patches that clean up the utf8 conversion framework and remove the bug. All is well now again, but... Mysql and postgresql users using anything but ENCODING=utf8 should take notice. Dbmail uses utf8 internally because gmime uses utf8 internally. You should do too. Convert your databases if you can, or plan to do so soon. Unicode and its UTF8 representation are the future. Converting your mysql database is trivial, but takes time and table locks (making the database read-only). A couple of 'ALTER' statements got me going just fine. Converting your postgres database is non-trivial. In fact I'm not even sure how hard it is. Someone fill me in please. I'll hold 2.2.5 for just a day or two extra to see how this latest patch works out. If there are unforseen problems, I'll rewind my tree a bit and release svn revision 2564 as 2.2.5, or branch out from there if anything needs to go in. So test and time will learn. Stay tuned. -- ________________________________________________________________ Paul Stevens paul at nfg.nl NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31 The Netherlands________________________________http://www.nfg.nl _______________________________________________ DBmail mailing list [email protected] https://mailman.fastxs.nl/mailman/listinfo/dbmail
