Hi! ----
While doing some GB18030 testing I found a disturbing issue: A lot of calls to the |*mb*()| functions are done without thinking about the current shift state. The issue is that this state is a hidden global variable and may easily be overlooked (the issue that UTF-8 can recover from invalid shift states makes this worse since UTF-8 locales won't suffer from this problem) ... which causes problems for Shift-State depending encodings like GBK/GB18030/ShiftJis. My preferred solution would be to change the current libast mb API to always take a |mbstate_t| argument. This would fix this issue (by making the shift state explicit), fix issues with nesting calls, e.g. if we are in a specific shift state and then call a utility function which operates on a different string ... and fix thread-safeness issues with the "hidden" global variable containing the current shift state... ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) [email protected] \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) _______________________________________________ ast-developers mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-developers
