Hi!

----

While doing some GB18030 testing I found a disturbing issue:
A lot of calls to the |*mb*()| functions are done without thinking
about the current shift state. The issue is that this state is a
hidden global variable and may easily be overlooked (the issue that
UTF-8 can recover from invalid shift states makes this worse since
UTF-8 locales won't suffer from this problem) ... which causes
problems for Shift-State depending encodings like
GBK/GB18030/ShiftJis.

My preferred solution would be to change the current libast mb API to
always take a |mbstate_t| argument. This would fix this issue (by
making the shift state explicit), fix issues with nesting calls, e.g.
if we are in a specific shift state and then call a utility function
which operates on a different string ... and fix thread-safeness
issues with the "hidden" global variable containing the current shift
state...

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) [email protected]
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Reply via email to