> String indexing *must* not fixed. As I said there is a number of plugins that > need *exactly* bytes: any plugin implementing hash function. char2nr(s[i]) is > guaranteed to return a value between 0x00 and 0xFF (inclusive) (0x00 is > returned only if s[i] is an empty string).
There are basically two variants how I see this situation mitigated: new data type like unicode() in python-2* with old strings being same as str() in python-2* with a function to convert from str() to unicode() and a set of mb*() functions. First variant have an advantage that by using the same unicode() as python-3* str() you may have O(1) indexing operations (if you keep utf-8 strings it will be O(N)). Second variant has an advantage of being far easier to implement: you just add a few function definitions without requiring to add big bunch of unicode()->str() conversions in a number of places, supporting unicode() objects in regex engine (python-3* str() objects use ASCII, latin1, UTF-16 or UTF-32 depending on what is the highest byte, but vim only accepts ASCII-compatible encodings) and so on. To make first variant perform better you need to modify *each* function that uses strings to work with unicode() or waste lots of time on unicode()->str()[->unicode()] conversions. -- -- You received this message from the "vim_dev" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
