On 18.11.2011 17:58, Andrea Fontana wrote:
I build a data access layer in c++. This layer works with mongo db where
string are always encoded using UTF-8. I've ported this layer in D using
swig. String is written correctly in console but when i use std.regex
sometimes it gives an exception:

core.exception.UnicodeException@src
<mailto:core.exception.UnicodeException@src>/rt/util/utf.d(290): invalid
UTF-8 sequence

Byte sequence (for better undestanding) is:
[83, 195, 179, 32]

And the string was "Sò " (with accented o and a space)

I'm not a utf expert, so Is it a wrong utf-8 encoding or it is a bug on
utf.d?


Which version of std.regex are you using - the one from git master or the one in the latest release? If it's the former then I'm willing to look into this thing on weekend, if you can get a hold of a pair: string + pattern that fails like this.


--
Dmitry Olshansky

Reply via email to