On 01/01/2014 07:58 AM, Dick Hollenbeck wrote:
On 11/21/2013 02:16 PM, Dick Hollenbeck wrote:
1) wx >= 2.9 has these constructors
wxString( const char* )
wxString( std::string )
whereas wx 2.8 does not.
Both offer:
wxString( const char*, wxConvUTF8 );
but this cannot be used in a default "type promotion" situation, this
constructor must be
invoked explicitly.
2) The above type promotion constructors treat the input encoding as that of
the current
locale, rather than UTF8 assuredly.
The type promotion constructors are important if you want to allow the compiler
to promote
an 8 bit string to a wxString for you without special syntax.
3) If you decide to keep 8 bit strings in memory, encoded in the current
locale, then
someday when you load a chinese board file, you will not be able to hold those
strings in
a deficient 8 bit encoding. (UTF8 is not a deficient 8 bit encoding, some
others are.)
The software breaks at that point. This argues for using UTF8 always as the
internal 8
bit encoding. But now the above two constructors are broken, since the current
locale's
encoding cannot be assumed to be UTF8, even though it often is on linux. You
just cannot
assume it.
In summary, I don't see any easy immediate relief from the boat anchor we know
as
wxString, even with wx 3.0. But I will continue to think about it.
Dick
Attached is a patch needing a good look, that shows off a new class UTF8 that I
wrote that
solves the problems addressed above by providing conversion operators to and
from
wxString, yet holding UTF8 data in what is basically a std::string.
Please say how it impacts you, realizing its usage scope can be trimmed or
expanded from
this sampling.
I am especially interested in:
a) how it compiles on gcc >= 4.8
b) how it compiles using clang.
c) what it does to any benchmarks of sane-ness and speed for stroke_font.h
Lorenzo, Marco, Orson, your feedback in particular is wanted.
class UTF8 will likely allow the removal of many many more calls to TO_UTF8()
and
FROM_UTF8(), not in this patch.
Plus code size will likely be reduced because I put the size expensive stuff
out of line
in a lean call interface.
Dick
Everything compiles & works fine with gcc 4.8.1 and wx 3.0. As there is
not much code contributed by us that works with strings - I do not see
anything that I could be missing. After some simple performance tests, I
confirmed my expectations (and Lorenzo's as well) that it does not
affect rendering speed noticeably.
I was wondering about some modifications of the uni_iter class to make
it usable with functions available in <algorithms> in the standard
library (e.g. https://gist.github.com/jeetsukumaran/307264/). It should
not require a lot of changes, if you want - I can try it out.
One trap that I can see is having both iterator (from std::string) and
uni_iter. It may lead to situations when one uses std::string::iterator
(just by habit or was not aware how does it work) and what really meant
is uni_iter. In my opinion if the class is specifically designed for
UTF8, we could drop the std::string iterator.
Besides that - everything is fine with me.
Regards,
Orson
_______________________________________________
Mailing list: https://launchpad.net/~kicad-developers
Post to : kicad-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kicad-developers
More help : https://help.launchpad.net/ListHelp