> That's true, but still, for conversions let's say in a tight loop
> (like when looping over all elements of a large xml document with
> msxml for
> example) I'm not really comfortable with using the stack for that.
> It's probably just a wrong gut feeling that I have, I'm always afraid
> that I'll run out of stack space one day.

Yes, ATL developers have solutions for this case as well.  If you use X2YEX
(instead of X2Y), it will allocate a static buffer of a size you set (via a
template argument) and uses that buffer for strings that fit into it (the
buffer is stack-based of course) and allocates from the free store for
larger strings.  Nitfy isn't it?  And you won't run out of stack in
loops/recursive calls.

Make sure you read the "ATL and MFC String Conversion Macros" page on the
MSDN completely.  You'll find everything you'll need to know there.  I
prefer conversion macros as my character conversion tool, and maybe you do
as well once you read that page.

> Yes that is absolutely true. STL is the most portable way at the
> moment; the point that I was trying to make is that I no longer think
> of std::string as the end and all of string classes. I've been
> thinking of starting to use Glib::ustring
> (http://gtkmm.sourceforge.net/gtkmm2/docs/reference/html/class
> Glib_1_1ustring.html,
> a utf-8 string class with a std::string interface) for my projects
> that may some day need cross-platformness but I have to investigate
> just how fast a utf-8 string can be. But as I mentioned before, I'm
> still new to the whole unicode/ internationalization thing, so I may
> be wrong on this one too :)

First of all, thanks a million for letting us know of this class.  I have
always missed exactly this utility, but I'd never looked for it (don't know
why!).  I use UTF-8 a lot, and it's a precious resource for me at least.

I can't comment on the speed of this class, but as a general note for UTF-8
encoding, you should remember that the 7-bit ASCII encoding will be the same
when UTF-8 encoded.  If a char's highest bit is set, then the next one(s)
need to be checked to decode the character.  UTF-8 can encode a character in
up to 4 bytes, and I would guess the encode/decode routines can be pretty
fast, since they don't do anything tougher than a number of bitwise shifts
to encode/decode the characters.

If you need rocket speed, maybe encoding your string into UTF-16 or UTF-32
may be your best bet.  Remember that according to the contents of the
string, UTF-16 and UTF-32 might require less or more memory than UTF-8.

-------------
Ehsan Akhgari

Farda Technology (http://www.farda-tech.com/)

List Owner: [EMAIL PROTECTED]

[ Email: [EMAIL PROTECTED] ]
[ WWW: http://www.beginthread.com/Ehsan ]

"In Heaven an angel is nobody in particular."
 George Bernard Shaw





Reply via email to