On Aug 2, 2006, at 7:28 PM, Theodore H. Smith wrote:
From: Brad Rhine <[EMAIL PROTECTED]>
Date: Wed, 2 Aug 2006 14:40:43 -0400
On Aug 2, 2006, at 2:37 PM, Theodore H. Smith wrote:
Why make your code do all sorts of awkward tricks with encodings,
(including but not limited to auto-convert on append), when you can
just assume all your data is UTF-8?
Because assumptions are dangerous. ;)
What if it's a guideline and not an assumption? Something like "use
utf-8 for most data processing, and utf-16 simply for input/output"?
Part of the speed increase my FastString class gets over Charles's
class based approach is that I don't do anything with encodings.
Why should I? I've never had a problem with it and no users have
reported it to me.
And you're writing in C, while I'm writing mostly in REALbasic. Now I
find it fastest to use Split and Join.
By eliminating a case which might occur less than 1% of the time, I
can get maybe 30% extra speed. And even that 1% of the time only
proves to be a design error on the developer's part, because he'd
get faster speed by using UTF-8 all throughout his app.
If there's anything I've learnt about string processing, it's that
it's really best to use one model for your data. Whether that's C++
or RB or anything.
In C++ we have so many string classes, CString (via MFC), stl's
string, char*, and then most libraries tend to have their own
string class, like CFString, or NSString. Then you need to write an
app using libraries, some which use char*, others using string,
others using NSString... it becomes a mess, complex, and slow, to
do all the interconversion.
Far quicker to just use one model, where possible.
Sure, but RS has chosen to opt for convenience, and it works pretty
well for most situations.
Just the same for encodings. UTF-8 does everything so there's no
advantage in using anything other than UTF-8 except for input and
output.
It should be considered a design error to be processing strings in
more than one encoding, except to convert it to and from the
dominant encoding.
Well, I think you can assume that people should stick to your
suggested design principles.
Probably Apple has some good developers, and they think UTF-16 is the
better choice.
Charles Yeomans
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>