On 01/01/2012 08:01 PM, Chad J wrote:
On 01/01/2012 10:39 AM, Timon Gehr wrote:
On 01/01/2012 04:13 PM, Chad J wrote:
On 01/01/2012 07:59 AM, Timon Gehr wrote:
On 01/01/2012 05:53 AM, Chad J wrote:
If you haven't been educated about unicode or how D handles it, you
might write this:
char[] str;
... load str ...
for ( int i = 0; i< str.length; i++ )
{
font.render(str[i]); // Ewww.
...
}
That actually looks like a bug that might happen in real world code.
What is the signature of font.render?
In my mind it's defined something like this:
class Font
{
...
/** Render the given code point at
the current (x,y) cursor position. */
void render( dchar c )
{
...
}
}
(Of course I don't know minute details like where the "cursor position"
comes from, but I figure it doesn't matter.)
I probably wrote some code like that loop a very long time ago, but I
probably don't have that code around anymore, or at least not easily
findable.
I think the main issue here is that char implicitly converts to dchar:
This is an implicit reinterpret-cast that is nonsensical if the
character is outside the ascii-range.
I agree.
Perhaps the compiler should insert a check on the 8th bit in cases like
these?
I suppose it's possible someone could declare a bunch of individual
char's and then start manipulating code units that way, and such an 8th
bit check could thwart those manipulations, but I would also counter
that such low manipulations should be done on ubyte's instead.
I don't know how much this would help though. Seems like too little,
too late.
I think the conversion char -> dchar should just require an explicit
cast. The runtime check is better left to std.conv.to;
The bigger problem is that a char is being taken from a char[] and
thereby loses its context as (potentially) being part of a larger
codepoint.
If it is part of a larger code point, then it has its highest bit set.
Any individual char that has its highest bit set does not carry a
character on its own. If it is not set, then it is a single ASCII character.