At 4:44 PM -0800 11/15/00, Markus Scherer wrote:

>In the case of Java, the equivalent course of action would be to 
>stick with a 16-bit char as the base type for strings. The int type 
>could be used in _additional_ APIs for single Unicode code points, 
>deprecating the old APIs with char.
>

It's not quite that simple. Many of the key APIs in Java already use 
ints instead of chars where chars are expected. In particular, the 
Reader and Writer classes in java.io do this.

I do agree that it makes sense to use strings rather than characters. 
I'm just wondering how bad the transition is going to be. Could we 
get away with eliminating (or at least deprecating) the char data 
type completely and all methods that use it? And can we do that 
without breaking all existing code and redesigning the language?

For example, consider the charAt() method in java.lang.String:

public char charAt(int index)

This method is used to walk strings, looking at each character in 
turn, a useful thing to do. Clearly it would be possible to replace 
it with a method with a String return type like this:

public String charAt(int index)

The returned string would contain a single character (which might be 
composed of two surrogate chars). However, we can't simply add that 
method because Java can't overload on return type. So we have to give 
that method a new name like:

public String characterAt(int index)

OK. That one's not too bad, maybe even more intelligible than what 
we're replacing. But we have to do this in hundreds of places in the 
API!  Some will be much worse than this.  Is it really going to be 
possible to make this sort of change everywhere? Or is it time to 
bite the bullet and break backwards compatibility? Or should we 
simply admit that non-BMP characters aren't that important and stick 
with the current API?  Or perhaps provide special classes that handle 
non-BMP characters as an ugly-bolt-on to the language that will be 
used by a few Unicode afficionados but ignored by most programmers, 
just like wchar is ignored in C to this day?

None of these solutions are attractive. It may take the next 
post-Java language to really solve them.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|                  The XML Bible (IDG Books, 1999)                   |
|              http://metalab.unc.edu/xml/books/bible/               |
|   http://www.amazon.com/exec/obidos/ISBN=0764532367/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://metalab.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/     |
+----------------------------------+---------------------------------+

Reply via email to