You can just say Screw the number 8, let's use 21-bit bytes.
★じゅういっちゃん★ EKYWY TXLY NPZ P MPVD XPHYV LPWWQY NKT ZPN XT WYPZTX PE PMM ET HPWWD "EYX EKTSZPXV'Z HTWY GSX P XSHOYW EKPX TXY PXV LTHHQEHYXE, ET HY, QZ RSQEY ZLPWD" --- Original Message --- 差出人: "Carl W. Brown" <[EMAIL PROTECTED]>; 宛先: [EMAIL PROTECTED]; Cc: 日時: 01/05/30 0:46 件名: RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email) >Ken, > >I suspect that Oracle is specifically pushing for this standard because of >its unique data base design. In a sense Oracle almost picks it self up by >its own bootstraps. It has always tried to minimize actual code. Therefore >it was a natural choice to implement Unicode with UTF-8 because it is easy >to reuse the multibyte support with minor changes to handle a different >character length algorithm. This has been one of the reasons that Oracle >has been successful. Its tinker toy like design has enabled them to quickly >adapt and add new features. Now however, they should take the time do "do >it right". Its UTF-8 storage creates problems for database designers >because they can not predict field sizes. This is a problem with MBCS code >pages but UTF-8s will make it worse. There will be lots of wasted storage >when characters can vary in size from 1 to 6 bytes. > >Most other database systems require specific code to support Unicode. As a >consequence most have implemented using UCS-2. Their migration is obviously >to use UTF-16. UTF-8s buys them nothing but headaches. > >Carl > >-----Original Message----- >From: Kenneth Whistler [mailto:[EMAIL PROTECTED]] >Sent: Tuesday, May 29, 2001 3:47 PM >To: [EMAIL PROTECTED] >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] >Subject: RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and >email) > > >Carl, > >> Ken, >> >> UTF-8s is essentially a way to ignore surrogate processing. It allows a >> company to encode UTF-16 with UCS-2 logic. >> >> The problem is that by not implementing surrogate support you can >introduce >> subtle errors. For example it is common to break buffers apart into >> segments. These segments may be reconcatinated but they may be processed >> individually. > >You are preaching to the choir here. I didn't state that *I* was in >favor of UTF-8S -- only that we have to be careful not to assume that >UTC will obviously not support it. The proponents of UTF-8S are >vigorously and actively campaigning for their proposal. In >standardization committees, proposals that have committed, active >proponents who can aim for the long haul, often have a way of getting >adopted in one form or another, unless there are equally committed >and active opponents of the proposal. It is just the nature of >consensus politicking in these committees, whether corporate based >or national body based. > >Also, I consider the stated position of "near-universal agreement >among the database vendors" to be largely a rhetorical device by >the proponents. Oracle is clearly pushing the proposal. NCR has >stated it is not in favor of the proposal. The other big enterprise >database vendors are hedging their positions somewhat -- in >particular, the standards people in those companies may not be >entirely in agreement with some of their database engine developers, for >example. And the small database vendors are either not playing >in this space or are part of desktop systems that will just follow >the behavior of the platforms. > >--Ken > > >