On Tue, Jun 12, 2001 at 11:46:30AM -0500, William A. Rowe, Jr. wrote: > From: "Luke Kenneth Casson Leighton" <[EMAIL PROTECTED]> > Sent: Tuesday, June 12, 2001 10:22 AM > > > > for various reasons i am prompted to ask, > > > > how would the idea of having an apr_ucs16 set of routines, > > apr_wstrcat, apr_wstrcpy, apr_wtolower, apr_wtoupper etc., > > be received? > > Well, since apr_isfoo apr_tofoo was 'reinvented', I don't see a > huge problem. cool.
> > on nt, it's easy: straightforward usage of the NT > > wstrcat, wstrcpy etc. lines. > > These are the folks who never read the "Security Implications" of ucs-8 > leaving 40% of all IIS webservers still vulnerable, so I'm dubious :-) *grin*. btw, samba #defines strcpy to ERROR_USE_SAFE_STRCPY_INSTEAD etc. sorry, forgot about this. okay, rewrite that: how about an equivalent apr_pwstrcat, apr_pwstrcpy with all the safety / security / paranoia therein? > Well, how about a simple question. Why restrain ourselves to ucs2? because it's what NT has: NT doesn't have 32-bit (ucs4?) unicode, afaik, only 16-bit (ucs2?) writing your own ucs4 library, forget it, might as well adopt the glib one. but iirc, the glib one _only_ does ucs4, not ucs2. > (No such thing as ucs16/32, it's ucs2/4). > ack. > Can iconv/apr_iconv provide this in a charset-opaque manner? That is, if > I want three 'characters' in shift-jis, can it give me the right number > of bytes? The reason is simple, Unicode is already splintered into a > multi-word character set anyways. I suspect it's easier to just get it > right, knowing the apr_xlate that's been opened, and asking for the char > len v.s. the byte len (sizeof) and providing the strcpy/cmp, etc. you need to be able to wtoupper, wtolower etc. that requires a lookup table. samba has an optimised lookup table of the standard ucs2 upper/lower conversion tables that is small enough to fit into the 2nd-level cache of an intel processor. luke
