On Fri, Mar 19, 2004 at 05:08:00PM -0500, Jeff Trawick wrote: > Joe Orton wrote: > >Hmm. Is this sbcs thing really safe at all? Just because a character > >set translation gives a particular mapping for 0x00-0xff in that order > >why is it guaranteed that it will for any other ordering of bytes? > > > >e.g. invent a mapping which does "0xff <end>" -> "0xff" but "0xff 0xf1" > >-> "0x42". I'd be surprised if a mapping between two real charsets does > >*not* exist which does something like this, given the range of extremely > >weird and wonderful charsets out there. > > to rule out this issue completely, 256 calls to iconv() would be required > in check_sbcs() to test each proposed byte/char individually ;)
My example is a bit bogus since iconv presumably can't handle a charset where "0xff" and "0xff 0xf1" represent different characters, but the point is there. Yes, logically, it would be necessary to do 256 individual iconv() calls to accurately deduce a single-byte lookup table for a particular mapping if one is available. So it seems no optimisation is really possible here; what performance problem is this code solving? The overhead of iconv surely in iconv_open() anyway: I think the right thing to do is remove the check_sbcs() functions. joe