Re: Large string patch

David & Lisa Jacobs Mon, 31 Dec 2001 08:51:06 -0800

From: "Dan Sugalski" <[EMAIL PROTECTED]>
> >Agreed.  I'll probably have the encoding structure provide the
terminating
> >bytes.  As a side note don't we also have to split UTF-16 into UTF-16BE
and
> >UTF-16LE (big endian and little endian)?
>
> I think UTF-16 can be a single encoding. The little/big endian issue can
be
> dealt with by an I/O filter.


Will an IO filter have an opportunity to inject itself when we mmap a file?
It was because you said you wanted this capability that I thought we were
maintaining the serialized forms of unicode encodings.  Otherwise, I would
be highly tempted to convert the internal representation of all unicode
strings into and array of 4 byte ints (allows for much faster processing).

David

Re: Large string patch

Reply via email to