Morning all,
be gentle with me, I'm not all that good a developer! ;-)
Given the problem with varchars being defined in bytes but needing to
store chars, how feasible would it be to allow the definition of a
column (or variable) in a manner similar to how Oracle does it?
create table ...(
a_c
> Unfortunately the implementation of UTF-8 in Firebird is annoying
> because it reduces that maximum allowed number of characters to a 1/4 of
> that for single byte character sets making it necessary to switch to
> blobs sooner.
IIRC, old Interbase versions defined maximum Varchar length in *byte
On Mon, Sep 02, 2013 at 03:30:26PM +0200, Stefan Heymann wrote:
> >>> I'd prefer to have an option to use UTF-16 (treated as a 2-byte
> >>> character set with surrogate pairs) as that will only halve the
> >>> maximum allowed number of characters.
>
> The maximum allowed number of characters in Un
>>> I'd prefer to have an option to use UTF-16 (treated as a 2-byte
>>> character set with surrogate pairs) as that will only halve the
>>> maximum allowed number of characters.
The maximum allowed number of characters in Unicode is about 1
Million. Which can be perfectly represented by either UTF
On Aug 31, 2013, at 4:55 AM, Mark Rotteveel wrote:
> On 29-8-2013 17:41, Jim Starkey wrote:
>> Paradoxically, Japanese strings tend to be shorter in UTF-8 than 16 bit
>> Unicode. The reason is simple: There are enough single byte characters
>> -- punctuation, control characters, and digits --
31.08.2013 13:53, Mark Rotteveel wrote:
> On 31-8-2013 13:38, Dimitry Sibiryakov wrote:
>> 31.08.2013 13:29, Mark Rotteveel wrote:
>>> As most languages don't need those surrogate pairs for their
>>> codepoints/glyphs, it is easier to consider UTF-16 to be 2 byte. As far
>>> as I know this is how m
On 31-8-2013 13:38, Dimitry Sibiryakov wrote:
> 31.08.2013 13:29, Mark Rotteveel wrote:
>> As most languages don't need those surrogate pairs for their
>> codepoints/glyphs, it is easier to consider UTF-16 to be 2 byte. As far
>> as I know this is how most UTF-16 implementations handle it.
>
>
31.08.2013 13:29, Mark Rotteveel wrote:
> As most languages don't need those surrogate pairs for their
> codepoints/glyphs, it is easier to consider UTF-16 to be 2 byte. As far
> as I know this is how most UTF-16 implementations handle it.
In this case UTF-16 has no difference from UCS2.
--
On 31-8-2013 13:15, Dimitry Sibiryakov wrote:
> 31.08.2013 10:55, Mark Rotteveel wrote:
>> I'd prefer to have an option to use UTF-16 (treated as a 2-byte
>> character set with surrogate pairs) as that will only halve the maximum
>> allowed number of characters.
>
> Nope. If you take into accou
31.08.2013 10:55, Mark Rotteveel wrote:
> I'd prefer to have an option to use UTF-16 (treated as a 2-byte
> character set with surrogate pairs) as that will only halve the maximum
> allowed number of characters.
Nope. If you take into account surrogates, UTF-16 will have the same maximum
of 4
On 29-8-2013 17:41, Jim Starkey wrote:
> Paradoxically, Japanese strings tend to be shorter in UTF-8 than 16 bit
> Unicode. The reason is simple: There are enough single byte characters
> -- punctuation, control characters, and digits -- stay as single bytes,
> double byte characters are a wash, a
Paradoxically, Japanese strings tend to be shorter in UTF-8 than 16 bit
Unicode. The reason is simple: There are enough single byte characters
-- punctuation, control characters, and digits -- stay as single bytes,
double byte characters are a wash, and the single byte characters
generally bal
On 9-8-2013 01:18, Adriano dos Santos Fernandes wrote:
> On 08-08-2013 13:30, Mark Rotteveel wrote:
>> Looking in the source of intl_builtin.cpp I noticed that there is
>> support for UTF16, UTF32 and UNICODE_UCS2, for UNICODE_UCS2 there is
>> also a constant (=8) defined in charsets.h
>>
>> These
Adriano wrote:
>> Looking in the source of intl_builtin.cpp I noticed that there is
>> support for UTF16, UTF32 and UNICODE_UCS2, for UNICODE_UCS2 there is
>> also a constant (=8) defined in charsets.h
>>
>> These definitions are missing from RDB$CHARACTER_SETS. Can these be used
>> as a connec
On 08-08-2013 13:30, Mark Rotteveel wrote:
> Looking in the source of intl_builtin.cpp I noticed that there is
> support for UTF16, UTF32 and UNICODE_UCS2, for UNICODE_UCS2 there is
> also a constant (=8) defined in charsets.h
>
> These definitions are missing from RDB$CHARACTER_SETS. Can these
Looking in the source of intl_builtin.cpp I noticed that there is
support for UTF16, UTF32 and UNICODE_UCS2, for UNICODE_UCS2 there is
also a constant (=8) defined in charsets.h
These definitions are missing from RDB$CHARACTER_SETS. Can these be used
as a connection or column character set? If
16 matches
Mail list logo