[firebird-support] COLLATE usage for blob and varchar fields on an UTF8 database

patrick_marten Mon, 02 Jul 2012 06:45:10 -0700

Hello,

currently I have a database with ISO8859_1 as character set. This character set 
applies to most fields, but some have a different one, so that a table looks 
like this:



CREATE TABLE MY_TABLE
(
  ...,
  DESCRIPTION1                                          BLOB SUB_TYPE 1         
SEGMENT SIZE 80,
  DESCRIPTION2                                          BLOB SUB_TYPE 1         
SEGMENT SIZE 80 CHARACTER SET ISO8859_5,
  SHORT_DESCRIPTION1                            VARCHAR(   100),
  SHORT_DESCRIPTION2                            VARCHAR(   100)                 
CHARACTER SET ISO8859_5,
  SOME_TEXTFIELD                                        VARCHAR(    50)         
COLLATE ISO8859_1,
  ...;
);

We are planning to convert everything to unicode and have created a database 
script + a database with UTF8 as character set, so that we can continue with 
development and then see, how to convert existing customer databases etc. (this 
will probably lead to a separate topic here with a lot of questions ;) )

Unfortunately I don't remember how we had created our current testing-database, 
so that a few things need to be clarified.

Another step will be, to change some blob fields from SUB_TYPE 1 to SUB_TYPE 0, 
because the component we will be using on the UTF8 version is working with a 
binary format for descriptions etc., so that the similar part of the new script 
(UTF8 database) looks like this:

CREATE TABLE MY_TABLE
(
  ...,
  DESCRIPTION1                                          BLOB SUB_TYPE 0         
        SEGMENT SIZE 80 COLLATE UNICODE,
  DESCRIPTION2                                          BLOB SUB_TYPE 0         
        SEGMENT SIZE 80 COLLATE UNICODE,
  SHORT_DESCRIPTION1                            VARCHAR(   100)         COLLATE 
UNICODE,
  SHORT_DESCRIPTION2                            VARCHAR(   100)         COLLATE 
UNICODE,
  SOME_TEXTFIELD                                        VARCHAR(    50)         
COLLATE UNICODE,
  ...;
);

IBExpert shows me current UTF8 table as follows:

Field Name | Field Type | Size | Subtype | Charset | Collate |
-------------------------------------------------------------
DESCRIPTION1 | BLOB | 80 | Binary | UTF8 | UTF8 (*) |
DESCRIPTION2 | BLOB | 80 | Binary | UTF8 | UTF8 (*) |
SHORT_DESCRIPTION1 | VARCHAR | 100 |  | UTF8 | UNICODE |
SHORT_DESCRIPTION1 | VARCHAR | 100 |  | UTF8 | UNICODE |
SOME_TEXTFIELD | VARCHAR | 50 |  | UTF8 | UNICODE |

Collate for the first two fields is different in IBExpert than in the script.

I've needed to add some fields to the table and tried this, based on the UTF8 
script:

alter table MY_TABLE add  ANOTHER_DESCRIPTION BLOB SUB_TYPE 0  SEGMENT SIZE 80  
COLLATE UNICODE;

but this fails with an error like "Invalid datatype. Incorrect usage of 
Character set or Collate".

In order to work it had to be:

alter table MY_TABLE add  ANOTHER_DESCRIPTION BLOB SUB_TYPE 0  SEGMENT SIZE 80;

So the first question is, how is it possible, that IBExpert shows me UTF8 for 
the "Collate" column (see above at the (*) symbol).

Because of the error with alter table, I thought that I must have created the 
table without collate and just forgott to edit the script.
I thought, that IBXpert automatically shows UTF8 as character set and collate 
in such a case, i.e. if nothing was specified for those fields.

But for the newely added field "ANOTHER_DESCRIPTION" IBExpert shows this:
Field Name | Field Type | Size | Subtype | Charset | Collate |
-------------------------------------------------------------
DESCRIPTION1 | BLOB | 80 | Binary |  |  |

So now I'm somehwat confused and am not sure, what is correct and why there are 
such differences and how to use collate properly in that case.

Please put some light into the dark...

Kind regards,
Patrick

[firebird-support] COLLATE usage for blob and varchar fields on an UTF8 database

Reply via email to