Den 2011-04-01 08:48 skrev Dmitry Yemanov såhär:
> All,

Well I'm not a developer so feel free to ignore me. I answer from the 
point of view of a FB user.

[snip examples and text about altering charset/collation on existing 
columns]

> However, now I doubt whether it's correct at all. By design, our DDL
> doesn't touch data, so the every table change must be valid a priori. We
> have a number of validations for data types and string lengths, but I
> believe they're missing for charsets/collations.
>
> What if someone changes the column with existing utf8 encoded data to
> win1251 but the data belongs to a different code page? It may make the
> table unreadable. It's easy to reproduce with changing the win1251
> column with some Cyrillic text into the ascii one.
>
> Collations fortunately don't affect record storage but they may affect
> indices. What if there's an unique index on a column and its collation
> is changed to be case-insensitive? Doesn't it make the index invalid
> (uniqueness violated)?
>
> I believe we need to brainstorm this. Should we rollback all these
> fixes? Should we allow them but add the proper consistency checks, e.g.
> allow ASCII->any-single-byte-or-utf8 and any->UTF8 conversions and
> disallow everything else? Should we add a DFW job to validate the
> existing data like we did for NOT NULL? Any other suggestions?
>
> Right now our behaviour is very inconsistent, to say at least.

Yes. As a user I find there are pros and cons with all available 
"options" from

"allow everything and do all necessary data conversions and validations 
automatically, raising errors only when that fails, e.g. because a 
varchar col changed to int contains non-numeric values"

to

"disallow everything that potentially breaks data or indices, resulting 
in no need for data and constraint validation at all".

Frankly, I don't really know what's best.

In general, I perceive that the philosophy of FB is that the person 
writing DSQL/PSQL/DDL etc should know what (s)he's doing and the 
possible implications, including possible heavy work = long execution 
times, bad performance, etc. In other words, if you write a query that 
can't use an index and performs a natural scan on a large table, it's 
your fault, not Firebird's, and Firebird has no responsibility to warn 
you unless you explicitly request it (by viewing the query plan).

It also seems to me that Firebird is very complete regarding features 
specified by the SQL standard(s).

So, I'd say, following these "guidelines", if there are paragraphs in 
the relevant standard documents that state what datatype changes should 
be allowed or not, try to comply to those as much as possible.

But most important is to be consistent, of course: If something works in 
one "place" it should work "everywhere" and vice versa, and no matter 
what you do, FB should make sure you don't end up with invalid data 
(data that violates existing constraints) or a crashed database.

So, if charset changes can be done via a domain, it should be possible 
without using a domain as well, but the data has to be at least 
validated to not require a transliteration, or a transliteration should 
happen automatically. Also, indices that need to be rebuilt must also be 
rebuilt automatically.

I'm not sure I made much sense... Tell me, does the standard specify 
anything regrding datatype change features? If so, what does it say?

Kjell
-- 
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kj...@datadia.se
Telefon: 08-761 06 55
Mobil: 0733-44 24 64


------------------------------------------------------------------------------
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and 
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to