Den 2011-04-01 08:48 skrev Dmitry Yemanov såhär: > All, Well I'm not a developer so feel free to ignore me. I answer from the point of view of a FB user.
[snip examples and text about altering charset/collation on existing columns] > However, now I doubt whether it's correct at all. By design, our DDL > doesn't touch data, so the every table change must be valid a priori. We > have a number of validations for data types and string lengths, but I > believe they're missing for charsets/collations. > > What if someone changes the column with existing utf8 encoded data to > win1251 but the data belongs to a different code page? It may make the > table unreadable. It's easy to reproduce with changing the win1251 > column with some Cyrillic text into the ascii one. > > Collations fortunately don't affect record storage but they may affect > indices. What if there's an unique index on a column and its collation > is changed to be case-insensitive? Doesn't it make the index invalid > (uniqueness violated)? > > I believe we need to brainstorm this. Should we rollback all these > fixes? Should we allow them but add the proper consistency checks, e.g. > allow ASCII->any-single-byte-or-utf8 and any->UTF8 conversions and > disallow everything else? Should we add a DFW job to validate the > existing data like we did for NOT NULL? Any other suggestions? > > Right now our behaviour is very inconsistent, to say at least. Yes. As a user I find there are pros and cons with all available "options" from "allow everything and do all necessary data conversions and validations automatically, raising errors only when that fails, e.g. because a varchar col changed to int contains non-numeric values" to "disallow everything that potentially breaks data or indices, resulting in no need for data and constraint validation at all". Frankly, I don't really know what's best. In general, I perceive that the philosophy of FB is that the person writing DSQL/PSQL/DDL etc should know what (s)he's doing and the possible implications, including possible heavy work = long execution times, bad performance, etc. In other words, if you write a query that can't use an index and performs a natural scan on a large table, it's your fault, not Firebird's, and Firebird has no responsibility to warn you unless you explicitly request it (by viewing the query plan). It also seems to me that Firebird is very complete regarding features specified by the SQL standard(s). So, I'd say, following these "guidelines", if there are paragraphs in the relevant standard documents that state what datatype changes should be allowed or not, try to comply to those as much as possible. But most important is to be consistent, of course: If something works in one "place" it should work "everywhere" and vice versa, and no matter what you do, FB should make sure you don't end up with invalid data (data that violates existing constraints) or a crashed database. So, if charset changes can be done via a domain, it should be possible without using a domain as well, but the data has to be at least validated to not require a transliteration, or a transliteration should happen automatically. Also, indices that need to be rebuilt must also be rebuilt automatically. I'm not sure I made much sense... Tell me, does the standard specify anything regrding datatype change features? If so, what does it say? Kjell -- -------------------------------------- Kjell Rilbe DataDIA AB E-post: kj...@datadia.se Telefon: 08-761 06 55 Mobil: 0733-44 24 64 ------------------------------------------------------------------------------ Create and publish websites with WebMatrix Use the most popular FREE web apps or write code yourself; WebMatrix provides all the features you need to develop and publish your website. http://p.sf.net/sfu/ms-webmatrix-sf Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel