Hi Sylvain, I didn't know the system tables are purged/recreated at startup, and I also didn't know about SystemTable, either. So, I think I'll pick up what's left of my pride and go back to reading the code some more :) .
I'll note that I am adding a column to one of the tables (system_columnfamiles), so I won't get into the remove column problem you mention. Thanks for your help, -Jason On Fri, Jan 25, 2013 at 10:19 AM, Sylvain Lebresne <sylv...@datastax.com>wrote: > The System tables schema is not really saved on disk. That is, they do > appear > in the System.schema* tables if you try to read it, but they are entirely > removed and rewritten at startup: see SystemTable.finishStartup. So not > sure > there is real need for what you're suggesting, but I may have misunderstood > you > maybe. > > That being said, I do note that when updates are made to a System table > schema, > we obviously need to be wary of backward compatibility. So adding a new > column > is always fine, but we might not want to remove columns in minor releases, > now > that user have easy access to those tables. > > That being said, what I'm saying here stands for those System table that > are > local to the node (local, peers, hints, etc...). It's more complicated with > the > distribued ones unfortunately, where I don't think there is a safe way > currently to remove any column even in major upgrade, because that would > break > rolling upgrades (we could be able to remove column in 2 major upgrade, > provided we are willing to say you can't skip major versions). > > -- > Sylvain > > > On Fri, Jan 25, 2013 at 6:23 PM, Jason Brown <jasedbr...@gmail.com> wrote: > > > Hi all, > > > > I'm working on a ticket that will require an update to one of the system > > tables. Now that we're using system tables with defined schema, starting > > with 1.2, the question becomes how to update existing tables. Currently, > > most, if not all, of the system tables have their CREATE TABLE statements > > hard-coded in the CFMetaData class. It seems to me that adding an update > > statment to this class everytime we make a system change will be a bit > > awkward, as well as working in the logic of determining which update to > > apply when updating c* versions may get hairy. > > > > Hence, I propose a schema migration system where we keep all system > schema > > changes stored as flat files (perhaps in a new 'conf/migrations' > > directory). Each file would have a unique numeric id (monotinically > > incrementing) to identify the migration version, thus implying order > > between the migrations. The migration version could be stored in the file > > itself, but probably easier to store in the file name, preferably as the > > first token in the file to allow easier sorting. (This is how I've done > it > > before.) > > > > We would then have a new system table to tack the current migration > status, > > system.migration, which would look something like this: > > > > use system; > > create table migration (version int); > > > > At c* startup, it would read the system.migration.version value, and > > compare against the migration files' ids. If the stored value is lower, > the > > higher numbered migration files would be applied. This would be a minor > > cost at upgrade time (most of which you would have to incur, anyways, > > because of the upgrade), but virtually free any other time. It allows us > to > > have better visibility into system schema change over time, as well. > > > > I'm happy to jump in and start working on this if the community thinks > it a > > worthwhile addition. > > > > Thanks, > > > > -Jason Brown > > Netflix > > >