Re: [GSoC 2012] Schema Alteration API proposal
On 06/04/12 06:34, j4nu5 wrote: Actually I am not planning to mess with syncdb and other management commands. I will only refactor django.db.backends creation functions like sql_create_model etc. to use the new API. Behaviour and functionality will be the same after refactor, so management commands like syncdb will not notice a difference. Alright, that's at least going to leave things in a good working state, then. Currently, I can only think of things like the unique index on SQLite and oddities in MySQL mostly again from South's test suite, I will give another update before today's deadline. There's a few other ones that South handles - like booleans in SQLite - but a look through the codebase would hopefully give you hints to most of those. Are you referring to the fake orm? Well if you are satisfied with my above explanation, there would be no need for it, since we will be using django's orm. Well, the "fake ORM" is exactly what you described above - models loaded and then cleared from the app cache. I'm not saying it's a bad thing - it beats what South had before (nothing) - but there could be alternatives. Well you said it yourself above that "the models API in Django is not designed with models asmoveable, dynamic objects". That is why I used a column-based approach. The advantage will be felt in live migrations. As for using Django fields for type information, I frankly cannot think of a major valid negative point for now, I will revert later today. If you plan to continue using Django fields as type information (as South does), what potential issues do you see there? The only issue I can think of is the case of custom fields created by the user. That's one big issue; one of South's biggest issues today is custom fields, though that's arguably more the serialisation side of them. Still, I'd at least like to see how you would want something like, say, GeoDjango to fit in, even though this GSOC wouldn't cover it - it has a lot of custom creation code, and alteration types that differ from creation types (much like SERIAL in postgres, which you _will_ have to address) and room would have to be made for these kinds of problems. Andrew -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
On Thursday, 5 April 2012 21:25:19 UTC+5:30, Andrew Godwin wrote: > > If you plan to continue using > Django fields as type information (as South does), what potential issues > do you see there? > The only issue I can think of is the case of custom fields created by the user. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/TTHMOOOFAhYJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
On Thursday, 5 April 2012 21:25:19 UTC+5:30, Andrew Godwin wrote: > > Just thought I'd chime in now I've had a chance to look over the current > proposal (I looked at the current one you have in the GSOC system): > > - When you describe feeding things in from local_fields, are you > referring to that being the method by which you're planning to implement > things like syncdb? > Actually I am not planning to mess with syncdb and other management commands. I will only refactor django.db.backends creation functions like sql_create_model etc. to use the new API. Behaviour and functionality will be the same after refactor, so management commands like syncdb will not notice a difference. > - I'd like to see a bit more detail about how you plan to test the > code - specifically, there are some backend-specific tests you may need, > as well as some detailed introspection in order to make sure things have > applied correctly. > Currently, I can only think of things like the unique index on SQLite and oddities in MySQL mostly again from South's test suite, I will give another update before today's deadline. > - Russ is correct about your models approach - as I've said before in > other places, the models API in Django is not designed with models as > moveable, dynamic objects. > I have taken care of clearing the app cache, after migrations. Actually the entire point of using these 'Django code' based tests is that I wanted to doubly ensure that Django will behave the way its supposed to after the migrations. I could have gone with a SQL only approach e.g. 'SELECT table' after calling db.delete_table but using testing using Django code seemed a bit more comprehensive. Now, to mimic migrations, I needed to alter model definitions. The closest way to resemble actual migration scenario seemed to be to change the definitions in models.py itself. File rename/rewrite is ugly and OS dependent thats why I used a 'temporary setting' based approach. I know that messing with app cache looks a bit hackish but I cannot think of anything else for now. South has one approach to these sorts of > tests, but I'd love to see a cleaner suggestion. > Are you referring to the fake orm? Well if you are satisfied with my above explanation, there would be no need for it, since we will be using django's orm. > - There's been some discussion on south-users about the benefits of a > column-based alteration API versus a field/model-based alteration API - > why have you picked a column-based one? If you plan to continue using > Django fields as type information (as South does), what potential issues > do you see there? > Well you said it yourself above that "the models API in Django is not designed with models as moveable, dynamic objects". That is why I used a column-based approach. The advantage will be felt in live migrations. As for using Django fields for type information, I frankly cannot think of a major valid negative point for now, I will revert later today. > - Some more detail on your background would be nice - what's your > specific experience with the 3 main databases you'll be handling > (postgres, mysql, sqlite)? What was a "high voltage database migration"? > Sure. I will update it. > Sorry for the late feedback, I've been far too busy. > No problem, as long as you reply to this before the deadline :D > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/5eBCoe2syNYJ. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
Just thought I'd chime in now I've had a chance to look over the current proposal (I looked at the current one you have in the GSOC system): - When you describe feeding things in from local_fields, are you referring to that being the method by which you're planning to implement things like syncdb? - I'd like to see a bit more detail about how you plan to test the code - specifically, there are some backend-specific tests you may need, as well as some detailed introspection in order to make sure things have applied correctly. - Russ is correct about your models approach - as I've said before in other places, the models API in Django is not designed with models as moveable, dynamic objects. South has one approach to these sorts of tests, but I'd love to see a cleaner suggestion. - There's been some discussion on south-users about the benefits of a column-based alteration API versus a field/model-based alteration API - why have you picked a column-based one? If you plan to continue using Django fields as type information (as South does), what potential issues do you see there? - Some more detail on your background would be nice - what's your specific experience with the 3 main databases you'll be handling (postgres, mysql, sqlite)? What was a "high voltage database migration"? Sorry for the late feedback, I've been far too busy. Andrew -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
On 04/04/2012, at 11:50 PM, j4nu5 wrote: > Hi Russell, > Thanks for your immense patience :-) > > These are some additions to my proposal above, based on your inputs: > Status of current 'creation' code in django: > The current code, for e.g. sql_create_model in > django.db.backends.creation is a mix of *inspection* part and *sql > generation* part. Since the sql generation part will (should) now be > handled by our new CRUD API, I will refactor > django.db.backends.creation (and other backends' creation modules) to > continue using their inspection part but using our new CRUD API for > sql generation. The approach will be to get the fields using > model._meta.local_fields and feeding them to our new CRUD API. This > will serve to be a proof of concept for my API. Hrm - not exactly ideal, but better than nothing I suppose. Ideally, there would actually be some migration task involved in your proof of concept. > As for testing using Django code, my models will be something like: > class UnchangedModel(models.Model): >eg = models.TextField() > > if BEFORE_MIGRATION: >class MyModel(models.Model): >f1 = models.TextField() >f2 = models.TextField() > # Deletion of a field > else: >class MyModel(models.Model): >f1 = models.TextField() > > The value of BEFORE_MIGRATION will be controlled by the testing code. > A temporary environment variable can be used for this purpose. Unless your plan also includes writing a lot of extra code to purge and repopulate the app cache, this approach won't work. Just changing a setting doesn't change the class that has already been parsed and processed. > Also a revised schedule: > Bonding period before GSoC: Discussion on API design > Week 1: Writing tests (using 2 part checks (checking the actual > database and using Django models), as discussed above) > Week 2: Developing the base migration API > Week 3: Developing extensions and overrides for PostgreSQL > Weeks 4-5 : Developing extensions and overrides for MySQL > Weeks 6-7 : Developing extensions and overrides for SQLite (may be shorter or > longer (by 0.5 week) depending on how much of xtrqt's code > is > considered acceptable) > Weeks 8-10 : Refactoring django.db backends.creation (and the PostgreSQL, > MySQL, SQLite creation modules) to use the new API for > SQL generation (approach discussed above) > Week 11 : Writing documentaion and leftover tests, if any > Week 12 : Buffer week for the unexpected > This looks a bit more convincing. Yours, Russ Magee %-) -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
Hi Russell, Thanks for your immense patience :-) These are some additions to my proposal above, based on your inputs: Status of current 'creation' code in django: The current code, for e.g. sql_create_model in django.db.backends.creation is a mix of *inspection* part and *sql generation* part. Since the sql generation part will (should) now be handled by our new CRUD API, I will refactor django.db.backends.creation (and other backends' creation modules) to continue using their inspection part but using our new CRUD API for sql generation. The approach will be to get the fields using model._meta.local_fields and feeding them to our new CRUD API. This will serve to be a proof of concept for my API. As for testing using Django code, my models will be something like: class UnchangedModel(models.Model): eg = models.TextField() if BEFORE_MIGRATION: class MyModel(models.Model): f1 = models.TextField() f2 = models.TextField() # Deletion of a field else: class MyModel(models.Model): f1 = models.TextField() The value of BEFORE_MIGRATION will be controlled by the testing code. A temporary environment variable can be used for this purpose. Also a revised schedule: Bonding period before GSoC: Discussion on API design Week 1: Writing tests (using 2 part checks (checking the actual database and using Django models), as discussed above) Week 2: Developing the base migration API Week 3: Developing extensions and overrides for PostgreSQL Weeks 4-5 : Developing extensions and overrides for MySQL Weeks 6-7 : Developing extensions and overrides for SQLite (may be shorter or longer (by 0.5 week) depending on how much of xtrqt's code is considered acceptable) Weeks 8-10 : Refactoring django.db backends.creation (and the PostgreSQL, MySQL, SQLite creation modules) to use the new API for SQL generation (approach discussed above) Week 11 : Writing documentaion and leftover tests, if any Week 12 : Buffer week for the unexpected On Tuesday, 3 April 2012 06:39:37 UTC+5:30, Russell Keith-Magee wrote: > > > On 03/04/2012, at 5:06 AM, j4nu5 wrote: > > > Hi Russell, > > > > Thanks for the prompt reply. > > > > * You aren't ever going to eat your own dogfood. You're spending the > GSoC building an API that is intended for use with schema migration, but > you're explicitly not looking at any part of the migration process that > would actually use that API. How will we know that the API you build is > actually fit for the purpose it is intended? How do we know that the > requirements of "step 2" of schema migration will be met by your API? I'd > almost prefer to see more depth, and less breadth -- i.e., show me a fully > functioning schema migration stack on just one database, rather than a > fully functioning API on all databases that hasn't actually been shown to > work in practice. > > > > 'Eating my own dogfood' to check whether my low level migration > primitives are actually *usable*, I believe can be done by: > > 1. Developing a working fork of South to use these primitives as I > mentioned in my project goals, or > > 2. Aiming for less 'breadth' and more 'depth', as you suggested. > > > > I did not opt for 2, since creating the '2nd level' of the migration > framework (the caller of the lower level API) is a huge beast by itself. > Any reasonable solution will have to take care of 'Pythonic' as well as > 'pseudo-SQL' migrations as discussed above. Not to mention taking care of > versioning + dependency management + backwards migrations. I am against the > development of a half baked and/or inconsistent 2nd level API layer. Trying > to fully develop such a solution even for one database will exceed the GSoC > timeline, in my humble opinion. > > Ok - there's two problems with what you've said here: > > 1) You don't make any reference in your schedule to implementing a > "working fork of South". This isn't a trivial activity, so if you're > planning on doing this, you should tell use how this is factored into your > schedule. > > 2) You're making the assumption that you need to "fully develop" a > solution. A proof of concept would be more than adequate. For example, in > the 2010 GSoC, Alex Gaynor's project was split into two bits; a bunch of > modifications to the core query engine, and a completely separate project, > not intended for merging to trunk, that demonstrated that his core query > changes would do what was necessary. You could take exactly the same > approach here; don't try to delivery a fully functioning schema migration > tool, just enough of a tool to demonstrate that your API is sufficient. > > > * It feels like there's a lot of padding in your schedule. > > > >- A week of discussion at the start > >- 2 weeks for a "base" migration API > >- 2.5 weeks to write documentation > >- 2 "buffer" weeks > > > > Your pro
Re: [GSoC 2012] Schema Alteration API proposal
On 03/04/2012, at 5:06 AM, j4nu5 wrote: > Hi Russell, > > Thanks for the prompt reply. > > * You aren't ever going to eat your own dogfood. You're spending the GSoC > building an API that is intended for use with schema migration, but you're > explicitly not looking at any part of the migration process that would > actually use that API. How will we know that the API you build is actually > fit for the purpose it is intended? How do we know that the requirements of > "step 2" of schema migration will be met by your API? I'd almost prefer to > see more depth, and less breadth -- i.e., show me a fully functioning schema > migration stack on just one database, rather than a fully functioning API on > all databases that hasn't actually been shown to work in practice. > > 'Eating my own dogfood' to check whether my low level migration primitives > are actually *usable*, I believe can be done by: > 1. Developing a working fork of South to use these primitives as I mentioned > in my project goals, or > 2. Aiming for less 'breadth' and more 'depth', as you suggested. > > I did not opt for 2, since creating the '2nd level' of the migration > framework (the caller of the lower level API) is a huge beast by itself. Any > reasonable solution will have to take care of 'Pythonic' as well as > 'pseudo-SQL' migrations as discussed above. Not to mention taking care of > versioning + dependency management + backwards migrations. I am against the > development of a half baked and/or inconsistent 2nd level API layer. Trying > to fully develop such a solution even for one database will exceed the GSoC > timeline, in my humble opinion. Ok - there's two problems with what you've said here: 1) You don't make any reference in your schedule to implementing a "working fork of South". This isn't a trivial activity, so if you're planning on doing this, you should tell use how this is factored into your schedule. 2) You're making the assumption that you need to "fully develop" a solution. A proof of concept would be more than adequate. For example, in the 2010 GSoC, Alex Gaynor's project was split into two bits; a bunch of modifications to the core query engine, and a completely separate project, not intended for merging to trunk, that demonstrated that his core query changes would do what was necessary. You could take exactly the same approach here; don't try to delivery a fully functioning schema migration tool, just enough of a tool to demonstrate that your API is sufficient. > * It feels like there's a lot of padding in your schedule. > >- A week of discussion at the start >- 2 weeks for a "base" migration API >- 2.5 weeks to write documentation >- 2 "buffer" weeks > > Your project is proposing the development of a low level database API. While > this should certainly be documented, if it's not going to be "user facing", > the documentation requirements aren't as high. Also, because it's a low level > database API, I'm not sure what common tools will exist -- yet your schedule > estimates 1/6 of your overall time, and 1/3 of your active coding time, will > be spent building these common tools. Having 1/6 of your project schedule as > contingency is very generous; and you don't mention what you plan to look at > if you don't have to use that contingency. > > I think the problem is that the 1st part - development of a lower level > migrations API - is a little bit small for the GSoC timeline but the 2nd part > - the caller of the API - is way big for GSoC. As I said, I did not want to > create a half baked solution. Thats why the explicit skipping of 2nd level > and thus the *padding*. I am still open for discussion and suggestions > regarding this matter though. So, to summarize: What you're telling us is that you know, a-priori, that your project isn't 12 weeks of work. This doesn't give us a lot of incentive to pick up your proposal for the GSoC. We have an opportunity to get Google to pay for 12 weeks development. Given that we have that opportunity, why would we select a project that will only yield 6 weeks of output? The goal here isn't to pick a project, and then make it fit 12 weeks by any means necessary. It's to pick something that will actually be 12 weeks of work. A little contingency is fine, but if you start padding too much, your proposal isn't going to be taken seriously. My suggestion -- work out some small aspect of part 2 that you *can* deliver. Not necessarily the whole thing, but a skeleton, and try to delivery a fully fleshed out part on that skeleton. If you're smart about it, this can also double as your dogfood requirement. > * Your references to testing are a bit casual for my taste. From my > experience, testing schema migration code is hard. Normal view code and > utilities are easy to test -- you set up a test database, insert some data, > and check functionality. However, schema migration code is explicitly about >
Re: [GSoC 2012] Schema Alteration API proposal
Hi Kushagra, On the whole, I think this proposal is looking fairly good. You're high-level explanation of the problem is solid, and you've given enough detail of the direction you intend to take the project that it gives me some confidence that you understand what you're proposing to do. I have a couple of small concerns: * You aren't ever going to eat your own dogfood. You're spending the GSoC building an API that is intended for use with schema migration, but you're explicitly not looking at any part of the migration process that would actually use that API. How will we know that the API you build is actually fit for the purpose it is intended? How do we know that the requirements of "step 2" of schema migration will be met by your API? I'd almost prefer to see more depth, and less breadth -- i.e., show me a fully functioning schema migration stack on just one database, rather than a fully functioning API on all databases that hasn't actually been shown to work in practice. * It feels like there's a lot of padding in your schedule. - A week of discussion at the start - 2 weeks for a "base" migration API - 2.5 weeks to write documentation - 2 "buffer" weeks Your project is proposing the development of a low level database API. While this should certainly be documented, if it's not going to be "user facing", the documentation requirements aren't as high. Also, because it's a low level database API, I'm not sure what common tools will exist -- yet your schedule estimates 1/6 of your overall time, and 1/3 of your active coding time, will be spent building these common tools. Having 1/6 of your project schedule as contingency is very generous; and you don't mention what you plan to look at if you don't have to use that contingency. * Your references to testing are a bit casual for my taste. From my experience, testing schema migration code is hard. Normal view code and utilities are easy to test -- you set up a test database, insert some data, and check functionality. However, schema migration code is explicitly about making database changes, so the thing that Django normally considers "static" -- the database models -- are subject to change, and that isn't always an easy thing to accommodate. I'd be interested to see your thoughts on how you plan to test your API. * Your proposal doesn't make any reference to the existing "migration-like" tasks in Django's codebase. For example, we already have code for creating tables and adding indicies. How will your migration code use, modify or augment these existing capabilities? Yours, Russ Magee %-) On 01/04/2012, at 5:02 PM, j4nu5 wrote: > Less than a week remains for student application deadline. Can someone please > comment on the above revised proposal. Thanks a lot. > > On Monday, 26 March 2012 01:29:35 UTC+5:30, j4nu5 wrote: > Here is a revised proposal. > > Abstract > -- > A database migration helper has been one of the most long standing feature > requests in Django. Though Django has an excellent database creation helper, > when faced with schema design changes, developers have to resort to either > writing raw SQL and manually performing the migrations, or using third party > apps like South[1] and Nashvegas[2]. > > [1] http://south.aeracode.org/ > [2] https://github.com/paltman/nashvegas/ > > Clearly Django will benefit from having a database migration helper as an > integral part of its codebase. > > From the summary on django-developers mailing list[3], the task of building a > migrations framework will involve: > 1. Add a db.backends module to provide an abstract interface to migration >primitives (add column, add index, rename column, rename table, and so on). > 2. Add a contrib app that performs the high level accounting of "has migration >X been applied", and management commands to "apply all outstanding >migrations" > 3. Provide an API that allows end users to define raw-SQL migrations, or >native Python migrations using the backend primitives. > 4. Leave the hard task of determining dependencies, introspection of database >models and so on to the toolset contributed by the broader community. > > [3] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8 > > I would like to work on the 1st step as part of this year's GSoC. > > > Implementation plan > -- > The idea is to have a CRUD interface to database schema (with some additional > utility functions for indexing etc.) with functions like: > * create_table > * rename_table > * delete_table > * add_column > and so on, which will have the *explicit* names of the table/column to be > modified as its parameter. It will be the responsibility of the higher level > API caller (will not be undertaken as part of GSoC) to translate model/field > names to ex
Re: [GSoC 2012] Schema Alteration API proposal
Less than a week remains for student application deadline. Can someone please comment on the above revised proposal. Thanks a lot. On Monday, 26 March 2012 01:29:35 UTC+5:30, j4nu5 wrote: > > Here is a revised proposal. > > Abstract > > -- > A database migration helper has been one of the most long standing feature > requests in Django. Though Django has an excellent database creation > helper, > when faced with schema design changes, developers have to resort to either > writing raw SQL and manually performing the migrations, or using third > party > apps like South[1] and Nashvegas[2]. > > [1] http://south.aeracode.org/ > [2] https://github.com/paltman/nashvegas/ > > Clearly Django will benefit from having a database migration helper as an > integral part of its codebase. > > From the summary on django-developers mailing list[3], the task of > building a > migrations framework will involve: > 1. Add a db.backends module to provide an abstract interface to migration >primitives (add column, add index, rename column, rename table, and so > on). > 2. Add a contrib app that performs the high level accounting of "has > migration >X been applied", and management commands to "apply all outstanding >migrations" > 3. Provide an API that allows end users to define raw-SQL migrations, or >native Python migrations using the backend primitives. > 4. Leave the hard task of determining dependencies, introspection of > database >models and so on to the toolset contributed by the broader community. > > [3] http://groups.google.com/ > group/django-developers/msg/cf379a4f353a37f8 > > I would like to work on the 1st step as part of this year's GSoC. > > > Implementation plan > > -- > The idea is to have a CRUD interface to database schema (with some > additional > utility functions for indexing etc.) with functions like: > * create_table > * rename_table > * delete_table > * add_column > and so on, which will have the *explicit* names of the table/column to be > modified as its parameter. It will be the responsibility of the higher > level > API caller (will not be undertaken as part of GSoC) to translate > model/field > names to explicit table/column names. These functions will be directly > responsible for modifying the schema, and any interaction with the database > schema will take place by calling these functions. Most of these functions > will come from South. > > These API functions will also have a "dry-run" or test mode, in which they > will output raw SQL representation of the migration or display errors if > they > occur. This will be useful in: > 1. The MySQL backend. MySQL does not have transaction support for schema >modification and hence the migrations will be run in a dry run mode > first >so that any errors can be captured before altering the schema. > 2. The django-admin commands sql and sqlall that return the SQL (for > creation >and indexing) for an app. They will capture the SQL returned from the > API >running in dry run mode. > > As for the future of the current Django creation API, it will have to be > refactored (not under GSoC) to make use of the 'create' part of our new > CRUD > interface, for consistency purposes. > > The GeoDjango backends will also have to be refactored to use the new API. > Since, they build upon the base code in db.backends, firstly db.backends > will > have to be refactored. > > Last year xtrqt had written, documented and tested code for at least the > SQLite backend[4]. As per Andrew's suggestion, I would not be relying too > much > on that code but some parts can still be salvaged. > > [4] https://groups.google.com/ > forum/?fromgroups#!searchin/django-developers/xtrqt/django-developers/pSICNJBJRy8/Hl7frp-O-dMJ > > > Schedule and Goal > > -- > Week 1 : Discussion on API design and writing tests > Week 2-3 : Developing the base migration API > Week 4 : Developing extensions and overrides for PostgreSQL > Week 5-6 : Developing extensions and overrides for MySQL > Week 7-8.5 : Developing extensions and overrides for SQLite (may be > shorter or > longer (by 0.5 week) depending on how much of xtrqt's code is > considered acceptable) > Week 8.5-10: Writing documentaion and leftover regression tests, if any > Week 11-12 : Buffer weeks for the unexpected > > I will consider my project to be successful when we have working, tested > and > documented migration primitives for Postgres, MySQL and SQLite. If we can > develop a working fork of South to use these primitives, that will be a > strong > indicator of the project's success. > > > About me and my inspiration for the project > >
Re: [GSoC 2012] Schema Alteration API proposal
Here is a revised proposal. Abstract -- A database migration helper has been one of the most long standing feature requests in Django. Though Django has an excellent database creation helper, when faced with schema design changes, developers have to resort to either writing raw SQL and manually performing the migrations, or using third party apps like South[1] and Nashvegas[2]. [1] http://south.aeracode.org/ [2] https://github.com/paltman/nashvegas/ Clearly Django will benefit from having a database migration helper as an integral part of its codebase. >From the summary on django-developers mailing list[3], the task of building a migrations framework will involve: 1. Add a db.backends module to provide an abstract interface to migration primitives (add column, add index, rename column, rename table, and so on). 2. Add a contrib app that performs the high level accounting of "has migration X been applied", and management commands to "apply all outstanding migrations" 3. Provide an API that allows end users to define raw-SQL migrations, or native Python migrations using the backend primitives. 4. Leave the hard task of determining dependencies, introspection of database models and so on to the toolset contributed by the broader community. [3] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8 I would like to work on the 1st step as part of this year's GSoC. Implementation plan -- The idea is to have a CRUD interface to database schema (with some additional utility functions for indexing etc.) with functions like: * create_table * rename_table * delete_table * add_column and so on, which will have the *explicit* names of the table/column to be modified as its parameter. It will be the responsibility of the higher level API caller (will not be undertaken as part of GSoC) to translate model/field names to explicit table/column names. These functions will be directly responsible for modifying the schema, and any interaction with the database schema will take place by calling these functions. Most of these functions will come from South. These API functions will also have a "dry-run" or test mode, in which they will output raw SQL representation of the migration or display errors if they occur. This will be useful in: 1. The MySQL backend. MySQL does not have transaction support for schema modification and hence the migrations will be run in a dry run mode first so that any errors can be captured before altering the schema. 2. The django-admin commands sql and sqlall that return the SQL (for creation and indexing) for an app. They will capture the SQL returned from the API running in dry run mode. As for the future of the current Django creation API, it will have to be refactored (not under GSoC) to make use of the 'create' part of our new CRUD interface, for consistency purposes. The GeoDjango backends will also have to be refactored to use the new API. Since, they build upon the base code in db.backends, firstly db.backends will have to be refactored. Last year xtrqt had written, documented and tested code for at least the SQLite backend[4]. As per Andrew's suggestion, I would not be relying too much on that code but some parts can still be salvaged. [4] https://groups.google.com/ forum/?fromgroups#!searchin/django-developers/xtrqt/ django-developers/pSICNJBJRy8/Hl7frp-O-dMJ Schedule and Goal -- Week 1 : Discussion on API design and writing tests Week 2-3 : Developing the base migration API Week 4 : Developing extensions and overrides for PostgreSQL Week 5-6 : Developing extensions and overrides for MySQL Week 7-8.5 : Developing extensions and overrides for SQLite (may be shorter or longer (by 0.5 week) depending on how much of xtrqt's code is considered acceptable) Week 8.5-10: Writing documentaion and leftover regression tests, if any Week 11-12 : Buffer weeks for the unexpected I will consider my project to be successful when we have working, tested and documented migration primitives for Postgres, MySQL and SQLite. If we can develop a working fork of South to use these primitives, that will be a strong indicator of the project's success. About me and my inspiration for the project -- I am Kushagra Sinha, a pre-final year student at Institute of Technology (about to be converted to an Indian Institute of Technology), Banaras Hindu University, Varanasi, India. I can be reached at: Gmail: sinha.kushagra Alternative email: kush [at] j4nu5 [dot] com IRC: Nick j4nu5 on #django-dev and #django Twitter: @j4nu5 github: j4nu5 I was happily using PHP for nearly all of my webdev work since my high school days (Cak
Re: [GSoC 2012] Schema Alteration API proposal
19.3.2012 13:15, Andrew Godwin kirjoitti: On 19/03/12 11:08, Jonathan French wrote: On 18 March 2012 23:33, Russell Keith-Magee mailto:russ...@keith-magee.com>> wrote: > 2. An inspection tool that generates the appropriate python code after > inspecting models and current state of database. The current consensus is that this shouldn't be Django's domain -- at least, not in the first instance. It might be appropriate to expose an API to extract the current model state in a Pythonic form, but a fully-fledged, user accessible "tool". I would, however, definitely recommend not touching the Oracle or MSSQL backends - three is already a lot of work, and they're harder databases to get a hold of for testing. Here I would like to rise my concern - specially being long time Django and Oracle user.. =) First at all everyone can get hands on Oracle Express database, free of charge standard Django stuff works in it very well. Geodjango doesn't work with it. AFAIK MSSQL is something that is not officially supported by Django so that shouldn't be much a problem if it's not touched. Secondly Django has been in the past very consistent in support of four databases: SQLite, PostgreSQL, MySQL and Oracle. All supported pretty well as well as possible. I'm aware that doing migrations for all databases is a time taking challenge to tackle around all peculiarities in different backends. So hopefully that consistency is kept even with new features like this. And yes, second thing is of course Geodjango part which takes complexity to whole new level. -- Jani Tiainen -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
On 19/03/12 11:08, Jonathan French wrote: On 18 March 2012 23:33, Russell Keith-Magee mailto:russ...@keith-magee.com>> wrote: > 2. An inspection tool that generates the appropriate python code after >inspecting models and current state of database. The current consensus is that this shouldn't be Django's domain -- at least, not in the first instance. It might be appropriate to expose an API to extract the current model state in a Pythonic form, but a fully-fledged, user accessible "tool". Is there a writeup anywhere of why this is the consensus? AFAICT it looks like Django already provides half of this in the form of DatabaseIntrospection, that e.g. South actually uses, which generates a model class from the current state of the database. Doing the diff as well doesn't seem like much of a stretch, and might make it more likely for third party custom fields to be made migrateable, if the interface for doing so is in Django core. No writeup that I know of - however, the main part of the work here would be the "model differencing" code, which means creating a versioned ORM, being able to load and save model definitions to some kind of format, and the actual difference-creating code, which is all too much to stick into Django. I've long maintained that I want South to become just that automatic differencing code, and to just move the actual database API across; this is mostly because I see there being scope for other kinds of migration systems apart from the kind South is (for example, a very declarative one, whose model states are retrieved using the combination of all migrations, rather than a lump on the bottom of the last one). As for your proposal, Kushagra, Russ has said most of the points I would have thought of and a few more - I'd recommend a good look into previous discussions on this mailing list for most of the current views on how we want the schema alteration API to work. I would, however, definitely recommend not touching the Oracle or MSSQL backends - three is already a lot of work, and they're harder databases to get a hold of for testing. Andrew -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
On 18 March 2012 23:33, Russell Keith-Magee wrote: > > 2. An inspection tool that generates the appropriate python code after > >inspecting models and current state of database. > > The current consensus is that this shouldn't be Django's domain -- at > least, not in the first instance. It might be appropriate to expose an API > to extract the current model state in a Pythonic form, but a fully-fledged, > user accessible "tool". Is there a writeup anywhere of why this is the consensus? AFAICT it looks like Django already provides half of this in the form of DatabaseIntrospection, that e.g. South actually uses, which generates a model class from the current state of the database. Doing the diff as well doesn't seem like much of a stretch, and might make it more likely for third party custom fields to be made migrateable, if the interface for doing so is in Django core. - ojno -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
Re: [GSoC 2012] Schema Alteration API proposal
On 18/03/2012, at 7:38 PM, Kushagra Sinha wrote: > Abstract > -- > A database migration helper has been one of the most long standing feature > requests in Django. Though Django has an excellent database creation helper, > when faced with schema design changes, developers have to resort to either > writing raw SQL and manually performing the migrations, or using third party > apps like South[1] and Nashvegas[2]. > > Clearly Django will benefit from having a database migration helper as an > integral part of its codebase. > > From [3], the consensus seems to be on building a Ruby on Rails ActiveRecord > Migrations[4] like framework, which will essentially emit python code after > inspecting user models and current state of the database. Check the edit dates on that wiki -- most of the content on that page is historical, reflecting discussions that were happening over 3 years ago. There have been many more recent discussions. The "current consensus" (at least, the consensus of what the core team is likely to accept) is better reflected by the GSoC project that was accepted, but not completed last year. I posted to Django-developers about this a week or so ago [1]; there were some follow up conversations in that thread, too [2]. [1] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8 [2] http://groups.google.com/group/django-developers/msg/2f287e5e3dc9f459 > The python code > generated will then be fed to a 'migrations API' that will actually handle the > task of migration. This is the approach followed by South (as opposed to > Nashvegas's approach of generating raw SQL migration files). This ensures > modularity, one of the trademarks of Django. I don't think you're going to be able to ignore raw SQL migrations quite that easily. Just like the ORM isn't able to express every query, there will be migrations that you can't express in any schema migration abstraction. Raw SQL migrations will always need to be an option (even if they're feature limited). > Third party developers can create > their own inspection and ORM versioning tools, provided the inspection tool > emits python code conforming to our new migrations API. > > To sum up, the complete migrations framework will need, at the highest level: > 1. A migrations API that accepts python code and actually performs the >migrations. This is certainly needed. I'm a little concerned by your phrasing of an "API that accepts python code", though. An API is something that Python code can invoke, not the other way around. We're looking for django.db.backends.migration as an analog of django.db.backends.creation, not a code consuming utility library. > 2. An inspection tool that generates the appropriate python code after >inspecting models and current state of database. The current consensus is that this shouldn't be Django's domain -- at least, not in the first instance. It might be appropriate to expose an API to extract the current model state in a Pythonic form, but a fully-fledged, user accessible "tool". > 3. A versioning tool to keep track of migrations. This will allow 'backward' >migrations. If backward migrations is the only reason to have a versioning tool, then I'd argue you don't need versioning. However, that's not the only reason to have versioning, is it :-) > South's syncdb: > class Command(NoArgsCommand): > def handle_noargs(self, migrate_all=False, **options): As a guide for the future -- large wads of code like this aren't very compelling as part of a proposal unless you're trying to demonstrate something specific. In this case, you're just duplicating some of South's internals -- "I'm going to take South's lead" is all you really needed to say. > If migrations become a core part of Django, every user app will have a > migration folder(module) under it, created at the time of issuing > django-admin.py startapp. Thus by modifying the startapp command to create a > migrations module for every app it creates, we will be able to use South's > syncdb code as is and will also save the user from issuing > schemamigration --initial for all his/her apps. > > Now that we have a guaranteed migrations history for every user app, migrate > command will also be more or less a copy of South's migrate command. What does this "history" look like? Are migrations named? Are they dated? Numbered? How do you handle dependencies? Ordering? Collisions between parallel development? *This* is the sort of thing a proposal should be elaborating. > > As much as I would have liked to use Django creation API's code for creating > and destroying models, we cannot. The reason for this is Django's creation API > uses its inspection tools to generate *SQL* which is then directly fed to > cursor.execute. What we need is a migrations API which gobbles up *python* > code generated by the inspection tool. Moreover deprecating
[GSoC 2012] Schema Alteration API proposal
Abstract -- A database migration helper has been one of the most long standing feature requests in Django. Though Django has an excellent database creation helper, when faced with schema design changes, developers have to resort to either writing raw SQL and manually performing the migrations, or using third party apps like South[1] and Nashvegas[2]. Clearly Django will benefit from having a database migration helper as an integral part of its codebase. >From [3], the consensus seems to be on building a Ruby on Rails ActiveRecord Migrations[4] like framework, which will essentially emit python code after inspecting user models and current state of the database. The python code generated will then be fed to a 'migrations API' that will actually handle the task of migration. This is the approach followed by South (as opposed to Nashvegas's approach of generating raw SQL migration files). This ensures modularity, one of the trademarks of Django. Third party developers can create their own inspection and ORM versioning tools, provided the inspection tool emits python code conforming to our new migrations API. To sum up, the complete migrations framework will need, at the highest level: 1. A migrations API that accepts python code and actually performs the migrations. 2. An inspection tool that generates the appropriate python code after inspecting models and current state of database. 3. A versioning tool to keep track of migrations. This will allow 'backward' migrations. 4. Glue code to tie the above three together. Implementation plan -- Before discussing the implementation plan for the migrations framework, I would like to digress for a moment and discuss the final state of the migrations framework when it will be implemented. For the user, syncing and migrating databases will consist of issuing the commands syncdb and a new 'migrate' command. syncdb will be have to be rewritten and a new migrate command will be written. South's syncdb: class Command(NoArgsCommand): def handle_noargs(self, migrate_all=False, **options): ... apps_needing_sync = [] apps_migrated = [] for app in models.get_apps(): app_label = get_app_label(app) if migrate_all: apps_needing_sync.append(app_label) else: try: migrations = migration.Migrations(app_label) except NoMigrations: # It needs syncing apps_needing_sync.append(app_label) else: # This is a migrated app, leave it apps_migrated.append(app_label) verbosity = int(options.get('verbosity', 0)) # Run the original syncdb procedure for apps_needing_sync # If migrate is passed as a parameter, run migrate command for rest The above code is from South's override of syncdb command. It basically divides INSTALLED_APPS into apps that have a migration history and will be handled by the migrations framework and those that do not have a migrations history and will be handled by Django's syncdb. South expects users to manually run a 'schemamigration --initial' command for every app they want to be handled by South's migration framework. If migrations become a core part of Django, every user app will have a migration folder(module) under it, created at the time of issuing django-admin.py startapp. Thus by modifying the startapp command to create a migrations module for every app it creates, we will be able to use South's syncdb code as is and will also save the user from issuing schemamigration --initial for all his/her apps. Now that we have a guaranteed migrations history for every user app, migrate command will also be more or less a copy of South's migrate command. Coming back to the migrations API, There are three fundamental operations that can be performed during a migration: 1. Creation of a new model. 2. Alteration in an existing model. 3. Deletion of an existing model. As much as I would have liked to use Django creation API's code for creating and destroying models, we cannot. The reason for this is Django's creation API uses its inspection tools to generate *SQL* which is then directly fed to cursor.execute. What we need is a migrations API which gobbles up *python* code generated by the inspection tool. Moreover deprecating/removing Django's creation API to use the new migrations API everywhere will give rise to performance issues since time will be wasted in generating python code and then converting python to SQL for Django's core apps which will never have migrations anyways. The creation API and code that depends on it (syncdb, sql, django.test.simple and django.contrib.gis.db.backends) will be left as is. Therefore much of the code for our new migra