subject:"\[GSoC 2012\] Schema Alteration API proposal"

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-06 Thread Andrew Godwin


On 06/04/12 06:34, j4nu5 wrote:

Actually I am not planning to mess with syncdb and other management
commands. I will only refactor django.db.backends creation functions
like sql_create_model etc. to use the new API. Behaviour and functionality
will be the same after refactor, so management commands like syncdb
will not notice a difference.


Alright, that's at least going to leave things in a good working state, 
then.



Currently, I can only think of things like the unique index on SQLite and
oddities in MySQL mostly again from South's test suite, I will give another
update before today's deadline.


There's a few other ones that South handles - like booleans in SQLite - 
but a look through the codebase would hopefully give you hints to most 
of those.



Are you referring to the fake orm? Well if you are satisfied with my above
explanation, there would be no need for it, since we will be using django's
orm.


Well, the "fake ORM" is exactly what you described above - models loaded 
and then cleared from the app cache. I'm not saying it's a bad thing - 
it beats what South had before (nothing) - but there could be alternatives.



Well you said it yourself above that "the models API in Django is not
designed with models asmoveable, dynamic objects". That is why I used
a column-based approach. The advantage will be felt in live migrations.
As for using Django fields for type information, I frankly cannot think of a
major valid negative point for now, I will revert later today.



If you plan to continue using
Django fields as type information (as South does), what potential issues
do you see there?

The only issue I can think of is the case of custom fields created by the user.


That's one big issue; one of South's biggest issues today is custom 
fields, though that's arguably more the serialisation side of them. 
Still, I'd at least like to see how you would want something like, say, 
GeoDjango to fit in, even though this GSOC wouldn't cover it - it has a 
lot of custom creation code, and alteration types that differ from 
creation types (much like SERIAL in postgres, which you _will_ have to 
address) and room would have to be made for these kinds of problems.


Andrew

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-06 Thread j4nu5

On Thursday, 5 April 2012 21:25:19 UTC+5:30, Andrew Godwin wrote:
>
> If you plan to continue using 
> Django fields as type information (as South does), what potential issues 
> do you see there?
>
The only issue I can think of is the case of custom fields created by the 
user. 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/TTHMOOOFAhYJ.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-05 Thread j4nu5

On Thursday, 5 April 2012 21:25:19 UTC+5:30, Andrew Godwin wrote:
>
> Just thought I'd chime in now I've had a chance to look over the current 
> proposal (I looked at the current one you have in the GSOC system):
>
>   - When you describe feeding things in from local_fields, are you 
> referring to that being the method by which you're planning to implement 
> things like syncdb?
>
Actually I am not planning to mess with syncdb and other management
commands. I will only refactor django.db.backends creation functions
like sql_create_model etc. to use the new API. Behaviour and functionality
will be the same after refactor, so management commands like syncdb
will not notice a difference.
 

>   - I'd like to see a bit more detail about how you plan to test the 
> code - specifically, there are some backend-specific tests you may need, 
> as well as some detailed introspection in order to make sure things have 
> applied correctly.
>
Currently, I can only think of things like the unique index on SQLite and
oddities in MySQL mostly again from South's test suite, I will give another
update before today's deadline.
 

> - Russ is correct about your models approach - as I've said before in 
> other places, the models API in Django is not designed with models as 
> moveable, dynamic objects.
>
I have taken care of clearing the app cache, after migrations.
Actually the entire point of using these 'Django code' based tests is that I
wanted to doubly ensure that Django will behave the way its supposed to
after the migrations. I could have gone with a SQL only approach e.g. 
'SELECT
table' after calling db.delete_table but using testing using Django code 
seemed
a bit more comprehensive.
Now, to mimic migrations, I needed to alter model definitions. The closest 
way
to resemble actual migration scenario seemed to be to change the definitions
in models.py itself. File rename/rewrite is ugly and OS dependent thats why 
I
used a 'temporary setting' based approach. I know that messing with app 
cache
looks a bit hackish but I cannot think of anything else for now.

South has one approach to these sorts of 
> tests, but I'd love to see a cleaner suggestion.
>
Are you referring to the fake orm? Well if you are satisfied with my above
explanation, there would be no need for it, since we will be using django's
orm.
 

> - There's been some discussion on south-users about the benefits of a 
> column-based alteration API versus a field/model-based alteration API - 
> why have you picked a column-based one? If you plan to continue using 
> Django fields as type information (as South does), what potential issues 
> do you see there?
>
Well you said it yourself above that "the models API in Django is not
designed with models as moveable, dynamic objects". That is why I used
a column-based approach. The advantage will be felt in live migrations.
As for using Django fields for type information, I frankly cannot think of a
major valid negative point for now, I will revert later today.
 

> - Some more detail on your background would be nice - what's your 
> specific experience with the 3 main databases you'll be handling 
> (postgres, mysql, sqlite)? What was a "high voltage database migration"?
>
Sure. I will update it.
 

> Sorry for the late feedback, I've been far too busy.
>
No problem, as long as you reply to this before the deadline :D

>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/5eBCoe2syNYJ.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-05 Thread Andrew Godwin

Just thought I'd chime in now I've had a chance to look over the current 
proposal (I looked at the current one you have in the GSOC system):


 - When you describe feeding things in from local_fields, are you 
referring to that being the method by which you're planning to implement 
things like syncdb?


 - I'd like to see a bit more detail about how you plan to test the 
code - specifically, there are some backend-specific tests you may need, 
as well as some detailed introspection in order to make sure things have 
applied correctly.


- Russ is correct about your models approach - as I've said before in 
other places, the models API in Django is not designed with models as 
moveable, dynamic objects. South has one approach to these sorts of 
tests, but I'd love to see a cleaner suggestion.


- There's been some discussion on south-users about the benefits of a 
column-based alteration API versus a field/model-based alteration API - 
why have you picked a column-based one? If you plan to continue using 
Django fields as type information (as South does), what potential issues 
do you see there?


- Some more detail on your background would be nice - what's your 
specific experience with the 3 main databases you'll be handling 
(postgres, mysql, sqlite)? What was a "high voltage database migration"?


Sorry for the late feedback, I've been far too busy.

Andrew

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-04 Thread Russell Keith-Magee


On 04/04/2012, at 11:50 PM, j4nu5 wrote:

> Hi Russell,
> Thanks for your immense patience :-)
> 
> These are some additions to my proposal above, based on your inputs:
> Status of current 'creation' code in django:
> The current code, for e.g. sql_create_model in
> django.db.backends.creation is a mix of *inspection* part and *sql
> generation* part. Since the sql generation part will (should) now be
> handled by our new CRUD API, I will refactor
> django.db.backends.creation (and other backends' creation modules) to
> continue using their inspection part but using our new CRUD API for
> sql generation. The approach will be to get the fields using
> model._meta.local_fields and feeding them to our new CRUD API. This
> will serve to be a proof of concept for my API.

Hrm - not exactly ideal, but better than nothing I suppose. Ideally, there 
would actually be some migration task involved in your proof of concept.

> As for testing using Django code, my models will be something like:
> class UnchangedModel(models.Model):
>eg = models.TextField()
> 
> if BEFORE_MIGRATION:
>class MyModel(models.Model):
>f1 = models.TextField()
>f2 = models.TextField()
> # Deletion of a field
> else:
>class MyModel(models.Model):
>f1 = models.TextField()
> 
> The value of BEFORE_MIGRATION will be controlled by the testing code.
> A temporary environment variable can be used for this purpose.

Unless your plan also includes writing a lot of extra code to purge and 
repopulate the app cache, this approach won't work. Just changing a setting 
doesn't change the class that has already been parsed and processed.


> Also a revised schedule:
> Bonding period before GSoC: Discussion on API design
> Week 1: Writing tests (using 2 part checks (checking the actual
>  database and using Django models), as discussed above)
> Week 2: Developing the base migration API
> Week 3: Developing extensions and overrides for PostgreSQL
> Weeks 4-5 : Developing extensions and overrides for MySQL
> Weeks 6-7 : Developing extensions and overrides for SQLite (may be shorter or
>   longer (by 0.5 week) depending on how much of xtrqt's code 
> is
>   considered acceptable)
> Weeks 8-10  : Refactoring django.db backends.creation (and the PostgreSQL,
>  MySQL, SQLite creation modules) to use the new API for
>  SQL generation (approach discussed above)
> Week 11  : Writing documentaion and leftover tests, if any
> Week 12  : Buffer week for the unexpected
> 

This looks a bit more convincing.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-04 Thread j4nu5

Hi Russell,
Thanks for your immense patience :-)

These are some additions to my proposal above, based on your inputs:
Status of current 'creation' code in django:
The current code, for e.g. sql_create_model in
django.db.backends.creation is a mix of *inspection* part and *sql
generation* part. Since the sql generation part will (should) now be
handled by our new CRUD API, I will refactor
django.db.backends.creation (and other backends' creation modules) to
continue using their inspection part but using our new CRUD API for
sql generation. The approach will be to get the fields using
model._meta.local_fields and feeding them to our new CRUD API. This
will serve to be a proof of concept for my API.

As for testing using Django code, my models will be something like:
class UnchangedModel(models.Model):
   eg = models.TextField()

if BEFORE_MIGRATION:
   class MyModel(models.Model):
   f1 = models.TextField()
   f2 = models.TextField()
# Deletion of a field
else:
   class MyModel(models.Model):
   f1 = models.TextField()

The value of BEFORE_MIGRATION will be controlled by the testing code.
A temporary environment variable can be used for this purpose.

Also a revised schedule:
Bonding period before GSoC: Discussion on API design
Week 1: Writing tests (using 2 part checks (checking the actual
 database and using Django models), as discussed above)
Week 2: Developing the base migration API
Week 3: Developing extensions and overrides for PostgreSQL
Weeks 4-5 : Developing extensions and overrides for MySQL
Weeks 6-7 : Developing extensions and overrides for SQLite (may be shorter 
or
  longer (by 0.5 week) depending on how much of xtrqt's 
code is
  considered acceptable)
Weeks 8-10  : Refactoring django.db backends.creation (and the PostgreSQL,
 MySQL, SQLite creation modules) to use the new API for
 SQL generation (approach discussed above)
Week 11  : Writing documentaion and leftover tests, if any
Week 12  : Buffer week for the unexpected

On Tuesday, 3 April 2012 06:39:37 UTC+5:30, Russell Keith-Magee wrote:
>
>
> On 03/04/2012, at 5:06 AM, j4nu5 wrote:
>
> > Hi Russell,
> > 
> > Thanks for the prompt reply.
> > 
> >  * You aren't ever going to eat your own dogfood. You're spending the 
> GSoC building an API that is intended for use with schema migration, but 
> you're explicitly not looking at any part of the migration process that 
> would actually use that API. How will we know that the API you build is 
> actually fit for the purpose it is intended? How do we know that the 
> requirements of "step 2" of schema migration will be met by your API? I'd 
> almost prefer to see more depth, and less breadth -- i.e., show me a fully 
> functioning schema migration stack on just one database, rather than a 
> fully functioning API on all databases that hasn't actually been shown to 
> work in practice.
> > 
> > 'Eating my own dogfood' to check whether my low level migration 
> primitives are actually *usable*, I believe can be done by:
> > 1. Developing a working fork of South to use these primitives as I 
> mentioned in my project goals, or
> > 2. Aiming for less 'breadth' and more 'depth', as you suggested.
> > 
> > I did not opt for 2, since creating the '2nd level' of the migration 
> framework (the caller of the lower level API) is a huge beast by itself. 
> Any reasonable solution will have to take care of 'Pythonic' as well as 
> 'pseudo-SQL' migrations as discussed above. Not to mention taking care of 
> versioning + dependency management + backwards migrations. I am against the 
> development of a half baked and/or inconsistent 2nd level API layer. Trying 
> to fully develop such a solution even for one database will exceed the GSoC 
> timeline, in my humble opinion.
>
> Ok - there's two problems with what you've said here:
>
>  1) You don't make any reference in your schedule to implementing a 
> "working fork of South". This isn't a trivial activity, so if you're 
> planning on doing this, you should tell use how this is factored into your 
> schedule.
>
>  2) You're making the assumption that you need to "fully develop" a 
> solution. A proof of concept would be more than adequate. For example, in 
> the 2010 GSoC, Alex Gaynor's project was split into two bits; a bunch of 
> modifications to the core query engine, and a completely separate project, 
> not intended for merging to trunk, that demonstrated that his core query 
> changes would do what was necessary. You could take exactly the same 
> approach here; don't try to delivery a fully functioning schema migration 
> tool, just enough of a tool to demonstrate that your API is sufficient. 
>
> >  * It feels like there's a lot of padding in your schedule.
> > 
> >- A week of discussion at the start
> >- 2 weeks for a "base" migration API
> >- 2.5 weeks to write documentation
> >- 2 "buffer" weeks
> > 
> > Your pro

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-02 Thread Russell Keith-Magee

On 03/04/2012, at 5:06 AM, j4nu5 wrote:

> Hi Russell,
> 
> Thanks for the prompt reply.
> 
>  * You aren't ever going to eat your own dogfood. You're spending the GSoC 
> building an API that is intended for use with schema migration, but you're 
> explicitly not looking at any part of the migration process that would 
> actually use that API. How will we know that the API you build is actually 
> fit for the purpose it is intended? How do we know that the requirements of 
> "step 2" of schema migration will be met by your API? I'd almost prefer to 
> see more depth, and less breadth -- i.e., show me a fully functioning schema 
> migration stack on just one database, rather than a fully functioning API on 
> all databases that hasn't actually been shown to work in practice.
> 
> 'Eating my own dogfood' to check whether my low level migration primitives 
> are actually *usable*, I believe can be done by:
> 1. Developing a working fork of South to use these primitives as I mentioned 
> in my project goals, or
> 2. Aiming for less 'breadth' and more 'depth', as you suggested.
> 
> I did not opt for 2, since creating the '2nd level' of the migration 
> framework (the caller of the lower level API) is a huge beast by itself. Any 
> reasonable solution will have to take care of 'Pythonic' as well as 
> 'pseudo-SQL' migrations as discussed above. Not to mention taking care of 
> versioning + dependency management + backwards migrations. I am against the 
> development of a half baked and/or inconsistent 2nd level API layer. Trying 
> to fully develop such a solution even for one database will exceed the GSoC 
> timeline, in my humble opinion.

Ok - there's two problems with what you've said here:

 1) You don't make any reference in your schedule to implementing a "working 
fork of South". This isn't a trivial activity, so if you're planning on doing 
this, you should tell use how this is factored into your schedule.

 2) You're making the assumption that you need to "fully develop" a solution. A 
proof of concept would be more than adequate. For example, in the 2010 GSoC, 
Alex Gaynor's project was split into two bits; a bunch of modifications to the 
core query engine, and a completely separate project, not intended for merging 
to trunk, that demonstrated that his core query changes would do what was 
necessary. You could take exactly the same approach here; don't try to delivery 
a fully functioning schema migration tool, just enough of a tool to demonstrate 
that your API is sufficient. 

>  * It feels like there's a lot of padding in your schedule.
> 
>- A week of discussion at the start
>- 2 weeks for a "base" migration API
>- 2.5 weeks to write documentation
>- 2 "buffer" weeks
> 
> Your project is proposing the development of a low level database API. While 
> this should certainly be documented, if it's not going to be "user facing", 
> the documentation requirements aren't as high. Also, because it's a low level 
> database API, I'm not sure what common tools will exist -- yet your schedule 
> estimates 1/6 of your overall time, and 1/3 of your active coding time, will 
> be spent building these common tools. Having 1/6 of your project schedule as 
> contingency is very generous; and you don't mention what you plan to look at 
> if you don't have to use that contingency.
> 
> I think the problem is that the 1st part - development of a lower level 
> migrations API - is a little bit small for the GSoC timeline but the 2nd part 
> - the caller of the API - is way big for GSoC. As I said, I did not want to 
> create a half baked solution. Thats why the explicit skipping of 2nd level 
> and thus the *padding*. I am still open for discussion and suggestions 
> regarding this matter though.

So, to summarize: What you're telling us is that you know, a-priori, that your 
project isn't 12 weeks of work. This doesn't give us a lot of incentive to pick 
up your proposal for the GSoC. We have an opportunity to get Google to pay for 
12 weeks development. Given that we have that opportunity, why would we select 
a project that will only yield 6 weeks of output?

The goal here isn't to pick a project, and then make it fit 12 weeks by any 
means necessary. It's to pick something that will actually be 12 weeks of work. 
A little contingency is fine, but if you start padding too much, your proposal 
isn't going to be taken seriously.

My suggestion -- work out some small aspect of part 2 that you *can* deliver. 
Not necessarily the whole thing, but a skeleton, and try to delivery a fully 
fleshed out part on that skeleton. If you're smart about it, this can also 
double as your dogfood requirement.

>  * Your references to testing are a bit casual for my taste. From my 
> experience, testing schema migration code is hard. Normal view code and 
> utilities are easy to test -- you set up a test database, insert some data, 
> and check functionality. However, schema migration code is explicitly about 
>

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-01 Thread Russell Keith-Magee

Hi Kushagra,

On the whole, I think this proposal is looking fairly good. You're high-level 
explanation of the problem is solid, and you've given enough detail of the 
direction you intend to take the project that it gives me some confidence that 
you understand what you're proposing to do.

I have a couple of small concerns:

 * You aren't ever going to eat your own dogfood. You're spending the GSoC 
building an API that is intended for use with schema migration, but you're 
explicitly not looking at any part of the migration process that would actually 
use that API. How will we know that the API you build is actually fit for the 
purpose it is intended? How do we know that the requirements of "step 2" of 
schema migration will be met by your API? I'd almost prefer to see more depth, 
and less breadth -- i.e., show me a fully functioning schema migration stack on 
just one database, rather than a fully functioning API on all databases that 
hasn't actually been shown to work in practice.

 * It feels like there's a lot of padding in your schedule. 

   - A week of discussion at the start
   - 2 weeks for a "base" migration API
   - 2.5 weeks to write documentation
   - 2 "buffer" weeks 

Your project is proposing the development of a low level database API. While 
this should certainly be documented, if it's not going to be "user facing", the 
documentation requirements aren't as high. Also, because it's a low level 
database API, I'm not sure what common tools will exist -- yet your schedule 
estimates 1/6 of your overall time, and 1/3 of your active coding time, will be 
spent building these common tools. Having 1/6 of your project schedule as 
contingency is very generous; and you don't mention what you plan to look at if 
you don't have to use that contingency.

 * Your references to testing are a bit casual for my taste. From my 
experience, testing schema migration code is hard. Normal view code and 
utilities are easy to test -- you set up a test database, insert some data, and 
check functionality. However, schema migration code is explicitly about making 
database changes, so the thing that Django normally considers "static" -- the 
database models -- are subject to change, and that isn't always an easy thing 
to accommodate. I'd be interested to see your thoughts on how you plan to test 
your API.

 * Your proposal doesn't make any reference to the existing "migration-like" 
tasks in Django's codebase. For example, we already have code for creating 
tables and adding indicies. How will your migration code use, modify or augment 
these existing capabilities?

Yours,
Russ Magee %-)

On 01/04/2012, at 5:02 PM, j4nu5 wrote:

> Less than a week remains for student application deadline. Can someone please 
> comment on the above revised proposal. Thanks a lot.
> 
> On Monday, 26 March 2012 01:29:35 UTC+5:30, j4nu5 wrote:
> Here is a revised proposal.
> 
> Abstract
> --
> A database migration helper has been one of the most long standing feature
> requests in Django. Though Django has an excellent database creation helper,
> when faced with schema design changes, developers have to resort to either
> writing raw SQL and manually performing the migrations, or using third party
> apps like South[1] and Nashvegas[2].
> 
> [1] http://south.aeracode.org/
> [2] https://github.com/paltman/nashvegas/
> 
> Clearly Django will benefit from having a database migration helper as an
> integral part of its codebase.
> 
> From the summary on django-developers mailing list[3], the task of building a
> migrations framework will involve:
> 1. Add a db.backends module to provide an abstract interface to migration
>primitives (add column, add index, rename column, rename table, and so on).
> 2. Add a contrib app that performs the high level accounting of "has migration
>X been applied", and management commands to "apply all outstanding
>migrations"
> 3. Provide an API that allows end users to define raw-SQL migrations, or
>native Python migrations using the backend primitives.
> 4. Leave the hard task of determining dependencies, introspection of database
>models and so on to the toolset contributed by the broader community.
> 
> [3] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8
> 
> I would like to work on the 1st step as part of this year's GSoC.
> 
> 
> Implementation plan
> --
> The idea is to have a CRUD interface to database schema (with some additional
> utility functions for indexing etc.) with functions like:
> * create_table
> * rename_table
> * delete_table
> * add_column
> and so on, which will have the *explicit* names of the table/column to be
> modified as its parameter. It will be the responsibility of the higher level
> API caller (will not be undertaken as part of GSoC) to translate model/field
> names to ex

Re: [GSoC 2012] Schema Alteration API proposal

2012-04-01 Thread j4nu5

Less than a week remains for student application deadline. Can someone 
please comment on the above revised proposal. Thanks a lot.

On Monday, 26 March 2012 01:29:35 UTC+5:30, j4nu5 wrote:
>
> Here is a revised proposal.
>
> Abstract
>
> --
> A database migration helper has been one of the most long standing feature
> requests in Django. Though Django has an excellent database creation 
> helper,
> when faced with schema design changes, developers have to resort to either
> writing raw SQL and manually performing the migrations, or using third 
> party
> apps like South[1] and Nashvegas[2].
>
> [1] http://south.aeracode.org/
> [2] https://github.com/paltman/nashvegas/
>
> Clearly Django will benefit from having a database migration helper as an
> integral part of its codebase.
>
> From the summary on django-developers mailing list[3], the task of 
> building a
> migrations framework will involve:
> 1. Add a db.backends module to provide an abstract interface to migration
>primitives (add column, add index, rename column, rename table, and so 
> on).
> 2. Add a contrib app that performs the high level accounting of "has 
> migration
>X been applied", and management commands to "apply all outstanding
>migrations"
> 3. Provide an API that allows end users to define raw-SQL migrations, or
>native Python migrations using the backend primitives.
> 4. Leave the hard task of determining dependencies, introspection of 
> database
>models and so on to the toolset contributed by the broader community.
>
> [3] http://groups.google.com/
> group/django-developers/msg/cf379a4f353a37f8
>
> I would like to work on the 1st step as part of this year's GSoC.
>
>
> Implementation plan
>
> --
> The idea is to have a CRUD interface to database schema (with some 
> additional
> utility functions for indexing etc.) with functions like:
> * create_table
> * rename_table
> * delete_table
> * add_column
> and so on, which will have the *explicit* names of the table/column to be
> modified as its parameter. It will be the responsibility of the higher 
> level
> API caller (will not be undertaken as part of GSoC) to translate 
> model/field
> names to explicit table/column names. These functions will be directly
> responsible for modifying the schema, and any interaction with the database
> schema will take place by calling these functions. Most of these functions
> will come from South.
>
> These API functions will also have a "dry-run" or test mode, in which they
> will output raw SQL representation of the migration or display errors if 
> they
> occur. This will be useful in:
> 1. The MySQL backend. MySQL does not have transaction support for schema
>modification and hence the migrations will be run in a dry run mode 
> first
>so that any errors can be captured before altering the schema.
> 2. The django-admin commands sql and sqlall that return the SQL (for 
> creation
>and indexing) for an app. They will capture the SQL returned from the 
> API
>running in dry run mode.
>
> As for the future of the current Django creation API, it will have to be
> refactored (not under GSoC) to make use of the 'create' part of our new 
> CRUD
> interface, for consistency purposes.
>
> The GeoDjango backends will also have to be refactored to use the new API.
> Since, they build upon the base code in db.backends, firstly db.backends 
> will
> have to be refactored.
>
> Last year xtrqt had written, documented and tested code for at least the
> SQLite backend[4]. As per Andrew's suggestion, I would not be relying too 
> much
> on that code but some parts can still be salvaged.
>
> [4] https://groups.google.com/
> forum/?fromgroups#!searchin/django-developers/xtrqt/django-developers/pSICNJBJRy8/Hl7frp-O-dMJ
>
>
> Schedule and Goal
>
> --
> Week 1 : Discussion on API design and writing tests
> Week 2-3   : Developing the base migration API
> Week 4 : Developing extensions and overrides for PostgreSQL
> Week 5-6   : Developing extensions and overrides for MySQL
> Week 7-8.5 : Developing extensions and overrides for SQLite (may be 
> shorter or
>  longer (by 0.5 week) depending on how much of xtrqt's code is
>  considered acceptable)
> Week 8.5-10: Writing documentaion and leftover regression tests, if any
> Week 11-12 : Buffer weeks for the unexpected
>
> I will consider my project to be successful when we have working, tested 
> and
> documented migration primitives for Postgres, MySQL and SQLite. If we can
> develop a working fork of South to use these primitives, that will be a 
> strong
> indicator of the project's success.
>
>
> About me and my inspiration for the project
>
>

Re: [GSoC 2012] Schema Alteration API proposal

2012-03-25 Thread Kushagra Sinha

Here is a revised proposal.

Abstract
--
A database migration helper has been one of the most long standing feature
requests in Django. Though Django has an excellent database creation helper,
when faced with schema design changes, developers have to resort to either
writing raw SQL and manually performing the migrations, or using third party
apps like South[1] and Nashvegas[2].

[1] http://south.aeracode.org/
[2] https://github.com/paltman/nashvegas/

Clearly Django will benefit from having a database migration helper as an
integral part of its codebase.

>From the summary on django-developers mailing list[3], the task of building
a
migrations framework will involve:
1. Add a db.backends module to provide an abstract interface to migration
   primitives (add column, add index, rename column, rename table, and so
on).
2. Add a contrib app that performs the high level accounting of "has
migration
   X been applied", and management commands to "apply all outstanding
   migrations"
3. Provide an API that allows end users to define raw-SQL migrations, or
   native Python migrations using the backend primitives.
4. Leave the hard task of determining dependencies, introspection of
database
   models and so on to the toolset contributed by the broader community.

[3] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8

I would like to work on the 1st step as part of this year's GSoC.


Implementation plan
--
The idea is to have a CRUD interface to database schema (with some
additional
utility functions for indexing etc.) with functions like:
* create_table
* rename_table
* delete_table
* add_column
and so on, which will have the *explicit* names of the table/column to be
modified as its parameter. It will be the responsibility of the higher level
API caller (will not be undertaken as part of GSoC) to translate model/field
names to explicit table/column names. These functions will be directly
responsible for modifying the schema, and any interaction with the database
schema will take place by calling these functions. Most of these functions
will come from South.

These API functions will also have a "dry-run" or test mode, in which they
will output raw SQL representation of the migration or display errors if
they
occur. This will be useful in:
1. The MySQL backend. MySQL does not have transaction support for schema
   modification and hence the migrations will be run in a dry run mode first
   so that any errors can be captured before altering the schema.
2. The django-admin commands sql and sqlall that return the SQL (for
creation
   and indexing) for an app. They will capture the SQL returned from the API
   running in dry run mode.

As for the future of the current Django creation API, it will have to be
refactored (not under GSoC) to make use of the 'create' part of our new CRUD
interface, for consistency purposes.

The GeoDjango backends will also have to be refactored to use the new API.
Since, they build upon the base code in db.backends, firstly db.backends
will
have to be refactored.

Last year xtrqt had written, documented and tested code for at least the
SQLite backend[4]. As per Andrew's suggestion, I would not be relying too
much
on that code but some parts can still be salvaged.

[4] https://groups.google.com/
forum/?fromgroups#!searchin/django-developers/xtrqt/
django-developers/pSICNJBJRy8/Hl7frp-O-dMJ


Schedule and Goal
--
Week 1 : Discussion on API design and writing tests
Week 2-3   : Developing the base migration API
Week 4 : Developing extensions and overrides for PostgreSQL
Week 5-6   : Developing extensions and overrides for MySQL
Week 7-8.5 : Developing extensions and overrides for SQLite (may be shorter
or
 longer (by 0.5 week) depending on how much of xtrqt's code is
 considered acceptable)
Week 8.5-10: Writing documentaion and leftover regression tests, if any
Week 11-12 : Buffer weeks for the unexpected

I will consider my project to be successful when we have working, tested and
documented migration primitives for Postgres, MySQL and SQLite. If we can
develop a working fork of South to use these primitives, that will be a
strong
indicator of the project's success.


About me and my inspiration for the project
--
I am Kushagra Sinha, a pre-final year student at Institute of Technology
(about to be converted to an Indian Institute of Technology),
Banaras Hindu University, Varanasi, India.

I can be reached at:
Gmail: sinha.kushagra
Alternative email: kush [at] j4nu5 [dot] com
IRC: Nick j4nu5 on #django-dev and #django
Twitter: @j4nu5
github: j4nu5

I was happily using PHP for nearly all of my webdev work since my high
school
days (Cak

Re: [GSoC 2012] Schema Alteration API proposal

2012-03-19 Thread Jani Tiainen

19.3.2012 13:15, Andrew Godwin kirjoitti:

On 19/03/12 11:08, Jonathan French wrote:

On 18 March 2012 23:33, Russell Keith-Magee mailto:russ...@keith-magee.com>> wrote:

> 2. An inspection tool that generates the appropriate python code
after
> inspecting models and current state of database.

The current consensus is that this shouldn't be Django's domain --
at least, not in the first instance. It might be appropriate to
expose an API to extract the current model state in a Pythonic form,
but a fully-fledged, user accessible "tool".

I would, however, definitely recommend not touching the Oracle or MSSQL
backends - three is already a lot of work, and they're harder databases
to get a hold of for testing.

Here I would like to rise my concern - specially being long time Django 
and Oracle user.. =)

First at all everyone can get hands on Oracle Express database, free of 
charge standard Django stuff works in it very well. Geodjango doesn't 
work with it. AFAIK MSSQL is something that is not officially supported 
by Django so that shouldn't be much a problem if it's not touched.

Secondly Django has been in the past very consistent in support of four 
databases: SQLite, PostgreSQL, MySQL and Oracle. All supported pretty 
well as well as possible. I'm aware that doing migrations for all 
databases is a time taking challenge to tackle around all peculiarities 
in different backends. So hopefully that consistency is kept even with 
new features like this.

And yes, second thing is of course Geodjango part which takes complexity 
to whole new level.

--

Jani Tiainen

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-03-19 Thread Andrew Godwin

On 19/03/12 11:08, Jonathan French wrote:

On 18 March 2012 23:33, Russell Keith-Magee mailto:russ...@keith-magee.com>> wrote:

 > 2. An inspection tool that generates the appropriate python code
after
 >inspecting models and current state of database.

The current consensus is that this shouldn't be Django's domain --
at least, not in the first instance. It might be appropriate to
expose an API to extract the current model state in a Pythonic form,
but a fully-fledged, user accessible "tool".

Is there a writeup anywhere of why this is the consensus? AFAICT it
looks like Django already provides half of this in the form of
DatabaseIntrospection, that e.g. South actually uses, which generates a
model class from the current state of the database. Doing the diff as
well doesn't seem like much of a stretch, and might make it more likely
for third party custom fields to be made migrateable, if the interface
for doing so is in Django core.

No writeup that I know of - however, the main part of the work here 
would be the "model differencing" code, which means creating a versioned 
ORM, being able to load and save model definitions to some kind of 
format, and the actual difference-creating code, which is all too much 
to stick into Django.

I've long maintained that I want South to become just that automatic 
differencing code, and to just move the actual database API across; this 
is mostly because I see there being scope for other kinds of migration 
systems apart from the kind South is (for example, a very declarative 
one, whose model states are retrieved using the combination of all 
migrations, rather than a lump on the bottom of the last one).

As for your proposal, Kushagra, Russ has said most of the points I would 
have thought of and a few more - I'd recommend a good look into previous 
discussions on this mailing list for most of the current views on how we 
want the schema alteration API to work.

I would, however, definitely recommend not touching the Oracle or MSSQL 
backends - three is already a lot of work, and they're harder databases 
to get a hold of for testing.

Andrew

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-03-19 Thread Jonathan French

On 18 March 2012 23:33, Russell Keith-Magee  wrote:

> > 2. An inspection tool that generates the appropriate python code after
> >inspecting models and current state of database.
>
> The current consensus is that this shouldn't be Django's domain -- at
> least, not in the first instance. It might be appropriate to expose an API
> to extract the current model state in a Pythonic form, but a fully-fledged,
> user accessible "tool".

Is there a writeup anywhere of why this is the consensus? AFAICT it looks
like Django already provides half of this in the form of
DatabaseIntrospection, that e.g. South actually uses, which generates a
model class from the current state of the database. Doing the diff as well
doesn't seem like much of a stretch, and might make it more likely for
third party custom fields to be made migrateable, if the interface for
doing so is in Django core.

- ojno

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: [GSoC 2012] Schema Alteration API proposal

2012-03-18 Thread Russell Keith-Magee

On 18/03/2012, at 7:38 PM, Kushagra Sinha wrote:

> Abstract
> --
> A database migration helper has been one of the most long standing feature
> requests in Django. Though Django has an excellent database creation helper,
> when faced with schema design changes, developers have to resort to either
> writing raw SQL and manually performing the migrations, or using third party
> apps like South[1] and Nashvegas[2].
> 
> Clearly Django will benefit from having a database migration helper as an
> integral part of its codebase.
> 
> From [3], the consensus seems to be on building a Ruby on Rails ActiveRecord
> Migrations[4] like framework, which will essentially emit python code after
> inspecting user models and current state of the database.

Check the edit dates on that wiki -- most of the content on that page is 
historical, reflecting discussions that were happening over 3 years ago. There 
have been many more recent discussions.

The "current consensus" (at least, the consensus of what the core team is 
likely to accept) is better reflected by the GSoC project that was accepted, 
but not completed last year. I posted to Django-developers about this a week or 
so ago [1]; there were some follow up conversations in that thread, too [2].

[1] http://groups.google.com/group/django-developers/msg/cf379a4f353a37f8
[2] http://groups.google.com/group/django-developers/msg/2f287e5e3dc9f459

> The python code
> generated will then be fed to a 'migrations API' that will actually handle the
> task of migration. This is the approach followed by South (as opposed to
> Nashvegas's approach of generating raw SQL migration files). This ensures
> modularity, one of the trademarks of Django.

I don't think you're going to be able to ignore raw SQL migrations quite that 
easily. Just like the ORM isn't able to express every query, there will be 
migrations that you can't express in any schema migration abstraction. Raw SQL 
migrations will always need to be an option (even if they're feature limited).

> Third party developers can create
> their own inspection and ORM versioning tools, provided the inspection tool
> emits python code conforming to our new migrations API.
> 
> To sum up, the complete migrations framework will need, at the highest level:
> 1. A migrations API that accepts python code and actually performs the
>migrations.

This is certainly needed. I'm a little concerned by your phrasing of an "API 
that accepts python code", though. An API is something that Python code can 
invoke, not the other way around. We're looking for 
django.db.backends.migration as an analog of django.db.backends.creation, not a 
code consuming utility library.

> 2. An inspection tool that generates the appropriate python code after
>inspecting models and current state of database.

The current consensus is that this shouldn't be Django's domain -- at least, 
not in the first instance. It might be appropriate to expose an API to extract 
the current model state in a Pythonic form, but a fully-fledged, user 
accessible "tool".

> 3. A versioning tool to keep track of migrations. This will allow 'backward'
>migrations.

If backward migrations is the only reason to have a versioning tool, then I'd 
argue you don't need versioning.

However, that's not the only reason to have versioning, is it :-)

> South's syncdb:
> class Command(NoArgsCommand):
> def handle_noargs(self, migrate_all=False, **options):

As a guide for the future -- large wads of code like this aren't very 
compelling as part of a proposal unless you're trying to demonstrate something 
specific. In this case, you're just duplicating some of South's internals -- 
"I'm going to take South's lead" is all you really needed to say.

> If migrations become a core part of Django, every user app will have a
> migration folder(module) under it, created at the time of issuing
> django-admin.py startapp. Thus by modifying the startapp command to create a
> migrations module for every app it creates, we will be able to use South's
> syncdb code as is and will also save the user from issuing
> schemamigration --initial for all his/her apps.
> 
> Now that we have a guaranteed migrations history for every user app, migrate
> command will also be more or less a copy of South's migrate command.

What does this "history" look like? Are migrations named? Are they dated? 
Numbered? How do you handle dependencies? Ordering? Collisions between parallel 
development? 

*This* is the sort of thing a proposal should be elaborating. 
> 
> As much as I would have liked to use Django creation API's code for creating
> and destroying models, we cannot. The reason for this is Django's creation API
> uses its inspection tools to generate *SQL* which is then directly fed to
> cursor.execute. What we need is a migrations API which gobbles up *python*
> code generated by the inspection tool. Moreover deprecating

[GSoC 2012] Schema Alteration API proposal

2012-03-18 Thread Kushagra Sinha

Abstract
--
A database migration helper has been one of the most long standing feature
requests in Django. Though Django has an excellent database creation helper,
when faced with schema design changes, developers have to resort to either
writing raw SQL and manually performing the migrations, or using third party
apps like South[1] and Nashvegas[2].

Clearly Django will benefit from having a database migration helper as an
integral part of its codebase.

>From [3], the consensus seems to be on building a Ruby on Rails ActiveRecord
Migrations[4] like framework, which will essentially emit python code after
inspecting user models and current state of the database. The python code
generated will then be fed to a 'migrations API' that will actually handle
the
task of migration. This is the approach followed by South (as opposed to
Nashvegas's approach of generating raw SQL migration files). This ensures
modularity, one of the trademarks of Django. Third party developers can
create
their own inspection and ORM versioning tools, provided the inspection tool
emits python code conforming to our new migrations API.

To sum up, the complete migrations framework will need, at the highest
level:
1. A migrations API that accepts python code and actually performs the
   migrations.
2. An inspection tool that generates the appropriate python code after
   inspecting models and current state of database.
3. A versioning tool to keep track of migrations. This will allow 'backward'
   migrations.
4. Glue code to tie the above three together.


Implementation plan
--
Before discussing the implementation plan for the migrations framework, I
would like to digress for a moment and discuss the final state of the
migrations framework when it will be implemented.

For the user, syncing and migrating databases will consist of issuing the
commands syncdb and a new 'migrate' command.
syncdb will be have to be rewritten and a new migrate command will be
written.

South's syncdb:
class Command(NoArgsCommand):
def handle_noargs(self, migrate_all=False, **options):
...
apps_needing_sync = []
apps_migrated = []
for app in models.get_apps():
app_label = get_app_label(app)
if migrate_all:
apps_needing_sync.append(app_label)
else:
try:
migrations = migration.Migrations(app_label)
except NoMigrations:
# It needs syncing
apps_needing_sync.append(app_label)
else:
# This is a migrated app, leave it
apps_migrated.append(app_label)
verbosity = int(options.get('verbosity', 0))
# Run the original syncdb procedure for apps_needing_sync
# If migrate is passed as a parameter, run migrate command for rest

The above code is from South's override of syncdb command. It basically
divides INSTALLED_APPS into apps that have a migration history and will be
handled by the migrations framework and those that do not have a migrations
history and will be handled by Django's syncdb. South expects users to
manually run a 'schemamigration --initial' command for every app they want
to
be handled by South's migration framework.

If migrations become a core part of Django, every user app will have a
migration folder(module) under it, created at the time of issuing
django-admin.py startapp. Thus by modifying the startapp command to create a
migrations module for every app it creates, we will be able to use South's
syncdb code as is and will also save the user from issuing
schemamigration --initial for all his/her apps.

Now that we have a guaranteed migrations history for every user app, migrate
command will also be more or less a copy of South's migrate command.

Coming back to the migrations API,
There are three fundamental operations that can be performed during a
migration:
1. Creation of a new model.
2. Alteration in an existing model.
3. Deletion of an existing model.

As much as I would have liked to use Django creation API's code for creating
and destroying models, we cannot. The reason for this is Django's creation
API
uses its inspection tools to generate *SQL* which is then directly fed to
cursor.execute. What we need is a migrations API which gobbles up *python*
code generated by the inspection tool. Moreover deprecating/removing
Django's
creation API to use the new migrations API everywhere will give rise to
performance issues since time will be wasted in generating python code and
then
converting python to SQL for Django's core apps which will never have
migrations anyways.

The creation API and code that depends on it (syncdb, sql,
django.test.simple
and django.contrib.gis.db.backends) will be left as is.

Therefore much of the code for our new migra

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

Re: [GSoC 2012] Schema Alteration API proposal

[GSoC 2012] Schema Alteration API proposal

15 matches

Site Navigation

Mail list logo

Footer information