Re: Discussion related to ticket #26822 (new migrations, --keepdb and --parallel option)

2016-07-05 Thread Romain Garrigues

Markus, I like the idea, which is definitely better than my idea of new 
option to recreate it manually when we know we have to.

I can try to investigate a bit if you think that could lead to something 
that makes sense.
Two ideas I have in mind after a quick look at migrate command line code:
1/ Extract the code related to the "plan" (created by 
executor.migration_plan(targets) function) to be also used somewhere else 
(in clone_test_db for example...)
2/ Make "migrate" command return a sort of report (number of migrations 
applied, ...) of what happened during a "migrate" call, that could then be 
used in db.backends.base.creation.BaseDatabaseCreation.create_test_db and 
passed to the connection.creation.clone_test_db loop block, moving from 
"django.test.runner.setup_databases" to 
"db.backends.base.creation.BaseDatabaseCreation.create_test_db" (as cloned 
databases have a link with the state of the default one, it can justify 
this move).

This parallel option is really great and coming with some environment 
constraints, as you said Aymeric, but for big projects, the gain is so 
impressive that I will do all I can to help on that!

I have the benchmark in my todo list, do you think it makes sense to update 
the current PR with one of these 2 propositions explained above?

Romain.

Le mardi 5 juillet 2016 12:48:38 UTC+1, Markus Holtermann a écrit :
>
> Hi, 
>
> it might be a shot in the dark, but can't we check if Django's 
> testrunner applied new migrations in which case we drop the cloned 
> databases and recreate them. If all migrations already existed we keep 
> the clones the way they are? 
>
> /Markus 
>
> On Tue, Jul 05, 2016 at 09:00:25AM +0200, Aymeric Augustin wrote: 
> >Hello, 
> > 
> >I’ll try to clarify what I said in the PR below.. 
> > 
> >The main reason for the `--parallel` option was to make Django’s own test 
> suite faster. A full run went down from ~8m to ~1m30 when I committed that 
> patch, which really helps the development cycle on invasive patches. 
> > 
> >Since Django’s own test runner bypasses migrations, whenever you make 
> changes to a model in Django’s test suite, you need a run without 
> `--keepdb`. So the problem you’re describing doesn’t exist for the primary 
> use case. 
> > 
> >Now let’s talk about models and migrations in users’ projects, which is 
> the logical next step and the use case you’re trying to improve. 
> > 
> >Note that the `--parallel` option is experimental and often 
> non-functional in this context: as soon as two tests hit a resource other 
> than the database — say, the cache — they can stomp upon one another. 
> > 
> >> On 05 Jul 2016, at 00:22, Romain Garrigues  > wrote: 
> >> 
> >> We could have just documented this limitation, but I don't think that 
> my situation is a really rare edge case in terms of process, so I was 
> suggesting to add a new option to be able to reset the cloned databases if 
> needed (let's name it --parallel-clone-reset). 
> > 
> >When I make changes to models, usually I keep removing and recreating a 
> single migration, which is incompatible with using the `--keepdb` option. 
> Whenever I make changes, I run without `--keepdb` once. 
> > 
> >> I don't really like the idea of adding a new option, as it impacts the 
> test runner, the clone_test_db function signature, ... but I have not found 
> a better idea to at the same time keep the performances with --keepdb and 
> --parallel, and handle these newly added migrations to a project. 
> > 
> >I’m not a fan of a new option either… 
> > 
> >> To summarize my proposal, this option (--parallel-clone-reset, or any 
> other name) should be used only if you are using --keepdb and --parallel 
> options at the same time, and when you have added a new migration between 2 
> test run. 
> > 
> >IIRC this will more than double the run time of Django’s own test suite 
> on MySQL: it will increase from ~2m to ~4m (give or take 30s) because 
> cloning databases is slow on MySQL. 
> > 
> >I’m quoting all these figures from memory and I may mix them up. As I 
> said on the ticket it would be useful to redo the benchmark on a first run 
> and subsequent run of `./runtests.py` on PostgreSQL and MySQL. 
> > 
> >You could argue that it’s best to degrade the experience of a few Django 
> contributors (original use case, Django’s test suite) for the benefits of 
> the wider community (new use case, projects’ test suites). However the 
> original use case is known to work and I don’t believe that the new use 
> case works well enough in general, at least not without some engineering to 
> isolate tests from one another. For this reason I’m not convinced by this 
> argument. 
> > 
> >I hope this clarifies the context of the trade-off we’re discussing. 
> > 
> >Best regards, 
> > 
> >-- 
> >Aymeric. 
> > 
> >-- 
> >You received this message because you are subscribed to the Google Groups 
> "Django developers  (Contributions to Django itself)" group. 
> >To 

Fellow Report - June 25, 2016

2016-07-05 Thread Tim Graham


(I neglected to post this last week.)


Triaged

---

https://code.djangoproject.com/ticket/26779 - i18n_javascript should take 
extra_context as argument (accepted)

https://code.djangoproject.com/ticket/26781 - SQLite crashes when changing 
the case of db_table (accepted)

https://code.djangoproject.com/ticket/26794 - Migrating ForeignKey to 
OneToOneField creates extra index on PostgreSQL (duplicate)

https://code.djangoproject.com/ticket/25292 - "'str' object has no 
attribute '_meta'" crash in ManyToManyField.through_fields check (accepted)

https://code.djangoproject.com/ticket/26799 - Model related signals' 
receiver could not receive its proxy models (duplicate)

Authored



https://github.com/django/django/pull/6809 - Refs #22384 -- Readded 
RegexURLResolver.reverse().

https://github.com/django/django/pull/6819 - Refs #21379, #26719 -- Moved 
username normalization to AbstractBaseUser.

https://github.com/django/django/pull/6820 - Fixed #26787 -- Documented 
deleting and reloading of model instance fields.

https://github.com/django/django/pull/6791 - Fixed #26791 -- Replaced 
LiveServerTestCase port ranges with binding to port 0.

https://github.com/jazzband/django-hosts/pull/59 - Simplified and completed 
changes for Django 1.10 support.

Reviewed/committed

--

https://github.com/django/django/pull/6713 - Refs #2 -- Added 
ALLOWED_HOSTS validation when running tests.

https://github.com/django/django/pull/6811 - Fixed #26783 -- Fixed 
SessionMiddleware's empty cookie deletion when using SESSION_COOKIE_PATH.

https://github.com/django/django/pull/6756 - Fixed #25940 -- Added 
OGRGeometry.from_gml() and GEOSGeometry.from_gml().

https://github.com/django/django/pull/6804 - Fixed #26729 -- Allowed 
overriding a form field's label/help_text in Form.__init__() for 
TabularInline.

https://github.com/django/django/pull/6693 - Fixed #25920 -- Added support 
for non-uniform NUMBER_GROUPING.

https://github.com/django/django/pull/6830 - Fixed #26795 -- Factor out 
"get_changes" method in test_autodetector

https://github.com/django/django/pull/6788 - Fixed #26719 -- Normalized 
email in AbstractUser.clean().

Reviews of core dev work



https://github.com/django/django/pull/6814 - Fixed #26781 -- Made table 
name case change a noop on SQLite.

https://github.com/django/django/pull/6827 - Refs #24829 -- Made 
TemplateResponse.content available sooner in exception context
https://github.com/django/django/pull/6798 - Fixed #5897 -- Added the 
Content-Length response header in CommonMiddleware

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/dc347fde-4b2c-4347-90dc-e5263fe26009%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Feature Request] Orderable ArrayAgg and StringAgg in contrib.postgres.aggregates

2016-07-05 Thread Floris den Hengst
I investigated the GROUP BY requirements when using an ORDER BY within 
these aggregates, but there appear to be none.
Further details can be found in the discussion in Trac: 
https://code.djangoproject.com/ticket/26067.

Based on this observation, I went on to create a 
PR: https://github.com/django/django/pull/6886
If anyone is up for it, please have a look!

On Saturday, July 2, 2016 at 10:36:35 AM UTC+2, Floris den Hengst wrote:
>
> @Josh, thanks for pointing out the Expressions API
>
> I will look into it and see if I can come up with a solution that looks 
> reasonable (including an investigation into requirements on contributing to 
> the GROUP BY) and report here / in Trac.
>
>
> On Monday, March 7, 2016 at 1:14:06 PM UTC+1, Tim Graham wrote:
>>
>> Ticket for JSON_AGG: https://code.djangoproject.com/ticket/26327
>>
>> On Thursday, January 7, 2016 at 5:45:03 PM UTC-5, Josh Smeaton wrote:
>>>
>>> Seems reasonable enough to me. Expressions already support generating an 
>>> ORDER BY clause by calling .asc() or .desc() on them. That'd allow your 
>>> proposed API to support:
>>>
>>> Model.objects.aggregate(ArrayAgg(some_field, order_by=
>>> F('some_field').asc()))
>>>
>>> Or any other expression. Consider that any ordering added within *may* 
>>> need to be contributed to GROUP BY, but I haven't read the documentation to 
>>> say that is a requirement.
>>>
>>> Feel free to open a ticket in Trac to track the feature. If you're also 
>>> up for implementing it (..and testing.. and documenting..) I'm quietly 
>>> confident that it'll be fairly straight forward to add, given familiarity 
>>> with Expressions API.
>>>
>>> Additionally, if you think json_agg and json_object_agg should be 
>>> implemented, you can open a feature ticket on Trac for those too. 
>>>
>>> Regards,
>>>
>>> On Friday, 8 January 2016 00:46:52 UTC+11, Floris den Hengst wrote:

 The excellent ArrayAgg and StringAgg Postgres-specific aggregates were 
 introduced in contrib.postgres in Django 1.9 and I've been quite happy 
 using them here and there.
 Thanks for the keeping Django awesome!

 The documentation 
  
 of Postgres 9.0 first mentions the possiblity of ordering the results 
 within such aggregations.
 This could be useful in some cases.
 For example: it could make sense to perform the StringAgg in 
 lexicographical order in some cases.

 The simple format of ordering the aggregation result in SQL is quite 
 simple:
 SELECT ARRAY_AGG(some_field ORDER BY some_field ASC) FROM table;
 SELECT ARRAY_AGG(some_field ORDER BY some_field DESC) FROM table;
 SELECT ARRAY_AGG(some_field ORDER BY other_field ASC) FROM table;

 It would be nice if the above would be supported as follows:
 Model.objects.aggregate(ArrayAgg(some_field, order_by='some_field'))
 Model.objects.aggregate(ArrayAgg(some_field, order_by='-some_field'))
 Model.objects.aggregate(ArrayAgg(some_field, order_by='other_field'))
 where order_by is an optional parameter. If it not specified, behavior 
 can remain unchanged from the current implementation.

 Note that the documentation for Postgres >= 9.3 also mentions 
 json_agg ordering and Postgres >= 9.4 mentions json_object_agg as well.
 I'm unsure whether these should be included as well as I couldn't find 
 the matching implementation in the ORM atm and have no knowledge on plans 
 in that direction.

>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/20fb672b-1d44-4f78-ba44-e19e844ee444%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Discussion related to ticket #26822 (new migrations, --keepdb and --parallel option)

2016-07-05 Thread Markus Holtermann

Hi,

it might be a shot in the dark, but can't we check if Django's
testrunner applied new migrations in which case we drop the cloned
databases and recreate them. If all migrations already existed we keep
the clones the way they are?

/Markus

On Tue, Jul 05, 2016 at 09:00:25AM +0200, Aymeric Augustin wrote:

Hello,

I’ll try to clarify what I said in the PR below..

The main reason for the `--parallel` option was to make Django’s own test suite 
faster. A full run went down from ~8m to ~1m30 when I committed that patch, 
which really helps the development cycle on invasive patches.

Since Django’s own test runner bypasses migrations, whenever you make changes 
to a model in Django’s test suite, you need a run without `--keepdb`. So the 
problem you’re describing doesn’t exist for the primary use case.

Now let’s talk about models and migrations in users’ projects, which is the 
logical next step and the use case you’re trying to improve.

Note that the `--parallel` option is experimental and often non-functional in 
this context: as soon as two tests hit a resource other than the database — 
say, the cache — they can stomp upon one another.


On 05 Jul 2016, at 00:22, Romain Garrigues  
wrote:

We could have just documented this limitation, but I don't think that my 
situation is a really rare edge case in terms of process, so I was suggesting 
to add a new option to be able to reset the cloned databases if needed (let's 
name it --parallel-clone-reset).


When I make changes to models, usually I keep removing and recreating a single 
migration, which is incompatible with using the `--keepdb` option. Whenever I 
make changes, I run without `--keepdb` once.


I don't really like the idea of adding a new option, as it impacts the test 
runner, the clone_test_db function signature, ... but I have not found a better 
idea to at the same time keep the performances with --keepdb and --parallel, 
and handle these newly added migrations to a project.


I’m not a fan of a new option either…


To summarize my proposal, this option (--parallel-clone-reset, or any other 
name) should be used only if you are using --keepdb and --parallel options at 
the same time, and when you have added a new migration between 2 test run.


IIRC this will more than double the run time of Django’s own test suite on 
MySQL: it will increase from ~2m to ~4m (give or take 30s) because cloning 
databases is slow on MySQL.

I’m quoting all these figures from memory and I may mix them up. As I said on 
the ticket it would be useful to redo the benchmark on a first run and 
subsequent run of `./runtests.py` on PostgreSQL and MySQL.

You could argue that it’s best to degrade the experience of a few Django 
contributors (original use case, Django’s test suite) for the benefits of the 
wider community (new use case, projects’ test suites). However the original use 
case is known to work and I don’t believe that the new use case works well 
enough in general, at least not without some engineering to isolate tests from 
one another. For this reason I’m not convinced by this argument.

I hope this clarifies the context of the trade-off we’re discussing.

Best regards,

--
Aymeric.

--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/BD5FB0D4-C9E1-4681-A5A1-CCD6D6BB84B7%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Django 
developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/20160705114828.GA3506%40inel.local.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: PGP signature


Re: Discussion related to ticket #26822 (new migrations, --keepdb and --parallel option)

2016-07-05 Thread Aymeric Augustin
Hello,

I’ll try to clarify what I said in the PR below..

The main reason for the `--parallel` option was to make Django’s own test suite 
faster. A full run went down from ~8m to ~1m30 when I committed that patch, 
which really helps the development cycle on invasive patches.

Since Django’s own test runner bypasses migrations, whenever you make changes 
to a model in Django’s test suite, you need a run without `--keepdb`. So the 
problem you’re describing doesn’t exist for the primary use case.

Now let’s talk about models and migrations in users’ projects, which is the 
logical next step and the use case you’re trying to improve.

Note that the `--parallel` option is experimental and often non-functional in 
this context: as soon as two tests hit a resource other than the database — 
say, the cache — they can stomp upon one another.

> On 05 Jul 2016, at 00:22, Romain Garrigues  
> wrote:
> 
> We could have just documented this limitation, but I don't think that my 
> situation is a really rare edge case in terms of process, so I was suggesting 
> to add a new option to be able to reset the cloned databases if needed (let's 
> name it --parallel-clone-reset).

When I make changes to models, usually I keep removing and recreating a single 
migration, which is incompatible with using the `--keepdb` option. Whenever I 
make changes, I run without `--keepdb` once.

> I don't really like the idea of adding a new option, as it impacts the test 
> runner, the clone_test_db function signature, ... but I have not found a 
> better idea to at the same time keep the performances with --keepdb and 
> --parallel, and handle these newly added migrations to a project.

I’m not a fan of a new option either…

> To summarize my proposal, this option (--parallel-clone-reset, or any other 
> name) should be used only if you are using --keepdb and --parallel options at 
> the same time, and when you have added a new migration between 2 test run.

IIRC this will more than double the run time of Django’s own test suite on 
MySQL: it will increase from ~2m to ~4m (give or take 30s) because cloning 
databases is slow on MySQL.

I’m quoting all these figures from memory and I may mix them up. As I said on 
the ticket it would be useful to redo the benchmark on a first run and 
subsequent run of `./runtests.py` on PostgreSQL and MySQL.

You could argue that it’s best to degrade the experience of a few Django 
contributors (original use case, Django’s test suite) for the benefits of the 
wider community (new use case, projects’ test suites). However the original use 
case is known to work and I don’t believe that the new use case works well 
enough in general, at least not without some engineering to isolate tests from 
one another. For this reason I’m not convinced by this argument.

I hope this clarifies the context of the trade-off we’re discussing.

Best regards,

-- 
Aymeric.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/BD5FB0D4-C9E1-4681-A5A1-CCD6D6BB84B7%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.


Re: Discussion related to ticket #26822 (new migrations, --keepdb and --parallel option)

2016-07-05 Thread Andreas Pelme
Hello,

> On 5 juli 2016, at 00:22, Romain Garrigues  
> wrote:
> After some investigation, I have seen that, in case of keepdb context, in 
> django/db/backends/*/creation.py, if the cloned databases already exist, we 
> don't touch them, which leads to this new field not created in cloned ones.
> 
> I have proposed in the PR to rebuild the cloned databases, even with keepdb 
> option, to be sure that we always have the cloned databases with the latest 
> migration state.
> 
> The problem with this method is that it will increase test database 
> initialization time, as we will now systematically copy all cloned databases, 
> even with --keepdb option (except the default one).

We’ve been doing similar things in pytest-django (with the --reuse-db option 
and pytest-xdist) and faces similar problems. Currently you have to 
force-recreate the databases and then all processes will run migrations and it 
is very slow.

I’ve been playing around with a solution to this: In my own project I create a 
template database and call it `test_myproject_`. 
Whenever a migration file changes (an existing file or a new migration file) - 
a new database is created and all clones are recreated.

Currently this lives as a hack in my own code base, but I would like to explore 
this further and it could be a way forward. Here is my scripts that calculates 
the hash and creates the databases:
https://gist.github.com/pelme/4b3dac475cd6b1dec4fd67d25d2e7cdc
https://gist.github.com/pelme/4a3ad3a62b6244068ff63736342f9509

This method could be refined: It is not necessary to create a database with a 
new name every time migrations change. I.e. we could create a private table 
with a single row that contains the hash.

This approach hashes only the migration files directly involved in migrations, 
if you are using a 3rd-party library that’s imported, that will not trigger a 
new migration run.

As an end user of this the experience is quite nice: You only experience the 
migration/cloning slowness whenever migrations actually changed, otherwise 
everything is fast. You don’t have to remember any special command line options.

Cheers,
Andreas

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/5A553E81-F6FF-4A85-ADB3-01D66AE072B9%40pelme.se.
For more options, visit https://groups.google.com/d/optout.


Re: Google Summer of Code Updates - Class based indexes

2016-07-05 Thread akki

   
   - Restructure index migrations
   PR  - https://github.com/django/django/pull/6866
   Modified the internal working of migrations/schema methods related to 
   indexes. This mainly aims at stabilising the newly added migrations for 
   Index class.


   - Introduce Meta.indexes
   PR ready for review at https://github.com/django/django/pull/6857.
   Ticket - https://code.djangoproject.com/ticket/26808


On Tuesday, 28 June 2016 18:44:37 UTC+5:30, thinkwel...@gmail.com wrote:
>
> Wonderful!  I make heavy use of btree_gin in one of my projects; will be 
> great to have these new Index classes!
>
> On Monday, June 27, 2016 at 6:51:31 PM UTC-4, Tim Graham wrote:
>>
>> Support for more index types is planned. This is only part 1 of the work. 
>> Please see https://gist.github.com/akki/7fd50505928dac58dc350e6cb186a404 
>> for the timeline.
>>
>> On Monday, June 27, 2016 at 5:02:06 PM UTC-4, thinkwel...@gmail.com 
>> wrote:
>>>
>>> Will it be possible to create indexes other than btree? In reviewing the 
>>> code it doesn't look like it but I hope I'm wrong. It'd be awesome to 
>>> support the many more specific index options without using run_sql.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/26b952b5-ae2c-4504-b064-00c0de60a693%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.