Re: A Django Async Roadmap

2019-06-25 Thread Andrew Godwin
The DEP is drafted and in the DEPs repo, and awaiting approval by the
freshly-elected Technical Board once I submit it. In the meantime, we
landed the ASGI patch, as well.

Andrew

On Tue, Jun 25, 2019 at 3:30 PM Chris Barry 
wrote:

> Hey all,
>
> Just wondering what the future of this is looking like?
>
> CB
>
> On Friday, 12 April 2019 07:33:35 UTC+1, Shaggy wrote:
>>
>> and how it is going ?
>> is there some interest from django devs?
>>
>> On Monday, 4 June 2018 15:18:23 UTC+2, Andrew Godwin wrote:
>>>
>>> Hello everyone,
>>>
>>> For a while now I have been working on potential plans for making Django
>>> async-capable, and I finally have a plan I am reasonably happy with and
>>> which I think we can actually do.
>>>
>>> This proposed roadmap, in its great length, is here:
>>>
>>> https://www.aeracode.org/2018/06/04/django-async-roadmap/
>>>
>>> I'd like to invite discussion on this potential plan - including:
>>>
>>>  - Do we think async is worth going after? Note that this is just async
>>> HTTP capability, not WebSockets (that would remain in Channels)
>>>
>>>  - Can we do this in a reasonable timeframe? If not, is there a way
>>> around that?
>>>
>>>  - Are the proposed modifications to how Django runs sensible?
>>>
>>>  - How should we fund this?
>>>
>>> There's many more potential questions, and I really would love feedback
>>> on this. I'm personally pretty convinced that we can and should do this,
>>> but this is a decision we cannot take lightly, and I would love to hear
>>> what you have to say.
>>>
>>> Andrew
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/6ea76507-4041-4850-ac6c-bb13a09af941%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFwN1uoPkM6bZUnAvkK6Yq7P0%2BvtRm4yFHW-ZxzuQKOvr%3DPv%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: A Django Async Roadmap

2019-06-25 Thread Chris Barry
Hey all,

Just wondering what the future of this is looking like?

CB

On Friday, 12 April 2019 07:33:35 UTC+1, Shaggy wrote:
>
> and how it is going ?
> is there some interest from django devs?
>
> On Monday, 4 June 2018 15:18:23 UTC+2, Andrew Godwin wrote:
>>
>> Hello everyone,
>>
>> For a while now I have been working on potential plans for making Django 
>> async-capable, and I finally have a plan I am reasonably happy with and 
>> which I think we can actually do.
>>
>> This proposed roadmap, in its great length, is here:
>>
>> https://www.aeracode.org/2018/06/04/django-async-roadmap/
>>
>> I'd like to invite discussion on this potential plan - including:
>>
>>  - Do we think async is worth going after? Note that this is just async 
>> HTTP capability, not WebSockets (that would remain in Channels)
>>
>>  - Can we do this in a reasonable timeframe? If not, is there a way 
>> around that?
>>
>>  - Are the proposed modifications to how Django runs sensible?
>>
>>  - How should we fund this?
>>
>> There's many more potential questions, and I really would love feedback 
>> on this. I'm personally pretty convinced that we can and should do this, 
>> but this is a decision we cannot take lightly, and I would love to hear 
>> what you have to say.
>>
>> Andrew
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/6ea76507-4041-4850-ac6c-bb13a09af941%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: When filtring on a subquery expressions, the subquery appears twice in the SELECT and the WHERE

2019-06-25 Thread Haki Benita
Thanks for the quick reply Simon.

I'm currently unable to test the [0] commit for this specific use case 
because of some compat issues in my project. I'm also not sure how the 
solution can work in SQL.

The PR to allow using Exists directly in filter sounds like it can solve 
the problem for this particular use case.

Any way, I'm back to `__in` for now ;)
I wont bother the group with it...

Thanks,
Haki.

On Tuesday, June 25, 2019 at 5:01:32 PM UTC+3, charettes wrote:
>
> Hello Haki,
>
> This should probably have been posted to the django-users list given this 
> one
> is for the development of Django itself but since there has been some 
> recent and ungoing
> development in this area I'll reply here.
>
> First there has been work to reuse annotation aliases that is in the 
> master branch but
> haven't been released yet[0]. I'm pretty sure it would reuse the 
> "reconciled" alias in your
> case but if it doesn't I'd encourage you to submit an optimization ticket 
> to make it so.
>
> Secondly there's an active PR to allow expressions resolving to a boolean 
> field to be
> directly passed to .filter()[1]. That would allow you to pass your Exists 
> expression
> directly to filter() instead of annotating it first and thus avoid the 
> double evaluation
> issue you are experiencing.
>
> Finally there's an accepted feature request for adding an alias() method 
> that would
> work how you'd expect `annotate(foo).defer('foo')` to do[2]. Maybe 
> annotate().defer()
> is more intuitive than adding yet another method.
>
> Unfortunately I can't think of another of way of performing a single 
> EXISTS() using
> any of the currently released versions of Django.
>
> Cheers,
> Simon
>
> [0] 
> https://github.com/django/django/commit/1ca825e4dc186da2b93292b5c848a3e5445968d7
> [1] https://github.com/django/django/pull/8119
> [2] https://code.djangoproject.com/ticket/27719
>
> Le mardi 25 juin 2019 09:12:46 UTC-4, Haki Benita a écrit :
>>
>> Hey,
>> I'm trying to filter on a subquery expressions using Exists and I ran 
>> into a weird issue. 
>> According to the docs 
>> ,
>>  
>> to filter on subquery you first need to annotate it, and than filter on it. 
>> This causes the subquery to appear both in the WHERE and in the SELECT 
>> which can lead to poor performance.
>>
>> For example, my queryset:
>>
>> Payment
>> .annotate(reconciled=Exists(
>> Record
>> .objects
>> .filter(payment_id=OuterRef('pk'))
>> .values_list('payment_id')
>> ))
>> .filter(reconciled=True)
>>
>> In the resulting SQL, the subquery is used in WHERE and appears in the 
>> SELECT as well:
>>
>> SELECT 
>> "payment"."id",
>> -- ... many more fields
>>
>>
>>
>>
>> * EXISTS (SELECT U0."id"FROM "record" U0WHERE 
>> U0."payment_id" = ("payment"."id")) AS "reconciled"*
>> FROM 
>> "payment"
>> WHERE 
>>
>>
>>
>>
>> *EXISTS (SELECT U0."id"FROM "leumicard_record" U0
>> WHERE U0."payment_id" = ("payment"."id")) = False*
>>
>>
>> This causes the "exists" query to be evaluated twice:
>>
>>  Seq Scan on payment  (cost=0.00..63982927.73 rows=4772627 width=155)
>>Filter: (NOT (SubPlan 2))
>>SubPlan 1
>>  ->  Index Only Scan using record_payment_id_058ca67f on 
>> leumicard_record u0  (cost=0.42..8.47 rows=2 width=0)
>>Index Cond: (payment_id = payment.id)
>>SubPlan 2
>>  ->  Index Only Scan using record_payment_id_058ca67f on 
>> leumicard_record u0_1  (cost=0.42..8.47 rows=2 width=0)
>>Index Cond: (payment_id = payment.id)
>>
>> This is most likely because the subquery is "annotated" and so the ORM 
>> adds it to the values list. 
>>
>> I tried to `defer` the field, and got the following error:
>>
>> django.core.exceptions.FieldDoesNotExist: Payment has no field named 
>> 'reconciled'
>>
>> To make sure that the query can be executed as intended without the 
>> subquery in the select list, I tried to explicitly list "values_list":
>>
>>
>> Payment
>> .annotate(reconciled=Exists(
>> Record
>> .objects
>> .filter(payment_id=OuterRef('pk'))
>> .values_list('payment_id')
>> ))
>> .filter(reconciled=True)
>> .values_list('pk')
>>
>> Query is now as intended:
>>
>> SELECT 
>> "payment"."id" 
>> FROM 
>> "payment" 
>> WHERE 
>> EXISTS(
>> SELECT U0."payment_id" 
>> FROM "record" U0 
>> WHERE U0."payment_id" = ("payment"."id")
>> ) = False
>>
>> Plan is cheaper:
>>
>>
>>  Seq Scan on payment  (cost=0.00..42747797.33 rows=4772627 width=4)
>>Filter: (NOT (SubPlan 1))
>>SubPlan 1
>>  ->  Index Only Scan using record_payment_id_058ca67f on 
>> leumicard_record u0  (cost=0.42..8.47 rows=2 width=0)
>>Index Cond: (payment_id = payment.id)
>>
>> My questions are:
>>
>> *-  Am I doing it right?*
>> *- Is this intended?*
>> *- What is the best way to exclude the 

Re: When filtring on a subquery expressions, the subquery appears twice in the SELECT and the WHERE

2019-06-25 Thread charettes
Hello Haki,

This should probably have been posted to the django-users list given this 
one
is for the development of Django itself but since there has been some 
recent and ungoing
development in this area I'll reply here.

First there has been work to reuse annotation aliases that is in the master 
branch but
haven't been released yet[0]. I'm pretty sure it would reuse the 
"reconciled" alias in your
case but if it doesn't I'd encourage you to submit an optimization ticket 
to make it so.

Secondly there's an active PR to allow expressions resolving to a boolean 
field to be
directly passed to .filter()[1]. That would allow you to pass your Exists 
expression
directly to filter() instead of annotating it first and thus avoid the 
double evaluation
issue you are experiencing.

Finally there's an accepted feature request for adding an alias() method 
that would
work how you'd expect `annotate(foo).defer('foo')` to do[2]. Maybe 
annotate().defer()
is more intuitive than adding yet another method.

Unfortunately I can't think of another of way of performing a single 
EXISTS() using
any of the currently released versions of Django.

Cheers,
Simon

[0] 
https://github.com/django/django/commit/1ca825e4dc186da2b93292b5c848a3e5445968d7
[1] https://github.com/django/django/pull/8119
[2] https://code.djangoproject.com/ticket/27719

Le mardi 25 juin 2019 09:12:46 UTC-4, Haki Benita a écrit :
>
> Hey,
> I'm trying to filter on a subquery expressions using Exists and I ran into 
> a weird issue. 
> According to the docs 
> ,
>  
> to filter on subquery you first need to annotate it, and than filter on it. 
> This causes the subquery to appear both in the WHERE and in the SELECT 
> which can lead to poor performance.
>
> For example, my queryset:
>
> Payment
> .annotate(reconciled=Exists(
> Record
> .objects
> .filter(payment_id=OuterRef('pk'))
> .values_list('payment_id')
> ))
> .filter(reconciled=True)
>
> In the resulting SQL, the subquery is used in WHERE and appears in the 
> SELECT as well:
>
> SELECT 
> "payment"."id",
> -- ... many more fields
>
>
>
>
> * EXISTS (SELECT U0."id"FROM "record" U0WHERE 
> U0."payment_id" = ("payment"."id")) AS "reconciled"*
> FROM 
> "payment"
> WHERE 
>
>
>
>
> *EXISTS (SELECT U0."id"FROM "leumicard_record" U0  
>   WHERE U0."payment_id" = ("payment"."id")) = False*
>
>
> This causes the "exists" query to be evaluated twice:
>
>  Seq Scan on payment  (cost=0.00..63982927.73 rows=4772627 width=155)
>Filter: (NOT (SubPlan 2))
>SubPlan 1
>  ->  Index Only Scan using record_payment_id_058ca67f on 
> leumicard_record u0  (cost=0.42..8.47 rows=2 width=0)
>Index Cond: (payment_id = payment.id)
>SubPlan 2
>  ->  Index Only Scan using record_payment_id_058ca67f on 
> leumicard_record u0_1  (cost=0.42..8.47 rows=2 width=0)
>Index Cond: (payment_id = payment.id)
>
> This is most likely because the subquery is "annotated" and so the ORM 
> adds it to the values list. 
>
> I tried to `defer` the field, and got the following error:
>
> django.core.exceptions.FieldDoesNotExist: Payment has no field named 
> 'reconciled'
>
> To make sure that the query can be executed as intended without the 
> subquery in the select list, I tried to explicitly list "values_list":
>
>
> Payment
> .annotate(reconciled=Exists(
> Record
> .objects
> .filter(payment_id=OuterRef('pk'))
> .values_list('payment_id')
> ))
> .filter(reconciled=True)
> .values_list('pk')
>
> Query is now as intended:
>
> SELECT 
> "payment"."id" 
> FROM 
> "payment" 
> WHERE 
> EXISTS(
> SELECT U0."payment_id" 
> FROM "record" U0 
> WHERE U0."payment_id" = ("payment"."id")
> ) = False
>
> Plan is cheaper:
>
>
>  Seq Scan on payment  (cost=0.00..42747797.33 rows=4772627 width=4)
>Filter: (NOT (SubPlan 1))
>SubPlan 1
>  ->  Index Only Scan using record_payment_id_058ca67f on 
> leumicard_record u0  (cost=0.42..8.47 rows=2 width=0)
>Index Cond: (payment_id = payment.id)
>
> My questions are:
>
> *-  Am I doing it right?*
> *- Is this intended?*
> *- What is the best way to exclude the subquery from the query SELECT 
> list?*
>
> Thanks,
> *Haki Benita. *
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/468fc3c1-8536-4143-8c3f-2cf67424a65f%40googlegroups.com.
For more options, visit 

When filtring on a subquery expressions, the subquery appears twice in the SELECT and the WHERE

2019-06-25 Thread Haki Benita

Hey,
I'm trying to filter on a subquery expressions using Exists and I ran into 
a weird issue. 
According to the docs 
,
 
to filter on subquery you first need to annotate it, and than filter on it. 
This causes the subquery to appear both in the WHERE and in the SELECT 
which can lead to poor performance.

For example, my queryset:

Payment
.annotate(reconciled=Exists(
Record
.objects
.filter(payment_id=OuterRef('pk'))
.values_list('payment_id')
))
.filter(reconciled=True)

In the resulting SQL, the subquery is used in WHERE and appears in the 
SELECT as well:

SELECT 
"payment"."id",
-- ... many more fields
   



* EXISTS (SELECT U0."id"FROM "record" U0WHERE 
U0."payment_id" = ("payment"."id")) AS "reconciled"*
FROM 
"payment"
WHERE 




*EXISTS (SELECT U0."id"FROM "leumicard_record" U0  
  WHERE U0."payment_id" = ("payment"."id")) = False*


This causes the "exists" query to be evaluated twice:

 Seq Scan on payment  (cost=0.00..63982927.73 rows=4772627 width=155)
   Filter: (NOT (SubPlan 2))
   SubPlan 1
 ->  Index Only Scan using record_payment_id_058ca67f on 
leumicard_record u0  (cost=0.42..8.47 rows=2 width=0)
   Index Cond: (payment_id = payment.id)
   SubPlan 2
 ->  Index Only Scan using record_payment_id_058ca67f on 
leumicard_record u0_1  (cost=0.42..8.47 rows=2 width=0)
   Index Cond: (payment_id = payment.id)

This is most likely because the subquery is "annotated" and so the ORM adds 
it to the values list. 

I tried to `defer` the field, and got the following error:

django.core.exceptions.FieldDoesNotExist: Payment has no field named 
'reconciled'

To make sure that the query can be executed as intended without the 
subquery in the select list, I tried to explicitly list "values_list":


Payment
.annotate(reconciled=Exists(
Record
.objects
.filter(payment_id=OuterRef('pk'))
.values_list('payment_id')
))
.filter(reconciled=True)
.values_list('pk')

Query is now as intended:

SELECT 
"payment"."id" 
FROM 
"payment" 
WHERE 
EXISTS(
SELECT U0."payment_id" 
FROM "record" U0 
WHERE U0."payment_id" = ("payment"."id")
) = False

Plan is cheaper:


 Seq Scan on payment  (cost=0.00..42747797.33 rows=4772627 width=4)
   Filter: (NOT (SubPlan 1))
   SubPlan 1
 ->  Index Only Scan using record_payment_id_058ca67f on 
leumicard_record u0  (cost=0.42..8.47 rows=2 width=0)
   Index Cond: (payment_id = payment.id)

My questions are:

*-  Am I doing it right?*
*- Is this intended?*
*- What is the best way to exclude the subquery from the query SELECT list?*

Thanks,
*Haki Benita. *

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/b8213fe1-fbb0-4bf3-87fd-3a6f323456ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Feature Request][Model, ORM] Disabling a field before removal to support continuous delivery

2019-06-25 Thread Matthieu Rudelle

>
> How does the "safe" field of migrations work with other migrations related 
> commands, such as squashmigrations? 
>

Squashmigrations typically targets and produces migrations that are old 
enough to be assumed safe.
 

> You can check out our repository. We've been pretty happy with how it's 
> working for us. https://github.com/aspiredu/django-safemigrate


safe-migrate is a neat solution. Although it lacks support for rollbacks: 
as soon as "migrate" is run the app is stuck in the current release

If migrations are seen as a rolling window, you want a window of X releases 
(2 in our case) to play well together (minus discrepancies due to updates 
to the application logic) so being able to postpone a column removal to a 
later release is important. The "disabled" feature (or anything similar to 
a flag written in the codebase) has this added benefit.
Squash migration would be used to squash up to but excluding the current 
sliding window. 

there are similar to `alter table drop column`  issue:  `alter table rename 
> column`, `drop table`, `rename table`. (honestly `alter table drop column` 
> and `drop table` a bit different wiith `alter table remane column` and 
> `rename table`)
>

   - `alter table rename column` can be reduced to `alter table add column` 
   and later `alter table drop column` plus some temporary update replication 
   implemented in the app (idem for `rename table`) 
   - `drop table` follow the same logic with the SQL actions delayed to a 
   later release

But if you mix migrations for `alter table add column` and `alter table 
> drop column` - you cannot safely apply both migrations simultaneously, and 
> your proposal sounds pretty reasonable there. 

 

However if you add disabled option, then make migration, remove field, then 
> make migration, then apply this changes to your environment - you will get 
> same issue. So to automate deployment process and to get achievable result 
> you want I think you need something more complex: deploying atomic changes, 
> but for me it sounds more about deployment than django itself.
>

As long as the columns are different these actions can happen in parallel. 
The flag will just delay the DB migration to a later release and the 
deployment needs this kind of flexibility from django to avoid direct 
access to the DB by the deployment process.

On Tuesday, June 25, 2019 at 7:06:38 AM UTC+2, Paveł Tyślacki wrote:
>
> there are similar to `alter table drop column`  issue:  `alter table 
> rename column`, `drop table`, `rename table`. (honestly `alter table drop 
> column` and `drop table` a bit different wiith `alter table remane column` 
> and `rename table`)
>
> Look like you general flow for migration:
> 1. change code and create migrations
> 2. apply migrations
> 3. apply code for all your instances
>
> the issue in this cases I can describe: we have code that we still used.
>
> for `alter table drop column`, `drop table` next scenario works fine (as 
> your temp solution too):
>
> 1. remove column/table usage from codebase
> 2. apply this code for all of your instances
> 3. apply migration with field/table removal
>
> But if you mix migrations for `alter table add column` and `alter table 
> drop column` - you cannot safely apply both migrations simultaneously, and 
> your proposal sounds pretty reasonable there.
>
> However if you add disabled option, then make migration, remove field, 
> then make migration, then apply this changes to your environment - you will 
> get same issue. So to automate deployment process and to get achievable 
> result you want I think you need something more complex: deploying atomic 
> changes, but for me it sounds more about deployment than django itself.
>
>
>
>
> On Monday, 24 June 2019 16:15:04 UTC+3, Matthieu Rudelle wrote:
>>
>> Hi there, 
>>
>> I can't find any previous ticket proposing a solution to this problem so 
>> here are my findings: 
>>
>> **Use case**:
>> When using continuous delivery several versions of the code can be 
>> running in parallel on se same DB. Say for instance that release 2.42 is in 
>> production, 2.43 is about to be rolled out and in this release one field 
>> (say ''MyModel.my_unused_field'') is not used anymore and was removed. 
>> Before rolling out 2.43 the DB is migrated and column ''my_unused_field'' 
>> of ''MyModel'' is removed. This makes 2.42 crash saying that one column is 
>> not found even though 2.42 does not use the field anywhere in the code.
>>
>> **Temporary solution**:
>> Do not makemigrations until de 2.44 release, but it does not scale well 
>> with many contributors and CI tools (doing their awesome job of making sure 
>> migrations and models are in sync) will complain.
>>
>> **Proposed solution**:
>> Have a ''disabled'' param on Field. When activated this field is not 
>> fetched from the DB but replaced by a hardcoded value. 
>> In our use case, ''disabled'' is added at the 2.42 release, then when 
>> 2.43 rolls out and migrates the DB