Re: Strange query when using annotate and count

2017-12-06 Thread Cristiano Coelho
Update. Found a work around that gets rid of the unnecessary group by (and 
hence speeds up the query, from ~200ms to ~100ms in my use case). Simply 
adding a .values('pk').
Bad thing, every model needs a custom manager and it will still use an 
inner query, still trying to figure out the side effects of it, nothing so 
far...

class FasterCountQuerySet(QuerySet):
> def count(self):
> return super(FasterCountQuerySet, self.values('pk')).count()
>


Makes me wonder if the original count code can be made "smarter" in order 
to prevent the unnecessary group by, or if this simply a bug


El miércoles, 6 de diciembre de 2017, 10:28:22 (UTC-3), Cristiano Coelho 
escribió:
>
> After testing for a while, there really doesn't seem to be a good way to 
> do this. .
>
> Annotation is required since the query is filtered based on the annotated 
> value, so any attempt to clear annotations would fail. Although a lookup 
> could be used for the filtering.
> But a custom lookup is not possible since the query also sorts by the 
> annotated value, so an annotation is required.
>
> This means that I would need to use a custom lookup for count queries, and 
> annotation for result queries, which would mean a lot of duplicated code 
> and also that the Django Paginator class can't be used since it expects 
> only one queryset.
>
>
> Are there really no options to make a count with a custom function to skip 
> the group by clause (and so the inner query)? I mean there's no group by at 
> all with the original query, it's just when count is used.
>
> To summarize, the original query is a bit more complex than the one 
> described on top, reason annotate is needed, and it looks more like:
>
> q = 
> Vulnerability.objects.annotate(score=WordTrigramCustomSimilarity('test','summary')).filter(score__gt=0.6).order_by('-score')
>
> Which is then sent to a paginator, that ends up almost adding 3 times more 
> database query time do to the horrid count query generated.
>
>
> El viernes, 24 de noviembre de 2017, 18:04:21 (UTC-3), Cristiano Coelho 
> escribió:
>>
>> Hello Simon,
>>
>> You are right, the score is really not meant to be attached so a custom 
>> lookup might work, if it wasn't for the issue that an order_by clause would 
>> fail without the annotation. 
>>
>> So it seems it's either update and duplicate a lot of code or find a very 
>> ugly work around, I was hoping I missed some flag or implementation detail 
>> on the custom function to tell the query builder to avoid all the grouping 
>> and subquery stuff, will research more...
>>
>>
>> El viernes, 24 de noviembre de 2017, 14:39:27 (UTC-3), Simon Charette 
>> escribió:
>>>
>>> Hello Cristiano,
>>>
>>> If you are solely using this annotation for querying purpose, that means 
>>> you are
>>> not expecting a `score` attribute to be attached to your `Vulnerability` 
>>> instances,
>>> you could try defining your function as a lookup on CharField/TextField 
>>> instead.
>>>
>>> You could then use it only when required by passing a tuple of values to 
>>> it.
>>>
>>> e.g.
>>>
>>> Vulnerability.objects.filter(summary__wordsimilarity_gt=('test', 
>>> threshold))
>>>
>>> If you need the value annotated I'm afraid the best solution would be to 
>>> use
>>> a custom paginator class. You could give a try at detecting whether or 
>>> not
>>> an annotation is referenced before removing it from query.annotations and
>>> report your finding on the aforementioned ticket but I expect this to be
>>> relatively hard to do correctly. Still it would make implementing 
>>> `alias()`
>>> way easier and provide help for users in the situation as you.
>>>
>>> Best,
>>> Simon
>>>
>>> Le vendredi 24 novembre 2017 08:14:10 UTC-5, Cristiano Coelho a écrit :
>>>>
>>>> Hello Simon,
>>>>
>>>> That private API is good to know, but now that I think of it would 
>>>> still not work for me, since my queryset is passed to a paginator and 
>>>> that's the class that does both the count and actual queryset execution, 
>>>> so 
>>>> need a queryset that can have both the annotation but also clears it if 
>>>> count is called so it doesn't create the redundant sub queries.
>>>>
>>>> I'm wondering what's better, should I try to resolve this at the 
>>>> manager level overriding count? I feel like this might cause issues if the 
>>>> annotation is actual

Re: Strange query when using annotate and count

2017-12-06 Thread Cristiano Coelho
After testing for a while, there really doesn't seem to be a good way to do 
this. .

Annotation is required since the query is filtered based on the annotated 
value, so any attempt to clear annotations would fail. Although a lookup 
could be used for the filtering.
But a custom lookup is not possible since the query also sorts by the 
annotated value, so an annotation is required.

This means that I would need to use a custom lookup for count queries, and 
annotation for result queries, which would mean a lot of duplicated code 
and also that the Django Paginator class can't be used since it expects 
only one queryset.


Are there really no options to make a count with a custom function to skip 
the group by clause (and so the inner query)? I mean there's no group by at 
all with the original query, it's just when count is used.

To summarize, the original query is a bit more complex than the one 
described on top, reason annotate is needed, and it looks more like:

q = 
Vulnerability.objects.annotate(score=WordTrigramCustomSimilarity('test','summary')).filter(score__gt=0.6).order_by('-score')

Which is then sent to a paginator, that ends up almost adding 3 times more 
database query time do to the horrid count query generated.


El viernes, 24 de noviembre de 2017, 18:04:21 (UTC-3), Cristiano Coelho 
escribió:
>
> Hello Simon,
>
> You are right, the score is really not meant to be attached so a custom 
> lookup might work, if it wasn't for the issue that an order_by clause would 
> fail without the annotation. 
>
> So it seems it's either update and duplicate a lot of code or find a very 
> ugly work around, I was hoping I missed some flag or implementation detail 
> on the custom function to tell the query builder to avoid all the grouping 
> and subquery stuff, will research more...
>
>
> El viernes, 24 de noviembre de 2017, 14:39:27 (UTC-3), Simon Charette 
> escribió:
>>
>> Hello Cristiano,
>>
>> If you are solely using this annotation for querying purpose, that means 
>> you are
>> not expecting a `score` attribute to be attached to your `Vulnerability` 
>> instances,
>> you could try defining your function as a lookup on CharField/TextField 
>> instead.
>>
>> You could then use it only when required by passing a tuple of values to 
>> it.
>>
>> e.g.
>>
>> Vulnerability.objects.filter(summary__wordsimilarity_gt=('test', 
>> threshold))
>>
>> If you need the value annotated I'm afraid the best solution would be to 
>> use
>> a custom paginator class. You could give a try at detecting whether or not
>> an annotation is referenced before removing it from query.annotations and
>> report your finding on the aforementioned ticket but I expect this to be
>> relatively hard to do correctly. Still it would make implementing 
>> `alias()`
>> way easier and provide help for users in the situation as you.
>>
>> Best,
>> Simon
>>
>> Le vendredi 24 novembre 2017 08:14:10 UTC-5, Cristiano Coelho a écrit :
>>>
>>> Hello Simon,
>>>
>>> That private API is good to know, but now that I think of it would still 
>>> not work for me, since my queryset is passed to a paginator and that's the 
>>> class that does both the count and actual queryset execution, so need a 
>>> queryset that can have both the annotation but also clears it if count is 
>>> called so it doesn't create the redundant sub queries.
>>>
>>> I'm wondering what's better, should I try to resolve this at the manager 
>>> level overriding count? I feel like this might cause issues if the 
>>> annotation is actually legit (ie with an aggregate) and it needs the 
>>> subquery after all.
>>> The other option is to subclass the paginator class with a special one 
>>> that does this annotation clearing before running count.
>>>
>>> Even with these cases, if the annotated value is used later with a 
>>> filter query I can't really simply removed, but the sub queries and extra 
>>> function calls really doesn't make sense to be there when doing a count, so 
>>> it seems that all the options are quite bad and hackish.
>>>
>>> Any other options?
>>>
>>>
>>> El viernes, 24 de noviembre de 2017, 1:12:07 (UTC-3), Simon Charette 
>>> escribió:
>>>>
>>>> Hello Cristiano,
>>>>
>>>> > Isn't there a way (even if it's hackish) to simply clear the 
>>>> annotations (and order by)? I know querysets are smart enough to not 
>>>> include the order by clause if there's a count.
>>>>
>>>> The ordering is cleared on `count()

Re: Strange query when using annotate and count

2017-11-24 Thread Cristiano Coelho
Hello Simon,

You are right, the score is really not meant to be attached so a custom 
lookup might work, if it wasn't for the issue that an order_by clause would 
fail without the annotation. 

So it seems it's either update and duplicate a lot of code or find a very 
ugly work around, I was hoping I missed some flag or implementation detail 
on the custom function to tell the query builder to avoid all the grouping 
and subquery stuff, will research more...


El viernes, 24 de noviembre de 2017, 14:39:27 (UTC-3), Simon Charette 
escribió:
>
> Hello Cristiano,
>
> If you are solely using this annotation for querying purpose, that means 
> you are
> not expecting a `score` attribute to be attached to your `Vulnerability` 
> instances,
> you could try defining your function as a lookup on CharField/TextField 
> instead.
>
> You could then use it only when required by passing a tuple of values to 
> it.
>
> e.g.
>
> Vulnerability.objects.filter(summary__wordsimilarity_gt=('test', 
> threshold))
>
> If you need the value annotated I'm afraid the best solution would be to 
> use
> a custom paginator class. You could give a try at detecting whether or not
> an annotation is referenced before removing it from query.annotations and
> report your finding on the aforementioned ticket but I expect this to be
> relatively hard to do correctly. Still it would make implementing `alias()`
> way easier and provide help for users in the situation as you.
>
> Best,
> Simon
>
> Le vendredi 24 novembre 2017 08:14:10 UTC-5, Cristiano Coelho a écrit :
>>
>> Hello Simon,
>>
>> That private API is good to know, but now that I think of it would still 
>> not work for me, since my queryset is passed to a paginator and that's the 
>> class that does both the count and actual queryset execution, so need a 
>> queryset that can have both the annotation but also clears it if count is 
>> called so it doesn't create the redundant sub queries.
>>
>> I'm wondering what's better, should I try to resolve this at the manager 
>> level overriding count? I feel like this might cause issues if the 
>> annotation is actually legit (ie with an aggregate) and it needs the 
>> subquery after all.
>> The other option is to subclass the paginator class with a special one 
>> that does this annotation clearing before running count.
>>
>> Even with these cases, if the annotated value is used later with a filter 
>> query I can't really simply removed, but the sub queries and extra function 
>> calls really doesn't make sense to be there when doing a count, so it seems 
>> that all the options are quite bad and hackish.
>>
>> Any other options?
>>
>>
>> El viernes, 24 de noviembre de 2017, 1:12:07 (UTC-3), Simon Charette 
>> escribió:
>>>
>>> Hello Cristiano,
>>>
>>> > Isn't there a way (even if it's hackish) to simply clear the 
>>> annotations (and order by)? I know querysets are smart enough to not 
>>> include the order by clause if there's a count.
>>>
>>> The ordering is cleared on `count()` because it shouldn't interfere with 
>>> the result in any way. You can clear annotations using 
>>> `queryset.query.annotations.clear()` but be aware that it is a private API 
>>> that could change under your feet. Make sure to make a copy of the queryset 
>>> (e.g. copy = queryset.all()) before performing this change as it will alter 
>>> the queryset in place.
>>>
>>> Best,
>>> Simon
>>>  
>>> Le jeudi 23 novembre 2017 22:41:41 UTC-5, Cristiano Coelho a écrit :
>>>>
>>>> Hello Simon, thanks for the response.
>>>>
>>>> The above code is just an example, the reason behind the annotate 
>>>> because there's some complicated code that builds a queryset and annotates 
>>>> it so it can easily be reused. It works fine 99% of the time except when 
>>>> there's a count involved and it ends up being redundant. The solution 
>>>> would 
>>>> be to not annotate anything and replace the code in multiple places to add 
>>>> the annotate call (or similar using a custom queryset or manager I guess), 
>>>> but that's quite painful and will end up with a lot of duplicated code 
>>>> since there's also an order_by that follows the annotate that needs to be 
>>>> moved over as well
>>>>
>>>> Isn't there a way (even if it's hackish) to simply clear the 
>>>> annotations (and order by)? I know querysets are smart enough to not 
>>>> include the order by clause if 

Re: Strange query when using annotate and count

2017-11-24 Thread Cristiano Coelho
Hello Simon,

That private API is good to know, but now that I think of it would still 
not work for me, since my queryset is passed to a paginator and that's the 
class that does both the count and actual queryset execution, so need a 
queryset that can have both the annotation but also clears it if count is 
called so it doesn't create the redundant sub queries.

I'm wondering what's better, should I try to resolve this at the manager 
level overriding count? I feel like this might cause issues if the 
annotation is actually legit (ie with an aggregate) and it needs the 
subquery after all.
The other option is to subclass the paginator class with a special one that 
does this annotation clearing before running count.

Even with these cases, if the annotated value is used later with a filter 
query I can't really simply removed, but the sub queries and extra function 
calls really doesn't make sense to be there when doing a count, so it seems 
that all the options are quite bad and hackish.

Any other options?


El viernes, 24 de noviembre de 2017, 1:12:07 (UTC-3), Simon Charette 
escribió:
>
> Hello Cristiano,
>
> > Isn't there a way (even if it's hackish) to simply clear the annotations 
> (and order by)? I know querysets are smart enough to not include the order 
> by clause if there's a count.
>
> The ordering is cleared on `count()` because it shouldn't interfere with 
> the result in any way. You can clear annotations using 
> `queryset.query.annotations.clear()` but be aware that it is a private API 
> that could change under your feet. Make sure to make a copy of the queryset 
> (e.g. copy = queryset.all()) before performing this change as it will alter 
> the queryset in place.
>
> Best,
> Simon
>  
> Le jeudi 23 novembre 2017 22:41:41 UTC-5, Cristiano Coelho a écrit :
>>
>> Hello Simon, thanks for the response.
>>
>> The above code is just an example, the reason behind the annotate because 
>> there's some complicated code that builds a queryset and annotates it so it 
>> can easily be reused. It works fine 99% of the time except when there's a 
>> count involved and it ends up being redundant. The solution would be to not 
>> annotate anything and replace the code in multiple places to add the 
>> annotate call (or similar using a custom queryset or manager I guess), but 
>> that's quite painful and will end up with a lot of duplicated code since 
>> there's also an order_by that follows the annotate that needs to be moved 
>> over as well
>>
>> Isn't there a way (even if it's hackish) to simply clear the annotations 
>> (and order by)? I know querysets are smart enough to not include the order 
>> by clause if there's a count.
>>
>> You could also suggest using two separate calls or a flag to pass down to 
>> the internal code so it doesn't include the additional stuff, but that 
>> wouldn't work since paginators accept only one query set for example and 
>> internall uses it for both count and results.
>>
>>
>>
>> El viernes, 24 de noviembre de 2017, 0:05:29 (UTC-3), Simon Charette 
>> escribió:
>>>
>>> Hello Cristiano,
>>>
>>> I understand your frustration but please avoid using the developer 
>>> mailing list
>>> as a second tier support channel. I suggest you try the IRC #django 
>>> channel if
>>> you need to want to get faster support.
>>>
>>> What's happening here is that annotate() really means "select this 
>>> field" while
>>> in your other case you use a lookup (summary__icontains) which are only 
>>> going to
>>> be added to the WHERE clause of your query.
>>>
>>> I'm not sure why you are annotating your queryset without referring to 
>>> it in
>>> a filter clause later on but the ORM cannot simply ignore it when you are
>>> performing your `count()` because some annotations could interfere with 
>>> grouping
>>> somehow.
>>>
>>> There is an open ticket[0] to add support for an `alias()` method that 
>>> would
>>> allow the ORM to clear/ignore the specified expressions if it's not 
>>> referenced
>>> in the query.
>>>
>>> In the mean time I think the best approach would be to avoid annotating 
>>> the
>>> queryset if your don't need to reference the score.
>>>
>>> Cheers,
>>> Simon
>>>
>>> [0] https://code.djangoproject.com/ticket/27719
>>>
>>> Le mardi 21 novembre 2017 08:46:21 UTC-5, Cristiano Coelho a écrit :
>>>>
>>>> Hmm, should I try with the dev mailing list? Guess it's something no 
>

Re: Strange query when using annotate and count

2017-11-23 Thread Cristiano Coelho
Hello Simon, thanks for the response.

The above code is just an example, the reason behind the annotate because 
there's some complicated code that builds a queryset and annotates it so it 
can easily be reused. It works fine 99% of the time except when there's a 
count involved and it ends up being redundant. The solution would be to not 
annotate anything and replace the code in multiple places to add the 
annotate call (or similar using a custom queryset or manager I guess), but 
that's quite painful and will end up with a lot of duplicated code since 
there's also an order_by that follows the annotate that needs to be moved 
over as well

Isn't there a way (even if it's hackish) to simply clear the annotations 
(and order by)? I know querysets are smart enough to not include the order 
by clause if there's a count.

You could also suggest using two separate calls or a flag to pass down to 
the internal code so it doesn't include the additional stuff, but that 
wouldn't work since paginators accept only one query set for example and 
internall uses it for both count and results.



El viernes, 24 de noviembre de 2017, 0:05:29 (UTC-3), Simon Charette 
escribió:
>
> Hello Cristiano,
>
> I understand your frustration but please avoid using the developer mailing 
> list
> as a second tier support channel. I suggest you try the IRC #django 
> channel if
> you need to want to get faster support.
>
> What's happening here is that annotate() really means "select this field" 
> while
> in your other case you use a lookup (summary__icontains) which are only 
> going to
> be added to the WHERE clause of your query.
>
> I'm not sure why you are annotating your queryset without referring to it 
> in
> a filter clause later on but the ORM cannot simply ignore it when you are
> performing your `count()` because some annotations could interfere with 
> grouping
> somehow.
>
> There is an open ticket[0] to add support for an `alias()` method that 
> would
> allow the ORM to clear/ignore the specified expressions if it's not 
> referenced
> in the query.
>
> In the mean time I think the best approach would be to avoid annotating the
> queryset if your don't need to reference the score.
>
> Cheers,
> Simon
>
> [0] https://code.djangoproject.com/ticket/27719
>
> Le mardi 21 novembre 2017 08:46:21 UTC-5, Cristiano Coelho a écrit :
>>
>> Hmm, should I try with the dev mailing list? Guess it's something no one 
>> faced before?
>>
>> El martes, 14 de noviembre de 2017, 22:54:23 (UTC-3), Cristiano Coelho 
>> escribió:
>>>
>>> I'm getting some very odd query when combining annotate with count. See 
>>> the following:
>>>
>>> >>> q = 
>>>> Vulnerability.objects.annotate(score=WordTrigramCustomSimilarity('test','summary'))
>>>> >>> q.count()
>>>> 3094
>>>> >>> print connection.queries[-1]
>>>> 'SELECT COUNT(*) 
>>>> FROM (
>>>> SELECT "vulnerabilities_vulnerability"."id" AS Col1, 
>>>> custom_word_similarity(\'test\', 
>>>> "vulnerabilities_vulnerability"."summary") 
>>>> AS "score" 
>>>> FROM "vulnerabilities_vulnerability" 
>>>> GROUP BY "vulnerabilities_vulnerability"."id", 
>>>> custom_word_similarity(\'test\', "vulnerabilities_vulnerability"."summary")
>>>> ) subquery
>>>> >>> q2 = Vulnerability.objects.filter(summary__icontains='test')
>>>> >>> q2.count()
>>>> 33
>>>> >>> print connection.queries[-1]
>>>> 'SELECT COUNT(*) AS "__count" 
>>>> FROM "vulnerabilities_vulnerability" 
>>>> WHERE UPPER("vulnerabilities_vulnerability"."summary"::text) LIKE 
>>>> UPPER(\'%test%\')
>>>
>>>
>>>
>>> Custom function code, is this what's causing the odd count behavior? Did 
>>> I miss anything?
>>>
>>> class WordTrigramCustomSimilarity(Func):
>>>> function = 'custom_word_similarity' 
>>>> def __init__(self, string, expression, **extra):
>>>> if not hasattr(string, 'resolve_expression'):
>>>> string = Value(string)
>>>> super(WordTrigramCustomSimilarity, self).__init__(string, 
>>>> expression, output_field=FloatField(), **extra)
>>>
>>>
>>> I would expect for the query to be a simple count, rather than a nested 
>>> query with a useless group by (correct me if I'm wrong).

Re: Strange query when using annotate and count

2017-11-21 Thread Cristiano Coelho
Hmm, should I try with the dev mailing list? Guess it's something no one 
faced before?

El martes, 14 de noviembre de 2017, 22:54:23 (UTC-3), Cristiano Coelho 
escribió:
>
> I'm getting some very odd query when combining annotate with count. See 
> the following:
>
> >>> q = 
>> Vulnerability.objects.annotate(score=WordTrigramCustomSimilarity('test','summary'))
>> >>> q.count()
>> 3094
>> >>> print connection.queries[-1]
>> 'SELECT COUNT(*) 
>> FROM (
>> SELECT "vulnerabilities_vulnerability"."id" AS Col1, 
>> custom_word_similarity(\'test\', "vulnerabilities_vulnerability"."summary") 
>> AS "score" 
>> FROM "vulnerabilities_vulnerability" 
>> GROUP BY "vulnerabilities_vulnerability"."id", 
>> custom_word_similarity(\'test\', "vulnerabilities_vulnerability"."summary")
>> ) subquery
>> >>> q2 = Vulnerability.objects.filter(summary__icontains='test')
>> >>> q2.count()
>> 33
>> >>> print connection.queries[-1]
>> 'SELECT COUNT(*) AS "__count" 
>> FROM "vulnerabilities_vulnerability" 
>> WHERE UPPER("vulnerabilities_vulnerability"."summary"::text) LIKE 
>> UPPER(\'%test%\')
>
>
>
> Custom function code, is this what's causing the odd count behavior? Did I 
> miss anything?
>
> class WordTrigramCustomSimilarity(Func):
>> function = 'custom_word_similarity' 
>> def __init__(self, string, expression, **extra):
>> if not hasattr(string, 'resolve_expression'):
>> string = Value(string)
>> super(WordTrigramCustomSimilarity, self).__init__(string, 
>> expression, output_field=FloatField(), **extra)
>
>
> I would expect for the query to be a simple count, rather than a nested 
> query with a useless group by (correct me if I'm wrong).
> The issue gets even worse if the function is expensive, since it gets 
> called when it's not needed at all, more than once.
> Also the issue behaves pretty much the same if the queryset includes 
> filtering and ordering but I didn't include it here for simplicity.
>
> Using Django 1.11.7 + postgres (psycopg) backend.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/b6e37c84-c09d-4486-af8a-e23c399cf975%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Strange query when using annotate and count

2017-11-14 Thread Cristiano Coelho
I'm getting some very odd query when combining annotate with count. See the 
following:

>>> q = 
> Vulnerability.objects.annotate(score=WordTrigramCustomSimilarity('test','summary'))
> >>> q.count()
> 3094
> >>> print connection.queries[-1]
> 'SELECT COUNT(*) 
> FROM (
> SELECT "vulnerabilities_vulnerability"."id" AS Col1, 
> custom_word_similarity(\'test\', "vulnerabilities_vulnerability"."summary") 
> AS "score" 
> FROM "vulnerabilities_vulnerability" 
> GROUP BY "vulnerabilities_vulnerability"."id", 
> custom_word_similarity(\'test\', "vulnerabilities_vulnerability"."summary")
> ) subquery
> >>> q2 = Vulnerability.objects.filter(summary__icontains='test')
> >>> q2.count()
> 33
> >>> print connection.queries[-1]
> 'SELECT COUNT(*) AS "__count" 
> FROM "vulnerabilities_vulnerability" 
> WHERE UPPER("vulnerabilities_vulnerability"."summary"::text) LIKE 
> UPPER(\'%test%\')



Custom function code, is this what's causing the odd count behavior? Did I 
miss anything?

class WordTrigramCustomSimilarity(Func):
> function = 'custom_word_similarity' 
> def __init__(self, string, expression, **extra):
> if not hasattr(string, 'resolve_expression'):
> string = Value(string)
> super(WordTrigramCustomSimilarity, self).__init__(string, 
> expression, output_field=FloatField(), **extra)


I would expect for the query to be a simple count, rather than a nested 
query with a useless group by (correct me if I'm wrong).
The issue gets even worse if the function is expensive, since it gets 
called when it's not needed at all, more than once.
Also the issue behaves pretty much the same if the queryset includes 
filtering and ordering but I didn't include it here for simplicity.

Using Django 1.11.7 + postgres (psycopg) backend.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/a8ef80d5-26a5-40b4-b2b1-5cbf3bfa8c64%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: About channels and scaling up

2017-08-06 Thread Cristiano Coelho
I thought that the fact redis was being used (or any distributed store) 
wouldn't make this an issue, so basically redis is used to be able to scale 
out the workers but the web sockets listener can't currently be scaled out.

The main reason of the question was about being able to deploy everything 
together, a django application and also web sockets through the same AWS 
Elastic Beanstlak deployment, which would scale automatically and spawn new 
servers in the process, this would actually save some money compared to 
what we currently use which is basically a separate elastic beanstalk 
deployment with a nodejs + web sockets server as a single instance since it 
has the exact same issue, but since node is quite fast (way faster than 
python sadly) scaling is not really an issue.

Is there any place to look into the technical challenges and progress for 
this issue? 




El domingo, 6 de agosto de 2017, 20:43:41 (UTC-3), Andrew Godwin escribió:
>
> Yes, this is a problem, and as it says it would only work for some site 
> designs (where you're OK with sharding the live userbase). We're fixing 
> this architectural issue in the next major Channels release, but it's going 
> to take a little time.
>
> Andrew
>
> On Mon, Aug 7, 2017 at 4:21 AM, Cristiano Coelho <cristia...@gmail.com 
> > wrote:
>
>> About docs here: http://channels.readthedocs.io/en/stable/deploying.html
>>
>> "The fundamental difference is that the group mechanic requires all 
>> servers serving the same site to be able to see each other*; if you 
>> separate the site up and run it in a few, large clusters, messages to 
>> groups will only deliver to WebSockets connected to the same cluster.* 
>> For some site designs this will be fine, and if you think you can live with 
>> this and design around it (which means never designing anything around 
>> global notifications or events), this may be a good way to go."
>>
>> It's not clear to me, the exact meaning of this. Channels won't work 
>> properly if there are multiple servers (load balanced) running the 
>> ASGI/listener (interface server?) code? Does this mean a broadcast 
>> happening at one node will only deliver to all the clients connected to 
>> that node? Isn't the point of using redis to be able to broadcast to 
>> everyone across every server?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to django-users...@googlegroups.com .
>> To post to this group, send email to django...@googlegroups.com 
>> .
>> Visit this group at https://groups.google.com/group/django-users.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-users/ef02b7bb-9d36-42ae-9cff-454bd2f3a558%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/django-users/ef02b7bb-9d36-42ae-9cff-454bd2f3a558%40googlegroups.com?utm_medium=email_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/b397d73a-850c-44f2-b5a6-959846fa45c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Django Channels and multiple servers

2017-08-06 Thread Cristiano Coelho
For some reason my original question got lost.

http://channels.readthedocs.io/en/stable/deploying.html

I don't quite understand this part: 

The fundamental difference is that the group mechanic requires all servers 
serving the same site to be able to see each other; *if you separate the 
site up and run it in a few, large clusters, messages to groups will only 
deliver to WebSockets connected to the same cluster.* For some site designs 
this will be fine, and if you think you can live with this and design 
around it (which means never designing anything around global notifications 
or events), this may be a good way to go.

Does this mean if I run the ASGI/Interface server on multiple servers (load 
balanced), channels won't properly work since any broadcast to a group will 
only send messages to clients connected to that cluster and not others? 
What's the point of using redis? I would assume the whole groups concept 
would be able to broadcast messages back to every client regardless of how 
many web sockets servers you have running.

What I'm trying to find out is, if I run multiple servers that are deployed 
automatically and load balanced (such as amazon elastic beanstalk), where 
each server runs a standard django app, and also django channels both 
interface and workers, would it still work, or the fact that there might be 
multiple servers listening to the web sockets connections will pretty much 
break any messages broad casting? The wording from the documentation seems 
a bit unclear.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/0d7b2371-7db9-491b-a7db-b51469748beb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


About channels and scaling up

2017-08-06 Thread Cristiano Coelho
About docs here: http://channels.readthedocs.io/en/stable/deploying.html

"The fundamental difference is that the group mechanic requires all servers 
serving the same site to be able to see each other*; if you separate the 
site up and run it in a few, large clusters, messages to groups will only 
deliver to WebSockets connected to the same cluster.* For some site designs 
this will be fine, and if you think you can live with this and design 
around it (which means never designing anything around global notifications 
or events), this may be a good way to go."

It's not clear to me, the exact meaning of this. Channels won't work 
properly if there are multiple servers (load balanced) running the 
ASGI/listener (interface server?) code? Does this mean a broadcast 
happening at one node will only deliver to all the clients connected to 
that node? Isn't the point of using redis to be able to broadcast to 
everyone across every server?

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/ef02b7bb-9d36-42ae-9cff-454bd2f3a558%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.