Re: Improving aggregate support (#14030)

2014-03-30 Thread Josh Smeaton
Code complete: https://github.com/django/django/pull/2496

- Aggregates have learnt arithmetic and how to interact with F() expressions
- Annotations can now accept non-aggregate expressions

I'm in the process of writing documentation, some of which I've committed 
and is available within the PR. I plan to have the documentation in a ready 
for review state next weekend.

I'd really appreciate some reviewing of the code and the approach though, 
ideally before master diverges too much and makes merging too difficult.

Cheers,

Josh

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/6d07962d-7fc4-4ca2-9fa6-564a4e759346%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Improving aggregate support (#14030)

2014-01-19 Thread Michael Manfre
There are a few other things that I need to try and get in before the 1.7
feature freeze. Since this is not slated for 1.7, I'll take a look a the PR
after the freeze and see if the comments I had still apply.

Regards,
Michael Manfre


On Sun, Jan 19, 2014 at 8:00 AM, Josh Smeaton wrote:

> I've finally sent a PR, so if you're still able, I'd love to see the
> specific comments you had.
>
> https://github.com/django/django/pull/2184
>
> Cheers,
>
> - Josh
>
> On Sunday, 12 January 2014 10:35:58 UTC+11, Michael Manfre wrote:
>
>> With minor tweaks to django-mssql's SQLCompiler, I am able to pass the
>> aggregation and aggregation_regress tests. If you create the pull request,
>> comments can be attached to specific lines of code.
>>
>> I haven't had time to do a full review of your changes, but my initial
>> questions and comments are:
>>
>> - Why were the aggregate specific fields, properties, etc. added to
>> ExpressionNode, instead of being on django.db.models.aggregates?
>> - Why does an ExpressionNode containing any Aggregates need to pollute
>> the entire tree with that knowledge?
>> - Instead of using is_ordinal and is_computed (confusing name, why not
>> is_float), you could eliminate those and define output_type for each
>> Aggregate that does something special.
>> - What does is_summary mean?
>>
>> Regards,
>> Michael Manfre
>>
>>
>> On Sat, Jan 11, 2014 at 3:06 AM, Josh Smeaton wrote:
>>
>>> A milestone has been reached. The diff is here: https://github.com/
>>> jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1
>>>
>>>- Existing annotations and aggregations working as they were
>>>- Complex annotations and complex aggregations now work
>>>- Complex annotations can now be referenced by an aggregation
>>>- Cleaned up the logic relating to aggregating over an annotation
>>>
>>> See the tests for the new functionality here: https://github.com/
>>> jarshwah/django/blob/aggregates-14030/tests/aggregation/tests.py#L662
>>>
>>> Things I still need to tackle:
>>>
>>>- GIS
>>>- Custom backend implementation of aggregates and annotations
>>>- Annotations that aren't aggregates. I think this should be
>>>extremely easy, but I might not be aware of some underlying difficulties.
>>>
>>> Things that should be done in the future, as I don't think there will be
>>> time for this just now:
>>>
>>>- Change order_by (and other functions) to accept Expressions.
>>>
>>> I would think that, as it stands, this code is ready for review with the
>>> above caveats. If there are any backend authors reading (mssql,
>>> django-nonrel), I would encourage you to look at the implementation here,
>>> and let me know if there's anything glaringly obvious (to you) that is
>>> missing or broken.
>>>
>>> Regards,
>>>
>>> - Josh
>>>
>>> On Thursday, 9 January 2014 00:33:08 UTC+11, Josh Smeaton wrote:

 I'll preface by saying it's late for me, and I'm about to sign off, so
 any mistakes below I will correct tomorrow after some sleep.


> To write a custom aggregate one needed two classes, one for user
> facing API, one for implementing that API for SQL queries.


 I've taken note of this already, and that is the reason I've kept the
 sql.Aggregate class where it is, though it's not used internally anymore.
 There are two options that I can see right now:

 - The non-compatible way involves moving the sql_function attribute to
 the Aggregate from the SQLAggregate, and everything should work fine
 - The backward-compatible way might involve re-adding the
 "add_to_query" function, and using the SQLAggregate to monkeypatch the
 Aggregate by copying over the sql_function attribute. I would have to
 experiment and test this though. I assume this would be the more
 appropriate option. We can include a DeprecationWarning to phase it out.


> Another case you should check is how GIS works *without* rewriting
> anything in django.contrib.gis


 Good idea, that's how I'll approach the aggregate stuff. Not sure that
 I can leave it completely alone due to Expressions though, but we'll see
 (and hope).

 I'll try to check your work in detail as soon as possible.
> Unfortunately I am very busy at the moment, so I can't guarantee anything.


 Totally understandable, I appreciate your comments regardless. It's
 just unfortunate there aren't more people deeply familiar with the ORM -
 it's bloody complicated :P

 I'm now to the point where all tests in aggregate, aggregate_regress,
 expressions, expressions_regress, and lookup pass. I've got a small problem
 with .count() regressions failing, but am working through that now.
 Tomorrow I should be able to start playing with new functionality and
 writing new tests. I'll also take a look into running tests on GIS and
 fixing any 

Re: Improving aggregate support (#14030)

2014-01-19 Thread Josh Smeaton
I've finally sent a PR, so if you're still able, I'd love to see the 
specific comments you had.

https://github.com/django/django/pull/2184

Cheers,

- Josh

On Sunday, 12 January 2014 10:35:58 UTC+11, Michael Manfre wrote:
>
> With minor tweaks to django-mssql's SQLCompiler, I am able to pass the 
> aggregation and aggregation_regress tests. If you create the pull request, 
> comments can be attached to specific lines of code.
>
> I haven't had time to do a full review of your changes, but my initial 
> questions and comments are:
>
> - Why were the aggregate specific fields, properties, etc. added to 
> ExpressionNode, instead of being on django.db.models.aggregates?
> - Why does an ExpressionNode containing any Aggregates need to pollute the 
> entire tree with that knowledge?
> - Instead of using is_ordinal and is_computed (confusing name, why not 
> is_float), you could eliminate those and define output_type for each 
> Aggregate that does something special.
> - What does is_summary mean?
>
> Regards,
> Michael Manfre
>
>
> On Sat, Jan 11, 2014 at 3:06 AM, Josh Smeaton 
>  > wrote:
>
>> A milestone has been reached. The diff is here: 
>> https://github.com/jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1
>>
>>- Existing annotations and aggregations working as they were
>>- Complex annotations and complex aggregations now work
>>- Complex annotations can now be referenced by an aggregation
>>- Cleaned up the logic relating to aggregating over an annotation
>>
>> See the tests for the new functionality here: 
>> https://github.com/jarshwah/django/blob/aggregates-14030/tests/aggregation/tests.py#L662
>>
>> Things I still need to tackle:
>>
>>- GIS
>>- Custom backend implementation of aggregates and annotations 
>>- Annotations that aren't aggregates. I think this should be 
>>extremely easy, but I might not be aware of some underlying difficulties.
>>
>> Things that should be done in the future, as I don't think there will be 
>> time for this just now:
>>
>>- Change order_by (and other functions) to accept Expressions.
>>
>> I would think that, as it stands, this code is ready for review with the 
>> above caveats. If there are any backend authors reading (mssql, 
>> django-nonrel), I would encourage you to look at the implementation here, 
>> and let me know if there's anything glaringly obvious (to you) that is 
>> missing or broken.
>>
>> Regards,
>>
>> - Josh
>>
>> On Thursday, 9 January 2014 00:33:08 UTC+11, Josh Smeaton wrote:
>>>
>>> I'll preface by saying it's late for me, and I'm about to sign off, so 
>>> any mistakes below I will correct tomorrow after some sleep.
>>>  
>>>
 To write a custom aggregate one needed two classes, one for user facing 
 API, one for implementing that API for SQL queries.
>>>
>>>
>>> I've taken note of this already, and that is the reason I've kept the 
>>> sql.Aggregate class where it is, though it's not used internally anymore. 
>>> There are two options that I can see right now:
>>>
>>> - The non-compatible way involves moving the sql_function attribute to 
>>> the Aggregate from the SQLAggregate, and everything should work fine
>>> - The backward-compatible way might involve re-adding the "add_to_query" 
>>> function, and using the SQLAggregate to monkeypatch the Aggregate by 
>>> copying over the sql_function attribute. I would have to experiment and 
>>> test this though. I assume this would be the more appropriate option. We 
>>> can include a DeprecationWarning to phase it out.
>>>  
>>>
 Another case you should check is how GIS works *without* rewriting 
 anything in django.contrib.gis
>>>
>>>
>>> Good idea, that's how I'll approach the aggregate stuff. Not sure that I 
>>> can leave it completely alone due to Expressions though, but we'll see (and 
>>> hope). 
>>>
>>> I'll try to check your work in detail as soon as possible. Unfortunately 
 I am very busy at the moment, so I can't guarantee anything.
>>>
>>>
>>> Totally understandable, I appreciate your comments regardless. It's just 
>>> unfortunate there aren't more people deeply familiar with the ORM - it's 
>>> bloody complicated :P
>>>
>>> I'm now to the point where all tests in aggregate, aggregate_regress, 
>>> expressions, expressions_regress, and lookup pass. I've got a small problem 
>>> with .count() regressions failing, but am working through that now. 
>>> Tomorrow I should be able to start playing with new functionality and 
>>> writing new tests. I'll also take a look into running tests on GIS and 
>>> fixing any errors that rear their heads there.
>>>
>>> Cheers,
>>>
>>> - Josh
>>>  
>>>
>>> On Thursday, 9 January 2014 00:05:23 UTC+11, Anssi Kääriäinen wrote:

 On Friday, January 3, 2014 6:34:13 PM UTC+2, Josh Smeaton wrote:
>
> I now have all tests passing on Postgres and SQLite (See 
> 

Re: Improving aggregate support (#14030)

2014-01-16 Thread Josh Smeaton
That's fine, I knew I was cutting it close when I started. At least I can start 
getting to bed at a reasonable time now :)

I was wondering where the dev Irc channel was, so I'll make use of that too. 
Thanks for the pointers so far. 

- Josh

> On 16 Jan 2014, at 19:36, Anssi Kääriäinen  wrote:
> 
> I'm sorry I haven't had enough time to review your work. Unfortunately I 
> won't have any more time in the near future either. So, it looks like this 
> has to wait for 1.8. Sorry for that.
> 
> If you want to ask specific questions, #django-dev IRC channel is a good 
> place to ask those questions. It is often easier to solve design issues 
> real-time than on a mailing list.
> 
>  - Anssi
> 
>> On Wednesday, January 15, 2014 3:57:33 PM UTC+2, Josh Smeaton wrote:
>> Quick update. GIS support is nearly done, I just have two failing tests to 
>> take care of (one to do with lookups setting srid, and another that is 
>> producing the wrong value). I have a few validation checks to make inside 
>> prepare, and then it's on to writing tests that make use of aggregates and 
>> expressions together. I'm unsure what kind of expressions I can write with 
>> GIS queries though, so if anyone has any suggestions, I'd be glad to 
>> implement them.
>> 
>> Examples of tests with regular aggregates would be:
>> 
>> .annotate(Sum('field')+Sum('anotherfield'))
>> .annotate(Max('field')*2)
>> 
>> I'm not sure how you'd compose GeoAggregates as I've never used them before.
>> 
>> I should mention that I've only run GIS tests against PostGIS. I've noticed 
>> that Oracle is special cased in a number of areas, so I'd appreciate any 
>> help I could receive in testing against Oracle spatial.
>> 
>> The exact commit is: 
>> https://github.com/jarshwah/django/commit/c892c0076398e6aad134fc8df549a54a94ad9dd1
>> 
>> There was a bug in gis/sql/compiler that passed the quote_name function 
>> instead of itself to the expression. Whether or not this change is accepted, 
>> that bug should be fixed in master as it breaks the query expression API.
>> 
>> Cheers,
>> 
>> - Josh
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "Django developers" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/django-developers/8vEAwSwJGMc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/7c5a1866-a414-48c6-af4a-6477b8b5b79a%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/B0BAF75C-52D4-49A8-8939-70D2E96E32C2%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-16 Thread Anssi Kääriäinen
I'm sorry I haven't had enough time to review your work. Unfortunately I 
won't have any more time in the near future either. So, it looks like this 
has to wait for 1.8. Sorry for that.

If you want to ask specific questions, #django-dev IRC channel is a good 
place to ask those questions. It is often easier to solve design issues 
real-time than on a mailing list.

 - Anssi

On Wednesday, January 15, 2014 3:57:33 PM UTC+2, Josh Smeaton wrote:
>
> Quick update. GIS support is nearly done, I just have two failing tests to 
> take care of (one to do with lookups setting srid, and another that is 
> producing the wrong value). I have a few validation checks to make inside 
> prepare, and then it's on to writing tests that make use of aggregates and 
> expressions together. I'm unsure what kind of expressions I can write with 
> GIS queries though, so if anyone has any suggestions, I'd be glad to 
> implement them.
>
> Examples of tests with regular aggregates would be:
>
> .annotate(Sum('field')+Sum('anotherfield'))
> .annotate(Max('field')*2)
>
> I'm not sure how you'd compose GeoAggregates as I've never used them 
> before.
>
> I should mention that I've only run GIS tests against PostGIS. I've 
> noticed that Oracle is special cased in a number of areas, so I'd 
> appreciate any help I could receive in testing against Oracle spatial.
>
> The exact commit is: 
> https://github.com/jarshwah/django/commit/c892c0076398e6aad134fc8df549a54a94ad9dd1
>
> There was a bug in gis/sql/compiler that passed the quote_name function 
> instead of itself to the expression. Whether or not this change is 
> accepted, that bug should be fixed in master as it breaks the query 
> expression API.
>
> Cheers,
>
> - Josh
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7c5a1866-a414-48c6-af4a-6477b8b5b79a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-15 Thread Josh Smeaton
Quick update. GIS support is nearly done, I just have two failing tests to 
take care of (one to do with lookups setting srid, and another that is 
producing the wrong value). I have a few validation checks to make inside 
prepare, and then it's on to writing tests that make use of aggregates and 
expressions together. I'm unsure what kind of expressions I can write with 
GIS queries though, so if anyone has any suggestions, I'd be glad to 
implement them.

Examples of tests with regular aggregates would be:

.annotate(Sum('field')+Sum('anotherfield'))
.annotate(Max('field')*2)

I'm not sure how you'd compose GeoAggregates as I've never used them before.

I should mention that I've only run GIS tests against PostGIS. I've noticed 
that Oracle is special cased in a number of areas, so I'd appreciate any 
help I could receive in testing against Oracle spatial.

The exact commit is: 
https://github.com/jarshwah/django/commit/c892c0076398e6aad134fc8df549a54a94ad9dd1

There was a bug in gis/sql/compiler that passed the quote_name function 
instead of itself to the expression. Whether or not this change is 
accepted, that bug should be fixed in master as it breaks the query 
expression API.

Cheers,

- Josh

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/21368cac-3b12-4545-a634-9b501bf4f74c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-14 Thread Josh Smeaton
I've also written a basic hack (there's no other word for it really) that 
should support the majority of custom aggregates out in the wild. The exact 
commit is here:

https://github.com/jarshwah/django/commit/74c945db2e12444dd012cf89dbf65dddce84ba7b

Basically, it checks to see if add_to_query exists, call it with a known 
alias so that we can then extract the SQLAggregate from query.aggregates, 
and steals the sql_function and sql_template attributes. That said, to 
support the "new" way, the user just needs to copy those two attributes 
onto the models/aggregate, and they now support both the old and new way.

Unfortunately, this doesn't work for GIS. I'll need to port GeoAggregate to 
work like a standard Aggregate, but the work there is extremely minimal. It 
would also gain the support I've built in above for custom Geo Aggregates. 
The "hard" work for GIS is changing the models/query and models/sql/query 
as I've done for core.

Cheers,

- Josh

On Tuesday, 14 January 2014 17:56:29 UTC+11, Josh Smeaton wrote:
>
> Vendor specific SQL is now possible, and uses the monkey patching 
> described above. I worked around the problem of temporarily changing the 
> template, by putting the template and the function into the extra dict on 
> init. Any modifications made to the dict are then made for that instance 
> only. Complicated overrides are possible, giving full control when needed, 
> but calling super after modifying the dicts should be the preferred way of 
> patching.
>
> Relevant code:
>
>
> https://github.com/jarshwah/django/blob/a595a87ea9d90a62ffdf03ced42f1b0904f25e25/django/db/models/expressions.py#L269
>
> https://github.com/jarshwah/django/blob/a595a87ea9d90a62ffdf03ced42f1b0904f25e25/django/db/models/expressions.py#L280
>
> And the tests:
>
>
> https://github.com/jarshwah/django/blob/a595a87ea9d90a62ffdf03ced42f1b0904f25e25/tests/aggregation/tests.py#L800
>
> One of the relevant tests, showing the most simple way of overriding an 
> aggregate:
>
> def lower_case_function_super(self, qn, connection):
> self.extra['sql_function'] = self.extra['sql_function'].lower()
> return super(Sum, self).as_sql(qn, connection)
> setattr(Sum, 'as_%s' % connection.vendor, 
> lower_case_function_super)
>
> Cheers,
>
> - Josh
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/e2f6cd30-63d3-4c50-8b2e-3fa3bff5c0c1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-13 Thread Josh Smeaton
Vendor specific SQL is now possible, and uses the monkey patching described 
above. I worked around the problem of temporarily changing the template, by 
putting the template and the function into the extra dict on init. Any 
modifications made to the dict are then made for that instance only. 
Complicated overrides are possible, giving full control when needed, but 
calling super after modifying the dicts should be the preferred way of 
patching.

Relevant code:

https://github.com/jarshwah/django/blob/a595a87ea9d90a62ffdf03ced42f1b0904f25e25/django/db/models/expressions.py#L269
https://github.com/jarshwah/django/blob/a595a87ea9d90a62ffdf03ced42f1b0904f25e25/django/db/models/expressions.py#L280

And the tests:

https://github.com/jarshwah/django/blob/a595a87ea9d90a62ffdf03ced42f1b0904f25e25/tests/aggregation/tests.py#L800

One of the relevant tests, showing the most simple way of overriding an 
aggregate:

def lower_case_function_super(self, qn, connection):
self.extra['sql_function'] = self.extra['sql_function'].lower()
return super(Sum, self).as_sql(qn, connection)
setattr(Sum, 'as_%s' % connection.vendor, lower_case_function_super)

Cheers,

- Josh

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/725aab28-d513-4820-ba78-e8a42d493842%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-13 Thread Josh Smeaton
I've rebased off your current lookups pull request to drag in all the 
changes made recently, and I pulled against upstream/master which has made 
looking at the diff between our branches extremely painful. Once your PR is 
applied to master, I'll be able to submit a proper PR though (after some 
squashing).

Onto the detail.

Maybe both Aggregate and Expression could implement the query expression 
> API (that is, those things Col class has in the lookups patch). This way 
> F('author__age') + F('author__books_sold') would translate to:
>   Expression(+):
> Col(author, age)
> Col(author, books_sold)


Absolutely that should be the goal. I've already written Expressions to 
support the query expression API (I had to, since lookups expected 
aggregates to support it), but I still can't use the Transforms because, as 
you said, build_filter() does all the work currently.

Internally this would mean that Expression just generates the connector 
> between nodes, but doesn't produce any SQL for the nodes itself. It seems 
> Expression is trying to do too much currently. It produces connector 
> between nodes when it is internal node and is also generates SQL for the 
> leaf nodes, too.


You were right, it was doing way too much. I've refactored Expressions 
significantly so that it does much much less. All the leaf nodes have 
changed their get_sql() methods to as_sql() meaning that they now support 
as_vendorname(), since the compiler is what generates the SQL. I still need 
to test as_vendorname to confirm I haven't missed anything, but it should 
work with little modification now. I'll write those tests tonight.

I started this refactoring with the hope that it'd be able to make the 
January 20th cutoff, but that's not going to be possible if I aim for 
getting non-aggregate annotations in. It needs more thought, and 
potentially a lot of work. However, I feel the work done up until this 
point gives a lot more power to aggregates that was not possible before. 
I'd like to propose submitting aggregate improvements now, closing 14030, 
and opening another ticket for supporting non-aggregate annotations.

Getting this change included relies on a few things though;

- There needs to be time for you (or someone else) to properly review the 
change. With only 6 or so days to go, I think this is asking too much.
- Docs need to be written
- I still need to sort out GIS
- Support for 3rd party backends (or non-sql backends) needs to be 
evaluated. The maintainer of django-mssql has taken a cursory look, but 
review seemed mostly positive (it worked with minimal changes). This does 
not necessarily mean that non-sql backends will be fine though.
- Tests need to be run on mysql and oracle. I couldn't get the tests 
running on mysql (errors regarding long index names when creating the DB), 
or oracle, since I only have oracle cluster at my disposal, which doesn't 
support creating the test database. I can run up an Oracle XE, but I won't 
do that just now if 1.8 is going to be the target.
- Cleaning up the commits
- Waiting for Lookups to land on master so a clean PR can be submitted

What are your thoughts?

Whether or not this is picked up for 1.7 or 1.8, I'll continue working on 
the patch and keeping it up to date with master so it doesn't go stale. 
That's happened enough times :P

Cheers

- Josh



On Sunday, 12 January 2014 21:28:33 UTC+11, Anssi Kääriäinen wrote:
>
> On Sunday, January 12, 2014 10:52:03 AM UTC+2, Josh Smeaton wrote:
>>
>> Any idea how to do this?
>>>
>>> The new way in custom lookups is trying to first call as_vendorname(qn, 
>>> connection), if that doesn't exist then call the standard implementation, 
>>> that is as_sql(). For example on sqlite Django will first try to call 
>>> as_sqlite(), if that doesn't exist then as_sql() will be called.
>>>
>>
>> I've been following your PR and noted the change (and made some comments 
>> too) away from @add_implementation. I'm having a bit of an internal debate 
>> about how I've structured expressions with respect to as_sql though. Each 
>> expression also has a get_sql() method that as_sql() calls internally on 
>> leaf nodes. I'm thinking it'd be best for the extension point to be on 
>> get_sql(), but it'd be inconsistent with the Lookups API. The problem with 
>> overriding as_sql() would be that the extender would also need to know to 
>> recurse the entire tree which is not a good idea at all. Would the 
>> inconsistency be acceptable? When I spend some time actually trying out 
>> swappable implementations, this may all become clearer to me.
>>
>
> Maybe both Aggregate and Expression could implement the query expression 
> API (that is, those things Col class has in the lookups patch). This way 
> F('author__age') + F('author__books_sold') would translate to:
>   Expression(+):
> Col(author, age)
> Col(author, books_sold)
>
> If you write F('author__age') + F(Sum('author__books__pages'))) this would 
> translate to:
>   

Re: Improving aggregate support (#14030)

2014-01-12 Thread Anssi Kääriäinen
On Sunday, January 12, 2014 10:52:03 AM UTC+2, Josh Smeaton wrote:
>
> Any idea how to do this?
>>
>> The new way in custom lookups is trying to first call as_vendorname(qn, 
>> connection), if that doesn't exist then call the standard implementation, 
>> that is as_sql(). For example on sqlite Django will first try to call 
>> as_sqlite(), if that doesn't exist then as_sql() will be called.
>>
>
> I've been following your PR and noted the change (and made some comments 
> too) away from @add_implementation. I'm having a bit of an internal debate 
> about how I've structured expressions with respect to as_sql though. Each 
> expression also has a get_sql() method that as_sql() calls internally on 
> leaf nodes. I'm thinking it'd be best for the extension point to be on 
> get_sql(), but it'd be inconsistent with the Lookups API. The problem with 
> overriding as_sql() would be that the extender would also need to know to 
> recurse the entire tree which is not a good idea at all. Would the 
> inconsistency be acceptable? When I spend some time actually trying out 
> swappable implementations, this may all become clearer to me.
>

Maybe both Aggregate and Expression could implement the query expression 
API (that is, those things Col class has in the lookups patch). This way 
F('author__age') + F('author__books_sold') would translate to:
  Expression(+):
Col(author, age)
Col(author, books_sold)

If you write F('author__age') + F(Sum('author__books__pages'))) this would 
translate to:
  Expression(+):
Col(author, age)
Sum(Col(books, pages))

The SQL would be generated by:
Expression.as_sql(qn, connection):
params = []
sql = []
for node in self.children:
node_sql, node_params = qn.compile(node)
sql.append(node_sql)
params.extend(node_params)
return ' + '.join(sql), params

Note that it doesn't matter at all if the children is an aggregate, a col 
reference of another Expression.

You would also gain support for F('author__birthdate__year'), that is your 
extract lookups are available. An extract is after all just another class 
implementing the query expression API. Full support needs a bit of 
refactoring to the Query class as expressions are currently only resolved 
in build_filter(). A generic get_ref(lookup_path) method is needed.

Internally this would mean that Expression just generates the connector 
between nodes, but doesn't produce any SQL for the nodes itself. It seems 
Expression is trying to do too much currently. It produces connector 
between nodes when it is internal node and is also generates SQL for the 
leaf nodes, too.

This is just an idea. To me it seems this is cleaner, as now Expression 
node is just connecting two parts implementing the sql expression API. OTOH 
this is really large refactoring, so maybe it is better to go with the 
current patch and then check if things can be improved further later on.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/ad2fbf0a-4b0b-4bab-9611-f0c915e4a361%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-12 Thread Josh Smeaton

>
> Any idea how to do this?
>
> The new way in custom lookups is trying to first call as_vendorname(qn, 
> connection), if that doesn't exist then call the standard implementation, 
> that is as_sql(). For example on sqlite Django will first try to call 
> as_sqlite(), if that doesn't exist then as_sql() will be called.
>

I've been following your PR and noted the change (and made some comments 
too) away from @add_implementation. I'm having a bit of an internal debate 
about how I've structured expressions with respect to as_sql though. Each 
expression also has a get_sql() method that as_sql() calls internally on 
leaf nodes. I'm thinking it'd be best for the extension point to be on 
get_sql(), but it'd be inconsistent with the Lookups API. The problem with 
overriding as_sql() would be that the extender would also need to know to 
recurse the entire tree which is not a good idea at all. Would the 
inconsistency be acceptable? When I spend some time actually trying out 
swappable implementations, this may all become clearer to me.

>
>>- Annotations that aren't aggregates. I think this should be 
>>extremely easy, but I might not be aware of some underlying difficulties.
>>
>> I don't think it is that easy. Django assumes that if any annotation 
>> exist, then the query is an aggregation query. But of course that doesn't 
>> hold if the above is done.
>
>
You're right. I was hoping that I could just append nodes into the 
query.select list, but that was a dead end. 

I've been playing around with non-aggregate annotations today, and I think 
it's almost fine to just put them directly into the query.aggregates 
dictionary (and rename it to annotations). It works in trivial cases only, 
by checking the is_aggregate property. F() expressions over annotations are 
forced into the HAVING part. Aggregations over annotations fail, because 
the annotation is promoted to the select list. I think both of these 
problems I can overcome fairly easy, but that doesn't mean there won't be 
more issues.

An alternative would be to special case non-aggregate annotations in a 
similar way that aggregates are treated in sql/query now. I don't like this 
idea at all, because too much needs to change. I'm trying to change as 
little as possible in sql/query and sql/compiler.

order_by(RefSQL("case when {{author__birthdate__year}} > %s then 1 else 0 
> end", [1981])). Joins generated by Django, and you can use any custom 
> lookups you have. No need for .extra() any more!


That's quite interesting. I've already introduced the concept of a 
ValueNode, which is an expression that outputs whatever value you give it. 
Already one could write .annotate(ValueNode("case when ...")) as a string, 
and it'd work, but that misses the {{author__birthdate__year}} stuff in 
your example. Once I've got my main points implemented, I was going to 
experiment with something like:

.annotate(Case(condition=Q(author__birthdate__year__gt=1891), true=1, 
false=0))

The actual use case I had in mind was in reference to #11305 (Conditional 
Aggregates):

.annotate(Count(Case(condition=Q(author__birthdate__year__gt=1891), true=1, 
false=0))

But that's just speculation at the moment. More work needs to be done to 
see what that'd look like in annotations, but then the path is clear for 
implementing it with order_by and friends.

Cheers,

- Josh

On Sunday, 12 January 2014 19:11:11 UTC+11, Anssi Kääriäinen wrote:
>
> On Saturday, January 11, 2014 10:06:14 AM UTC+2, Josh Smeaton wrote:
>>
>>
>>- Custom backend implementation of aggregates and annotations
>>
>> Any idea how to do this?
>
> The new way in custom lookups is trying to first call as_vendorname(qn, 
> connection), if that doesn't exist then call the standard implementation, 
> that is as_sql(). For example on sqlite Django will first try to call 
> as_sqlite(), if that doesn't exist then as_sql() will be called.
>
> For aggregates I guess monkey-patch addition of as_vendorname() will work. 
> It is a bit ugly to write the custom implementations. And as the function 
> template for an aggregate is defined as a variable in the aggregate itself 
> changing that isn't going to be that easy. You can't just change the 
> template and call super - if the same query is executed on different 
> backend then things will not work. I guess what is needed is some way of 
> non-permanent change of the template. Possibilities seem to be addition of 
> template kwarg to as_sql() so you can just do:
> def as_mysql():
> super(qn, connection, template='MYSQLCOUNT')
>
> Alternatively there could be some way to clone the aggregate into backend 
> specific class if needed.
>
>>
>>- Annotations that aren't aggregates. I think this should be 
>>extremely easy, but I might not be aware of some underlying difficulties.
>>
>> I don't think it is that easy. Django assumes that if any annotation 
> exist, then the query is an aggregation query. But of course that doesn't 
> hold 

Re: Improving aggregate support (#14030)

2014-01-12 Thread Anssi Kääriäinen
On Saturday, January 11, 2014 10:06:14 AM UTC+2, Josh Smeaton wrote:
>
>
>- Custom backend implementation of aggregates and annotations
>
> Any idea how to do this?

The new way in custom lookups is trying to first call as_vendorname(qn, 
connection), if that doesn't exist then call the standard implementation, 
that is as_sql(). For example on sqlite Django will first try to call 
as_sqlite(), if that doesn't exist then as_sql() will be called.

For aggregates I guess monkey-patch addition of as_vendorname() will work. 
It is a bit ugly to write the custom implementations. And as the function 
template for an aggregate is defined as a variable in the aggregate itself 
changing that isn't going to be that easy. You can't just change the 
template and call super - if the same query is executed on different 
backend then things will not work. I guess what is needed is some way of 
non-permanent change of the template. Possibilities seem to be addition of 
template kwarg to as_sql() so you can just do:
def as_mysql():
super(qn, connection, template='MYSQLCOUNT')

Alternatively there could be some way to clone the aggregate into backend 
specific class if needed.

>
>- Annotations that aren't aggregates. I think this should be extremely 
>easy, but I might not be aware of some underlying difficulties.
>
> I don't think it is that easy. Django assumes that if any annotation 
exist, then the query is an aggregation query. But of course that doesn't 
hold if the above is done.
 

> Things that should be done in the future, as I don't think there will be 
> time for this just now:
>
>- Change order_by (and other functions) to accept Expressions.
>
> Not just Expressions but anything implementing query expression API. That 
is, it would be very useful to be able to do 
qs.order_by('birthdate__year'). And if this is done it will be actually 
pretty easy to inject custom SQL into the queries. It will be relatively 
easy to write a RefSQL class which allows one to do: .order_by(RefSQL("case 
when {{author__birthdate__year}} > %s then 1 else 0 end", [1981])). Joins 
generated by Django, and you can use any custom lookups you have. No need 
for .extra() any more!

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/82973329-da5f-451a-a3d0-c45b3d1ef850%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-11 Thread Josh Smeaton
Hi Michael,

Thanks for taking a look, much appreciated. I haven't created a PR yet 
because I've rebased on top of Ansii's Lookups refactor which hasn't hit 
Master yet (and is changing slightly based on feedback of his PR). To 
answer your questions:

Why were the aggregate specific fields, properties, etc. added to 
> ExpressionNode, instead of being on django.db.models.aggregates?
>

Originally they were, but then Sum('field') + Sum('anotherfield') didn't 
work as that expression produces an ExpressionNode with AggregateNode 
children. That ExpressionNode (+ (Sum, Sum)) has to behave like an 
Aggregate. The previous SQLEvaluator had specific methods that related to 
F() expressions and DateModifierNodes for exactly this reason, which have 
now been pushed down to the ExpressionNode.

- Why does an ExpressionNode containing any Aggregates need to pollute the 
> entire tree with that knowledge?
>

As mentioned above. The ExpressionNode becomes an Aggregate to the outside 
world and should be treated like one. The leaves of the Expression tree 
form the actual aggregate functions, but the root of the tree forms the 
entire Aggregate expression. Using the example above:

.annotate(label=Sum('field') + Sum('anotherfield')) ->

ExpressionNode(+):
: Sum(field)
: Sum(anotherfield)

The framework needs to know several things about an Aggregate, like its 
output type and the type of columns that are wrapped. When there are 
multiple aggregate functions in a single expression, the root needs to 
check its children and make a decision on what to present.

Instead of using is_ordinal and is_computed (confusing name, why not 
> is_float), you could eliminate those and define output_type for each 
> Aggregate that does something special.
>

These are remnants of existing aggregates, and can probably be replaced by 
setting the field_type property directly. I will look into doing that.

What does is_summary mean?
>

It means that .aggregate() has been invoked which evaluates immediately. 
Internally the aggregate is prepared differently though as it has to check 
if it refers to an aggregate annotation. I don't like how the query sets 
the is_summary property directly though, and might grow .prepare() to also 
accept an is_summary parameter.

Cheers,

- Josh

On Sunday, 12 January 2014 10:35:58 UTC+11, Michael Manfre wrote:
>
> With minor tweaks to django-mssql's SQLCompiler, I am able to pass the 
> aggregation and aggregation_regress tests. If you create the pull request, 
> comments can be attached to specific lines of code.
>
> I haven't had time to do a full review of your changes, but my initial 
> questions and comments are:
>
> - Why were the aggregate specific fields, properties, etc. added to 
> ExpressionNode, instead of being on django.db.models.aggregates?
> - Why does an ExpressionNode containing any Aggregates need to pollute the 
> entire tree with that knowledge?
> - Instead of using is_ordinal and is_computed (confusing name, why not 
> is_float), you could eliminate those and define output_type for each 
> Aggregate that does something special.
> - What does is_summary mean?
>
> Regards,
> Michael Manfre
>
>
> On Sat, Jan 11, 2014 at 3:06 AM, Josh Smeaton 
>  > wrote:
>
>> A milestone has been reached. The diff is here: 
>> https://github.com/jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1
>>
>>- Existing annotations and aggregations working as they were
>>- Complex annotations and complex aggregations now work
>>- Complex annotations can now be referenced by an aggregation
>>- Cleaned up the logic relating to aggregating over an annotation
>>
>> See the tests for the new functionality here: 
>> https://github.com/jarshwah/django/blob/aggregates-14030/tests/aggregation/tests.py#L662
>>
>> Things I still need to tackle:
>>
>>- GIS
>>- Custom backend implementation of aggregates and annotations 
>>- Annotations that aren't aggregates. I think this should be 
>>extremely easy, but I might not be aware of some underlying difficulties.
>>
>> Things that should be done in the future, as I don't think there will be 
>> time for this just now:
>>
>>- Change order_by (and other functions) to accept Expressions.
>>
>> I would think that, as it stands, this code is ready for review with the 
>> above caveats. If there are any backend authors reading (mssql, 
>> django-nonrel), I would encourage you to look at the implementation here, 
>> and let me know if there's anything glaringly obvious (to you) that is 
>> missing or broken.
>>
>> Regards,
>>
>> - Josh
>>
>> On Thursday, 9 January 2014 00:33:08 UTC+11, Josh Smeaton wrote:
>>>
>>> I'll preface by saying it's late for me, and I'm about to sign off, so 
>>> any mistakes below I will correct tomorrow after some sleep.
>>>  
>>>
 To write a custom aggregate one needed two classes, one for user facing 
 API, one for implementing that API for SQL queries.
>>>

Re: Improving aggregate support (#14030)

2014-01-11 Thread Michael Manfre
With minor tweaks to django-mssql's SQLCompiler, I am able to pass the
aggregation and aggregation_regress tests. If you create the pull request,
comments can be attached to specific lines of code.

I haven't had time to do a full review of your changes, but my initial
questions and comments are:

- Why were the aggregate specific fields, properties, etc. added to
ExpressionNode, instead of being on django.db.models.aggregates?
- Why does an ExpressionNode containing any Aggregates need to pollute the
entire tree with that knowledge?
- Instead of using is_ordinal and is_computed (confusing name, why not
is_float), you could eliminate those and define output_type for each
Aggregate that does something special.
- What does is_summary mean?

Regards,
Michael Manfre


On Sat, Jan 11, 2014 at 3:06 AM, Josh Smeaton wrote:

> A milestone has been reached. The diff is here:
> https://github.com/jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1
>
>- Existing annotations and aggregations working as they were
>- Complex annotations and complex aggregations now work
>- Complex annotations can now be referenced by an aggregation
>- Cleaned up the logic relating to aggregating over an annotation
>
> See the tests for the new functionality here:
> https://github.com/jarshwah/django/blob/aggregates-14030/tests/aggregation/tests.py#L662
>
> Things I still need to tackle:
>
>- GIS
>- Custom backend implementation of aggregates and annotations
>- Annotations that aren't aggregates. I think this should be extremely
>easy, but I might not be aware of some underlying difficulties.
>
> Things that should be done in the future, as I don't think there will be
> time for this just now:
>
>- Change order_by (and other functions) to accept Expressions.
>
> I would think that, as it stands, this code is ready for review with the
> above caveats. If there are any backend authors reading (mssql,
> django-nonrel), I would encourage you to look at the implementation here,
> and let me know if there's anything glaringly obvious (to you) that is
> missing or broken.
>
> Regards,
>
> - Josh
>
> On Thursday, 9 January 2014 00:33:08 UTC+11, Josh Smeaton wrote:
>>
>> I'll preface by saying it's late for me, and I'm about to sign off, so
>> any mistakes below I will correct tomorrow after some sleep.
>>
>>
>>> To write a custom aggregate one needed two classes, one for user facing
>>> API, one for implementing that API for SQL queries.
>>
>>
>> I've taken note of this already, and that is the reason I've kept the
>> sql.Aggregate class where it is, though it's not used internally anymore.
>> There are two options that I can see right now:
>>
>> - The non-compatible way involves moving the sql_function attribute to
>> the Aggregate from the SQLAggregate, and everything should work fine
>> - The backward-compatible way might involve re-adding the "add_to_query"
>> function, and using the SQLAggregate to monkeypatch the Aggregate by
>> copying over the sql_function attribute. I would have to experiment and
>> test this though. I assume this would be the more appropriate option. We
>> can include a DeprecationWarning to phase it out.
>>
>>
>>> Another case you should check is how GIS works *without* rewriting
>>> anything in django.contrib.gis
>>
>>
>> Good idea, that's how I'll approach the aggregate stuff. Not sure that I
>> can leave it completely alone due to Expressions though, but we'll see (and
>> hope).
>>
>> I'll try to check your work in detail as soon as possible. Unfortunately
>>> I am very busy at the moment, so I can't guarantee anything.
>>
>>
>> Totally understandable, I appreciate your comments regardless. It's just
>> unfortunate there aren't more people deeply familiar with the ORM - it's
>> bloody complicated :P
>>
>> I'm now to the point where all tests in aggregate, aggregate_regress,
>> expressions, expressions_regress, and lookup pass. I've got a small problem
>> with .count() regressions failing, but am working through that now.
>> Tomorrow I should be able to start playing with new functionality and
>> writing new tests. I'll also take a look into running tests on GIS and
>> fixing any errors that rear their heads there.
>>
>> Cheers,
>>
>> - Josh
>>
>>
>> On Thursday, 9 January 2014 00:05:23 UTC+11, Anssi Kääriäinen wrote:
>>>
>>> On Friday, January 3, 2014 6:34:13 PM UTC+2, Josh Smeaton wrote:

 I now have all tests passing on Postgres and SQLite (See 
 here).
 I still haven't touched GIS, but I believe it should be as simple as
 changing the query to directly call prepare on the expression, and removing
 references to SQLEvaluator.

 The big problem is that if we drop the SQLEvaluator and similarly the
> models/aggregates.py <-> models/sql/aggregates.py ways of writing
> Expressions and Aggregates, then there 

Re: Improving aggregate support (#14030)

2014-01-11 Thread Josh Smeaton
A milestone has been reached. The diff is here: 
https://github.com/jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1

   - Existing annotations and aggregations working as they were
   - Complex annotations and complex aggregations now work
   - Complex annotations can now be referenced by an aggregation
   - Cleaned up the logic relating to aggregating over an annotation
   
See the tests for the new functionality here: 
https://github.com/jarshwah/django/blob/aggregates-14030/tests/aggregation/tests.py#L662

Things I still need to tackle:

   - GIS
   - Custom backend implementation of aggregates and annotations
   - Annotations that aren't aggregates. I think this should be extremely 
   easy, but I might not be aware of some underlying difficulties.

Things that should be done in the future, as I don't think there will be 
time for this just now:

   - Change order_by (and other functions) to accept Expressions.

I would think that, as it stands, this code is ready for review with the 
above caveats. If there are any backend authors reading (mssql, 
django-nonrel), I would encourage you to look at the implementation here, 
and let me know if there's anything glaringly obvious (to you) that is 
missing or broken.

Regards,

- Josh

On Thursday, 9 January 2014 00:33:08 UTC+11, Josh Smeaton wrote:
>
> I'll preface by saying it's late for me, and I'm about to sign off, so any 
> mistakes below I will correct tomorrow after some sleep.
>  
>
>> To write a custom aggregate one needed two classes, one for user facing 
>> API, one for implementing that API for SQL queries.
>
>
> I've taken note of this already, and that is the reason I've kept the 
> sql.Aggregate class where it is, though it's not used internally anymore. 
> There are two options that I can see right now:
>
> - The non-compatible way involves moving the sql_function attribute to the 
> Aggregate from the SQLAggregate, and everything should work fine
> - The backward-compatible way might involve re-adding the "add_to_query" 
> function, and using the SQLAggregate to monkeypatch the Aggregate by 
> copying over the sql_function attribute. I would have to experiment and 
> test this though. I assume this would be the more appropriate option. We 
> can include a DeprecationWarning to phase it out.
>  
>
>> Another case you should check is how GIS works *without* rewriting 
>> anything in django.contrib.gis
>
>
> Good idea, that's how I'll approach the aggregate stuff. Not sure that I 
> can leave it completely alone due to Expressions though, but we'll see (and 
> hope). 
>
> I'll try to check your work in detail as soon as possible. Unfortunately I 
>> am very busy at the moment, so I can't guarantee anything.
>
>
> Totally understandable, I appreciate your comments regardless. It's just 
> unfortunate there aren't more people deeply familiar with the ORM - it's 
> bloody complicated :P
>
> I'm now to the point where all tests in aggregate, aggregate_regress, 
> expressions, expressions_regress, and lookup pass. I've got a small problem 
> with .count() regressions failing, but am working through that now. 
> Tomorrow I should be able to start playing with new functionality and 
> writing new tests. I'll also take a look into running tests on GIS and 
> fixing any errors that rear their heads there.
>
> Cheers,
>
> - Josh
>  
>
> On Thursday, 9 January 2014 00:05:23 UTC+11, Anssi Kääriäinen wrote:
>>
>> On Friday, January 3, 2014 6:34:13 PM UTC+2, Josh Smeaton wrote:
>>>
>>> I now have all tests passing on Postgres and SQLite (See 
>>> here).
>>>  
>>> I still haven't touched GIS, but I believe it should be as simple as 
>>> changing the query to directly call prepare on the expression, and removing 
>>> references to SQLEvaluator.
>>>
>>> The big problem is that if we drop the SQLEvaluator and similarly the 
 models/aggregates.py <-> models/sql/aggregates.py ways of writing 
 Expressions and Aggregates, then there will be a lot of broken 3rd party 
 aggregates and expressions. Even if the APIs are private I am not willing 
 to just drop the old way. So, some backwards compatibility is required.

>>>
>>> What level of backwards compatibility would be required? I like the idea 
>>> of using the @add_implementation decorator to extend Expressions and 
>>> Aggregates, but would that be enough? I've not used django-nonrel or any 
>>> other non-official backend, so I'm really not sure what would be required 
>>> or where a line should be drawn. The SQLEvaluator API (mostly) has moved 
>>> down to ExpressionNode, so the callers only change by calling .prepare on 
>>> the value rather than wrapping it in an Evaluator.
>>>
>>
>> The biggest problem are 3rd party Aggregates. To write a custom aggregate 
>> one needed two classes, one for user facing API, one for implementing that 
>> API for SQL queries.
>>
>> Taking first 

Re: Improving aggregate support (#14030)

2014-01-08 Thread Anssi Kääriäinen
On Friday, January 3, 2014 6:34:13 PM UTC+2, Josh Smeaton wrote:
>
> I now have all tests passing on Postgres and SQLite (See 
> here).
>  
> I still haven't touched GIS, but I believe it should be as simple as 
> changing the query to directly call prepare on the expression, and removing 
> references to SQLEvaluator.
>
> The big problem is that if we drop the SQLEvaluator and similarly the 
>> models/aggregates.py <-> models/sql/aggregates.py ways of writing 
>> Expressions and Aggregates, then there will be a lot of broken 3rd party 
>> aggregates and expressions. Even if the APIs are private I am not willing 
>> to just drop the old way. So, some backwards compatibility is required.
>>
>
> What level of backwards compatibility would be required? I like the idea 
> of using the @add_implementation decorator to extend Expressions and 
> Aggregates, but would that be enough? I've not used django-nonrel or any 
> other non-official backend, so I'm really not sure what would be required 
> or where a line should be drawn. The SQLEvaluator API (mostly) has moved 
> down to ExpressionNode, so the callers only change by calling .prepare on 
> the value rather than wrapping it in an Evaluator.
>

The biggest problem are 3rd party Aggregates. To write a custom aggregate 
one needed two classes, one for user facing API, one for implementing that 
API for SQL queries.

Taking first example I found from Google, from 
http://voices.canonical.com/michael.hudson/2012/09/02/using-postgres-array_agg-from-django/:

class SQLArrayAgg(SQLAggregate):
sql_function = 'array_agg'

class ArrayAgg(Aggregate):
name = 'ArrayAgg'
def add_to_query(self, query, alias, col, source, is_summary):
klass = SQLArrayAgg
aggregate = klass(
col, source=source, is_summary=is_summary, **self.extra)
aggregate.field = DecimalField() # vomit
query.aggregates[alias] = aggregate

Then doing:
qs = SomeModel.objects.annotate(ArrayAgg('id'))
should produce results using the ArrayAgg class. This is using private API, but 
there is a lot of code relying on this. We should support this if at all 
possible.

Another case you should check is how GIS works *without* rewriting anything in 
django.contrib.gis. If that works, then that shows fairly good backwards 
compatibility.


Lets try to find the common use cases and try to make sure that they work 
without any modification during deprecation period, and they are as easy as 
possible to rewrite to use the new way.

If this turns out to be too painful to support backwards compatibility we 
have to consider either dropping compatibility or maybe supporting both 
NewAggregate and old style Aggregate simultaneously.

I am not sure how much custom ExpressionNodes there are, at least such 
nodes that rely on having SQLEvaluator available. If anybody reading this 
knows of any please mention them here.

I'll try to check your work in detail as soon as possible. Unfortunately I 
am very busy at the moment, so I can't guarantee anything.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f115b7b9-80d5-435d-9d14-cd38e66e3a28%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-06 Thread Josh Smeaton
An update. I've had some success with refactoring Aggregates as 
ExpressionNodes. The aggregates test suite is now passing, but I still have 
some failures with aggregates_regress test suite. The refactoring should 
allow behaviour like below possible:

Model.objects.annotate(total_time=Sum( F('timer1') + F('timer2') / 2 ))

An issue that the first patch ran into was deducing the python return types 
since there was no field type to "attach" to. I've introduced some basic 
behaviour that checks that all nested expressions are the same type, and 
then grabs the first column type. If the types are different, an error is 
thrown. To get around this, I introduced a field_type parameter to 
aggregates:

Model.objects.annotate(total_time=Sum( F('timer1') + F('timer2') / 2, 
field_type=FloatField() ))

Feedback appreciated and welcome.

 My plan is:

- Make sure all existing tests pass
- Create a bunch of new tests that stress "complex annotations/aggregations"
- Figure out how to allow F() in annotations. Annotation currently accepts 
arbitrary expressions, but does nothing with them.
- Clean up anything that looks too messy

I think I will be done with the above list inside the next 7 days. I would 
welcome help (or pointers) on testing against GIS once I'm done, as that 
should be all that's left to tackle.

- Josh

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9c16ccdc-9d0d-4804-91ba-550de6148c1a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2014-01-03 Thread Josh Smeaton
I now have all tests passing on Postgres and SQLite (See 
here).
 
I still haven't touched GIS, but I believe it should be as simple as 
changing the query to directly call prepare on the expression, and removing 
references to SQLEvaluator.

The big problem is that if we drop the SQLEvaluator and similarly the 
> models/aggregates.py <-> models/sql/aggregates.py ways of writing 
> Expressions and Aggregates, then there will be a lot of broken 3rd party 
> aggregates and expressions. Even if the APIs are private I am not willing 
> to just drop the old way. So, some backwards compatibility is required.
>

What level of backwards compatibility would be required? I like the idea of 
using the @add_implementation decorator to extend Expressions and 
Aggregates, but would that be enough? I've not used django-nonrel or any 
other non-official backend, so I'm really not sure what would be required 
or where a line should be drawn. The SQLEvaluator API (mostly) has moved 
down to ExpressionNode, so the callers only change by calling .prepare on 
the value rather than wrapping it in an Evaluator.

I'll make a start on refactoring Aggregate to inherit from ExpressionNode 
regardless. Hopefully that will make clear some uncertainties.

Cheers,

Josh

On Sunday, 29 December 2013 04:15:44 UTC+11, Anssi Kääriäinen wrote:
>
> On Wednesday, December 25, 2013 5:52:10 PM UTC+2, Josh Smeaton wrote:
>>
>> So I've gone ahead and tried to remove SQLEvaluator by pushing its 
>> functions down into ExpressionNode. For the most part, it's working well. I 
>> rebased on akaariai/lookups_3 to make sure that any changes to lookups 
>> would be reflected in this change as they are *somewhat* related. The diff 
>> between lookups and my patch is below:
>>
>>
>> https://github.com/jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1
>>
>> There is still a lot missing (I haven't touched GIS yet), and some of the 
>> tests are failing. I'm running tests against SQLITE and Postgres at the 
>> moment, but I'll add in Oracle and MySQL once I've got something closer to 
>> finished.
>>
>> I'm having some trouble with the test that is currently failing, and if 
>> someone could point out where I'm going wrong that'd be awesome. 
>> Specifically, ExpressionTests.test_filter() is failing on updates spanning 
>> relationships (line 
>> 179).
>>  
>> The problem is that relabeled_clone isn't being called or used at the 
>> appropriate time. I did some debugging, and I *think* that 
>> Lookup.process_rhs() is being called before Lookup.relabeled_clone(), which 
>> results in the ExpressionNode using the incorrect alias. I assume this is a 
>> problem with how I've refactored ExpressionNode since the same tests pass 
>> when I checkout lookups_3. ExpressionNode.relabeled_clone is called, but 
>> after the SQL has already been generated. Should I be providing a method or 
>> attribute for Lookup to inspect maybe?
>>
>>
> To me it seems the problem is in the relabeled_clone method. The children 
> are relabeled_clone()'d, but the return value (the relabeled clone) is just 
> discarded.
>
> In addition avoid defining custom __deepcopy__. Instead you could 
> copy.copy() self in start, and then alter self's mutable values manually.
>
> I'd appreciate some feedback on the patch as it stands too if possible. I 
>> think this is the right direction to head towards, as it simplifies the 
>> existing F() evaluation, and should allow a straightforward implementation 
>> of Aggregates on top. I'll be away until the 2nd of January, but will pick 
>> this up again then.
>>
>
> I have always found the F() -> SQLEvaluator way of producing SQL limiting 
> and confusing, so +1 from me for getting rid of that.
>
> The idea of F() evaluated by SQLEvaluator (or models.aggregates evaluated 
> by models.sql.aggregates) is that this deduplicates the user facing API 
> from the implementation. This is needed if you have custom Query class (for 
> example for nonrelational engine).
>
> The implementation (that is, SQLEvaluator) is used in two stages. First, 
> when adding the Expression to the query and then when compiling the actual 
> query string. If you need to customize how Expressions are added to the 
> query, then it means you have a custom Query class. That again means that 
> you can wrap the F()/Aggregate object in any way you like, as you control 
> the Query object. For the second part the new query expression API offers 
> you a direct way to alter query string compilation for different backends 
> (see add_implementation() in 
> https://github.com/django/django/pull/2019/files#diff-986a7aa3149cc32d19b751d7ab166082R222
> ).
>
> So, I think we should be safe as far as feature parity goes. If the new 
> way of writing expressions is actually better for non-sql 

Re: Improving aggregate support (#14030)

2013-12-28 Thread Anssi Kääriäinen
On Wednesday, December 25, 2013 5:52:10 PM UTC+2, Josh Smeaton wrote:
>
> So I've gone ahead and tried to remove SQLEvaluator by pushing its 
> functions down into ExpressionNode. For the most part, it's working well. I 
> rebased on akaariai/lookups_3 to make sure that any changes to lookups 
> would be reflected in this change as they are *somewhat* related. The diff 
> between lookups and my patch is below:
>
>
> https://github.com/jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1
>
> There is still a lot missing (I haven't touched GIS yet), and some of the 
> tests are failing. I'm running tests against SQLITE and Postgres at the 
> moment, but I'll add in Oracle and MySQL once I've got something closer to 
> finished.
>
> I'm having some trouble with the test that is currently failing, and if 
> someone could point out where I'm going wrong that'd be awesome. 
> Specifically, ExpressionTests.test_filter() is failing on updates spanning 
> relationships (line 
> 179).
>  
> The problem is that relabeled_clone isn't being called or used at the 
> appropriate time. I did some debugging, and I *think* that 
> Lookup.process_rhs() is being called before Lookup.relabeled_clone(), which 
> results in the ExpressionNode using the incorrect alias. I assume this is a 
> problem with how I've refactored ExpressionNode since the same tests pass 
> when I checkout lookups_3. ExpressionNode.relabeled_clone is called, but 
> after the SQL has already been generated. Should I be providing a method or 
> attribute for Lookup to inspect maybe?
>
>
To me it seems the problem is in the relabeled_clone method. The children 
are relabeled_clone()'d, but the return value (the relabeled clone) is just 
discarded.

In addition avoid defining custom __deepcopy__. Instead you could 
copy.copy() self in start, and then alter self's mutable values manually.

I'd appreciate some feedback on the patch as it stands too if possible. I 
> think this is the right direction to head towards, as it simplifies the 
> existing F() evaluation, and should allow a straightforward implementation 
> of Aggregates on top. I'll be away until the 2nd of January, but will pick 
> this up again then.
>

I have always found the F() -> SQLEvaluator way of producing SQL limiting 
and confusing, so +1 from me for getting rid of that.

The idea of F() evaluated by SQLEvaluator (or models.aggregates evaluated 
by models.sql.aggregates) is that this deduplicates the user facing API 
from the implementation. This is needed if you have custom Query class (for 
example for nonrelational engine).

The implementation (that is, SQLEvaluator) is used in two stages. First, 
when adding the Expression to the query and then when compiling the actual 
query string. If you need to customize how Expressions are added to the 
query, then it means you have a custom Query class. That again means that 
you can wrap the F()/Aggregate object in any way you like, as you control 
the Query object. For the second part the new query expression API offers 
you a direct way to alter query string compilation for different backends 
(see add_implementation() in 
https://github.com/django/django/pull/2019/files#diff-986a7aa3149cc32d19b751d7ab166082R222).

So, I think we should be safe as far as feature parity goes. If the new way 
of writing expressions is actually better for non-sql backend support isn't 
clear to me. I just haven't worked with non-SQL backends enough to say.

The big problem is that if we drop the SQLEvaluator and similarly the 
models/aggregates.py <-> models/sql/aggregates.py ways of writing 
Expressions and Aggregates, then there will be a lot of broken 3rd party 
aggregates and expressions. Even if the APIs are private I am not willing 
to just drop the old way. So, some backwards compatibility is required.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/f2e215f2-1acb-4cf6-8e5c-70e136ff5fe2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2013-12-25 Thread Josh Smeaton
So I've gone ahead and tried to remove SQLEvaluator by pushing its 
functions down into ExpressionNode. For the most part, it's working well. I 
rebased on akaariai/lookups_3 to make sure that any changes to lookups 
would be reflected in this change as they are *somewhat* related. The diff 
between lookups and my patch is below:

https://github.com/jarshwah/django/compare/akaariai:lookups_3...aggregates-14030?expand=1

There is still a lot missing (I haven't touched GIS yet), and some of the 
tests are failing. I'm running tests against SQLITE and Postgres at the 
moment, but I'll add in Oracle and MySQL once I've got something closer to 
finished.

I'm having some trouble with the test that is currently failing, and if 
someone could point out where I'm going wrong that'd be awesome. 
Specifically, ExpressionTests.test_filter() is failing on updates spanning 
relationships (line 
179).
 
The problem is that relabeled_clone isn't being called or used at the 
appropriate time. I did some debugging, and I *think* that 
Lookup.process_rhs() is being called before Lookup.relabeled_clone(), which 
results in the ExpressionNode using the incorrect alias. I assume this is a 
problem with how I've refactored ExpressionNode since the same tests pass 
when I checkout lookups_3. ExpressionNode.relabeled_clone is called, but 
after the SQL has already been generated. Should I be providing a method or 
attribute for Lookup to inspect maybe?

I'd appreciate some feedback on the patch as it stands too if possible. I 
think this is the right direction to head towards, as it simplifies the 
existing F() evaluation, and should allow a straightforward implementation 
of Aggregates on top. I'll be away until the 2nd of January, but will pick 
this up again then.

Cheers

- Josh


On Tuesday, 24 December 2013 11:41:19 UTC+11, Josh Smeaton wrote:
>
> So, after hacking away a little bit yesterday, I came to the same 
> conclusion that Anssi did back when the first PR was sent. Namely, that the 
> ExpressionNode <-> SQLEvaluator structure seems overly complex, and makes 
> it difficult to create custom "ExpressionNodes".
>
> To build a custom Expression, one needs to:
>
> - Subclass ExpressionNode
> - Build in SQL into the backend (sometimes)
> - Modify SQLEvaluator (or create a new one)
> - Modify sql/query to learn about the different types of 
> ExpressionNodes
>
> I think it'd be better if SQLEvaluator was dropped altogether, and 
> ExpressionNode was given the evaluator responsibilities. In this world, 
> creating a custom Expression would look like:
>
> - Subclass ExpressionNode
> - The subclass contains SQL for each supported backend (and has 
> access to the 'connection' instance)
> - sql/query asks ExpressionNode to evaluate itself
>
> Provided that ExpressionNode (or one of its subclasses) had the 
> appropriate methods, it would allow library authors to create their own 
> custom functions. #11305 (Conditional Aggregates) could then be implemented 
> quite easily, without touching the rest of the stack.
>
> So this is what I'm going to attempt first. I'll "merge" the functions of 
> SQLEvaluator and ExpressionNode, and see how far along that gets me. If 
> that turns out to simplify the handling of F(), then it should be equally 
> useful when refactoring Aggregate as an ExpressionNode. As a POC, I'll also 
> implement a version of Conditional Aggregates to confirm the API. Now, this 
> is definitely more complex than porting nateb's original changes, but it's 
> probably going to be a more simplified implementation. As such, my hopes 
> for inclusion in 1.7 are probably far-fetched.
>
> Will report back when I've got something a little more concrete to talk 
> about.
>
> Regards,
>
> - Josh
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/a5643365-2bc4-4240-981a-85706b44408c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2013-12-23 Thread Josh Smeaton
So, after hacking away a little bit yesterday, I came to the same 
conclusion that Anssi did back when the first PR was sent. Namely, that the 
ExpressionNode <-> SQLEvaluator structure seems overly complex, and makes 
it difficult to create custom "ExpressionNodes".

To build a custom Expression, one needs to:

- Subclass ExpressionNode
- Build in SQL into the backend (sometimes)
- Modify SQLEvaluator (or create a new one)
- Modify sql/query to learn about the different types of 
ExpressionNodes

I think it'd be better if SQLEvaluator was dropped altogether, and 
ExpressionNode was given the evaluator responsibilities. In this world, 
creating a custom Expression would look like:

- Subclass ExpressionNode
- The subclass contains SQL for each supported backend (and has 
access to the 'connection' instance)
- sql/query asks ExpressionNode to evaluate itself

Provided that ExpressionNode (or one of its subclasses) had the appropriate 
methods, it would allow library authors to create their own custom 
functions. #11305 (Conditional Aggregates) could then be implemented quite 
easily, without touching the rest of the stack.

So this is what I'm going to attempt first. I'll "merge" the functions of 
SQLEvaluator and ExpressionNode, and see how far along that gets me. If 
that turns out to simplify the handling of F(), then it should be equally 
useful when refactoring Aggregate as an ExpressionNode. As a POC, I'll also 
implement a version of Conditional Aggregates to confirm the API. Now, this 
is definitely more complex than porting nateb's original changes, but it's 
probably going to be a more simplified implementation. As such, my hopes 
for inclusion in 1.7 are probably far-fetched.

Will report back when I've got something a little more concrete to talk 
about.

Regards,

- Josh

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/b4d333f5-0184-4b2b-9946-73b365144306%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2013-12-22 Thread Josh Smeaton
Thanks for the response. I've noticed the JoinPromoter recently, and I have 
been looking into the custom lookups, but wasn't sure what the purpose was 
until now.

I'll see what I can do with this patch, and try to avoid as much complexity 
as I can. Then it just comes down to review on whether it makes it in or 
not (and my ability of course).

Regards,

Josh

On Sunday, 22 December 2013 22:04:36 UTC+11, Anssi Kääriäinen wrote:
>
> On Saturday, December 21, 2013 3:15:12 AM UTC+2, Josh Smeaton wrote:
>>
>> I'm interested in tackling 
>> ticket#14030, 
>> ideally before the Jan 20th cutoff, but I'm looking for some guidance. 
>> There have been a lot of references to Anssi Kääriäinen attempting an ORM 
>> refactor before trying to fix aggregates, but I'm not sure if this is still 
>> a goal or not.
>>
>
> Yes, more refactoring is still the plan. A lot has already been done. The 
> main part of the work has been bug fixing, code cleanup and join generation 
> changes.
>
> The next part is introducing custom lookups for 1.7 (see 
> https://github.com/django/django/pull/2019), and later on in 1.8 allow 
> usage of custom lookups in every part of the ORM. This means one can do 
> queries such as .order_by(' jsonfield__user__firstname'), or more relevant 
> to this ticket, .annotate(avg_year=Avg('datefield__year')).
>
> To make the above possible Django's ORM needs to switch to using unified 
> query expression API. For example aggregates, custom lookups and Expression 
> nodes will all respond to the same API. The API deals mainly with how to 
> produce SQL for the expressions, how to convert db values to Python values 
> and things like that. As a side effect Avg(F('somefield') + 
> F('anotherfield')) should just work.
>  
>
>> The work done by nateb at https://github.com/django/django/pull/46 was 
>> very similar to the structure I believed would be the right way forward - 
>> refactoring Aggregates to inherit from ExpressionNode. Unfortunately, the 
>> patch submitted has been closed because it no longer applies cleanly. There 
>> was also discussion regarding type coercion and a lack of real world tests 
>> that should be considered.
>>
>> What I'd like to do is attempt to rewrite the above PR so that it applies 
>> cleanly. The reviews that had already happened on that patch would then 
>> mostly apply to the rewrite, eliminating the need for a start to finish 
>> review of the concept in general. Is this something that I should go 
>> through with, or are we better off waiting for the ORM refactor then 
>> reevaluating the way forward?
>>
>
> Making Aggregates inherit from ExpressionNode shouldn't make the unified 
> query expression API significantly easier or harder to achieve. If the code 
> complexity stays manageable, I don't see a reason to not merge the changes. 
> The change will be needed for Avg('somefield')  + Avg('otherfield') support 
> even when unified query expression API was implemented.
>
> One of the main problems for this ticket has been the code complexity. 
> Both Aggregates and Expressions are somewhat complicated in the ORM. The 
> combination of the two results in even worse code complexity. On the other 
> hand this feature is clearly needed by many users, so maybe some code 
> complexity can be tolerated.
>
>  - Anssi
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1eeaa1b5-fe7a-4b7a-8fb7-8792bf45a49b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Improving aggregate support (#14030)

2013-12-22 Thread Anssi Kääriäinen
On Saturday, December 21, 2013 3:15:12 AM UTC+2, Josh Smeaton wrote:
>
> I'm interested in tackling 
> ticket#14030, 
> ideally before the Jan 20th cutoff, but I'm looking for some guidance. 
> There have been a lot of references to Anssi Kääriäinen attempting an ORM 
> refactor before trying to fix aggregates, but I'm not sure if this is still 
> a goal or not.
>

Yes, more refactoring is still the plan. A lot has already been done. The 
main part of the work has been bug fixing, code cleanup and join generation 
changes.

The next part is introducing custom lookups for 1.7 (see 
https://github.com/django/django/pull/2019), and later on in 1.8 allow 
usage of custom lookups in every part of the ORM. This means one can do 
queries such as .order_by(' jsonfield__user__firstname'), or more relevant 
to this ticket, .annotate(avg_year=Avg('datefield__year')).

To make the above possible Django's ORM needs to switch to using unified 
query expression API. For example aggregates, custom lookups and Expression 
nodes will all respond to the same API. The API deals mainly with how to 
produce SQL for the expressions, how to convert db values to Python values 
and things like that. As a side effect Avg(F('somefield') + 
F('anotherfield')) should just work.
 

> The work done by nateb at https://github.com/django/django/pull/46 was 
> very similar to the structure I believed would be the right way forward - 
> refactoring Aggregates to inherit from ExpressionNode. Unfortunately, the 
> patch submitted has been closed because it no longer applies cleanly. There 
> was also discussion regarding type coercion and a lack of real world tests 
> that should be considered.
>
> What I'd like to do is attempt to rewrite the above PR so that it applies 
> cleanly. The reviews that had already happened on that patch would then 
> mostly apply to the rewrite, eliminating the need for a start to finish 
> review of the concept in general. Is this something that I should go 
> through with, or are we better off waiting for the ORM refactor then 
> reevaluating the way forward?
>

Making Aggregates inherit from ExpressionNode shouldn't make the unified 
query expression API significantly easier or harder to achieve. If the code 
complexity stays manageable, I don't see a reason to not merge the changes. 
The change will be needed for Avg('somefield')  + Avg('otherfield') support 
even when unified query expression API was implemented.

One of the main problems for this ticket has been the code complexity. Both 
Aggregates and Expressions are somewhat complicated in the ORM. The 
combination of the two results in even worse code complexity. On the other 
hand this feature is clearly needed by many users, so maybe some code 
complexity can be tolerated.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1272168b-84f1-4968-a604-84141c174483%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Improving aggregate support (#14030)

2013-12-20 Thread Josh Smeaton
I'm interested in tackling 
ticket#14030, 
ideally before the Jan 20th cutoff, but I'm looking for some guidance. 
There have been a lot of references to Anssi Kääriäinen attempting an ORM 
refactor before trying to fix aggregates, but I'm not sure if this is still 
a goal or not.

The work done by nateb at https://github.com/django/django/pull/46 was very 
similar to the structure I believed would be the right way forward - 
refactoring Aggregates to inherit from ExpressionNode. Unfortunately, the 
patch submitted has been closed because it no longer applies cleanly. There 
was also discussion regarding type coercion and a lack of real world tests 
that should be considered.

What I'd like to do is attempt to rewrite the above PR so that it applies 
cleanly. The reviews that had already happened on that patch would then 
mostly apply to the rewrite, eliminating the need for a start to finish 
review of the concept in general. Is this something that I should go 
through with, or are we better off waiting for the ORM refactor then 
reevaluating the way forward?

Regards,

Josh

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/600ab641-0eac-4c82-b5fd-544ff0240025%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.