On 3/8/06, Rock <[EMAIL PROTECTED]> wrote:
I read it to mean that everyone was concentrating on getting magic-removal finished. MR is all about getting the backwards incompatible changes out of the way in one big jump; once those are sorted out, we can worry about the backwards compatible changes (like adding new API for aggregates). If we don't focus, magic-removal runs the risk of becoming the branch that will never merge...
However - that said, a few comments to give you something to ruminate upon:
You have inadvertantly made my point. The only reason avg is a good name is so that you don't have to remember the difference in names when you use your get_aggregate(s) method, and like I said in my last email, I _really_ don't like that function precisely because it leaks raw SQL into the Django API.
Show me somewhere in core (or near core) python libraries that are not SQL related where avg() is used to describe an arithmetic mean, and I'll back down on this one. There is precedent in the Python core for the abbreviation of min, max:
http://docs.python.org/lib/typesseq.html#l2h-167
If you take SciPy as an example of good maths + stats libraries, it would dictate that std() and mean() be the names for STDDEV and AVG.
http://www.scipy.org/doc/api_docs/scipy.stats.stats.html
There are also at least one ASPN recipe to support the idea of stddev() and mean():
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/409413
But a quick search didn't reveal a single example (that isn't SQL bound) that promotes avg().
Look back at the archives when I first brought up aggregates:
http://groups.google.com/group/django-developers/browse_thread/thread/897711c8b86bc5e8/ef491ce3be3e1989
It was made clear to me then that 'SQL does it like X, so lets add X to Django' wouldn't win me any points. Django's ORM isn't about finding a way of representing SQL as Python - it's about getting a consistent, expressive object model, that just happens to be backed by a SQL database. Keep in mind that it could just as well be backed by an object database, or some other persistent store. What will happen to SQL notation if SQL isn't available?
The name isn't the only problem: In no particular order:
- It requires the use of raw SQL as a parameter, specified as a string
- It requires the use of strings to identify column names (which string to use? column name? field name? verbose name? plural name? what about error checking?)
- It doesn't provide a way to get the minimum of one field, and the maximum of another using a single SQL call
- It doesn't exhibit good polymorphism - There is no reason that get_aggregate couldn't accept a list as its first parameter as well as a single item
- The method name is get_ in an API moving towards descriptors
Not to put too fine a point on it, but I find very little in get_aggregates that is appealing.
I'm not opposed to the idea of a simple min/max etc API in addition to some mega-aggregate API. However, I don't want to start working on something as big as aggregates until we have a stable base to work on, and we have the attention of all the big players. Finalizing magic-removal has everyone pretty busy at the moment.
_Please_ can we defer this discussion to post magic-removal. Trust me, I won't forget about it - it's the single biggest item I personally want in Django, and I'd rather it be done right, than right now :-)
Russ Magee %-)
It seems that no one else has any thoughts about this, so its just you
and me.
I read it to mean that everyone was concentrating on getting magic-removal finished. MR is all about getting the backwards incompatible changes out of the way in one big jump; once those are sorted out, we can worry about the backwards compatible changes (like adding new API for aggregates). If we don't focus, magic-removal runs the risk of becoming the branch that will never merge...
However - that said, a few comments to give you something to ruminate upon:
Next, I disagree about "average". Using the abbreviated SQLish name for
all of the functions except one is bad. Either all of the names should
be expanded or none. I prefer none since the non-standard SQL aggregate
functions must match the SQL provided name precisely. Might as well
make the standard function names do the same.
You have inadvertantly made my point. The only reason avg is a good name is so that you don't have to remember the difference in names when you use your get_aggregate(s) method, and like I said in my last email, I _really_ don't like that function precisely because it leaks raw SQL into the Django API.
Show me somewhere in core (or near core) python libraries that are not SQL related where avg() is used to describe an arithmetic mean, and I'll back down on this one. There is precedent in the Python core for the abbreviation of min, max:
http://docs.python.org/lib/typesseq.html#l2h-167
If you take SciPy as an example of good maths + stats libraries, it would dictate that std() and mean() be the names for STDDEV and AVG.
http://www.scipy.org/doc/api_docs/scipy.stats.stats.html
There are also at least one ASPN recipe to support the idea of stddev() and mean():
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/409413
But a quick search didn't reveal a single example (that isn't SQL bound) that promotes avg().
Look back at the archives when I first brought up aggregates:
http://groups.google.com/group/django-developers/browse_thread/thread/897711c8b86bc5e8/ef491ce3be3e1989
It was made clear to me then that 'SQL does it like X, so lets add X to Django' wouldn't win me any points. Django's ORM isn't about finding a way of representing SQL as Python - it's about getting a consistent, expressive object model, that just happens to be backed by a SQL database. Keep in mind that it could just as well be backed by an object database, or some other persistent store. What will happen to SQL notation if SQL isn't available?
The get_aggregate() function is really just the shared code behind
sum(), min(), max(), stddev() and so forth. Arguably it could replaced
by get_aggregates() (or whatever better name you might suggest),
The name isn't the only problem: In no particular order:
- It requires the use of raw SQL as a parameter, specified as a string
- It requires the use of strings to identify column names (which string to use? column name? field name? verbose name? plural name? what about error checking?)
- It doesn't provide a way to get the minimum of one field, and the maximum of another using a single SQL call
- It doesn't exhibit good polymorphism - There is no reason that get_aggregate couldn't accept a list as its first parameter as well as a single item
- The method name is get_ in an API moving towards descriptors
Not to put too fine a point on it, but I find very little in get_aggregates that is appealing.
all, is how I do the aggregate functions today.) Once we have a good
80/20 design, then a "kitchen sink" approach would be fun to explore
and your suggestions are a good starting point.
I'm not opposed to the idea of a simple min/max etc API in addition to some mega-aggregate API. However, I don't want to start working on something as big as aggregates until we have a stable base to work on, and we have the attention of all the big players. Finalizing magic-removal has everyone pretty busy at the moment.
_Please_ can we defer this discussion to post magic-removal. Trust me, I won't forget about it - it's the single biggest item I personally want in Django, and I'd rather it be done right, than right now :-)
Russ Magee %-)
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/django-developers
-~----------~----~----~----~------~----~------~--~---