On 2/22/07, Tim Chase <[EMAIL PROTECTED]> wrote:
>
> Below I've pasted my first attempt at an aggregate function
> class.  Its __init__ takes a queryset (and an optional list of
> aggregate functions to perform if, say, you only want the "sum"s
> rather than min/max/avg/sum).  Thus you should be able to do
> things like
>
>  >>>class Foo(Model):
> ...  blah = CharField(maxlength=42)
> ...  xxx = IntegerField()
> ...  yyy = SmallIntegerField()
> ...  zzz = PositiveIntegerField()
> ...
> [add some data to Foo]
>  >>> stats = Aggregate(Foo.objects.filter(xxx__gt=21))
>
> The "stats" object now has properties
>         xxx_sum
>         xxx_max
>         xxx_min
>         xxx_average
>         yyy_sum
>         yyy_max
>         yyy_min
>         yyy_average
>         zzz_sum
>         zzz_max
>         zzz_min
>         zzz_average
>
> which can be accessed to provide the associated stats for the
> given model.  If you make your own field types, if they're
> numeric, you'll want to add them to the set().
>
> The constructor can also be called with the list of functions you
> want:
>
>  >>> stats = Aggregate(Foo.objects.all(), ['sum'])
>
> in case you just want the *_sum properties.
>
> Future enhancements might include aggregates that don't involve
> sum/avg, so one might get the min/max of a date/time/char field.
>
> I haven't yet figured out a way to suppress the order_by portion,
> so what's currently in there is an ugly hack.  But it would need
> to prevent the standard methods from inserting an ORDER BY clause
> against a non-aggregated field.

if you add an empty call (no parameters) to order_by(), it will drop
the ORDER BY clause

>
> At the moment, it meets my needs, but I'd gladly consider any
> feedback on this first draft version of things.  Also, if you
> find it useful, feel free to shred it into your own purposes as
> desired.

I have another proposition, how about:

>>> quseryset = Model.objects.all()
>>> queryset.aggregate( ( 'name', 'city' ), sum=( 'pay',
'some_other_field' ), avg=( 'pay', 'age' ), count=True )
>>> [
  {
    'name' : 'some name',
    'city' : 'Prague',
    'sum' : { 'pay' : 30000, 'some_other_field' : 10000 },
    'avg' : { 'pay' : 10000, 'age' : 30 },
    'count' : 5,
  },
  ......
]

or something like:

{
  ( 'some name', 'Prague') : {
    'sum' : { 'pay' : 50000, 'some_other_field' : 10000 },
    'avg' : { 'pay' : 10000, 'age' : 30 },
    'count' : 5,
  },
  ( 'other name', 'other city') : {
    ....
  }
}

it seems reasonably clean interface to me:
fist argument is a tuple of fields on which to call GROUP BY, then
there are several optional kwargs: avg, sum, min, max and count

I am willing to code up a patch for this, it should be pretty
straightforward. If we would want to take it to the next level, we
could allow filter() on such aggregated queryset, that would manifest
itself in the HAVING clause...

what do you think??


>
> -tkc
>
>
> #############################################
> from django.db import backend, connection
>
> NUMERIC_FIELDS = set((
>    'FloatField',
>    'IntegerField',
>    'PositiveIntegerField',
>    'PositiveSmallIntegerField',
>    'SmallIntegerField',
>    ))
>
> FUNCTIONS = [
>    'min',
>    'max',
>    'average',
>    'sum',
>    ]
>
> def is_numeric_field_type(field):
>    return field.get_internal_type() in NUMERIC_FIELDS
>
> class Aggregate(object):
>    def __init__(self, queryset, only_type=None):
>      self._cache = None
>      self.queryset = queryset.select_related(False)
>      self.only_type = only_type or FUNCTIONS
>
>    def all(self):
>      self._get_results()
>
>    def _get_aggregate_sql(self):
>      q = self.queryset._clone()
>      q._order_by = '?' # can't use None as it gets
>      # overridden if a default Meta.ordering is specified
>      # Should do some magic to prevent it from adding
>      # the Meta.ordering if possible
>      select, sql, params = q._get_sql_clause()
>      meta = self.queryset.model._meta
>      # build the SELECT contents:
>      # fn(table.field) AS field_fn
>      selects = [[
>        "%s(%s.%s) AS %s" % (
>          fn.lower(),
>          backend.quote_name(meta.db_table),
>          backend.quote_name(f.column),
>          backend.quote_name("%s_%s" % (f.column, fn.lower())),
>          ) for f in meta.fields
>          if is_numeric_field_type(f)
>          ] for fn in self.only_type]
>      # the sum(selects, []) flattens
>      # the list-of-lists into just a list
>      return sum(selects, []), sql, params
>
>    def _get_results(self):
>      if self._cache is None:
>        select, sql, params =  self._get_aggregate_sql()
>        cursor = connection.cursor()
>        sql = "SELECT " + ",".join(select) + sql
>        cursor.execute(sql, params)
>        results = cursor.fetchone()
>        field_names = [d[0] for d in cursor.description]
>        self._cache = dict(zip(field_names, results))
>      return self._cache
>
>    def __getattribute__(self, k):
>      try:
>        return object.__getattribute__(self, k)
>      except:
>        results = self._get_results()
>        return results[k]
>
>
> >
>


-- 
Honza Kr�l
E-Mail: [EMAIL PROTECTED]
ICQ#:   107471613
Phone:  +420 606 678585

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to