I'd like to find an efficient method to retrieve a list of Django DB
records (model has a DateTimeField `timestamp`) between datetime
'start' and datetime 'end' with a few conditions. First, we're looking
for a record once every 'interval' minutes. Second, I only want a
record if it's timestamp within 'fudge' minutes of the target time.
Last, if such a record doesn't exist for an interval, put None in the
list instead.

For example:
'start' = 2008-10-01 00:00:00
'end' = 2008-10-01 12:00:00
'interval' = 60 minutes
'fudge' = 5 minutes

I would expect to get thirteen items in a list (each either a record
or None).
The first record to save is the first record between 23:55:00 and
00:05:00. The second record to be saved would be the first record
betwen 00:55:00 and 01:05:00. And so on.

At the moment I have attacked this with a brute force approach by
processing a QuerySet of all records between 'start' and 'end'. Here
is the QuerySet:

qs = Weather.objects.filter(timestamp__year=date.year,
                          timestamp__month=date.month,
 
timestamp__day=date.day).order_by('timestamp')

Here is the processing code:

def periodic_samples(qs, start, fudge, interval, periods):
    '''
    Returns a list of arbitrary type records (containing attribute
'timestamp')
    from `qs`, one record for each target window during a total of
'periods'
    windows beginning at 'start'.

    `target` = `start` plus a (0 to `periods`) multiple of `interval`.
    A target window is defined as: `target`-`fudge` to `target`
+`fudge`.
    The first record found in a target window is saved in the list and
all other
    records in that window are ignored. If no record is found in the
target
    window then None is placed in the list instead of a record.

    For instance if `start`=12:00, `fudge`=5 minutes, `interval`=30
minutes,
    and `periods`=2, record timestamps must fall in the ranges 11:55 -
12:05
    and 12:25 - 12:35.

    Parameter types:
    `qs` = Queryset of records which have a "timestamp" field
    `start` = datetime.datetime
    `fudge` = datetime.timedelta
    `interval` = datetime.timedelta
    `periods` = integer

    '''
    dataset = []
    if len(qs):
        target = start
        end = start + (periods * interval)
        for rec in qs:
            if target >= end:
                break
            ts = rec.timestamp
            while ts > (target + fudge):
                dataset.append(None)
                target += interval
            if ts < (target - fudge):
                pass
            else:
                dataset.append(rec)
                target += interval
        # no more records, fill out the dataset with None values
        while target < end:
            dataset.append(None)
            target += interval
    return dataset

Regardless of my finicky needs as outlined above, does anyone know of
a method for efficiently extracting a bunch of records, say one record
an hour, over some range of time? It seems awfully inefficient to look
at each record in the queryset to find what I want.
---
Graham Ullrich
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to