I want to build a popularity metric on a tree of entities. Say my popularity
model looks like this:
class Popularity(db.Model):
hourly = db.FloatProperty()
daily = db.FloatProperty()
weekly = db.FloatProperty()
monthly = db.FloatProperty()
yearly = db.FloatProperty()
Now say I want to aggregate the downloads of a set of files. If I want to do
that globally that's easy. Keyname is the filename (or some unique id for
each file if names are not unique) and update the counter on each download.
When I want the most popular download in the last hour I can query for it
easily:
Popularity.all(keys_only=True).order('-hourly').fetch(N) for the first N
filenames.
So far so good, each property has 2 indices (which is overkill I only want
only 1).
Now lets make the problem a bit more interesting. I want to aggregate
globally, but also on each region, country and city. So I need to setup a
"tree" of popularity metrics and update them individually. The keys are set
as a filenames so I cannot have the same key for all popularities. But I can
have an ancestor with keyname being the path down the tree (the entity of
the ancestor need not be realized in this case since it is irrelevant
anyway). Key for the 'eu' counter will be db.Key.from_path('_DoesNotMatter',
'eu', 'Popularity', <filename>). For eu-Sweden it will
be db.Key.from_path('_DoesNotMatter', 'eu/Sweden', 'Popularity', <filename>)
and so on. Now I can query for the most popular downloads in Sweden like so:
Popularity.all(keys_only=True).order('-hourly').ancestor(db.Key.from_path('_DoesNotMatter',
'eu/Sweden')).fetch(N)
But this will require a composite index for (ancestor, hourly, desc). For
all hourly, daily, ... queries I will need 5 composite indexes on top of the
10 existing ones for a total of 15 indexes per write/update.
The question is how to reduce the number of indexes required to do the
above. Since the model name of the popularity class does not matter at all,
I came up with the following:
eu popularity: db.Key.from_path('Popularity_eu', <filename>)
Sweden popularity: db.Key.from_path('Popularity_eu_Sweden', <filename>)
global popularity: db.Key.from_path('Popularity', <filename>)
This way I can use the following query for top N downloads in Sweden:
db.Query('_'.join(['Popularity', 'eu', 'Sweden'],
keys_only=True).order('-hourly').fetch(N)
This will still require 10 indexes (5 useless ones but still better than 15)
to implement but the number of models will be quite large (order of cities
in the world).
- alkis
2010/4/6 Nick Johnson (Google) <[email protected]>
> Hi Alkis,
>
> 2010/4/5 Alkis Evlogimenos ('Αλκης Ευλογημένος) <[email protected]>
>
> Is there a limit in the number of model kinds per application? I am
>> considering a design where the number of models is going to be in the order
>> of 500k. I understand the model viewer in the admin console will be
>> completely unusable but other than that is there going to be a problem in
>> general?
>
>
> No. Other than the admin console datastore viewer likely timing out,
> there's no problem with doing this - though I can't think of a reason why
> you would actually want to do this, though.
>
> -Nick
>
>
>>
>> If that won't work, is there a plan to make unindexed properties able to
>> participate in composite indexes? There are cases where you only want an
>> (ancestor, property, descending) index and you do not care about the base
>> (property, ascending) and (property, descending) indexes but the current
>> design/api forces you to have at least 3 indexes where only 1 would just
>> work.
>>
>> - alkis
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected]<google-appengine%[email protected]>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/google-appengine?hl=en.
>>
>
>
>
> --
> Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd.
> :: Registered in Dublin, Ireland, Registration Number: 368047
> Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
> 368047
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
--
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.