Hi, Doug.

I've written an app that does something similar - it keeps lists of  
tags on persisted objects.  You could bang your head into exploding  
indices and performance problems depending on how many objects are in  
your datastore and how many tags are in each list on average.

I propose a different design.  Keep tags in lists on your objects (or  
"documents"), but create a new model, "bucket."  Have a bucket store a  
list of keys to documents, and on bucket creation, name the bucket's  
key after a particular tag.

Consider indexing a document for a particular tag.  Do two things:  
store that tag on the list of tags in the document object, then insert  
that document's key into the document key list stored in the bucket  
corresponding to that tag (creating that bucket if it doesn't already  
exist).

Consider querying for all documents corresponding to a particular  
tag.  Simply get the bucket corresponding to that tag by key (super  
fast!), then resolve that bucket's list of document keys to document  
objects.  :-)

You're scratching the surface of search engine design.  This design  
has worked well for me and may work well for you too.  For a reference  
implementation, see the code for my app:

Example models ("document" corresponds to "Bookmark" and "bucket"  
corresponds to "Keychain"):
http://code.google.com/p/grab-it/source/browse/trunk/models.py

Example querying (the _search_bookmarks_generic function):
http://code.google.com/p/grab-it/source/browse/trunk/logic.py#114

Example indexing and unindexing (the _index_bookmark and  
_unindex_bookmark methods):
http://code.google.com/p/grab-it/source/browse/trunk/handlers.py#316

Have-a-lot-of-fun-ly yours,
Raj  ;-)

On Aug 28, 2009, at 11:56 AM, Doug wrote:

>
> Hi,
>
> I'm very new to GAE and was curous if there is a performance penalty
> with regards to queries and exploding indices, etc. etc... if one
> implements tags (a list of category names for a record if you will) as
> dynamic properties vs.  just list<db.Category>.
>
> As a dynamic property I was thinking a possibility would be:
> obj.tag_moe="moe"
> obj.tag_curly="curly"
> (Of course, this assumes I can specify the name of a dynamic tag at
> creation time)
>
> so if I wanted to query a database for tags "xxx" and "yyy", could I
> do:
> SELECT * FROM myModel WHERE tag_moe = "moe" AND tag_curly = "curly"
> ORDER by date
>
>
> Or... as a list of tags a possibility would be:
> obj.tags = ["moe", "curly"]
>
> and the query
> SELECT * FROM myModel WHERE tags = "moe" AND tags = "curly" ORDER by
> date
>
>
> Thanks for any insight
> -d
>
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google  
> Groups "Google App Engine" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to 
> [email protected]
> For more options, visit this group at 
> http://groups.google.com/group/google-appengine?hl=en
> -~----------~----~----~----~------~----~------~--~---
>

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to