oops, task queue code should be:

keys = gen_keys(document) // Builds a list of db.Key instances based
on the document
indexes=db.get(keys)
upserts=[]
for i,key in enumerate(indexes):
   if indexes[i] is None:
       upserts.append(I(key=keys[i],v=array('I',[document_id])))
   elif document_id not in indexes[i].v:
        indexes[i].v.append(document_id)
        upserts.append(indexes[i])
db.put(upserts)

On 6 January 2011 19:36, Donovan <[email protected]> wrote:
> Hi,
>
> I'm using a very simple model to store arrays of document ids for an
> inverted index based on 3 million documents.
>
> class I(db.Model):
>    v=ArrayProperty(typecode="I",required=True)
>
> which uses:
>
> http://appengine-cookbook.appspot.com/recipe/store-arrays-of-numeric-values-efficiently-in-the-datastore/
>
> I have a simple task queue that includes the following piece of logic
> which loops 3,000 times a day, for new incoming documents which
> generate on average 3,500 keys each, to update the index:
>
> keys = gen_keys(document) // Builds a list of db.Key instances based
> on the document
> indexes=db.get(keys)
> upserts=[]
> for i,key in enumerate(indexes):
>    if indexes[i] is None:
>        upserts.append(I(key=keys[i],v=array('I',[document_id])))
>    elif news_article_id not in indexes[i].v:
>         indexes[i].v.append(document_id)
>         upserts.append(indexes[i])
> db.put(upserts)
>
> This loop leads to datastore CPU usage of 48 hours per 1000 documents
> which means a daily spend of $16.80 just for the datastore updates,
> which seems quite expensive given how something like Kyoto Cabinet
> running on conventional hosting could easily deal with this load. Does
> anyone have any ideas for minimizing the datastore CPU usage? My hunch
> is that the datastore CPU usage is a bit overpriced :(
>
> Cheers,
> Donovan.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to