Same issue here :/ It seems to me that db.put() uses a huge amount of CPU Time. Is there a way to avoid this or does anyone no why db.put() counts the "CPU Time" instead of the "Datastore CPU Time"?
On 6 Jan., 20:36, Donovan <[email protected]> wrote: > Hi, > > I'm using a very simple model to store arrays of document ids for an > inverted index based on 3 million documents. > > class I(db.Model): > v=ArrayProperty(typecode="I",required=True) > > which uses: > > http://appengine-cookbook.appspot.com/recipe/store-arrays-of-numeric-... > > I have a simple task queue that includes the following piece of logic > which loops 3,000 times a day, for new incoming documents which > generate on average 3,500 keys each, to update the index: > > keys = gen_keys(document) // Builds a list of db.Key instances based > on the document > indexes=db.get(keys) > upserts=[] > for i,key in enumerate(indexes): > if indexes[i] is None: > upserts.append(I(key=keys[i],v=array('I',[document_id]))) > elif news_article_id not in indexes[i].v: > indexes[i].v.append(document_id) > upserts.append(indexes[i]) > db.put(upserts) > > This loop leads to datastore CPU usage of 48 hours per 1000 documents > which means a daily spend of $16.80 just for the datastore updates, > which seems quite expensive given how something like Kyoto Cabinet > running on conventional hosting could easily deal with this load. Does > anyone have any ideas for minimizing the datastore CPU usage? My hunch > is that the datastore CPU usage is a bit overpriced :( > > Cheers, > Donovan. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
