[google-appengine] Re: 1000 mcycles to update a single entity

Josh Heitzman Thu, 16 Oct 2008 12:22:46 -0700

I've experimented with pickling and found that in about 810 mcycles
are consumed getting the entity (just one BlobProperty), unpickling,
pickling, and then putting the entity.  Of those mcycles most (~730)
go into the put operation.


This only leaves 190 mcycles to actually do the work of parsing the
request, processing data, and constructing a result.  Even if you can
manage to stay under 1000 mcycles you can only ever put one entity per
request and stay under 1000 mcycles and I found the putting a second
entity basically doubles the get, pickling, and put times.

The sad thing is the actual runtime of putting multiple entities isn't
very long.  I've seen 15000 mcycle be consumed by a request that took
less then two seconds to actually get rendered in my browser.

So this 1000 mcycle soft cap seems out of whack with the reality of
how many mcycles are consumed by put data into the store.

While the soft cap can be gotten around by breaking up the processing
into multiple requests behind the scenes, this will result in higher
mcycle consumption to process the user command as well as longer
response time which just isn't good for my bottom line (since we'll be
paying for overall mcycle consumption) or my user satisfaction as
anyone using a high latency connection see it many seconds longer to
render a page.  For example someone on using satellite will probably
see about a 4 second delay for every chunk the processing is broken
into.

On Oct 16, 12:20 am, Andy Freeman <[EMAIL PROTECTED]> wrote:
> Does this do what you'd like?
>
> import copy
> import pickle
>
> # should work on any pickle able datatype
> # tested with both dicts and lists, works with unicode keys, data,
> elements
>
> # empty() is arguably wrong, but what's correct?
> # choices probably doesn't make any sense here, but ....
> # readable=True is for folks who want to look at pickle encoded data.
> class PickleProperty(db.Property):
>     def __init__(self, readable=False, *args, **kwds):
>         if readable:
>             self._readable = 0
>             self.data_type = db.Text
>         else:
>             self._readable = -1
>             self.data_type = db.Blob
>         super(PickleProperty, self).__init__(*args, **kwds)
>
>     # as with db.ListProperty, don't want any static sharing
>     def default_value(self):
>         return copy.deepcopy(self.default)
>
>     # if value is true or the same type as the default value, it is
>     # assumed to be reasonable even if required is set.
>     def empty(self, value):
>         return not (value or (isinstance(value, type(self.default))))
>
>     def get_value_for_datastore(self, model_instance):
>         v = super(PickleProperty,
> self).get_value_for_datastore(model_instance)
>         r = pickle.dumps(v, self._readable)
>         return self.data_type(r)
>
>     def make_value_from_datastore(self, value):
>         v = super(PickleProperty,
> self).make_value_from_datastore(value)
>         r = pickle.loads(str(v))
>         return r
>
> class XX(db.Model):
>     data = PickleProperty(default={})
>
> xx = XX()
> xx.data['a'] = 7
> xx.put()
> yy = XX.get(xx.key())
> assert xx.data == yy.data
>
> On Oct 15, 12:25 am, Josh Heitzman <[EMAIL PROTECTED]> wrote:
>
> > There are no indexes in index.yaml for these entity kinds and not very
> > many of the properties are being changed at one time (no idea if that
> > matters or not).
>
> > If updating the implicit indexes is the majority of the cost of doing
> > these updates, then I definitely agree that either--
> > 1) an attribute for disabling the implicit indexing of properties
> > should be added, or
> > 2) native serialization needs to be provided as part of the runtime so
> > we can quickly (de)serialize our data (from)into a blob.
>
> > On Oct 15, 12:04 am, djidjadji <[EMAIL PROTECTED]> wrote:
>
> > > For this entity at least 44 (10+1+2+15+1+15) index updates have to be done
> > > in 16 different index tables (10+1+2+1+1+1). Every attribute has its
> > > implicit index
> > > and you get an implicit index for the 'product' of the property lists.
> > > Not to mention the index tables mentioned in the index.yaml file that
> > > this entity uses.
> > > It can grow big when you have the ListProperties used in the
> > > index.yaml file, 15 extra updates
> > > for every mention of the string list property.
>
> > > I'm sure not every property of an entity is used in a query to retrieve 
> > > objects.
> > > To reduce the number of index updates it could be useful to have a
> > > non-index version of every property type. Just like we have for the
> > > StringProperty. The TextProperty does not have an index to be updated.
>
> > > A possible syntax to tell AppEngine NOT to create and update an index for 
> > > a
> > > property would be to add an attribute to the Property constructor.
> > > The default value of the attribute is True.
>
> > > def MyModel(db.Model):
> > >   id = db.IntegerProperty(required=True)
> > >   num1 = db.IntegerProperty(need_index=False)
>
> > > This would also help not to often hit the entity-index-update-limit
> > > ('exploding' index).
>
> > > Are the index updates counted in the mcycles used?
>
> > > 2008/10/15 Josh Heitzman <[EMAIL PROTECTED]>:
>
> > > > Regarding the first question, those mcycle numbers are from logs on
> > > > GAE, not from local profiling.  But if you mean are lots of people
> > > > using, no.  I was the only user with any data when I did the test.
>
> > > > Regarding the second question, the entities are not what I would
> > > > consider large. For example, one has 10 integer properties, 1 string
> > > > property, 2 datetime properties, one string list property (15 strings
> > > > with none more 30 characters long), and one int list property (only 1
> > > > value at the moment).
>
> > > > The entity group had 4 entities in it when I generated those numbers.
>
> > > > There is no contention involved, as the data is user specific and I
> > > > was the only user with data when I did the test.
>
> > > > On Oct 14, 8:31 pm, "David Symonds" <[EMAIL PROTECTED]> wrote:
> > > >> On Wed, Oct 15, 2008 at 1:50 PM, Josh Heitzman <[EMAIL PROTECTED]> 
> > > >> wrote:
> > > >> > Actually, I'm it take about 1500 mcycle to update one entity and then
> > > >> > an about an additional 1000 mcycle per additional entity (each a
> > > >> > different kind in this case) that is updated via the same db.put 
> > > >> > call.
>
> > > >> Is this in production? What size is the entity? Is it in a large
> > > >> entity group? How much contention do you think is involved?
>
> > > >> Dave.- Hide quoted text -
>
> > - Show quoted text -
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

[google-appengine] Re: 1000 mcycles to update a single entity

Reply via email to