I have noticed similar comments / posts.
On Sat, Oct 30, 2010 at 10:44, Jamie H <[email protected]> wrote: > I could be wrong but I believe I have heard 10mb rpc calls are coming > in a release soon. > > On Oct 30, 9:16 am, Stephen <[email protected]> wrote: >> On Oct 30, 1:39 pm, Stephen <[email protected]> wrote: >> >> >> >> > On Oct 29, 6:24 pm, Joshua Smith <[email protected]> wrote: >> >> > > I'm running into a too-large exception when I bulk put a bunch of >> > > entities. So obviously, I need to break up my puts into batches. I >> > > want to do something like this pseudo code: >> >> > > size = 0 >> > > for o in objects: >> > > if size + o.size() > 1MB: >> > > db.put(list) >> > > size = 0 >> > > list = [] >> > > list.append(o) >> >> > > Any idea what I could use for the "o.size()" method? I could crawl >> > > through all the fields and build up an estimate, but it seems likely to >> > > me that there is a way to get the API-size of an entity more elegantly. >> >> > How about something like: >> >> > from google.appengine.api import datastore >> > from google.appengine.runtime import apiproxy_errors >> >> > def put_all(entities, **kw): >> > try: >> > return datastore.Put(entities, **kw) >> > except apiproxy_errors.RequestTooLargeError: >> > n = len(entities) / 2 >> > a, b = entities[:n], entities[n:] >> > return put_all(a, **kw).extend(put_all(b, **kw)) >> >> Although the general idea of the above code is to rely on the >> apiproxy_stub to accurately measure rpc size and split if too big, if >> you regularly try to put() large batch sizes you suffer the same >> overhead already mentioned: converting from model to entity to >> protobuf multiple times. >> >> So how about something like this (untested...): >> >> from google.appengine.api import datastore >> from google.appengine.runtime import apiproxy_errors >> >> def put_all(models, **kw): >> rpc = datastore.GetRpcFromKwargs(kw) >> models, multiple = datastore.NormalizeAndTypeCheck(models, Model) >> assert multiple >> entities = >> [model._populate_internal_entity(_entity_class=_CachedEntity) >> for model in models] >> return _put_or_split(entities, rpc, **kw) >> >> def _put_or_split(entities, rpc, **kw): >> try: >> return datastore.Put(entities, rpc=rpc, **kw) >> except apiproxy_errors.RequestTooLargeError: >> n = len(entities) / 2 >> a, b = entities[:n], entities[n:] >> logging.warn('batch put of %d entities failed,' >> ' trying batches of %d and %d', >> len(entities), len(a), len(b)) >> return _put_or_split(a, rpc, **kw).extend(_put_or_split(b, >> rpc, **kw)) >> >> class _CachedEntity(datastore.Entity): >> def _ToPb(self, **kw): >> if getattr(self, '__cached_pb', None) is None: >> self.__cached_pb = super(_CachedEntity, self)._ToPb(**kw) >> return self.__cached_pb > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
