I could be wrong but I believe I have heard 10mb rpc calls are coming in a release soon.
On Oct 30, 9:16 am, Stephen <[email protected]> wrote: > On Oct 30, 1:39 pm, Stephen <[email protected]> wrote: > > > > > On Oct 29, 6:24 pm, Joshua Smith <[email protected]> wrote: > > > > I'm running into a too-large exception when I bulk put a bunch of > > > entities. So obviously, I need to break up my puts into batches. I want > > > to do something like this pseudo code: > > > > size = 0 > > > for o in objects: > > > if size + o.size() > 1MB: > > > db.put(list) > > > size = 0 > > > list = [] > > > list.append(o) > > > > Any idea what I could use for the "o.size()" method? I could crawl > > > through all the fields and build up an estimate, but it seems likely to > > > me that there is a way to get the API-size of an entity more elegantly. > > > How about something like: > > > from google.appengine.api import datastore > > from google.appengine.runtime import apiproxy_errors > > > def put_all(entities, **kw): > > try: > > return datastore.Put(entities, **kw) > > except apiproxy_errors.RequestTooLargeError: > > n = len(entities) / 2 > > a, b = entities[:n], entities[n:] > > return put_all(a, **kw).extend(put_all(b, **kw)) > > Although the general idea of the above code is to rely on the > apiproxy_stub to accurately measure rpc size and split if too big, if > you regularly try to put() large batch sizes you suffer the same > overhead already mentioned: converting from model to entity to > protobuf multiple times. > > So how about something like this (untested...): > > from google.appengine.api import datastore > from google.appengine.runtime import apiproxy_errors > > def put_all(models, **kw): > rpc = datastore.GetRpcFromKwargs(kw) > models, multiple = datastore.NormalizeAndTypeCheck(models, Model) > assert multiple > entities = > [model._populate_internal_entity(_entity_class=_CachedEntity) > for model in models] > return _put_or_split(entities, rpc, **kw) > > def _put_or_split(entities, rpc, **kw): > try: > return datastore.Put(entities, rpc=rpc, **kw) > except apiproxy_errors.RequestTooLargeError: > n = len(entities) / 2 > a, b = entities[:n], entities[n:] > logging.warn('batch put of %d entities failed,' > ' trying batches of %d and %d', > len(entities), len(a), len(b)) > return _put_or_split(a, rpc, **kw).extend(_put_or_split(b, > rpc, **kw)) > > class _CachedEntity(datastore.Entity): > def _ToPb(self, **kw): > if getattr(self, '__cached_pb', None) is None: > self.__cached_pb = super(_CachedEntity, self)._ToPb(**kw) > return self.__cached_pb -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
