I could be wrong but I believe I have heard 10mb rpc calls are coming
in a release soon.

On Oct 30, 9:16 am, Stephen <[email protected]> wrote:
> On Oct 30, 1:39 pm, Stephen <[email protected]> wrote:
>
>
>
> > On Oct 29, 6:24 pm, Joshua Smith <[email protected]> wrote:
>
> > > I'm running into a too-large exception when I bulk put a bunch of 
> > > entities.  So obviously, I need to break up my puts into batches.  I want 
> > > to do something like this pseudo code:
>
> > > size = 0
> > > for o in objects:
> > >   if size + o.size() > 1MB:
> > >     db.put(list)
> > >     size = 0
> > >     list = []
> > >   list.append(o)
>
> > > Any idea what I could use for the "o.size()" method?  I could crawl 
> > > through all the fields and build up an estimate, but it seems likely to 
> > > me that there is a way to get the API-size of an entity more elegantly.
>
> > How about something like:
>
> > from google.appengine.api import datastore
> > from google.appengine.runtime import apiproxy_errors
>
> > def put_all(entities, **kw):
> >     try:
> >         return datastore.Put(entities, **kw)
> >     except apiproxy_errors.RequestTooLargeError:
> >         n = len(entities) / 2
> >         a, b = entities[:n], entities[n:]
> >         return put_all(a, **kw).extend(put_all(b, **kw))
>
> Although the general idea of the above code is to rely on the
> apiproxy_stub to accurately measure rpc size and split if too big, if
> you regularly try to put() large batch sizes you suffer the same
> overhead already mentioned: converting from model to entity to
> protobuf multiple times.
>
> So how about something like this (untested...):
>
> from google.appengine.api import datastore
> from google.appengine.runtime import apiproxy_errors
>
> def put_all(models, **kw):
>     rpc = datastore.GetRpcFromKwargs(kw)
>     models, multiple = datastore.NormalizeAndTypeCheck(models, Model)
>     assert multiple
>     entities =
> [model._populate_internal_entity(_entity_class=_CachedEntity)
>                   for model in models]
>     return _put_or_split(entities, rpc, **kw)
>
> def _put_or_split(entities, rpc, **kw):
>     try:
>         return datastore.Put(entities, rpc=rpc, **kw)
>     except apiproxy_errors.RequestTooLargeError:
>         n = len(entities) / 2
>         a, b = entities[:n], entities[n:]
>         logging.warn('batch put of %d entities failed,'
>                      ' trying batches of %d and %d',
>                      len(entities), len(a), len(b))
>         return _put_or_split(a, rpc, **kw).extend(_put_or_split(b,
> rpc, **kw))
>
> class _CachedEntity(datastore.Entity):
>     def _ToPb(self, **kw):
>         if getattr(self, '__cached_pb', None) is None:
>             self.__cached_pb = super(_CachedEntity, self)._ToPb(**kw)
>         return self.__cached_pb

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to