This isn't rocket science, guys. It's pretty silly for an API to have a limit for which there is no practical way for an application to know if it is exceeding. Counting to 500 is easy. But having a limit that you cannot put more than X bytes across an API, and then not providing an efficient "sizeof()" method, is a design flaw.
I'm going to live with the hack of simply writing my own sizeof() that does an overestimate. This will work, but it is suboptimal, and a total hack. (Heck, I'm assuming that the blob properties are going to serialize to about the len() of the blob content, but really, that isn't even guaranteed at the API level.) -Joshua On Oct 30, 2010, at 3:37 PM, Jeff Schwartz wrote: > You have control over where, what and how you persist your data so you should > be able to find a way to get around this issue. Since Google docs are pretty > straight forward regarding datastore limits you could incorporate them into > your code. As an example, don't batch put more than 500 entities at a time - > break up your puts up into smaller batches; if you are creating entities > whose property sizes can exceed the 1 megabyte limit break up your entities > into multiple entities or use the blob store; if you need to store more than > 500 entries in a multi-value property break up the entity into multiple > entities. Yes, it is a lot of work coding wise but what other choice do you > have? Even SQL has limits that we must live with so the datastore isn't > really different in that regard. > > Good luck. > > Jeff > > On Sat, Oct 30, 2010 at 3:12 PM, Joshua Smith <[email protected]> > wrote: > I understand the cause of this error. Like I said, I have a bunch of large > entities to push into the datastore, and I want to do it as efficiently as > possible. > > But it seems there is no efficient way to find out how big an entity is going > to be when crossing the API transom, so there is no way to do these puts > optimally. > > For now, I've added a .size() method to my model, which generates an > over-estimate using some heuristics. But that's a hack, and this really > should happen under the covers. > > -Joshua > > On Oct 30, 2010, at 1:54 PM, Jeff Schwartz wrote: > >> The maximum size for an entity is 1 megabyte. The maximum number of entities >> in a batch put or delete is 500. These limits can be found at >> http://code.google.com/appengine/docs/python/datastore/overview.html which >> also provides information on other datastore limits. >> >> So it appears that you are hitting the 1 megabyte limit, either for the >> total of all entities you are batch putting or for at least one of the them. >> >> Try using logging while putting the entities individually to isolate and >> report the offending entity. Catch the exception and dump what ever the >> entity contains that will identify either where or how it was created in >> your workflow. >> >> Jeff >> >> On Sat, Oct 30, 2010 at 1:10 PM, Joshua Smith <[email protected]> >> wrote: >> It was a lot of big entities. The exception said it was the size, not the >> quantity. >> >> On Oct 30, 2010, at 9:51 AM, Jeff Schwartz wrote: >> >>> How many entities were there when the batch put failed? >>> >>> Was it the size of the entities or the number of entities that caused the >>> batch put to fail? >>> >>> Jeff >>> >>> On Sat, Oct 30, 2010 at 8:39 AM, Stephen <[email protected]> wrote: >>> >>> >>> On Oct 29, 6:24 pm, Joshua Smith <[email protected]> wrote: >>> > I'm running into a too-large exception when I bulk put a bunch of >>> > entities. So obviously, I need to break up my puts into batches. I want >>> > to do something like this pseudo code: >>> > >>> > size = 0 >>> > for o in objects: >>> > if size + o.size() > 1MB: >>> > db.put(list) >>> > size = 0 >>> > list = [] >>> > list.append(o) >>> > >>> > Any idea what I could use for the "o.size()" method? I could crawl >>> > through all the fields and build up an estimate, but it seems likely to >>> > me that there is a way to get the API-size of an entity more elegantly. >>> >>> >>> How about something like: >>> >>> >>> from google.appengine.api import datastore >>> from google.appengine.runtime import apiproxy_errors >>> >>> def put_all(entities, **kw): >>> try: >>> return datastore.Put(entities, **kw) >>> except apiproxy_errors.RequestTooLargeError: >>> n = len(entities) / 2 >>> a, b = entities[:n], entities[n:] >>> return put_all(a, **kw).extend(put_all(b, **kw)) >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Google App Engine" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine?hl=en. >>> >>> >>> >>> >>> -- >>> Jeff >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Google App Engine" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group at >>> http://groups.google.com/group/google-appengine?hl=en. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> >> >> >> -- >> Jeff >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > > > -- > Jeff > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
