Nick, Thank you for the response.
I have tens of thousands of records to load. If I load them all at once or "rate-limit" the load, wont I run out of the short term quotas just the same? Or did you mean that I ought to "rate-limit" my load over a number of days or weeks? I am trying to determine if with large datasets, GAE is an adequate platform onto which the application I have in mind can be hosted. Currently, I am doing an evaluation. I've not yet built the application because I want to know if GAE has adequate performance. I've have already re-written the client-side code that extracts the data from the protocol layer and achieved a 20% performance increase over the shipped 1.2.2 sdk on the production GAE server (my new code was only 12% to 15% faster on the local development server so 20% was unexpected). So, performance is critical for me - performance against large dataset. I don't know if the Python bulkloader will be an improvement. I ship the data up as csv blocks with are parsed into Entities and then stored. Pretty simple. Concerning the speed of deleting existing data. You suggested using key-only queries. In my initial email that you responded to, I had a short code snippet where, in deed, I set the query to use keys only. So, was the code incorrect? Richard Emberson Nick Johnson (Google) wrote: > Hi Richard, > > You're running into short term quotas, which are designed to prevent > you exhausting your entire quota for the day in one go. You need to > rate-limit your bulk loading code, and/or pay for additional quota. > Even enabling billing without setting a high limit will increase your > short term quotas automatically. > > You should also look at your bulk loading code and make sure it's as > efficient as possible. One possibility is to use the Python > bulkloader. > > As far as deletion goes, make sure you are doing key-only queries to > get the key to delete, which will save on CPU time and timeouts. > > -Nick Johnson > > On Wed, Jul 15, 2009 at 12:11 AM, richard > emberson<[email protected]> wrote: >> >> So, once again, I've tried to upload some data. >> >> After a couple, I guess, thousand records I start >> getting HttpServletResponse.SC_FORBIDDEN from >> the app engine server. >> >> On the Dashboard it says: >> >> Your application is exceeding a quota: CPU Time >> Your application is exceeding a quota: Datastore CPU Time >> >> but under Resource, CPU Time usage is at 34% >> and Stored Data usage is at 4%. >> >> I am trying to develop an application on GAE. >> I will need to load tens of thousands or >> a couple of hundred thousand entities as part >> of testing the application. I will then want >> to delete those entities. >> >> Currently, I can only load a couple of hundred >> before the app engine starts rejecting additional >> uploads. And I can not delete any of them - I >> keep getting timeouts - even if I try to delete only >> 10. >> >> Is there some upload per minute quota or something? >> And, whats the magic to delete stuff. >> >> The following code causes timeouts: >> >> DatastoreService ds = DatastoreServiceFactory.getDatastoreService(); >> final Query q = new Query(kindName); >> q.setKeysOnly(); >> >> final Iterable<Entity> entities = ds.prepare(q).asIterable( >> FetchOptions.Builder.withLimit(count)); >> KeyIterable ki = new KeyIterable(entities); >> ds.delete(ki); >> int numberDeleted = ki.getCount(); >> return numberDeleted; >> >> >> >> Richard >> >> -- >> Quis custodiet ipsos custodes >> > > > -- Quis custodiet ipsos custodes --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
