Nick,

Thank you for the response.

I have tens of thousands of records to load. If I load them
all at once or "rate-limit" the load, wont I run out of the
short term quotas just the same? Or did you mean that I
ought to "rate-limit" my load over a number of days or weeks?

I am trying to determine if with large datasets, GAE is an
adequate platform onto which the application I have in mind
can be hosted. Currently, I am doing an evaluation. I've
not yet built the application because I want to know if
GAE has adequate performance.

I've have already re-written the client-side code that
extracts the data from the protocol layer and achieved a
20% performance increase over the shipped 1.2.2 sdk on the
production GAE server (my new code was only 12% to 15%
faster on the
local development server so 20% was unexpected).
So, performance is critical for me - performance against
large dataset.

I don't know if the Python bulkloader will be an improvement.
I ship the data up as csv blocks with are parsed into Entities
and then stored. Pretty simple.

Concerning the speed of deleting existing data. You suggested
using key-only queries. In my initial email that you responded
to, I had a short code snippet where, in deed, I set the
query to use keys only. So, was the code incorrect?

Richard Emberson


Nick Johnson (Google) wrote:
> Hi Richard,
> 
> You're running into short term quotas, which are designed to prevent
> you exhausting your entire quota for the day in one go. You need to
> rate-limit your bulk loading code, and/or pay for additional quota.
> Even enabling billing without setting a high limit will increase your
> short term quotas automatically.
> 
> You should also look at your bulk loading code and make sure it's as
> efficient as possible. One possibility is to use the Python
> bulkloader.
> 
> As far as deletion goes, make sure you are doing key-only queries to
> get the key to delete, which will save on CPU time and timeouts.
> 
> -Nick Johnson
> 
> On Wed, Jul 15, 2009 at 12:11 AM, richard
> emberson<[email protected]> wrote:
>>
>> So, once again, I've tried to upload some data.
>>
>> After a couple, I guess, thousand records I start
>> getting HttpServletResponse.SC_FORBIDDEN from
>> the app engine server.
>>
>> On the Dashboard it says:
>>
>> Your application is exceeding a quota: CPU Time
>> Your application is exceeding a quota: Datastore CPU Time
>>
>> but under Resource, CPU Time usage is at 34%
>> and Stored Data usage is at 4%.
>>
>> I am trying to develop an application on GAE.
>> I will need to load tens of thousands or
>> a couple of hundred thousand entities as part
>> of testing the application. I will then want
>> to delete those entities.
>>
>> Currently, I can only load a couple of hundred
>> before the app engine starts rejecting additional
>> uploads. And I can not delete any of them - I
>> keep getting timeouts - even if I try to delete only
>> 10.
>>
>> Is there some upload per minute quota or something?
>> And, whats the magic to delete stuff.
>>
>> The following code causes timeouts:
>>
>>     DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
>>     final Query q = new Query(kindName);
>>     q.setKeysOnly();
>>
>>     final Iterable<Entity> entities = ds.prepare(q).asIterable(
>>                 FetchOptions.Builder.withLimit(count));
>>     KeyIterable ki = new KeyIterable(entities);
>>     ds.delete(ki);
>>     int numberDeleted = ki.getCount();
>>     return numberDeleted;
>>
>>
>>
>> Richard
>>
>> --
>> Quis custodiet ipsos custodes
>>
> 
> 
> 

-- 
Quis custodiet ipsos custodes

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to