On Wed, Jul 15, 2009 at 6:14 PM, richard
emberson<[email protected]> wrote:
>
> Nick,
>
> Thank you for the response.
>
> I have tens of thousands of records to load. If I load them
> all at once or "rate-limit" the load, wont I run out of the
> short term quotas just the same? Or did you mean that I
> ought to "rate-limit" my load over a number of days or weeks?

The problem you're running into is loading so rapidly you're hitting
the very short term quotas intended to prevent you consuming all your
daily quota at once. If you rate-limit enough, you can avoid hitting
the short-term quotas while still staying within the daily quotas.
Whether or not you need to rate limit enough to cover more than one
day, or buy extra quota, is another issue.

>
> I am trying to determine if with large datasets, GAE is an
> adequate platform onto which the application I have in mind
> can be hosted. Currently, I am doing an evaluation. I've
> not yet built the application because I want to know if
> GAE has adequate performance.
>
> I've have already re-written the client-side code that
> extracts the data from the protocol layer and achieved a
> 20% performance increase over the shipped 1.2.2 sdk on the
> production GAE server (my new code was only 12% to 15%
> faster on the
> local development server so 20% was unexpected).
> So, performance is critical for me - performance against
> large dataset.
>
> I don't know if the Python bulkloader will be an improvement.
> I ship the data up as csv blocks with are parsed into Entities
> and then stored. Pretty simple.

The Python bulk loader does all the translation into entities on the
client side, and then uses remote_api to send the encoded data over.
This inevitably leads to less CPU utilization than parsing it yourself
on the server. Nevertheless, the main reason I recommended the Python
bulk loader is because it has support for concurrency and
rate-limiting built right in.

>
> Concerning the speed of deleting existing data. You suggested
> using key-only queries. In my initial email that you responded
> to, I had a short code snippet where, in deed, I set the
> query to use keys only. So, was the code incorrect?

Sorry, I didn't read the snippet in enough detail.

-Nick Johnson

>
> Richard Emberson
>
>
> Nick Johnson (Google) wrote:
>> Hi Richard,
>>
>> You're running into short term quotas, which are designed to prevent
>> you exhausting your entire quota for the day in one go. You need to
>> rate-limit your bulk loading code, and/or pay for additional quota.
>> Even enabling billing without setting a high limit will increase your
>> short term quotas automatically.
>>
>> You should also look at your bulk loading code and make sure it's as
>> efficient as possible. One possibility is to use the Python
>> bulkloader.
>>
>> As far as deletion goes, make sure you are doing key-only queries to
>> get the key to delete, which will save on CPU time and timeouts.
>>
>> -Nick Johnson
>>
>> On Wed, Jul 15, 2009 at 12:11 AM, richard
>> emberson<[email protected]> wrote:
>>>
>>> So, once again, I've tried to upload some data.
>>>
>>> After a couple, I guess, thousand records I start
>>> getting HttpServletResponse.SC_FORBIDDEN from
>>> the app engine server.
>>>
>>> On the Dashboard it says:
>>>
>>> Your application is exceeding a quota: CPU Time
>>> Your application is exceeding a quota: Datastore CPU Time
>>>
>>> but under Resource, CPU Time usage is at 34%
>>> and Stored Data usage is at 4%.
>>>
>>> I am trying to develop an application on GAE.
>>> I will need to load tens of thousands or
>>> a couple of hundred thousand entities as part
>>> of testing the application. I will then want
>>> to delete those entities.
>>>
>>> Currently, I can only load a couple of hundred
>>> before the app engine starts rejecting additional
>>> uploads. And I can not delete any of them - I
>>> keep getting timeouts - even if I try to delete only
>>> 10.
>>>
>>> Is there some upload per minute quota or something?
>>> And, whats the magic to delete stuff.
>>>
>>> The following code causes timeouts:
>>>
>>>     DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
>>>     final Query q = new Query(kindName);
>>>     q.setKeysOnly();
>>>
>>>     final Iterable<Entity> entities = ds.prepare(q).asIterable(
>>>                 FetchOptions.Builder.withLimit(count));
>>>     KeyIterable ki = new KeyIterable(entities);
>>>     ds.delete(ki);
>>>     int numberDeleted = ki.getCount();
>>>     return numberDeleted;
>>>
>>>
>>>
>>> Richard
>>>
>>> --
>>> Quis custodiet ipsos custodes
>>>
>>
>>
>>
>
> --
> Quis custodiet ipsos custodes
>
> >
>



-- 
Nick Johnson, App Engine Developer Programs Engineer
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
Number: 368047

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to