[google-appengine] Re: cloudstorage library and URLFetch quotas - workaround?

Nick (Cloud Platform Support) Fri, 31 Jul 2015 13:42:06 -0700

Hey Josh,

It seems as though you got some pretty good answers in the stackoverflow 
thread. I'll add on my thoughts:


   - You can make a feature request in the public issue tracker 
   <https://code.google.com/p/googleappengine/issues/list> with an 
   explanation of your use-case if you'd like to see something implemented
   - You can also look into the use of Datastore 
   <https://cloud.google.com/datastore/docs/concepts/overview?hl=en> to 
   store the temporary results of your process, since this will have better 
   rate-limiting quotas than cloud storage, which isn't really meant for rapid 
   writes such as this. You could also look into BigTable 
   <https://cloud.google.com/bigtable/docs/>, or any number of distributed 
   databases such as memcached <http://memcached.org/> to resolve your 
   issue of temporary file storage.
   
I hope this has helped you. Feel free to ask any questions you may have, or 
to go ahead and create a feature request / quota increase request in the 
public issue tracker.

Best wishes,

Nick

On Wednesday, July 29, 2015 at 2:09:04 PM UTC-4, Josh Whelchel (Loudr) 
wrote:
>
> We use Cloud Storage to store large elasticsearch results (from 
> aggregations - so scan+scroll isn't going to work here).
>
> To handle these large aggregations in parallel, we store them as multiline 
> JSON dumps that is sourced from a managed vm.
>
> As a result, to perform *parallel processing*, many *app engine *instances 
> will open this file at once, and as a result, *hit the URLFetch rate 
> limit* because of this documented limitation:
>
> and the calls count against your URL fetch quota, as the library uses the 
>> URL Fetch service to interact with Cloud Storage.
>
>
> - https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/
>
>
> *Here's the resulting exception:*
>
>
> <https://lh3.googleusercontent.com/-WbU1UiwCB2s/VbkWhCRDMjI/AAAAAAAAAH0/Ta3WBGEC0n0/s1600/Screenshot%2B2015-07-28%2B17.07.40.png>
>
>
> *Here's the code that opens the file:*
>
>     import cloudstorage as gcs
>
>     def open_file(path, mode, **kwargs):
>         f = gcs.open(path, mode=mode, **kwargs)
>         if not f:
>             raise Exception("File could not be opened: %s" % path)
>
>         return f
>
> --
>
> We need a method of communicating with Cloud Storage that bypasses the 
> URLFetch quotas and rate limits, or it becomes impossible for us to 
> effectively execute parallel processing.
>
> *Is there a method of reading GCS files from App Engine that does not 
> route through URLFetch*, much like the datastore API does not incur url 
> fetch rate limits?
>
>
>
>
> I've detailed this question on Stackoverflow as well:
>
> http://stackoverflow.com/questions/31707961/urlfetch-rate-limits-with-google-cloud-storage
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/892b6118-b361-4aa0-a4f6-5297c92d46a3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[google-appengine] Re: cloudstorage library and URLFetch quotas - workaround?

Reply via email to