[google-appengine] Re: Alternate to mapreduce

'Jordan (Cloud Platform Support)' via Google App Engine Fri, 21 Jul 2017 13:24:28 -0700

You can instead use PUSH Task Queues  
<https://cloud.google.com/appengine/docs/standard/python/taskqueue/>to 
replicate a Map-Reduce job. Simply have a master method shard a job, and 
'Map' the shards to other instances by simply enqueuing tasks. The 
instances that accept the shards (aka tasks) then perform the work and can 
write their results to the Datastore. Your master method then checks the 
Datastore on a looped timer until all shards are finished computing to 
finally perform your 'Reduce' phase.


You can also experiment with deploying different services (aka separate 
groups of instances) and Shard (aka enqueuing tasks) across these different 
services to ensure each shard never waits in a pending queue for an 
available instance. You can of course lower the min-pending-latency of a 
single service to achieve this same goal of minimizing the pending queue 
and forcing new instance creations for faster Map-Reduce. 

- As for performing Dataflow work via the Console, I am not aware of this 
happening any time soon. If this is a real show stopper for you I recommend 
filing a feature request 
<https://cloud.google.com/support/docs/issue-trackers> with them, 
specifying your exact use-case in detail. 

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/fa20442c-7879-4d64-b5dc-ccc46c46bfd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[google-appengine] Re: Alternate to mapreduce

Reply via email to