Hello, The Cloud Datastore SLA agreement <https://cloud.google.com/datastore/sla> doesn't specify answers to many of the questions posed here on purpose: it's extremely hard to predict if downtime will happen all at once or intermittently, as those events are most often unplanned by their own nature. Indeed, a quick glance at previous incidents <https://status.cloud.google.com/summary#cloud-datastore> reveal the occurrence of them both in the past year. When designing your application, it's probably better to abstract such unknowns and implement general fail-safe mechanisms - for instance, if a write fails, you can catch the Datastore exception and enqueue a task to retry later, etc.
That being said, given the small budget for downtime allocated for Cloud Datastore (and taking into consideration its past generally reliable behavior), it's more common to observe issues with it due to the implementation not following the general best practices <https://cloud.google.com/datastore/docs/best-practices> or because of sub-optimal design <https://cloud.google.com/appengine/articles/scalability>. There's a greater benefit to be reaped in terms of your app's overall reliability by focusing on a general strategy to give those topics the proper attention they deserve in development instead. On Friday, April 26, 2019 at 12:21:50 PM UTC-4, dir Ls wrote: > > Cloud datastore has 99.95% monthly uptime SLA for multi-region which > translates to slightly above 20 minutes per month. Is this downtime likely > to happen all at once or intermittently? What kind of errors are to be > expected during the downtime? I am trying to figure out the strategy > required to be put in place on how the app should respond to end users > during the downtime. Would it be possible that it works for data related to > some users but not the others at a given time? I am looking for a best > practice guidance for an app that is expected to be usable 24/7 with > graceful downgrading based on the underlying services. For example, if the > downtime is intermittent, users might just reload the page and won't even > know something wrong happened. But if the downtime is prolonged, explicitly > displaying that the system is currently inaccessible and asking them to > visit after sometime might be better. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/1c8d0874-c870-4e16-a111-51ae51f8adf1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
