Hi team,

This is perhaps somewhat related to this topic here 
<https://groups.google.com/forum/#!searchin/google-appengine/Unexpected$20error$20during$20VM$20startup%7Csort:date/google-appengine/hLn3bW7ix4Y/bBS-_qVyCQAJ>
.

We've recently experienced delays in deploying our PHP flexible app whereby 
"gcloud deploy" times-out waiting for the new instance to start. This has 
been somewhat managed - by manually deleting the failing VM and starting 
the version again manually - up until now as it only impacted deployments. 
Today, we discovered that our production application experienced the same 
thing during what appears to be scheduled maintenance.

Digging a little deeper, this entry appeared in our log at 06:16:33 this 
morning:

2018-02-26 06:16:33.396 GMT
App Engine
vm_administrative_task_started
default:29-1
appengine-admin-nore...@google.com
Starting periodic restart of VM version.

Followed by:

2018-02-26 06:17:12.579 GMT
App Engine
vm_administrative_task_started
default:29-1
appengine-admin-nore...@google.com
Restarting batch of VMs as part of rolling restart.

The VM running our production application then proceeded to shutdown 
following this log entry:

2018-02-26 06:17:20.469 GMT------------------------VM shutdown initiated
-----------------------

Now after this point, and when the VM tries to startup again, it appears to 
go in to a *perpetual restart cycle*, Note the following error messages

<https://lh3.googleusercontent.com/-NejSOyDok8w/WpP1L-RmJQI/AAAAAAAAAA0/4Nh--iAAda0XdSOldb32aZO0NrpdPvZDACLcBGAs/s1600/unexpected_error_during_vm_startup.png>

We had no idea the system was down until a user reported the issue to us - 
which is hugely bad for us (!). We've learnt the following from this:-
- We clearly need an UpTime check (!!) - We've now updated stackdriver to 
do this for us.
- Currently our flexible environment utilises only a single instance - we 
clearly need to add more for redundancy.

We initially thought that our issues pertaining to App Engine were isolated 
to the deployment, and not ongoing maintenance... which is clearly not the 
case.

Having fallen back to our previous production version (which thankfully did 
start), I can't start our latest version on the platform. This is causing 
us a lot of concern (and headache):

Will post some logs shortly.

Thanks for looking in this.

Karl

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/787efe47-9a00-41e1-be5e-2ab60a22261f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
  • [google-appengine]... Karl Tinawi
    • [google-appen... 'George (Cloud Platform Support)' via Google App Engine
      • [google-a... 'George (Cloud Platform Support)' via Google App Engine
        • [goog... Karl Tinawi
          • [... Karl Tinawi
            • ... 'George Suceveanu' via Google App Engine
              • ... Karl Tinawi
                • ... Karl Tinawi
                • ... Karl Tinawi
                • ... 'George (Cloud Platform Support)' via Google App Engine
                • ... Karl Tinawi

Reply via email to