[google-appengine] Re: 502 Bad gateway error

'Nicholas (Google Cloud Support)' via Google App Engine Wed, 08 Feb 2017 08:59:20 -0800

Hey Vinay Chitlangia,

Thanks for some preliminary troubleshooting and linking this interesting 
article.  App Engine runs Nginx processes to handle routes to your 
application's handlers.  Handlers serving static assets for instance are 
handled by this Nginx process and the resources are served directly, thus 
bypassing the application altogether to save on precious application 
resources.


The Nginx process will often serve a *502* if the application raises an 
exception, an internal API call raises an exception or if the request 
simply takes too long.  As such, the status code by itself does not tell us 
much.

Looking at the GAE logs for your application, I found the *502*s you 
mentioned.  One thing I noticed is that they all occur from the */read* 
endpoint.  From the naming, I assume this endpoint is reading some data 
from BigTable.  Investigating further, perhaps you could provide some 
additional information:

   - What exactly is happening at the */read* endpoint?  A code sample 
   would be ideal if that's not too sensitive.
   - What kind of error handling exists in said endpoint if the BigTable 
   API returns non-success responses?
   - Can you log various steps in the */read* endpoint?  This might help 
   identify the progress the request reaches before the *502* is served. 
    It would also help in confirming that your application is actually even 
   getting the request as I can't currently confirm that from the logs.
   - If said endpoint does in fact read from BigTable, what API and java 
   library are you using?

Regarding the article you linked, while the configuration of an HTTPS load 
balancer and nginx.conf can be very important, both the load balancing 
component and nginx.conf are out of the hands of the developer with App 
Engine.  Your scaling settings, health check settings and handlers in the 
app.yaml are the only rules over which you have control that affect load 
balancing and nginx rules.

On Wednesday, February 8, 2017 at 11:27:43 AM UTC-5, Vinay Chitlangia wrote:
>
> Might be related:
>
> https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340#.6k2laoada
>
> The symptoms mentioned in this blog
> Somewhat moderate requests
> No logs
>
> match our observations.
>
> I do not see the "backend_connection_closed_before_data_sent_to_client" 
> status in the logs.
>
> The error message for a failed request received by the client is:
> 11:12:44.549com.yawasa.server.storage.RpcStorageService LogError: 
> <html><head><title>502 Bad Gateway</title></head><body 
> bgcolor="white"><center><h1>502 Bad 
> Gateway</h1></center><hr><center>nginx</center></body></html> (
> RpcStorageService.java:137 
> <https://console.cloud.google.com/debug/fromlog?appModule=default&appVersion=1&file=RpcStorageService.java&line=137&logInsertId=589569d9000e7bf6825479e4&logNanos=1486186963359794000&nestedLogIndex=0&project=village-test>
> )
>
> The mention of nginx in the log message appears promising. We are not 
> using nginx deliberately, so I am assuming this is something happening 
> under the hood.
>
> On Tuesday, February 7, 2017 at 11:08:55 AM UTC+5:30, Vinay Chitlangia 
> wrote:
>>
>> Hi,
>> We are seeing intermittent occurrences of 502 Bad Gateway error in our 
>> server.
>> About 0.5% requests fail with this error.
>>
>> Out setup is:
>> Flex running jetty9-compat
>> F1 machine
>> 1 server
>>
>> Our request pattern is bursty. So the server gets ~30 requests in 
>> parallel. 
>> The failures, when they happen are clustered, that is over a period of 
>> 10'ish seconds one would see 3-4 errors.
>>
>> The requests which complete successfully, finish in 50-100 ms, so it does 
>> not appear like the server is under major load and not able to keep up.
>> To rule out this possibility, I started the servers with 5 replicas. 
>> However the failure percentage did not change.
>>
>> From the looks of it, it appears that there is some throttling or quota 
>> issue at play. I tried tweaking max-concurrent-requests param. Set it to 
>> 300, but that did not make any difference either.
>>
>> I do not see new instances being created at the time of failure either.
>>
>>
>> The request log for the failed request:
>> 09:57:30.686POST502262 B4 msAppEngine-Google; (+
>> http://code.google.com/appengine; appid: s~village-test)/read
>> 107.178.194.3 - - [07/Feb/2017:09:57:30 +0530] "POST /read HTTP/1.1" 502 
>> 262 - "AppEngine-Google; (+http://code.google.com/appengine; ms=4 
>> cpu_ms=0 cpm_usd=2.9279999999999998e-8 loading_request=0 instance=- 
>> app_engine_release=1.9.48 trace_id=-
>> {
>> protoPayload: {…}  
>> insertId: "58994cb30002335cb47fd364"  
>> httpRequest: {…}  
>> resource: {…}  
>> timestamp: "2017-02-07T04:27:30.686052Z"  
>> labels: {…}  
>>
>> operation: {…}  
>> }
>>
>> Looking around at other logs at around the time of failure I see. 
>> 09:57:30.000[error] 32#32: *35107 recv() failed (104: Connection reset by 
>> peer) while reading response header from upstream, client: 169.254.160.2, 
>> server: , request: "POST /read HTTP/1.1", upstream: "
>> http://172.17.0.4:8080/read";, host: "bigtable-dev.appspot.com"
>> AFAICT this request never made it to our servlet.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/ea48946b-fbd9-47af-a7b4-136493f0d583%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[google-appengine] Re: 502 Bad gateway error

Reply via email to