Re: Does Tomcat/Java get around the problem of 64K maximum client source ports?
On 27.03.2020 21:39, Eric Robinson wrote: FYI, I don't have 1800 tomcat instances on one server. I have about 100 instances on each of 18 servers. When one of these (attempted) connections fails, do you not get some error message which gives a clue as to what the failure is due to ? (should be a log somewhere, no ?) Also, just for info : in the past, I have run into problems under Linux (no more connections accepted, neither incoming nor outgoing) whenever the actual number of TCP connections went above a certain number (maybe it was 64K). A TCP connection goes through a number of states (which you see with a netstat display), such as "ESTABLISHED" but also "TIME_WAIT", "CLOSE_WAIT" etc.. In some of these states, the connection no longer has any link to any process, but the connection still counts against the limit (of the OS/TCP stack). The case I'm talking about was a bit like yours : a webapp running under tomcat was making a connection to a remote host, but this connection was wrapped inside an object of some kind. When the webapp no longer needed the connection, it just discarded the wrapping object, which was left without references to it, and thus candidate for destruction at some point. But the discarded object never explicitly closed the underlying connection. Over a period of time, this left an accumulation of (no longer used) connections in the "CLOSE_WAIT" state (closed by the remote host side, but not by the webapp side), which just sat there until a GC happened, at which time the destruction of these objects really happened, and some implicit close was done at the OS level, which eliminated these pending underlying CLOSE_WAIT connections. And since the available heap was quite large, it took a long time before a GC happened, which allowed such CLOSE_WAIT connections to accumulate in the hundreds or thousands before being "recycled". Until a certain number was reached, and then the host became all but unreachable and very slow. That was a long time ago, and thus a lot of Java versions and Linux versions ago, so maybe something happened since then to avoid such a situation. But maybe also, you are suffering of some similar phenomenon. You could try to use netstat some more, and when you are having the problem, you should count at ALL the TCP connections, including the ones in CLOSE_WAIT, and just check if you do not have an obscene number of them in total. There is definitely some limit number past which the OS starts acting funny. (Note : unlike for TIME_WAIT e.g., there is no time limit for a connection in the CLOSE_WAIT state; it will stay in that state as long as the client side has not explicitly closed it, in some kind of zombie half-life) See e.g. : https://users.cs.northwestern.edu/~agupta/cs340/project2/TCPIP_State_Transition_Diagram.pdf - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: HttpServletRequest.getRemoteAddr() sometimes returns NULL on Tomcat 9.0.30 and HTTP/2 secure requests
Hi Mark, we're now on the latest 9.0.33 release and we still see this issue intermittently in our logs. Only on HTTP/2 secure requests. Please see the attached access logs (these represent all the cases for one whole day in one single high-volume server). Some of the following request fields are NULL (or -1) in these examples: - remoteAddr - remotePort - serverPort - requestURI - User-Agent Some requests are missing some of the fields, some of the requests are missing others. What is particularly interesting is that the errors are clustered around particular timestamps, pointing to some likely issue regarding object sharing across several requests. Please note that this is not just an issue at the AccessLogValve level. These fields contain invalid data while the request is being processed, so that is causing unexpected exceptions in our production code. The cases are few and isolated, but still this should be looked into. Any thoughts? *Manuel Dominguez Sarmiento* On 05/02/2020 14:12, Manuel Dominguez Sarmiento wrote: Our filter is not doing anything fancy (and it has always worked correctly before we ran into this bug). In pseudo-code: public doFilter(request, response) { String ip = request.getRemoteAddr(); boolean isProxy = isProxy(ip); if (isProxy) { String unwrappedIP = unwrapXForwardedFor(request); chain.doFilter(new MobileProxyHidingServletRequestWrapper(request, unwrappedIP), response); } else { chain.doFilter(request, response); } } All that MobileProxyHidingServletRequestWrapper is override getRemoteAddr() returning unwrappedIP instead of delegating to the actual request, while unwrapXForwardedFor() does what the name suggests, which is processing X-Forwarded-For to obtain the originating IP before it hit the detected proxy. *Manuel Dominguez Sarmiento* On 05/02/2020 10:28, Mark Thomas wrote: On 04/02/2020 22:27, Manuel Dominguez Sarmiento wrote: We are getting the NPEs in a top-of-the-chain servlet filter which decorates HttpServletRequest.getRemoteAddr() before actual servlet processing. Only on HTTP/2 and in a very small number of cases. Perhaps we should test 9.0.31 and see what happens. When is this new version due for release? I'm just working through back-porting some changes and then I'll be starting the release process. It 9.0.31 should be available towards the beginning of next week. Can you expand on what your filter is doing? When is the call made to HttpServletRequest.getRemoteAddr() on the original request? Mark LOOKING FOR ALL ISSUE INSTANCES: [root@optimus ~]# cat /home/wap/logs/access.2020-03-27.log | grep "^-" - -1 443 [27/Mar/2020:07:53:12 -0300] "GET /us/en/country.do?method=list HTTP/2.0" 400 762 "-" "Mozilla/5.0 (Linux; Android 6.0; vivo 1609) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.93 Mobile Safari/537.36" - -1 443 [27/Mar/2020:10:48:12 -0300] "GET /pe/es/subscriptionPlanDetail.do?id=4483=false=2181=46045=true=419634618870==ojo.pe=d=EAIaIQobChMIif6cyOW66AIVKAa5Bh3eRgI6EAEYASAAEgJuRPD_BwE HTTP/2.0" 400 637 "https://ojo.pe/; "Mozilla/5.0 (Linux; Android 9; LM-X520 Build/PKQ1.190223.001; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/80.0.3987.119 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/260.0.0.42.118;]" - -1 443 [27/Mar/2020:14:39:36 -0300] "GET /cl/es/subscriptionPlanDetail.do?id=4120=false=2131=17450=false=45663=true=380011499904==mobileapp%3A%3A2-com.appstar.callrecorder=EAIaIQobChMI-aa0_8uN6AIVZga5Bh1UBwQ5EAEYASAAEgKAxPD_BwE_cl_smd_ok=32320413=32320413=705b26c82e98b8401b74a463a68180d6=1584044911681=CELLULAR=EFFECTIVE_4G=true HTTP/2.0" 400 637 "https://wap.renxo.com/cl/es/subscriptionPlanDetail.do?id=4120=false=2131=17450=false=45663=true=380011499904==mobileapp%3A%3A2-com.appstar.callrecorder=EAIaIQobChMI-aa0_8uN6AIVZga5Bh1UBwQ5EAEYASAAEgKAxPD_BwE_cl_smd_ok=32320413=32320413=705b26c82e98b8401b74a463a68180d6=1584044911681=CELLULAR=EFFECTIVE_4G; "Mozilla/5.0 (Linux; Android 8.1.0; SAMSUNG SM-J710MN Build/M1AJQ) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/9.4 Chrome/67.0.3396.87 Mobile Safari/537.36" - -1 443 [27/Mar/2020:17:18:55 -0300] "GET /ar/es/subscriptionPlanDetail.do?id=4328=16242=2403=48008=true=409370554249=%2Farts%20%26%20entertainment=cuttsite.website=d=EAIaIQobChMI98zj67y76AIVT4p3Ch3riAVXEAEYASAAEgLGavD_BwE HTTP/2.0" 400 637