Re: Does Tomcat/Java get around the problem of 64K maximum client source ports?

2020-03-28 Thread tomcat/perl

On 27.03.2020 21:39, Eric Robinson wrote:

FYI, I don't have 1800 tomcat instances on one server. I have about 100 
instances on each of 18 servers.


When one of these (attempted) connections fails, do you not get some error message which 
gives a clue as to what the failure is due to ?

(should be a log somewhere, no ?)

Also, just for info :
in the past, I have run into problems under Linux (no more connections accepted, neither 
incoming nor outgoing) whenever the actual number of TCP connections went above a certain 
number (maybe it was 64K).
A TCP connection goes through a number of states (which you see with a netstat display), 
such as "ESTABLISHED" but also "TIME_WAIT", "CLOSE_WAIT" etc.. In some of these states, 
the connection no longer has any link to any process, but the connection still counts 
against the limit (of the OS/TCP stack).


The case I'm talking about was a bit like yours : a webapp running under tomcat was making 
a connection to a remote host, but this connection was wrapped inside an object of some 
kind. When the webapp no longer needed the connection, it just discarded the wrapping 
object, which was left without references to it, and thus candidate for destruction at 
some point. But the discarded object never explicitly closed the underlying connection.


Over a period of time, this left an accumulation of (no longer used) connections in the 
"CLOSE_WAIT" state (closed by the remote host side, but not by the webapp side), which 
just sat there until a GC happened, at which time the destruction of these objects really 
happened, and some implicit close was done at the OS level, which eliminated these pending 
underlying CLOSE_WAIT connections.
And since the available heap was quite large, it took a long time before a GC happened, 
which allowed such CLOSE_WAIT connections to accumulate in the hundreds or thousands 
before being "recycled".
Until a certain number was reached, and then the host became all but unreachable and very 
slow.
That was a long time ago, and thus a lot of Java versions and Linux versions ago, so maybe 
something happened since then to avoid such a situation.

But maybe also, you are suffering of some similar phenomenon.
You could try to use netstat some more, and when you are having the problem, you should 
count at ALL the TCP connections, including the ones in CLOSE_WAIT, and just check if you 
do not have an obscene number of them in total.  There is definitely some limit number 
past which the OS starts acting funny.


(Note : unlike for TIME_WAIT e.g., there is no time limit for a connection in the 
CLOSE_WAIT state; it will stay in that state as long as the client side has not explicitly 
closed it, in some kind of zombie half-life)
See e.g. : 
https://users.cs.northwestern.edu/~agupta/cs340/project2/TCPIP_State_Transition_Diagram.pdf




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: HttpServletRequest.getRemoteAddr() sometimes returns NULL on Tomcat 9.0.30 and HTTP/2 secure requests

2020-03-28 Thread Manuel Dominguez Sarmiento
Hi Mark, we're now on the latest 9.0.33 release and we still see this 
issue intermittently in our logs. Only on HTTP/2 secure requests.


Please see the attached access logs (these represent all the cases for 
one whole day in one single high-volume server).

Some of the following request fields are NULL (or -1) in these examples:
- remoteAddr
- remotePort
- serverPort
- requestURI
- User-Agent

Some requests are missing some of the fields, some of the requests are 
missing others. What is particularly interesting is that the errors are 
clustered around particular timestamps, pointing to some likely issue 
regarding object sharing across several requests.


Please note that this is not just an issue at the AccessLogValve level. 
These fields contain invalid data while the request is being processed, 
so that is causing unexpected exceptions in our production code. The 
cases are few and isolated, but still this should be looked into.


Any thoughts?

*Manuel Dominguez Sarmiento*

On 05/02/2020 14:12, Manuel Dominguez Sarmiento wrote:
Our filter is not doing anything fancy (and it has always worked 
correctly before we ran into this bug). In pseudo-code:


public doFilter(request, response) {

    String ip = request.getRemoteAddr();
    boolean isProxy = isProxy(ip);
    if (isProxy) {
        String unwrappedIP = unwrapXForwardedFor(request);
        chain.doFilter(new 
MobileProxyHidingServletRequestWrapper(request, unwrappedIP), response);

    } else {
        chain.doFilter(request, response);
    }
}

All that MobileProxyHidingServletRequestWrapper is override 
getRemoteAddr() returning unwrappedIP instead of delegating to the 
actual request, while unwrapXForwardedFor() does what the name 
suggests, which is processing X-Forwarded-For to obtain the 
originating IP before it hit the detected proxy.


*Manuel Dominguez Sarmiento*

On 05/02/2020 10:28, Mark Thomas wrote:

On 04/02/2020 22:27, Manuel Dominguez Sarmiento wrote:

We are getting the NPEs in a top-of-the-chain servlet filter which
decorates HttpServletRequest.getRemoteAddr() before actual servlet
processing. Only on HTTP/2 and in a very small number of cases. Perhaps
we should test 9.0.31 and see what happens. When is this new version due
for release?

I'm just working through back-porting some changes and then I'll be
starting the release process. It 9.0.31 should be available towards the
beginning of next week.

Can you expand on what your filter is doing? When is the call made to
HttpServletRequest.getRemoteAddr() on the original request?

Mark



LOOKING FOR ALL ISSUE INSTANCES:

[root@optimus ~]# cat /home/wap/logs/access.2020-03-27.log | grep "^-"
- -1 443 [27/Mar/2020:07:53:12 -0300] "GET /us/en/country.do?method=list 
HTTP/2.0" 400 762 "-" "Mozilla/5.0 (Linux; Android 6.0; vivo 1609) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.93 Mobile Safari/537.36"
- -1 443 [27/Mar/2020:10:48:12 -0300] "GET 
/pe/es/subscriptionPlanDetail.do?id=4483=false=2181=46045=true=419634618870==ojo.pe=d=EAIaIQobChMIif6cyOW66AIVKAa5Bh3eRgI6EAEYASAAEgJuRPD_BwE
 HTTP/2.0" 400 637 "https://ojo.pe/; "Mozilla/5.0 (Linux; Android 9; LM-X520 
Build/PKQ1.190223.001; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 
Chrome/80.0.3987.119 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/260.0.0.42.118;]"
- -1 443 [27/Mar/2020:14:39:36 -0300] "GET 
/cl/es/subscriptionPlanDetail.do?id=4120=false=2131=17450=false=45663=true=380011499904==mobileapp%3A%3A2-com.appstar.callrecorder=EAIaIQobChMI-aa0_8uN6AIVZga5Bh1UBwQ5EAEYASAAEgKAxPD_BwE_cl_smd_ok=32320413=32320413=705b26c82e98b8401b74a463a68180d6=1584044911681=CELLULAR=EFFECTIVE_4G=true
 HTTP/2.0" 400 637 
"https://wap.renxo.com/cl/es/subscriptionPlanDetail.do?id=4120=false=2131=17450=false=45663=true=380011499904==mobileapp%3A%3A2-com.appstar.callrecorder=EAIaIQobChMI-aa0_8uN6AIVZga5Bh1UBwQ5EAEYASAAEgKAxPD_BwE_cl_smd_ok=32320413=32320413=705b26c82e98b8401b74a463a68180d6=1584044911681=CELLULAR=EFFECTIVE_4G;
 "Mozilla/5.0 (Linux; Android 8.1.0; SAMSUNG SM-J710MN Build/M1AJQ) 
AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/9.4 Chrome/67.0.3396.87 
Mobile Safari/537.36"
- -1 443 [27/Mar/2020:17:18:55 -0300] "GET 
/ar/es/subscriptionPlanDetail.do?id=4328=16242=2403=48008=true=409370554249=%2Farts%20%26%20entertainment=cuttsite.website=d=EAIaIQobChMI98zj67y76AIVT4p3Ch3riAVXEAEYASAAEgLGavD_BwE
 HTTP/2.0" 400 637