[
https://issues.apache.org/jira/browse/TS-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Leif Hedstrom updated TS-4897:
------------------------------
Fix Version/s: 7.1.0
> Unbound growth of number of memory maps for traffic_server under SSL
> termination load when ssl_ticket_enabled=0
> ---------------------------------------------------------------------------------------------------------------
>
> Key: TS-4897
> URL: https://issues.apache.org/jira/browse/TS-4897
> Project: Traffic Server
> Issue Type: Bug
> Components: TLS
> Reporter: Can Selcik
> Fix For: 7.1.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> The number of {{\[anon\]}} memory regions mapped to the {{traffic_server}}
> process displays unbound growth until the kernel thresholds are reached and
> the process is terminated.
> This happens when ATS is used to terminate SSL and {{ssl_ticket_enabled=0}}
> in {{ssl_multicert.config}}.
> We've experienced this issue on our staging and production hosts and were
> able to replicate it with the above configuration under high volume HTTPS
> load. We didn't experience this with {{5.2.x}} and it will make sense why at
> the end.
> While generating {{https}} traffic with {{siege}} or {{ab}}, the issue can be
> observed with:
> {{watch "pmap $(pidof traffic_server) | wc -l"}}
> {{git bisect}} pointed us to: <TS-3883: Fix madvise>
> Turns out a no-op {{ats_madvise}} hides the symptoms of the issue.
> Going in deeper, we realize that {{ssl_ticket_enabled}} option is relevant
> because after enabling the {{ssl.session_cache}} tag, we see that ATS doesn't
> manage its own session cache for SSL, it is done by the library instead. In
> that case, the code path doing the problematic allocation within ATS doesn't
> get executed often since OpenSSL takes care of the session tokens.
> But why does this happen? It happens because {{MADV_DONTDUMP}} is passed to
> {{posix_madvise}} even though {{MADV_DONTDUMP}} is not a valid flag for
> {{posix_madvise}} as it is not a drop-in replacement to {{madvise}}.
> Looking at {{<bits/mman.h>}}:
> {noformat}
> 87 /* Advice to `madvise'. */
> 88 #ifdef __USE_BSD
> 89 # define MADV_NORMAL▸ 0▸ /* No further special treatment. */
> 90 # define MADV_RANDOM▸ 1▸ /* Expect random page references. */
> 91 # define MADV_SEQUENTIAL 2▸ /* Expect sequential page references.
> */
> 92 # define MADV_WILLNEED▸ 3▸ /* Will need these pages. */
> 93 # define MADV_DONTNEED▸ 4▸ /* Don't need these pages. */
> 94 # define MADV_REMOVE▸ 9▸ /* Remove these pages and resources.
> */
> 95 # define MADV_DONTFORK▸ 10▸ /* Do not inherit across fork. */
> 96 # define MADV_DOFORK▸ 11▸ /* Do inherit across fork. */
> 97 # define MADV_MERGEABLE▸ 12▸ /* KSM may merge identical pages. */
> 98 # define MADV_UNMERGEABLE 13▸ /* KSM may not merge identical pages.
> */
> 99 # define MADV_DONTDUMP▸ 16 /* Explicity exclude from the core
> dump,
> 100 overrides the coredump filter
> bits. */
> 101 # define MADV_DODUMP▸ 17▸ /* Clear the MADV_DONTDUMP flag. */
> 102 # define MADV_HWPOISON▸ 100▸ /* Poison a page for testing. */
> 103 #endif
> {noformat}
> However {{posix_madvise}} takes:
> {noformat}
> 107 # define POSIX_MADV_NORMAL▸ 0 /* No further special treatment. */
> 108 # define POSIX_MADV_RANDOM▸ 1 /* Expect random page references.
> */
> 109 # define POSIX_MADV_SEQUENTIAL▸ 2 /* Expect sequential page
> references. */
> 110 # define POSIX_MADV_WILLNEED▸ 3 /* Will need these pages. */
> 111 # define POSIX_MADV_DONTNEED▸ 4 /* Don't need these pages. */
> {noformat}
> Also {{posix_madvise}} and {{madvise}} can both be present on the same
> system. However they do not have the same capability. That's why {{Explicity
> exclude from the core dump, overrides the coredump filter bits}}
> functionality isn't achievable through {{posix_madvise}}.
> Will post a PR momentarily.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)