[GitHub] trafficserver pull request #1503: client cert should be added to netvcoption...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1503 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1503: client cert should be added to netvcoptions only ...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1503 Something that should be considered moving forward is how (or whether) server session sharing needs to be augmented to take the client cert requirements into account. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1532: ATS 7.1 release running out of memory
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1532 Built with jemalloc and installed on same machine. Does not crash anymore, but the load average is crazy (25-30). spinlock shows up as 25% CPU in perf top and IOBuffer::read_avail shows up in second spot at 7-12% cpu. I did notice that as part of fix for TS-1822, @PSUdaemon removes a call to mallopt. Perhaps we still need to set a constraint for the straight glibc malloc? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1532: ATS 7.1 release running out of memory
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1532 I installed on a machine with a higher load, and it fails in ats_memalign from freelist_new within 2 minutes of traffic. Watching top the amount of memory used by the process is much less than the amount of available memory (10% in use). Could be that top is delayed in getting updated. The same build (or very similar build) has run for days on a machine with lower (1/3) of the traffic load. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1565: Fix Assertion failure in the regex_revalid...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1565 Fix Assertion failure in the regex_revalidate plugin. Since TS-4387, Calls to TSContSchedule/TSContScheduleEvery(), require that the continuation associated with the TSCont parameter must have a mutex. (cherry picked from commit 0b1f28b53174baf5cfff54a2d224ffbe09a64374) You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue-1561 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1565.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1565 commit a1755d6cc2bb43547962b0e4e4f61d62a2243150 Author: John J. Rushford <jrushf...@apache.org> Date: 2017-02-01T20:34:44Z Fix Assertion failure in the regex_revalidate plugin. Since TS-4387, Calls to TSContSchedule/TSContScheduleEvery(), require that the continuation associated with the TSCont parameter must have a mutex. (cherry picked from commit 0b1f28b53174baf5cfff54a2d224ffbe09a64374) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1561: A new 7.1 Crash with regex_revalidate config upda...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1561 @jrushford fixed this for 6.2.x. I'm putting up a PR with a cherry-pick of his fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1561: A new 7.1 Crash
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1561 A new 7.1 Crash After running in production for 2.5 days without incident (with a couple fixes now merged back to 7.1.x), I got a core with the following stack trace ``` (gdb) bt #0 0x2ba487c1c625 in raise () from /lib64/libc.so.6 #1 0x2ba487c1dd8d in abort () from /lib64/libc.so.6 #2 0x2ba48576b5a9 in ink_abort (message_format=0x2ba48577d75c "%s:%d: failed assertion `%s`") at ink_error.cc:99 #3 0x2ba485768cda in _ink_assert (expression=0x7b3d70 "((INKContInternal *)contp)->mutex", file=0x7b2dad "InkAPI.cc", line=4372) at ink_assert.cc:37 #4 0x0052bab7 in _TSReleaseAssert (text=0x7b3d70 "((INKContInternal *)contp)->mutex", file=0x7b2dad "InkAPI.cc", line=4372) at InkAPI.cc:401 #5 0x00534f77 in TSContSchedule (contp=0x2aad1801bd60, timeout=30, tp=TS_THREAD_POOL_TASK) at InkAPI.cc:4372 #6 0x2aaabbc39132 in config_handler (cont=0x1fe9ca0, event=Unhandled dwarf expression opcode 0xf3 ) at regex_revalidate/regex_revalidate.c:371 #7 0x0052c9fb in INKContInternal::handle_event (this=0x1fe9ca0, event=2, edata=0x2aad10079b80) at InkAPI.cc:1049 #8 0x005184f0 in Continuation::handleEvent (this=0x1fe9ca0, event=2, data=0x2aad10079b80) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #9 0x007aac3f in EThread::process_event (this=0x109abc90, e=0x2aad10079b80, calling_code=2) at UnixEThread.cc:143 #10 0x007aaf57 in EThread::execute (this=0x109abc90) at UnixEThread.cc:225 #11 0x007aa303 in spawn_thread_internal (a=0x105b0280) at Thread.cc:84 #12 0x2ba487f86aa1 in start_thread () from /lib64/libpthread.so.0 #13 0x2ba487cd293d in clone () from /lib64/libc.so.6 (gdb) ``` The problem is that TSContSchedule calls FORCE_PLUGIN_SCOPED_MUTEX which asserts that the continuation has a mutex. In this case, the continuation does not have a mutex. This check was added last May to solve TS-4387. The offending plugin is regex_revalidate. If the configuration file is updated, it creates a continuation and spins it off on a task thread. But it is calling TSContCreate with a second argument of NULL. Indeed the timestamp of the config file is a minute before the timestamp on the core file. The straightforward fix would be create a mutex to pass into TSContCreate. Not sure what the downside of that would be. It appears that this is not a frequently run path, so I doubt the creation of an additional object would affect much one way or the other. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1544: AddressSanitizer failed to deallocate
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1544 Also reported via issue #1498. I think there is a problem with freeing memory if the free list is deallocated (-f). When I accidentally left the -f flag on after I reinstalled a non-ASAN build, my system crashed due to out of memory within minutes. Without the -f flag, my production box has been running for days without appreciable memory growth. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1547: Fix ssl hook state logic.
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1547 Fix ssl hook state logic. Was not correctly resetting the ssl hook state after the servername hooks. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver fix_ssl_hook_state_transition Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1547.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1547 commit 0d3801cb39b0101b44365fb3d2e990ffc1287858 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-03-07T19:04:12Z Fix ssl hook state logic. Was not correctly resetting the ssl hook state after the servername hooks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1446: Need dedicated TS_SSL_SERVERNAME_HOOK
Github user shinrich commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/1446#discussion_r104750433 --- Diff: iocore/net/SSLNetVConnection.cc --- @@ -1438,17 +1438,22 @@ bool SSLNetVConnection::callHooks(TSEvent eventId) { // Only dealing with the SNI/CERT hook so far. - ink_assert(eventId == TS_EVENT_SSL_CERT); + ink_assert(eventId == TS_EVENT_SSL_CERT || eventId == TS_EVENT_SSL_SERVERNAME); Debug("ssl", "callHooks sslHandshakeHookState=%d", this->sslHandshakeHookState); // First time through, set the type of the hook that is currently being invoked - if (HANDSHAKE_HOOKS_PRE == sslHandshakeHookState) { + if ((this->sslHandshakeHookState == HANDSHAKE_HOOKS_PRE || this->sslHandshakeHookState == HANDSHAKE_HOOKS_DONE) && --- End diff -- For SSL_CTX_set_early_cb, perhaps as we move to openssl1.1 we should make that change. For the sslHandshakeHookState == DONE, probably not needed since we reset the hook state to PRE at the end of the servername chain. There is a flaw with this PR, and I am in the process of putting up another one. I'll probably leave the DONE check for now. I'm in the process of writing a test suite for this logic since it has bitten me multiple times. I'll do the tidy up once the test suite is there to check me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1522: Ignore read and write errors if vio has been clea...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1522 I ran my production box overnight with the latest work around. I think this change is a step in the right direction and we should merge it and bring it back to 7.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1522: Ignore read and write errors if vio has been clea...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1522 I think the issue raised by @scw00 is different. It looks like what @zwoop identified in issue #1531. In that case the write vio errored, so the write vio is being sent up as data to a handler expecting only a read vio. In the error case it shouldn't matter whether it is a read or a write vio and the error clean up should occur regardless. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1456: Add TCP accept metric which tracks the total numb...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1456 http2 connections are tracked by proxy.process.http2.current_client_sessions and proxy.process.http2.total_client_connections. All successfully negotiated SSL/TCP connections would be the sum of proxy.process.http2.total_client_connections and proxy.process.http.total_client_connections --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1532: ATS 7.1 release running out of memory
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1532 I saw stacks like this when I was running with the free list disabled (-f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1476: Crash in get_client_addr()
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1476 I had two more asserts once I added back in the accept thread. Upon deeper inspection, the last alloc time and the UA_BEGIN time were off by half a ms, so I am attributing that to clock skew between threads. I think the best action for 7.1.0 is to revert commit c1ac5f8bf87fd4bc3a8e06507219970d83965acd and investigate how to safely bubble up errors for 7.1.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1476: Crash in get_client_addr()
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1476 Have been adding member variables to NetVConnection and adding assert to HttpSM::state_api_callout. In the cores it seemed that the crash always happened in the first hook called from HttpSM::state_read_server_response_header. As noted in the previous comment, the event came in on the server side VC, and the state_read_server_response_header is getting ready to send the response of the UA side VC. I added allocate thread, free_thread, and last_alloc time to the netvc. And added the following assert at the beginning of NetVConnection {code} ink_release_assert(milestones[TS_MILESTONE_UA_CLOSE] > 0 || ua_session->get_netvc() == NULL || (ua_session->get_create_time() != 0 && ink_hrtime_to_msec(ua_session->get_create_time()) <= ink_hrtime_to_msec(milestones[TS_MILESTONE_UA_BEGIN]))); {code} This will trigger if there is not last_alloc_time set (it has been freed) or the create time of the VC is newer than the start of the transaction (has been reallocated). This assert would trigger before we called the hooks implying that the client_vc has been freed in a previous stack but the ua_session reference has not been cleaned up. The only place I could see this happening is in read_signal_and_update and write_signal_and_update of the corresponding vio._cont is NULL. I added warnings in the NULL path, and they triggered a handful of times with event=EVENT_ERROR (3). The error bubble addition (which we were concerned about from issue #1401) adds calls to read_signal_error and write_signal_error. So I reverted that commit (c1ac5f8bf87fd4bc3a8e06507219970d83965acd) and removed my workarounds for issue #1401. I let it run overnight and it crashed twice on my assert, but in the accept stack not the send response stack. The timestamps varied in the low microseconds, so I am doubting the accuracy of our timestamping at the micro and nano seconds. I added the ink_hrtime_to_msec to the asserts, and kicked it off to run today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1532: ATS 7.1 release running out of memory
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1532 Any indication of object buildup in a particular type of object pool? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1531 Oh, nevermind. I saw vio = NULL in the stack, but that is the local variable. That makes sense that the event would be passing up the write via. It is a write event we are processing. But it looks like the handler is only expecting a read event. I guess a broken pipe shows up as a write event. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1531 The server_entry->read_vio is NULL. From what I can tell this value only gets set due to do_io_read() calls in attach_server_session() and setup_server_read_response_header(). Presumably one of these do_io_read operations failed. Looking at the stack, we had a failure which is getting trigger via the write path (errno 32 broken pipe). I'm guessing that we should have started handling the failure where the do_io_read failed presumably in setup_server_read_response_header(). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1476: Crash in get_client_addr()
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1476 Still struggling with this issue. Running into it every 2-4 hours in production. At this point, I think it is a use-after-free, but as noted avoid ASAN fails in this case. Looking at a core more closely, it seems to be an incoming client VC that is being messed up. In the core, the remote_addr of the VC did not match the src_addr in the HttpSM. Both IP addresses show up in our squid.log at roughly the same time. We are running with accept threads. I'm going to turn that off on my test machine, I assume that will eliminate this race condition. If so, then will need to press on and see how we are freeing the VC early. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1498: Problems with ASAN builds in 7.1
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1498 Actually it looks like the issue is just tracking down the particular issue in #1476 / #1480. I enabled ASAN while merging in another issue, and it caught the double free in there just fine. So I think we are good with the ASAN and I am just running into a flaw in ASAN. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1527: More and slower active connections in 7.1.x
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1527 So do you think this is making a real performance difference? Or are we just not timing out connections as quickly now? I hadn't noticed a difference in number of active connections, but I'll keep an eye out for that in my testing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1526: When SSL connect fails, we return 502 succ...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1526 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1526: When SSL connect fails, we return 502 success
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1526 Yes, I suppose since it is a latent issue and not a regression it doesn't need to be in the 7.1 train. We will separately pull it into our version. Set version to 7.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1444: issue #1401: Potential fix to the write_to_io_net...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1444 Closing in deference to the more complete solution in PR #1522 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1444: issue #1401: Potential fix to the write_to...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1444 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1526: When SSL connect fails, we return 502 succ...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1526 When SSL connect fails, we return 502 success While debugging a failure in origin cert verification, we saw that the response header included ``` HTTP/1.1 502 Success Date: Mon, 27 Feb 2017 15:18:29 GMT Connection: keep-alive ``` 502 Success seemed peculiar. Tracked it down to the fact that errno was not set in the SSL_connect failure case. Added a check to stuff in ENET_CONNECT_FAILED if errno is 0 so we get a reason that is in the ballpark. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver success_on_502_pr Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1526.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1526 commit 085a5983b706a875229a2ac03d76174b9c79a2f5 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-03-01T21:33:23Z When SSL connect fails, we return 502 success --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1522: Ingore read and write errors if vio has been clea...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1522 Looks reasonable to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1525: Should allow control on whether default cert path...
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1525 Should allow control on whether default cert paths/files are included for verification When creating the SSL_CTX for ATS initiating connections to origin, we always call SSL_CTX_set_default_verify_path which adds the default trusted root packages on the system. You can also set your own via settings, but the default case is also added. For a reverse proxy, the default trusted root set is probably not desirable. You probably just want to verify that your origins are signed with your small set of trusted roots. Adding more trusted roots just allows for the possibility that you accept a cert signed by someone else entirely. There are a couple options to fix this 1. Add a new setting to ignore default trusted root 2. Don't call SSL_CTX_set_default_verify_path if a CA file or CA directory is explicitly defined. 3. The reverse proxy folks should just move the default trusted root files out of the way if they case (which is accidentally what we did). No option is technically difficult, but probably worth a bit of discussion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1446: Need dedicated TS_SSL_SERVERNAME_HOOK
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1446 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1445: Issue #1443 - Fix early or duplicate 404 e...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1445 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1459: Mysterious uptick in user_agent SSL errors moving...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1459 Yes, this is still an issue. I need to do some more testing on PR #1446. Some variant of that will need to be added to 7.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1511: heap-use-after-free: Access ua_session after Http...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1511 Looks reasonable to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1480: Crash in Http1ClientSession::set_inactivity_timeo...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1480 I think this is the same underlying issue as issue #1476. Another case of calling a virtual method into a VC. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1476: Crash in get_client_addr()
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1476 Very odd bug. Most of the crashes are in get_client_addr or get_remote_addr. In both cases, the problem is at the point of the virtual method call into the corresponding test. The RAX register is either 0 or 1. But when you go through the assembly instructions based on the current register values, the function pointer should be valid. Tried this on two different machines in case it was a HW problem. Still see the problem on both machine. Set counters to see if the crash happens when calling into the set method or on return. The crash happens before the set method is called. At this point, I think it is a use-after-free problem. If I do the function calculation off a VC object (which would be the pointer in the virtual function table after free), the value is a 0 or 1. I've tried to set up ASAN builds but they are not working for me at the moment (issue filed). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1498: Problems with ASAN builds in 7.1
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1498 The gcc in devtoolset-3 (gcc 4.9.2) and devtoolset-4 (gcc 5.3.1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1498: Problems with ASAN builds in 7.1
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1498 Problems with ASAN builds in 7.1 When running ASAN builds with the 7.1.x codebase I no longer see nice use-after-free reports (for the most part). Instead I see repeats of the following pair of lines. {code} ==22259==ERROR: AddressSanitizer failed to deallocate 0x8000 (32768) bytes at address 0x7ffd87426000 ==22259==AddressSanitizer CHECK failed: ../../../../libsanitizer/sanitizer_common/sanitizer_posix.cc:77 "(("unable to unmap" && 0)) != (0)" (0x0, 0x0) {code} Is anyone else seeing this? Any ideas on how to fix? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1401: Segfault in write_to_net_io with 7.1.x
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1401 Finally get a use-after-free ASAN stack in this area. Anyone else having problems with ASAN in newer builds? Looks like it is showing a use after free in the case of the error bubbling. {code} ==30868==ERROR: AddressSanitizer: heap-use-after-free on address 0x624001933448 at pc 0x5afa20 bp 0x7fffeaefe7e0 sp 0x7fffeaefe7d8 READ of size 8 at 0x624001933448 thread T17 ([ET_NET 15]) #0 0x5afa1f in Continuation::handleEvent(int, void*) ../../../../trafficserver/iocore/eventsystem/I_Continuation.h:153 #1 0xae0c33 in write_signal_and_update ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:176 #2 0xae10ac in write_signal_done ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:218 #3 0xae11b2 in write_signal_error ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:237 #4 0xae2a1e in write_to_net_io(NetHandler*, UnixNetVConnection*, EThread*) ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:458 #5 0xae25e5 in write_to_net(NetHandler*, UnixNetVConnection*, EThread*) ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:430 #6 0xace638 in NetHandler::mainNetEvent(int, Event*) ../../../../trafficserver/iocore/net/UnixNet.cc:526 #7 0x5afb30 in Continuation::handleEvent(int, void*) ../../../../trafficserver/iocore/eventsystem/I_Continuation.h:153 #8 0xb32866 in EThread::process_event(Event*, int) ../../../../trafficserver/iocore/eventsystem/UnixEThread.cc:143 #9 0xb33487 in EThread::execute() ../../../../trafficserver/iocore/eventsystem/UnixEThread.cc:270 #10 0xb3101b in spawn_thread_internal ../../../../trafficserver/iocore/eventsystem/Thread.cc:84 #11 0x7568aaa0 in start_thread (/lib64/libpthread.so.0+0x7aa0) #12 0x74fbd93c in clone (/lib64/libc.so.6+0xe893c) 0x624001933448 is located 4936 bytes inside of 7728-byte region [0x624001932100,0x624001933f30) freed by thread T17 ([ET_NET 15]) here: #0 0x549cb7 in free (/home/y/bin64/traffic_server+0x549cb7) #1 0x77b96c79 in ats_memalign_free ../../../../trafficserver/lib/ts/ink_memory.cc:141 #2 0x77b989be in malloc_free ../../../../trafficserver/lib/ts/ink_queue.cc:322 #3 0x77b986e8 in ink_freelist_free ../../../../trafficserver/lib/ts/ink_queue.cc:276 #4 0x75bc20 in ClassAllocator::free(HttpSM*) /var/builds/workspace/163866-v3-component/BUILD_CONTAINER/rhel6-gcc5_5/label/DOCKER-HIGH/app_root/_build/asan_build/../../trafficserver/lib/ts/Allocator.h:135 #5 0x708afe in HttpSM::destroy() ../../../../trafficserver/proxy/http/HttpSM.cc:365 #6 0x7459ad in HttpSM::kill_this() ../../../../trafficserver/proxy/http/HttpSM.cc:6951 #7 0x71dcb9 in HttpSM::main_handler(int, void*) ../../../../trafficserver/proxy/http/HttpSM.cc:2678 #8 0x5afb30 in Continuation::handleEvent(int, void*) ../../../../trafficserver/iocore/eventsystem/I_Continuation.h:153 #9 0x7f50f6 in HttpTunnel::main_handler(int, void*) ../../../../trafficserver/proxy/http/HttpTunnel.cc:1662 #10 0x5afb30 in Continuation::handleEvent(int, void*) ../../../../trafficserver/iocore/eventsystem/I_Continuation.h:153 #11 0xae0c33 in write_signal_and_update ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:176 #12 0xae10ac in write_signal_done ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:218 #13 0xae3588 in write_to_net_io(NetHandler*, UnixNetVConnection*, EThread*) ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:596 #14 0xae25e5 in write_to_net(NetHandler*, UnixNetVConnection*, EThread*) ../../../../trafficserver/iocore/net/UnixNetVConnection.cc:430 #15 0xace638 in NetHandler::mainNetEvent(int, Event*) ../../../../trafficserver/iocore/net/UnixNet.cc:526 #16 0x5afb30 in Continuation::handleEvent(int, void*) ../../../../trafficserver/iocore/eventsystem/I_Continuation.h:153 #17 0xb32866 in EThread::process_event(Event*, int) ../../../../trafficserver/iocore/eventsystem/UnixEThread.cc:143 #18 0xb33487 in EThread::execute() ../../../../trafficserver/iocore/eventsystem/UnixEThread.cc:270 #19 0xb3101b in spawn_thread_internal ../../../../trafficserver/iocore/eventsystem/Thread.cc:84 #20 0x7568aaa0 in start_thread (/lib64/libpthread.so.0+0x7aa0) previously allocated by thread T17 ([ET_NET 15]) here: #0 0x54a42b in posix_memalign (/home/y/bin64/traffic_server+0x54a42b) #1 0x77b96afa in ats_memalign ../../../../trafficserver/lib/ts/ink_memory.cc:102 #2 0x77b984a5 in malloc_new ../../../../trafficserver/lib/ts/ink_queue.cc:260 #3 0x77b97e57 in ink_freelist_new ../../../../trafficserver/lib/ts/ink_queue.cc:183 #4 0x648f31
[GitHub] trafficserver pull request #1488: This allows old ssl_multicert.config to st...
Github user shinrich commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/1488#discussion_r103023456 --- Diff: iocore/net/SSLUtils.cc --- @@ -2007,7 +2007,10 @@ SSLParseCertificateConfiguration(const SSLConfigParams *params, SSLCertLookup *l if (ssl_extract_certificate(_info, sslMultiCertSettings)) { // There must be a certificate specified unless the tunnel action is set if (sslMultiCertSettings.cert || sslMultiCertSettings.opt != SSLCertContext::OPT_TUNNEL) { -ssl_store_ssl_context(params, lookup, ); +if (ssl_store_ssl_context(params, lookup, ) == nullptr) { + Error("failed to load SSL server contexts"); + return false; +} } else { --- End diff -- I'm a bit confused why this fixes anything. I don't see any callers checking the return value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1480: Crash in Http1ClientSession::set_inactivity_timeo...
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1480 Crash in Http1ClientSession::set_inactivity_timeout While testing 7.1, we see the following crash and stack track ``` #0 0x in ?? () #1 0x005d6794 in Http1ClientSession::set_inactivity_timeout (this=0x2ad091121980, timeout_in=300) at Http1ClientSession.h:161 #2 0x005d71d1 in Http1ClientTransaction::set_inactivity_timeout (this=0x2ad091121c60, timeout_in=300) at Http1ClientTransaction.h:156 #3 0x005f56b5 in HttpSM::do_setup_post_tunnel (this=0x2ad0c06a35f0, to_vc_type=HTTP_SERVER_VC) at HttpSM.cc:5726 #4 0x005e7b30 in HttpSM::state_send_server_request_header (this=0x2ad0c06a35f0, event=103, data=0x2ad048ebed40) at HttpSM.cc:2001 #5 0x005ea331 in HttpSM::main_handler (this=0x2ad0c06a35f0, event=103, data=0x2ad048ebed40) at HttpSM.cc:2662 #6 0x005160f2 in Continuation::handleEvent (this=0x2ad0c06a35f0, event=103, data=0x2ad048ebed40) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #7 0x00784a70 in write_signal_and_update (event=103, vc=0x2ad048ebebb0) at UnixNetVConnection.cc:176 #8 0x00784c76 in write_signal_done (event=103, nh=0x2ad01ce11e60, vc=0x2ad048ebebb0) at UnixNetVConnection.cc:218 #9 0x00785f09 in write_to_net_io (nh=0x2ad01ce11e60, vc=0x2ad048ebebb0, thread=0x2ad01ce0e010) at UnixNetVConnection.cc:596 #10 0x00785724 in write_to_net (nh=0x2ad01ce11e60, vc=0x2ad048ebebb0, thread=0x2ad01ce0e010) at UnixNetVConnection.cc:430 #11 0x0077d30d in NetHandler::mainNetEvent (this=0x2ad01ce11e60, event=5, e=0x13863a0) at UnixNet.cc:526 #12 0x005160f2 in Continuation::handleEvent (this=0x2ad01ce11e60, event=5, data=0x13863a0) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #13 0x007a69c5 in EThread::process_event (this=0x2ad01ce0e010, e=0x13863a0, calling_code=5) at UnixEThread.cc:143 #14 0x007a6eb1 in EThread::execute (this=0x2ad01ce0e010) at UnixEThread.cc:270 #15 0x007a6089 in spawn_thread_internal (a=0x1272fe0) at Thread.cc:84 #16 0x2ad016cafaa1 in start_thread () from /lib64/libpthread.so.0 #17 0x2ad0169fc93d in clone () from /lib64/libc.so.6 ``` I assume that the compiler has inlined the call to client_vc->set_inactivity_timeout(timeout_in) and the crash is someplace in the UnitNetVConnection::set_inactivity_timeout function. client_vc is non-null and appears ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1446: Need dedicated TS_SSL_SERVERNAME_HOOK
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1446 Updated PR to address issue #1459 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1476: Crash in get_client_addr()
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1476 Crash in get_client_addr() After working through issues #1401 and #1443, we now see the following crash in our copy of 7.1. The crash occurs once every couple hours. ``` #0 0x2b46debb3867 in ?? () from /lib64/libgcc_s.so.1 #1 0x2b46debb4119 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1 #2 0x2b46deeb9936 in backtrace () from /lib64/libc.so.6 #3 0x2b46dc70fe72 in ink_stack_trace_dump () at ink_stack_trace.cc:61 #4 0x2b46dc71213a in signal_crash_handler (signo=11) at signals.cc:186 #5 0x00513414 in crash_logger_invoke (signo=11, info=0x2b46e39dc130, ctx=0x2b46e39dc000) at Crash.cc:169 #6 #7 0x0001 in ?? () at ../../lib/ts/Diags.h:142 #8 0x005126d5 in NetVConnection::get_remote_addr (this=0x2aad1cc4d0a0) at /home/shinrich/yats_build/trafficserver/iocore/net/P_NetVConnection.h:30 #9 0x0055deea in ProxyClientSession::get_client_addr (this=0x2ab0dc25bba0) at ProxyClientSession.h:221 #10 0x005352bf in TSHttpSsnClientAddrGet (ssnp=0x2ab0dc25bba0) at InkAPI.cc:5439 #11 0x0053530d in TSHttpTxnClientAddrGet (txnp=0x2b474440c9f0) at InkAPI.cc:5447 #12 0x2e4008f6 in http_hook (contp=0x1e42f40, event=60006, edata=0x2b474440c9f0) at INKPluginInit.cc:174 #13 0x0052a5fd in INKContInternal::handle_event (this=0x1e42f40, event=60006, edata=0x2b474440c9f0) at InkAPI.cc:1048 #14 0x005160f2 in Continuation::handleEvent (this=0x1e42f40, event=60006, data=0x2b474440c9f0) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #15 0x0052adf3 in APIHook::invoke (this=0x1e44f40, event=60006, edata=0x2b474440c9f0) at InkAPI.cc:1267 #16 0x005e5e16 in HttpSM::state_api_callout (this=0x2b474440c9f0, event=0, data=0x0) at HttpSM.cc:1462 #17 0x005f3a11 in HttpSM::do_api_callout_internal (this=0x2b474440c9f0) at HttpSM.cc:5171 #18 0x00601b85 in HttpSM::do_api_callout (this=0x2b474440c9f0) at HttpSM.cc:438 #19 0x005e78d6 in HttpSM::state_read_server_response_header (this=0x2b474440c9f0, event=100, data=0x2b4744cac508) at HttpSM.cc:1962 #20 0x005ea331 in HttpSM::main_handler (this=0x2b474440c9f0, event=100, data=0x2b4744cac508) at HttpSM.cc:2662 #21 0x005160f2 in Continuation::handleEvent (this=0x2b474440c9f0, event=100, data=0x2b4744cac508) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #22 0x0078489f in read_signal_and_update (event=100, vc=0x2b4744cac3e0) at UnixNetVConnection.cc:145 #23 0x00787a10 in UnixNetVConnection::readSignalAndUpdate (this=0x2b4744cac3e0, event=100) at UnixNetVConnection.cc:1125 #24 0x0076a098 in SSLNetVConnection::net_read_io (this=0x2b4744cac3e0, nh=0x2b46e18ade60, lthread=0x2b46e18aa010) at SSLNetVConnection.cc:587 #25 0x0077d20d in NetHandler::mainNetEvent (this=0x2b46e18ade60, event=5, e=0x1873f20) at UnixNet.cc:509 #26 0x005160f2 in Continuation::handleEvent (this=0x2b46e18ade60, event=5, data=0x1873f20) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #27 0x007a69c5 in EThread::process_event (this=0x2b46e18aa010, e=0x1873f20, calling_code=5) at UnixEThread.cc:143 #28 0x007a6eb1 in EThread::execute (this=0x2b46e18aa010) at UnixEThread.cc:270 #29 0x007a6089 in spawn_thread_internal (a=0x1762fe0) at Thread.cc:84 #30 0x2b46df156aa1 in start_thread () from /lib64/libpthread.so.0 #31 0x2b46deea393d in clone () from /lib64/libc.so.6 ``` I assume that there has been a goodly amount of inlining here so it is hard to tell how we got from NetVConnection::get_remote_addr to the Diags::on method. Digging through the core, the cached member addresses seem ok, and I don't see evidence of thread races. I think I have seen this stack reported before, but couldn't find the issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1369: proxy.config.http.server_max_connections over lim...
Github user shinrich closed the issue at: https://github.com/apache/trafficserver/issues/1369 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1369: proxy.config.http.server_max_connections over lim...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1369 Change committed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1421: Segmentation fault on TLS when destination server...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1421 Related discussion on issue #1401. I have a hacky PR related to that issue, which has enabled us to continue on. It is definitely triggered by the error bubbling changes. I didn't pin it to a server reset though. Makes our servers crash within a minute. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1459: Mysterious uptick in user_agent SSL errors moving...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1459 Found this was an issue in the changes made in our version of 7.1 shown in PR #1446. I will update the PR with the fix. As it stands, if there is no servername hook, the cert hook will never be called. This means that certificates specified by the SNI name will never be selected. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1459: Mysterious uptick in user_agent SSL errors moving...
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1459 Mysterious uptick in user_agent SSL errors moving to 7.1 Comparing a machine running 7.1.x against its peer running our version of 5.3.x. A number of the proxy.process.ssl.user_agent_* metrics started increasing in the 7.1 build. Namely proxy.process.ssl.user_agent_unknown_cert and proxy.process.ssl.user_agent_bad_cert. I did packet captures for a few seconds on both machines to verify that this wasn't just a change in logging behavior. On the 7.1.x box with 5000 TLS handshakes captured we saw 81 Certificate Unknown alerts and 5 Bad Cert alerts. On the 5.3.x box with 23000 handshakes captured, 1 Bad Cert alert (from an internal IP) and 4 Certificate Unknown alerts (3 from the same IP address). After running for a few minutes, the rate of alerts in the 7.1 build does not go down. It isn't huge, but the difference is alarming me enough that I am not expanding my testing until I have a good story for this. Will go back and run 7.1.x with ASAN. Perhaps the cert buffers are getting corrupted in some cases? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1444: issue #1401: Potential fix to the write_to_io_net...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1444 Yes, as I said not an entirely satisfying solution. I do thing the issue is triggered by the changes from TS-4796 where @jacksontj added logic to bubble up these errors. I'm guessing in some error cases the VC is already partially freed up by this point. I just checked and the fixes for TS-4796 entered between 7.0 (not there) and 7.1 (is there). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1452: ssl_callback_ocsp_stapling spew of messages in di...
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1452 ssl_callback_ocsp_stapling spew of messages in diags.log on start On system start with 7.1, I see lots of the following messages in diags.log for a few minutes. It eventually stops (presumably once we get a response), but it leaves a mess in the logs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1445: Issue #1443 - Fix early or duplicate 404 error ha...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1445 An assert triggers. A stack trace in the issue description. https://github.com/apache/trafficserver/issues/1443 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1446: Need dedicated TS_SSL_SERVERNAME_HOOK
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1446 Need dedicated TS_SSL_SERVERNAME_HOOK If you have a plugin that needs to trigger on every TLS client hello, this won't work the TS_SSL_SNI_HOOK with openssl-1.0.2. In that version TS_SSL_SNI_HOOK is shared with the TS_SSL_CERT_HOOK which does not trigger on session reuse. I propose adding a dedicated TS_SSL_SERVERNAME_HOOK and deprecate the TS_SSL_SNI_HOOK to drop on the next major release You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver ts_ssl_servername_hook Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1446.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1446 commit 2507909055ca96ff09458c8ffac73d912b262c06 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-02-11T00:35:01Z Need dedicated TS_SSL_SERVERNAME_HOOK If you have a plugin that needs to trigger on every TLS client hello, this won't work the TS_SSL_SNI_HOOK with openssl-1.0.2. In that version TS_SSL_SNI_HOOK is shared with the TS_SSL_CERT_HOOK which does not trigger on session reuse. I propose adding a dedicated TS_SSL_SERVERNAME_HOOK and deprecate the TS_SSL_SNI_HOOK to drop on the next major release --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1445: Issue #1443 - Fix early or duplicate 404 e...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1445 Issue #1443 - Fix early or duplicate 404 error handling You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue_1443 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1445.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1445 commit 866c00aa085cb1579b48d48f293676bd712006f3 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-02-14T20:39:52Z Issue #1443 - Fix early or duplicate 404 error handling --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1444: issue #1401: Potential fix to the write_to...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1444 issue #1401: Potential fix to the write_to_io_net crash. Not a super satisfying solution. Without this change, ats 7.1 would crash immediately in our production environment. With this patch, we run for a few minutes before hitting the next issue. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue_1401 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1444.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1444 commit 5f65583b3f04ac381e2878c64f50f461288454b9 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-02-14T12:32:04Z issue #1401: Potential fix to the write_to_io_net crash. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1443: Assert on null t_state.transact_return_point
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1443 Tracked it down some more. In this case, new remap rule is not found, and build_error_response is called to set the status of 404 with the reason of "Not Found". But somehow the t_state.transaction_return_point is not updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1443: Assert on null t_state.transact_return_point
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1443 Assert on null t_state.transact_return_point Seeing this crash now after putting in work around for issue #1401. ``` #0 0x2b20159b7625 in raise () from /lib64/libc.so.6 #1 0x2b20159b8d8d in abort () from /lib64/libc.so.6 #2 0x2b20132c8a02 in ink_abort (message_format=0x2b20132de4f0 "%s:%d: failed assertion `%s`") at ink_error.cc:99 #3 0x2b20132c6106 in _ink_assert (expression=0x7d8b20 "t_state.transact_return_point != nullptr", file=0x7d5e2c "HttpSM.cc", line=7197) at ink_assert.cc:37 #4 0x005fbbeb in HttpSM::call_transact_and_set_next_state (this=0x2b20c8096c00, f=0) at HttpSM.cc:7197 #5 0x005fbefe in HttpSM::set_next_state (this=0x2b20c8096c00) at HttpSM.cc:7253 #6 0x005fbd43 in HttpSM::call_transact_and_set_next_state (this=0x2b20c8096c00, f=0) at HttpSM.cc:7206 #7 0x005e7623 in HttpSM::handle_api_return (this=0x2b20c8096c00) at HttpSM.cc:1606 #8 0x005e7454 in HttpSM::state_api_callout (this=0x2b20c8096c00, event=6, data=0x0) at HttpSM.cc:1544 #9 0x005e6a22 in HttpSM::state_api_callback (this=0x2b20c8096c00, event=6, data=0x0) at HttpSM.cc:1340 #10 0x00536236 in TSHttpTxnReenable (txnp=0x2b20c8096c00, event=6) at InkAPI.cc:5883 #11 0x2e401263 in http_hook (contp=0x2204f40, event=60016, edata=0x2b20c8096c00) at INKPluginInit.cc:408 #12 0x0052a4d3 in INKContInternal::handle_event (this=0x2204f40, event=60016, edata=0x2b20c8096c00) at InkAPI.cc:1048 #13 0x0051592e in Continuation::handleEvent (this=0x2204f40, event=60016, data=0x2b20c8096c00) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #14 0x0052acd3 in APIHook::invoke (this=0x2206f80, event=60016, edata=0x2b20c8096c00) at InkAPI.cc:1267 #15 0x005e7134 in HttpSM::state_api_callout (this=0x2b20c8096c00, event=0, data=0x0) at HttpSM.cc:1462 #16 0x005f4d91 in HttpSM::do_api_callout_internal (this=0x2b20c8096c00) at HttpSM.cc:5168 #17 0x00602fa3 in HttpSM::do_api_callout (this=0x2b20c8096c00) at HttpSM.cc:438 #18 0x005fbdb0 in HttpSM::set_next_state (this=0x2b20c8096c00) at HttpSM.cc:7239 #19 0x005fbd43 in HttpSM::call_transact_and_set_next_state (this=0x2b20c8096c00, f=0) at HttpSM.cc:7206 #20 0x005e7623 in HttpSM::handle_api_return (this=0x2b20c8096c00) at HttpSM.cc:1606 #21 0x005e7454 in HttpSM::state_api_callout (this=0x2b20c8096c00, event=6, data=0x0) at HttpSM.cc:1544 #22 0x005e6a22 in HttpSM::state_api_callback (this=0x2b20c8096c00, event=6, data=0x0) at HttpSM.cc:1340 #23 0x00536236 in TSHttpTxnReenable (txnp=0x2b20c8096c00, event=6) at InkAPI.cc:5883 #24 0x2fd35999 in carpPluginHook (contp=0x2204dc0, event=60002, edata=0x2b20c8096c00) at carp.cc:542 #25 0x0052a4d3 in INKContInternal::handle_event (this=0x2204dc0, event=60002, edata=0x2b20c8096c00) at InkAPI.cc:1048 #26 0x0051592e in Continuation::handleEvent (this=0x2204dc0, event=60002, data=0x2b20c8096c00) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #27 0x0052acd3 in APIHook::invoke (this=0x2206f00, event=60002, edata=0x2b20c8096c00) at InkAPI.cc:1267 #28 0x005e7134 in HttpSM::state_api_callout (this=0x2b20c8096c00, event=6, data=0x0) at HttpSM.cc:1462 #29 0x005e6a22 in HttpSM::state_api_callback (this=0x2b20c8096c00, event=6, data=0x0) at HttpSM.cc:1340 #30 0x00536236 in TSHttpTxnReenable (txnp=0x2b20c8096c00, event=6) at InkAPI.cc:5883 #31 0x2e401263 in http_hook (contp=0x2204f40, event=60002, edata=0x2b20c8096c00) at INKPluginInit.cc:408 #32 0x0052a4d3 in INKContInternal::handle_event (this=0x2204f40, event=60002, edata=0x2b20c8096c00) at InkAPI.cc:1048 #33 0x0051592e in Continuation::handleEvent (this=0x2204f40, event=60002, data=0x2b20c8096c00) at /home/shinrich/yats_build/trafficserver/iocore/eventsystem/I_Continuation.h:153 #34 0x0052acd3 in APIHook::invoke (this=0x2206fa0, event=60002, edata=0x2b20c8096c00) at InkAPI.cc:1267 #35 0x005e7134 in HttpSM::state_api_callout (this=0x2b20c8096c00, event=0, data=0x0) at HttpSM.cc:1462 #36 0x005f4d91 in HttpSM::do_api_callout_internal (this=0x2b20c8096c00) at HttpSM.cc:5168 #37 0x00602fa3 in HttpSM::do_api_callout (this=0x2b20c8096c00) at HttpSM.cc:438 #38 0x005fbdb0 in HttpSM::set_next_state (this=0x2b20c8096c00) at HttpSM.cc:7239 #39 0x005fbd43 in HttpSM::call_transact_and_set_next_state (this=0x2b20c8096c00, f= 0x60e8f0 <HttpTransact::
[GitHub] trafficserver issue #1401: Segfault in write_to_net_io with 7.1.x
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1401 BTW I'm currently testing without HTTP/2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1401: Segfault in write_to_net_io with 7.1.x
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1401 @PSUdaemon would be cool to see symbols with the stack to verify that is the same thing. My hacky fix made it go away only to be immediately replaced by a new crash. (New issue to be posted). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1401: Segfault in write_to_net_io with 7.1.x
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1401 I've finally got my environment running and I see the same stack very quickly as well. In the cases I've seen it looks like there was a write error, but for some reason the write vio has been cleared out (or was never set?) ``` (gdb) frame 2 #2 0x00788350 in write_to_net_io (nh=0x2af588003e60, vc=0x2aad1401b800, thread=0x2af58810) at UnixNetVConnection.cc:440 440 UnixNetVConnection.cc: No such file or directory. in UnixNetVConnection.cc (gdb) print *s $1 = {enabled = 0, error = 1, vio = {_cont = 0x0, nbytes = 0, ndone = 0, op = 0, buffer = {mbuf = 0x0, entry = 0x0}, vc_server = 0x0, mutex = {m_ptr = 0x0}}, ready_link = {<SLink> = {next = 0x0}, prev = 0x0}, enable_link = {next = 0x0}, in_enabled_list = 0, triggered = 1} ``` The write.error stuff was added by Thomas in TS-4796, but if this just showed up between 7.0 and 7.1, it is unlikely that this was the culprit. I think it has been a while (since 9/3/2016). Seems more likely that someone has cleared the vio, or we are bouncing an error. In the short term I'm adding a NULL check at the begining of write_to_net_io, but that just seems to be masking the failure case rather than identifying the root cause. ``` + if (!s->vio.mutex) { +ink_release_assert(s->vio._cont == NULL && vc->write.error); +return; + } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1248: TS-5070 Add configuration variables to set...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1248 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1365: TS-4896: TSHttp***ClientAddrGet/TSHttp***I...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1365 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1387: Add diags log message when cache wraps.
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1387 Add diags log message when cache wraps. Requested by our operations guys to better track wraps. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver cache_wrap_diags_msg Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1387.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1387 commit 2f419f998e8480c8ab0c42203701957a5cb1f7c1 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-27T22:28:00Z Add diags log message when cache wraps. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1375: Incorrectly freeing Http1ClientSession set...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1375 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1365: TS-4896: TSHttp***ClientAddrGet/TSHttp***Incoming...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1365 Replaced NULL references with nullptr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1379: Make sure to schedule connect event on cor...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1379 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1370: Issue #1369: proxy.config.http.server_max_...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1370 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1375: Incorrectly freeing Http1ClientSession setting up...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1375 The client session gets cleaned up as normal after the response is sent. The problem is the original code was cleaning up the client session via the kill tunnel before sending the error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1376: TSStringPercentDecode null-termination cli...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1376 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1379: Make sure to schedule connect event on cor...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1379 Make sure to schedule connect event on correct thread type. Cannot blindly schedule on current thread. It may not be the right type. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver schedule_thread_type Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1379.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1379 commit 7e8fe6e308f6c79d17713e7ecbb23f61562d7111 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-26T00:38:17Z Make sure to schedule connect event on correct thread type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1376: TSStringPercentDecode null-termination cli...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1376 TSStringPercentDecode null-termination clipping on overflow. Actually fixed by @petar last summer, but forgot to get it pushed to up. Long URL's would cause buffer overflow write. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver tsstringpercentdecode_overflow Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1376.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1376 commit 1f73eac3997eb99f142b4e2dec0ecef5b55b29af Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-25T21:49:38Z TSStringPercentDecode null-termination clipping on overflow. Added Regression Test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1370: Issue #1369: proxy.config.http.server_max_connect...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1370 Addressed Alan's comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1375: Incorrectly freeing Http1ClientSession set...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1375 Incorrectly freeing Http1ClientSession setting up to return a error Saw a very deep stack. The following is the stop of the stack. {code} #0 0x in ?? () #1 0x005e05ab in Http1ClientTransaction::do_io_write (this=0x2b512412e3d8, c=0x2b4ffd082470, nbytes=123, buf=0x2b50b4075e60, owner=false) at Http1ClientTransaction.h:45 #2 0x0063c7f5 in HttpTunnel::producer_run (this=0x2b4ffd082470, p=0x2b4ffd082670) at HttpTunnel.cc:916 #3 0x0063c103 in HttpTunnel::tunnel_run (this=0x2b4ffd082470, p_arg=0x2b4ffd082670) at HttpTunnel.cc:734 #4 0x005ff225 in HttpSM::setup_internal_transfer (this=0x2b4ffd080fd0, handler_arg=(int (HttpSM::)(HttpSM *, int, void *)) 0x5f383c <HttpSM::tunnel_handler(int, void)>) at HttpSM.cc:6222 #5 0x005ef6ae in HttpSM::handle_api_return (this=0x2b4ffd080fd0) at HttpSM.cc:1721 #6 0x005ef30b in HttpSM::state_api_callout (this=0x2b4ffd080fd0, event=6, data=0x0) at HttpSM.cc:1596 #7 0x005ee9e6 in HttpSM::state_api_callback (this=0x2b4ffd080fd0, event=6, data=0x0) at HttpSM.cc:1394 #8 0x00534ea1 in TSHttpTxnReenable (txnp=0x2b4ffd080fd0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5652 {code} I spent some time grubbing through the core. It is interesting that the whole transaction is on the stack. The Http1ClientSession gets created in frame 67. We crash because it has been deleted. I think the problem is in frame 4 in HttpSM::setup_internal_transfer. This is an error case. Specifically ATS is trying to return "HTTP/1.0 500 INKApi Error\r\nDate: Wed, 18 May 2016 20:21:46 GMT\r\nConnection: close\r\nServer: ATS/5.3.0\r\nContent-Length: 0\r\n\r\n" which gets set from HttpTransact::HandleApiErrorJump. But in setup_internal_transfer, we call tunnel.kill_tunnel to remove any previous static producers. But if there was a previous tunnel setup with our Http1ClientTransaction/Session as a consumer, it would call do_io_close on it which would likely free the Http1ClientSession object. Replaced the kill_tunnel with a reset which doesn't free the consumer/producer. The crash went away. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver early-client-free Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1375.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1375 commit 95010edee4c5378b4f1f46174100e7a6e6115d13 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-25T20:46:56Z Incorrectly freeing Http1ClientSession while setting up to return a error header. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1370: Issue #1369: proxy.config.http.server_max_...
Github user shinrich commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/1370#discussion_r97871781 --- Diff: proxy/http/HttpSM.cc --- @@ -4933,8 +4933,10 @@ HttpSM::do_http_server_open(bool raw) // between the statement above and the check below. // If this happens, we might go over the max by 1 but this is ok. if (sum >= t_state.http_config_param->server_max_connections) { - ink_assert(pending_action == nullptr); - pending_action = eventProcessor.schedule_in(this, HRTIME_MSECONDS(100)); + // Eventually may want to have a queue as the origin_max_connection does to allow for a combination + // of retries and errors. But at this point, we are just going to allow the error case. + t_state.current.state = HttpTransact::CONNECTION_ERROR; + call_transact_and_set_next_state(HttpTransact::HandleResponse); httpSessionManager.purge_keepalives(); --- End diff -- This is the order we are currently running in Yahoo production. The purge_keepalives is just clearing out the global pool, so it should not interfere is the set next state. This was pre-existing code. I'm guessing the rationale is that is the system is loaded, we should be shutting down idle connections. Could do the purge first I suppose. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1370: Issue #1369: proxy.config.http.server_max_connect...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1370 Ok. Still figuring out the new world order :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1366: Issue #1360: REGRESSION_TEST(SDK_API_OVERR...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1366 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1366: Issue #1360: REGRESSION_TEST(SDK_API_OVERRIDABLE_...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1366 Probably. We will be pulling this back into our 7.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1367: HdrHeap potential corruption
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1367 Hopefully this fixes what you saw @maskit. Looking at the discussion on TS-2792 had it been solved before? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1368: Issue #1367: HdrHeap potential corruption
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1368 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1370: Issue #1369: proxy.config.http.server_max_...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1370 Issue #1369: proxy.config.http.server_max_connections over limit fail You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue-1369 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1370.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1370 commit e3b586998a9906de3540ce575a550eb72b3ce206 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-24T03:43:59Z Issue #1369: proxy.config.http.server_max_connections over limit should fail immediately --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1369: proxy.config.http.server_max_connections over lim...
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1369 proxy.config.http.server_max_connections over limit should fail immediately The way it is currently encoded, requests over the connection limit delay 100ms and try again. This doesn't help ATS shed load. Change this to return an error to the requester immediately much like the max_queue changes to origin_max_connections allow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1368: Issue #1367: HdrHeap potential corruption
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1368 Issue #1367: HdrHeap potential corruption You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue-1367 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1368.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1368 commit b8e2f720ba42b74a5bd119d451d8a19a555b7152 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-24T03:21:53Z Issue #1367: HdrHeap potential corruption --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1367: HdrHeap potential corruption
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1367 HdrHeap potential corruption Would see this sometimes during the host header manipulation. The original code assumes that the string pointer from the field will stay static over other header manipulation calls. But thanks to coalescing, the string pointers may move around causing bad behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1366: Issue #1360: REGRESSION_TEST(SDK_API_OVERR...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1366 Issue #1360: REGRESSION_TEST(SDK_API_OVERRIDABLE_CONFIGS) crash sometimes You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue-1360 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1366.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1366 commit 4b878c1498c79ff470d88bbc7bb5e4e11839b591 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-24T01:58:41Z Issue #1360: REGRESSION_TEST(SDK_API_OVERRIDABLE_CONFIGS) sometimes crashes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1364: Issue #1359: Flaw in TS-2157 port in server addre...
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1364 Would be reasonable to backport. We will be pulling it back into our version of 7.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1365: TS-4896: TSHttp***ClientAddrGet/TSHttp***I...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1365 TS-4896: TSHttp***ClientAddrGet/TSHttp***IncomingAddrGet may return NULL Several of our plugins would experience crashes once we tightened up clean up on session shutdown. They assumed that the TSHttp{Txn|Ssn}ClientAddrGet and TSHttp{Txn|Ssn}IncomingAddrGet calls would never return NULL. If the client initiates the shutdown, the underlying netvc may get cleaned up before the sessions, transactions, and state machines were completely shutdown. The original code would return NULL in that case. In our version, we added this logic to cache the address information and use that data if the vc has been cleaned up. This avoided spinning changes in the plugins. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver ts-4896 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1365.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1365 commit 1c8703c9d678f74daaf0e1184affb7eb428076e3 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-24T01:46:15Z TS-4896: TSHttpTxnClientAddrGet and TSHttpTxnIncomingAddrGet may return NULL --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1358: Should add server_connect_end to the slowlog
Github user shinrich closed the issue at: https://github.com/apache/trafficserver/issues/1358 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1358: Should add server_connect_end to the slowlog
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1358 Fix commited --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1364: Issue #1359: Flaw in TS-2157 port in serve...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1364 Issue #1359: Flaw in TS-2157 port in server address may be unset You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue-1359 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1364.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1364 commit 92f7a6ed50595a16aba247b4addda89695662bce Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-24T01:30:30Z Issue #1359: Flaw in TS-2157 port in server address may be unset --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1361: Issue #1358: Add server_connect_end in slo...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1361 Issue #1358: Add server_connect_end in slow log You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver issue-1358 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1361.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1361 commit 5932ac66c403352820f53c0362464029529fad48 Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-23T22:28:56Z Issue #1358: Add server_connect_end milestone. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1360: REGRESSION_TEST(SDK_API_OVERRIDABLE_CONFIGS) some...
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1360 REGRESSION_TEST(SDK_API_OVERRIDABLE_CONFIGS) sometimes crashes Due to timing changes in cleaning up sessions and state machines sometimes REGRESSION_TEST(SDK_API_OVERRIDABLE_CONFIGS) crashes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1359: Flaw in TS-2157 means that port in server address...
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1359 Flaw in TS-2157 means that port in server address is unset This is only an issue if the request to origin is a non-standard port. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1358: Should add server_connect_end to the milestones
GitHub user shinrich opened an issue: https://github.com/apache/trafficserver/issues/1358 Should add server_connect_end to the milestones We added this milestone entry a while back to help debug a performance problem. Would like to contribute this back. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1347: TS-5022: nullptr check
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1347 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1347: TS-5022: nullptr check
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1347 Looks good to me. Explicitly checking that both the path and file name are NULL should avoid bogus load attempts. Still odd that relative layout logic uses getcwd if the path is NULL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1315: TS-5022: fix silent exit problem
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/pull/1315 Looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1315: TS-5022: fix silent exit problem
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1315 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1308: CID 1368316 & 1368315: Leaks and NULL references
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1308 Both issues have been addressed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1308: CID 1368316 & 1368315: Leaks and NULL references
Github user shinrich closed the issue at: https://github.com/apache/trafficserver/issues/1308 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1314: Fix Http2Stream *stream variable shadow re...
Github user shinrich closed the pull request at: https://github.com/apache/trafficserver/pull/1314 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver issue #1308: CID 1368316 & 1368315: Leaks and NULL references
Github user shinrich commented on the issue: https://github.com/apache/trafficserver/issues/1308 PR #1314 should fix CID 1368315 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] trafficserver pull request #1314: Fix Http2Stream *stream variable shadow re...
GitHub user shinrich opened a pull request: https://github.com/apache/trafficserver/pull/1314 Fix Http2Stream *stream variable shadow reported by issue 1308. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shinrich/trafficserver fix-issue-1308 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1314.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1314 commit 7981d28c7c74fd843be21d265b44e83aec4820db Author: Susan Hinrichs <shinr...@ieee.org> Date: 2017-01-09T18:40:52Z Fix Http2Stream *stream variable shadow reported by issue 1308. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---