[jira] [Commented] (TS-974) TS should have a mode to hold partial objects in cache
[ https://issues.apache.org/jira/browse/TS-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605500#comment-14605500 ] ASF subversion and git services commented on TS-974: Commit 528eab64a26e869ce69a4bb3729c7441cd9b4906 in trafficserver's branch refs/heads/poc-6-0-x from [~amc] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=528eab6 ] TS-974: Partial Object Caching. TS should have a mode to hold partial objects in cache -- Key: TS-974 URL: https://issues.apache.org/jira/browse/TS-974 Project: Traffic Server Issue Type: Improvement Components: Cache Affects Versions: 3.0.1 Reporter: William Bardwell Assignee: Alan M. Carroll Labels: A Fix For: 6.1.0 For ATS to do an excelent job caching large files like video it would need to be able to hold partial objects for a large file. This could be done in a plugin or in the core. This would need to be integrated with the Range handling code to serve requests out of the partial objects and to get more parts of a file to satisfy a Range request. An intermediate step (also do-able in the core or in a plugin) would be to have some settings to let the Range handling code be able to trigger a full file download either asynchronously when a Range response indicates that the file isn't larger than some threshold, or synchronously when a Range request could reasonably be answered quickly from a full request. (Right now Range requests are tunneled if there is not full cached content as far as I can tell.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3661) cache broken during 3xx redirect follow response
[ https://issues.apache.org/jira/browse/TS-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3661: Backport to Version: (was: 5.3.1) cache broken during 3xx redirect follow response Key: TS-3661 URL: https://issues.apache.org/jira/browse/TS-3661 Project: Traffic Server Issue Type: Bug Components: HTTP Affects Versions: 5.2.1 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda Fix For: 5.3.1, 6.0.0 During a 3xx redirect follow, the current TS implementation is that, it creates a new cache key for the redirect follow request and stores the response against the new cache key. There's some logic in *HttpCacheSM::open_write* that's basically to check that a txn is not stuck in a open_write loop. https://github.com/apache/trafficserver/blob/master/proxy/http/HttpCacheSM.cc#L289 The logic basically is to check the current *open_write_tries* for a given *cache_sm* object against *proxy.config.http.cache.max_open_write_retries* and against *redirection_tries* for that txn. The assumption here is that, we allow atleast one *open_write_try* per redirect follow attempt. {code} if (open_write_tries master_sm-redirection_tries open_write_tries master_sm-t_state.http_config_param-max_cache_open_write_retries) { master_sm-handleEvent(CACHE_EVENT_OPEN_WRITE_FAILED, (void *)-ECACHE_DOC_BUSY); return ACTION_RESULT_DONE; } {code} However, the *open_write_tries counter* is incremented before checking the condition, while *redirection_tries* is only incremented after receiving the server response which is too late. This results in basically open_write_tries being incremented ahead and would always fail the check (except, for a non-default value ( 1) for *proxy.config.http.cache.max_open_write_retries*) Update: While the above is *a* possible issue, the real reason for this regression is not directly related to the above problem. The above problem actually ends up helping the cause (so to speak), since it basically fails all open_writes with the new cache key (location based) during redirect follow. The issue seems to have been resulted due to the commits in TS-3140, which close the original cache_sm before doing a redirect follow. Without this fix, the original cache_sm (opened against the original client URL as the key) is still open and is used to write the final (2xx) response from the origin. However, closing the cache_sm before redirect follow with TS-3140 makes it so that, the object never gets cached. Fixing this should be very simple. Either reset the *open_write_tries* counter in *cache_sm.close_write()* which is performed during the redirect follow (before open_write with new cache_key), or adjust the logic a bit (e.g. swap around the point at which the counters are incremented etc). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3719) HPACK error in lowering table size
[ https://issues.apache.org/jira/browse/TS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3719: Backport to Version: (was: 5.3.1) HPACK error in lowering table size -- Key: TS-3719 URL: https://issues.apache.org/jira/browse/TS-3719 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 The table size is reduced by the by max size - new size instead of current size - new size. This causes the table to try to delete items that don't exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3665) Redirect logic causing debug asserts and leaking cache_vc's
[ https://issues.apache.org/jira/browse/TS-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3665: Backport to Version: (was: 5.3.1) Redirect logic causing debug asserts and leaking cache_vc's --- Key: TS-3665 URL: https://issues.apache.org/jira/browse/TS-3665 Project: Traffic Server Issue Type: Bug Components: Cache Reporter: Susan Hinrichs Assignee: Susan Hinrichs Fix For: 5.3.1, 6.0.0 Attachments: ts-3665-2.diff, ts-3665.diff This is related to TS-3140 and TS-3661. I spent this morning reviewing the issue addressed by TS-3140 after the fixes for TS-3661 were put in place. TS-3140 addresses the issue when the 301 is in cache, but I'm seeing asserts for both 301's in cache and 301's not in cache. My first assert was line 109 in HttpCacheSM.cc line 109, ink_assert(cache_read_vc == NULL). I added a cache_sm.close_read() to the HttpTransact::SM_ACTION_REDIRECT_READ: case of HttpSM::handle_api_return. While only debug assert, if we ignore it we will reassign cache_read_vc without freeing the previous. I addressed this by adding cache_sm.close_read() to the SM_ACTION_REDIRECT_READ case of HttpSM::handle_api_return. My second assert is in HttpSM::do_cache_prepare_action (line 4446 of HttpSM.cc). Before the changes for TS-3661, it was expressing itself in SM_ACTION_CACHE_ISSUE_WRITE case of HttpSM::cache_write_state(). In this case, do_cache_prepare_action will open a new cache_write_vc overwriting the original and losing the cache_vc memory. The original fix to TS-3140 addressed this by adding a cache_sm.close_write in the SM_ACTION_REDIRECT_READ case of HttpSM::handle_api_return. But this caused problems of TS-3661 causing the originally selected cache key to be lost, but if you pass through this logic, I assume that the original cache write vc will be lost anyway. [~sudheerv] and [~zwoop] does this situation not happen in your redirect use cases? I'm afraid that I'm not following how the original cache key is preserved in the second cache open only if the first cache write open is not cleaned up. My test URLs are: curl -v --proxy localhost:80 http://whos.amung.us/cwidget/4s62rme9/007071fecc4e.png and curl -v --proxy localhost:80 http://docs.trafficserver.apache.org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3721) Remove scoping on enum for HTTP2_FRAME_TYPE_MAX
[ https://issues.apache.org/jira/browse/TS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3721: Fix Version/s: 5.3.1 Remove scoping on enum for HTTP2_FRAME_TYPE_MAX --- Key: TS-3721 URL: https://issues.apache.org/jira/browse/TS-3721 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 Seeing this on our RHEL 6.5 build server: Http2ConnectionState.cc: In member function ‘int Http2ConnectionState::main_event_handler(int, void*)’: Http2ConnectionState.cc:643: error: ‘Http2FrameType’ is not a class or namespace Http2ConnectionState.cc:644: error: ‘Http2FrameType’ is not a class or namespace -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3697) Fix frame type array check for http/2 is invalid
[ https://issues.apache.org/jira/browse/TS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3697: Backport to Version: (was: 5.3.1) Fix frame type array check for http/2 is invalid Key: TS-3697 URL: https://issues.apache.org/jira/browse/TS-3697 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3719) HPACK error in lowering table size
[ https://issues.apache.org/jira/browse/TS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3719: Fix Version/s: 5.3.1 HPACK error in lowering table size -- Key: TS-3719 URL: https://issues.apache.org/jira/browse/TS-3719 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 The table size is reduced by the by max size - new size instead of current size - new size. This causes the table to try to delete items that don't exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3665) Redirect logic causing debug asserts and leaking cache_vc's
[ https://issues.apache.org/jira/browse/TS-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3665: Fix Version/s: 5.3.1 Redirect logic causing debug asserts and leaking cache_vc's --- Key: TS-3665 URL: https://issues.apache.org/jira/browse/TS-3665 Project: Traffic Server Issue Type: Bug Components: Cache Reporter: Susan Hinrichs Assignee: Susan Hinrichs Fix For: 5.3.1, 6.0.0 Attachments: ts-3665-2.diff, ts-3665.diff This is related to TS-3140 and TS-3661. I spent this morning reviewing the issue addressed by TS-3140 after the fixes for TS-3661 were put in place. TS-3140 addresses the issue when the 301 is in cache, but I'm seeing asserts for both 301's in cache and 301's not in cache. My first assert was line 109 in HttpCacheSM.cc line 109, ink_assert(cache_read_vc == NULL). I added a cache_sm.close_read() to the HttpTransact::SM_ACTION_REDIRECT_READ: case of HttpSM::handle_api_return. While only debug assert, if we ignore it we will reassign cache_read_vc without freeing the previous. I addressed this by adding cache_sm.close_read() to the SM_ACTION_REDIRECT_READ case of HttpSM::handle_api_return. My second assert is in HttpSM::do_cache_prepare_action (line 4446 of HttpSM.cc). Before the changes for TS-3661, it was expressing itself in SM_ACTION_CACHE_ISSUE_WRITE case of HttpSM::cache_write_state(). In this case, do_cache_prepare_action will open a new cache_write_vc overwriting the original and losing the cache_vc memory. The original fix to TS-3140 addressed this by adding a cache_sm.close_write in the SM_ACTION_REDIRECT_READ case of HttpSM::handle_api_return. But this caused problems of TS-3661 causing the originally selected cache key to be lost, but if you pass through this logic, I assume that the original cache write vc will be lost anyway. [~sudheerv] and [~zwoop] does this situation not happen in your redirect use cases? I'm afraid that I'm not following how the original cache key is preserved in the second cache open only if the first cache write open is not cleaned up. My test URLs are: curl -v --proxy localhost:80 http://whos.amung.us/cwidget/4s62rme9/007071fecc4e.png and curl -v --proxy localhost:80 http://docs.trafficserver.apache.org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3721) Remove scoping on enum for HTTP2_FRAME_TYPE_MAX
[ https://issues.apache.org/jira/browse/TS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3721: Backport to Version: (was: 5.3.1) Remove scoping on enum for HTTP2_FRAME_TYPE_MAX --- Key: TS-3721 URL: https://issues.apache.org/jira/browse/TS-3721 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 Seeing this on our RHEL 6.5 build server: Http2ConnectionState.cc: In member function ‘int Http2ConnectionState::main_event_handler(int, void*)’: Http2ConnectionState.cc:643: error: ‘Http2FrameType’ is not a class or namespace Http2ConnectionState.cc:644: error: ‘Http2FrameType’ is not a class or namespace -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3697) Fix frame type array check for http/2 is invalid
[ https://issues.apache.org/jira/browse/TS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3697: Fix Version/s: 5.3.1 Fix frame type array check for http/2 is invalid Key: TS-3697 URL: https://issues.apache.org/jira/browse/TS-3697 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3643) TS-2944 changes logging format / breaks compatibility
[ https://issues.apache.org/jira/browse/TS-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3643: Fix Version/s: (was: 5.3.1) TS-2944 changes logging format / breaks compatibility - Key: TS-3643 URL: https://issues.apache.org/jira/browse/TS-3643 Project: Traffic Server Issue Type: Bug Components: Configuration, Logging Affects Versions: 5.3.0 Reporter: David Carlin TS-2944 broke our log processing by changing cache result code from TCP_HIT to TCP_MEM_HIT for the majority of our responses. Additionally, TS-3036 is better. https://issues.apache.org/jira/browse/TS-3036?focusedCommentId=14561404page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14561404 Should TS-2944 be reverted since its redundant and breaks compatibility? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3486: Backport to Version: 5.3.1 Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: sometime Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6919 #27 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #28 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #29 0x005b980b in
[jira] [Commented] (TS-3642) proxy.config.http.share_server_sessions not working
[ https://issues.apache.org/jira/browse/TS-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605913#comment-14605913 ] Phil Sorber commented on TS-3642: - [~amc] said he is going to fix this today. proxy.config.http.share_server_sessions not working --- Key: TS-3642 URL: https://issues.apache.org/jira/browse/TS-3642 Project: Traffic Server Issue Type: Bug Components: Configuration Affects Versions: 5.3.0 Reporter: David Carlin Assignee: Phil Sorber Fix For: 5.3.1 Testing 5.3.0 and I noticed proxy.config.http.share_server_sessions = 1 no longer works. Saw a 10-15x increase in origin connections; there appears to be some reuse, I am seeing approximately 1.2-1.3 requests per origin connection. Setting proxy.config.http.server_session_sharing.pool = global restored expected behavior (Thanks [~sudheerv]!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (TS-3643) TS-2944 changes logging format / breaks compatibility
[ https://issues.apache.org/jira/browse/TS-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber closed TS-3643. --- Resolution: Won't Fix TS-2944 was added in 5.1.0 so this had been around for a while. Furthermore, [~dcarlin] has worked around this. TS-2944 changes logging format / breaks compatibility - Key: TS-3643 URL: https://issues.apache.org/jira/browse/TS-3643 Project: Traffic Server Issue Type: Bug Components: Configuration, Logging Affects Versions: 5.3.0 Reporter: David Carlin TS-2944 broke our log processing by changing cache result code from TCP_HIT to TCP_MEM_HIT for the majority of our responses. Additionally, TS-3036 is better. https://issues.apache.org/jira/browse/TS-3036?focusedCommentId=14561404page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14561404 Should TS-2944 be reverted since its redundant and breaks compatibility? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3643) TS-2944 changes logging format / breaks compatibility
[ https://issues.apache.org/jira/browse/TS-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3643: Backport to Version: (was: 5.3.1) TS-2944 changes logging format / breaks compatibility - Key: TS-3643 URL: https://issues.apache.org/jira/browse/TS-3643 Project: Traffic Server Issue Type: Bug Components: Configuration, Logging Affects Versions: 5.3.0 Reporter: David Carlin TS-2944 broke our log processing by changing cache result code from TCP_HIT to TCP_MEM_HIT for the majority of our responses. Additionally, TS-3036 is better. https://issues.apache.org/jira/browse/TS-3036?focusedCommentId=14561404page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14561404 Should TS-2944 be reverted since its redundant and breaks compatibility? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2150) Add Milestone log tags
[ https://issues.apache.org/jira/browse/TS-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605991#comment-14605991 ] ASF GitHub Bot commented on TS-2150: Github user fpesce commented on the pull request: https://github.com/apache/trafficserver/pull/229#issuecomment-116774590 Looks like a good least worst idea to me. I'll update the PR. Add Milestone log tags -- Key: TS-2150 URL: https://issues.apache.org/jira/browse/TS-2150 Project: Traffic Server Issue Type: New Feature Components: Logging Reporter: Leif Hedstrom Assignee: John Rushford Labels: yahoo Fix For: sometime We have a notion of milestones in the core, and plugin APIs (TSHttpTxnMilestoneGet() ). It'd be useful to expose these milestone timers as a log tag, something like: {code} %{UA_BEGIN}mtms {code} mtms is just an example / suggestion, MilestoneTimeMilliSecond, we can make it whatever we like. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3683) Add a tag to log SSL Session/Ticket HIT as well as TCP connection reused
[ https://issues.apache.org/jira/browse/TS-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] François Pesce updated TS-3683: --- Labels: yahoo (was: ) Add a tag to log SSL Session/Ticket HIT as well as TCP connection reused Key: TS-3683 URL: https://issues.apache.org/jira/browse/TS-3683 Project: Traffic Server Issue Type: Improvement Components: Logging Reporter: François Pesce Assignee: Alan M. Carroll Labels: yahoo Fix For: 6.1.0 These tags would be useful for performance metrics collection: %cqtr The TCP reused status; indicates if this request went through an already established connection. %cqssr The SSL session/ticket reused status; indicates if this request hit the SSL session/ticket and avoided a full SSL handshake. both of them would display respectively 0 or 1 , if resp. not reused or reused. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2150) Add Milestone log tags
[ https://issues.apache.org/jira/browse/TS-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605982#comment-14605982 ] ASF GitHub Bot commented on TS-2150: Github user SolidWallOfCode commented on the pull request: https://github.com/apache/trafficserver/pull/229#issuecomment-116773852 I discussed this with Bryan and Susan and we think the least worst option is to put two milestone indices in the `LogField` class. The code in `LogFormat.cc` that creates the `LogField` instances has no provisions for creating sublcasses, which IMHO is a problem but outside the scope of this bug. Therefore we recommend adding two milestone indices to `LogField` and storing the appropriate milestone indices there in the constructor (by doing the lookup in the map) and then using those (instead of doing the lookup) during actual logging operations. Add Milestone log tags -- Key: TS-2150 URL: https://issues.apache.org/jira/browse/TS-2150 Project: Traffic Server Issue Type: New Feature Components: Logging Reporter: Leif Hedstrom Assignee: John Rushford Labels: yahoo Fix For: sometime We have a notion of milestones in the core, and plugin APIs (TSHttpTxnMilestoneGet() ). It'd be useful to expose these milestone timers as a log tag, something like: {code} %{UA_BEGIN}mtms {code} mtms is just an example / suggestion, MilestoneTimeMilliSecond, we can make it whatever we like. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3714) TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times.
[ https://issues.apache.org/jira/browse/TS-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606731#comment-14606731 ] ASF subversion and git services commented on TS-3714: - Commit a88d6adfa9729558b0368d9bc636593b66fba813 in trafficserver's branch refs/heads/ts3714 from [~sudheerv] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a88d6ad ] [TS-3714]: replace ink_get_hrtime_internal() with Thread::get_hrtime() TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times. --- Key: TS-3714 URL: https://issues.apache.org/jira/browse/TS-3714 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda An example slow log showing very high *ua_first_read*. {code} ERROR: [8624075] Slow Request: client_ip: xx.xx.xx.xxx url: http://xx status: 200 unique id: bytes: 86 fd: 0 client state: 0 serv er state: 9 ua_begin: 0.000 ua_first_read: 42.224 ua_read_header_done: 42.224 cache_open_rea d_begin: 42.224 cache_open_read_end: 42.224 dns_lookup_begin: 42.224 dns_lookup_end: 42.224 server_connect: 42.224 server_first_read: 42.229 server_read_header_done: 42.229 server_clos e: 42.229 ua_begin_write: 42.229 ua_close: 42.229 sm_finish: 42.229 {code} Initially, I suspected that it might be caused by browser's connecting early before sending any bytes to TS. However, this seems to be happening too frequently and with unrealistically high delay between *ua_begin* and *ua_first_read*. I suspect it's caused due to the code that disables the read temporarily before calling *TXN_START* hook and re-enables it after the API call out. The read disable is done via a 0-byte *do_io_read* on the client vc, but, the problem is that a valid *mbuf* is still passed. Based on what I am seeing, it's possible this results in actually enabling the *read_vio* all the way uptil *ssl_read_from_net* for instance (if there's a race condition and there were bytes already from the client resulting in an epoll read event), which would finally disable the read since, the *ntodo* (nbytes) is 0. However, this may result in the epoll event being lost until a new i/o happens from the client. I'm trying out further experiments to confirm the theory. In most cases, the read buffer already has some bytes by the time the http session and http sm is created, which makes it just work. But, if there's a slight delay in the receiving bytes after making a connection, the epoll mechanism should read it, but, due to the way the temporary read disable is being done, the event may be lost (this is coz, ATS uses the epoll edge triggered mode). Some history on this issue - This issue has been a problem for a long time and affects both http and https requests. When this issue was first reported, our router operations team eventually closed it indicating that disabling *accept* threads resolved it ([~yzlai] also reported similar observations and conclusions). It's possible that the race condition window may be slightly reduced by disabling accept threads, but, to the overall performance and through put, accept threads are very important. I now notice that the issue still exists (regardless of whether or not accept threads are enabled/disabled) and am testing further to confirm the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3266) core dump in UnixNetProcessor::connect_re_internal
[ https://issues.apache.org/jira/browse/TS-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Sorber updated TS-3266: Backport to Version: (was: 5.3.1) core dump in UnixNetProcessor::connect_re_internal -- Key: TS-3266 URL: https://issues.apache.org/jira/browse/TS-3266 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.2.0, 5.3.0 Reporter: Sudheer Vinukonda Assignee: Susan Hinrichs Labels: crash Attachments: ts-3266.diff See a new core dump in v5.2.0 after running stable for over 48 hours. Below is the bt and some gdb info. {code} (gdb) bt #0 0x00773056 in EThread::is_event_type (this=0x0, et=2) at UnixEThread.cc:121 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 #2 0x0052b498 in NetProcessor::connect_re (this=0x1032fc0, cont=0x2aad4590d080, addr=0x2aad4590d728, opts=0x2aac8b367600) at ../iocore/net/P_UnixNetProcessor.h:85 #3 0x005e64e5 in HttpSM::do_http_server_open (this=0x2aad4590d080, raw=false) at HttpSM.cc:4796 #4 0x005edec7 in HttpSM::set_next_state (this=0x2aad4590d080) at HttpSM.cc:7141 #5 0x005ed2f2 in HttpSM::call_transact_and_set_next_state (this=0x2aad4590d080, f=0x607320 HttpTransact::HandleResponse(HttpTransact::State*)) at HttpSM.cc:6961 #6 0x005e7b72 in HttpSM::handle_server_setup_error (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:5308 #7 0x005dc57c in HttpSM::state_send_server_request_header (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:1989 #8 0x005de6a2 in HttpSM::main_handler (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:2570 #9 0x00502eae in Continuation::handleEvent (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at ../iocore/eventsystem/I_Continuation.h:146 #10 0x007524c3 in read_signal_and_update (event=104, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:138 #11 0x0075261e in read_signal_done (event=104, nh=0x2aac89a53ad0, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:169 #12 0x00754cd4 in UnixNetVConnection::readSignalDone (this=0x2aade42ed9d0, event=104, nh=0x2aac89a53ad0) at UnixNetVConnection.cc:922 #13 0x0073e088 in SSLNetVConnection::net_read_io (this=0x2aade42ed9d0, nh=0x2aac89a53ad0, lthread=0x2aac89a50010) at SSLNetVConnection.cc:596 #14 0x0074c50d in NetHandler::mainNetEvent (this=0x2aac89a53ad0, event=5, e=0x282fb30) at UnixNet.cc:399 #15 0x00502eae in Continuation::handleEvent (this=0x2aac89a53ad0, event=5, data=0x282fb30) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x00773172 in EThread::process_event (this=0x2aac89a50010, e=0x282fb30, calling_code=5) at UnixEThread.cc:144 #17 0x0077367c in EThread::execute (this=0x2aac89a50010) at UnixEThread.cc:268 #18 0x0077272d in spawn_thread_internal (a=0x2e1b740) at Thread.cc:88 #19 0x2aabd3d04851 in start_thread () from /lib64/libpthread.so.0 #20 0x003296ee890d in clone () from /lib64/libc.so.6 {code} {code} (gdb) frame 1 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 247 UnixNetProcessor.cc: No such file or directory. in UnixNetProcessor.cc (gdb) print mutex $28 = (ProxyMutex *) 0x2aadf004d070 (gdb) print *mutex $29 = {RefCountObj = {ForceVFPTToTop = {_vptr.ForceVFPTToTop = 0x77e890}, m_refcount = 16}, the_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' repeats 39 times, __align = 0}, thread_holding = 0x0, nthread_holding = 0} (gdb) print t $30 = (EThread *) 0x0 (gdb) print cont $31 = (Continuation *) 0x2aad4590d080 (gdb) print *cont $32 = {force_VFPT_to_top = {_vptr.force_VFPT_to_top = 0x7aaef0}, handler = (int (Continuation::*)(Continuation *, int, void *)) 0x5de4ce HttpSM::main_handler(int, void*), mutex = { m_ptr = 0x2aadf004d070}, link = {SLinkContinuation = {next = 0x0}, prev = 0x0}} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3642) proxy.config.http.share_server_sessions not working
[ https://issues.apache.org/jira/browse/TS-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606593#comment-14606593 ] ASF GitHub Bot commented on TS-3642: GitHub user SolidWallOfCode opened a pull request: https://github.com/apache/trafficserver/pull/236 TS-3642: Make server session sharing configuration work as documented. You can merge this pull request into a Git repository by running: $ git pull https://github.com/SolidWallOfCode/trafficserver ts-3642 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/236.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #236 commit e449570f5419daa90b080e5f275f63794bb41a73 Author: Alan M. Carroll a...@apache.org Date: 2015-05-29T21:09:52Z TS-3650: Track configuration variable source. commit 26cd2558aab8c94d4c28b88cf2fa203ddc5780a5 Author: Phil Sorber sor...@apache.org Date: 2015-06-17T01:33:44Z TS-3650: clang-format commit 3f4cc21d3fcd8623cf4180e5a17a8711822a5e22 Author: Alan M. Carroll a...@apache.org Date: 2015-06-29T22:48:27Z TS-3642: Make server session sharing config values worked as documented. proxy.config.http.share_server_sessions not working --- Key: TS-3642 URL: https://issues.apache.org/jira/browse/TS-3642 Project: Traffic Server Issue Type: Bug Components: Configuration Affects Versions: 5.3.0 Reporter: David Carlin Assignee: Phil Sorber Fix For: 5.3.1 Testing 5.3.0 and I noticed proxy.config.http.share_server_sessions = 1 no longer works. Saw a 10-15x increase in origin connections; there appears to be some reuse, I am seeing approximately 1.2-1.3 requests per origin connection. Setting proxy.config.http.server_session_sharing.pool = global restored expected behavior (Thanks [~sudheerv]!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606454#comment-14606454 ] ASF GitHub Bot commented on TS-3486: Github user asfgit closed the pull request at: https://github.com/apache/trafficserver/pull/235 Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.0.0 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6919 #27 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #28 0x005b45f8
[jira] [Commented] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606452#comment-14606452 ] ASF subversion and git services commented on TS-3486: - Commit a638d97019e1b3ef4e6ec53b93d81a96885cbb6e in trafficserver's branch refs/heads/master from shinrich [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a638d97 ] TS-3486: Crashes due to race condition on server sessions moving between threads. This closes #235 Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.0.0 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at
[jira] [Updated] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs updated TS-3486: --- Fix Version/s: (was: sometime) 6.0.0 Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.0.0 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6919 #27 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #28 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455
[jira] [Commented] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606536#comment-14606536 ] ASF subversion and git services commented on TS-3486: - Commit 1a160e13e4931fbaa522339c99abfad1ce9d638c in trafficserver's branch refs/heads/master from [~psudaemon] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=1a160e1 ] TS-3486: clang-format Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.1.0 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0)
[jira] [Commented] (TS-3714) TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times.
[ https://issues.apache.org/jira/browse/TS-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606538#comment-14606538 ] Sudheer Vinukonda commented on TS-3714: --- Debug logs showing pending data read using SSL_read() right after SSL handshake complete. Without doing an additional SSL_read() at that point, this data would not be read until a subsequent i/o, which may likely not happen quickly enough. {code} [Jun 29 22:18:02.378] Server {0x2ab980504700} ERROR: ssl read right after handshake, read 1218, pending 0 bytes, for vc 0x2abb04a58c10 [Jun 29 22:18:03.442] Server {0x2ab97be71700} ERROR: ssl read right after handshake, read 1202, pending 0 bytes, for vc 0x2abaf46401a0 [Jun 29 22:18:09.150] Server {0x2ab980706700} ERROR: ssl read right after handshake, read 1699, pending 0 bytes, for vc 0x2abadc412460 [Jun 29 22:18:10.230] Server {0x2ab980a09700} ERROR: ssl read right after handshake, read 504, pending 0 bytes, for vc 0x2abadc4284a0 [Jun 29 22:18:13.923] Server {0x2ab97bf72700} ERROR: ssl read right after handshake, read 1411, pending 0 bytes, for vc 0x2aba7c0036b0 [Jun 29 22:18:15.739] Server {0x2ab980100700} ERROR: ssl read right after handshake, read 1370, pending 0 bytes, for vc 0x2abac43d3a60 [Jun 29 22:18:15.845] Server {0x2ab980a09700} ERROR: ssl read right after handshake, read 1841, pending 0 bytes, for vc 0x2abadc426910 [Jun 29 22:18:18.337] Server {0x2ab980a09700} ERROR: ssl read right after handshake, read 2838, pending 0 bytes, for vc 0x2abac43e72c0 [Jun 29 22:18:22.181] Server {0x2ab980a09700} ERROR: ssl read right after handshake, read 897, pending 0 bytes, for vc 0x2aba94011020 [Jun 29 22:18:23.458] Server {0x2ab97b669700} ERROR: ssl read right after handshake, read 1285, pending 0 bytes, for vc 0x2aba7c00d900 [Jun 29 22:18:25.109] Server {0x2ab980504700} ERROR: ssl read right after handshake, read 448, pending 0 bytes, for vc 0x2aba54002d80 {code} TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times. --- Key: TS-3714 URL: https://issues.apache.org/jira/browse/TS-3714 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda An example slow log showing very high *ua_first_read*. {code} ERROR: [8624075] Slow Request: client_ip: xx.xx.xx.xxx url: http://xx status: 200 unique id: bytes: 86 fd: 0 client state: 0 serv er state: 9 ua_begin: 0.000 ua_first_read: 42.224 ua_read_header_done: 42.224 cache_open_rea d_begin: 42.224 cache_open_read_end: 42.224 dns_lookup_begin: 42.224 dns_lookup_end: 42.224 server_connect: 42.224 server_first_read: 42.229 server_read_header_done: 42.229 server_clos e: 42.229 ua_begin_write: 42.229 ua_close: 42.229 sm_finish: 42.229 {code} Initially, I suspected that it might be caused by browser's connecting early before sending any bytes to TS. However, this seems to be happening too frequently and with unrealistically high delay between *ua_begin* and *ua_first_read*. I suspect it's caused due to the code that disables the read temporarily before calling *TXN_START* hook and re-enables it after the API call out. The read disable is done via a 0-byte *do_io_read* on the client vc, but, the problem is that a valid *mbuf* is still passed. Based on what I am seeing, it's possible this results in actually enabling the *read_vio* all the way uptil *ssl_read_from_net* for instance (if there's a race condition and there were bytes already from the client resulting in an epoll read event), which would finally disable the read since, the *ntodo* (nbytes) is 0. However, this may result in the epoll event being lost until a new i/o happens from the client. I'm trying out further experiments to confirm the theory. In most cases, the read buffer already has some bytes by the time the http session and http sm is created, which makes it just work. But, if there's a slight delay in the receiving bytes after making a connection, the epoll mechanism should read it, but, due to the way the temporary read disable is being done, the event may be lost (this is coz, ATS uses the epoll edge triggered mode). Some history on this issue - This issue has been a problem for a long time and affects both http and https requests. When this issue was first reported, our router operations team eventually closed it indicating that disabling *accept* threads resolved it ([~yzlai] also reported similar observations and conclusions). It's possible that the race condition window may be slightly reduced by disabling accept threads, but, to the overall performance and through put, accept
[jira] [Updated] (TS-3722) Remove the old tstop symlink from install and packages
[ https://issues.apache.org/jira/browse/TS-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom updated TS-3722: -- Backport to Version: 6.0.0 Remove the old tstop symlink from install and packages Key: TS-3722 URL: https://issues.apache.org/jira/browse/TS-3722 Project: Traffic Server Issue Type: Improvement Components: Tools Reporter: Leif Hedstrom Assignee: Leif Hedstrom Fix For: 6.0.0 We renamed tstop to traffic_top a while ago, but kept around a symlink for it. We should eliminate this symlink as part of the 6.0.0 release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (TS-3266) core dump in UnixNetProcessor::connect_re_internal
[ https://issues.apache.org/jira/browse/TS-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs closed TS-3266. -- Resolution: Duplicate The fix is on TS-3486. These were two expressions of the same underlying issue. core dump in UnixNetProcessor::connect_re_internal -- Key: TS-3266 URL: https://issues.apache.org/jira/browse/TS-3266 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.2.0, 5.3.0 Reporter: Sudheer Vinukonda Assignee: Susan Hinrichs Labels: crash Attachments: ts-3266.diff See a new core dump in v5.2.0 after running stable for over 48 hours. Below is the bt and some gdb info. {code} (gdb) bt #0 0x00773056 in EThread::is_event_type (this=0x0, et=2) at UnixEThread.cc:121 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 #2 0x0052b498 in NetProcessor::connect_re (this=0x1032fc0, cont=0x2aad4590d080, addr=0x2aad4590d728, opts=0x2aac8b367600) at ../iocore/net/P_UnixNetProcessor.h:85 #3 0x005e64e5 in HttpSM::do_http_server_open (this=0x2aad4590d080, raw=false) at HttpSM.cc:4796 #4 0x005edec7 in HttpSM::set_next_state (this=0x2aad4590d080) at HttpSM.cc:7141 #5 0x005ed2f2 in HttpSM::call_transact_and_set_next_state (this=0x2aad4590d080, f=0x607320 HttpTransact::HandleResponse(HttpTransact::State*)) at HttpSM.cc:6961 #6 0x005e7b72 in HttpSM::handle_server_setup_error (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:5308 #7 0x005dc57c in HttpSM::state_send_server_request_header (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:1989 #8 0x005de6a2 in HttpSM::main_handler (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:2570 #9 0x00502eae in Continuation::handleEvent (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at ../iocore/eventsystem/I_Continuation.h:146 #10 0x007524c3 in read_signal_and_update (event=104, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:138 #11 0x0075261e in read_signal_done (event=104, nh=0x2aac89a53ad0, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:169 #12 0x00754cd4 in UnixNetVConnection::readSignalDone (this=0x2aade42ed9d0, event=104, nh=0x2aac89a53ad0) at UnixNetVConnection.cc:922 #13 0x0073e088 in SSLNetVConnection::net_read_io (this=0x2aade42ed9d0, nh=0x2aac89a53ad0, lthread=0x2aac89a50010) at SSLNetVConnection.cc:596 #14 0x0074c50d in NetHandler::mainNetEvent (this=0x2aac89a53ad0, event=5, e=0x282fb30) at UnixNet.cc:399 #15 0x00502eae in Continuation::handleEvent (this=0x2aac89a53ad0, event=5, data=0x282fb30) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x00773172 in EThread::process_event (this=0x2aac89a50010, e=0x282fb30, calling_code=5) at UnixEThread.cc:144 #17 0x0077367c in EThread::execute (this=0x2aac89a50010) at UnixEThread.cc:268 #18 0x0077272d in spawn_thread_internal (a=0x2e1b740) at Thread.cc:88 #19 0x2aabd3d04851 in start_thread () from /lib64/libpthread.so.0 #20 0x003296ee890d in clone () from /lib64/libc.so.6 {code} {code} (gdb) frame 1 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 247 UnixNetProcessor.cc: No such file or directory. in UnixNetProcessor.cc (gdb) print mutex $28 = (ProxyMutex *) 0x2aadf004d070 (gdb) print *mutex $29 = {RefCountObj = {ForceVFPTToTop = {_vptr.ForceVFPTToTop = 0x77e890}, m_refcount = 16}, the_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' repeats 39 times, __align = 0}, thread_holding = 0x0, nthread_holding = 0} (gdb) print t $30 = (EThread *) 0x0 (gdb) print cont $31 = (Continuation *) 0x2aad4590d080 (gdb) print *cont $32 = {force_VFPT_to_top = {_vptr.force_VFPT_to_top = 0x7aaef0}, handler = (int (Continuation::*)(Continuation *, int, void *)) 0x5de4ce HttpSM::main_handler(int, void*), mutex = { m_ptr = 0x2aadf004d070}, link = {SLinkContinuation = {next = 0x0}, prev = 0x0}} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TS-3722) Remove the old tstop symlink from install and packages
[ https://issues.apache.org/jira/browse/TS-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom reassigned TS-3722: - Assignee: Leif Hedstrom Remove the old tstop symlink from install and packages Key: TS-3722 URL: https://issues.apache.org/jira/browse/TS-3722 Project: Traffic Server Issue Type: Improvement Components: Tools Reporter: Leif Hedstrom Assignee: Leif Hedstrom Fix For: 6.0.0 We renamed tstop to traffic_top a while ago, but kept around a symlink for it. We should eliminate this symlink as part of the 6.0.0 release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3313) New World order for connection management and timeouts
[ https://issues.apache.org/jira/browse/TS-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606375#comment-14606375 ] Leif Hedstrom commented on TS-3313: --- Is this completed? New World order for connection management and timeouts -- Key: TS-3313 URL: https://issues.apache.org/jira/browse/TS-3313 Project: Traffic Server Issue Type: New Feature Components: Core Reporter: Leif Hedstrom Assignee: Bryan Call Labels: Umbrella Fix For: 6.0.0 This is an umbrella ticket for all issues related to connection management and timeouts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-3107) Remove RHEL5 from supported OS
[ https://issues.apache.org/jira/browse/TS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Hedstrom resolved TS-3107. --- Resolution: Fixed Remove RHEL5 from supported OS -- Key: TS-3107 URL: https://issues.apache.org/jira/browse/TS-3107 Project: Traffic Server Issue Type: Task Components: Docs Reporter: Phil Sorber Assignee: Leif Hedstrom Priority: Blocker Labels: compatibility Fix For: 6.0.0 We want to drop support for RHEL5 (and clones) when we release 6.0. We need more modern supporting libs like Flex and Bison. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs updated TS-3486: --- Fix Version/s: (was: 6.0.0) 6.0.1 Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.0.1 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6919 #27 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #28 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #29
[jira] [Updated] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs updated TS-3486: --- Backport to Version: 5.3.1, 6.0.0 (was: 5.3.1) Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.0.1 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6919 #27 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #28 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #29
[jira] [Closed] (TS-3266) core dump in UnixNetProcessor::connect_re_internal
[ https://issues.apache.org/jira/browse/TS-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs closed TS-3266. -- Resolution: Fixed See notes on TS-3486 core dump in UnixNetProcessor::connect_re_internal -- Key: TS-3266 URL: https://issues.apache.org/jira/browse/TS-3266 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.2.0, 5.3.0 Reporter: Sudheer Vinukonda Assignee: Susan Hinrichs Labels: crash Attachments: ts-3266.diff See a new core dump in v5.2.0 after running stable for over 48 hours. Below is the bt and some gdb info. {code} (gdb) bt #0 0x00773056 in EThread::is_event_type (this=0x0, et=2) at UnixEThread.cc:121 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 #2 0x0052b498 in NetProcessor::connect_re (this=0x1032fc0, cont=0x2aad4590d080, addr=0x2aad4590d728, opts=0x2aac8b367600) at ../iocore/net/P_UnixNetProcessor.h:85 #3 0x005e64e5 in HttpSM::do_http_server_open (this=0x2aad4590d080, raw=false) at HttpSM.cc:4796 #4 0x005edec7 in HttpSM::set_next_state (this=0x2aad4590d080) at HttpSM.cc:7141 #5 0x005ed2f2 in HttpSM::call_transact_and_set_next_state (this=0x2aad4590d080, f=0x607320 HttpTransact::HandleResponse(HttpTransact::State*)) at HttpSM.cc:6961 #6 0x005e7b72 in HttpSM::handle_server_setup_error (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:5308 #7 0x005dc57c in HttpSM::state_send_server_request_header (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:1989 #8 0x005de6a2 in HttpSM::main_handler (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:2570 #9 0x00502eae in Continuation::handleEvent (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at ../iocore/eventsystem/I_Continuation.h:146 #10 0x007524c3 in read_signal_and_update (event=104, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:138 #11 0x0075261e in read_signal_done (event=104, nh=0x2aac89a53ad0, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:169 #12 0x00754cd4 in UnixNetVConnection::readSignalDone (this=0x2aade42ed9d0, event=104, nh=0x2aac89a53ad0) at UnixNetVConnection.cc:922 #13 0x0073e088 in SSLNetVConnection::net_read_io (this=0x2aade42ed9d0, nh=0x2aac89a53ad0, lthread=0x2aac89a50010) at SSLNetVConnection.cc:596 #14 0x0074c50d in NetHandler::mainNetEvent (this=0x2aac89a53ad0, event=5, e=0x282fb30) at UnixNet.cc:399 #15 0x00502eae in Continuation::handleEvent (this=0x2aac89a53ad0, event=5, data=0x282fb30) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x00773172 in EThread::process_event (this=0x2aac89a50010, e=0x282fb30, calling_code=5) at UnixEThread.cc:144 #17 0x0077367c in EThread::execute (this=0x2aac89a50010) at UnixEThread.cc:268 #18 0x0077272d in spawn_thread_internal (a=0x2e1b740) at Thread.cc:88 #19 0x2aabd3d04851 in start_thread () from /lib64/libpthread.so.0 #20 0x003296ee890d in clone () from /lib64/libc.so.6 {code} {code} (gdb) frame 1 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 247 UnixNetProcessor.cc: No such file or directory. in UnixNetProcessor.cc (gdb) print mutex $28 = (ProxyMutex *) 0x2aadf004d070 (gdb) print *mutex $29 = {RefCountObj = {ForceVFPTToTop = {_vptr.ForceVFPTToTop = 0x77e890}, m_refcount = 16}, the_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' repeats 39 times, __align = 0}, thread_holding = 0x0, nthread_holding = 0} (gdb) print t $30 = (EThread *) 0x0 (gdb) print cont $31 = (Continuation *) 0x2aad4590d080 (gdb) print *cont $32 = {force_VFPT_to_top = {_vptr.force_VFPT_to_top = 0x7aaef0}, handler = (int (Continuation::*)(Continuation *, int, void *)) 0x5de4ce HttpSM::main_handler(int, void*), mutex = { m_ptr = 0x2aadf004d070}, link = {SLinkContinuation = {next = 0x0}, prev = 0x0}} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs resolved TS-3486. Resolution: Fixed This fix has been running in our production on 4 boxes over the weekend without incident. Previous 5.3 builds on those boxes would core at least every couple hours. Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.0.1 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6919 #27 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at
[jira] [Reopened] (TS-3266) core dump in UnixNetProcessor::connect_re_internal
[ https://issues.apache.org/jira/browse/TS-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs reopened TS-3266: core dump in UnixNetProcessor::connect_re_internal -- Key: TS-3266 URL: https://issues.apache.org/jira/browse/TS-3266 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.2.0, 5.3.0 Reporter: Sudheer Vinukonda Assignee: Susan Hinrichs Labels: crash Attachments: ts-3266.diff See a new core dump in v5.2.0 after running stable for over 48 hours. Below is the bt and some gdb info. {code} (gdb) bt #0 0x00773056 in EThread::is_event_type (this=0x0, et=2) at UnixEThread.cc:121 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 #2 0x0052b498 in NetProcessor::connect_re (this=0x1032fc0, cont=0x2aad4590d080, addr=0x2aad4590d728, opts=0x2aac8b367600) at ../iocore/net/P_UnixNetProcessor.h:85 #3 0x005e64e5 in HttpSM::do_http_server_open (this=0x2aad4590d080, raw=false) at HttpSM.cc:4796 #4 0x005edec7 in HttpSM::set_next_state (this=0x2aad4590d080) at HttpSM.cc:7141 #5 0x005ed2f2 in HttpSM::call_transact_and_set_next_state (this=0x2aad4590d080, f=0x607320 HttpTransact::HandleResponse(HttpTransact::State*)) at HttpSM.cc:6961 #6 0x005e7b72 in HttpSM::handle_server_setup_error (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:5308 #7 0x005dc57c in HttpSM::state_send_server_request_header (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:1989 #8 0x005de6a2 in HttpSM::main_handler (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at HttpSM.cc:2570 #9 0x00502eae in Continuation::handleEvent (this=0x2aad4590d080, event=104, data=0x2aade42edae8) at ../iocore/eventsystem/I_Continuation.h:146 #10 0x007524c3 in read_signal_and_update (event=104, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:138 #11 0x0075261e in read_signal_done (event=104, nh=0x2aac89a53ad0, vc=0x2aade42ed9d0) at UnixNetVConnection.cc:169 #12 0x00754cd4 in UnixNetVConnection::readSignalDone (this=0x2aade42ed9d0, event=104, nh=0x2aac89a53ad0) at UnixNetVConnection.cc:922 #13 0x0073e088 in SSLNetVConnection::net_read_io (this=0x2aade42ed9d0, nh=0x2aac89a53ad0, lthread=0x2aac89a50010) at SSLNetVConnection.cc:596 #14 0x0074c50d in NetHandler::mainNetEvent (this=0x2aac89a53ad0, event=5, e=0x282fb30) at UnixNet.cc:399 #15 0x00502eae in Continuation::handleEvent (this=0x2aac89a53ad0, event=5, data=0x282fb30) at ../iocore/eventsystem/I_Continuation.h:146 #16 0x00773172 in EThread::process_event (this=0x2aac89a50010, e=0x282fb30, calling_code=5) at UnixEThread.cc:144 #17 0x0077367c in EThread::execute (this=0x2aac89a50010) at UnixEThread.cc:268 #18 0x0077272d in spawn_thread_internal (a=0x2e1b740) at Thread.cc:88 #19 0x2aabd3d04851 in start_thread () from /lib64/libpthread.so.0 #20 0x003296ee890d in clone () from /lib64/libc.so.6 {code} {code} (gdb) frame 1 #1 0x00750cfe in UnixNetProcessor::connect_re_internal (this=0x1032fc0, cont=0x2aad4590d080, target=0x2aad4590d728, opt=0x2aac8b367600) at UnixNetProcessor.cc:247 247 UnixNetProcessor.cc: No such file or directory. in UnixNetProcessor.cc (gdb) print mutex $28 = (ProxyMutex *) 0x2aadf004d070 (gdb) print *mutex $29 = {RefCountObj = {ForceVFPTToTop = {_vptr.ForceVFPTToTop = 0x77e890}, m_refcount = 16}, the_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' repeats 39 times, __align = 0}, thread_holding = 0x0, nthread_holding = 0} (gdb) print t $30 = (EThread *) 0x0 (gdb) print cont $31 = (Continuation *) 0x2aad4590d080 (gdb) print *cont $32 = {force_VFPT_to_top = {_vptr.force_VFPT_to_top = 0x7aaef0}, handler = (int (Continuation::*)(Continuation *, int, void *)) 0x5de4ce HttpSM::main_handler(int, void*), mutex = { m_ptr = 0x2aadf004d070}, link = {SLinkContinuation = {next = 0x0}, prev = 0x0}} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3486) Segfault in do_io_write with plugin (??)
[ https://issues.apache.org/jira/browse/TS-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Susan Hinrichs updated TS-3486: --- Fix Version/s: (was: 6.0.1) 6.1.0 Segfault in do_io_write with plugin (??) Key: TS-3486 URL: https://issues.apache.org/jira/browse/TS-3486 Project: Traffic Server Issue Type: Bug Affects Versions: 5.2.0, 5.3.0 Reporter: Qiang Li Assignee: Susan Hinrichs Labels: crash Fix For: 6.1.0 Attachments: ts-3266-2.diff, ts-3266-complete.diff, ts3486-ptrace.txt.gz {code} (gdb) bt #0 0x005bdb8b in HttpServerSession::do_io_write (this=value optimized out, c=0x2aaadccc4bf0, nbytes=576, buf=0x2aaafc2ffee8, owner=false) at HttpServerSession.cc:104 #1 0x005acc1d in HttpSM::setup_server_send_request (this=0x2aaadccc4bf0) at HttpSM.cc:5686 #2 0x005b3f85 in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1520 #3 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #4 0x005b980b in HttpSM::state_api_callback (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1275 #5 0x004d7a1b in TSHttpTxnReenable (txnp=0x2aaadccc4bf0, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5614 #6 0x2ba118441c89 in cachefun (contp=value optimized out, event=value optimized out, edata=0x2aaadccc4bf0) at main.cpp:1876 #7 0x005b4466 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=value optimized out, data=value optimized out) at HttpSM.cc:1381 #8 0x005b627d in HttpSM::do_http_server_open (this=0x2aaadccc4bf0, raw=value optimized out) at HttpSM.cc:4639 #9 0x005baa04 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7021 #10 0x005b25a3 in HttpSM::state_cache_open_write (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2442 #11 0x005b5b28 in HttpSM::main_handler (this=0x2aaadccc4bf0, event=1108, data=0x2aab1c3b6800) at HttpSM.cc:2554 #12 0x0059338a in handleEvent (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at ../../iocore/eventsystem/I_Continuation.h:145 #13 HttpCacheSM::state_cache_open_write (this=0x2aaadccc6618, event=value optimized out, data=0x2aab1c3b6800) at HttpCacheSM.cc:167 #14 0x00697223 in handleEvent (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:145 #15 CacheVC::callcont (this=0x2aab1c3b6800, event=value optimized out) at ../../iocore/cache/P_CacheInternal.h:662 #16 0x00715940 in Cache::open_write (this=value optimized out, cont=value optimized out, key=0x2ba0ff762d70, info=value optimized out, apin_in_cache=46914401429576, type=CACHE_FRAG_TYPE_HTTP, hostname=0x2aaadd281078 www.mifangba.comhttpapi.phpwww.mifangba.comhttp://www.mifangba.com/api.php?op=countid=4modelid=12;, host_len=16) at CacheWrite.cc:1788 #17 0x006e5765 in open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at P_CacheInternal.h:1093 #18 CacheProcessor::open_write (this=value optimized out, cont=0x2aaadccc6618, expected_size=value optimized out, url=0x2aaadccc5310, cluster_cache_local=value optimized out, request=value optimized out, old_info=0x0, pin_in_cache=0, type=CACHE_FRAG_TYPE_HTTP) at Cache.cc:3622 #19 0x005936f0 in HttpCacheSM::open_write (this=0x2aaadccc6618, url=value optimized out, request=value optimized out, old_info=value optimized out, pin_in_cache=value optimized out, retry=value optimized out, allow_multiple=false) at HttpCacheSM.cc:298 #20 0x005a022e in HttpSM::do_cache_prepare_action (this=0x2aaadccc4bf0, c_sm=0x2aaadccc6618, object_read_info=0x0, retry=true, allow_multiple=false) at HttpSM.cc:4511 #21 0x005babd9 in do_cache_prepare_write (this=0x2aaadccc4bf0) at HttpSM.cc:4436 #22 HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:7098 #23 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #24 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=0, data=0x0) at HttpSM.cc:1455 #25 0x005ba712 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6876 #26 0x005ba702 in HttpSM::set_next_state (this=0x2aaadccc4bf0) at HttpSM.cc:6919 #27 0x005b3f5f in HttpSM::handle_api_return (this=0x2aaadccc4bf0) at HttpSM.cc:1517 #28 0x005b45f8 in HttpSM::state_api_callout (this=0x2aaadccc4bf0, event=6, data=0x0) at HttpSM.cc:1455 #29
[jira] [Comment Edited] (TS-3164) Why the load of trafficserver occurs an abrupt rise on a occasion ?
[ https://issues.apache.org/jira/browse/TS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605204#comment-14605204 ] hahayaoniming edited comment on TS-3164 at 6/29/15 6:46 AM: I find some phenomena relating to this problem: was (Author: hahayaoniming): I find some phenomena relating to this problem: Why the load of trafficserver occurs an abrupt rise on a occasion ? --- Key: TS-3164 URL: https://issues.apache.org/jira/browse/TS-3164 Project: Traffic Server Issue Type: Bug Components: Core Environment: CentOS 6.3 64bit, 8 cores, 128G mem Reporter: taoyunxing Fix For: sometime I use Tsar to monitor the traffic status of the ATS 4.2.0, and come across the following problem: {code} Time ---cpu-- ---mem-- ---tcp-- -traffic --sda--- --sdb--- --sdc--- ---load- Time util util retranbytin bytout util util util load1 03/11/14-18:20 40.6787.19 3.3624.5M 43.9M13.0294.68 0.00 5.34 03/11/14-18:25 40.3087.20 3.2722.5M 42.6M12.3894.87 0.00 5.79 03/11/14-18:30 40.8484.67 3.4421.4M 42.0M13.2995.37 0.00 6.28 03/11/14-18:35 43.6387.36 3.2123.8M 45.0M13.2393.99 0.00 7.37 03/11/14-18:40 42.2587.37 3.0924.2M 44.8M12.8495.77 0.00 7.25 03/11/14-18:45 42.9687.44 3.4623.3M 46.0M12.9695.84 0.00 7.10 03/11/14-18:50 44.0087.42 3.4922.3M 43.0M14.1794.99 0.00 6.57 03/11/14-18:55 42.2087.44 3.4622.3M 43.6M13.1996.05 0.00 6.09 03/11/14-19:00 44.9087.53 3.6023.6M 46.5M13.6196.67 0.00 8.06 03/11/14-19:05 46.2687.73 3.2425.8M 49.1M15.3994.05 0.00 9.98 03/11/14-19:10 43.8587.69 3.1925.4M 50.9M12.8897.80 0.00 7.99 03/11/14-19:15 45.2887.69 3.3625.6M 49.6M13.1096.86 0.00 7.47 03/11/14-19:20 44.1185.20 3.2924.1M 47.8M14.2496.75 0.00 5.82 03/11/14-19:25 45.2687.78 3.5224.4M 47.7M13.2195.44 0.00 7.61 03/11/14-19:30 44.8387.80 3.6425.7M 50.8M13.2798.02 0.00 6.85 03/11/14-19:35 44.8987.78 3.6123.9M 49.0M13.3497.42 0.00 7.04 03/11/14-19:40 69.2188.88 0.5518.3M 33.7M11.3971.23 0.00 65.80 03/11/14-19:45 72.4788.66 0.2715.4M 31.6M11.5172.31 0.00 11.56 03/11/14-19:50 44.8788.72 4.1122.7M 46.3M12.9997.33 0.00 8.29 {code} in addition, top command show {code} hi:0 ni:0 si:45.56 st:0 sy:13.92 us:12.58 wa:14.3 id:15.96 {code} who help me ? thanks in advance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3164) Why the load of trafficserver occurs an abrupt rise on a occasion ?
[ https://issues.apache.org/jira/browse/TS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605204#comment-14605204 ] hahayaoniming commented on TS-3164: --- I find some phenomena relating to this problem: Why the load of trafficserver occurs an abrupt rise on a occasion ? --- Key: TS-3164 URL: https://issues.apache.org/jira/browse/TS-3164 Project: Traffic Server Issue Type: Bug Components: Core Environment: CentOS 6.3 64bit, 8 cores, 128G mem Reporter: taoyunxing Fix For: sometime I use Tsar to monitor the traffic status of the ATS 4.2.0, and come across the following problem: {code} Time ---cpu-- ---mem-- ---tcp-- -traffic --sda--- --sdb--- --sdc--- ---load- Time util util retranbytin bytout util util util load1 03/11/14-18:20 40.6787.19 3.3624.5M 43.9M13.0294.68 0.00 5.34 03/11/14-18:25 40.3087.20 3.2722.5M 42.6M12.3894.87 0.00 5.79 03/11/14-18:30 40.8484.67 3.4421.4M 42.0M13.2995.37 0.00 6.28 03/11/14-18:35 43.6387.36 3.2123.8M 45.0M13.2393.99 0.00 7.37 03/11/14-18:40 42.2587.37 3.0924.2M 44.8M12.8495.77 0.00 7.25 03/11/14-18:45 42.9687.44 3.4623.3M 46.0M12.9695.84 0.00 7.10 03/11/14-18:50 44.0087.42 3.4922.3M 43.0M14.1794.99 0.00 6.57 03/11/14-18:55 42.2087.44 3.4622.3M 43.6M13.1996.05 0.00 6.09 03/11/14-19:00 44.9087.53 3.6023.6M 46.5M13.6196.67 0.00 8.06 03/11/14-19:05 46.2687.73 3.2425.8M 49.1M15.3994.05 0.00 9.98 03/11/14-19:10 43.8587.69 3.1925.4M 50.9M12.8897.80 0.00 7.99 03/11/14-19:15 45.2887.69 3.3625.6M 49.6M13.1096.86 0.00 7.47 03/11/14-19:20 44.1185.20 3.2924.1M 47.8M14.2496.75 0.00 5.82 03/11/14-19:25 45.2687.78 3.5224.4M 47.7M13.2195.44 0.00 7.61 03/11/14-19:30 44.8387.80 3.6425.7M 50.8M13.2798.02 0.00 6.85 03/11/14-19:35 44.8987.78 3.6123.9M 49.0M13.3497.42 0.00 7.04 03/11/14-19:40 69.2188.88 0.5518.3M 33.7M11.3971.23 0.00 65.80 03/11/14-19:45 72.4788.66 0.2715.4M 31.6M11.5172.31 0.00 11.56 03/11/14-19:50 44.8788.72 4.1122.7M 46.3M12.9997.33 0.00 8.29 {code} in addition, top command show {code} hi:0 ni:0 si:45.56 st:0 sy:13.92 us:12.58 wa:14.3 id:15.96 {code} who help me ? thanks in advance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3723) Failed remap shouldn't log as ERR_CONNECT_FAIL
[ https://issues.apache.org/jira/browse/TS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605223#comment-14605223 ] ASF subversion and git services commented on TS-3723: - Commit 8dbe5601d668e183d69a63f041fe10dacd4e3978 in trafficserver's branch refs/heads/master from [~briang] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=8dbe560 ] TS-3723: Failed remap should log as ERR_INVALID_URL Failed remap shouldn't log as ERR_CONNECT_FAIL -- Key: TS-3723 URL: https://issues.apache.org/jira/browse/TS-3723 Project: Traffic Server Issue Type: Bug Components: Core, Logging Reporter: Brian Geffon Fix For: 6.1.0 It appears that upon a failed remap (when remap is required) it is logged as ERR_CONNECT_FAIL, this is confusing as no connection attempt was actually made. ERR_INVALID_URL makes more sense as remap was required so technically this is an invalid URL. Nothing else will change the response code will still be 404 with the mapping failed body factory page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-3723) Failed remap shouldn't log as ERR_CONNECT_FAIL
[ https://issues.apache.org/jira/browse/TS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Geffon resolved TS-3723. -- Resolution: Fixed Failed remap shouldn't log as ERR_CONNECT_FAIL -- Key: TS-3723 URL: https://issues.apache.org/jira/browse/TS-3723 Project: Traffic Server Issue Type: Bug Components: Core, Logging Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 6.1.0 It appears that upon a failed remap (when remap is required) it is logged as ERR_CONNECT_FAIL, this is confusing as no connection attempt was actually made. ERR_INVALID_URL makes more sense as remap was required so technically this is an invalid URL. Nothing else will change the response code will still be 404 with the mapping failed body factory page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TS-3723) Failed remap shouldn't log as ERR_CONNECT_FAIL
[ https://issues.apache.org/jira/browse/TS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Geffon reassigned TS-3723: Assignee: Brian Geffon Failed remap shouldn't log as ERR_CONNECT_FAIL -- Key: TS-3723 URL: https://issues.apache.org/jira/browse/TS-3723 Project: Traffic Server Issue Type: Bug Components: Core, Logging Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 6.1.0 It appears that upon a failed remap (when remap is required) it is logged as ERR_CONNECT_FAIL, this is confusing as no connection attempt was actually made. ERR_INVALID_URL makes more sense as remap was required so technically this is an invalid URL. Nothing else will change the response code will still be 404 with the mapping failed body factory page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TS-3723) Failed remap shouldn't log as ERR_CONNECT_FAIL
[ https://issues.apache.org/jira/browse/TS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TS-3723 started by Brian Geffon. Failed remap shouldn't log as ERR_CONNECT_FAIL -- Key: TS-3723 URL: https://issues.apache.org/jira/browse/TS-3723 Project: Traffic Server Issue Type: Bug Components: Core, Logging Reporter: Brian Geffon Assignee: Brian Geffon Fix For: 6.1.0 It appears that upon a failed remap (when remap is required) it is logged as ERR_CONNECT_FAIL, this is confusing as no connection attempt was actually made. ERR_INVALID_URL makes more sense as remap was required so technically this is an invalid URL. Nothing else will change the response code will still be 404 with the mapping failed body factory page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-3164) Why the load of trafficserver occurs an abrupt rise on a occasion ?
[ https://issues.apache.org/jira/browse/TS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605204#comment-14605204 ] hahayaoniming edited comment on TS-3164 at 6/29/15 6:51 AM: I find some phenomena relating to this problem: 1. It always happens when memory under heavy usage. 2. it won't happen if I restart the traffic_server everyday. 3. the memory usage grows slowly, about 2% everyday. I suspect there's some memory leaks, but I can't prove this. was (Author: hahayaoniming): I find some phenomena relating to this problem: 1. It always happens when memory under heavy usage. 2. it won't happen if I restart the machine everyday. 3. the memory usage grows slowly, about 2% everyday. I suspect there's some memory leaks, but I can't prove this. Why the load of trafficserver occurs an abrupt rise on a occasion ? --- Key: TS-3164 URL: https://issues.apache.org/jira/browse/TS-3164 Project: Traffic Server Issue Type: Bug Components: Core Environment: CentOS 6.3 64bit, 8 cores, 128G mem Reporter: taoyunxing Fix For: sometime I use Tsar to monitor the traffic status of the ATS 4.2.0, and come across the following problem: {code} Time ---cpu-- ---mem-- ---tcp-- -traffic --sda--- --sdb--- --sdc--- ---load- Time util util retranbytin bytout util util util load1 03/11/14-18:20 40.6787.19 3.3624.5M 43.9M13.0294.68 0.00 5.34 03/11/14-18:25 40.3087.20 3.2722.5M 42.6M12.3894.87 0.00 5.79 03/11/14-18:30 40.8484.67 3.4421.4M 42.0M13.2995.37 0.00 6.28 03/11/14-18:35 43.6387.36 3.2123.8M 45.0M13.2393.99 0.00 7.37 03/11/14-18:40 42.2587.37 3.0924.2M 44.8M12.8495.77 0.00 7.25 03/11/14-18:45 42.9687.44 3.4623.3M 46.0M12.9695.84 0.00 7.10 03/11/14-18:50 44.0087.42 3.4922.3M 43.0M14.1794.99 0.00 6.57 03/11/14-18:55 42.2087.44 3.4622.3M 43.6M13.1996.05 0.00 6.09 03/11/14-19:00 44.9087.53 3.6023.6M 46.5M13.6196.67 0.00 8.06 03/11/14-19:05 46.2687.73 3.2425.8M 49.1M15.3994.05 0.00 9.98 03/11/14-19:10 43.8587.69 3.1925.4M 50.9M12.8897.80 0.00 7.99 03/11/14-19:15 45.2887.69 3.3625.6M 49.6M13.1096.86 0.00 7.47 03/11/14-19:20 44.1185.20 3.2924.1M 47.8M14.2496.75 0.00 5.82 03/11/14-19:25 45.2687.78 3.5224.4M 47.7M13.2195.44 0.00 7.61 03/11/14-19:30 44.8387.80 3.6425.7M 50.8M13.2798.02 0.00 6.85 03/11/14-19:35 44.8987.78 3.6123.9M 49.0M13.3497.42 0.00 7.04 03/11/14-19:40 69.2188.88 0.5518.3M 33.7M11.3971.23 0.00 65.80 03/11/14-19:45 72.4788.66 0.2715.4M 31.6M11.5172.31 0.00 11.56 03/11/14-19:50 44.8788.72 4.1122.7M 46.3M12.9997.33 0.00 8.29 {code} in addition, top command show {code} hi:0 ni:0 si:45.56 st:0 sy:13.92 us:12.58 wa:14.3 id:15.96 {code} who help me ? thanks in advance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-3164) Why the load of trafficserver occurs an abrupt rise on a occasion ?
[ https://issues.apache.org/jira/browse/TS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605204#comment-14605204 ] hahayaoniming edited comment on TS-3164 at 6/29/15 6:50 AM: I find some phenomena relating to this problem: 1. It always happens when memory under heavy usage. 2. it won't happen if I restart the machine everyday. 3. the memory usage grows slowly, about 2% everyday. I suspect there's some memory leaks, but I can't prove this. was (Author: hahayaoniming): I find some phenomena relating to this problem: Why the load of trafficserver occurs an abrupt rise on a occasion ? --- Key: TS-3164 URL: https://issues.apache.org/jira/browse/TS-3164 Project: Traffic Server Issue Type: Bug Components: Core Environment: CentOS 6.3 64bit, 8 cores, 128G mem Reporter: taoyunxing Fix For: sometime I use Tsar to monitor the traffic status of the ATS 4.2.0, and come across the following problem: {code} Time ---cpu-- ---mem-- ---tcp-- -traffic --sda--- --sdb--- --sdc--- ---load- Time util util retranbytin bytout util util util load1 03/11/14-18:20 40.6787.19 3.3624.5M 43.9M13.0294.68 0.00 5.34 03/11/14-18:25 40.3087.20 3.2722.5M 42.6M12.3894.87 0.00 5.79 03/11/14-18:30 40.8484.67 3.4421.4M 42.0M13.2995.37 0.00 6.28 03/11/14-18:35 43.6387.36 3.2123.8M 45.0M13.2393.99 0.00 7.37 03/11/14-18:40 42.2587.37 3.0924.2M 44.8M12.8495.77 0.00 7.25 03/11/14-18:45 42.9687.44 3.4623.3M 46.0M12.9695.84 0.00 7.10 03/11/14-18:50 44.0087.42 3.4922.3M 43.0M14.1794.99 0.00 6.57 03/11/14-18:55 42.2087.44 3.4622.3M 43.6M13.1996.05 0.00 6.09 03/11/14-19:00 44.9087.53 3.6023.6M 46.5M13.6196.67 0.00 8.06 03/11/14-19:05 46.2687.73 3.2425.8M 49.1M15.3994.05 0.00 9.98 03/11/14-19:10 43.8587.69 3.1925.4M 50.9M12.8897.80 0.00 7.99 03/11/14-19:15 45.2887.69 3.3625.6M 49.6M13.1096.86 0.00 7.47 03/11/14-19:20 44.1185.20 3.2924.1M 47.8M14.2496.75 0.00 5.82 03/11/14-19:25 45.2687.78 3.5224.4M 47.7M13.2195.44 0.00 7.61 03/11/14-19:30 44.8387.80 3.6425.7M 50.8M13.2798.02 0.00 6.85 03/11/14-19:35 44.8987.78 3.6123.9M 49.0M13.3497.42 0.00 7.04 03/11/14-19:40 69.2188.88 0.5518.3M 33.7M11.3971.23 0.00 65.80 03/11/14-19:45 72.4788.66 0.2715.4M 31.6M11.5172.31 0.00 11.56 03/11/14-19:50 44.8788.72 4.1122.7M 46.3M12.9997.33 0.00 8.29 {code} in addition, top command show {code} hi:0 ni:0 si:45.56 st:0 sy:13.92 us:12.58 wa:14.3 id:15.96 {code} who help me ? thanks in advance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3723) Failed remap shouldn't log as ERR_CONNECT_FAIL
Brian Geffon created TS-3723: Summary: Failed remap shouldn't log as ERR_CONNECT_FAIL Key: TS-3723 URL: https://issues.apache.org/jira/browse/TS-3723 Project: Traffic Server Issue Type: Bug Components: Core, Logging Reporter: Brian Geffon It appears that upon a failed remap (when remap is required) it is logged as ERR_CONNECT_FAIL, this is confusing as no connection attempt was actually made. ERR_INVALID_URL makes more sense as remap was required so technically this is an invalid URL. Nothing else will change the response code will still be 404 with the mapping failed body factory page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3714) TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times.
[ https://issues.apache.org/jira/browse/TS-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606710#comment-14606710 ] ASF GitHub Bot commented on TS-3714: GitHub user sudheerv opened a pull request: https://github.com/apache/trafficserver/pull/237 [TS-3714]: Summary of changes below: a) Issue a SSL_read right after SSL handshake to ensure data already in the SSL buffers is not lost. b) Add vc to net thread's read_ready_list immediately after accept, to ensure data already in the socket buffers is not lost. c) Add a timer for SSL handshake with default to no timer (as today). d) Fix a bunch of error cases to correctly release resources. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/trafficserver ts3714 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/237.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #237 commit 8525ab9b69fd41072b3107d2512cc43eb4917738 Author: Sudheer Vinukonda sudhe...@yahoo-inc.com Date: 2015-06-30T00:07:24Z [TS-3714]: Summary of changes below: a) Issue a SSL_read right after SSL handshake to ensure data already in the SSL buffers is not lost. b) Add vc to net thread's read_ready_list immediately after accept, to ensure data already in the socket buffers is not lost. c) Add a timer for SSL handshake with default to no timer (as today). d) Fix a bunch of error cases to correctly release resources. TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times. --- Key: TS-3714 URL: https://issues.apache.org/jira/browse/TS-3714 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda An example slow log showing very high *ua_first_read*. {code} ERROR: [8624075] Slow Request: client_ip: xx.xx.xx.xxx url: http://xx status: 200 unique id: bytes: 86 fd: 0 client state: 0 serv er state: 9 ua_begin: 0.000 ua_first_read: 42.224 ua_read_header_done: 42.224 cache_open_rea d_begin: 42.224 cache_open_read_end: 42.224 dns_lookup_begin: 42.224 dns_lookup_end: 42.224 server_connect: 42.224 server_first_read: 42.229 server_read_header_done: 42.229 server_clos e: 42.229 ua_begin_write: 42.229 ua_close: 42.229 sm_finish: 42.229 {code} Initially, I suspected that it might be caused by browser's connecting early before sending any bytes to TS. However, this seems to be happening too frequently and with unrealistically high delay between *ua_begin* and *ua_first_read*. I suspect it's caused due to the code that disables the read temporarily before calling *TXN_START* hook and re-enables it after the API call out. The read disable is done via a 0-byte *do_io_read* on the client vc, but, the problem is that a valid *mbuf* is still passed. Based on what I am seeing, it's possible this results in actually enabling the *read_vio* all the way uptil *ssl_read_from_net* for instance (if there's a race condition and there were bytes already from the client resulting in an epoll read event), which would finally disable the read since, the *ntodo* (nbytes) is 0. However, this may result in the epoll event being lost until a new i/o happens from the client. I'm trying out further experiments to confirm the theory. In most cases, the read buffer already has some bytes by the time the http session and http sm is created, which makes it just work. But, if there's a slight delay in the receiving bytes after making a connection, the epoll mechanism should read it, but, due to the way the temporary read disable is being done, the event may be lost (this is coz, ATS uses the epoll edge triggered mode). Some history on this issue - This issue has been a problem for a long time and affects both http and https requests. When this issue was first reported, our router operations team eventually closed it indicating that disabling *accept* threads resolved it ([~yzlai] also reported similar observations and conclusions). It's possible that the race condition window may be slightly reduced by disabling accept threads, but, to the overall performance and through put, accept threads are very important. I now notice that the issue still exists (regardless of whether or not accept threads are enabled/disabled) and am testing further to confirm the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3720) HostDB no longer keeps track of host failures
[ https://issues.apache.org/jira/browse/TS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606842#comment-14606842 ] ASF GitHub Bot commented on TS-3720: Github user jacksontj commented on the pull request: https://github.com/apache/trafficserver/pull/234#issuecomment-116911339 Merged HostDB no longer keeps track of host failures - Key: TS-3720 URL: https://issues.apache.org/jira/browse/TS-3720 Project: Traffic Server Issue Type: Bug Affects Versions: 5.3.0, 6.0.0 Reporter: Thomas Jackson Assignee: Thomas Jackson If ATS's DNS response was N A records, and one is unavailable ATS will always send traffic to the down real irregardless of configuration. This bug seems to be due to how hostdb is keeping track of DNS + health of the real. Regression was introduced in https://issues.apache.org/jira/browse/TS-3237 (https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a8d1862). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3724) ATS doesn't properly check last_failure time for most RR methods
[ https://issues.apache.org/jira/browse/TS-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Jackson updated TS-3724: --- Fix Version/s: 6.0.0 ATS doesn't properly check last_failure time for most RR methods Key: TS-3724 URL: https://issues.apache.org/jira/browse/TS-3724 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Assignee: Thomas Jackson Fix For: 6.0.0 Right now strict and timed RR don't check the last_failure time of the real. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3720) HostDB no longer keeps track of host failures
[ https://issues.apache.org/jira/browse/TS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606841#comment-14606841 ] ASF GitHub Bot commented on TS-3720: Github user jacksontj closed the pull request at: https://github.com/apache/trafficserver/pull/234 HostDB no longer keeps track of host failures - Key: TS-3720 URL: https://issues.apache.org/jira/browse/TS-3720 Project: Traffic Server Issue Type: Bug Affects Versions: 5.3.0, 6.0.0 Reporter: Thomas Jackson Assignee: Thomas Jackson If ATS's DNS response was N A records, and one is unavailable ATS will always send traffic to the down real irregardless of configuration. This bug seems to be due to how hostdb is keeping track of DNS + health of the real. Regression was introduced in https://issues.apache.org/jira/browse/TS-3237 (https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a8d1862). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3724) ATS doesn't properly check last_failure time for most RR methods
[ https://issues.apache.org/jira/browse/TS-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606844#comment-14606844 ] Thomas Jackson commented on TS-3724: Code merged, leaving the ticket around until I commit a test case ATS doesn't properly check last_failure time for most RR methods Key: TS-3724 URL: https://issues.apache.org/jira/browse/TS-3724 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Assignee: Thomas Jackson Right now strict and timed RR don't check the last_failure time of the real. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3724) ATS doesn't properly check last_failure time for most RR methods
[ https://issues.apache.org/jira/browse/TS-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606838#comment-14606838 ] ASF subversion and git services commented on TS-3724: - Commit be68bd8f47f7ecde5403d9a63dbf81604d9bdf56 in trafficserver's branch refs/heads/master from [~jacksontj] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=be68bd8 ] Make HostDBRoundRobin::select_best_http take last_failure time into consideration for all RR types In the current setup it only checks that status of the reals if you use default RR (which is actually consistent hashing... but we'll let that slide). This patch consolidates the alive() check into the HostDBInfo struct, and then calls if from all 3 LB mechanisms. Since you can control if/when a host is marked as down in ATS there is no reason to not check. Issue: TS-3724 ATS doesn't properly check last_failure time for most RR methods Key: TS-3724 URL: https://issues.apache.org/jira/browse/TS-3724 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Assignee: Thomas Jackson Right now strict and timed RR don't check the last_failure time of the real. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-3720) HostDB no longer keeps track of host failures
[ https://issues.apache.org/jira/browse/TS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Jackson updated TS-3720: --- Backport to Version: 5.3.0, 6.0.0 HostDB no longer keeps track of host failures - Key: TS-3720 URL: https://issues.apache.org/jira/browse/TS-3720 Project: Traffic Server Issue Type: Bug Affects Versions: 5.3.0, 6.0.0 Reporter: Thomas Jackson Assignee: Thomas Jackson If ATS's DNS response was N A records, and one is unavailable ATS will always send traffic to the down real irregardless of configuration. This bug seems to be due to how hostdb is keeping track of DNS + health of the real. Regression was introduced in https://issues.apache.org/jira/browse/TS-3237 (https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a8d1862). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3725) ATS's use of host files is broken
Thomas Jackson created TS-3725: -- Summary: ATS's use of host files is broken Key: TS-3725 URL: https://issues.apache.org/jira/browse/TS-3725 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson The previous implementation actually loaded up the hosts file (in a background thread) and did a get (getbyname_imm) and then set the IPs to match. This works-- but does cause a problem since the port set on it is 0. What we need to do is have this thread maintain in an in-memory mapping of name - ip and have it override resolution within hostdb-- that way we can use host files and maintain whether reals are up or down. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3440) Connect_retries re-connects even if request made it to origin
[ https://issues.apache.org/jira/browse/TS-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606818#comment-14606818 ] Thomas Jackson commented on TS-3440: Yea, we can fix the test case once the feature works properly-- the intent was to show that its broken ;) Connect_retries re-connects even if request made it to origin - Key: TS-3440 URL: https://issues.apache.org/jira/browse/TS-3440 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Assignee: Brian Geffon Fix For: sometime While trying to workaround TS-3439 I decided to test out the connect retries option. During testing I found a case where it should not retry where it is. The scenario is as follows: - ATS makes a connection to an origin - the origin acks the entire request - the origin starts to send back a response (lets say first line of the header) - the origin sends an RST In this scenario ATS will re-connect to the origin, which is bad since we have already sent the request (and we aren't sure if the URL is re-entrant). Test case: https://github.com/jacksontj/trafficserver/commit/28059ccb93f9fb173792aeebf90062882dfdf9d5#diff-06f9ddbe6cc45d76ebb2cb21479dc805R182 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3724) ATS doesn't properly check last_failure time for most RR methods
[ https://issues.apache.org/jira/browse/TS-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606830#comment-14606830 ] Thomas Jackson commented on TS-3724: Patch in https://github.com/apache/trafficserver/pull/234 ATS doesn't properly check last_failure time for most RR methods Key: TS-3724 URL: https://issues.apache.org/jira/browse/TS-3724 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Assignee: Thomas Jackson Right now strict and timed RR don't check the last_failure time of the real. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-3724) ATS doesn't properly check last_failure time for most RR methods
Thomas Jackson created TS-3724: -- Summary: ATS doesn't properly check last_failure time for most RR methods Key: TS-3724 URL: https://issues.apache.org/jira/browse/TS-3724 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Right now strict and timed RR don't check the last_failure time of the real. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3724) ATS doesn't properly check last_failure time for most RR methods
[ https://issues.apache.org/jira/browse/TS-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606833#comment-14606833 ] Thomas Jackson commented on TS-3724: From looking around it doesn't seem like a *regression* so much as a feature that never worked correctly ;) ATS doesn't properly check last_failure time for most RR methods Key: TS-3724 URL: https://issues.apache.org/jira/browse/TS-3724 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Assignee: Thomas Jackson Right now strict and timed RR don't check the last_failure time of the real. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TS-3724) ATS doesn't properly check last_failure time for most RR methods
[ https://issues.apache.org/jira/browse/TS-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Jackson reassigned TS-3724: -- Assignee: Thomas Jackson ATS doesn't properly check last_failure time for most RR methods Key: TS-3724 URL: https://issues.apache.org/jira/browse/TS-3724 Project: Traffic Server Issue Type: Bug Reporter: Thomas Jackson Assignee: Thomas Jackson Right now strict and timed RR don't check the last_failure time of the real. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3714) TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times.
[ https://issues.apache.org/jira/browse/TS-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606708#comment-14606708 ] ASF subversion and git services commented on TS-3714: - Commit 8525ab9b69fd41072b3107d2512cc43eb4917738 in trafficserver's branch refs/heads/ts3714 from [~sudheerv] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=8525ab9 ] [TS-3714]: Summary of changes below: a) Issue a SSL_read right after SSL handshake to ensure data already in the SSL buffers is not lost. b) Add vc to net thread's read_ready_list immediately after accept, to ensure data already in the socket buffers is not lost. c) Add a timer for SSL handshake with default to no timer (as today). d) Fix a bunch of error cases to correctly release resources. TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times. --- Key: TS-3714 URL: https://issues.apache.org/jira/browse/TS-3714 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda An example slow log showing very high *ua_first_read*. {code} ERROR: [8624075] Slow Request: client_ip: xx.xx.xx.xxx url: http://xx status: 200 unique id: bytes: 86 fd: 0 client state: 0 serv er state: 9 ua_begin: 0.000 ua_first_read: 42.224 ua_read_header_done: 42.224 cache_open_rea d_begin: 42.224 cache_open_read_end: 42.224 dns_lookup_begin: 42.224 dns_lookup_end: 42.224 server_connect: 42.224 server_first_read: 42.229 server_read_header_done: 42.229 server_clos e: 42.229 ua_begin_write: 42.229 ua_close: 42.229 sm_finish: 42.229 {code} Initially, I suspected that it might be caused by browser's connecting early before sending any bytes to TS. However, this seems to be happening too frequently and with unrealistically high delay between *ua_begin* and *ua_first_read*. I suspect it's caused due to the code that disables the read temporarily before calling *TXN_START* hook and re-enables it after the API call out. The read disable is done via a 0-byte *do_io_read* on the client vc, but, the problem is that a valid *mbuf* is still passed. Based on what I am seeing, it's possible this results in actually enabling the *read_vio* all the way uptil *ssl_read_from_net* for instance (if there's a race condition and there were bytes already from the client resulting in an epoll read event), which would finally disable the read since, the *ntodo* (nbytes) is 0. However, this may result in the epoll event being lost until a new i/o happens from the client. I'm trying out further experiments to confirm the theory. In most cases, the read buffer already has some bytes by the time the http session and http sm is created, which makes it just work. But, if there's a slight delay in the receiving bytes after making a connection, the epoll mechanism should read it, but, due to the way the temporary read disable is being done, the event may be lost (this is coz, ATS uses the epoll edge triggered mode). Some history on this issue - This issue has been a problem for a long time and affects both http and https requests. When this issue was first reported, our router operations team eventually closed it indicating that disabling *accept* threads resolved it ([~yzlai] also reported similar observations and conclusions). It's possible that the race condition window may be slightly reduced by disabling accept threads, but, to the overall performance and through put, accept threads are very important. I now notice that the issue still exists (regardless of whether or not accept threads are enabled/disabled) and am testing further to confirm the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3714) TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times.
[ https://issues.apache.org/jira/browse/TS-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606848#comment-14606848 ] ASF subversion and git services commented on TS-3714: - Commit 8c76e6aca9be8f71b714183ee861cabee4bac84d in trafficserver's branch refs/heads/ts3714 from [~sudheerv] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=8c76e6a ] [TS-3714]: adjust protocol probe to adjust for early read TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times. --- Key: TS-3714 URL: https://issues.apache.org/jira/browse/TS-3714 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda An example slow log showing very high *ua_first_read*. {code} ERROR: [8624075] Slow Request: client_ip: xx.xx.xx.xxx url: http://xx status: 200 unique id: bytes: 86 fd: 0 client state: 0 serv er state: 9 ua_begin: 0.000 ua_first_read: 42.224 ua_read_header_done: 42.224 cache_open_rea d_begin: 42.224 cache_open_read_end: 42.224 dns_lookup_begin: 42.224 dns_lookup_end: 42.224 server_connect: 42.224 server_first_read: 42.229 server_read_header_done: 42.229 server_clos e: 42.229 ua_begin_write: 42.229 ua_close: 42.229 sm_finish: 42.229 {code} Initially, I suspected that it might be caused by browser's connecting early before sending any bytes to TS. However, this seems to be happening too frequently and with unrealistically high delay between *ua_begin* and *ua_first_read*. I suspect it's caused due to the code that disables the read temporarily before calling *TXN_START* hook and re-enables it after the API call out. The read disable is done via a 0-byte *do_io_read* on the client vc, but, the problem is that a valid *mbuf* is still passed. Based on what I am seeing, it's possible this results in actually enabling the *read_vio* all the way uptil *ssl_read_from_net* for instance (if there's a race condition and there were bytes already from the client resulting in an epoll read event), which would finally disable the read since, the *ntodo* (nbytes) is 0. However, this may result in the epoll event being lost until a new i/o happens from the client. I'm trying out further experiments to confirm the theory. In most cases, the read buffer already has some bytes by the time the http session and http sm is created, which makes it just work. But, if there's a slight delay in the receiving bytes after making a connection, the epoll mechanism should read it, but, due to the way the temporary read disable is being done, the event may be lost (this is coz, ATS uses the epoll edge triggered mode). Some history on this issue - This issue has been a problem for a long time and affects both http and https requests. When this issue was first reported, our router operations team eventually closed it indicating that disabling *accept* threads resolved it ([~yzlai] also reported similar observations and conclusions). It's possible that the race condition window may be slightly reduced by disabling accept threads, but, to the overall performance and through put, accept threads are very important. I now notice that the issue still exists (regardless of whether or not accept threads are enabled/disabled) and am testing further to confirm the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3720) HostDB no longer keeps track of host failures
[ https://issues.apache.org/jira/browse/TS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606846#comment-14606846 ] Thomas Jackson commented on TS-3720: Specifically we need to backport https://github.com/jacksontj/trafficserver/commit/6a8cb988fb19ab36140da6be737b88bf4e99d63d HostDB no longer keeps track of host failures - Key: TS-3720 URL: https://issues.apache.org/jira/browse/TS-3720 Project: Traffic Server Issue Type: Bug Affects Versions: 5.3.0, 6.0.0 Reporter: Thomas Jackson Assignee: Thomas Jackson If ATS's DNS response was N A records, and one is unavailable ATS will always send traffic to the down real irregardless of configuration. This bug seems to be due to how hostdb is keeping track of DNS + health of the real. Regression was introduced in https://issues.apache.org/jira/browse/TS-3237 (https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a8d1862). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3697) Fix frame type array check for http/2 is invalid
[ https://issues.apache.org/jira/browse/TS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606963#comment-14606963 ] ASF subversion and git services commented on TS-3697: - Commit 81a6807a419051712feb2b7f93e62e3140f4001c in trafficserver's branch refs/heads/5.3.x from [~bcall] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=81a6807 ] TS-3697: Fix frame type check for http/2 (cherry picked from commit 45c1133b9ea45003ae6491f4bd03edcf3ab8be37) Fix frame type array check for http/2 is invalid Key: TS-3697 URL: https://issues.apache.org/jira/browse/TS-3697 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3237) DNS host entries are tied to ports
[ https://issues.apache.org/jira/browse/TS-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607510#comment-14607510 ] ASF subversion and git services commented on TS-3237: - Commit c31e12b4910deefb4c71d0d028e9a2baaec4a797 in trafficserver's branch refs/heads/master from [~zwoop] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=c31e12b ] Revert Revert TS-3237: Don't segregate DNS results by port. This reverts commit 6a56fd27228994fcea38c29580c73f704fe01e55. I'm reverting this, since it restores the CHANGES file. Please redo this commit, and also remember to run the clang-format before commit. DNS host entries are tied to ports -- Key: TS-3237 URL: https://issues.apache.org/jira/browse/TS-3237 Project: Traffic Server Issue Type: Bug Components: HostDB Reporter: Alan M. Carroll Assignee: Alan M. Carroll Fix For: 5.3.0 When HostDB does a DNS resolution of an FQDN it stores it in the HostDB as associated with a specific port. This means if the same FQDN is accessed via multiple ports (e.g., 80 and 443) there is a duplicate record for each port. For normal (that is, non SRV) resolution the port should be fixed to 0 because the data will always be identical. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3719) HPACK error in lowering table size
[ https://issues.apache.org/jira/browse/TS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606964#comment-14606964 ] ASF subversion and git services commented on TS-3719: - Commit 4281f20ad9e9316710d427628250ce58bd15ea09 in trafficserver's branch refs/heads/5.3.x from [~bcall] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=4281f20 ] TS-3719: HPACK error in lowering table size (cherry picked from commit 96bd1fa785c3e6adbca8e11f1ad7a578e945625e) HPACK error in lowering table size -- Key: TS-3719 URL: https://issues.apache.org/jira/browse/TS-3719 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 The table size is reduced by the by max size - new size instead of current size - new size. This causes the table to try to delete items that don't exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3665) Redirect logic causing debug asserts and leaking cache_vc's
[ https://issues.apache.org/jira/browse/TS-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606962#comment-14606962 ] ASF subversion and git services commented on TS-3665: - Commit 9031516b71ff5780d89af0ed4da536757b6ab2aa in trafficserver's branch refs/heads/5.3.x from shinrich [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=9031516 ] TS-3665: Update CHANGES (cherry picked from commit 8d16cdbe0db48b508297cec6f1ffc8e2cb9a6106) Conflicts: CHANGES Redirect logic causing debug asserts and leaking cache_vc's --- Key: TS-3665 URL: https://issues.apache.org/jira/browse/TS-3665 Project: Traffic Server Issue Type: Bug Components: Cache Reporter: Susan Hinrichs Assignee: Susan Hinrichs Fix For: 5.3.1, 6.0.0 Attachments: ts-3665-2.diff, ts-3665.diff This is related to TS-3140 and TS-3661. I spent this morning reviewing the issue addressed by TS-3140 after the fixes for TS-3661 were put in place. TS-3140 addresses the issue when the 301 is in cache, but I'm seeing asserts for both 301's in cache and 301's not in cache. My first assert was line 109 in HttpCacheSM.cc line 109, ink_assert(cache_read_vc == NULL). I added a cache_sm.close_read() to the HttpTransact::SM_ACTION_REDIRECT_READ: case of HttpSM::handle_api_return. While only debug assert, if we ignore it we will reassign cache_read_vc without freeing the previous. I addressed this by adding cache_sm.close_read() to the SM_ACTION_REDIRECT_READ case of HttpSM::handle_api_return. My second assert is in HttpSM::do_cache_prepare_action (line 4446 of HttpSM.cc). Before the changes for TS-3661, it was expressing itself in SM_ACTION_CACHE_ISSUE_WRITE case of HttpSM::cache_write_state(). In this case, do_cache_prepare_action will open a new cache_write_vc overwriting the original and losing the cache_vc memory. The original fix to TS-3140 addressed this by adding a cache_sm.close_write in the SM_ACTION_REDIRECT_READ case of HttpSM::handle_api_return. But this caused problems of TS-3661 causing the originally selected cache key to be lost, but if you pass through this logic, I assume that the original cache write vc will be lost anyway. [~sudheerv] and [~zwoop] does this situation not happen in your redirect use cases? I'm afraid that I'm not following how the original cache key is preserved in the second cache open only if the first cache write open is not cleaned up. My test URLs are: curl -v --proxy localhost:80 http://whos.amung.us/cwidget/4s62rme9/007071fecc4e.png and curl -v --proxy localhost:80 http://docs.trafficserver.apache.org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3721) Remove scoping on enum for HTTP2_FRAME_TYPE_MAX
[ https://issues.apache.org/jira/browse/TS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606965#comment-14606965 ] ASF subversion and git services commented on TS-3721: - Commit 15fed8a01537a780ca433edc76fc1dd8da5af3f1 in trafficserver's branch refs/heads/5.3.x from [~bcall] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=15fed8a ] TS-3721: Remove scoping on enum for HTTP2_FRAME_TYPE_MAX (cherry picked from commit 4377dabd0f6c8ec9275ec456d295907a06a1b591) Remove scoping on enum for HTTP2_FRAME_TYPE_MAX --- Key: TS-3721 URL: https://issues.apache.org/jira/browse/TS-3721 Project: Traffic Server Issue Type: Bug Components: HTTP/2 Reporter: Bryan Call Assignee: Bryan Call Fix For: 5.3.1, 6.0.0 Seeing this on our RHEL 6.5 build server: Http2ConnectionState.cc: In member function ‘int Http2ConnectionState::main_event_handler(int, void*)’: Http2ConnectionState.cc:643: error: ‘Http2FrameType’ is not a class or namespace Http2ConnectionState.cc:644: error: ‘Http2FrameType’ is not a class or namespace -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3714) TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times.
[ https://issues.apache.org/jira/browse/TS-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607523#comment-14607523 ] ASF subversion and git services commented on TS-3714: - Commit 2f41a41e92e8d85c7c581664fdc587d1d8860f95 in trafficserver's branch refs/heads/ts3714 from [~sudheerv] [ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=2f41a41 ] [TS-3714]: Fix ssl's first byte iobuf leak in error cases TS seems to stall between ua_begin and ua_first_read on some transactions resulting in high response times. --- Key: TS-3714 URL: https://issues.apache.org/jira/browse/TS-3714 Project: Traffic Server Issue Type: Bug Components: Core Affects Versions: 5.3.0 Reporter: Sudheer Vinukonda Assignee: Sudheer Vinukonda An example slow log showing very high *ua_first_read*. {code} ERROR: [8624075] Slow Request: client_ip: xx.xx.xx.xxx url: http://xx status: 200 unique id: bytes: 86 fd: 0 client state: 0 serv er state: 9 ua_begin: 0.000 ua_first_read: 42.224 ua_read_header_done: 42.224 cache_open_rea d_begin: 42.224 cache_open_read_end: 42.224 dns_lookup_begin: 42.224 dns_lookup_end: 42.224 server_connect: 42.224 server_first_read: 42.229 server_read_header_done: 42.229 server_clos e: 42.229 ua_begin_write: 42.229 ua_close: 42.229 sm_finish: 42.229 {code} Initially, I suspected that it might be caused by browser's connecting early before sending any bytes to TS. However, this seems to be happening too frequently and with unrealistically high delay between *ua_begin* and *ua_first_read*. I suspect it's caused due to the code that disables the read temporarily before calling *TXN_START* hook and re-enables it after the API call out. The read disable is done via a 0-byte *do_io_read* on the client vc, but, the problem is that a valid *mbuf* is still passed. Based on what I am seeing, it's possible this results in actually enabling the *read_vio* all the way uptil *ssl_read_from_net* for instance (if there's a race condition and there were bytes already from the client resulting in an epoll read event), which would finally disable the read since, the *ntodo* (nbytes) is 0. However, this may result in the epoll event being lost until a new i/o happens from the client. I'm trying out further experiments to confirm the theory. In most cases, the read buffer already has some bytes by the time the http session and http sm is created, which makes it just work. But, if there's a slight delay in the receiving bytes after making a connection, the epoll mechanism should read it, but, due to the way the temporary read disable is being done, the event may be lost (this is coz, ATS uses the epoll edge triggered mode). Some history on this issue - This issue has been a problem for a long time and affects both http and https requests. When this issue was first reported, our router operations team eventually closed it indicating that disabling *accept* threads resolved it ([~yzlai] also reported similar observations and conclusions). It's possible that the race condition window may be slightly reduced by disabling accept threads, but, to the overall performance and through put, accept threads are very important. I now notice that the issue still exists (regardless of whether or not accept threads are enabled/disabled) and am testing further to confirm the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-3179) ats_scoped_fd should not provide boolean operators
[ https://issues.apache.org/jira/browse/TS-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606208#comment-14606208 ] Alan M. Carroll commented on TS-3179: - Commit - 8a980c0b9a713426ad8f1b65d7c059e908cdf685. Typo in the commit message, it falsely indicates TS-3719. ats_scoped_fd should not provide boolean operators -- Key: TS-3179 URL: https://issues.apache.org/jira/browse/TS-3179 Project: Traffic Server Issue Type: Bug Components: Core Reporter: Alan M. Carroll Assignee: Alan M. Carroll Fix For: 5.2.0 The boolean operators for {{ats_scoped_fd}} are a compile problem in some supported compilers. They are also inappropriate for a file descriptor as they encourage uses which are different (and wrong) for file descriptors, which should be checked for being negative rather than directly as booleans. -- This message was sent by Atlassian JIRA (v6.3.4#6332)