[jira] [Resolved] (TS-5106) Assert on ParentSelection.h line 337, there is no selection_strategy for default parent proxy.
[ https://issues.apache.org/jira/browse/TS-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu resolved TS-5106. -- Resolution: Fixed Assignee: Oknet Xu Fix Version/s: 7.0.0 > Assert on ParentSelection.h line 337, there is no selection_strategy for > default parent proxy. > -- > > Key: TS-5106 > URL: https://issues.apache.org/jira/browse/TS-5106 > Project: Traffic Server > Issue Type: Bug > Components: Parent Proxy >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {code} > FATAL: ParentSelection.h:337: failed assertion > `result->rec->selection_strategy != NULL` > traffic_server: Aborted (Signal sent by tkill() 21363 65534) > traffic_server - STACK TRACE: > ../../bin/traffic_server(crash_logger_invoke(int, siginfo_t*, > void*)+0x99)[0x4c1be9] > /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f8b2a8ba8d0] > /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f8b29b15107] > /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f8b29b164e8] > /usr/local/ats/lib/libtsutil.so.7(+0x2ac31)[0x7f8b2c1adc31] > /usr/local/ats/lib/libtsutil.so.7(+0x28c95)[0x7f8b2c1abc95] > ../../bin/traffic_server(ParentConfigParams::findParent(HttpRequestData*, > ParentResult*)+0x56e)[0x4f677e] > ../../bin/traffic_server(SocksEntry::init(Ptr&, > UnixNetVConnection*, unsigned char, unsigned char)+0x59b)[0x762dfb] > ../../bin/traffic_server(UnixNetProcessor::connect_re_internal(Continuation*, > sockaddr const*, NetVCOptions*)+0x251)[0x74eaf1] > ../../bin/traffic_server(HttpSM::do_http_server_open(bool)+0x850)[0x5bb5d0] > ../../bin/traffic_server(HttpSM::set_next_state()+0x4a3)[0x5bc7b3] > ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void > (*)(HttpTransact::State*))+0x3a)[0x5ae28a] > ../../bin/traffic_server(HttpSM::state_cache_open_write(int, > void*)+0x1ce)[0x5b083e] > ../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8] > ../../bin/traffic_server(HttpCacheSM::state_cache_open_write(int, > void*)+0x1d1)[0x599221] > ../../bin/traffic_server(CacheVC::callcont(int)+0x5b)[0x6a1b2b] > ../../bin/traffic_server(Cache::open_write(Continuation*, ats::CryptoHash > const*, HTTPInfo*, long, ats::CryptoHash const*, CacheFragType, char const*, > int)+0x56b)[0x713f0b] > ../../bin/traffic_server(HttpCacheSM::open_write(HttpCacheKey const*, URL*, > HTTPHdr*, HTTPInfo*, long, bool, bool)+0xcd)[0x598fed] > ../../bin/traffic_server(HttpSM::do_cache_prepare_action(HttpCacheSM*, > HTTPInfo*, bool, bool)+0x15d)[0x5a90dd] > ../../bin/traffic_server(HttpSM::set_next_state()+0x8b6)[0x5bcbc6] > ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void > (*)(HttpTransact::State*))+0x3a)[0x5ae28a] > ../../bin/traffic_server(HttpSM::handle_api_return()+0xe7)[0x5b95f7] > ../../bin/traffic_server(HttpSM::set_next_state()+0x16b)[0x5bc47b] > ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void > (*)(HttpTransact::State*))+0x3a)[0x5ae28a] > ../../bin/traffic_server(HttpSM::state_hostdb_lookup(int, > void*)+0xa0)[0x5ba540] > ../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8] > ../../bin/traffic_server[0x690386] > ../../bin/traffic_server(HostDBContinuation::do_dns()+0x1d7)[0x691fc7] > ../../bin/traffic_server(HostDBContinuation::probeEvent(int, > Event*)+0x228)[0x6946a8] > ../../bin/traffic_server(EThread::process_event(Event*, int)+0x8d)[0x77980d] > ../../bin/traffic_server(EThread::execute()+0x73d)[0x77a4cd] > ../../bin/traffic_server[0x778c4a] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f8b2a8b30a4] > /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f8b29bc607d] > Aborted > {code} > The default parent proxy is set by ParentRecord::DefaultInit(char *val), but > it is not create selection_strategy for default parent proxy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal
[ https://issues.apache.org/jira/browse/TS-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu resolved TS-5105. -- Resolution: Fixed > Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal > - > > Key: TS-5105 > URL: https://issues.apache.org/jira/browse/TS-5105 > Project: Traffic Server > Issue Type: Bug > Components: SOCKS >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > {code} > traffic_server: using root directory '/usr/local' > traffic_server: Abort trap > traffic_server - STACK TRACE: > 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at > /usr/local/bin/traffic_server > 0x80275ab37at /lib/libthr.so.3 > 0x80275a22c at /lib/libthr.so.3 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Version 4 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) server connect timeout: 10 socks respnonse > timeout 100 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (SocksProxy) Read SocksProxy info: accept_enabled = > 0 accept_port = 1080 http_port = 80 > [Dec 21 17:27:34.841] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Config File: > /usr/local/etc/trafficserver/socks.config > [Dec 21 17:27:34.842] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Turned on > [Dec 21 17:27:35.052] Server {0x804008000} DEBUG: (connect_re_internal)> (Socks) Using Socks ip: 216.58.192.142:80 > Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, > line 67. > traffic_server: using root directory '/usr/local' > traffic_server: Abort trap > traffic_server - STACK TRACE: > 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at > /usr/local/bin/traffic_server > 0x80275ab37 at /lib/libthr.so.3 > 0x80275a22c at /lib/libthr.so.3 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal
[ https://issues.apache.org/jira/browse/TS-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TS-5105 started by Oknet Xu. > Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal > - > > Key: TS-5105 > URL: https://issues.apache.org/jira/browse/TS-5105 > Project: Traffic Server > Issue Type: Bug > Components: SOCKS >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > {code} > traffic_server: using root directory '/usr/local' > traffic_server: Abort trap > traffic_server - STACK TRACE: > 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at > /usr/local/bin/traffic_server > 0x80275ab37at /lib/libthr.so.3 > 0x80275a22c at /lib/libthr.so.3 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Version 4 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) server connect timeout: 10 socks respnonse > timeout 100 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (SocksProxy) Read SocksProxy info: accept_enabled = > 0 accept_port = 1080 http_port = 80 > [Dec 21 17:27:34.841] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Config File: > /usr/local/etc/trafficserver/socks.config > [Dec 21 17:27:34.842] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Turned on > [Dec 21 17:27:35.052] Server {0x804008000} DEBUG: (connect_re_internal)> (Socks) Using Socks ip: 216.58.192.142:80 > Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, > line 67. > traffic_server: using root directory '/usr/local' > traffic_server: Abort trap > traffic_server - STACK TRACE: > 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at > /usr/local/bin/traffic_server > 0x80275ab37 at /lib/libthr.so.3 > 0x80275a22c at /lib/libthr.so.3 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal
[ https://issues.apache.org/jira/browse/TS-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-5105: - Assignee: Oknet Xu Backport to Version: 6.2.1 Fix Version/s: 7.0.0 > Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal > - > > Key: TS-5105 > URL: https://issues.apache.org/jira/browse/TS-5105 > Project: Traffic Server > Issue Type: Bug > Components: SOCKS >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {code} > traffic_server: using root directory '/usr/local' > traffic_server: Abort trap > traffic_server - STACK TRACE: > 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at > /usr/local/bin/traffic_server > 0x80275ab37at /lib/libthr.so.3 > 0x80275a22c at /lib/libthr.so.3 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Version 4 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) server connect timeout: 10 socks respnonse > timeout 100 > [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (SocksProxy) Read SocksProxy info: accept_enabled = > 0 accept_port = 1080 http_port = 80 > [Dec 21 17:27:34.841] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Config File: > /usr/local/etc/trafficserver/socks.config > [Dec 21 17:27:34.842] Server {0x804006400} DEBUG: (loadSocksConfiguration)> (Socks) Socks Turned on > [Dec 21 17:27:35.052] Server {0x804008000} DEBUG: (connect_re_internal)> (Socks) Using Socks ip: 216.58.192.142:80 > Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, > line 67. > traffic_server: using root directory '/usr/local' > traffic_server: Abort trap > traffic_server - STACK TRACE: > 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at > /usr/local/bin/traffic_server > 0x80275ab37 at /lib/libthr.so.3 > 0x80275a22c at /lib/libthr.so.3 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-5106) Assert on ParentSelection.h line 337, there is no selection_strategy for default parent proxy.
Oknet Xu created TS-5106: Summary: Assert on ParentSelection.h line 337, there is no selection_strategy for default parent proxy. Key: TS-5106 URL: https://issues.apache.org/jira/browse/TS-5106 Project: Traffic Server Issue Type: Bug Components: Parent Proxy Reporter: Oknet Xu {code} FATAL: ParentSelection.h:337: failed assertion `result->rec->selection_strategy != NULL` traffic_server: Aborted (Signal sent by tkill() 21363 65534) traffic_server - STACK TRACE: ../../bin/traffic_server(crash_logger_invoke(int, siginfo_t*, void*)+0x99)[0x4c1be9] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f8b2a8ba8d0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f8b29b15107] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f8b29b164e8] /usr/local/ats/lib/libtsutil.so.7(+0x2ac31)[0x7f8b2c1adc31] /usr/local/ats/lib/libtsutil.so.7(+0x28c95)[0x7f8b2c1abc95] ../../bin/traffic_server(ParentConfigParams::findParent(HttpRequestData*, ParentResult*)+0x56e)[0x4f677e] ../../bin/traffic_server(SocksEntry::init(Ptr&, UnixNetVConnection*, unsigned char, unsigned char)+0x59b)[0x762dfb] ../../bin/traffic_server(UnixNetProcessor::connect_re_internal(Continuation*, sockaddr const*, NetVCOptions*)+0x251)[0x74eaf1] ../../bin/traffic_server(HttpSM::do_http_server_open(bool)+0x850)[0x5bb5d0] ../../bin/traffic_server(HttpSM::set_next_state()+0x4a3)[0x5bc7b3] ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0x3a)[0x5ae28a] ../../bin/traffic_server(HttpSM::state_cache_open_write(int, void*)+0x1ce)[0x5b083e] ../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8] ../../bin/traffic_server(HttpCacheSM::state_cache_open_write(int, void*)+0x1d1)[0x599221] ../../bin/traffic_server(CacheVC::callcont(int)+0x5b)[0x6a1b2b] ../../bin/traffic_server(Cache::open_write(Continuation*, ats::CryptoHash const*, HTTPInfo*, long, ats::CryptoHash const*, CacheFragType, char const*, int)+0x56b)[0x713f0b] ../../bin/traffic_server(HttpCacheSM::open_write(HttpCacheKey const*, URL*, HTTPHdr*, HTTPInfo*, long, bool, bool)+0xcd)[0x598fed] ../../bin/traffic_server(HttpSM::do_cache_prepare_action(HttpCacheSM*, HTTPInfo*, bool, bool)+0x15d)[0x5a90dd] ../../bin/traffic_server(HttpSM::set_next_state()+0x8b6)[0x5bcbc6] ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0x3a)[0x5ae28a] ../../bin/traffic_server(HttpSM::handle_api_return()+0xe7)[0x5b95f7] ../../bin/traffic_server(HttpSM::set_next_state()+0x16b)[0x5bc47b] ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0x3a)[0x5ae28a] ../../bin/traffic_server(HttpSM::state_hostdb_lookup(int, void*)+0xa0)[0x5ba540] ../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8] ../../bin/traffic_server[0x690386] ../../bin/traffic_server(HostDBContinuation::do_dns()+0x1d7)[0x691fc7] ../../bin/traffic_server(HostDBContinuation::probeEvent(int, Event*)+0x228)[0x6946a8] ../../bin/traffic_server(EThread::process_event(Event*, int)+0x8d)[0x77980d] ../../bin/traffic_server(EThread::execute()+0x73d)[0x77a4cd] ../../bin/traffic_server[0x778c4a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f8b2a8b30a4] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f8b29bc607d] Aborted {code} The default parent proxy is set by ParentRecord::DefaultInit(char *val), but it is not create selection_strategy for default parent proxy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal
Oknet Xu created TS-5105: Summary: Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal Key: TS-5105 URL: https://issues.apache.org/jira/browse/TS-5105 Project: Traffic Server Issue Type: Bug Components: SOCKS Reporter: Oknet Xu {code} traffic_server: using root directory '/usr/local' traffic_server: Abort trap traffic_server - STACK TRACE: 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server 0x80275ab37at /lib/libthr.so.3 0x80275a22c at /lib/libthr.so.3 [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (Socks) Socks Version 4 [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (Socks) server connect timeout: 10 socks respnonse timeout 100 [Dec 21 17:27:34.840] Server {0x804006400} DEBUG: (SocksProxy) Read SocksProxy info: accept_enabled = 0 accept_port = 1080 http_port = 80 [Dec 21 17:27:34.841] Server {0x804006400} DEBUG: (Socks) Socks Config File: /usr/local/etc/trafficserver/socks.config [Dec 21 17:27:34.842] Server {0x804006400} DEBUG: (Socks) Socks Turned on [Dec 21 17:27:35.052] Server {0x804008000} DEBUG: (Socks) Using Socks ip: 216.58.192.142:80 Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, line 67. traffic_server: using root directory '/usr/local' traffic_server: Abort trap traffic_server - STACK TRACE: 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at /usr/local/bin/traffic_server 0x80275ab37 at /lib/libthr.so.3 0x80275a22c at /lib/libthr.so.3 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-5104) Wrong retry count for OSDNSLookup if url_expansions enabled
[ https://issues.apache.org/jira/browse/TS-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-5104: - Affects Version/s: 6.2.0 > Wrong retry count for OSDNSLookup if url_expansions enabled > --- > > Key: TS-5104 > URL: https://issues.apache.org/jira/browse/TS-5104 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 6.2.0 >Reporter: Oknet Xu > Fix For: 6.2.1 > > > {code} > 1674HttpTransact::OSDNSLookup(State *s) > 1675 { > 1676static int max_dns_lookups = 3 + > s->http_config_param->num_url_expansions; > 1677++s->dns_info.attempts; > {code} > The max_dns_lookups include : > - 1 for origin domain, for example: oknet > - 1 for default expansion, for example: www.oknet.com > - n for url_expansions_string list, for example: oknet.org, oknet.net > Thus, max_dns_lookups should be ```2 + > s->http_config_param->num_url_expansions``` > {code} > HttpTransact::HostNameExpansionError_t > 6614 HttpTransact::try_to_expand_host_name(State *s) > a165134@andrewhsuInitial commit. > andrewhsu authored on 30 Oct 2009 > 6615 { > 6616static int max_dns_lookups = 2 + > s->http_config_param->num_url_expansions; > 6617static int last_expansion = max_dns_lookups - 2; > {code} > In the HttpTransact::try_to_expand_host_name, it is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-5104) Wrong retry count for OSDNSLookup if url_expansions enabled
[ https://issues.apache.org/jira/browse/TS-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-5104: - Fix Version/s: 6.2.1 > Wrong retry count for OSDNSLookup if url_expansions enabled > --- > > Key: TS-5104 > URL: https://issues.apache.org/jira/browse/TS-5104 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Affects Versions: 6.2.0 >Reporter: Oknet Xu > Fix For: 6.2.1 > > > {code} > 1674HttpTransact::OSDNSLookup(State *s) > 1675 { > 1676static int max_dns_lookups = 3 + > s->http_config_param->num_url_expansions; > 1677++s->dns_info.attempts; > {code} > The max_dns_lookups include : > - 1 for origin domain, for example: oknet > - 1 for default expansion, for example: www.oknet.com > - n for url_expansions_string list, for example: oknet.org, oknet.net > Thus, max_dns_lookups should be ```2 + > s->http_config_param->num_url_expansions``` > {code} > HttpTransact::HostNameExpansionError_t > 6614 HttpTransact::try_to_expand_host_name(State *s) > a165134@andrewhsuInitial commit. > andrewhsu authored on 30 Oct 2009 > 6615 { > 6616static int max_dns_lookups = 2 + > s->http_config_param->num_url_expansions; > 6617static int last_expansion = max_dns_lookups - 2; > {code} > In the HttpTransact::try_to_expand_host_name, it is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check
[ https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4583: - Fix Version/s: (was: 6.2.1) 7.1.0 > CID 1021958: Null-pointer dereference after check > - > > Key: TS-4583 > URL: https://issues.apache.org/jira/browse/TS-4583 > Project: Traffic Server > Issue Type: Bug >Reporter: Jari Alhonen >Assignee: Jari Alhonen > Fix For: 7.1.0 > > Time Spent: 2h > Remaining Estimate: 0h > > HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, > before calling release_server_session(), which dereferences server_entry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check
[ https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4583: - Affects Version/s: (was: 6.2.0) > CID 1021958: Null-pointer dereference after check > - > > Key: TS-4583 > URL: https://issues.apache.org/jira/browse/TS-4583 > Project: Traffic Server > Issue Type: Bug >Reporter: Jari Alhonen >Assignee: Jari Alhonen > Fix For: 7.1.0 > > Time Spent: 2h > Remaining Estimate: 0h > > HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, > before calling release_server_session(), which dereferences server_entry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check
[ https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4583: - Affects Version/s: 6.2.0 > CID 1021958: Null-pointer dereference after check > - > > Key: TS-4583 > URL: https://issues.apache.org/jira/browse/TS-4583 > Project: Traffic Server > Issue Type: Bug >Affects Versions: 6.2.0 >Reporter: Jari Alhonen >Assignee: Jari Alhonen > Fix For: 6.2.1 > > Time Spent: 2h > Remaining Estimate: 0h > > HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, > before calling release_server_session(), which dereferences server_entry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check
[ https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4583: - Fix Version/s: (was: 7.1.0) 6.2.1 > CID 1021958: Null-pointer dereference after check > - > > Key: TS-4583 > URL: https://issues.apache.org/jira/browse/TS-4583 > Project: Traffic Server > Issue Type: Bug >Affects Versions: 6.2.0 >Reporter: Jari Alhonen >Assignee: Jari Alhonen > Fix For: 6.2.1 > > Time Spent: 2h > Remaining Estimate: 0h > > HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, > before calling release_server_session(), which dereferences server_entry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-5104) Wrong retry count for OSDNSLookup if url_expansions enabled
Oknet Xu created TS-5104: Summary: Wrong retry count for OSDNSLookup if url_expansions enabled Key: TS-5104 URL: https://issues.apache.org/jira/browse/TS-5104 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Oknet Xu {code} 1674HttpTransact::OSDNSLookup(State *s) 1675{ 1676 static int max_dns_lookups = 3 + s->http_config_param->num_url_expansions; 1677 ++s->dns_info.attempts; {code} The max_dns_lookups include : - 1 for origin domain, for example: oknet - 1 for default expansion, for example: www.oknet.com - n for url_expansions_string list, for example: oknet.org, oknet.net Thus, max_dns_lookups should be ```2 + s->http_config_param->num_url_expansions``` {code} HttpTransact::HostNameExpansionError_t 6614HttpTransact::try_to_expand_host_name(State *s) a165134@andrewhsuInitial commit. andrewhsu authored on 30 Oct 2009 6615{ 6616 static int max_dns_lookups = 2 + s->http_config_param->num_url_expansions; 6617 static int last_expansion = max_dns_lookups - 2; {code} In the HttpTransact::try_to_expand_host_name, it is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-5103) Always tunnel non-keepalive HTTP request if tr-pass enabled
[ https://issues.apache.org/jira/browse/TS-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-5103: - Summary: Always tunnel non-keepalive HTTP request if tr-pass enabled (was: Always tunnel pipeline non-keepalive HTTP request if tr-pass enabled) > Always tunnel non-keepalive HTTP request if tr-pass enabled > --- > > Key: TS-5103 > URL: https://issues.apache.org/jira/browse/TS-5103 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Oknet Xu > Time Spent: 10m > Remaining Estimate: 0h > > Should use ua_buffer_reader instead of ua_raw_buffer_reader. > {code} > // If we had a GET request that has data after the > // get request, do blind tunnel > } else if (state == PARSE_DONE && > t_state.hdr_info.client_request.method_get_wksidx() == HTTP_WKSIDX_GET && >ua_raw_buffer_reader->read_avail() > 0 && > !t_state.hdr_info.client_request.is_keep_alive_set()) { > do_blind_tunnel = true; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-5103) Always tunnel pipeline non-keepalive HTTP request if tr-pass enabled
Oknet Xu created TS-5103: Summary: Always tunnel pipeline non-keepalive HTTP request if tr-pass enabled Key: TS-5103 URL: https://issues.apache.org/jira/browse/TS-5103 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Oknet Xu Should use ua_buffer_reader instead of ua_raw_buffer_reader. {code} // If we had a GET request that has data after the // get request, do blind tunnel } else if (state == PARSE_DONE && t_state.hdr_info.client_request.method_get_wksidx() == HTTP_WKSIDX_GET && ua_raw_buffer_reader->read_avail() > 0 && !t_state.hdr_info.client_request.is_keep_alive_set()) { do_blind_tunnel = true; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-5082) Compile error: undefined reference to IOBufferReader::is_read_avail_more_than
[ https://issues.apache.org/jira/browse/TS-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731084#comment-15731084 ] Oknet Xu commented on TS-5082: -- >From my understand: - The {code}TS_INLINE{code} defined as {code}inline{code} in lib/ts/ink_apidefs.h only for libts, for libts.so these functions declared by TS_INLINE is inline function. - And {code}TS_INLINE{code} defined as empty in iocore/*/Inline.cc, for a sub module static libs (*.a) these functions declared by TS_INLINE is not inline function. I don't know the reasons about the TS_INLINE macro, this case only fix the compile error. [~zwoop] > Compile error: undefined reference to IOBufferReader::is_read_avail_more_than > - > > Key: TS-5082 > URL: https://issues.apache.org/jira/browse/TS-5082 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu > Time Spent: 40m > Remaining Estimate: 0h > > autoreconf -fi > ./configure > make > got some error message like below: > {code} > CXXLD traffic_server > undefined reference to IOBufferReader::is_read_avail_more_than(...) > {code} > file: P_IOBuffer.h > {code} > inline bool > IOBufferReader::is_read_avail_more_than(int64_t size) > {code} > should be > {code} > TS_INLINE bool > IOBufferReader::is_read_avail_more_than(int64_t size) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-5082) Compile error: undefined reference to IOBufferReader::is_read_avail_more_than
[ https://issues.apache.org/jira/browse/TS-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731084#comment-15731084 ] Oknet Xu edited comment on TS-5082 at 12/8/16 4:49 AM: --- >From my understand: - The *TS_INLINE* defined as *inline* in lib/ts/ink_apidefs.h only for libts, for libts.so these functions declared by *TS_INLINE* is inline function. - And *TS_INLINE* defined as empty in iocore/\*/Inline.cc, for a sub module static libs (\*.a) these functions declared by *TS_INLINE* is *NOT* inline function. I don't know the reasons about the TS_INLINE macro, this case only fix the compile error. [~zwoop] was (Author: oknet): >From my understand: - The {code}TS_INLINE{code} defined as {code}inline{code} in lib/ts/ink_apidefs.h only for libts, for libts.so these functions declared by TS_INLINE is inline function. - And {code}TS_INLINE{code} defined as empty in iocore/*/Inline.cc, for a sub module static libs (*.a) these functions declared by TS_INLINE is not inline function. I don't know the reasons about the TS_INLINE macro, this case only fix the compile error. [~zwoop] > Compile error: undefined reference to IOBufferReader::is_read_avail_more_than > - > > Key: TS-5082 > URL: https://issues.apache.org/jira/browse/TS-5082 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu > Time Spent: 40m > Remaining Estimate: 0h > > autoreconf -fi > ./configure > make > got some error message like below: > {code} > CXXLD traffic_server > undefined reference to IOBufferReader::is_read_avail_more_than(...) > {code} > file: P_IOBuffer.h > {code} > inline bool > IOBufferReader::is_read_avail_more_than(int64_t size) > {code} > should be > {code} > TS_INLINE bool > IOBufferReader::is_read_avail_more_than(int64_t size) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-5082) Compile error: undefined reference to IOBufferReader::is_read_avail_more_than
Oknet Xu created TS-5082: Summary: Compile error: undefined reference to IOBufferReader::is_read_avail_more_than Key: TS-5082 URL: https://issues.apache.org/jira/browse/TS-5082 Project: Traffic Server Issue Type: Bug Components: Core Reporter: Oknet Xu autoreconf -fi ./configure make got some error message like below: {code} CXXLD traffic_server undefined reference to IOBufferReader::is_read_avail_more_than(...) {code} file: P_IOBuffer.h {code} inline bool IOBufferReader::is_read_avail_more_than(int64_t size) {code} should be {code} TS_INLINE bool IOBufferReader::is_read_avail_more_than(int64_t size) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4206) Stalled connections show a client request but no HTTP response
[ https://issues.apache.org/jira/browse/TS-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721202#comment-15721202 ] Oknet Xu commented on TS-4206: -- please try to fix it with TS-5076 patch. > Stalled connections show a client request but no HTTP response > -- > > Key: TS-4206 > URL: https://issues.apache.org/jira/browse/TS-4206 > Project: Traffic Server > Issue Type: Bug > Components: Core, HTTP >Reporter: Eric Sproul >Assignee: Bryan Call >Priority: Blocker > Labels: A, regression > Fix For: 7.1.0 > > > Have been discussing this one on IRC but wanted to capture everything here > since it seems like a fairly serious issue. Since upgrading from 5.3.2 to > 6.1.1 we have witnessed connections that, from the client perspective, seem > to stall after the client sends the request. TS logs the connection only > after it seemingly hits {{proxy.config.net.default_inactivity_timeout}} (5 > minutes), but logs a response code of 000, despite the presence of a request > (e.g. GET with a request URL logged). > For the time being we have failed back to 5.3.2 but I was able to capture a > sample of this situation in the slow log. [This > paste|http://apaste.info/ds1] shows the slow log as well as the corresponding > squid.blog entry (default format). > This issue feels similar to TS-3456 but we are not using tunneling, though we > are using SSL/TLS in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-5076) NetVC is lost from read or write enable_list
[ https://issues.apache.org/jira/browse/TS-5076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-5076: - Backport to Version: 5.3.3, 6.2.1, 7.0.1 > NetVC is lost from read or write enable_list > > > Key: TS-5076 > URL: https://issues.apache.org/jira/browse/TS-5076 > Project: Traffic Server > Issue Type: Bug > Components: Network >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > Time Spent: 50m > Remaining Estimate: 0h > > The related code here: > {code} > void > UnixNetVConnection::reenable(VIO *vio) > { > if (STATE_FROM_VIO(vio)->enabled) > return; > set_enabled(vio); > if (!thread) > return; > EThread *t = vio->mutex->thread_holding; > ink_assert(t == this_ethread()); > ink_release_assert(!closed); > if (nh->mutex->thread_holding == t) { > ... > MUTEX_TRY_LOCK(lock, nh->mutex, t); > if (!lock.is_locked()) { > if (vio == ) { > if (!read.in_enabled_list) {// ---> the condition check > is not atomic > read.in_enabled_list = 1; // ---> the variable set is not > atomic > nh->read_enable_list.push(this); > } > } else { > if (!write.in_enabled_list) { // ---> the write side > write.in_enabled_list = 1; // ---> the write side > nh->write_enable_list.push(this); > } > } > if (nh->trigger_event && nh->trigger_event->ethread->signal_hook) > nh->trigger_event->ethread->signal_hook(nh->trigger_event->ethread); > } else { > ... > } > } > } > {code} > Due to the unstable condition check code, the nh->read_enable_list.push(this) > would push a netvc into atomic queue that is already inside a queue. > It leads the elements in atomic queue after the netvc will be lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-5076) NetVC is lost from read or write enable_list
Oknet Xu created TS-5076: Summary: NetVC is lost from read or write enable_list Key: TS-5076 URL: https://issues.apache.org/jira/browse/TS-5076 Project: Traffic Server Issue Type: Bug Components: Core Reporter: Oknet Xu The related code here: {code} void UnixNetVConnection::reenable(VIO *vio) { if (STATE_FROM_VIO(vio)->enabled) return; set_enabled(vio); if (!thread) return; EThread *t = vio->mutex->thread_holding; ink_assert(t == this_ethread()); ink_release_assert(!closed); if (nh->mutex->thread_holding == t) { ... MUTEX_TRY_LOCK(lock, nh->mutex, t); if (!lock.is_locked()) { if (vio == ) { if (!read.in_enabled_list) {// ---> the condition check is not atomic read.in_enabled_list = 1; // ---> the variable set is not atomic nh->read_enable_list.push(this); } } else { if (!write.in_enabled_list) { // ---> the write side write.in_enabled_list = 1; // ---> the write side nh->write_enable_list.push(this); } } if (nh->trigger_event && nh->trigger_event->ethread->signal_hook) nh->trigger_event->ethread->signal_hook(nh->trigger_event->ethread); } else { ... } } } {code} Due to the unstable condition check code, the nh->read_enable_list.push(this) would push a netvc into atomic queue that is already inside a queue. It leads the elements in atomic queue after the netvc will be lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool
[ https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu closed TS-4910. Resolution: Not A Bug Assignee: Oknet Xu > We should get vc->mutex before do_io when the vc is acquired from session pool > -- > > Key: TS-4910 > URL: https://issues.apache.org/jira/browse/TS-4910 > Project: Traffic Server > Issue Type: Bug >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > > Component: ServerSessionPool > (I cound not found ServerSessionPool from JIRA) > source: proxy/http/HttpSessionManager.cc > {code} > 309 // Now check to see if we have a connection in our shared connection > pool > 310 EThread *ethread = this_ethread(); > 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == > sm->t_state.http_config_param->server_session_sharing_pool) ? > 312ethread->server_session_pool->mutex.get() : > 313m_g_pool->mutex.get(); > 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread); > 315 if (lock.is_locked()) { > 316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD == > sm->t_state.http_config_param->server_session_sharing_pool) { > 317 retval = ethread->server_session_pool->acquireSession(ip, > hostname_hash, match_style, sm, to_return); > 318 Debug("http_ss", "[acquire session] thread pool search %s", > to_return ? "successful" : "failed"); > 319 } else { > 320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, > sm, to_return); > 321 Debug("http_ss", "[acquire session] global pool search %s", > to_return ? "successful" : "failed"); > 322 // At this point to_return has been removed from the pool. Do we > need to move it > 323 // to the same thread? > 324 if (to_return) { > 325 UnixNetVConnection *server_vc = dynamic_cast *>(to_return->get_netvc()); > 326 if (server_vc) { > 327 UnixNetVConnection *new_vc = > server_vc->migrateToCurrentThread(sm, ethread); > {code} > As the code above: > 1. we get pool_mutex first > 2. then acquire a vc from session pool > 3. then migrate the vc to current thread without get vc->mutex > Depend on the comments, a SM only access VIO & VC that returned with callback. > The mutex of ServerSession may be different from server_vc while it is > acquired from ServerSessionPool and attached to HttpSM. > HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before > call ServerSession->do_io(). Currently HttpSM does not. > The mutex create & usage list at below by timeline: > 1. ClientVC is accepted from NetAccept with a new allocated mutex. > 2. ClientSession is created and share the same mutex with ClientVC. > 3. HttpSM is created and share the same mutex with ClientVC. > 4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by > ClientVC->read.vio._cont. > {code} > ClientVC->nh->mutex is locked by EventSystem > HttpSM->mutex is locked by NetHandler > ClientVC->mutex is locked due share the same mutex with HttpSM > To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked > simultaneously. > {code} > 5. Scenes1: > HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re() > Then HttpSM create ServerSession with ServerVC and share the same mutex with > HttpSM. > 5. Scenes2: > HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is > not current EThread. > The ServerSession->mutex is set to HttpSM->mutex while it is attached to > HttpSM. > But the ServerVC->mutex is old one. > The first bug: > Before VC Migration merged: > - ServerSession->do_io() is called and directly call ServerVC->do_io() > without get ServerVC->mutex first. > After VC Migration merged: > - Migrate ServerVC into current thread without get ServerVC->mutex first. > The second bug: > Before VC Migration merged: > - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously. > Suggestion: > 1. Recall VC Migration > 2. Re-design ServerSession > To re-design ServerSession: > 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is > callbacked VC_EVENT_NET_OPEN to HttpSM. > 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread()) > 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO > 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex > 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO > 4b. if not, create a Cont and schedule it into > servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() > later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool
[ https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717396#comment-15717396 ] Oknet Xu commented on TS-4910: -- Sorry, this is not a bug. The NetVC->mutex is unused after it is pushed into NetHandler. And should acquire nh->mutex before access to a netvc that is not belongs to current thread. > We should get vc->mutex before do_io when the vc is acquired from session pool > -- > > Key: TS-4910 > URL: https://issues.apache.org/jira/browse/TS-4910 > Project: Traffic Server > Issue Type: Bug >Reporter: Oknet Xu > Fix For: 7.1.0 > > > Component: ServerSessionPool > (I cound not found ServerSessionPool from JIRA) > source: proxy/http/HttpSessionManager.cc > {code} > 309 // Now check to see if we have a connection in our shared connection > pool > 310 EThread *ethread = this_ethread(); > 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == > sm->t_state.http_config_param->server_session_sharing_pool) ? > 312ethread->server_session_pool->mutex.get() : > 313m_g_pool->mutex.get(); > 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread); > 315 if (lock.is_locked()) { > 316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD == > sm->t_state.http_config_param->server_session_sharing_pool) { > 317 retval = ethread->server_session_pool->acquireSession(ip, > hostname_hash, match_style, sm, to_return); > 318 Debug("http_ss", "[acquire session] thread pool search %s", > to_return ? "successful" : "failed"); > 319 } else { > 320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, > sm, to_return); > 321 Debug("http_ss", "[acquire session] global pool search %s", > to_return ? "successful" : "failed"); > 322 // At this point to_return has been removed from the pool. Do we > need to move it > 323 // to the same thread? > 324 if (to_return) { > 325 UnixNetVConnection *server_vc = dynamic_cast *>(to_return->get_netvc()); > 326 if (server_vc) { > 327 UnixNetVConnection *new_vc = > server_vc->migrateToCurrentThread(sm, ethread); > {code} > As the code above: > 1. we get pool_mutex first > 2. then acquire a vc from session pool > 3. then migrate the vc to current thread without get vc->mutex > Depend on the comments, a SM only access VIO & VC that returned with callback. > The mutex of ServerSession may be different from server_vc while it is > acquired from ServerSessionPool and attached to HttpSM. > HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before > call ServerSession->do_io(). Currently HttpSM does not. > The mutex create & usage list at below by timeline: > 1. ClientVC is accepted from NetAccept with a new allocated mutex. > 2. ClientSession is created and share the same mutex with ClientVC. > 3. HttpSM is created and share the same mutex with ClientVC. > 4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by > ClientVC->read.vio._cont. > {code} > ClientVC->nh->mutex is locked by EventSystem > HttpSM->mutex is locked by NetHandler > ClientVC->mutex is locked due share the same mutex with HttpSM > To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked > simultaneously. > {code} > 5. Scenes1: > HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re() > Then HttpSM create ServerSession with ServerVC and share the same mutex with > HttpSM. > 5. Scenes2: > HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is > not current EThread. > The ServerSession->mutex is set to HttpSM->mutex while it is attached to > HttpSM. > But the ServerVC->mutex is old one. > The first bug: > Before VC Migration merged: > - ServerSession->do_io() is called and directly call ServerVC->do_io() > without get ServerVC->mutex first. > After VC Migration merged: > - Migrate ServerVC into current thread without get ServerVC->mutex first. > The second bug: > Before VC Migration merged: > - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously. > Suggestion: > 1. Recall VC Migration > 2. Re-design ServerSession > To re-design ServerSession: > 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is > callbacked VC_EVENT_NET_OPEN to HttpSM. > 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread()) > 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO > 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex > 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO > 4b. if not, create a Cont and schedule it into > servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() > later. -- This message
[jira] [Updated] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()
[ https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4879: - Backport to Version: 6.2.1 > NetVC leaks while hyper emergency occur on check_emergency_throttle() > - > > Key: TS-4879 > URL: https://issues.apache.org/jira/browse/TS-4879 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > The con could be closed if hyper emergency occur on > check_emergency_throttle(). > But we did not check the con.fd while we get return from > check_emergency_throttle(). > For hyper emergency: > - The socket fd is removed from epoll while it is closed. > - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to > SM. > Thus: > - The NetVC will never triggered by NetHandler. > - Only InactivityCop could handle the NetVC and the default timeout value is > 86400 secs. > For the counter: net_connections_currently_open_stat > - It is increased in “connect_re_internal()” > - It isn't decreased while the con.fd set to NO_FD due to hyper emergency > - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. > (TS-4178) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4885) Incorrect checking of fds_throttle and fds_limit
[ https://issues.apache.org/jira/browse/TS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4885: - Backport to Version: 6.2.1 > Incorrect checking of fds_throttle and fds_limit > > > Key: TS-4885 > URL: https://issues.apache.org/jira/browse/TS-4885 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {code} > 902 static void > 903 check_fd_limit() > 904 { > 905 int fds_throttle = -1; > 906 REC_ReadConfigInteger(fds_throttle, > "proxy.config.net.connections_throttle"); > 907 if (fds_throttle > fds_limit + THROTTLE_FD_HEADROOM) { // > ---> Incorrect > 908 int new_fds_throttle = fds_limit - THROTTLE_FD_HEADROOM; > 909 if (new_fds_throttle < 1) { > 910 ink_abort("too few file descriptors (%d) available", fds_limit); > 911 } > 912 char msg[256]; > 913 snprintf(msg, sizeof(msg), "connection throttle too high, " > 914"%d (throttle) + %d (internal use) > %d > (file descriptor limit), " > 915"using throttle of %d", > 916 fds_throttle, THROTTLE_FD_HEADROOM, fds_limit, > new_fds_throttle); > 917 SignalWarning(MGMT_SIGNAL_SYSTEM_ERROR, msg); > 918 } > 919 } > {code} > {code} > 1001 static void > 1002 adjust_sys_settings(void) > 1003 { > ... > 1024 REC_ReadConfigInteger(fds_throttle, > "proxy.config.net.connections_throttle"); > 1025 > 1026 if (getrlimit(RLIMIT_NOFILE, ) == 0) { > 1027 if (fds_throttle > (int)(lim.rlim_cur + THROTTLE_FD_HEADROOM)) { // > --> Incorrect > 1028 lim.rlim_cur = (lim.rlim_max = (rlim_t)fds_throttle); > 1029 if (setrlimit(RLIMIT_NOFILE, ) == 0 && > getrlimit(RLIMIT_NOFILE, ) == 0) { > 1030 fds_limit = (int)lim.rlim_cur; > 1031 syslog(LOG_NOTICE, "NOTE: RLIMIT_NOFILE(%d):cur(%d),max(%d)", > RLIMIT_NOFILE, (int)lim.rlim_cur, (int)lim.rlim_max); > 1032 } > 1033 } > 1034 } > ... > 1043 } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-2482) Problems with SOCKS
[ https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-2482: - Backport to Version: 6.2.1 > Problems with SOCKS > --- > > Key: TS-2482 > URL: https://issues.apache.org/jira/browse/TS-2482 > Project: Traffic Server > Issue Type: Bug > Components: Core, SOCKS >Reporter: Radim Kolar >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > There are several problems with using SOCKS. I am interested in case when TF > is sock client. Client sends HTTP request and TF uses SOCKS server to make > connection to internet. > a/ - not documented enough in default configs > From default configs comments it seems that for running > TF 4.1.2 as socks client, it is sufficient to add one line to socks.config: > dest_ip=0.0.0.0-255.255.255.255 parent="10.0.0.7:9050" > but socks proxy is not used. If i run tcpdump sniffing packets TF never > tries to connect to that SOCKS. > From source code - > https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it > looks that is needed to set "proxy.config.socks.socks_needed" to activate > socks support. This should be documented in both sample files: socks.config > and record.config > b/ > after enabling socks, i am hit by this assert: > Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, > line 65. > i run on dual stack system (ip4,ip6). > This code is setting default destination for SOCKS request? Can not you use > just 127.0.0.1 for case if client gets connected over IP6? > https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool
[ https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4910: - Description: Component: ServerSessionPool (I cound not found ServerSessionPool from JIRA) source: proxy/http/HttpSessionManager.cc {code} 309 // Now check to see if we have a connection in our shared connection pool 310 EThread *ethread = this_ethread(); 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) ? 312ethread->server_session_pool->mutex.get() : 313m_g_pool->mutex.get(); 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread); 315 if (lock.is_locked()) { 316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) { 317 retval = ethread->server_session_pool->acquireSession(ip, hostname_hash, match_style, sm, to_return); 318 Debug("http_ss", "[acquire session] thread pool search %s", to_return ? "successful" : "failed"); 319 } else { 320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, sm, to_return); 321 Debug("http_ss", "[acquire session] global pool search %s", to_return ? "successful" : "failed"); 322 // At this point to_return has been removed from the pool. Do we need to move it 323 // to the same thread? 324 if (to_return) { 325 UnixNetVConnection *server_vc = dynamic_cast(to_return->get_netvc()); 326 if (server_vc) { 327 UnixNetVConnection *new_vc = server_vc->migrateToCurrentThread(sm, ethread); {code} As the code above: 1. we get pool_mutex first 2. then acquire a vc from session pool 3. then migrate the vc to current thread without get vc->mutex Depend on the comments, a SM only access VIO & VC that returned with callback. The mutex of ServerSession may be different from server_vc while it is acquired from ServerSessionPool and attached to HttpSM. HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before call ServerSession->do_io(). Currently HttpSM does not. The mutex create & usage list at below by timeline: 1. ClientVC is accepted from NetAccept with a new allocated mutex. 2. ClientSession is created and share the same mutex with ClientVC. 3. HttpSM is created and share the same mutex with ClientVC. 4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by ClientVC->read.vio._cont. {code} ClientVC->nh->mutex is locked by EventSystem HttpSM->mutex is locked by NetHandler ClientVC->mutex is locked due share the same mutex with HttpSM To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked simultaneously. {code} 5. Scenes1: HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re() Then HttpSM create ServerSession with ServerVC and share the same mutex with HttpSM. 5. Scenes2: HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is not current EThread. The ServerSession->mutex is set to HttpSM->mutex while it is attached to HttpSM. But the ServerVC->mutex is old one. The first bug: Before VC Migration merged: - ServerSession->do_io() is called and directly call ServerVC->do_io() without get ServerVC->mutex first. After VC Migration merged: - Migrate ServerVC into current thread without get ServerVC->mutex first. The second bug: Before VC Migration merged: - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously. Suggestion: 1. Recall VC Migration 2. Re-design ServerSession To re-design ServerSession: 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is callbacked VC_EVENT_NET_OPEN to HttpSM. 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread()) 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO 4b. if not, create a Cont and schedule it into servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() later. was: Component: ServerSessionPool (I cound not found ServerSessionPool from JIRA) source: proxy/http/HttpSessionManager.cc {code} 309 // Now check to see if we have a connection in our shared connection pool 310 EThread *ethread = this_ethread(); 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) ? 312ethread->server_session_pool->mutex.get() : 313m_g_pool->mutex.get(); 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread); 315 if (lock.is_locked()) { 316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) { 317 retval =
[jira] [Updated] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool
[ https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4910: - Description: Component: ServerSessionPool (I cound not found ServerSessionPool from JIRA) source: proxy/http/HttpSessionManager.cc {code} 309 // Now check to see if we have a connection in our shared connection pool 310 EThread *ethread = this_ethread(); 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) ? 312ethread->server_session_pool->mutex.get() : 313m_g_pool->mutex.get(); 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread); 315 if (lock.is_locked()) { 316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) { 317 retval = ethread->server_session_pool->acquireSession(ip, hostname_hash, match_style, sm, to_return); 318 Debug("http_ss", "[acquire session] thread pool search %s", to_return ? "successful" : "failed"); 319 } else { 320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, sm, to_return); 321 Debug("http_ss", "[acquire session] global pool search %s", to_return ? "successful" : "failed"); 322 // At this point to_return has been removed from the pool. Do we need to move it 323 // to the same thread? 324 if (to_return) { 325 UnixNetVConnection *server_vc = dynamic_cast(to_return->get_netvc()); 326 if (server_vc) { 327 UnixNetVConnection *new_vc = server_vc->migrateToCurrentThread(sm, ethread); {code} As the code above: 1. we get pool_mutex first 2. then acquire a vc from session pool 3. then migrate the vc to current thread without get vc->mutex Depend on the comments, a SM only access VIO & VC that returned with callback. The mutex of ServerSession may be different from server_vc while it is acquired from ServerSessionPool and attached to HttpSM. HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before call ServerSession->do_io(). Currently HttpSM does not. The mutex create & usage list at below by timeline: 1. ClientVC is accepted from NetAccept with a new allocated mutex. 2. ClientSession is created and share the same mutex with ClientVC. 3. HttpSM is created and share the same mutex with ClientVC. 4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by ClientVC->read.vio._cont. {code} ClientVC->nh->mutex is locked by EventSystem HttpSM->mutex is locked by NetHandler ClientVC->mutex is locked due share the same mutex with HttpSM To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked simultaneously. {code} 5. Scenes1: HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re() Then HttpSM create ServerSession with ServerVC and share the same mutex with HttpSM. 5. Scenes2: HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is not current EThread. The ServerSession->mutex is set to HttpSM->mutex while it is attached to HttpSM. But the ServerVC->mutex is old one. The first bug: Before VC Migration merged: - ServerSession->do_io() is called and directly call ServerVC->do_io() without get ServerVC->mutex first. After VC Migration merged: - Migrate ServerVC into current thread without get ServerVC->mutex first. The second bug: Before VC Migration merged: - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously. Suggestion: 1. Recall VC Migration 2. Re-design ServerSession To re-design ServerSession: 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is callbacked VC_EVENT_NET_OPEN to HttpSM. 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread()) 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO 4b. if not, create a Cont and schedule it into servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() later. was: Component: ServerSessionPool (I cound not found ServerSessionPool from JIRA) source: proxy/http/HttpSessionManager.cc ``` 309 // Now check to see if we have a connection in our shared connection pool 310 EThread *ethread = this_ethread(); 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) ? 312ethread->server_session_pool->mutex.get() : 313m_g_pool->mutex.get(); 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread); 315 if (lock.is_locked()) { 316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) { 317 retval =
[jira] [Created] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool
Oknet Xu created TS-4910: Summary: We should get vc->mutex before do_io when the vc is acquired from session pool Key: TS-4910 URL: https://issues.apache.org/jira/browse/TS-4910 Project: Traffic Server Issue Type: Bug Reporter: Oknet Xu Component: ServerSessionPool (I cound not found ServerSessionPool from JIRA) source: proxy/http/HttpSessionManager.cc ``` 309 // Now check to see if we have a connection in our shared connection pool 310 EThread *ethread = this_ethread(); 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) ? 312ethread->server_session_pool->mutex.get() : 313m_g_pool->mutex.get(); 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread); 315 if (lock.is_locked()) { 316 if (TS_SERVER_SESSION_SHARING_POOL_THREAD == sm->t_state.http_config_param->server_session_sharing_pool) { 317 retval = ethread->server_session_pool->acquireSession(ip, hostname_hash, match_style, sm, to_return); 318 Debug("http_ss", "[acquire session] thread pool search %s", to_return ? "successful" : "failed"); 319 } else { 320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, sm, to_return); 321 Debug("http_ss", "[acquire session] global pool search %s", to_return ? "successful" : "failed"); 322 // At this point to_return has been removed from the pool. Do we need to move it 323 // to the same thread? 324 if (to_return) { 325 UnixNetVConnection *server_vc = dynamic_cast(to_return->get_netvc()); 326 if (server_vc) { 327 UnixNetVConnection *new_vc = server_vc->migrateToCurrentThread(sm, ethread); ``` As the code above: 1. we get pool_mutex first 2. then acquire a vc from session pool 3. then migrate the vc to current thread without get vc->mutex Depend on the comments, a SM only access VIO & VC that returned with callback. The mutex of ServerSession may be different from server_vc while it is acquired from ServerSessionPool and attached to HttpSM. HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before call ServerSession->do_io(). Currently HttpSM does not. ClientVC is accepted from NetAccept with a new allocated mutex. Then ClientSession is created and share the same mutex with ClientVC. Then HttpSM is created and share the same mutex with ClientVC. Then NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by ClientVC->read.vio._cont. ** ClientVC->nh->mutex is locked by EventSystem ** HttpSM->mutex is locked by NetHandler ** ClientVC->mutex is locked due share the same mutex with HttpSM Scenes1: HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re() Then HttpSM create ServerSession with ServerVC and share the same mutex with HttpSM. Scenes2: HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is not current EThread. The ServerSession->mutex is set to HttpSM->mutex while it is attached to HttpSM. But the ServerVC->mutex is old one. The first bug: Before VC Migration merged: - ServerSession->do_io() is called and directly call ServerVC->do_io() without get ServerVC->mutex first. After VC Migration merged: - Migrate ServerVC into current thread without get ServerVC->mutex first. The second bug: Before VC Migration merged: - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously. Suggestion: 1. Recall VC Migration 2. Re-design ServerSession To re-design ServerSession: 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is callbacked VC_EVENT_NET_OPEN to HttpSM. 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread()) 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO 4b. if not, create a Cont and schedule it into servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (TS-4895) CID 1021743: Uninitialized members in iocore/net/UnixNet.cc
[ https://issues.apache.org/jira/browse/TS-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu closed TS-4895. > CID 1021743: Uninitialized members in iocore/net/UnixNet.cc > > > Key: TS-4895 > URL: https://issues.apache.org/jira/browse/TS-4895 > Project: Traffic Server > Issue Type: Bug > Components: Network >Reporter: Leif Hedstrom >Assignee: Oknet Xu > Labels: coverity > Fix For: 7.1.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {code} > *** CID 1021743: Uninitialized members (UNINIT_CTOR) > /iocore/net/UnixNet.cc: 269 in NetHandler::NetHandler()() > 263 max_connections_active_in(0), > 264 inactive_threashold_in(0), > 265 transaction_no_activity_timeout_in(0), > 266 keep_alive_no_activity_timeout_in(0) > 267 { > 268 SET_HANDLER((NetContHandler)::startNetEvent); >CID 1021743: Uninitialized members (UNINIT_CTOR) >Non-static class member "default_inactivity_timeout" is not initialized in > this constructor nor in any functions that it calls. > 269 } > 270 > 271 int > 272 update_nethandler_config(const char *name, RecDataT data_type > ATS_UNUSED, RecData data, void *cookie) > 273 { > 274 NetHandler *nh = static_cast(cookie); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-4895) CID 1021743: Uninitialized members in iocore/net/UnixNet.cc
[ https://issues.apache.org/jira/browse/TS-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu resolved TS-4895. -- Resolution: Fixed > CID 1021743: Uninitialized members in iocore/net/UnixNet.cc > > > Key: TS-4895 > URL: https://issues.apache.org/jira/browse/TS-4895 > Project: Traffic Server > Issue Type: Bug > Components: Network >Reporter: Leif Hedstrom >Assignee: Oknet Xu > Labels: coverity > Fix For: 7.1.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {code} > *** CID 1021743: Uninitialized members (UNINIT_CTOR) > /iocore/net/UnixNet.cc: 269 in NetHandler::NetHandler()() > 263 max_connections_active_in(0), > 264 inactive_threashold_in(0), > 265 transaction_no_activity_timeout_in(0), > 266 keep_alive_no_activity_timeout_in(0) > 267 { > 268 SET_HANDLER((NetContHandler)::startNetEvent); >CID 1021743: Uninitialized members (UNINIT_CTOR) >Non-static class member "default_inactivity_timeout" is not initialized in > this constructor nor in any functions that it calls. > 269 } > 270 > 271 int > 272 update_nethandler_config(const char *name, RecDataT data_type > ATS_UNUSED, RecData data, void *cookie) > 273 { > 274 NetHandler *nh = static_cast(cookie); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (TS-4539) The mutex of server_vc is not set while server_session reuse
[ https://issues.apache.org/jira/browse/TS-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu closed TS-4539. Resolution: Not A Bug > The mutex of server_vc is not set while server_session reuse > > > Key: TS-4539 > URL: https://issues.apache.org/jira/browse/TS-4539 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > > NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex. > And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share > the same mutex. > The HttpServerSession and server_vc will put into ServerSessionPool and may > assign to next new client_vc. > The HttpSM::attach_server_session() only set the mutex of HttpServerSession > to the mutex of HttpSM after get a HttpServerSession from ServerSessionPool. > But it forget to set the mutex of server_vc to the mutex of HttpSM. > > {code} > void > HttpSM::attach_server_session(HttpServerSession *s) > { > hsm_release_assert(server_session == NULL); > hsm_release_assert(server_entry == NULL); > hsm_release_assert(s->state == HSS_ACTIVE); > server_session = s; > server_session->transact_count++; > // Set the mutex so that we have something to update > // stats with > server_session->mutex = this->mutex; > {code} > But I can not found any issue, Is it by design? > Or it is hard to locate the problem, due to my limited knowedge on HttpSM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4539) The mutex of server_vc is not set while server_session reuse
[ https://issues.apache.org/jira/browse/TS-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528725#comment-15528725 ] Oknet Xu commented on TS-4539: -- This is not a bug, but I found another bug in ServerSessionPool and HttpSessionManager. A VC cannot be migrated to other EThreads while it is allocated. It is managed by NetHandler running in the same EThread. The NetHandler own the VC and the VC is only freed by the NetHandler. InactivityCop is a part of NetHandler due to they are share same mutex therefore, similar to NetHandler, it could free the VC. > The mutex of server_vc is not set while server_session reuse > > > Key: TS-4539 > URL: https://issues.apache.org/jira/browse/TS-4539 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > > NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex. > And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share > the same mutex. > The HttpServerSession and server_vc will put into ServerSessionPool and may > assign to next new client_vc. > The HttpSM::attach_server_session() only set the mutex of HttpServerSession > to the mutex of HttpSM after get a HttpServerSession from ServerSessionPool. > But it forget to set the mutex of server_vc to the mutex of HttpSM. > > {code} > void > HttpSM::attach_server_session(HttpServerSession *s) > { > hsm_release_assert(server_session == NULL); > hsm_release_assert(server_entry == NULL); > hsm_release_assert(s->state == HSS_ACTIVE); > server_session = s; > server_session->transact_count++; > // Set the mutex so that we have something to update > // stats with > server_session->mutex = this->mutex; > {code} > But I can not found any issue, Is it by design? > Or it is hard to locate the problem, due to my limited knowedge on HttpSM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4612) Proposal: InactivityCop Optimize
[ https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4612: - Issue Type: Improvement (was: Bug) > Proposal: InactivityCop Optimize > > > Key: TS-4612 > URL: https://issues.apache.org/jira/browse/TS-4612 > Project: Traffic Server > Issue Type: Improvement > Components: Core, Network >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > By review the processing of InactivityCop::check_inactivity(): > 1. get all local vc from open_list > 2. put them into cop_list > 3. check every vc in cop_list if it is already timeouted > 4. callback vc->handleEvent to close vc if it is timeout > InactivityCop and NetHandler share one mutex. > InactivityCop runs every second, NetHandler runs every 10ms, that means > Nethandler runs 100 times until next InactivityCop runs. > if one vc has read/write in a Nethandler call, it is won't be timeout in the > next InactivityCop run. > Thus, if the vc has read/write in Nethandler, we move it out of cop-list then > the InactivityCop runs would get better performace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-4612) Proposal: InactivityCop Optimize
[ https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu resolved TS-4612. -- Resolution: Fixed Assignee: Oknet Xu Fix Version/s: (was: sometime) 7.1.0 > Proposal: InactivityCop Optimize > > > Key: TS-4612 > URL: https://issues.apache.org/jira/browse/TS-4612 > Project: Traffic Server > Issue Type: Bug > Components: Core, Network >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > By review the processing of InactivityCop::check_inactivity(): > 1. get all local vc from open_list > 2. put them into cop_list > 3. check every vc in cop_list if it is already timeouted > 4. callback vc->handleEvent to close vc if it is timeout > InactivityCop and NetHandler share one mutex. > InactivityCop runs every second, NetHandler runs every 10ms, that means > Nethandler runs 100 times until next InactivityCop runs. > if one vc has read/write in a Nethandler call, it is won't be timeout in the > next InactivityCop run. > Thus, if the vc has read/write in Nethandler, we move it out of cop-list then > the InactivityCop runs would get better performace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()
[ https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu resolved TS-4879. -- Resolution: Fixed Fix Version/s: 7.1.0 > NetVC leaks while hyper emergency occur on check_emergency_throttle() > - > > Key: TS-4879 > URL: https://issues.apache.org/jira/browse/TS-4879 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > The con could be closed if hyper emergency occur on > check_emergency_throttle(). > But we did not check the con.fd while we get return from > check_emergency_throttle(). > For hyper emergency: > - The socket fd is removed from epoll while it is closed. > - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to > SM. > Thus: > - The NetVC will never triggered by NetHandler. > - Only InactivityCop could handle the NetVC and the default timeout value is > 86400 secs. > For the counter: net_connections_currently_open_stat > - It is increased in “connect_re_internal()” > - It isn't decreased while the con.fd set to NO_FD due to hyper emergency > - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. > (TS-4178) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TS-4705) Proposal: NetVC Context
[ https://issues.apache.org/jira/browse/TS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu resolved TS-4705. -- Resolution: Fixed > Proposal: NetVC Context > --- > > Key: TS-4705 > URL: https://issues.apache.org/jira/browse/TS-4705 > Project: Traffic Server > Issue Type: Improvement > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Goal 1st: > In the NetVConnection, we have get_local_addr() and get_remote_addr() methods. > Also have members local_addr, remote_addr and netvc->con.addr. > Thus, we should using netvc->con.addr or remote_addr to replace member > server_addr in UnixNetVConnection. > Goal 2nd: > SSLNetVConnection has member sslClientConnection with 2 methods > setSSLClientConnection() and getSSLClientConnection() to indictor ATS is a > client or server in a SSL session. > To abstract above two goals, I'm design the netvc context function. > As a proxy, there has two side: client side ( Client <-> Proxy ) and server > side ( Proxy <-> Server ). With the netvc context funtion to indicate which > side the NetVC working on. > Goal 3rd: > Fix a minor bug in NetAccept::do_blocking_accept, call to > check_emergency_throttle(con) first then allocate vc. > Goal 4th: > NetAccept Optimize, remove dup code, etc... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TS-4705) Proposal: NetVC Context
[ https://issues.apache.org/jira/browse/TS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TS-4705 started by Oknet Xu. > Proposal: NetVC Context > --- > > Key: TS-4705 > URL: https://issues.apache.org/jira/browse/TS-4705 > Project: Traffic Server > Issue Type: Improvement > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.1.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Goal 1st: > In the NetVConnection, we have get_local_addr() and get_remote_addr() methods. > Also have members local_addr, remote_addr and netvc->con.addr. > Thus, we should using netvc->con.addr or remote_addr to replace member > server_addr in UnixNetVConnection. > Goal 2nd: > SSLNetVConnection has member sslClientConnection with 2 methods > setSSLClientConnection() and getSSLClientConnection() to indictor ATS is a > client or server in a SSL session. > To abstract above two goals, I'm design the netvc context function. > As a proxy, there has two side: client side ( Client <-> Proxy ) and server > side ( Proxy <-> Server ). With the netvc context funtion to indicate which > side the NetVC working on. > Goal 3rd: > Fix a minor bug in NetAccept::do_blocking_accept, call to > check_emergency_throttle(con) first then allocate vc. > Goal 4th: > NetAccept Optimize, remove dup code, etc... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()
[ https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TS-4879 started by Oknet Xu. > NetVC leaks while hyper emergency occur on check_emergency_throttle() > - > > Key: TS-4879 > URL: https://issues.apache.org/jira/browse/TS-4879 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Time Spent: 4.5h > Remaining Estimate: 0h > > The con could be closed if hyper emergency occur on > check_emergency_throttle(). > But we did not check the con.fd while we get return from > check_emergency_throttle(). > For hyper emergency: > - The socket fd is removed from epoll while it is closed. > - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to > SM. > Thus: > - The NetVC will never triggered by NetHandler. > - Only InactivityCop could handle the NetVC and the default timeout value is > 86400 secs. > For the counter: net_connections_currently_open_stat > - It is increased in “connect_re_internal()” > - It isn't decreased while the con.fd set to NO_FD due to hyper emergency > - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. > (TS-4178) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4886) ATS HostDB Crash
[ https://issues.apache.org/jira/browse/TS-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4886: - Affects Version/s: 6.2.0 > ATS HostDB Crash > > > Key: TS-4886 > URL: https://issues.apache.org/jira/browse/TS-4886 > Project: Traffic Server > Issue Type: Bug > Components: HostDB >Affects Versions: 6.2.0 >Reporter: song > > HostDB crash for every week. > FATAL: MultiCache.cc:1073: failed assert `0` > traffic_server: using root directory '/opt/fusion/cdn/trafficserver' > traffic_server: Aborted (Signal sent by tkill() 31504 33) > traffic_server - STACK TRACE: > /opt/fusion/cdn/trafficserver/bin/traffic_server(crash_logger_invoke(int, > siginfo_t*, void*)+0xc3)[0x50963a] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x2b4e38494330] > /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x2b4e390fcc37] > /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x2b4e39100028] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal_va(char const*, > __va_list_tag*)+0x0)[0x2b4e37433a04] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal(char const*, > ...)+0x0)[0x2b4e37433acf] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_pfatal(char const*, > ...)+0x0)[0x2b4e37433b6e] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ats_base64_encode(unsigned > char const*, unsigned long, char*, unsigned long, unsigned > long*)+0x0)[0x2b4e374317e0] > /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheBase::fixup_heap_offsets(int, > int, UnsunkPtrRegistry*, int)+0x16f)[0x6e0ab3] > /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheSync::mcEvent(int, > Event*)+0x108)[0x6e2408] > /opt/fusion/cdn/trafficserver/bin/traffic_server(Continuation::handleEvent(int, > void*)+0x72)[0x50c5f8] > /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::process_event(Event*, > int)+0x136)[0x7aa646] > /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::execute()+0xdc)[0x7aa8a6] > /opt/fusion/cdn/trafficserver/bin/traffic_server[0x7a9c27] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x2b4e3848c184] > /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2b4e391c037d] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513834#comment-15513834 ] Oknet Xu commented on TS-4468: -- For HTTP session reuse and upon my suggestion: release/acquire server session upon server_request->get_host() if proxy.config.url_remap.pristine_host_hdr is false: - t_state.current.server->name == server_request->get_host() == "origin.example.com" - no difference if proxy.config.url_remap.pristine_host_hdr is true: - t_state.current.server->name == "origin.example.com" - server_request->get_host() == "example.com" or "www.example.com" - reduces the value of server session reuse (but without any negative effects) With the option enabled: - The results of match=ip and match=ip+FQDN are almost the same. - The "match=ip" already meet our requirements. Because the FQDN is resolved to multiple IPs and the contents on each IP are the same. - The result of match=ip+Host more accurate/less than the result of match=ip+FQDN. For Http session reuse: - match=ip is enough <==> match = IP - match=FQDN is acceptable and improve the value while multiple IPs for a FQDN <==> match = HOST - match=ip+FQDN is almost the same as match=ip <==> match = BOTH - match=Host is acceptable and improve the value but lower than FQDN - match=ip+Host is acceptable but reduces the value of reuse For Https session reuse: - match=ip is unacceptable, againest RFC 6066 - match=FQDN is unacceptable, againest RFC 6066 - match=ip+FQDN is unacceptable, againest RFC 6066 - match=Host(SNI) is acceptable and improve the value - match=ip+Host(SNI) is required <==> match = IP - match=FQDN+Host(SNI) is acceptable and no difference with ip+Host <==> match = HOST - match=ip+FQDN+Host(SNI) is acceptable and no difference with ip+Host <==> match = BOTH Your patch implement the addtionnal SNI match for SSLNetVC. Depend on the analysis above, in order to get max value of reuse: - to reuse a server session connect to parent proxy, we prefer match=ip - to reuse a server session that reverse proxy to http origin server, we prefer match=ip - to reuse a server session that reverse proxy to https origin server, we prefer match=ip+sni(with the patch) - to reuse a server session that forward proxy to http origin server, we prefer match=host - to reuse a server session that forward proxy to https origin server, we prefer match=host+sni(with the patch) Now, ATS default setting is match=both that is middle solution(not bad but not the best). Thanks for your explaination and finally I'm totally understand the reuse. However, I will reserve my opinion about match=IP+FQDN <==> match=BOTH. > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513474#comment-15513474 ] Oknet Xu commented on TS-4468: -- Could you please do a test for the issue with the option disabled ? > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513105#comment-15513105 ] Oknet Xu commented on TS-4468: -- By review the codes, the key point is "proxy.config.url_remap.pristine_host_hdr". if proxy.config.url_remap.pristine_host_hdr is false: t_state.current.server->name == server_request->get_host() == "origin.example.com" if proxy.config.url_remap.pristine_host_hdr is true: t_state.current.server->name == "origin.example.com" server_request->get_host() == "example.com" or "www.example.com" ATS always : - set SNI upon server_request->get_host() - release/acquire server session upon t_state.current.server->name as hostname My Suggestion is: - release/acquire server session upon server_request->get_host() - no need to check SNI in ServerSessionManager. > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4886) ATS HostDB Crash
[ https://issues.apache.org/jira/browse/TS-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513065#comment-15513065 ] Oknet Xu commented on TS-4886: -- [~jacksontj] Could you please have a look ? > ATS HostDB Crash > > > Key: TS-4886 > URL: https://issues.apache.org/jira/browse/TS-4886 > Project: Traffic Server > Issue Type: Bug > Components: HostDB >Reporter: song > > HostDB crash for every week. > FATAL: MultiCache.cc:1073: failed assert `0` > traffic_server: using root directory '/opt/fusion/cdn/trafficserver' > traffic_server: Aborted (Signal sent by tkill() 31504 33) > traffic_server - STACK TRACE: > /opt/fusion/cdn/trafficserver/bin/traffic_server(crash_logger_invoke(int, > siginfo_t*, void*)+0xc3)[0x50963a] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x2b4e38494330] > /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x2b4e390fcc37] > /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x2b4e39100028] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal_va(char const*, > __va_list_tag*)+0x0)[0x2b4e37433a04] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal(char const*, > ...)+0x0)[0x2b4e37433acf] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_pfatal(char const*, > ...)+0x0)[0x2b4e37433b6e] > /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ats_base64_encode(unsigned > char const*, unsigned long, char*, unsigned long, unsigned > long*)+0x0)[0x2b4e374317e0] > /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheBase::fixup_heap_offsets(int, > int, UnsunkPtrRegistry*, int)+0x16f)[0x6e0ab3] > /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheSync::mcEvent(int, > Event*)+0x108)[0x6e2408] > /opt/fusion/cdn/trafficserver/bin/traffic_server(Continuation::handleEvent(int, > void*)+0x72)[0x50c5f8] > /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::process_event(Event*, > int)+0x136)[0x7aa646] > /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::execute()+0xdc)[0x7aa8a6] > /opt/fusion/cdn/trafficserver/bin/traffic_server[0x7a9c27] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x2b4e3848c184] > /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2b4e391c037d] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4885) Incorrect checking of fds_throttle and fds_limit
Oknet Xu created TS-4885: Summary: Incorrect checking of fds_throttle and fds_limit Key: TS-4885 URL: https://issues.apache.org/jira/browse/TS-4885 Project: Traffic Server Issue Type: Bug Components: Core Reporter: Oknet Xu {code} 902 static void 903 check_fd_limit() 904 { 905 int fds_throttle = -1; 906 REC_ReadConfigInteger(fds_throttle, "proxy.config.net.connections_throttle"); 907 if (fds_throttle > fds_limit + THROTTLE_FD_HEADROOM) { // ---> Incorrect 908 int new_fds_throttle = fds_limit - THROTTLE_FD_HEADROOM; 909 if (new_fds_throttle < 1) { 910 ink_abort("too few file descriptors (%d) available", fds_limit); 911 } 912 char msg[256]; 913 snprintf(msg, sizeof(msg), "connection throttle too high, " 914"%d (throttle) + %d (internal use) > %d (file descriptor limit), " 915"using throttle of %d", 916 fds_throttle, THROTTLE_FD_HEADROOM, fds_limit, new_fds_throttle); 917 SignalWarning(MGMT_SIGNAL_SYSTEM_ERROR, msg); 918 } 919 } {code} {code} 1001 static void 1002 adjust_sys_settings(void) 1003 { ... 1024 REC_ReadConfigInteger(fds_throttle, "proxy.config.net.connections_throttle"); 1025 1026 if (getrlimit(RLIMIT_NOFILE, ) == 0) { 1027 if (fds_throttle > (int)(lim.rlim_cur + THROTTLE_FD_HEADROOM)) { // --> Incorrect 1028 lim.rlim_cur = (lim.rlim_max = (rlim_t)fds_throttle); 1029 if (setrlimit(RLIMIT_NOFILE, ) == 0 && getrlimit(RLIMIT_NOFILE, ) == 0) { 1030 fds_limit = (int)lim.rlim_cur; 1031 syslog(LOG_NOTICE, "NOTE: RLIMIT_NOFILE(%d):cur(%d),max(%d)", RLIMIT_NOFILE, (int)lim.rlim_cur, (int)lim.rlim_max); 1032 } 1033 } 1034 } ... 1043 } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15512361#comment-15512361 ] Oknet Xu commented on TS-4468: -- [~jered] do you enable proxy.config.url_remap.pristine_host_hdr ? > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510070#comment-15510070 ] Oknet Xu commented on TS-4468: -- {code} One thing that's not clear is in what situations `t_state.current.server->name` is not the same as `server_request.get_host`. {code} according the code, they are synced. There would be a bug if they are not synced. Therefore, we should revert the commit and fix the bug. > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510077#comment-15510077 ] Oknet Xu commented on TS-4468: -- {code} At most, if we decided we needed to be stringent in enforcing SNI/host matching on client side, I would want the ability to opt out. {code} agree with that, an option to control it. [~jered] > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509978#comment-15509978 ] Oknet Xu edited comment on TS-4468 at 9/21/16 2:01 PM: --- RFC 6066 conflic with RFC 7540: {code} RFC 6066 If the server_name is established in the TLS session handshake, the client SHOULD NOT attempt to request a different server name at the application layer. {code} {code} RFC 7540 9.1.1 Connection Reuse Connections that are made to an origin server, either directly or through a tunnel created using the CONNECT method (Section 8.3), MAY be reused for requests with multiple different URI authority components. A connection can be reused as long as the origin server is authoritative (Section 10.1). For TCP connections without TLS, this depends on the host having resolved to the same IP address. For https resources, connection reuse additionally depends on having a certificate that is valid for the host in the URI. The certificate presented by the server MUST satisfy any checks that the client would perform when forming a new TLS connection for the host in the URI. An origin server might offer a certificate with multiple subjectAltName attributes or names with wildcards, one of which is valid for the authority in the URI. For example, a certificate with a subjectAltName of *.example.com might permit the use of the same connection for requests to URIs starting with https://a.example.com/ and https://b.example.com/. {code} RFC 7540, HTTP/2 allow connection reuse depends on having a certificate that is valid for the host in the URI. But RFC 6066, Only allow connection reuse depends on having a same SNI for the host in the URI. Depend on a research for Firefox, Chrome and Edge, Sarfri: https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/ - Firefox reuse connection by IP address. - The chrome reuse connection by both. - Edge & Sarfri reuse connection by hostname. was (Author: oknet): RFC 6066 conflic with RFC 7540: {code} RFC 6066 If the server_name is established in the TLS session handshake, the client SHOULD NOT attempt to request a different server name at the application layer. {code} {code} RFC 7540 9.1.1 Connection Reuse Connections that are made to an origin server, either directly or through a tunnel created using the CONNECT method (Section 8.3), MAY be reused for requests with multiple different URI authority components. A connection can be reused as long as the origin server is authoritative (Section 10.1). For TCP connections without TLS, this depends on the host having resolved to the same IP address. For https resources, connection reuse additionally depends on having a certificate that is valid for the host in the URI. The certificate presented by the server MUST satisfy any checks that the client would perform when forming a new TLS connection for the host in the URI. An origin server might offer a certificate with multiple subjectAltName attributes or names with wildcards, one of which is valid for the authority in the URI. For example, a certificate with a subjectAltName of *.example.com might permit the use of the same connection for requests to URIs starting with https://a.example.com/ and https://b.example.com/. {code} RFC 7540, HTTP/2 allow connection reuse depends on having a certificate that is valid for the host in the URI. But RFC 6066, Only allow connection reuse depends on having a same SNI for the host in the URI. > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as:
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509978#comment-15509978 ] Oknet Xu commented on TS-4468: -- RFC 6066 conflic with RFC 7540: {code} RFC 6066 If the server_name is established in the TLS session handshake, the client SHOULD NOT attempt to request a different server name at the application layer. {code} {code} RFC 7540 9.1.1 Connection Reuse Connections that are made to an origin server, either directly or through a tunnel created using the CONNECT method (Section 8.3), MAY be reused for requests with multiple different URI authority components. A connection can be reused as long as the origin server is authoritative (Section 10.1). For TCP connections without TLS, this depends on the host having resolved to the same IP address. For https resources, connection reuse additionally depends on having a certificate that is valid for the host in the URI. The certificate presented by the server MUST satisfy any checks that the client would perform when forming a new TLS connection for the host in the URI. An origin server might offer a certificate with multiple subjectAltName attributes or names with wildcards, one of which is valid for the authority in the URI. For example, a certificate with a subjectAltName of *.example.com might permit the use of the same connection for requests to URIs starting with https://a.example.com/ and https://b.example.com/. {code} RFC 7540, HTTP/2 allow connection reuse depends on having a certificate that is valid for the host in the URI. But RFC 6066, Only allow connection reuse depends on having a same SNI for the host in the URI. > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506700#comment-15506700 ] Oknet Xu commented on TS-4468: -- {code} 4789 shared_result = httpSessionManager.acquire_session(this, // state machine 4790 _state.current.server->dst_addr.sa, // ip + port 4791 t_state.current.server->name, // hostname 4792ua_session, // has ptr to bound ua sessions 4793this // sm 4794); {code} The t_state.current.server->name should not used to acquire server session, It is only used to lookup hostdb and get dst_addr. We should replace it with server_request.get_host() here to obey RFC6066. > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS
[ https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506637#comment-15506637 ] Oknet Xu commented on TS-4468: -- By my understand to RFC6066: - As a server, we should always verify SNI while it get a new request. (This is not yet complete in ATS) - In another word, the SNI of client_vc should always sync with Host header in t_state.hdr_info.client_request. - As a client, we should always set SNI upon application layer. - In another word, the SNI of server_vc should always sync with Host header in t_state.hdr_info.server_request. Thus, we just do acquire server session upon server_request.get_host() is enough, and no need to compares SNI. We just fix the bug if they are not synced. > http.server_session_sharing.match = both unsafe with HTTPS > -- > > Key: TS-4468 > URL: https://issues.apache.org/jira/browse/TS-4468 > Project: Traffic Server > Issue Type: Bug > Components: HTTP, SSL >Affects Versions: 6.1.1 >Reporter: Jered Floyd >Assignee: Susan Hinrichs > Fix For: 7.0.0 > > Attachments: TS-4468.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > proxy.config.http.server_session_sharing.match has a default value of "both", > which compares IP address, port, and FQDN when determining whether a > connection can be reused for further user agent requests. > The "host" (FQDN) matching does not behave safely when ATS is operating as a > reverse proxy. The compared value is the origin server FQDN after mapping, > rather than the initial "Host" target. > If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS > will attempt to reuse a connection that may have an SNI Host that does not > match the HTTP Host. With Apache 2.4 origin servers this results in 400 Bad > Request to the user agent. > PROBLEM REPRODUCTION: > You can observe this behavior with two mapping rules such as: > map https://example.com/ https://origin.example.com/ > map https://www.example.com/ https://origin.example.com/ > Non-caching clients alternately fetching URIs from the two targets will see > 400 Bad Request responses intermittently. > WORKAROUND: > proxy.config.http.server_session_sharing.match should have a default value of > "none" when proxy.config.reverse_proxy.enabled is "1" > SUGGESTED FIXES: > In order of completeness: > 1) Do not share server sessions on reverse_proxy requests. > 2) Do not share server sessions on reverse_proxy requests where scheme is > HTTPS. > 3) Compare target host (SNI host) rather than replacement host when > determining if reuse of server session is allowed (when > server_session_sharing.match is set to "host" or "both"). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-803) Fix SOCKS breakage and allow for setting next-hop SOCKS
[ https://issues.apache.org/jira/browse/TS-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15502680#comment-15502680 ] Oknet Xu commented on TS-803: - Form my understand: - compare to parent http proxy server, socks proxy server as a parent proxy is an bottom setting - the socks proxy is designed to proxy all outgoing connections including parent http proxy server. So we can set a socks proxy with the new API in a remap or plugin manually. About the API name: - proxy/api/ts/ts.h:tsapi void TSHttpTxnParentProxySet(TSHttpTxn txnp, const char *hostname, int port); - we already have ParentProxySet API that is named without "Addr" - the socks proxy is not only set for HTTP protocol and it is set for a VConnection. Can we named it with TSVConnSocksParentSet ? And we need more parameters for socks server: - socks version - username (optional) - password (optional) > Fix SOCKS breakage and allow for setting next-hop SOCKS > --- > > Key: TS-803 > URL: https://issues.apache.org/jira/browse/TS-803 > Project: Traffic Server > Issue Type: New Feature > Components: Network, SOCKS >Affects Versions: 3.0.0 > Environment: Wherever ATS might run >Reporter: M. Nunberg > > Here is a patch I drew up a few months ago against a snapshot of ATS/2.1.7 > unstable/git. There are some quirks here, and I'm not that sure any more what > this patch does exactly. However it: > 1) Does fix SOCKS connections in general > 2) Allows setting next-hop SOCKS proxy via the API > Problems: > See https://issues.apache.org/jira/browse/TS-802 > This has no effect on connections which are drawn from the connection pool, > as it seems ATS currently doesn't maintain unique identities for peripheral > connection params (source IP, SOCKS etc); i.e. this only affects new TCP > connections to an OS. > diff -x '*.o' -ru tsorig/iocore/net/I_NetVConnection.h > tsgit217/iocore/net/I_NetVConnection.h > --- tsorig/iocore/net/I_NetVConnection.h2011-03-09 21:43:58.0 > + > +++ tsgit217/iocore/net/I_NetVConnection.h2011-03-17 14:37:18.0 > + > @@ -120,6 +120,13 @@ >/// Version of SOCKS to use. >unsigned char socks_version; > + struct { > + unsigned int ip; > + int port; > + char *username; > + char *password; > + } socks_override; > + >int socket_recv_bufsize; >int socket_send_bufsize; > Only in tsgit217/iocore/net: Makefile > Only in tsgit217/iocore/net: Makefile.in > diff -x '*.o' -ru tsorig/iocore/net/P_Socks.h tsgit217/iocore/net/P_Socks.h > --- tsorig/iocore/net/P_Socks.h2011-03-09 21:43:58.0 + > +++ tsgit217/iocore/net/P_Socks.h2011-03-17 13:17:20.0 + > @@ -126,7 +126,7 @@ >unsigned char version; >bool write_done; > - > + bool manual_parent_selection; >SocksAuthHandler auth_handler; >unsigned char socks_cmd; > @@ -145,7 +145,8 @@ > SocksEntry():Continuation(NULL), netVConnection(0), > ip(0), port(0), server_ip(0), server_port(0), nattempts(0), > -lerrno(0), timeout(0), version(5), write_done(false), > auth_handler(NULL), socks_cmd(NORMAL_SOCKS) > +lerrno(0), timeout(0), version(5), write_done(false), > manual_parent_selection(false), > +auth_handler(NULL), socks_cmd(NORMAL_SOCKS) >{ >} > }; > diff -x '*.o' -ru tsorig/iocore/net/Socks.cc tsgit217/iocore/net/Socks.cc > --- tsorig/iocore/net/Socks.cc2011-03-09 21:43:58.0 + > +++ tsgit217/iocore/net/Socks.cc2011-03-17 13:46:07.0 + > @@ -73,7 +73,8 @@ >nattempts = 0; >findServer(); > - timeout = this_ethread()->schedule_in(this, > HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout)); > +// timeout = this_ethread()->schedule_in(this, > HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout)); > + timeout = this_ethread()->schedule_in(this, HRTIME_SECONDS(5)); >write_done = false; > } > @@ -81,6 +82,15 @@ > SocksEntry::findServer() > { >nattempts++; > + if(manual_parent_selection) { > + if(nattempts > 1) { > + //Nullify IP and PORT > + server_ip = -1; > + server_port = 0; > + } > + Debug("mndebug(Socks)", "findServer() is a noop with manual socks > selection"); > + return; > + } > #ifdef SOCKS_WITH_TS >if (nattempts == 1) { > @@ -187,7 +197,6 @@ > } > Debug("Socks", "Failed to connect to %u.%u.%u.%u:%d", > PRINT_IP(server_ip), server_port); > - > findServer(); > if (server_ip == (uint32_t) - 1) { > diff -x '*.o' -ru tsorig/iocore/net/UnixNetProcessor.cc > tsgit217/iocore/net/UnixNetProcessor.cc > --- tsorig/iocore/net/UnixNetProcessor.cc2011-03-09 21:43:58.0 > + > +++ tsgit217/iocore/net/UnixNetProcessor.cc2011-03-17 15:48:38.0 > + > @@
[jira] [Updated] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()
[ https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4879: - Description: The con could be closed if hyper emergency occur on check_emergency_throttle(). But we did not check the con.fd while we get return from check_emergency_throttle(). For hyper emergency: - The socket fd is removed from epoll while it is closed. - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to SM. Thus: - The NetVC will never triggered by NetHandler. - Only InactivityCop could handle the NetVC and the default timeout value is 86400 secs. For the counter: net_connections_currently_open_stat - It is increased in “connect_re_internal()” - It isn't decreased while the con.fd set to NO_FD due to hyper emergency - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. (TS-4178) was: The con could be closed if hyper emergency occur on check_emergency_throttle(). But we did not check the con.fd while we get return from check_emergency_throttle(). For hyper emergency: - The socket fd is removed from epoll while it is closed. - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to SM. Thus: - The NetVC will never triggered by NetHandler. - Only InactivityCop could handle the NetVC and the default timeout value is 86400 secs. For the counter: net_connections_currently_open_stat - It is increased in “connect_re_internal()” - It isn't decreased while the con.fd set to NO_FD due to hyper emergency - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. > NetVC leaks while hyper emergency occur on check_emergency_throttle() > - > > Key: TS-4879 > URL: https://issues.apache.org/jira/browse/TS-4879 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > > The con could be closed if hyper emergency occur on > check_emergency_throttle(). > But we did not check the con.fd while we get return from > check_emergency_throttle(). > For hyper emergency: > - The socket fd is removed from epoll while it is closed. > - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to > SM. > Thus: > - The NetVC will never triggered by NetHandler. > - Only InactivityCop could handle the NetVC and the default timeout value is > 86400 secs. > For the counter: net_connections_currently_open_stat > - It is increased in “connect_re_internal()” > - It isn't decreased while the con.fd set to NO_FD due to hyper emergency > - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. > (TS-4178) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()
Oknet Xu created TS-4879: Summary: NetVC leaks while hyper emergency occur on check_emergency_throttle() Key: TS-4879 URL: https://issues.apache.org/jira/browse/TS-4879 Project: Traffic Server Issue Type: Bug Components: Core Reporter: Oknet Xu The con could be closed if hyper emergency occur on check_emergency_throttle(). But we did not check the con.fd while we get return from check_emergency_throttle(). For hyper emergency: - The socket fd is removed from epoll while it is closed. - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to SM. Thus: - The NetVC will never triggered by NetHandler. - Only InactivityCop could handle the NetVC and the default timeout value is 86400 secs. For the counter: net_connections_currently_open_stat - It is increased in “connect_re_internal()” - It isn't decreased while the con.fd set to NO_FD due to hyper emergency - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()
[ https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu reassigned TS-4879: Assignee: Oknet Xu > NetVC leaks while hyper emergency occur on check_emergency_throttle() > - > > Key: TS-4879 > URL: https://issues.apache.org/jira/browse/TS-4879 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > > The con could be closed if hyper emergency occur on > check_emergency_throttle(). > But we did not check the con.fd while we get return from > check_emergency_throttle(). > For hyper emergency: > - The socket fd is removed from epoll while it is closed. > - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to > SM. > Thus: > - The NetVC will never triggered by NetHandler. > - Only InactivityCop could handle the NetVC and the default timeout value is > 86400 secs. > For the counter: net_connections_currently_open_stat > - It is increased in “connect_re_internal()” > - It isn't decreased while the con.fd set to NO_FD due to hyper emergency > - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-803) Fix SOCKS breakage and allow for setting next-hop SOCKS
[ https://issues.apache.org/jira/browse/TS-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500410#comment-15500410 ] Oknet Xu commented on TS-803: - [~jpe...@apache.org] The patch implemented a new API: TSHttpTxnSocksProxySet to set socks server in plugin manually. And I don't find any codes related to “Does fix SOCKS connections in general”. Does it need API REVIEW if it is a new API ? > Fix SOCKS breakage and allow for setting next-hop SOCKS > --- > > Key: TS-803 > URL: https://issues.apache.org/jira/browse/TS-803 > Project: Traffic Server > Issue Type: New Feature > Components: Network, SOCKS >Affects Versions: 3.0.0 > Environment: Wherever ATS might run >Reporter: M. Nunberg > > Here is a patch I drew up a few months ago against a snapshot of ATS/2.1.7 > unstable/git. There are some quirks here, and I'm not that sure any more what > this patch does exactly. However it: > 1) Does fix SOCKS connections in general > 2) Allows setting next-hop SOCKS proxy via the API > Problems: > See https://issues.apache.org/jira/browse/TS-802 > This has no effect on connections which are drawn from the connection pool, > as it seems ATS currently doesn't maintain unique identities for peripheral > connection params (source IP, SOCKS etc); i.e. this only affects new TCP > connections to an OS. > diff -x '*.o' -ru tsorig/iocore/net/I_NetVConnection.h > tsgit217/iocore/net/I_NetVConnection.h > --- tsorig/iocore/net/I_NetVConnection.h2011-03-09 21:43:58.0 > + > +++ tsgit217/iocore/net/I_NetVConnection.h2011-03-17 14:37:18.0 > + > @@ -120,6 +120,13 @@ >/// Version of SOCKS to use. >unsigned char socks_version; > + struct { > + unsigned int ip; > + int port; > + char *username; > + char *password; > + } socks_override; > + >int socket_recv_bufsize; >int socket_send_bufsize; > Only in tsgit217/iocore/net: Makefile > Only in tsgit217/iocore/net: Makefile.in > diff -x '*.o' -ru tsorig/iocore/net/P_Socks.h tsgit217/iocore/net/P_Socks.h > --- tsorig/iocore/net/P_Socks.h2011-03-09 21:43:58.0 + > +++ tsgit217/iocore/net/P_Socks.h2011-03-17 13:17:20.0 + > @@ -126,7 +126,7 @@ >unsigned char version; >bool write_done; > - > + bool manual_parent_selection; >SocksAuthHandler auth_handler; >unsigned char socks_cmd; > @@ -145,7 +145,8 @@ > SocksEntry():Continuation(NULL), netVConnection(0), > ip(0), port(0), server_ip(0), server_port(0), nattempts(0), > -lerrno(0), timeout(0), version(5), write_done(false), > auth_handler(NULL), socks_cmd(NORMAL_SOCKS) > +lerrno(0), timeout(0), version(5), write_done(false), > manual_parent_selection(false), > +auth_handler(NULL), socks_cmd(NORMAL_SOCKS) >{ >} > }; > diff -x '*.o' -ru tsorig/iocore/net/Socks.cc tsgit217/iocore/net/Socks.cc > --- tsorig/iocore/net/Socks.cc2011-03-09 21:43:58.0 + > +++ tsgit217/iocore/net/Socks.cc2011-03-17 13:46:07.0 + > @@ -73,7 +73,8 @@ >nattempts = 0; >findServer(); > - timeout = this_ethread()->schedule_in(this, > HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout)); > +// timeout = this_ethread()->schedule_in(this, > HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout)); > + timeout = this_ethread()->schedule_in(this, HRTIME_SECONDS(5)); >write_done = false; > } > @@ -81,6 +82,15 @@ > SocksEntry::findServer() > { >nattempts++; > + if(manual_parent_selection) { > + if(nattempts > 1) { > + //Nullify IP and PORT > + server_ip = -1; > + server_port = 0; > + } > + Debug("mndebug(Socks)", "findServer() is a noop with manual socks > selection"); > + return; > + } > #ifdef SOCKS_WITH_TS >if (nattempts == 1) { > @@ -187,7 +197,6 @@ > } > Debug("Socks", "Failed to connect to %u.%u.%u.%u:%d", > PRINT_IP(server_ip), server_port); > - > findServer(); > if (server_ip == (uint32_t) - 1) { > diff -x '*.o' -ru tsorig/iocore/net/UnixNetProcessor.cc > tsgit217/iocore/net/UnixNetProcessor.cc > --- tsorig/iocore/net/UnixNetProcessor.cc2011-03-09 21:43:58.0 > + > +++ tsgit217/iocore/net/UnixNetProcessor.cc2011-03-17 15:48:38.0 > + > @@ -228,6 +228,11 @@ >!socks_conf_stuff->ip_range.match(ip)) > #endif > ); > + if(opt->socks_override.ip >= 1) { > + using_socks = true; > + Debug("mndebug", "trying to set using_socks to true"); > + } > + >SocksEntry *socksEntry = NULL; > #endif >NET_SUM_GLOBAL_DYN_STAT(net_connections_currently_open_stat, 1); > @@ -242,6 +247,16 @@ >if (using_socks) { > Debug("Socks", "Using Socks ip: %u.%u.%u.%u:%d\n", PRINT_IP(ip), port); >
[jira] [Commented] (TS-2889) Crash in FetchSM related to spdy FetchSM changes in 5.0.x
[ https://issues.apache.org/jira/browse/TS-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470041#comment-15470041 ] Oknet Xu commented on TS-2889: -- Could you explain the code that copy data from resp_reader to resp_buffer ? {code} + +while (total_bytes_copied < bytes) { + int64_t actual_bytes_copied; + actual_bytes_copied = resp_buffer->write(resp_reader, bytes, 0); + Debug(DEBUG_TAG, "[%s] copied %" PRId64 " bytes", __FUNCTION__, actual_bytes_copied); + if (actual_bytes_copied <= 0) { + break; + } + total_bytes_copied += actual_bytes_copied; +} +Debug(DEBUG_TAG, "[%s] total copied %" PRId64 " bytes", __FUNCTION__, total_bytes_copied); +resp_reader->consume(total_bytes_copied); + {code} Copy the data and then cosume old copy ? why ? > Crash in FetchSM related to spdy FetchSM changes in 5.0.x > - > > Key: TS-2889 > URL: https://issues.apache.org/jira/browse/TS-2889 > Project: Traffic Server > Issue Type: Bug > Components: Core, SPDY >Affects Versions: 5.0.0 >Reporter: Brian Geffon >Assignee: Brian Geffon > Labels: yahoo > Fix For: 5.1.0 > > Attachments: ts2889.diff > > > I'm seeing a crash in the FetchSM on 5.0.x, this is surely because of the > changes that were made to the FetchSM as a result of SPDY. > Sample bt: > #0 0x00377c632925 in raise () from /lib64/libc.so.6 > #1 0x00377c634105 in abort () from /lib64/libc.so.6 > #2 0x2b09b0693ef0 in ink_die_die_die (retval=1) at ink_error.cc:43 > #3 0x2b09b0693fbd in ink_fatal_va(int, const char *, typedef > __va_list_tag __va_list_tag *) (return_code=1, > message_format=0x2b09b06a1358 "%s:%d: failed assert `%s`", > ap=0x2b09b8806710) at ink_error.cc:65 > #4 0x2b09b0694086 in ink_fatal (return_code=1, > message_format=0x2b09b06a1358 "%s:%d: failed assert `%s`") > at ink_error.cc:73 > #5 0x2b09b0692d40 in _ink_assert (expression=0x761f2f "header_done", > file=0x761ede "FetchSM.cc", line=160) at ink_assert.cc:37 > #6 0x004fa5c0 in FetchSM::check_chunked (this=0x2b09f8012240) > at FetchSM.cc:160 > #7 0x004fac82 in FetchSM::get_info_from_buffer (this=0x2b09f8012240, > the_reader=0x2b09f4004818) at FetchSM.cc:313 > #8 0x004fb18b in FetchSM::process_fetch_read (this=0x2b09f8012240, > event=104) at FetchSM.cc:402 > #9 0x004fb42d in FetchSM::fetch_handler (this=0x2b09f8012240, > event=104, edata=0x2b09f8002768) at FetchSM.cc:449 > #10 0x004fc43e in Continuation::handleEvent (this=0x2b09f8012240, > event=104, data=0x2b09f8002768) > at ../iocore/eventsystem/I_Continuation.h:146 > ---Type to continue, or q to quit--- > #11 0x00537f2e in PluginVC::process_read_side (this=0x2b09f8002670, > other_side_call=false) at PluginVC.cc:637 > #12 0x00536856 in PluginVC::main_handler (this=0x2b09f8002670, > event=1, data=0x2b0a340293e0) at PluginVC.cc:208 > #13 0x004fc43e in Continuation::handleEvent (this=0x2b09f8002670, > event=1, data=0x2b0a340293e0) at > ../iocore/eventsystem/I_Continuation.h:146 > #14 0x0075d2e6 in EThread::process_event (this=0x2b09b23cc010, > e=0x2b0a340293e0, calling_code=1) at UnixEThread.cc:145 > #15 0x0075d4b4 in EThread::execute (this=0x2b09b23cc010) > at UnixEThread.cc:196 > #16 0x0075c844 in spawn_thread_internal (a=0x1428b10) at Thread.cc:88 > #17 0x00377ce079d1 in start_thread () from /lib64/libpthread.so.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4804) Incorrect write.vio.ndone
[ https://issues.apache.org/jira/browse/TS-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455845#comment-15455845 ] Oknet Xu commented on TS-4804: -- [~zwoop] the bug only in 7.0.0. > Incorrect write.vio.ndone > - > > Key: TS-4804 > URL: https://issues.apache.org/jira/browse/TS-4804 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {code} > int64_t > UnixNetVConnection::load_buffer_and_write(int64_t towrite, MIOBufferAccessor > , int64_t _written, int ) > { > ... > if (r > 0) { > buf.reader()->consume(r); > } > total_written += r; > ... > return r; > } > {code} > the 'r' is returned from socketManage.writev(). > 'total_written += r;' should be enclosed by if statement because the 'r' may > be a negative value otherwise the total_written is incorrect. > {code} > void > write_to_net_io(NetHandler *nh, UnixNetVConnection *vc, EThread *thread) > { > ... > int64_t r = vc->load_buffer_and_write(towrite, buf, > total_written, needs); > if (total_written > 0) { > NET_SUM_DYN_STAT(net_write_bytes_stat, total_written); > s->vio.ndone += total_written; > } > ... > } > {code} > The incorrect total_written will cause the incorrect of write.vio.ndone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4804) Incorrect write.vio.ndone
Oknet Xu created TS-4804: Summary: Incorrect write.vio.ndone Key: TS-4804 URL: https://issues.apache.org/jira/browse/TS-4804 Project: Traffic Server Issue Type: Bug Components: Core Reporter: Oknet Xu {code} int64_t UnixNetVConnection::load_buffer_and_write(int64_t towrite, MIOBufferAccessor , int64_t _written, int ) { ... if (r > 0) { buf.reader()->consume(r); } total_written += r; ... return r; } {code} the 'r' is returned from socketManage.writev(). 'total_written += r;' should be enclosed by if statement because the 'r' may be a negative value otherwise the total_written is incorrect. {code} void write_to_net_io(NetHandler *nh, UnixNetVConnection *vc, EThread *thread) { ... int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs); if (total_written > 0) { NET_SUM_DYN_STAT(net_write_bytes_stat, total_written); s->vio.ndone += total_written; } ... } {code} The incorrect total_written will cause the incorrect of write.vio.ndone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-2482) Problems with SOCKS
[ https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441427#comment-15441427 ] Oknet Xu commented on TS-2482: -- The socks proxy is broken by “TS-919: Make iocore IPv6 capable ,https://github.com/oknet/trafficserver/commit/8247bcac9e326746132d6526469c6b30146c0baf” at 19 Aug 2011. server_addr is socks server. target_addr is remote_addr that to connect with. > Problems with SOCKS > --- > > Key: TS-2482 > URL: https://issues.apache.org/jira/browse/TS-2482 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Radim Kolar >Assignee: Oknet Xu > Fix For: sometime > > Time Spent: 10m > Remaining Estimate: 0h > > There are several problems with using SOCKS. I am interested in case when TF > is sock client. Client sends HTTP request and TF uses SOCKS server to make > connection to internet. > a/ - not documented enough in default configs > From default configs comments it seems that for running > TF 4.1.2 as socks client, it is sufficient to add one line to socks.config: > dest_ip=0.0.0.0-255.255.255.255 parent="10.0.0.7:9050" > but socks proxy is not used. If i run tcpdump sniffing packets TF never > tries to connect to that SOCKS. > From source code - > https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it > looks that is needed to set "proxy.config.socks.socks_needed" to activate > socks support. This should be documented in both sample files: socks.config > and record.config > b/ > after enabling socks, i am hit by this assert: > Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, > line 65. > i run on dual stack system (ip4,ip6). > This code is setting default destination for SOCKS request? Can not you use > just 127.0.0.1 for case if client gets connected over IP6? > https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used
[ https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4613: - Priority: Minor (was: Major) > Set an independent thread_data_used for each thread group instead of sharing > one thread_data_used > - > > Key: TS-4613 > URL: https://issues.apache.org/jira/browse/TS-4613 > Project: Traffic Server > Issue Type: Improvement > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu >Priority: Minor > Labels: Optimization > Fix For: 7.0.0 > > > "thread_data_used" indicate the usage of EThread::thread_private[ ]. > The EThread::thread_private[ ] saved thread specific data e.g. : > - stat system arrays > - NetHandler object > - PollCont object > However, the private data of thread group are different. > Sharing thread_data_used cause the waste of space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used
[ https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4613: - Issue Type: Improvement (was: Bug) > Set an independent thread_data_used for each thread group instead of sharing > one thread_data_used > - > > Key: TS-4613 > URL: https://issues.apache.org/jira/browse/TS-4613 > Project: Traffic Server > Issue Type: Improvement > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Labels: Optimization > Fix For: 7.0.0 > > > "thread_data_used" indicate the usage of EThread::thread_private[ ]. > The EThread::thread_private[ ] saved thread specific data e.g. : > - stat system arrays > - NetHandler object > - PollCont object > However, the private data of thread group are different. > Sharing thread_data_used cause the waste of space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used
[ https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4613: - Labels: Optimization (was: ) > Set an independent thread_data_used for each thread group instead of sharing > one thread_data_used > - > > Key: TS-4613 > URL: https://issues.apache.org/jira/browse/TS-4613 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Labels: Optimization > Fix For: 7.0.0 > > > "thread_data_used" indicate the usage of EThread::thread_private[ ]. > The EThread::thread_private[ ] saved thread specific data e.g. : > - stat system arrays > - NetHandler object > - PollCont object > However, the private data of thread group are different. > Sharing thread_data_used cause the waste of space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used
[ https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4613: - Description: "thread_data_used" indicate the usage of EThread::thread_private[ ]. The EThread::thread_private[ ] saved thread specific data e.g. : - stat system arrays - NetHandler object - PollCont object However, the private data of thread group are different. Sharing thread_data_used cause the waste of space. was: NetHandler has a method: _close_vc , It is called by InactivityCop. first, create a dummy Event in stack, then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, ); the handleEvent is mainEvent here. In the UnixNetVConnection::mainEvent code: ``` int UnixNetVConnection::mainEvent(int event, Event *e) { ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); ink_assert(thread == this_ethread()); MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, e->ethread); MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, e->ethread); if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) #endif e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); return EVENT_CONT; } ``` the dummy Event would be schedule_in into Event System by e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. ``` #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); #endif ``` I'm try to allocate a Event instead dummy Event, but meet Event System callback on a deallocated UnixNetVConnection. due to NetHandler called close_UnixNetVConnection before Event System callback the Event by schedule_in. In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) from UnixNetVConnection::mainEvent, to do ++handle_event; or not. ``` if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) ++handle_event; ``` the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, due to the mutex of ServerSessionVC may different from ClientSessionVC. Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server Session from SessionPool. ServerSessionVC still keep the old mutex. > Set an independent thread_data_used for each thread group instead of sharing > one thread_data_used > - > > Key: TS-4613 > URL: https://issues.apache.org/jira/browse/TS-4613 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > > "thread_data_used" indicate the usage of EThread::thread_private[ ]. > The EThread::thread_private[ ] saved thread specific data e.g. : > - stat system arrays > - NetHandler object > - PollCont object > However, the private data of thread group are different. > Sharing thread_data_used cause the waste of space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used
[ https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4613: - Summary: Set an independent thread_data_used for each thread group instead of sharing one thread_data_used (was: In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback ) > Set an independent thread_data_used for each thread group instead of sharing > one thread_data_used > - > > Key: TS-4613 > URL: https://issues.apache.org/jira/browse/TS-4613 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > > NetHandler has a method: _close_vc , It is called by InactivityCop. > first, create a dummy Event in stack, > then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, > ); > the handleEvent is mainEvent here. > In the UnixNetVConnection::mainEvent code: > ``` > int > UnixNetVConnection::mainEvent(int event, Event *e) > { > ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); > ink_assert(thread == this_ethread()); > MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); > MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, > e->ethread); > MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : > e->ethread->mutex, e->ethread); > if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || > (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || > (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > #endif > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > return EVENT_CONT; > } > ``` > the dummy Event would be schedule_in into Event System by > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. > ``` > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > #endif > ``` > I'm try to allocate a Event instead dummy Event, but meet Event System > callback on a deallocated UnixNetVConnection. > due to NetHandler called close_UnixNetVConnection before Event System > callback the Event by schedule_in. > In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) > from UnixNetVConnection::mainEvent, to do ++handle_event; or not. > ``` > if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) > ++handle_event; > ``` > the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, > due to the mutex of ServerSessionVC may different from ClientSessionVC. > Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server > Session from SessionPool. ServerSessionVC still keep the old mutex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used
[ https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4613: - Component/s: (was: Network) > Set an independent thread_data_used for each thread group instead of sharing > one thread_data_used > - > > Key: TS-4613 > URL: https://issues.apache.org/jira/browse/TS-4613 > Project: Traffic Server > Issue Type: Bug > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > > NetHandler has a method: _close_vc , It is called by InactivityCop. > first, create a dummy Event in stack, > then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, > ); > the handleEvent is mainEvent here. > In the UnixNetVConnection::mainEvent code: > ``` > int > UnixNetVConnection::mainEvent(int event, Event *e) > { > ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); > ink_assert(thread == this_ethread()); > MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); > MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, > e->ethread); > MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : > e->ethread->mutex, e->ethread); > if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || > (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || > (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > #endif > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > return EVENT_CONT; > } > ``` > the dummy Event would be schedule_in into Event System by > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. > ``` > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > #endif > ``` > I'm try to allocate a Event instead dummy Event, but meet Event System > callback on a deallocated UnixNetVConnection. > due to NetHandler called close_UnixNetVConnection before Event System > callback the Event by schedule_in. > In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) > from UnixNetVConnection::mainEvent, to do ++handle_event; or not. > ``` > if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) > ++handle_event; > ``` > the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, > due to the mutex of ServerSessionVC may different from ClientSessionVC. > Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server > Session from SessionPool. ServerSessionVC still keep the old mutex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4337) Pickup wrong dest ip or port while parsing CONNECT Method with use_client_target_addr = 2
[ https://issues.apache.org/jira/browse/TS-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4337: - Description: add config line into records.config {code} CONFIG proxy.config.http.use_client_target_addr INT 2 {code} Setup ATS working on bridge mode with ebtables/iptables, and enable tr-full on 8080 port. send a HTTP CONNECT request. {code} telnet 200.x.y.10 8080 CONNECT 220.181.111.188:443 HTTP/1.1 {code} the ip address 200.x.y.10 is a public http proxy address. Snip contents from traffic.out {code} + Proxy's Request + -- State Machine Id: 578 CONNECT HTTP/1.1 Client-ip: 172.22.70.66 X-Forwarded-For: 172.22.70.66 Via: http/1.0 debian[AC166F6E] (ApacheTrafficServer/6.0.0) Host: 220.181.111.188:443 [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_trans) Next action next; HttpTransact::HandleResponse [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] State Transition: SM_ACTION_API_OS_DNS -> SM_ACTION_ORIGIN_SERVER_RAW_OPEN [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_track) entered inside do_http_server_open ][IPv4] [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open connection to 220.181.111.188: 200.x.y.10:443 [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_seq) [HttpSM::do_http_server_open] Sending request to server [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) calling netProcessor.connect_s {code} please notice on the below line: {code} [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open connection to 220.181.111.188: 200.x.y.10:443 {code} with "use_client_target_addr INT 2", ATS does not do name resolve and pickup dest ip directly from TCP layer but still pickup dest port from HTTP request. In a tr-full mode, does ATS should tunnel the CONNECT method to a remote proxy ? just think it is a one shot parent proxy, only for this tcp connection. or other behaviour ? was: add config line into records.config {code} CONFIG proxy.config.http.use_client_target_addr INT 2 {code} Setup ATS working on bridge mode with ebtables/iptables, and enable tr-full on 8080 port. send a HTTP CONNECT request. {code} telnet 200.x.y.10 8080 CONNECT 220.181.111.188:443 HTTP/1.1 {code} the ip address 200.x.y.10 is a public http proxy address. Snip contents from traffic.out {code} + Proxy's Request + -- State Machine Id: 578 CONNECT HTTP/1.1 Client-ip: 172.22.70.66 X-Forwarded-For: 172.22.70.66 Via: http/1.0 debian[AC166F6E] (ApacheTrafficServer/6.0.0) Host: 220.181.111.188:443 [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_trans) Next action next; HttpTransact::HandleResponse [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] State Transition: SM_ACTION_API_OS_DNS -> SM_ACTION_ORIGIN_SERVER_RAW_OPEN [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_track) entered inside do_http_server_open ][IPv4] [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open connection to 220.181.111.188: 111.13.56.28:443 [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_seq) [HttpSM::do_http_server_open] Sending request to server [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) calling netProcessor.connect_s {code} please notice on the below line: {code} [Apr 7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open connection to 220.181.111.188: 200.x.y.10:443 {code} with "use_client_target_addr INT 2", ATS does not do name resolve and pickup dest ip directly from TCP layer but still pickup dest port from HTTP request. In a tr-full mode, does ATS should tunnel the CONNECT method to a remote proxy ? just think it is a one shot parent proxy, only for this tcp connection. or other behaviour ? > Pickup wrong dest ip or port while parsing CONNECT Method with > use_client_target_addr = 2 > - > > Key: TS-4337 > URL: https://issues.apache.org/jira/browse/TS-4337 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Oknet Xu > Fix For: 7.1.0 > > > add config line into records.config > {code} > CONFIG proxy.config.http.use_client_target_addr INT 2 > {code} > Setup ATS working on bridge mode with ebtables/iptables, > and enable tr-full on 8080 port. > send a HTTP CONNECT request. > {code} > telnet 200.x.y.10 8080 > CONNECT 220.181.111.188:443 HTTP/1.1 > {code} > the ip address 200.x.y.10 is a public http proxy address. > Snip contents from traffic.out > {code} > + Proxy's Request + > -- State Machine Id: 578 > CONNECT HTTP/1.1 > Client-ip: 172.22.70.66 > X-Forwarded-For: 172.22.70.66 > Via: http/1.0 debian[AC166F6E] (ApacheTrafficServer/6.0.0) > Host: 220.181.111.188:443 > [Apr 7 16:58:49.996] Server {0x2b6176b8a700}
[jira] [Updated] (TS-4522) Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS only signaled from read_from_net()
[ https://issues.apache.org/jira/browse/TS-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4522: - Description: The 1st Problem: The "r" saved return value from write(). The "r == 0" or "!r" is not means EOS. Because of on a closed socket fd: - read(socketfd) return 0 - write(socketfd) return EPIPE In the write_to_net_io, we check the return value of write() with the same way to read(). {code} if (!r || r == -ECONNRESET) { {code} It is a copy & paste bug. The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but VC_EVENT_ERROR instead. full code here: {code} int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs); if (total_written > 0) { NET_SUM_DYN_STAT(net_write_bytes_stat, total_written); s->vio.ndone += total_written; } // check for errors if (r <= 0) { // if the socket was not ready,add to WaitList if (r == -EAGAIN || r == -ENOTCONN) { NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat); if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { vc->write.triggered = 0; nh->write_ready_list.remove(vc); write_reschedule(nh, vc); } if ((needs & EVENTIO_READ) == EVENTIO_READ) { vc->read.triggered = 0; nh->read_ready_list.remove(vc); read_reschedule(nh, vc); } return; } if (!r || r == -ECONNRESET) { vc->write.triggered = 0; write_signal_done(VC_EVENT_EOS, nh, vc); return; } vc->write.triggered = 0; write_signal_error(nh, vc, (int)-total_written); return; {code} The 2nd Problem: In the iocore/net/I_NetVConnection.h, the comments for do_io_write: {code} 257 /** 258 Initiates write. Thread-safe, may be called when not handling 259 an event from the NetVConnection, or the NetVConnection creation 260 callback. 261 262 Callbacks: non-reentrant, c's lock taken during callbacks. 263 264 265 266 c->handleEvent(VC_EVENT_WRITE_READY, vio) 267 signifies data has written from the reader or there are no bytes available for the reader to write. 268 269 270 c->handleEvent(VC_EVENT_WRITE_COMPLETE, vio) 271 signifies the amount of data indicated by nbytes has been read from the buffer 272 273 274 c->handleEvent(VC_EVENT_ERROR, vio) 275 signified that error occured during write. 276 277 278 279 The vio returned during callbacks is the same as the one returned 280 by do_io_write(). The vio can be changed only during call backs 281 from the vconnection. The vconnection deallocates the reader 282 when it is destroyed. 283 284 @param c continuation to be called back after (partial) write 285 @param nbytes no of bytes to write, if unknown msut be set to INT64_MAX 286 @param buf source of data 287 @param owner 288 @return vio pointer 289 290 */ 291 virtual VIO *do_io_write(Continuation *c, int64_t nbytes, IOBufferReader *buf, bool owner = false) = 0; {code} Only 3 Events was introduced - VC_EVENT_WRITE_READY - VC_EVENT_WRITE_COMPLETE - VC_EVENT_ERROR The code {code}write_signal_done(VC_EVENT_EOS, nh, vc);{code} should not be here (write_to_net_io). was: On a closed socket fd: read(socketfd) return 0 write(socketfd) return EPIPE In the write_to_net_io, we check the return value of write() with the same way to read(). {code} if (!r || r == -ECONNRESET) { {code} The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but VC_EVENT_ERROR instead. full code here: {code} int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs); if (total_written > 0) { NET_SUM_DYN_STAT(net_write_bytes_stat, total_written); s->vio.ndone += total_written; } // check for errors if (r <= 0) { // if the socket was not ready,add to WaitList if (r == -EAGAIN || r == -ENOTCONN) { NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat); if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { vc->write.triggered = 0; nh->write_ready_list.remove(vc); write_reschedule(nh, vc); } if ((needs & EVENTIO_READ) == EVENTIO_READ) { vc->read.triggered = 0; nh->read_ready_list.remove(vc); read_reschedule(nh, vc); } return; } if (!r || r == -ECONNRESET) { vc->write.triggered = 0; write_signal_done(VC_EVENT_EOS, nh, vc); return; } vc->write.triggered = 0; write_signal_error(nh, vc, (int)-total_written); return; {code} > Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS > only signaled from read_from_net() > - > > Key: TS-4522 > URL: https://issues.apache.org/jira/browse/TS-4522 > Project:
[jira] [Updated] (TS-4522) Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS only signaled from read_from_net()
[ https://issues.apache.org/jira/browse/TS-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4522: - Summary: Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS only signaled from read_from_net() (was: did not check EPIPE on write_to_net_io) > Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS > only signaled from read_from_net() > - > > Key: TS-4522 > URL: https://issues.apache.org/jira/browse/TS-4522 > Project: Traffic Server > Issue Type: Bug > Components: Core, Network >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > On a closed socket fd: > read(socketfd) return 0 > write(socketfd) return EPIPE > In the write_to_net_io, we check the return value of write() with the same > way to read(). > {code} > if (!r || r == -ECONNRESET) { > {code} > The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but > VC_EVENT_ERROR instead. > full code here: > {code} > int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs); > if (total_written > 0) { > NET_SUM_DYN_STAT(net_write_bytes_stat, total_written); > s->vio.ndone += total_written; > } > // check for errors > if (r <= 0) { // if the socket was not ready,add to WaitList > if (r == -EAGAIN || r == -ENOTCONN) { > NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat); > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > vc->write.triggered = 0; > nh->write_ready_list.remove(vc); > write_reschedule(nh, vc); > } > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > vc->read.triggered = 0; > nh->read_ready_list.remove(vc); > read_reschedule(nh, vc); > } > return; > } > if (!r || r == -ECONNRESET) { > vc->write.triggered = 0; > write_signal_done(VC_EVENT_EOS, nh, vc); > return; > } > vc->write.triggered = 0; > write_signal_error(nh, vc, (int)-total_written); > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4322) ProfileSM Proposal
[ https://issues.apache.org/jira/browse/TS-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426171#comment-15426171 ] Oknet Xu commented on TS-4322: -- Yes, VC migration logic could. But the ProfileSM is a better desgin for ATS, it is not conflic with VC migration logic. >From my understanding of the code, the NetVC has the below inheritance: SSLNetVConnection <-- UnixNetVConnection <-- NetVConnection <-- VConnection A long time ago, there was a class NTNetVConnection (no longer existed), I guess its inheritance is: SSLNetVConnection <-- NTNetVConnection <-- NetVConnection <-- VConnection Thus SSLNetVConnection has 2 versions, one derived from UnixNetVConnection another one derived from NTNetVConnection. I noticed there are two kind of prefix of header files in iccore. one is "P_ " and another one is "I_". P_ prefix means the header file is used to define private interfaces and variables only. I_ prefix means the header file is user to define public interfaces and variables. The class NetVConnection is defined in I_NetVConnection.h, thus it is a interface used by HttpSM. The class UnixNetVConnection is defined in P_UnixNetVConnection.h, thus it is a typeof implement for NetVConnection. Meanwhile, the NTNetVConnection is a typeof implement too, The design makes HttpSM can use NetVConnection directly regardless of the operating system (Windows or Unix). But the SSLNetVConnection breaking the design. (6.0.x branch) The NetVConnection designed to be a resource handle but there is not any I/O operation abstract for NetVConnection. The ProfileSM designed to abstract I/O operation on NetVConnection. The TcpProfileSM is a typeof implement for NetVConnection with TCP. The SslProfileSM is a typeof implement for NetVConnection with TCP-SSL. I have already accomplish TcpProfileSM and SslProfileSM. Now, I am working on the SocksProfileSM which is used to verify this design. > ProfileSM Proposal > -- > > Key: TS-4322 > URL: https://issues.apache.org/jira/browse/TS-4322 > Project: Traffic Server > Issue Type: Improvement > Components: Core, Network >Reporter: Oknet Xu > Fix For: sometime > > Attachments: ATS SslProfileSMv1.png, ATS TcpProfileSM.png > > > Preface > === > NetVConnection is a base class for all NetIO derived classes: > - NetVConnection > - UnixNetVConnection for TCP NetIO > - SSLNetVConnection for SSL NetIO > with the below codes to test a NetVC whether is a SSLNetVC : > {code} > sslvc = dynamic_castnetvc; > if (sslvc != NULL) > { > // netvc is a SSLNetVConnection > } else { > // netvc is a UnixNetVConnection > } > {code} > ATS support HTTP, SPDY and H2 protocol, and also support them with SSL/TLS. > Sometimes we want to talk in HTTP/TCP first, and then talk in HTTP/SSL, > Example : HTTPS over HTTP CONNECT method > {code} > Client send a CONNECT method request > C->P: CONNECT www.example.com:443 HTTP/1.1 > C->P: Host: www.example.com:443 > C->P: > ATS reply a HTTP 200/OK, then build a TCP tunnel to www.example.com:443 > P->C: 200 OK > P->C: > Client send a SSL Handshake Client Hello message > C->P: > ATS tunnel the message > P->S: > Server response a SSL Handshake Server Hello message > P<-S: > ATS tunnel the message > C<-P: > Server send a Certificate to ATS > P<-S: > ATS tunnel the message > C<-P: > etc . . . > {code} > currently, It isn't a easy way upgrading to SSLNetVConnection from > UnixNetVConnection. > the ProfileSM is designed to setup a plugable mechanism for > UnixNetVConnection to handle(abstract) different type of I/O operation. > so we will have TcpProfileSM and UdpProfileSM as low level ProfileSM and > SslProfileSM as high level ProfileSM. > How to implement > > Introduce a new class ProfileSM & TcpProfileSM & SslProfileSM : > It is a derived class from Continuation > - Has handleEvent() function > - Has mutex member > TcpProfileSM is a derived class from ProfileSM > SslProfileSM is a derived class from ProfileSM > handshakeEvent(int event, void *data) function > - only defined in SslProfileSM > - the SSL handshake handle function > - `event' can be IOCORE_EVENTS_READ or IOCORE_EVENTS_WRITE > - it is callback from NetHandler::mainNetEvent() > - `data' is a pointer to Nethandler type > - it is implement NPN/ALPN support and replace SSLNextProtocolAccept & > SSLNextProtocolTrampoline, pick some codes from sslvc->net_read_io(), > write_to_net_io() > - set Continuation->handler to mainEvent() when HandShake done. > mainEvent(int event, void *data) function > - the first entrance > - `event' can be IOCORE_EVENTS_READ or IOCORE_EVENTS_WRITE > - it is callback from NetHandler::mainNetEvent() > - `data' is a pointer to Nethandler type > handle_read(NetHandler *nh, EThread *lthread) > -
[jira] [Commented] (TS-4475) Crash in Log-Collation client after using inactivity-cop.
[ https://issues.apache.org/jira/browse/TS-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420710#comment-15420710 ] Oknet Xu commented on TS-4475: -- The Log-Collation has LogCollationHostSM and LogCollationClientSM, the LogCollationHostSM handle server side netvc and LogCollationClientSM handle client side. We should do the same thing for LogCollationHostSM. what do you think, [~sudheerv] ? Can you include it in your PR#831 ? > Crash in Log-Collation client after using inactivity-cop. > - > > Key: TS-4475 > URL: https://issues.apache.org/jira/browse/TS-4475 > Project: Traffic Server > Issue Type: Bug > Components: Logging >Affects Versions: 6.1.1 >Reporter: Peter Chou > Fix For: sometime > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Background: We recently tried making use of inactivity-cop by setting it to > 300s instead of the default one-day setting. This was to address an issue > where, under heavy load, ATS would become un-responsive to client requests, > and the condition would persist after traffic was stopped with the active > queue saying 0 connections but 'netstat -na' showing a bunch of established > connections (up to the throttle limit approximately). > Inactivity cop seemed to help ATS handle this situation, but we have since > experienced a couple of core dumps over the last four day period. It seems > occasionally the Log Collation Client State Machine will have event value 105 > or VC_EVENT_INACTIVITY_TIMEOUT, but when it reaches read_signal_and_update() > it tries to call the continuation handler which down the line does not know > about this event thus causing core dump !"unexpcted state" [sic]. > Here is the back-trace -- > (gdb) bt > #0 0x2b67cd5405f7 in raise () from /lib64/libc.so.6 > #1 0x2b67cd541e28 in abort () from /lib64/libc.so.6 > #2 0x2b67cb032921 in ink_die_die_die () at ink_error.cc:43 > #3 0x2b67cb0329da in ink_fatal_va (fmt=0x2b67cb0442dc "%s:%d: failed > assert `%s`", ap=0x7ffc690e7ba8) at ink_error.cc:65 > #4 0x2b67cb032a79 in ink_fatal (message_format=0x2b67cb0442dc "%s:%d: > failed assert `%s`") at ink_error.cc:73 > #5 0x2b67cb0305a6 in _ink_assert (expression=0x7fb422 "!\"unexpcted > state\"", file=0x7fb35b "LogCollationClientSM.cc", > line=445) at ink_assert.cc:37 > #6 0x0069c86b in LogCollationClientSM::client_idle > (this=0x2b681400bb00, event=105) at LogCollationClientSM.cc:445 > #7 0x0069b427 in LogCollationClientSM::client_handler > (this=0x2b681400bb00, event=105, data=0x2b680c017020) > at LogCollationClientSM.cc:119 > #8 0x00502cc6 in Continuation::handleEvent (this=0x2b681400bb00, > event=105, data=0x2b680c017020) > at ../iocore/eventsystem/I_Continuation.h:153 > #9 0x00783d40 in read_signal_and_update (event=105, > vc=0x2b680c016f00) at UnixNetVConnection.cc:150 > #10 0x00787a22 in UnixNetVConnection::mainEvent (this=0x2b680c016f00, > event=1, e=0x127ad60) at UnixNetVConnection.cc:1188 > #11 0x00502cc6 in Continuation::handleEvent (this=0x2b680c016f00, > event=1, data=0x127ad60) > at ../iocore/eventsystem/I_Continuation.h:153 > #12 0x0077d943 in InactivityCop::check_inactivity (this=0x1209a00, > event=2, e=0x127ad60) at UnixNet.cc:102 > #13 0x00502cc6 in Continuation::handleEvent (this=0x1209a00, event=2, > data=0x127ad60) > at ../iocore/eventsystem/I_Continuation.h:153 > #14 0x007a5df6 in EThread::process_event (this=0x2b67cf7bb010, > e=0x127ad60, calling_code=2) at UnixEThread.cc:128 > #15 0x007a61f5 in EThread::execute (this=0x2b67cf7bb010) at > UnixEThread.cc:207 > #16 0x00534430 in main (argv=0x7ffc690e82e8) at Main.cc:1918 > I believe it takes a wrong turn here -- > #9 0x00783d40 in read_signal_and_update (event=105, > vc=0x2b680c016f00) at UnixNetVConnection.cc:150 > 150 vc->read.vio._cont->handleEvent(event, >read.vio); > (gdb) list > 145 static inline int > 146 read_signal_and_update(int event, UnixNetVConnection *vc) > 147 { > 148 vc->recursion++; > 149 if (vc->read.vio._cont) { > 150 vc->read.vio._cont->handleEvent(event, >read.vio); > 151 } else { > 152 switch (event) { > 153 case VC_EVENT_EOS: > 154 case VC_EVENT_ERROR: > (gdb) list > 155 case VC_EVENT_ACTIVE_TIMEOUT: > 156 case VC_EVENT_INACTIVITY_TIMEOUT: > 157 Debug("inactivity_cop", "event %d: null read.vio cont, closing > vc %p", event, vc); > 158 vc->closed = 1; > 159 break; > 160 default: > 161 Error("Unexpected event %d for vc %p", event, vc); > 162 ink_release_assert(0); > 163 break; > 164 } > Note: I understand that
[jira] [Commented] (TS-4475) Crash in Log-Collation client after using inactivity-cop.
[ https://issues.apache.org/jira/browse/TS-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405696#comment-15405696 ] Oknet Xu commented on TS-4475: -- [~pbchou] You should handle Active Timeout and Inactive Timeout event in Log-Collation, ignore timeout event would leave dead netvc in ATS. And specify a standalone timeout value for Log-Collation. > Crash in Log-Collation client after using inactivity-cop. > - > > Key: TS-4475 > URL: https://issues.apache.org/jira/browse/TS-4475 > Project: Traffic Server > Issue Type: Bug > Components: Logging >Affects Versions: 6.1.1 >Reporter: Peter Chou > Fix For: sometime > > Time Spent: 50m > Remaining Estimate: 0h > > Background: We recently tried making use of inactivity-cop by setting it to > 300s instead of the default one-day setting. This was to address an issue > where, under heavy load, ATS would become un-responsive to client requests, > and the condition would persist after traffic was stopped with the active > queue saying 0 connections but 'netstat -na' showing a bunch of established > connections (up to the throttle limit approximately). > Inactivity cop seemed to help ATS handle this situation, but we have since > experienced a couple of core dumps over the last four day period. It seems > occasionally the Log Collation Client State Machine will have event value 105 > or VC_EVENT_INACTIVITY_TIMEOUT, but when it reaches read_signal_and_update() > it tries to call the continuation handler which down the line does not know > about this event thus causing core dump !"unexpcted state" [sic]. > Here is the back-trace -- > (gdb) bt > #0 0x2b67cd5405f7 in raise () from /lib64/libc.so.6 > #1 0x2b67cd541e28 in abort () from /lib64/libc.so.6 > #2 0x2b67cb032921 in ink_die_die_die () at ink_error.cc:43 > #3 0x2b67cb0329da in ink_fatal_va (fmt=0x2b67cb0442dc "%s:%d: failed > assert `%s`", ap=0x7ffc690e7ba8) at ink_error.cc:65 > #4 0x2b67cb032a79 in ink_fatal (message_format=0x2b67cb0442dc "%s:%d: > failed assert `%s`") at ink_error.cc:73 > #5 0x2b67cb0305a6 in _ink_assert (expression=0x7fb422 "!\"unexpcted > state\"", file=0x7fb35b "LogCollationClientSM.cc", > line=445) at ink_assert.cc:37 > #6 0x0069c86b in LogCollationClientSM::client_idle > (this=0x2b681400bb00, event=105) at LogCollationClientSM.cc:445 > #7 0x0069b427 in LogCollationClientSM::client_handler > (this=0x2b681400bb00, event=105, data=0x2b680c017020) > at LogCollationClientSM.cc:119 > #8 0x00502cc6 in Continuation::handleEvent (this=0x2b681400bb00, > event=105, data=0x2b680c017020) > at ../iocore/eventsystem/I_Continuation.h:153 > #9 0x00783d40 in read_signal_and_update (event=105, > vc=0x2b680c016f00) at UnixNetVConnection.cc:150 > #10 0x00787a22 in UnixNetVConnection::mainEvent (this=0x2b680c016f00, > event=1, e=0x127ad60) at UnixNetVConnection.cc:1188 > #11 0x00502cc6 in Continuation::handleEvent (this=0x2b680c016f00, > event=1, data=0x127ad60) > at ../iocore/eventsystem/I_Continuation.h:153 > #12 0x0077d943 in InactivityCop::check_inactivity (this=0x1209a00, > event=2, e=0x127ad60) at UnixNet.cc:102 > #13 0x00502cc6 in Continuation::handleEvent (this=0x1209a00, event=2, > data=0x127ad60) > at ../iocore/eventsystem/I_Continuation.h:153 > #14 0x007a5df6 in EThread::process_event (this=0x2b67cf7bb010, > e=0x127ad60, calling_code=2) at UnixEThread.cc:128 > #15 0x007a61f5 in EThread::execute (this=0x2b67cf7bb010) at > UnixEThread.cc:207 > #16 0x00534430 in main (argv=0x7ffc690e82e8) at Main.cc:1918 > I believe it takes a wrong turn here -- > #9 0x00783d40 in read_signal_and_update (event=105, > vc=0x2b680c016f00) at UnixNetVConnection.cc:150 > 150 vc->read.vio._cont->handleEvent(event, >read.vio); > (gdb) list > 145 static inline int > 146 read_signal_and_update(int event, UnixNetVConnection *vc) > 147 { > 148 vc->recursion++; > 149 if (vc->read.vio._cont) { > 150 vc->read.vio._cont->handleEvent(event, >read.vio); > 151 } else { > 152 switch (event) { > 153 case VC_EVENT_EOS: > 154 case VC_EVENT_ERROR: > (gdb) list > 155 case VC_EVENT_ACTIVE_TIMEOUT: > 156 case VC_EVENT_INACTIVITY_TIMEOUT: > 157 Debug("inactivity_cop", "event %d: null read.vio cont, closing > vc %p", event, vc); > 158 vc->closed = 1; > 159 break; > 160 default: > 161 Error("Unexpected event %d for vc %p", event, vc); > 162 ink_release_assert(0); > 163 break; > 164 } > Note: I understand that there were several issues related to TS-3196 > concerning inactivity_cop and this section of
[jira] [Commented] (TS-4475) Crash in Log-Collation client after using inactivity-cop.
[ https://issues.apache.org/jira/browse/TS-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401947#comment-15401947 ] Oknet Xu commented on TS-4475: -- Acroding the source file: /proxy/logging/LogCollationClientSM.cc, the LogCollationClientSM::client_idle() handle LOG_COLL_CLIENT_IDLE state. The netvc managed by LogCollationClientSM is a persistent connection, the client_idle() handle EOS for do_io_read and ERROR for do_io_write. Thus, there is no timeout event only idle state, the InactivityCop should not check timeout for the netvc managed by LogCollationClientSM. Consider the LogCollationHostSM::read_hdr() would receive TIMEOUT Event from InactivityCop too, but it is call host_handler() with LOG_COLL_EVENT_ERROR event and no assert. The host_handler() only log the error event and call host_done() to close the netvc. Thus no crash report from LogCollationHostSM. A possible solution: Step 1. handle Timeout event at LogCollationClientSM::client_idle() and some others, and call client_fail() to close the timeouted netvc. Step 2. Set a standalone inactivity timeout value for logcollation's netvc to keep the connection in a idle state. The PR831 is not call client_fail() to close the timeouted netvc. [~pbchou] , can you replace "return EVENT_CONT" with "return client_fail(LOG_COLL_EVENT_SWITCH, NULL);" for VC_EVENT_INACTIVITY_TIMEOUT. > Crash in Log-Collation client after using inactivity-cop. > - > > Key: TS-4475 > URL: https://issues.apache.org/jira/browse/TS-4475 > Project: Traffic Server > Issue Type: Bug > Components: Logging >Affects Versions: 6.1.1 >Reporter: Peter Chou > Fix For: sometime > > Time Spent: 50m > Remaining Estimate: 0h > > Background: We recently tried making use of inactivity-cop by setting it to > 300s instead of the default one-day setting. This was to address an issue > where, under heavy load, ATS would become un-responsive to client requests, > and the condition would persist after traffic was stopped with the active > queue saying 0 connections but 'netstat -na' showing a bunch of established > connections (up to the throttle limit approximately). > Inactivity cop seemed to help ATS handle this situation, but we have since > experienced a couple of core dumps over the last four day period. It seems > occasionally the Log Collation Client State Machine will have event value 105 > or VC_EVENT_INACTIVITY_TIMEOUT, but when it reaches read_signal_and_update() > it tries to call the continuation handler which down the line does not know > about this event thus causing core dump !"unexpcted state" [sic]. > Here is the back-trace -- > (gdb) bt > #0 0x2b67cd5405f7 in raise () from /lib64/libc.so.6 > #1 0x2b67cd541e28 in abort () from /lib64/libc.so.6 > #2 0x2b67cb032921 in ink_die_die_die () at ink_error.cc:43 > #3 0x2b67cb0329da in ink_fatal_va (fmt=0x2b67cb0442dc "%s:%d: failed > assert `%s`", ap=0x7ffc690e7ba8) at ink_error.cc:65 > #4 0x2b67cb032a79 in ink_fatal (message_format=0x2b67cb0442dc "%s:%d: > failed assert `%s`") at ink_error.cc:73 > #5 0x2b67cb0305a6 in _ink_assert (expression=0x7fb422 "!\"unexpcted > state\"", file=0x7fb35b "LogCollationClientSM.cc", > line=445) at ink_assert.cc:37 > #6 0x0069c86b in LogCollationClientSM::client_idle > (this=0x2b681400bb00, event=105) at LogCollationClientSM.cc:445 > #7 0x0069b427 in LogCollationClientSM::client_handler > (this=0x2b681400bb00, event=105, data=0x2b680c017020) > at LogCollationClientSM.cc:119 > #8 0x00502cc6 in Continuation::handleEvent (this=0x2b681400bb00, > event=105, data=0x2b680c017020) > at ../iocore/eventsystem/I_Continuation.h:153 > #9 0x00783d40 in read_signal_and_update (event=105, > vc=0x2b680c016f00) at UnixNetVConnection.cc:150 > #10 0x00787a22 in UnixNetVConnection::mainEvent (this=0x2b680c016f00, > event=1, e=0x127ad60) at UnixNetVConnection.cc:1188 > #11 0x00502cc6 in Continuation::handleEvent (this=0x2b680c016f00, > event=1, data=0x127ad60) > at ../iocore/eventsystem/I_Continuation.h:153 > #12 0x0077d943 in InactivityCop::check_inactivity (this=0x1209a00, > event=2, e=0x127ad60) at UnixNet.cc:102 > #13 0x00502cc6 in Continuation::handleEvent (this=0x1209a00, event=2, > data=0x127ad60) > at ../iocore/eventsystem/I_Continuation.h:153 > #14 0x007a5df6 in EThread::process_event (this=0x2b67cf7bb010, > e=0x127ad60, calling_code=2) at UnixEThread.cc:128 > #15 0x007a61f5 in EThread::execute (this=0x2b67cf7bb010) at > UnixEThread.cc:207 > #16 0x00534430 in main (argv=0x7ffc690e82e8) at Main.cc:1918 > I believe it takes a wrong turn here -- > #9 0x00783d40 in read_signal_and_update (event=105, > vc=0x2b680c016f00)
[jira] [Created] (TS-4705) Proposal: NetVC Context
Oknet Xu created TS-4705: Summary: Proposal: NetVC Context Key: TS-4705 URL: https://issues.apache.org/jira/browse/TS-4705 Project: Traffic Server Issue Type: Improvement Components: Core Reporter: Oknet Xu Goal 1st: In the NetVConnection, we have get_local_addr() and get_remote_addr() methods. Also have members local_addr, remote_addr and netvc->con.addr. Thus, we should using netvc->con.addr or remote_addr to replace member server_addr in UnixNetVConnection. Goal 2nd: SSLNetVConnection has member sslClientConnection with 2 methods setSSLClientConnection() and getSSLClientConnection() to indictor ATS is a client or server in a SSL session. To abstract above two goals, I'm design the netvc context function. As a proxy, there has two side: client side ( Client <-> Proxy ) and server side ( Proxy <-> Server ). With the netvc context funtion to indicate which side the NetVC working on. Goal 3rd: Fix a minor bug in NetAccept::do_blocking_accept, call to check_emergency_throttle(con) first then allocate vc. Goal 4th: NetAccept Optimize, remove dup code, etc... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4614) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback
[ https://issues.apache.org/jira/browse/TS-4614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397182#comment-15397182 ] Oknet Xu commented on TS-4614: -- What I should do for the backport ? > In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event > callback > --- > > Key: TS-4614 > URL: https://issues.apache.org/jira/browse/TS-4614 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > NetHandler has a method: _close_vc , It is called by InactivityCop. > first, create a dummy Event in stack, > then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, > ); > the handleEvent is mainEvent here. > In the UnixNetVConnection::mainEvent code: > ``` > int > UnixNetVConnection::mainEvent(int event, Event *e) > { > ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); > ink_assert(thread == this_ethread()); > MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); > MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, > e->ethread); > MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : > e->ethread->mutex, e->ethread); > if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || > (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || > (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > #endif > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > return EVENT_CONT; > } > ``` > the dummy Event would be schedule_in into Event System by > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. > ``` > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > #endif > ``` > I'm try to allocate a Event instead dummy Event, but meet Event System > callback on a deallocated UnixNetVConnection. > due to NetHandler called close_UnixNetVConnection before Event System > callback the Event by schedule_in. > In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) > from UnixNetVConnection::mainEvent, to do ++handle_event; or not. > ``` > if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) > ++handle_event; > ``` > the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, > due to the mutex of ServerSessionVC may different from ClientSessionVC. > Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server > Session from SessionPool. ServerSessionVC still keep the old mutex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4697) MIOBuffer did not free if failed on ipallow check in HttpSessionAccept::accept()
Oknet Xu created TS-4697: Summary: MIOBuffer did not free if failed on ipallow check in HttpSessionAccept::accept() Key: TS-4697 URL: https://issues.apache.org/jira/browse/TS-4697 Project: Traffic Server Issue Type: Bug Components: HTTP, Network Reporter: Oknet Xu {code} void HttpSessionAccept::accept(NetVConnection *netvc, MIOBuffer *iobuf, IOBufferReader *reader) { sockaddr const *client_ip = netvc->get_remote_addr(); const AclRecord *acl_record = NULL; ip_port_text_buffer ipb; IpAllow::scoped_config ipallow; // The backdoor port is now only bound to "localhost", so no // reason to check for if it's incoming from "localhost" or not. if (backdoor) { acl_record = IpAllow::AllMethodAcl(); } else if (ipallow && (((acl_record = ipallow->match(client_ip)) == NULL) || (acl_record->isEmpty( { // if client address forbidden, close immediately // Warning("client '%s' prohibited by ip-allow policy", ats_ip_ntop(client_ip, ipb, sizeof(ipb))); netvc->do_io_close(); return; // -> MIOBuffer did not free. } ... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4612) Proposal: InactivityCop Optimize
[ https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4612: - Description: By review the processing of InactivityCop::check_inactivity(): 1. get all local vc from open_list 2. put them into cop_list 3. check every vc in cop_list if it is already timeouted 4. callback vc->handleEvent to close vc if it is timeout InactivityCop and NetHandler share one mutex. InactivityCop runs every second, NetHandler runs every 10ms, that means Nethandler runs 100 times until next InactivityCop runs. if one vc has read/write in a Nethandler call, it is won't be timeout in the next InactivityCop run. Thus, if the vc has read/write in Nethandler, we move it out of cop-list then the InactivityCop runs would get better performace. was: NetHandler has a method: _close_vc , It is called by InactivityCop. first, create a dummy Event in stack, then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, ); the handleEvent is mainEvent here. In the UnixNetVConnection::mainEvent code: ``` int UnixNetVConnection::mainEvent(int event, Event *e) { ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); ink_assert(thread == this_ethread()); MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, e->ethread); MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, e->ethread); if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) #endif e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); return EVENT_CONT; } ``` the dummy Event would be schedule_in into Event System by e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. ``` #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); #endif ``` I'm try to allocate a Event instead dummy Event, but meet Event System callback on a deallocated UnixNetVConnection. due to NetHandler called close_UnixNetVConnection before Event System callback the Event by schedule_in. In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) from UnixNetVConnection::mainEvent, to do ++handle_event; or not. ``` if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) ++handle_event; ``` the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, due to the mutex of ServerSessionVC may different from ClientSessionVC. Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server Session from SessionPool. ServerSessionVC still keep the old mutex. > Proposal: InactivityCop Optimize > > > Key: TS-4612 > URL: https://issues.apache.org/jira/browse/TS-4612 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Reporter: Oknet Xu > > By review the processing of InactivityCop::check_inactivity(): > 1. get all local vc from open_list > 2. put them into cop_list > 3. check every vc in cop_list if it is already timeouted > 4. callback vc->handleEvent to close vc if it is timeout > InactivityCop and NetHandler share one mutex. > InactivityCop runs every second, NetHandler runs every 10ms, that means > Nethandler runs 100 times until next InactivityCop runs. > if one vc has read/write in a Nethandler call, it is won't be timeout in the > next InactivityCop run. > Thus, if the vc has read/write in Nethandler, we move it out of cop-list then > the InactivityCop runs would get better performace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4612) Proposal: InactivityCop Optimize
[ https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4612: - Summary: Proposal: InactivityCop Optimize (was: In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback ) > Proposal: InactivityCop Optimize > > > Key: TS-4612 > URL: https://issues.apache.org/jira/browse/TS-4612 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Reporter: Oknet Xu > > NetHandler has a method: _close_vc , It is called by InactivityCop. > first, create a dummy Event in stack, > then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, > ); > the handleEvent is mainEvent here. > In the UnixNetVConnection::mainEvent code: > ``` > int > UnixNetVConnection::mainEvent(int event, Event *e) > { > ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); > ink_assert(thread == this_ethread()); > MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); > MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, > e->ethread); > MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : > e->ethread->mutex, e->ethread); > if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || > (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || > (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > #endif > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > return EVENT_CONT; > } > ``` > the dummy Event would be schedule_in into Event System by > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. > ``` > #ifdef INACTIVITY_TIMEOUT > if (e == active_timeout) > e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); > #endif > ``` > I'm try to allocate a Event instead dummy Event, but meet Event System > callback on a deallocated UnixNetVConnection. > due to NetHandler called close_UnixNetVConnection before Event System > callback the Event by schedule_in. > In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) > from UnixNetVConnection::mainEvent, to do ++handle_event; or not. > ``` > if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) > ++handle_event; > ``` > the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, > due to the mutex of ServerSessionVC may different from ClientSessionVC. > Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server > Session from SessionPool. ServerSessionVC still keep the old mutex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4614) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback
Oknet Xu created TS-4614: Summary: In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback Key: TS-4614 URL: https://issues.apache.org/jira/browse/TS-4614 Project: Traffic Server Issue Type: Bug Components: Cop Reporter: Oknet Xu NetHandler has a method: _close_vc , It is called by InactivityCop. first, create a dummy Event in stack, then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, ); the handleEvent is mainEvent here. In the UnixNetVConnection::mainEvent code: ``` int UnixNetVConnection::mainEvent(int event, Event *e) { ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); ink_assert(thread == this_ethread()); MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, e->ethread); MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, e->ethread); if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) #endif e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); return EVENT_CONT; } ``` the dummy Event would be schedule_in into Event System by e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. ``` #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); #endif ``` I'm try to allocate a Event instead dummy Event, but meet Event System callback on a deallocated UnixNetVConnection. due to NetHandler called close_UnixNetVConnection before Event System callback the Event by schedule_in. In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) from UnixNetVConnection::mainEvent, to do ++handle_event; or not. ``` if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) ++handle_event; ``` the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, due to the mutex of ServerSessionVC may different from ClientSessionVC. Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server Session from SessionPool. ServerSessionVC still keep the old mutex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4612) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback
Oknet Xu created TS-4612: Summary: In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback Key: TS-4612 URL: https://issues.apache.org/jira/browse/TS-4612 Project: Traffic Server Issue Type: Bug Components: Cop Reporter: Oknet Xu NetHandler has a method: _close_vc , It is called by InactivityCop. first, create a dummy Event in stack, then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, ); the handleEvent is mainEvent here. In the UnixNetVConnection::mainEvent code: ``` int UnixNetVConnection::mainEvent(int event, Event *e) { ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); ink_assert(thread == this_ethread()); MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, e->ethread); MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, e->ethread); if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) #endif e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); return EVENT_CONT; } ``` the dummy Event would be schedule_in into Event System by e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. ``` #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); #endif ``` I'm try to allocate a Event instead dummy Event, but meet Event System callback on a deallocated UnixNetVConnection. due to NetHandler called close_UnixNetVConnection before Event System callback the Event by schedule_in. In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) from UnixNetVConnection::mainEvent, to do ++handle_event; or not. ``` if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) ++handle_event; ``` the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, due to the mutex of ServerSessionVC may different from ClientSessionVC. Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server Session from SessionPool. ServerSessionVC still keep the old mutex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4613) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback
Oknet Xu created TS-4613: Summary: In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback Key: TS-4613 URL: https://issues.apache.org/jira/browse/TS-4613 Project: Traffic Server Issue Type: Bug Components: Cop Reporter: Oknet Xu NetHandler has a method: _close_vc , It is called by InactivityCop. first, create a dummy Event in stack, then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, ); the handleEvent is mainEvent here. In the UnixNetVConnection::mainEvent code: ``` int UnixNetVConnection::mainEvent(int event, Event *e) { ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL); ink_assert(thread == this_ethread()); MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread); MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, e->ethread); MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, e->ethread); if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() || (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) || (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) { #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) #endif e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); return EVENT_CONT; } ``` the dummy Event would be schedule_in into Event System by e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro. ``` #ifdef INACTIVITY_TIMEOUT if (e == active_timeout) e->schedule_in(HRTIME_MSECONDS(net_retry_delay)); #endif ``` I'm try to allocate a Event instead dummy Event, but meet Event System callback on a deallocated UnixNetVConnection. due to NetHandler called close_UnixNetVConnection before Event System callback the Event by schedule_in. In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) from UnixNetVConnection::mainEvent, to do ++handle_event; or not. ``` if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE) ++handle_event; ``` the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback, due to the mutex of ServerSessionVC may different from ClientSessionVC. Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server Session from SessionPool. ServerSessionVC still keep the old mutex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) haven't check the change of lock after return from wbe callback
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Summary: haven't check the change of lock after return from wbe callback (was: Don't reschedule read depend on needs & did not check the change of lock at the return callback with wbe.) > haven't check the change of lock after return from wbe callback > --- > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { // should check needs==0 > write_disable(nh, vc); > return; > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} > another issue in write_to_net_io(): did not check the change of lock at the > return callback with wbe. > {code} > if (s->vio.ntodo() <= 0) { > write_signal_done(VC_EVENT_WRITE_COMPLETE, nh, vc); > return; > } else if (signalled && (wbe_event != vc->write_buffer_empty_event)) { > // @a signalled means we won't send an event, and the event values > differing means we > // had a write buffer trap and cleared it, so we need to send it now. > if (write_signal_and_update(wbe_event, vc) != EVENT_CONT) > return; > // > did not check the change of lock at the return callback > with wbe. > } else if (!signalled) { > if (write_signal_and_update(VC_EVENT_WRITE_READY, vc) != EVENT_CONT) { > return; > } > // change of lock... don't look at shared variables! > if (lock.get_mutex() != s->vio.mutex.get()) { > write_reschedule(nh, vc); > return; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4590) INKVConnInternal didn't set m_free_magic to DEAD as INKContInternal
Oknet Xu created TS-4590: Summary: INKVConnInternal didn't set m_free_magic to DEAD as INKContInternal Key: TS-4590 URL: https://issues.apache.org/jira/browse/TS-4590 Project: Traffic Server Issue Type: Improvement Components: TS API Reporter: Oknet Xu The class INKContInternal is a base class of INKVConnInternal. INKVConnInternal rewrite destroy() and handle_event(), but forgot to set m_free_magic to DEAD that is a debug flag. I will add 2 methods for INKContInternal and INKVConnInternal: - clear() - clear variables - free() - call clear() first - call this->mutex.clear(); - set m_free_magic - call xxxAllocator.free(this) and rewrite destroy to call free(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4539) the mutex of server_vc is not set while server_session reuse.
[ https://issues.apache.org/jira/browse/TS-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333670#comment-15333670 ] Oknet Xu commented on TS-4539: -- Client_vc->mutex is alloced in NetAccept by {code}vc->mutex = new_ProxyMutex();{code} and Server_vc->mutex is set to {code}vc->mutex = cont->mutex;{code} in UnixNetProcessor::connect_re_internal(). the connect_re_internal() is called by NetProcessor::connect_re(). {code} if (scheme_to_use == URL_WKSIDX_HTTPS) { DebugSM("http", "calling sslNetProcessor.connect_re"); int len = 0; const char *host = t_state.hdr_info.server_request.host_get(); if (host && len > 0) opt.set_sni_servername(host, len); connect_action_handle = sslNetProcessor.connect_re(this, // state machine _state.current.server->dst_addr.sa, // addr + port ); } else { if (t_state.method != HTTP_WKSIDX_CONNECT) { DebugSM("http", "calling netProcessor.connect_re"); connect_action_handle = netProcessor.connect_re(this, // state machine _state.current.server->dst_addr.sa, // addr + port ); } else { {code} acroding the above code in HttpSM::do_http_server_open(), the cont is HttpSM. the Server_vc->mutex is set to HttpSM->mutex. the Server_vc->mutex is not set to EThread->mutex as your said. > the mutex of server_vc is not set while server_session reuse. > - > > Key: TS-4539 > URL: https://issues.apache.org/jira/browse/TS-4539 > Project: Traffic Server > Issue Type: Bug > Components: HTTP >Reporter: Oknet Xu > > NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex. > And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share > the same mutex. > The HttpServerSession and server_vc will put into ServerSessionPool and may > assign to next new client_vc. > The HttpSM::attach_server_session() only set the mutex of HttpServerSession > to the mutex of HttpSM after get a HttpServerSession from ServerSessionPool. > But it forget to set the mutex of server_vc to the mutex of HttpSM. > > {code} > void > HttpSM::attach_server_session(HttpServerSession *s) > { > hsm_release_assert(server_session == NULL); > hsm_release_assert(server_entry == NULL); > hsm_release_assert(s->state == HSS_ACTIVE); > server_session = s; > server_session->transact_count++; > // Set the mutex so that we have something to update > // stats with > server_session->mutex = this->mutex; > {code} > But I can not found any issue, Is it by design? > Or it is hard to locate the problem, due to my limited knowedge on HttpSM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4539) the mutex of server_vc is not set while server_session reuse.
Oknet Xu created TS-4539: Summary: the mutex of server_vc is not set while server_session reuse. Key: TS-4539 URL: https://issues.apache.org/jira/browse/TS-4539 Project: Traffic Server Issue Type: Bug Components: HTTP Reporter: Oknet Xu NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex. And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share the same mutex. The HttpServerSession and server_vc will put into ServerSessionPool and may assign to next new client_vc. The HttpSM::attach_server_session() only set the mutex of HttpServerSession to the mutex of HttpSM after get a HttpServerSession from ServerSessionPool. But it forget to set the mutex of server_vc to the mutex of HttpSM. {code} void HttpSM::attach_server_session(HttpServerSession *s) { hsm_release_assert(server_session == NULL); hsm_release_assert(server_entry == NULL); hsm_release_assert(s->state == HSS_ACTIVE); server_session = s; server_session->transact_count++; // Set the mutex so that we have something to update // stats with server_session->mutex = this->mutex; {code} But I can not found any issue, Is it by design? Or it is hard to locate the problem, due to my limited knowedge on HttpSM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4521) compile error on build proxy/http2/test_HPACK
[ https://issues.apache.org/jira/browse/TS-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326884#comment-15326884 ] Oknet Xu commented on TS-4521: -- Thanks! It's worked. > compile error on build proxy/http2/test_HPACK > - > > Key: TS-4521 > URL: https://issues.apache.org/jira/browse/TS-4521 > Project: Traffic Server > Issue Type: Bug >Reporter: Oknet Xu >Assignee: Masakazu Kitajo > Fix For: 7.0.0 > > > OS: Debian 7 (wheezy) > ATS Branch: master > GCC: 4.7.2(Debian 4.7.2-5) > {code} > /usr/bin/ld: ../../proxy/hdrs/libhdrs.a(HttpCompat.o): undefined reference to > symbol 'Tcl_NextHashEntry' > /usr/bin/ld: note: 'Tcl_NextHashEntry' is defined in DSO > /usr/lib/libtcl8.5.so.0 so try adding it to the linker command line > /usr/lib/libtcl8.5.so.0: could not read symbols: Invalid operation > collect2: error: ld returned 1 exit status > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4522) did not check EPIPE on write_to_net_io
Oknet Xu created TS-4522: Summary: did not check EPIPE on write_to_net_io Key: TS-4522 URL: https://issues.apache.org/jira/browse/TS-4522 Project: Traffic Server Issue Type: Bug Components: Core, Network Reporter: Oknet Xu On a closed socket fd: read(socketfd) return 0 write(socketfd) return EPIPE In the write_to_net_io, we check the return value of write() with the same way to read(). {code} if (!r || r == -ECONNRESET) { {code} The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but VC_EVENT_ERROR instead. full code here: {code} int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs); if (total_written > 0) { NET_SUM_DYN_STAT(net_write_bytes_stat, total_written); s->vio.ndone += total_written; } // check for errors if (r <= 0) { // if the socket was not ready,add to WaitList if (r == -EAGAIN || r == -ENOTCONN) { NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat); if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { vc->write.triggered = 0; nh->write_ready_list.remove(vc); write_reschedule(nh, vc); } if ((needs & EVENTIO_READ) == EVENTIO_READ) { vc->read.triggered = 0; nh->read_ready_list.remove(vc); read_reschedule(nh, vc); } return; } if (!r || r == -ECONNRESET) { vc->write.triggered = 0; write_signal_done(VC_EVENT_EOS, nh, vc); return; } vc->write.triggered = 0; write_signal_error(nh, vc, (int)-total_written); return; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TS-4488) Segmentation fault at HttpSM::tunnel_handler_ua
[ https://issues.apache.org/jira/browse/TS-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326419#comment-15326419 ] Oknet Xu edited comment on TS-4488 at 6/12/16 12:34 PM: meet the similar crash, only has stack trace: {code} traffic_server: Segmentation fault (Address not mapped to object [(nil)]) traffic_server - STACK TRACE: /usr/bin/traffic_server(crash_logger_invoke(int, siginfo*, void*)+0xa2)[0x2ad121596e32] /lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2ad1252711a0] /usr/bin/traffic_server(HttpTunnel::consumer_handler(int, HttpTunnelConsumer*)+0xca)[0x2ad1216ab19a] /usr/bin/traffic_server(HttpTunnel::main_handler(int, void*)+0x5e)[0x2ad1216ab58e] /usr/bin/traffic_server(HttpSM::state_send_server_request_header(int, void*)+0xfd)[0x2ad12166ea9d] /usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x80)[0x2ad12166cb10] /usr/bin/traffic_server(UnixNetVConnection::mainEvent(int, Event*)+0x4e7)[0x2ad121808687] /usr/bin/traffic_server(InactivityCop::check_inactivity(int, Event*)+0x287)[0x2ad1217fe8d7] /usr/bin/traffic_server(EThread::process_event(Event*, int)+0x90)[0x2ad12182a020] /usr/bin/traffic_server(EThread::execute()+0x69e)[0x2ad12182ac4e] /usr/bin/traffic_server(+0x39938a)[0x2ad12182938a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2ad1255d3b50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ad12531d30d]{code} {code} traffic_server: Segmentation fault (Address not mapped to object [0x3d]) traffic_server - STACK TRACE: /usr/bin/traffic_server(crash_logger_invoke(int, siginfo*, void*)+0xa2)[0x2afd2459fe32] /lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2afd2827a1a0] /usr/bin/traffic_server(HttpSM::tunnel_handler_transform_write(int, HttpTunnelConsumer*)+0x1a4)[0x2afd24663f94] /usr/bin/traffic_server(HttpTunnel::consumer_handler(int, HttpTunnelConsumer*)+0xaf)[0x2afd246b417f] /usr/bin/traffic_server(HttpTunnel::main_handler(int, void*)+0x5e)[0x2afd246b458e] /usr/lib/trafficserver/modules/skg-spe.so(+0xc179)[0x2afd7b2a1179] /usr/bin/traffic_server(EThread::process_event(Event*, int)+0x90)[0x2afd24833020] /usr/bin/traffic_server(EThread::execute()+0x67f)[0x2afd24833c2f] /usr/bin/traffic_server(+0x39938a)[0x2afd2483238a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2afd285dcb50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2afd2832630d] traffic_server: using root directory '/usr' {code} was (Author: oknet): meet the similar crash, only has stack trace: {code} traffic_server: Segmentation fault (Address not mapped to object [(nil)]) traffic_server - STACK TRACE: /usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2ad121596e32] /lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2ad1252711a0] /usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xca)[0x2ad1216ab19a] /usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2ad1216ab58e] /usr/bin/traffic_server(_ZN6HttpSM32state_send_server_request_headerEiPv+0xfd)[0x2ad12166ea9d] /usr/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x80)[0x2ad12166cb10] /usr/bin/traffic_server(_ZN18UnixNetVConnection9mainEventEiP5Event+0x4e7)[0x2ad121808687] /usr/bin/traffic_server(_ZN13InactivityCop16check_inactivityEiP5Event+0x287)[0x2ad1217fe8d7] /usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2ad12182a020] /usr/bin/traffic_server(_ZN7EThread7executeEv+0x69e)[0x2ad12182ac4e] /usr/bin/traffic_server(+0x39938a)[0x2ad12182938a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2ad1255d3b50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ad12531d30d] {code} {code} traffic_server: Segmentation fault (Address not mapped to object [0x3d]) traffic_server - STACK TRACE: /usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2afd2459fe32] /lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2afd2827a1a0] /usr/bin/traffic_server(_ZN6HttpSM30tunnel_handler_transform_writeEiP18HttpTunnelConsumer+0x1a4)[0x2afd24663f94] /usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xaf)[0x2afd246b417f] /usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2afd246b458e] /usr/lib/trafficserver/modules/skg-spe.so(+0xc179)[0x2afd7b2a1179] /usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2afd24833020] /usr/bin/traffic_server(_ZN7EThread7executeEv+0x67f)[0x2afd24833c2f] /usr/bin/traffic_server(+0x39938a)[0x2afd2483238a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2afd285dcb50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2afd2832630d] traffic_server: using root directory '/usr' {code} > Segmentation fault at HttpSM::tunnel_handler_ua > --- > > Key: TS-4488 > URL: https://issues.apache.org/jira/browse/TS-4488 > Project: Traffic Server > Issue Type: Bug >Reporter: Pavlo Yatsukhnenko >Assignee: Pavlo Yatsukhnenko >
[jira] [Commented] (TS-4488) Segmentation fault at HttpSM::tunnel_handler_ua
[ https://issues.apache.org/jira/browse/TS-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326419#comment-15326419 ] Oknet Xu commented on TS-4488: -- meet the similar crash, only has stack trace: {code} traffic_server: Segmentation fault (Address not mapped to object [(nil)]) traffic_server - STACK TRACE: /usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2ad121596e32] /lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2ad1252711a0] /usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xca)[0x2ad1216ab19a] /usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2ad1216ab58e] /usr/bin/traffic_server(_ZN6HttpSM32state_send_server_request_headerEiPv+0xfd)[0x2ad12166ea9d] /usr/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x80)[0x2ad12166cb10] /usr/bin/traffic_server(_ZN18UnixNetVConnection9mainEventEiP5Event+0x4e7)[0x2ad121808687] /usr/bin/traffic_server(_ZN13InactivityCop16check_inactivityEiP5Event+0x287)[0x2ad1217fe8d7] /usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2ad12182a020] /usr/bin/traffic_server(_ZN7EThread7executeEv+0x69e)[0x2ad12182ac4e] /usr/bin/traffic_server(+0x39938a)[0x2ad12182938a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2ad1255d3b50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ad12531d30d] {code} {code} traffic_server: Segmentation fault (Address not mapped to object [0x3d]) traffic_server - STACK TRACE: /usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2afd2459fe32] /lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2afd2827a1a0] /usr/bin/traffic_server(_ZN6HttpSM30tunnel_handler_transform_writeEiP18HttpTunnelConsumer+0x1a4)[0x2afd24663f94] /usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xaf)[0x2afd246b417f] /usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2afd246b458e] /usr/lib/trafficserver/modules/skg-spe.so(+0xc179)[0x2afd7b2a1179] /usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2afd24833020] /usr/bin/traffic_server(_ZN7EThread7executeEv+0x67f)[0x2afd24833c2f] /usr/bin/traffic_server(+0x39938a)[0x2afd2483238a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2afd285dcb50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2afd2832630d] traffic_server: using root directory '/usr' {code} > Segmentation fault at HttpSM::tunnel_handler_ua > --- > > Key: TS-4488 > URL: https://issues.apache.org/jira/browse/TS-4488 > Project: Traffic Server > Issue Type: Bug >Reporter: Pavlo Yatsukhnenko >Assignee: Pavlo Yatsukhnenko > Labels: Crash > Fix For: 7.0.0 > > > From Github PR #674 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TS-4521) compile error on build proxy/http2/test_HPACK
Oknet Xu created TS-4521: Summary: compile error on build proxy/http2/test_HPACK Key: TS-4521 URL: https://issues.apache.org/jira/browse/TS-4521 Project: Traffic Server Issue Type: Bug Reporter: Oknet Xu OS: Debian 7 (wheezy) ATS Branch: master GCC: 4.7.2(Debian 4.7.2-5) {code} /usr/bin/ld: ../../proxy/hdrs/libhdrs.a(HttpCompat.o): undefined reference to symbol 'Tcl_NextHashEntry' /usr/bin/ld: note: 'Tcl_NextHashEntry' is defined in DSO /usr/lib/libtcl8.5.so.0 so try adding it to the linker command line /usr/lib/libtcl8.5.so.0: could not read symbols: Invalid operation collect2: error: ld returned 1 exit status {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TS-4483) NetAccept & SSLNetAccept Optimize
[ https://issues.apache.org/jira/browse/TS-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305195#comment-15305195 ] Oknet Xu commented on TS-4483: -- This optimize also fix a bug in SSLNetAccept::init_accept_per_thread(bool isTransparent) {code} PollDescriptor *pd = get_PollDescriptor(t); if (ep.start(pd, this, EVENTIO_READ) < 0) // ==> should be a->ep.start(pd, a, EVENTIO_READ) Debug("iocore_net", "error starting EventIO"); a->mutex = get_NetHandler(t)->mutex; t->schedule_every(a, period, etype); {code} > NetAccept & SSLNetAccept Optimize > - > > Key: TS-4483 > URL: https://issues.apache.org/jira/browse/TS-4483 > Project: Traffic Server > Issue Type: Improvement > Components: Core >Reporter: Oknet Xu >Assignee: Oknet Xu > Fix For: 7.0.0 > > > replace getEtype() with member etype. > NetAccept has a member named 'etype' and it is set by upgradeEtype before > NetAccept running. > Thus, we can replace getEtype() with member etype and make the SSLNetAccept > codes clearly. > Should we remote the getEtype() method ? It is called by none after the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't reschedule read depend on needs & did not check the change of lock at the return callback with wbe.
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Summary: Don't reschedule read depend on needs & did not check the change of lock at the return callback with wbe. (was: Don't reschedule read depend on needs) > Don't reschedule read depend on needs & did not check the change of lock at > the return callback with wbe. > - > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { // should check needs==0 > write_disable(nh, vc); > return; > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't reschedule read depend on needs
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Description: the code: {code} int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, needs); {code} At the end of write_to_net_io, {code} if (!buf.reader()->read_avail()) { // should check needs==0 write_disable(nh, vc); return; } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} was: the code: {code} int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, needs); {code} At the end of write_to_net_io, {code} if (!buf.reader()->read_avail()) { // should check needs==0 write_disable(nh, vc); return; } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } // here r>0, don't need to check the needs. if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} > Don't reschedule read depend on needs > - > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { // should check needs==0 > write_disable(nh, vc); > return; > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't reschedule read depend on needs
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Description: the code: {code} int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, needs); {code} At the end of write_to_net_io, {code} if (!buf.reader()->read_avail()) { // should check needs==0 write_disable(nh, vc); return; } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } // here r>0, don't need to check the needs. if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} was: the code: {code} int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, needs); {code} In the SSLNetVConnection::load_buffer_and_write(), only set needs |= EVENTIO_WRITE on r>0. At the end of write_to_net_io, {code} if (!buf.reader()->read_avail()) { write_disable(nh, vc); return; } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } // here r>0, don't need to check the needs. if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} > Don't reschedule read depend on needs > - > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { // should check needs==0 > write_disable(nh, vc); > return; > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > // here r>0, don't need to check the needs. > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't reschedule read depend on needs
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Summary: Don't reschedule read depend on needs (was: Don't need reschedule read depend on needs) > Don't reschedule read depend on needs > - > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > In the SSLNetVConnection::load_buffer_and_write(), only set needs |= > EVENTIO_WRITE on r>0. > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { > write_disable(nh, vc); > return; > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > // here r>0, don't need to check the needs. > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't need reschedule read depend on needs
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Description: the code: {code} int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, needs); {code} In the SSLNetVConnection::load_buffer_and_write(), only set needs |= EVENTIO_WRITE on r>0. At the end of write_to_net_io, {code} if (!buf.reader()->read_avail()) { write_disable(nh, vc); return; } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } // here r>0, don't need to check the needs. if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} was: the code: {code} int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, needs); {code} In the SSLNetVConnection::load_buffer_and_write(), only set needs |= EVENTIO_WRITE on r>0. At the end of write_to_net_io, {code} if (!buf.reader()->read_avail()) { write_disable(nh, vc); return; <-- return from here, but don't reschedule read } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} > Don't need reschedule read depend on needs > -- > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > In the SSLNetVConnection::load_buffer_and_write(), only set needs |= > EVENTIO_WRITE on r>0. > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { > write_disable(nh, vc); > return; > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > // here r>0, don't need to check the needs. > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't need reschedule read depend on needs
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Issue Type: Improvement (was: Bug) > Don't need reschedule read depend on needs > -- > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Improvement > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > In the SSLNetVConnection::load_buffer_and_write(), only set needs |= > EVENTIO_WRITE on r>0. > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { > write_disable(nh, vc); > return; <-- return from here, but don't reschedule read > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't reschedule read depend on needs while write buffer empty
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Description: the code: {code} int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, needs); {code} In the SSLNetVConnection::load_buffer_and_write(), only set needs |= EVENTIO_WRITE on r>0. At the end of write_to_net_io, {code} if (!buf.reader()->read_avail()) { write_disable(nh, vc); return; <-- return from here, but don't reschedule read } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} was: At the end of write_to_net_io {code} if (!buf.reader()->read_avail()) { write_disable(nh, vc); return; <-- return from here, but don't reschedule read } if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { write_reschedule(nh, vc); } if ((needs & EVENTIO_READ) == EVENTIO_READ) { read_reschedule(nh, vc); } return; {code} > Don't reschedule read depend on needs while write buffer empty > -- > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > In the SSLNetVConnection::load_buffer_and_write(), only set needs |= > EVENTIO_WRITE on r>0. > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { > write_disable(nh, vc); > return; <-- return from here, but don't reschedule read > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TS-4487) Don't need reschedule read depend on needs
[ https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oknet Xu updated TS-4487: - Summary: Don't need reschedule read depend on needs (was: Don't reschedule read depend on needs while write buffer empty) > Don't need reschedule read depend on needs > -- > > Key: TS-4487 > URL: https://issues.apache.org/jira/browse/TS-4487 > Project: Traffic Server > Issue Type: Bug > Components: SSL >Reporter: Oknet Xu > > the code: > {code} > int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, > buf, needs); > {code} > In the SSLNetVConnection::load_buffer_and_write(), only set needs |= > EVENTIO_WRITE on r>0. > At the end of write_to_net_io, > {code} > if (!buf.reader()->read_avail()) { > write_disable(nh, vc); > return; <-- return from here, but don't reschedule read > } > if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) { > write_reschedule(nh, vc); > } > if ((needs & EVENTIO_READ) == EVENTIO_READ) { > read_reschedule(nh, vc); > } > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)