[jira] [Resolved] (TS-5106) Assert on ParentSelection.h line 337, there is no selection_strategy for default parent proxy.

2017-01-02 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu resolved TS-5106.
--
   Resolution: Fixed
 Assignee: Oknet Xu
Fix Version/s: 7.0.0

> Assert on ParentSelection.h line 337, there is no selection_strategy for 
> default parent proxy.
> --
>
> Key: TS-5106
> URL: https://issues.apache.org/jira/browse/TS-5106
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Parent Proxy
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> FATAL: ParentSelection.h:337: failed assertion 
> `result->rec->selection_strategy != NULL`
> traffic_server: Aborted (Signal sent by tkill() 21363 65534)
> traffic_server - STACK TRACE: 
> ../../bin/traffic_server(crash_logger_invoke(int, siginfo_t*, 
> void*)+0x99)[0x4c1be9]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f8b2a8ba8d0]
> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f8b29b15107]
> /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f8b29b164e8]
> /usr/local/ats/lib/libtsutil.so.7(+0x2ac31)[0x7f8b2c1adc31]
> /usr/local/ats/lib/libtsutil.so.7(+0x28c95)[0x7f8b2c1abc95]
> ../../bin/traffic_server(ParentConfigParams::findParent(HttpRequestData*, 
> ParentResult*)+0x56e)[0x4f677e]
> ../../bin/traffic_server(SocksEntry::init(Ptr&, 
> UnixNetVConnection*, unsigned char, unsigned char)+0x59b)[0x762dfb]
> ../../bin/traffic_server(UnixNetProcessor::connect_re_internal(Continuation*, 
> sockaddr const*, NetVCOptions*)+0x251)[0x74eaf1]
> ../../bin/traffic_server(HttpSM::do_http_server_open(bool)+0x850)[0x5bb5d0]
> ../../bin/traffic_server(HttpSM::set_next_state()+0x4a3)[0x5bc7b3]
> ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
> (*)(HttpTransact::State*))+0x3a)[0x5ae28a]
> ../../bin/traffic_server(HttpSM::state_cache_open_write(int, 
> void*)+0x1ce)[0x5b083e]
> ../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8]
> ../../bin/traffic_server(HttpCacheSM::state_cache_open_write(int, 
> void*)+0x1d1)[0x599221]
> ../../bin/traffic_server(CacheVC::callcont(int)+0x5b)[0x6a1b2b]
> ../../bin/traffic_server(Cache::open_write(Continuation*, ats::CryptoHash 
> const*, HTTPInfo*, long, ats::CryptoHash const*, CacheFragType, char const*, 
> int)+0x56b)[0x713f0b]
> ../../bin/traffic_server(HttpCacheSM::open_write(HttpCacheKey const*, URL*, 
> HTTPHdr*, HTTPInfo*, long, bool, bool)+0xcd)[0x598fed]
> ../../bin/traffic_server(HttpSM::do_cache_prepare_action(HttpCacheSM*, 
> HTTPInfo*, bool, bool)+0x15d)[0x5a90dd]
> ../../bin/traffic_server(HttpSM::set_next_state()+0x8b6)[0x5bcbc6]
> ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
> (*)(HttpTransact::State*))+0x3a)[0x5ae28a]
> ../../bin/traffic_server(HttpSM::handle_api_return()+0xe7)[0x5b95f7]
> ../../bin/traffic_server(HttpSM::set_next_state()+0x16b)[0x5bc47b]
> ../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
> (*)(HttpTransact::State*))+0x3a)[0x5ae28a]
> ../../bin/traffic_server(HttpSM::state_hostdb_lookup(int, 
> void*)+0xa0)[0x5ba540]
> ../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8]
> ../../bin/traffic_server[0x690386]
> ../../bin/traffic_server(HostDBContinuation::do_dns()+0x1d7)[0x691fc7]
> ../../bin/traffic_server(HostDBContinuation::probeEvent(int, 
> Event*)+0x228)[0x6946a8]
> ../../bin/traffic_server(EThread::process_event(Event*, int)+0x8d)[0x77980d]
> ../../bin/traffic_server(EThread::execute()+0x73d)[0x77a4cd]
> ../../bin/traffic_server[0x778c4a]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f8b2a8b30a4]
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f8b29bc607d]
> Aborted
> {code}
> The default parent proxy is set by ParentRecord::DefaultInit(char *val), but 
> it is not create selection_strategy for default parent proxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal

2017-01-02 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu resolved TS-5105.
--
Resolution: Fixed

> Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal
> -
>
> Key: TS-5105
> URL: https://issues.apache.org/jira/browse/TS-5105
> Project: Traffic Server
>  Issue Type: Bug
>  Components: SOCKS
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {code}
> traffic_server: using root directory '/usr/local'
> traffic_server: Abort trap
> traffic_server - STACK TRACE:
> 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
> /usr/local/bin/traffic_server
> 0x80275ab37  at /lib/libthr.so.3
> 0x80275a22c  at /lib/libthr.so.3
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Version 4
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) server connect timeout: 10 socks respnonse 
> timeout 100
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (SocksProxy) Read SocksProxy info: accept_enabled = 
> 0 accept_port = 1080 http_port = 80
> [Dec 21 17:27:34.841] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Config File: 
> /usr/local/etc/trafficserver/socks.config
> [Dec 21 17:27:34.842] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Turned on
> [Dec 21 17:27:35.052] Server {0x804008000} DEBUG:  (connect_re_internal)> (Socks) Using Socks ip: 216.58.192.142:80
> Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, 
> line 67.
> traffic_server: using root directory '/usr/local'
> traffic_server: Abort trap
> traffic_server - STACK TRACE:
> 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
> /usr/local/bin/traffic_server
> 0x80275ab37  at /lib/libthr.so.3
> 0x80275a22c  at /lib/libthr.so.3
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal

2017-01-02 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TS-5105 started by Oknet Xu.

> Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal
> -
>
> Key: TS-5105
> URL: https://issues.apache.org/jira/browse/TS-5105
> Project: Traffic Server
>  Issue Type: Bug
>  Components: SOCKS
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {code}
> traffic_server: using root directory '/usr/local'
> traffic_server: Abort trap
> traffic_server - STACK TRACE:
> 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
> /usr/local/bin/traffic_server
> 0x80275ab37  at /lib/libthr.so.3
> 0x80275a22c  at /lib/libthr.so.3
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Version 4
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) server connect timeout: 10 socks respnonse 
> timeout 100
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (SocksProxy) Read SocksProxy info: accept_enabled = 
> 0 accept_port = 1080 http_port = 80
> [Dec 21 17:27:34.841] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Config File: 
> /usr/local/etc/trafficserver/socks.config
> [Dec 21 17:27:34.842] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Turned on
> [Dec 21 17:27:35.052] Server {0x804008000} DEBUG:  (connect_re_internal)> (Socks) Using Socks ip: 216.58.192.142:80
> Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, 
> line 67.
> traffic_server: using root directory '/usr/local'
> traffic_server: Abort trap
> traffic_server - STACK TRACE:
> 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
> /usr/local/bin/traffic_server
> 0x80275ab37  at /lib/libthr.so.3
> 0x80275a22c  at /lib/libthr.so.3
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal

2016-12-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-5105:
-
   Assignee: Oknet Xu
Backport to Version: 6.2.1
  Fix Version/s: 7.0.0

> Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal
> -
>
> Key: TS-5105
> URL: https://issues.apache.org/jira/browse/TS-5105
> Project: Traffic Server
>  Issue Type: Bug
>  Components: SOCKS
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> traffic_server: using root directory '/usr/local'
> traffic_server: Abort trap
> traffic_server - STACK TRACE:
> 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
> /usr/local/bin/traffic_server
> 0x80275ab37  at /lib/libthr.so.3
> 0x80275a22c  at /lib/libthr.so.3
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Version 4
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) server connect timeout: 10 socks respnonse 
> timeout 100
> [Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (SocksProxy) Read SocksProxy info: accept_enabled = 
> 0 accept_port = 1080 http_port = 80
> [Dec 21 17:27:34.841] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Config File: 
> /usr/local/etc/trafficserver/socks.config
> [Dec 21 17:27:34.842] Server {0x804006400} DEBUG:  (loadSocksConfiguration)> (Socks) Socks Turned on
> [Dec 21 17:27:35.052] Server {0x804008000} DEBUG:  (connect_re_internal)> (Socks) Using Socks ip: 216.58.192.142:80
> Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, 
> line 67.
> traffic_server: using root directory '/usr/local'
> traffic_server: Abort trap
> traffic_server - STACK TRACE:
> 0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
> /usr/local/bin/traffic_server
> 0x80275ab37  at /lib/libthr.so.3
> 0x80275a22c  at /lib/libthr.so.3
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-5106) Assert on ParentSelection.h line 337, there is no selection_strategy for default parent proxy.

2016-12-22 Thread Oknet Xu (JIRA)
Oknet Xu created TS-5106:


 Summary: Assert on ParentSelection.h line 337, there is no 
selection_strategy for default parent proxy.
 Key: TS-5106
 URL: https://issues.apache.org/jira/browse/TS-5106
 Project: Traffic Server
  Issue Type: Bug
  Components: Parent Proxy
Reporter: Oknet Xu


{code}
FATAL: ParentSelection.h:337: failed assertion `result->rec->selection_strategy 
!= NULL`
traffic_server: Aborted (Signal sent by tkill() 21363 65534)
traffic_server - STACK TRACE: 
../../bin/traffic_server(crash_logger_invoke(int, siginfo_t*, 
void*)+0x99)[0x4c1be9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7f8b2a8ba8d0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f8b29b15107]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f8b29b164e8]
/usr/local/ats/lib/libtsutil.so.7(+0x2ac31)[0x7f8b2c1adc31]
/usr/local/ats/lib/libtsutil.so.7(+0x28c95)[0x7f8b2c1abc95]
../../bin/traffic_server(ParentConfigParams::findParent(HttpRequestData*, 
ParentResult*)+0x56e)[0x4f677e]
../../bin/traffic_server(SocksEntry::init(Ptr&, 
UnixNetVConnection*, unsigned char, unsigned char)+0x59b)[0x762dfb]
../../bin/traffic_server(UnixNetProcessor::connect_re_internal(Continuation*, 
sockaddr const*, NetVCOptions*)+0x251)[0x74eaf1]
../../bin/traffic_server(HttpSM::do_http_server_open(bool)+0x850)[0x5bb5d0]
../../bin/traffic_server(HttpSM::set_next_state()+0x4a3)[0x5bc7b3]
../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
(*)(HttpTransact::State*))+0x3a)[0x5ae28a]
../../bin/traffic_server(HttpSM::state_cache_open_write(int, 
void*)+0x1ce)[0x5b083e]
../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8]
../../bin/traffic_server(HttpCacheSM::state_cache_open_write(int, 
void*)+0x1d1)[0x599221]
../../bin/traffic_server(CacheVC::callcont(int)+0x5b)[0x6a1b2b]
../../bin/traffic_server(Cache::open_write(Continuation*, ats::CryptoHash 
const*, HTTPInfo*, long, ats::CryptoHash const*, CacheFragType, char const*, 
int)+0x56b)[0x713f0b]
../../bin/traffic_server(HttpCacheSM::open_write(HttpCacheKey const*, URL*, 
HTTPHdr*, HTTPInfo*, long, bool, bool)+0xcd)[0x598fed]
../../bin/traffic_server(HttpSM::do_cache_prepare_action(HttpCacheSM*, 
HTTPInfo*, bool, bool)+0x15d)[0x5a90dd]
../../bin/traffic_server(HttpSM::set_next_state()+0x8b6)[0x5bcbc6]
../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
(*)(HttpTransact::State*))+0x3a)[0x5ae28a]
../../bin/traffic_server(HttpSM::handle_api_return()+0xe7)[0x5b95f7]
../../bin/traffic_server(HttpSM::set_next_state()+0x16b)[0x5bc47b]
../../bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
(*)(HttpTransact::State*))+0x3a)[0x5ae28a]
../../bin/traffic_server(HttpSM::state_hostdb_lookup(int, void*)+0xa0)[0x5ba540]
../../bin/traffic_server(HttpSM::main_handler(int, void*)+0xc8)[0x5b6eb8]
../../bin/traffic_server[0x690386]
../../bin/traffic_server(HostDBContinuation::do_dns()+0x1d7)[0x691fc7]
../../bin/traffic_server(HostDBContinuation::probeEvent(int, 
Event*)+0x228)[0x6946a8]
../../bin/traffic_server(EThread::process_event(Event*, int)+0x8d)[0x77980d]
../../bin/traffic_server(EThread::execute()+0x73d)[0x77a4cd]
../../bin/traffic_server[0x778c4a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f8b2a8b30a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f8b29bc607d]
Aborted
{code}

The default parent proxy is set by ParentRecord::DefaultInit(char *val), but it 
is not create selection_strategy for default parent proxy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-5105) Assert on Socks.cc line 67, due to remote_addr not set in connect_re_internal

2016-12-22 Thread Oknet Xu (JIRA)
Oknet Xu created TS-5105:


 Summary: Assert on Socks.cc line 67, due to remote_addr not set in 
connect_re_internal
 Key: TS-5105
 URL: https://issues.apache.org/jira/browse/TS-5105
 Project: Traffic Server
  Issue Type: Bug
  Components: SOCKS
Reporter: Oknet Xu


{code}
traffic_server: using root directory '/usr/local'
traffic_server: Abort trap
traffic_server - STACK TRACE:
0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
/usr/local/bin/traffic_server
0x80275ab37  at /lib/libthr.so.3
0x80275a22c  at /lib/libthr.so.3
[Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (Socks) Socks Version 4
[Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (Socks) server connect timeout: 10 socks respnonse 
timeout 100
[Dec 21 17:27:34.840] Server {0x804006400} DEBUG:  (SocksProxy) Read SocksProxy info: accept_enabled = 0 
accept_port = 1080 http_port = 80
[Dec 21 17:27:34.841] Server {0x804006400} DEBUG:  (Socks) Socks Config File: 
/usr/local/etc/trafficserver/socks.config
[Dec 21 17:27:34.842] Server {0x804006400} DEBUG:  (Socks) Socks Turned on
[Dec 21 17:27:35.052] Server {0x804008000} DEBUG:  (Socks) Using Socks ip: 216.58.192.142:80
Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, 
line 67.
traffic_server: using root directory '/usr/local'
traffic_server: Abort trap
traffic_server - STACK TRACE:
0x4b07a9 <_Z19crash_logger_invokeiP9__siginfoPv+0x69> at 
/usr/local/bin/traffic_server
0x80275ab37  at /lib/libthr.so.3
0x80275a22c  at /lib/libthr.so.3
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-5104) Wrong retry count for OSDNSLookup if url_expansions enabled

2016-12-22 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-5104:
-
Affects Version/s: 6.2.0

> Wrong retry count for OSDNSLookup if url_expansions enabled
> ---
>
> Key: TS-5104
> URL: https://issues.apache.org/jira/browse/TS-5104
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Affects Versions: 6.2.0
>Reporter: Oknet Xu
> Fix For: 6.2.1
>
>
> {code}
> 1674HttpTransact::OSDNSLookup(State *s)
> 1675  {
> 1676static int max_dns_lookups = 3 + 
> s->http_config_param->num_url_expansions;
> 1677++s->dns_info.attempts;
> {code}
> The max_dns_lookups include :
> - 1 for origin domain, for example: oknet
> - 1 for default expansion, for example: www.oknet.com
> - n for url_expansions_string list, for example: oknet.org, oknet.net
> Thus, max_dns_lookups should be ```2 + 
> s->http_config_param->num_url_expansions```
> {code}
> HttpTransact::HostNameExpansionError_t
> 6614  HttpTransact::try_to_expand_host_name(State *s)
> a165134@andrewhsuInitial commit.
> andrewhsu authored on 30 Oct 2009
> 6615  {
> 6616static int max_dns_lookups = 2 + 
> s->http_config_param->num_url_expansions;
> 6617static int last_expansion  = max_dns_lookups - 2;
> {code}
> In the HttpTransact::try_to_expand_host_name, it is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-5104) Wrong retry count for OSDNSLookup if url_expansions enabled

2016-12-22 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-5104:
-
Fix Version/s: 6.2.1

> Wrong retry count for OSDNSLookup if url_expansions enabled
> ---
>
> Key: TS-5104
> URL: https://issues.apache.org/jira/browse/TS-5104
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Affects Versions: 6.2.0
>Reporter: Oknet Xu
> Fix For: 6.2.1
>
>
> {code}
> 1674HttpTransact::OSDNSLookup(State *s)
> 1675  {
> 1676static int max_dns_lookups = 3 + 
> s->http_config_param->num_url_expansions;
> 1677++s->dns_info.attempts;
> {code}
> The max_dns_lookups include :
> - 1 for origin domain, for example: oknet
> - 1 for default expansion, for example: www.oknet.com
> - n for url_expansions_string list, for example: oknet.org, oknet.net
> Thus, max_dns_lookups should be ```2 + 
> s->http_config_param->num_url_expansions```
> {code}
> HttpTransact::HostNameExpansionError_t
> 6614  HttpTransact::try_to_expand_host_name(State *s)
> a165134@andrewhsuInitial commit.
> andrewhsu authored on 30 Oct 2009
> 6615  {
> 6616static int max_dns_lookups = 2 + 
> s->http_config_param->num_url_expansions;
> 6617static int last_expansion  = max_dns_lookups - 2;
> {code}
> In the HttpTransact::try_to_expand_host_name, it is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check

2016-12-22 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4583:
-
Fix Version/s: (was: 6.2.1)
   7.1.0

> CID 1021958: Null-pointer dereference after check
> -
>
> Key: TS-4583
> URL: https://issues.apache.org/jira/browse/TS-4583
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Jari Alhonen
>Assignee: Jari Alhonen
> Fix For: 7.1.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, 
> before calling release_server_session(), which dereferences server_entry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check

2016-12-22 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4583:
-
Affects Version/s: (was: 6.2.0)

> CID 1021958: Null-pointer dereference after check
> -
>
> Key: TS-4583
> URL: https://issues.apache.org/jira/browse/TS-4583
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Jari Alhonen
>Assignee: Jari Alhonen
> Fix For: 7.1.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, 
> before calling release_server_session(), which dereferences server_entry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check

2016-12-22 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4583:
-
Affects Version/s: 6.2.0

> CID 1021958: Null-pointer dereference after check
> -
>
> Key: TS-4583
> URL: https://issues.apache.org/jira/browse/TS-4583
> Project: Traffic Server
>  Issue Type: Bug
>Affects Versions: 6.2.0
>Reporter: Jari Alhonen
>Assignee: Jari Alhonen
> Fix For: 6.2.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, 
> before calling release_server_session(), which dereferences server_entry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4583) CID 1021958: Null-pointer dereference after check

2016-12-22 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4583:
-
Fix Version/s: (was: 7.1.0)
   6.2.1

> CID 1021958: Null-pointer dereference after check
> -
>
> Key: TS-4583
> URL: https://issues.apache.org/jira/browse/TS-4583
> Project: Traffic Server
>  Issue Type: Bug
>Affects Versions: 6.2.0
>Reporter: Jari Alhonen
>Assignee: Jari Alhonen
> Fix For: 6.2.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HttpSM.cc checks server_entry against NULL, suggesting it might be NULL, 
> before calling release_server_session(), which dereferences server_entry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-5104) Wrong retry count for OSDNSLookup if url_expansions enabled

2016-12-22 Thread Oknet Xu (JIRA)
Oknet Xu created TS-5104:


 Summary: Wrong retry count for OSDNSLookup if url_expansions 
enabled
 Key: TS-5104
 URL: https://issues.apache.org/jira/browse/TS-5104
 Project: Traffic Server
  Issue Type: Bug
  Components: HTTP
Reporter: Oknet Xu


{code}
1674HttpTransact::OSDNSLookup(State *s)
1675{
1676  static int max_dns_lookups = 3 + 
s->http_config_param->num_url_expansions;
1677  ++s->dns_info.attempts;
{code}

The max_dns_lookups include :
- 1 for origin domain, for example: oknet
- 1 for default expansion, for example: www.oknet.com
- n for url_expansions_string list, for example: oknet.org, oknet.net

Thus, max_dns_lookups should be ```2 + 
s->http_config_param->num_url_expansions```

{code}
HttpTransact::HostNameExpansionError_t
6614HttpTransact::try_to_expand_host_name(State *s)
a165134@andrewhsuInitial commit.
andrewhsu authored on 30 Oct 2009
6615{
6616  static int max_dns_lookups = 2 + 
s->http_config_param->num_url_expansions;
6617  static int last_expansion  = max_dns_lookups - 2;
{code}

In the HttpTransact::try_to_expand_host_name, it is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-5103) Always tunnel non-keepalive HTTP request if tr-pass enabled

2016-12-22 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-5103:
-
Summary: Always tunnel non-keepalive HTTP request if tr-pass enabled  (was: 
Always tunnel pipeline non-keepalive HTTP request if tr-pass enabled)

> Always tunnel non-keepalive HTTP request if tr-pass enabled
> ---
>
> Key: TS-5103
> URL: https://issues.apache.org/jira/browse/TS-5103
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Reporter: Oknet Xu
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Should use ua_buffer_reader instead of ua_raw_buffer_reader.
> {code}
>   // If we had a GET request that has data after the
>   // get request, do blind tunnel
> } else if (state == PARSE_DONE && 
> t_state.hdr_info.client_request.method_get_wksidx() == HTTP_WKSIDX_GET &&
>ua_raw_buffer_reader->read_avail() > 0 && 
> !t_state.hdr_info.client_request.is_keep_alive_set()) {
>   do_blind_tunnel = true;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-5103) Always tunnel pipeline non-keepalive HTTP request if tr-pass enabled

2016-12-22 Thread Oknet Xu (JIRA)
Oknet Xu created TS-5103:


 Summary: Always tunnel pipeline non-keepalive HTTP request if 
tr-pass enabled
 Key: TS-5103
 URL: https://issues.apache.org/jira/browse/TS-5103
 Project: Traffic Server
  Issue Type: Bug
  Components: HTTP
Reporter: Oknet Xu


Should use ua_buffer_reader instead of ua_raw_buffer_reader.

{code}
  // If we had a GET request that has data after the
  // get request, do blind tunnel
} else if (state == PARSE_DONE && 
t_state.hdr_info.client_request.method_get_wksidx() == HTTP_WKSIDX_GET &&
   ua_raw_buffer_reader->read_avail() > 0 && 
!t_state.hdr_info.client_request.is_keep_alive_set()) {
  do_blind_tunnel = true;
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-5082) Compile error: undefined reference to IOBufferReader::is_read_avail_more_than

2016-12-07 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731084#comment-15731084
 ] 

Oknet Xu commented on TS-5082:
--

>From my understand:

- The {code}TS_INLINE{code} defined as {code}inline{code} in 
lib/ts/ink_apidefs.h only for libts, for libts.so these functions declared by 
TS_INLINE is inline function.
- And {code}TS_INLINE{code} defined as empty in iocore/*/Inline.cc, for a sub 
module static libs (*.a) these functions declared by TS_INLINE is not inline 
function.

I don't know the reasons about the TS_INLINE macro, this case only fix the 
compile error.

[~zwoop]

> Compile error: undefined reference to IOBufferReader::is_read_avail_more_than
> -
>
> Key: TS-5082
> URL: https://issues.apache.org/jira/browse/TS-5082
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> autoreconf -fi
> ./configure
> make
> got some error message like below:
> {code}
> CXXLD traffic_server
> undefined reference to IOBufferReader::is_read_avail_more_than(...)
> {code}
> file: P_IOBuffer.h
> {code}
> inline bool
> IOBufferReader::is_read_avail_more_than(int64_t size)
> {code}
> should be
> {code}
> TS_INLINE bool
> IOBufferReader::is_read_avail_more_than(int64_t size)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-5082) Compile error: undefined reference to IOBufferReader::is_read_avail_more_than

2016-12-07 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731084#comment-15731084
 ] 

Oknet Xu edited comment on TS-5082 at 12/8/16 4:49 AM:
---

>From my understand:

- The *TS_INLINE* defined as *inline* in lib/ts/ink_apidefs.h only for libts, 
for libts.so these functions declared by *TS_INLINE* is inline function.
- And *TS_INLINE* defined as empty in iocore/\*/Inline.cc, for a sub module 
static libs (\*.a) these functions declared by *TS_INLINE* is *NOT* inline 
function.

I don't know the reasons about the TS_INLINE macro, this case only fix the 
compile error.

[~zwoop]


was (Author: oknet):
>From my understand:

- The {code}TS_INLINE{code} defined as {code}inline{code} in 
lib/ts/ink_apidefs.h only for libts, for libts.so these functions declared by 
TS_INLINE is inline function.
- And {code}TS_INLINE{code} defined as empty in iocore/*/Inline.cc, for a sub 
module static libs (*.a) these functions declared by TS_INLINE is not inline 
function.

I don't know the reasons about the TS_INLINE macro, this case only fix the 
compile error.

[~zwoop]

> Compile error: undefined reference to IOBufferReader::is_read_avail_more_than
> -
>
> Key: TS-5082
> URL: https://issues.apache.org/jira/browse/TS-5082
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> autoreconf -fi
> ./configure
> make
> got some error message like below:
> {code}
> CXXLD traffic_server
> undefined reference to IOBufferReader::is_read_avail_more_than(...)
> {code}
> file: P_IOBuffer.h
> {code}
> inline bool
> IOBufferReader::is_read_avail_more_than(int64_t size)
> {code}
> should be
> {code}
> TS_INLINE bool
> IOBufferReader::is_read_avail_more_than(int64_t size)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-5082) Compile error: undefined reference to IOBufferReader::is_read_avail_more_than

2016-12-07 Thread Oknet Xu (JIRA)
Oknet Xu created TS-5082:


 Summary: Compile error: undefined reference to 
IOBufferReader::is_read_avail_more_than
 Key: TS-5082
 URL: https://issues.apache.org/jira/browse/TS-5082
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Oknet Xu


autoreconf -fi
./configure
make

got some error message like below:
{code}
CXXLD traffic_server
undefined reference to IOBufferReader::is_read_avail_more_than(...)
{code}

file: P_IOBuffer.h
{code}
inline bool
IOBufferReader::is_read_avail_more_than(int64_t size)
{code}
should be
{code}
TS_INLINE bool
IOBufferReader::is_read_avail_more_than(int64_t size)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4206) Stalled connections show a client request but no HTTP response

2016-12-04 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721202#comment-15721202
 ] 

Oknet Xu commented on TS-4206:
--

please try to fix it with TS-5076 patch.

> Stalled connections show a client request but no HTTP response
> --
>
> Key: TS-4206
> URL: https://issues.apache.org/jira/browse/TS-4206
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, HTTP
>Reporter: Eric Sproul
>Assignee: Bryan Call
>Priority: Blocker
>  Labels: A, regression
> Fix For: 7.1.0
>
>
> Have been discussing this one on IRC but wanted to capture everything here 
> since it seems like a fairly serious issue.  Since upgrading from 5.3.2 to 
> 6.1.1 we have witnessed connections that, from the client perspective, seem 
> to stall after the client sends the request.  TS logs the connection only 
> after it seemingly hits {{proxy.config.net.default_inactivity_timeout}} (5 
> minutes), but logs a response code of 000, despite the presence of a request 
> (e.g. GET with a request URL logged).
> For the time being we have failed back to 5.3.2 but I was able to capture a 
> sample of this situation in the slow log.  [This 
> paste|http://apaste.info/ds1] shows the slow log as well as the corresponding 
> squid.blog entry (default format).
> This issue feels similar to TS-3456 but we are not using tunneling, though we 
> are using SSL/TLS in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-5076) NetVC is lost from read or write enable_list

2016-12-02 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-5076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-5076:
-
Backport to Version: 5.3.3, 6.2.1, 7.0.1

> NetVC is lost from read or write enable_list
> 
>
> Key: TS-5076
> URL: https://issues.apache.org/jira/browse/TS-5076
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Network
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The related code here:
> {code}
> void
> UnixNetVConnection::reenable(VIO *vio)
> {
>   if (STATE_FROM_VIO(vio)->enabled)
> return;
>   set_enabled(vio);
>   if (!thread)
> return;
>   EThread *t = vio->mutex->thread_holding;
>   ink_assert(t == this_ethread());
>   ink_release_assert(!closed);
>   if (nh->mutex->thread_holding == t) { 
> ...
> MUTEX_TRY_LOCK(lock, nh->mutex, t);
> if (!lock.is_locked()) {
>   if (vio == ) {
> if (!read.in_enabled_list) {// ---> the condition check 
> is not atomic
>   read.in_enabled_list = 1;   // ---> the variable set is not 
> atomic
>   nh->read_enable_list.push(this);
> }
>   } else {
> if (!write.in_enabled_list) {   // ---> the write side 
>   write.in_enabled_list = 1;  // ---> the write side
>   nh->write_enable_list.push(this);
> }
>   }
>   if (nh->trigger_event && nh->trigger_event->ethread->signal_hook)
> nh->trigger_event->ethread->signal_hook(nh->trigger_event->ethread);
> } else {
> ...
> }
>   }
> }
> {code}
> Due to the unstable condition check code, the nh->read_enable_list.push(this) 
> would push a netvc into atomic queue that is already inside a queue.
> It leads the elements in atomic queue after the netvc will be lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-5076) NetVC is lost from read or write enable_list

2016-12-02 Thread Oknet Xu (JIRA)
Oknet Xu created TS-5076:


 Summary: NetVC is lost from read or write enable_list
 Key: TS-5076
 URL: https://issues.apache.org/jira/browse/TS-5076
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Oknet Xu


The related code here:

{code}
void
UnixNetVConnection::reenable(VIO *vio)
{
  if (STATE_FROM_VIO(vio)->enabled)
return;
  set_enabled(vio);
  if (!thread)
return;
  EThread *t = vio->mutex->thread_holding;
  ink_assert(t == this_ethread());
  ink_release_assert(!closed);
  if (nh->mutex->thread_holding == t) { 
...
MUTEX_TRY_LOCK(lock, nh->mutex, t);
if (!lock.is_locked()) {
  if (vio == ) {
if (!read.in_enabled_list) {// ---> the condition check is 
not atomic
  read.in_enabled_list = 1;   // ---> the variable set is not 
atomic
  nh->read_enable_list.push(this);
}
  } else {
if (!write.in_enabled_list) {   // ---> the write side 
  write.in_enabled_list = 1;  // ---> the write side
  nh->write_enable_list.push(this);
}
  }
  if (nh->trigger_event && nh->trigger_event->ethread->signal_hook)
nh->trigger_event->ethread->signal_hook(nh->trigger_event->ethread);
} else {
...
}
  }
}
{code}

Due to the unstable condition check code, the nh->read_enable_list.push(this) 
would push a netvc into atomic queue that is already inside a queue.

It leads the elements in atomic queue after the netvc will be lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool

2016-12-02 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu closed TS-4910.

Resolution: Not A Bug
  Assignee: Oknet Xu

> We should get vc->mutex before do_io when the vc is acquired from session pool
> --
>
> Key: TS-4910
> URL: https://issues.apache.org/jira/browse/TS-4910
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>
> Component: ServerSessionPool
> (I cound not found ServerSessionPool from JIRA)
> source: proxy/http/HttpSessionManager.cc
> {code}
> 309 // Now check to see if we have a connection in our shared connection 
> pool
> 310 EThread *ethread   = this_ethread();
> 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
> sm->t_state.http_config_param->server_session_sharing_pool) ?
> 312ethread->server_session_pool->mutex.get() :
> 313m_g_pool->mutex.get();
> 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
> 315 if (lock.is_locked()) {
> 316   if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
> sm->t_state.http_config_param->server_session_sharing_pool) {
> 317 retval = ethread->server_session_pool->acquireSession(ip, 
> hostname_hash, match_style, sm, to_return);
> 318 Debug("http_ss", "[acquire session] thread pool search %s", 
> to_return ? "successful" : "failed");
> 319   } else {
> 320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, 
> sm, to_return);
> 321 Debug("http_ss", "[acquire session] global pool search %s", 
> to_return ? "successful" : "failed");
> 322 // At this point to_return has been removed from the pool. Do we 
> need to move it
> 323 // to the same thread?
> 324 if (to_return) {
> 325   UnixNetVConnection *server_vc = dynamic_cast *>(to_return->get_netvc());
> 326   if (server_vc) {
> 327 UnixNetVConnection *new_vc = 
> server_vc->migrateToCurrentThread(sm, ethread);
> {code}
> As the code above:
> 1. we get pool_mutex first
> 2. then acquire a vc from session pool
> 3. then migrate the vc to current thread without get vc->mutex
> Depend on the comments, a SM only access VIO & VC that returned with callback.
> The mutex of ServerSession may be different from server_vc while it is 
> acquired from ServerSessionPool and attached to HttpSM.
> HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before 
> call ServerSession->do_io(). Currently HttpSM does not.
> The mutex create & usage list at below by timeline:
> 1. ClientVC is accepted from NetAccept with a new allocated mutex.
> 2. ClientSession is created and share the same mutex with ClientVC.
> 3. HttpSM is created and share the same mutex with ClientVC.
> 4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by 
> ClientVC->read.vio._cont.
> {code}
> ClientVC->nh->mutex is locked by EventSystem
> HttpSM->mutex is locked by NetHandler
> ClientVC->mutex is locked due share the same mutex with HttpSM
> To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked 
> simultaneously.
> {code}
> 5. Scenes1:
> HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re()
> Then HttpSM create ServerSession with ServerVC and share the same mutex with 
> HttpSM.
> 5. Scenes2:
> HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is 
> not current EThread.
> The ServerSession->mutex is set to HttpSM->mutex while it is attached to 
> HttpSM.
> But the ServerVC->mutex is old one.
> The first bug:
> Before VC Migration merged:
> - ServerSession->do_io() is called and directly call ServerVC->do_io() 
> without get ServerVC->mutex first.
> After VC Migration merged:
> - Migrate ServerVC into current thread without get ServerVC->mutex first.
> The second bug:
> Before VC Migration merged:
> - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously.
> Suggestion:
> 1. Recall VC Migration
> 2. Re-design ServerSession
> To re-design ServerSession:
> 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is 
> callbacked VC_EVENT_NET_OPEN to HttpSM.
> 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread())
> 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO
> 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex
> 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO
> 4b. if not, create a Cont and schedule it into 
> servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() 
> later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool

2016-12-02 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717396#comment-15717396
 ] 

Oknet Xu commented on TS-4910:
--

Sorry, this is not a bug.

The NetVC->mutex is unused after it is pushed into NetHandler.
And should acquire nh->mutex before access to a netvc that is not belongs to 
current thread.

> We should get vc->mutex before do_io when the vc is acquired from session pool
> --
>
> Key: TS-4910
> URL: https://issues.apache.org/jira/browse/TS-4910
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Oknet Xu
> Fix For: 7.1.0
>
>
> Component: ServerSessionPool
> (I cound not found ServerSessionPool from JIRA)
> source: proxy/http/HttpSessionManager.cc
> {code}
> 309 // Now check to see if we have a connection in our shared connection 
> pool
> 310 EThread *ethread   = this_ethread();
> 311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
> sm->t_state.http_config_param->server_session_sharing_pool) ?
> 312ethread->server_session_pool->mutex.get() :
> 313m_g_pool->mutex.get();
> 314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
> 315 if (lock.is_locked()) {
> 316   if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
> sm->t_state.http_config_param->server_session_sharing_pool) {
> 317 retval = ethread->server_session_pool->acquireSession(ip, 
> hostname_hash, match_style, sm, to_return);
> 318 Debug("http_ss", "[acquire session] thread pool search %s", 
> to_return ? "successful" : "failed");
> 319   } else {
> 320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, 
> sm, to_return);
> 321 Debug("http_ss", "[acquire session] global pool search %s", 
> to_return ? "successful" : "failed");
> 322 // At this point to_return has been removed from the pool. Do we 
> need to move it
> 323 // to the same thread?
> 324 if (to_return) {
> 325   UnixNetVConnection *server_vc = dynamic_cast *>(to_return->get_netvc());
> 326   if (server_vc) {
> 327 UnixNetVConnection *new_vc = 
> server_vc->migrateToCurrentThread(sm, ethread);
> {code}
> As the code above:
> 1. we get pool_mutex first
> 2. then acquire a vc from session pool
> 3. then migrate the vc to current thread without get vc->mutex
> Depend on the comments, a SM only access VIO & VC that returned with callback.
> The mutex of ServerSession may be different from server_vc while it is 
> acquired from ServerSessionPool and attached to HttpSM.
> HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before 
> call ServerSession->do_io(). Currently HttpSM does not.
> The mutex create & usage list at below by timeline:
> 1. ClientVC is accepted from NetAccept with a new allocated mutex.
> 2. ClientSession is created and share the same mutex with ClientVC.
> 3. HttpSM is created and share the same mutex with ClientVC.
> 4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by 
> ClientVC->read.vio._cont.
> {code}
> ClientVC->nh->mutex is locked by EventSystem
> HttpSM->mutex is locked by NetHandler
> ClientVC->mutex is locked due share the same mutex with HttpSM
> To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked 
> simultaneously.
> {code}
> 5. Scenes1:
> HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re()
> Then HttpSM create ServerSession with ServerVC and share the same mutex with 
> HttpSM.
> 5. Scenes2:
> HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is 
> not current EThread.
> The ServerSession->mutex is set to HttpSM->mutex while it is attached to 
> HttpSM.
> But the ServerVC->mutex is old one.
> The first bug:
> Before VC Migration merged:
> - ServerSession->do_io() is called and directly call ServerVC->do_io() 
> without get ServerVC->mutex first.
> After VC Migration merged:
> - Migrate ServerVC into current thread without get ServerVC->mutex first.
> The second bug:
> Before VC Migration merged:
> - Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously.
> Suggestion:
> 1. Recall VC Migration
> 2. Re-design ServerSession
> To re-design ServerSession:
> 1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is 
> callbacked VC_EVENT_NET_OPEN to HttpSM.
> 2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread())
> 3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO
> 3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex
> 4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO
> 4b. if not, create a Cont and schedule it into 
> servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() 
> later.



--
This message 

[jira] [Updated] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()

2016-11-06 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4879:
-
Backport to Version: 6.2.1

> NetVC leaks while hyper emergency occur on check_emergency_throttle()
> -
>
> Key: TS-4879
> URL: https://issues.apache.org/jira/browse/TS-4879
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The con could be closed if hyper emergency occur on 
> check_emergency_throttle().
> But we did not check the con.fd while we get return from 
> check_emergency_throttle().
> For hyper emergency:
> - The socket fd is removed from epoll while it is closed.
> - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to 
> SM.
> Thus:
> - The NetVC will never triggered by NetHandler.
> - Only InactivityCop could handle the NetVC and the default timeout value is 
> 86400 secs.
> For the counter: net_connections_currently_open_stat
> - It is increased in “connect_re_internal()”
> - It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
> - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. 
> (TS-4178)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4885) Incorrect checking of fds_throttle and fds_limit

2016-11-06 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4885:
-
Backport to Version: 6.2.1

> Incorrect checking of fds_throttle and fds_limit
> 
>
> Key: TS-4885
> URL: https://issues.apache.org/jira/browse/TS-4885
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
>  902 static void
>  903 check_fd_limit()
>  904 {
>  905   int fds_throttle = -1;
>  906   REC_ReadConfigInteger(fds_throttle, 
> "proxy.config.net.connections_throttle");
>  907   if (fds_throttle > fds_limit + THROTTLE_FD_HEADROOM) {  // 
> ---> Incorrect
>  908 int new_fds_throttle = fds_limit - THROTTLE_FD_HEADROOM;
>  909 if (new_fds_throttle < 1) {
>  910   ink_abort("too few file descriptors (%d) available", fds_limit);
>  911 }
>  912 char msg[256];
>  913 snprintf(msg, sizeof(msg), "connection throttle too high, "
>  914"%d (throttle) + %d (internal use) > %d 
> (file descriptor limit), "
>  915"using throttle of %d",
>  916  fds_throttle, THROTTLE_FD_HEADROOM, fds_limit, 
> new_fds_throttle);
>  917 SignalWarning(MGMT_SIGNAL_SYSTEM_ERROR, msg);
>  918   }
>  919 }
> {code}
> {code}
> 1001 static void
> 1002 adjust_sys_settings(void)
> 1003 {
> ...
> 1024   REC_ReadConfigInteger(fds_throttle, 
> "proxy.config.net.connections_throttle");
> 1025 
> 1026   if (getrlimit(RLIMIT_NOFILE, ) == 0) {
> 1027 if (fds_throttle > (int)(lim.rlim_cur + THROTTLE_FD_HEADROOM)) {  // 
> --> Incorrect
> 1028   lim.rlim_cur = (lim.rlim_max = (rlim_t)fds_throttle);
> 1029   if (setrlimit(RLIMIT_NOFILE, ) == 0 && 
> getrlimit(RLIMIT_NOFILE, ) == 0) {
> 1030 fds_limit = (int)lim.rlim_cur;
> 1031 syslog(LOG_NOTICE, "NOTE: RLIMIT_NOFILE(%d):cur(%d),max(%d)", 
> RLIMIT_NOFILE, (int)lim.rlim_cur, (int)lim.rlim_max);
> 1032   }
> 1033 }
> 1034   }
> ...
> 1043 }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-2482) Problems with SOCKS

2016-11-06 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-2482:
-
Backport to Version: 6.2.1

> Problems with SOCKS
> ---
>
> Key: TS-2482
> URL: https://issues.apache.org/jira/browse/TS-2482
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, SOCKS
>Reporter: Radim Kolar
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> There are several problems with using SOCKS. I am interested in case when TF 
> is sock client. Client sends HTTP request and TF uses SOCKS server to make 
> connection to internet.
> a/ - not documented enough in default configs
> From default configs comments it seems that for running 
> TF 4.1.2 as socks client, it is sufficient to add one line to socks.config:
> dest_ip=0.0.0.0-255.255.255.255 parent="10.0.0.7:9050"
> but socks proxy is not used. If i run tcpdump sniffing packets  TF never 
> tries to connect to that SOCKS.
> From source code - 
> https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it 
> looks that is needed to set "proxy.config.socks.socks_needed" to activate 
> socks support. This should be documented in both sample files: socks.config 
> and record.config
> b/
> after enabling socks, i am hit by this assert:
> Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, 
> line 65.
> i run on dual stack system (ip4,ip6). 
> This code is setting default destination for SOCKS request? Can not you use 
> just 127.0.0.1 for case if client gets connected over IP6?
> https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool

2016-09-29 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4910:
-
Description: 
Component: ServerSessionPool
(I cound not found ServerSessionPool from JIRA)

source: proxy/http/HttpSessionManager.cc
{code}
309 // Now check to see if we have a connection in our shared connection 
pool
310 EThread *ethread   = this_ethread();
311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) ?
312ethread->server_session_pool->mutex.get() :
313m_g_pool->mutex.get();
314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
315 if (lock.is_locked()) {
316   if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) {
317 retval = ethread->server_session_pool->acquireSession(ip, 
hostname_hash, match_style, sm, to_return);
318 Debug("http_ss", "[acquire session] thread pool search %s", 
to_return ? "successful" : "failed");
319   } else {
320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, 
sm, to_return);
321 Debug("http_ss", "[acquire session] global pool search %s", 
to_return ? "successful" : "failed");
322 // At this point to_return has been removed from the pool. Do we 
need to move it
323 // to the same thread?
324 if (to_return) {
325   UnixNetVConnection *server_vc = dynamic_cast(to_return->get_netvc());
326   if (server_vc) {
327 UnixNetVConnection *new_vc = 
server_vc->migrateToCurrentThread(sm, ethread);
{code}

As the code above:

1. we get pool_mutex first
2. then acquire a vc from session pool
3. then migrate the vc to current thread without get vc->mutex

Depend on the comments, a SM only access VIO & VC that returned with callback.

The mutex of ServerSession may be different from server_vc while it is acquired 
from ServerSessionPool and attached to HttpSM.

HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before 
call ServerSession->do_io(). Currently HttpSM does not.

The mutex create & usage list at below by timeline:

1. ClientVC is accepted from NetAccept with a new allocated mutex.
2. ClientSession is created and share the same mutex with ClientVC.
3. HttpSM is created and share the same mutex with ClientVC.
4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by 
ClientVC->read.vio._cont.

{code}
ClientVC->nh->mutex is locked by EventSystem
HttpSM->mutex is locked by NetHandler
ClientVC->mutex is locked due share the same mutex with HttpSM

To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked 
simultaneously.
{code}

5. Scenes1:
HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re()
Then HttpSM create ServerSession with ServerVC and share the same mutex with 
HttpSM.

5. Scenes2:
HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is not 
current EThread.
The ServerSession->mutex is set to HttpSM->mutex while it is attached to HttpSM.
But the ServerVC->mutex is old one.

The first bug:
Before VC Migration merged:
- ServerSession->do_io() is called and directly call ServerVC->do_io() without 
get ServerVC->mutex first.

After VC Migration merged:
- Migrate ServerVC into current thread without get ServerVC->mutex first.

The second bug:
Before VC Migration merged:
- Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously.


Suggestion:
1. Recall VC Migration
2. Re-design ServerSession

To re-design ServerSession:
1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is 
callbacked VC_EVENT_NET_OPEN to HttpSM.
2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread())
3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO
3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex
4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO
4b. if not, create a Cont and schedule it into 
servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() later.


  was:
Component: ServerSessionPool
(I cound not found ServerSessionPool from JIRA)

source: proxy/http/HttpSessionManager.cc
{code}
309 // Now check to see if we have a connection in our shared connection 
pool
310 EThread *ethread   = this_ethread();
311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) ?
312ethread->server_session_pool->mutex.get() :
313m_g_pool->mutex.get();
314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
315 if (lock.is_locked()) {
316   if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) {
317 retval = 

[jira] [Updated] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool

2016-09-29 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4910:
-
Description: 
Component: ServerSessionPool
(I cound not found ServerSessionPool from JIRA)

source: proxy/http/HttpSessionManager.cc
{code}
309 // Now check to see if we have a connection in our shared connection 
pool
310 EThread *ethread   = this_ethread();
311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) ?
312ethread->server_session_pool->mutex.get() :
313m_g_pool->mutex.get();
314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
315 if (lock.is_locked()) {
316   if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) {
317 retval = ethread->server_session_pool->acquireSession(ip, 
hostname_hash, match_style, sm, to_return);
318 Debug("http_ss", "[acquire session] thread pool search %s", 
to_return ? "successful" : "failed");
319   } else {
320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, 
sm, to_return);
321 Debug("http_ss", "[acquire session] global pool search %s", 
to_return ? "successful" : "failed");
322 // At this point to_return has been removed from the pool. Do we 
need to move it
323 // to the same thread?
324 if (to_return) {
325   UnixNetVConnection *server_vc = dynamic_cast(to_return->get_netvc());
326   if (server_vc) {
327 UnixNetVConnection *new_vc = 
server_vc->migrateToCurrentThread(sm, ethread);
{code}

As the code above:

1. we get pool_mutex first
2. then acquire a vc from session pool
3. then migrate the vc to current thread without get vc->mutex

Depend on the comments, a SM only access VIO & VC that returned with callback.

The mutex of ServerSession may be different from server_vc while it is acquired 
from ServerSessionPool and attached to HttpSM.

HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before 
call ServerSession->do_io(). Currently HttpSM does not.

The mutex create & usage list at below by timeline:

1. ClientVC is accepted from NetAccept with a new allocated mutex.
2. ClientSession is created and share the same mutex with ClientVC.
3. HttpSM is created and share the same mutex with ClientVC.
4. NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by 
ClientVC->read.vio._cont.

{code}
ClientVC->nh->mutex is locked by EventSystem
HttpSM->mutex is locked by NetHandler
ClientVC->mutex is locked due share the same mutex with HttpSM

To access & modify a NetVC should get vc->nh->mutex and vc->mutex locked 
simultaneously.
{code}

5. Scenes1:
HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re()
Then HttpSM create ServerSession with ServerVC and share the same mutex with 
HttpSM.

5. Scenes2:
HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is not 
current EThread.
The ServerSession->mutex is set to HttpSM->mutex while it is attached to HttpSM.
But the ServerVC->mutex is old one.

The first bug:
Before VC Migration merged:
- ServerSession->do_io() is called and directly call ServerVC->do_io() without 
get ServerVC->mutex first.
After VC Migration merged:
- Migrate ServerVC into current thread without get ServerVC->mutex first.

The second bug:
Before VC Migration merged:
- Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously.


Suggestion:
1. Recall VC Migration
2. Re-design ServerSession

To re-design ServerSession:
1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is 
callbacked VC_EVENT_NET_OPEN to HttpSM.
2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread())
3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO
3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex
4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO
4b. if not, create a Cont and schedule it into 
servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() later.


  was:
Component: ServerSessionPool
(I cound not found ServerSessionPool from JIRA)

source: proxy/http/HttpSessionManager.cc
```
309 // Now check to see if we have a connection in our shared connection 
pool
310 EThread *ethread   = this_ethread();
311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) ?
312ethread->server_session_pool->mutex.get() :
313m_g_pool->mutex.get();
314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
315 if (lock.is_locked()) {
316   if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) {
317 retval = 

[jira] [Created] (TS-4910) We should get vc->mutex before do_io when the vc is acquired from session pool

2016-09-29 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4910:


 Summary: We should get vc->mutex before do_io when the vc is 
acquired from session pool
 Key: TS-4910
 URL: https://issues.apache.org/jira/browse/TS-4910
 Project: Traffic Server
  Issue Type: Bug
Reporter: Oknet Xu


Component: ServerSessionPool
(I cound not found ServerSessionPool from JIRA)

source: proxy/http/HttpSessionManager.cc
```
309 // Now check to see if we have a connection in our shared connection 
pool
310 EThread *ethread   = this_ethread();
311 ProxyMutex *pool_mutex = (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) ?
312ethread->server_session_pool->mutex.get() :
313m_g_pool->mutex.get();
314 MUTEX_TRY_LOCK(lock, pool_mutex, ethread);
315 if (lock.is_locked()) {
316   if (TS_SERVER_SESSION_SHARING_POOL_THREAD == 
sm->t_state.http_config_param->server_session_sharing_pool) {
317 retval = ethread->server_session_pool->acquireSession(ip, 
hostname_hash, match_style, sm, to_return);
318 Debug("http_ss", "[acquire session] thread pool search %s", 
to_return ? "successful" : "failed");
319   } else {
320 retval = m_g_pool->acquireSession(ip, hostname_hash, match_style, 
sm, to_return);
321 Debug("http_ss", "[acquire session] global pool search %s", 
to_return ? "successful" : "failed");
322 // At this point to_return has been removed from the pool. Do we 
need to move it
323 // to the same thread?
324 if (to_return) {
325   UnixNetVConnection *server_vc = dynamic_cast(to_return->get_netvc());
326   if (server_vc) {
327 UnixNetVConnection *new_vc = 
server_vc->migrateToCurrentThread(sm, ethread);
```

As the code above:

1. we get pool_mutex first
2. then acquire a vc from session pool
3. then migrate the vc to current thread without get vc->mutex

Depend on the comments, a SM only access VIO & VC that returned with callback.

The mutex of ServerSession may be different from server_vc while it is acquired 
from ServerSessionPool and attached to HttpSM.

HttpSM should get the server_vc->nh->mutex and server_vc->mutex first before 
call ServerSession->do_io(). Currently HttpSM does not.

ClientVC is accepted from NetAccept with a new allocated mutex.
Then ClientSession is created and share the same mutex with ClientVC.
Then HttpSM is created and share the same mutex with ClientVC.
Then NetHandler get HttpSM->mutex and callback READ_READY to HttpSM by 
ClientVC->read.vio._cont.
** ClientVC->nh->mutex is locked by EventSystem
** HttpSM->mutex is locked by NetHandler
** ClientVC->mutex is locked due share the same mutex with HttpSM

Scenes1:
HttpSM create ServerVC with HttpSM->mutex by netProcessor.connect_re()
Then HttpSM create ServerSession with ServerVC and share the same mutex with 
HttpSM.

Scenes2:
HttpSM acquire a ServerSession from SessionPool and the ServerVC->thread is not 
current EThread.
The ServerSession->mutex is set to HttpSM->mutex while it is attached to HttpSM.
But the ServerVC->mutex is old one.

The first bug:
Before VC Migration merged:
- ServerSession->do_io() is called and directly call ServerVC->do_io() without 
get ServerVC->mutex first.
After VC Migration merged:
- Migrate ServerVC into current thread without get ServerVC->mutex first.

The second bug:
Before VC Migration merged:
- Any do_io() should lock vc->nh->mutex and vc->mutex simultaneously.


Suggestion:
1. Recall VC Migration
2. Re-design ServerSession

To re-design ServerSession:
1. Add NetHandler *servervc_nh to save ServerVC->nh while ServerVC is 
callbacked VC_EVENT_NET_OPEN to HttpSM.
2. For do_io(), check servervc_nh ==? get_NetHandler(this_ethread())
3a. equal, directly call ServerVC->do_io() and return ServerVC->VIO
3b. not equal, try lock servervc_nh->mutex and ServerVC->mutex
4a. if locked, directly call ServerVC->do_io() and return ServerVC->VIO
4b. if not, create a Cont and schedule it into 
servervc_nh->trigger_event->thread. The Cont will call ServerVC->do_io() later.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (TS-4895) CID 1021743: Uninitialized members in iocore/net/UnixNet.cc

2016-09-28 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu closed TS-4895.


> CID 1021743:  Uninitialized members in iocore/net/UnixNet.cc
> 
>
> Key: TS-4895
> URL: https://issues.apache.org/jira/browse/TS-4895
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Network
>Reporter: Leif Hedstrom
>Assignee: Oknet Xu
>  Labels: coverity
> Fix For: 7.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> *** CID 1021743:  Uninitialized members  (UNINIT_CTOR)
> /iocore/net/UnixNet.cc: 269 in NetHandler::NetHandler()()
> 263 max_connections_active_in(0),
> 264 inactive_threashold_in(0),
> 265 transaction_no_activity_timeout_in(0),
> 266 keep_alive_no_activity_timeout_in(0)
> 267 {
> 268   SET_HANDLER((NetContHandler)::startNetEvent);
>CID 1021743:  Uninitialized members  (UNINIT_CTOR)
>Non-static class member "default_inactivity_timeout" is not initialized in 
> this constructor nor in any functions that it calls.
> 269 }
> 270 
> 271 int
> 272 update_nethandler_config(const char *name, RecDataT data_type 
> ATS_UNUSED, RecData data, void *cookie)
> 273 {
> 274   NetHandler *nh = static_cast(cookie);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TS-4895) CID 1021743: Uninitialized members in iocore/net/UnixNet.cc

2016-09-28 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu resolved TS-4895.
--
Resolution: Fixed

> CID 1021743:  Uninitialized members in iocore/net/UnixNet.cc
> 
>
> Key: TS-4895
> URL: https://issues.apache.org/jira/browse/TS-4895
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Network
>Reporter: Leif Hedstrom
>Assignee: Oknet Xu
>  Labels: coverity
> Fix For: 7.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> *** CID 1021743:  Uninitialized members  (UNINIT_CTOR)
> /iocore/net/UnixNet.cc: 269 in NetHandler::NetHandler()()
> 263 max_connections_active_in(0),
> 264 inactive_threashold_in(0),
> 265 transaction_no_activity_timeout_in(0),
> 266 keep_alive_no_activity_timeout_in(0)
> 267 {
> 268   SET_HANDLER((NetContHandler)::startNetEvent);
>CID 1021743:  Uninitialized members  (UNINIT_CTOR)
>Non-static class member "default_inactivity_timeout" is not initialized in 
> this constructor nor in any functions that it calls.
> 269 }
> 270 
> 271 int
> 272 update_nethandler_config(const char *name, RecDataT data_type 
> ATS_UNUSED, RecData data, void *cookie)
> 273 {
> 274   NetHandler *nh = static_cast(cookie);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (TS-4539) The mutex of server_vc is not set while server_session reuse

2016-09-28 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu closed TS-4539.

Resolution: Not A Bug

> The mutex of server_vc is not set while server_session reuse
> 
>
> Key: TS-4539
> URL: https://issues.apache.org/jira/browse/TS-4539
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>
> NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex.
> And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share 
> the same mutex.
> The HttpServerSession and server_vc will put into ServerSessionPool and may 
> assign to next new client_vc.
> The HttpSM::attach_server_session() only set the mutex of HttpServerSession 
> to the mutex of HttpSM after get a HttpServerSession from ServerSessionPool.
> But it forget to set the mutex of server_vc to the mutex of HttpSM.
>  
> {code}
> void
> HttpSM::attach_server_session(HttpServerSession *s)
> {
>   hsm_release_assert(server_session == NULL);
>   hsm_release_assert(server_entry == NULL);
>   hsm_release_assert(s->state == HSS_ACTIVE);
>   server_session = s; 
>   server_session->transact_count++;
>   // Set the mutex so that we have something to update
>   //   stats with
>   server_session->mutex = this->mutex;
> {code}
> But I can not found any issue, Is it by design?
> Or it is hard to locate the problem, due to my limited knowedge on HttpSM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4539) The mutex of server_vc is not set while server_session reuse

2016-09-28 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528725#comment-15528725
 ] 

Oknet Xu commented on TS-4539:
--

This is not a bug, but I found another bug in ServerSessionPool and 
HttpSessionManager.

A VC cannot be migrated to other EThreads while it is allocated.
It is managed by NetHandler running in the same EThread.
The NetHandler own the VC and the VC is only freed by the NetHandler.

InactivityCop is a part of NetHandler due to they are share same mutex 
therefore, similar to NetHandler, it could free the VC.

> The mutex of server_vc is not set while server_session reuse
> 
>
> Key: TS-4539
> URL: https://issues.apache.org/jira/browse/TS-4539
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>
> NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex.
> And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share 
> the same mutex.
> The HttpServerSession and server_vc will put into ServerSessionPool and may 
> assign to next new client_vc.
> The HttpSM::attach_server_session() only set the mutex of HttpServerSession 
> to the mutex of HttpSM after get a HttpServerSession from ServerSessionPool.
> But it forget to set the mutex of server_vc to the mutex of HttpSM.
>  
> {code}
> void
> HttpSM::attach_server_session(HttpServerSession *s)
> {
>   hsm_release_assert(server_session == NULL);
>   hsm_release_assert(server_entry == NULL);
>   hsm_release_assert(s->state == HSS_ACTIVE);
>   server_session = s; 
>   server_session->transact_count++;
>   // Set the mutex so that we have something to update
>   //   stats with
>   server_session->mutex = this->mutex;
> {code}
> But I can not found any issue, Is it by design?
> Or it is hard to locate the problem, due to my limited knowedge on HttpSM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4612) Proposal: InactivityCop Optimize

2016-09-23 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4612:
-
Issue Type: Improvement  (was: Bug)

> Proposal: InactivityCop Optimize
> 
>
> Key: TS-4612
> URL: https://issues.apache.org/jira/browse/TS-4612
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core, Network
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> By review the processing of InactivityCop::check_inactivity():
> 1. get all local vc from open_list
> 2. put them into cop_list
> 3. check every vc in cop_list if it is already timeouted
> 4. callback vc->handleEvent to close vc if it is timeout
> InactivityCop and NetHandler share one mutex.
> InactivityCop runs every second, NetHandler runs every 10ms, that means 
> Nethandler runs 100 times until next InactivityCop runs.
> if one vc has read/write in a Nethandler call, it is won't be timeout in the 
> next InactivityCop run.
> Thus, if the vc has read/write in Nethandler, we move it out of cop-list then 
> the InactivityCop runs would get better performace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TS-4612) Proposal: InactivityCop Optimize

2016-09-23 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu resolved TS-4612.
--
   Resolution: Fixed
 Assignee: Oknet Xu
Fix Version/s: (was: sometime)
   7.1.0

> Proposal: InactivityCop Optimize
> 
>
> Key: TS-4612
> URL: https://issues.apache.org/jira/browse/TS-4612
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, Network
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> By review the processing of InactivityCop::check_inactivity():
> 1. get all local vc from open_list
> 2. put them into cop_list
> 3. check every vc in cop_list if it is already timeouted
> 4. callback vc->handleEvent to close vc if it is timeout
> InactivityCop and NetHandler share one mutex.
> InactivityCop runs every second, NetHandler runs every 10ms, that means 
> Nethandler runs 100 times until next InactivityCop runs.
> if one vc has read/write in a Nethandler call, it is won't be timeout in the 
> next InactivityCop run.
> Thus, if the vc has read/write in Nethandler, we move it out of cop-list then 
> the InactivityCop runs would get better performace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()

2016-09-23 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu resolved TS-4879.
--
   Resolution: Fixed
Fix Version/s: 7.1.0

> NetVC leaks while hyper emergency occur on check_emergency_throttle()
> -
>
> Key: TS-4879
> URL: https://issues.apache.org/jira/browse/TS-4879
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The con could be closed if hyper emergency occur on 
> check_emergency_throttle().
> But we did not check the con.fd while we get return from 
> check_emergency_throttle().
> For hyper emergency:
> - The socket fd is removed from epoll while it is closed.
> - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to 
> SM.
> Thus:
> - The NetVC will never triggered by NetHandler.
> - Only InactivityCop could handle the NetVC and the default timeout value is 
> 86400 secs.
> For the counter: net_connections_currently_open_stat
> - It is increased in “connect_re_internal()”
> - It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
> - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. 
> (TS-4178)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TS-4705) Proposal: NetVC Context

2016-09-23 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu resolved TS-4705.
--
Resolution: Fixed

> Proposal: NetVC Context
> ---
>
> Key: TS-4705
> URL: https://issues.apache.org/jira/browse/TS-4705
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Goal 1st:
> In the NetVConnection, we have get_local_addr() and get_remote_addr() methods.
> Also have members local_addr, remote_addr and netvc->con.addr.
> Thus, we should using netvc->con.addr or remote_addr to replace member 
> server_addr in UnixNetVConnection.
> Goal 2nd:
> SSLNetVConnection has member sslClientConnection with 2 methods 
> setSSLClientConnection() and getSSLClientConnection() to indictor ATS is a 
> client or server in a SSL session.
> To abstract above two goals, I'm design the netvc context function.
> As a proxy, there has two side: client side ( Client <-> Proxy ) and server 
> side ( Proxy <-> Server ). With the netvc context funtion to indicate which 
> side the NetVC working on.
> Goal 3rd:
> Fix a minor bug in NetAccept::do_blocking_accept, call to 
> check_emergency_throttle(con) first then allocate vc.
> Goal 4th:
> NetAccept Optimize, remove dup code, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (TS-4705) Proposal: NetVC Context

2016-09-23 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TS-4705 started by Oknet Xu.

> Proposal: NetVC Context
> ---
>
> Key: TS-4705
> URL: https://issues.apache.org/jira/browse/TS-4705
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.1.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Goal 1st:
> In the NetVConnection, we have get_local_addr() and get_remote_addr() methods.
> Also have members local_addr, remote_addr and netvc->con.addr.
> Thus, we should using netvc->con.addr or remote_addr to replace member 
> server_addr in UnixNetVConnection.
> Goal 2nd:
> SSLNetVConnection has member sslClientConnection with 2 methods 
> setSSLClientConnection() and getSSLClientConnection() to indictor ATS is a 
> client or server in a SSL session.
> To abstract above two goals, I'm design the netvc context function.
> As a proxy, there has two side: client side ( Client <-> Proxy ) and server 
> side ( Proxy <-> Server ). With the netvc context funtion to indicate which 
> side the NetVC working on.
> Goal 3rd:
> Fix a minor bug in NetAccept::do_blocking_accept, call to 
> check_emergency_throttle(con) first then allocate vc.
> Goal 4th:
> NetAccept Optimize, remove dup code, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()

2016-09-23 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TS-4879 started by Oknet Xu.

> NetVC leaks while hyper emergency occur on check_emergency_throttle()
> -
>
> Key: TS-4879
> URL: https://issues.apache.org/jira/browse/TS-4879
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The con could be closed if hyper emergency occur on 
> check_emergency_throttle().
> But we did not check the con.fd while we get return from 
> check_emergency_throttle().
> For hyper emergency:
> - The socket fd is removed from epoll while it is closed.
> - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to 
> SM.
> Thus:
> - The NetVC will never triggered by NetHandler.
> - Only InactivityCop could handle the NetVC and the default timeout value is 
> 86400 secs.
> For the counter: net_connections_currently_open_stat
> - It is increased in “connect_re_internal()”
> - It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
> - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. 
> (TS-4178)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4886) ATS HostDB Crash

2016-09-23 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4886:
-
Affects Version/s: 6.2.0

> ATS HostDB Crash
> 
>
> Key: TS-4886
> URL: https://issues.apache.org/jira/browse/TS-4886
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HostDB
>Affects Versions: 6.2.0
>Reporter: song
>
> HostDB crash for every week.
> FATAL: MultiCache.cc:1073: failed assert `0`
> traffic_server: using root directory '/opt/fusion/cdn/trafficserver'
> traffic_server: Aborted (Signal sent by tkill() 31504 33)
> traffic_server - STACK TRACE:
> /opt/fusion/cdn/trafficserver/bin/traffic_server(crash_logger_invoke(int, 
> siginfo_t*, void*)+0xc3)[0x50963a]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x2b4e38494330]
> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x2b4e390fcc37]
> /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x2b4e39100028]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal_va(char const*, 
> __va_list_tag*)+0x0)[0x2b4e37433a04]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal(char const*, 
> ...)+0x0)[0x2b4e37433acf]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_pfatal(char const*, 
> ...)+0x0)[0x2b4e37433b6e]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ats_base64_encode(unsigned 
> char const*, unsigned long, char*, unsigned long, unsigned 
> long*)+0x0)[0x2b4e374317e0]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheBase::fixup_heap_offsets(int,
>  int, UnsunkPtrRegistry*, int)+0x16f)[0x6e0ab3]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheSync::mcEvent(int, 
> Event*)+0x108)[0x6e2408]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(Continuation::handleEvent(int,
>  void*)+0x72)[0x50c5f8]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::process_event(Event*,
>  int)+0x136)[0x7aa646]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::execute()+0xdc)[0x7aa8a6]
> /opt/fusion/cdn/trafficserver/bin/traffic_server[0x7a9c27]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x2b4e3848c184]
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2b4e391c037d]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-22 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513834#comment-15513834
 ] 

Oknet Xu commented on TS-4468:
--

For HTTP session reuse and upon my suggestion:

release/acquire server session upon server_request->get_host()


if proxy.config.url_remap.pristine_host_hdr is false:

- t_state.current.server->name == server_request->get_host() == 
"origin.example.com"

- no difference

if proxy.config.url_remap.pristine_host_hdr is true:

- t_state.current.server->name == "origin.example.com"

- server_request->get_host() == "example.com" or "www.example.com"

- reduces the value of server session reuse (but without any negative effects)


With the option enabled:

- The results of match=ip and match=ip+FQDN are almost the same. 
- The "match=ip" already meet our requirements. Because the FQDN is resolved to 
multiple IPs and the contents on each IP are the same.
- The result of match=ip+Host more accurate/less than the result of 
match=ip+FQDN.


For Http session reuse:

- match=ip is enough <==> match = IP
- match=FQDN is acceptable and improve the value while multiple IPs for a FQDN 
<==> match = HOST
- match=ip+FQDN is almost the same as match=ip <==> match = BOTH
- match=Host is acceptable and improve the value but lower than FQDN
- match=ip+Host is acceptable but reduces the value of reuse


For Https session reuse:

- match=ip is unacceptable, againest RFC 6066 
- match=FQDN is unacceptable, againest RFC 6066
- match=ip+FQDN is unacceptable, againest RFC 6066
- match=Host(SNI) is acceptable and improve the value
- match=ip+Host(SNI) is required <==> match = IP
- match=FQDN+Host(SNI) is acceptable and no difference with ip+Host <==> match 
= HOST
- match=ip+FQDN+Host(SNI) is acceptable and no difference with ip+Host <==> 
match = BOTH


Your patch implement the addtionnal SNI match for SSLNetVC.

Depend on the analysis above, in order to get max value of reuse:

- to reuse a server session connect to parent proxy, we prefer match=ip
- to reuse a server session that reverse proxy to http origin server, we prefer 
match=ip
- to reuse a server session that reverse proxy to https origin server, we 
prefer match=ip+sni(with the patch)
- to reuse a server session that forward proxy to http origin server, we prefer 
match=host
- to reuse a server session that forward proxy to https origin server, we 
prefer match=host+sni(with the patch)

Now, ATS default setting is match=both that is middle solution(not bad but not 
the best).

Thanks for your explaination and finally I'm totally understand the reuse.

However, I will reserve my opinion about match=IP+FQDN <==> match=BOTH.

> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-22 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513474#comment-15513474
 ] 

Oknet Xu commented on TS-4468:
--

Could you please do a test for the issue with the option disabled ?

> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-22 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513105#comment-15513105
 ] 

Oknet Xu commented on TS-4468:
--

By review the codes,

  the key point is "proxy.config.url_remap.pristine_host_hdr".

if proxy.config.url_remap.pristine_host_hdr is false:

  t_state.current.server->name == server_request->get_host() == 
"origin.example.com"

if proxy.config.url_remap.pristine_host_hdr is true:

  t_state.current.server->name == "origin.example.com"

  server_request->get_host() == "example.com"  or  "www.example.com"


ATS always :

- set SNI upon server_request->get_host()
- release/acquire server session upon t_state.current.server->name as hostname

My Suggestion is:

- release/acquire server session upon server_request->get_host()
- no need to check SNI in ServerSessionManager.


> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4886) ATS HostDB Crash

2016-09-22 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513065#comment-15513065
 ] 

Oknet Xu commented on TS-4886:
--

[~jacksontj] Could you please have a look ?

> ATS HostDB Crash
> 
>
> Key: TS-4886
> URL: https://issues.apache.org/jira/browse/TS-4886
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HostDB
>Reporter: song
>
> HostDB crash for every week.
> FATAL: MultiCache.cc:1073: failed assert `0`
> traffic_server: using root directory '/opt/fusion/cdn/trafficserver'
> traffic_server: Aborted (Signal sent by tkill() 31504 33)
> traffic_server - STACK TRACE:
> /opt/fusion/cdn/trafficserver/bin/traffic_server(crash_logger_invoke(int, 
> siginfo_t*, void*)+0xc3)[0x50963a]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x2b4e38494330]
> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x2b4e390fcc37]
> /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x2b4e39100028]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal_va(char const*, 
> __va_list_tag*)+0x0)[0x2b4e37433a04]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_fatal(char const*, 
> ...)+0x0)[0x2b4e37433acf]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ink_pfatal(char const*, 
> ...)+0x0)[0x2b4e37433b6e]
> /opt/fusion/cdn/trafficserver/lib/libtsutil.so.6(ats_base64_encode(unsigned 
> char const*, unsigned long, char*, unsigned long, unsigned 
> long*)+0x0)[0x2b4e374317e0]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheBase::fixup_heap_offsets(int,
>  int, UnsunkPtrRegistry*, int)+0x16f)[0x6e0ab3]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(MultiCacheSync::mcEvent(int, 
> Event*)+0x108)[0x6e2408]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(Continuation::handleEvent(int,
>  void*)+0x72)[0x50c5f8]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::process_event(Event*,
>  int)+0x136)[0x7aa646]
> /opt/fusion/cdn/trafficserver/bin/traffic_server(EThread::execute()+0xdc)[0x7aa8a6]
> /opt/fusion/cdn/trafficserver/bin/traffic_server[0x7a9c27]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x2b4e3848c184]
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2b4e391c037d]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4885) Incorrect checking of fds_throttle and fds_limit

2016-09-22 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4885:


 Summary: Incorrect checking of fds_throttle and fds_limit
 Key: TS-4885
 URL: https://issues.apache.org/jira/browse/TS-4885
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Oknet Xu


{code}
 902 static void
 903 check_fd_limit()
 904 {
 905   int fds_throttle = -1;
 906   REC_ReadConfigInteger(fds_throttle, 
"proxy.config.net.connections_throttle");
 907   if (fds_throttle > fds_limit + THROTTLE_FD_HEADROOM) {  // ---> 
Incorrect
 908 int new_fds_throttle = fds_limit - THROTTLE_FD_HEADROOM;
 909 if (new_fds_throttle < 1) {
 910   ink_abort("too few file descriptors (%d) available", fds_limit);
 911 }
 912 char msg[256];
 913 snprintf(msg, sizeof(msg), "connection throttle too high, "
 914"%d (throttle) + %d (internal use) > %d 
(file descriptor limit), "
 915"using throttle of %d",
 916  fds_throttle, THROTTLE_FD_HEADROOM, fds_limit, 
new_fds_throttle);
 917 SignalWarning(MGMT_SIGNAL_SYSTEM_ERROR, msg);
 918   }
 919 }
{code}


{code}
1001 static void
1002 adjust_sys_settings(void)
1003 {
...
1024   REC_ReadConfigInteger(fds_throttle, 
"proxy.config.net.connections_throttle");
1025 
1026   if (getrlimit(RLIMIT_NOFILE, ) == 0) {
1027 if (fds_throttle > (int)(lim.rlim_cur + THROTTLE_FD_HEADROOM)) {  // 
--> Incorrect
1028   lim.rlim_cur = (lim.rlim_max = (rlim_t)fds_throttle);
1029   if (setrlimit(RLIMIT_NOFILE, ) == 0 && getrlimit(RLIMIT_NOFILE, 
) == 0) {
1030 fds_limit = (int)lim.rlim_cur;
1031 syslog(LOG_NOTICE, "NOTE: RLIMIT_NOFILE(%d):cur(%d),max(%d)", 
RLIMIT_NOFILE, (int)lim.rlim_cur, (int)lim.rlim_max);
1032   }
1033 }
1034   }
...
1043 }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-22 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15512361#comment-15512361
 ] 

Oknet Xu commented on TS-4468:
--

[~jered] do you enable proxy.config.url_remap.pristine_host_hdr ?

> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-21 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510070#comment-15510070
 ] 

Oknet Xu commented on TS-4468:
--

{code}
One thing that's not clear is in what situations `t_state.current.server->name` 
is not the same as `server_request.get_host`.
{code}

according the code, they are synced.

There would be a bug if they are not synced.

Therefore, we should revert the commit and fix the bug.

> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-21 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510077#comment-15510077
 ] 

Oknet Xu commented on TS-4468:
--

{code}
At most, if we decided we needed to be stringent in enforcing SNI/host matching 
on client side, I would want the ability to opt out. 
{code}

agree with that, an option to control it. [~jered]

> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-21 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509978#comment-15509978
 ] 

Oknet Xu edited comment on TS-4468 at 9/21/16 2:01 PM:
---

RFC 6066 conflic with RFC 7540:

{code}
RFC 6066

If the server_name is
   established in the TLS session handshake, the client SHOULD NOT
   attempt to request a different server name at the application layer.

{code}


{code}
RFC 7540

9.1.1 Connection Reuse

Connections that are made to an origin server, either directly or through a 
tunnel created using the CONNECT method (Section 8.3), MAY be reused for 
requests with multiple different URI authority components. A connection can be 
reused as long as the origin server is authoritative (Section 10.1). For TCP 
connections without TLS, this depends on the host having resolved to the same 
IP address.

For https resources, connection reuse additionally depends on having a 
certificate that is valid for the host in the URI. The certificate presented by 
the server MUST satisfy any checks that the client would perform when forming a 
new TLS connection for the host in the URI.

An origin server might offer a certificate with multiple subjectAltName 
attributes or names with wildcards, one of which is valid for the authority in 
the URI. For example, a certificate with a subjectAltName of *.example.com 
might permit the use of the same connection for requests to URIs starting with 
https://a.example.com/ and https://b.example.com/.

{code}


RFC 7540, HTTP/2 allow connection reuse depends on having a certificate that is 
valid for the host in the URI.
But RFC 6066, Only allow connection reuse depends on having a same SNI for the 
host in the URI.

Depend on a research for Firefox, Chrome and Edge, Sarfri:
https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/

- Firefox reuse connection by IP address.
- The chrome reuse connection by both.
- Edge & Sarfri reuse connection by hostname.



was (Author: oknet):
RFC 6066 conflic with RFC 7540:

{code}
RFC 6066

If the server_name is
   established in the TLS session handshake, the client SHOULD NOT
   attempt to request a different server name at the application layer.

{code}


{code}
RFC 7540

9.1.1 Connection Reuse

Connections that are made to an origin server, either directly or through a 
tunnel created using the CONNECT method (Section 8.3), MAY be reused for 
requests with multiple different URI authority components. A connection can be 
reused as long as the origin server is authoritative (Section 10.1). For TCP 
connections without TLS, this depends on the host having resolved to the same 
IP address.

For https resources, connection reuse additionally depends on having a 
certificate that is valid for the host in the URI. The certificate presented by 
the server MUST satisfy any checks that the client would perform when forming a 
new TLS connection for the host in the URI.

An origin server might offer a certificate with multiple subjectAltName 
attributes or names with wildcards, one of which is valid for the authority in 
the URI. For example, a certificate with a subjectAltName of *.example.com 
might permit the use of the same connection for requests to URIs starting with 
https://a.example.com/ and https://b.example.com/.

{code}


RFC 7540, HTTP/2 allow connection reuse depends on having a certificate that is 
valid for the host in the URI.
But RFC 6066, Only allow connection reuse depends on having a same SNI for the 
host in the URI.


> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:

[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-21 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509978#comment-15509978
 ] 

Oknet Xu commented on TS-4468:
--

RFC 6066 conflic with RFC 7540:

{code}
RFC 6066

If the server_name is
   established in the TLS session handshake, the client SHOULD NOT
   attempt to request a different server name at the application layer.

{code}


{code}
RFC 7540

9.1.1 Connection Reuse

Connections that are made to an origin server, either directly or through a 
tunnel created using the CONNECT method (Section 8.3), MAY be reused for 
requests with multiple different URI authority components. A connection can be 
reused as long as the origin server is authoritative (Section 10.1). For TCP 
connections without TLS, this depends on the host having resolved to the same 
IP address.

For https resources, connection reuse additionally depends on having a 
certificate that is valid for the host in the URI. The certificate presented by 
the server MUST satisfy any checks that the client would perform when forming a 
new TLS connection for the host in the URI.

An origin server might offer a certificate with multiple subjectAltName 
attributes or names with wildcards, one of which is valid for the authority in 
the URI. For example, a certificate with a subjectAltName of *.example.com 
might permit the use of the same connection for requests to URIs starting with 
https://a.example.com/ and https://b.example.com/.

{code}


RFC 7540, HTTP/2 allow connection reuse depends on having a certificate that is 
valid for the host in the URI.
But RFC 6066, Only allow connection reuse depends on having a same SNI for the 
host in the URI.


> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-20 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506700#comment-15506700
 ] 

Oknet Xu commented on TS-4468:
--

{code}
4789 shared_result = httpSessionManager.acquire_session(this,   
  // state machine
4790
_state.current.server->dst_addr.sa, // ip + port
4791
t_state.current.server->name, // hostname
4792ua_session, 
  // has ptr to bound ua sessions
4793this
  // sm
4794);
{code}

The t_state.current.server->name should not used to acquire server session, It 
is only used to lookup hostdb and get dst_addr.

We should replace it with server_request.get_host() here to obey RFC6066.

> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4468) http.server_session_sharing.match = both unsafe with HTTPS

2016-09-20 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506637#comment-15506637
 ] 

Oknet Xu commented on TS-4468:
--

By my understand to RFC6066:

  - As a server, we should always verify SNI while it get a new request. (This 
is not yet complete in ATS)
- In another word, the SNI of client_vc should always sync with Host header 
in t_state.hdr_info.client_request.
  - As a client, we should always set SNI upon application layer.
- In another word, the SNI of server_vc should always sync with Host header 
in t_state.hdr_info.server_request.

Thus, we just do acquire server session upon server_request.get_host() is 
enough, and no need to compares SNI.

We just fix the bug if they are not synced.


> http.server_session_sharing.match = both unsafe with HTTPS
> --
>
> Key: TS-4468
> URL: https://issues.apache.org/jira/browse/TS-4468
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP, SSL
>Affects Versions: 6.1.1
>Reporter: Jered Floyd
>Assignee: Susan Hinrichs
> Fix For: 7.0.0
>
> Attachments: TS-4468.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> proxy.config.http.server_session_sharing.match has a default value of "both", 
> which compares IP address, port, and FQDN when determining whether a 
> connection can be reused for further user agent requests.
> The "host" (FQDN) matching does not behave safely when ATS is operating as a 
> reverse proxy.  The compared value is the origin server FQDN after mapping, 
> rather than the initial "Host" target.
> If multiple Hosts map to the same origin server and the scheme is HTTPS, ATS 
> will attempt to reuse a connection that may have an SNI Host that does not 
> match the HTTP Host.  With Apache 2.4 origin servers this results in 400 Bad 
> Request to the user agent.
> PROBLEM REPRODUCTION:
> You can observe this behavior with two mapping rules such as:
> map https://example.com/ https://origin.example.com/
> map https://www.example.com/ https://origin.example.com/
> Non-caching clients alternately fetching URIs from the two targets will see 
> 400 Bad Request responses intermittently.
> WORKAROUND:
> proxy.config.http.server_session_sharing.match should have a default value of 
> "none" when proxy.config.reverse_proxy.enabled is "1"
> SUGGESTED FIXES:
> In order of completeness:
> 1) Do not share server sessions on reverse_proxy requests.
> 2) Do not share server sessions on reverse_proxy requests where scheme is 
> HTTPS.
> 3) Compare target host (SNI host) rather than replacement host when 
> determining if reuse of server session is allowed (when 
> server_session_sharing.match is set to "host" or "both").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-803) Fix SOCKS breakage and allow for setting next-hop SOCKS

2016-09-19 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15502680#comment-15502680
 ] 

Oknet Xu commented on TS-803:
-

Form my understand:

  - compare to parent http proxy server, socks proxy server as a parent proxy 
is an bottom setting
  - the socks proxy is designed to proxy all outgoing connections including 
parent http proxy server.

So we can set a socks proxy with the new API in a remap or plugin manually.

About the API name:

  - proxy/api/ts/ts.h:tsapi void TSHttpTxnParentProxySet(TSHttpTxn txnp, const 
char *hostname, int port);
  - we already have ParentProxySet API that is named without "Addr"
  - the socks proxy is not only set for HTTP protocol and it is set for a 
VConnection. Can we named it with TSVConnSocksParentSet ?

And we need more parameters for socks server: 

  - socks version
  - username (optional)
  - password (optional)



> Fix SOCKS breakage and allow for setting next-hop SOCKS
> ---
>
> Key: TS-803
> URL: https://issues.apache.org/jira/browse/TS-803
> Project: Traffic Server
>  Issue Type: New Feature
>  Components: Network, SOCKS
>Affects Versions: 3.0.0
> Environment: Wherever ATS might run
>Reporter: M. Nunberg
>
> Here is a patch I drew up a few months ago against a snapshot of ATS/2.1.7 
> unstable/git. There are some quirks here, and I'm not that sure any more what 
> this patch does exactly. However it:
> 1) Does fix SOCKS connections in general
> 2) Allows setting next-hop SOCKS proxy via the API
> Problems:
> See https://issues.apache.org/jira/browse/TS-802
> This has no effect on connections which are drawn from the connection pool, 
> as it seems ATS currently doesn't maintain unique identities for peripheral 
> connection params (source IP, SOCKS etc); i.e. this only affects new TCP 
> connections to an OS.
> diff -x '*.o' -ru tsorig/iocore/net/I_NetVConnection.h 
> tsgit217/iocore/net/I_NetVConnection.h
> --- tsorig/iocore/net/I_NetVConnection.h2011-03-09 21:43:58.0 
> +
> +++ tsgit217/iocore/net/I_NetVConnection.h2011-03-17 14:37:18.0 
> +
> @@ -120,6 +120,13 @@
>/// Version of SOCKS to use.
>unsigned char socks_version;
> +  struct {
> +  unsigned int ip;
> +  int port;
> +  char *username;
> +  char *password;
> +  } socks_override;
> +
>int socket_recv_bufsize;
>int socket_send_bufsize;
> Only in tsgit217/iocore/net: Makefile
> Only in tsgit217/iocore/net: Makefile.in
> diff -x '*.o' -ru tsorig/iocore/net/P_Socks.h tsgit217/iocore/net/P_Socks.h
> --- tsorig/iocore/net/P_Socks.h2011-03-09 21:43:58.0 +
> +++ tsgit217/iocore/net/P_Socks.h2011-03-17 13:17:20.0 +
> @@ -126,7 +126,7 @@
>unsigned char version;
>bool write_done;
> -
> +  bool manual_parent_selection;
>SocksAuthHandler auth_handler;
>unsigned char socks_cmd;
> @@ -145,7 +145,8 @@
>  SocksEntry():Continuation(NULL), netVConnection(0),
>  ip(0), port(0), server_ip(0), server_port(0), nattempts(0),
> -lerrno(0), timeout(0), version(5), write_done(false), 
> auth_handler(NULL), socks_cmd(NORMAL_SOCKS)
> +lerrno(0), timeout(0), version(5), write_done(false), 
> manual_parent_selection(false),
> +auth_handler(NULL), socks_cmd(NORMAL_SOCKS)
>{
>}
>  };
> diff -x '*.o' -ru tsorig/iocore/net/Socks.cc tsgit217/iocore/net/Socks.cc
> --- tsorig/iocore/net/Socks.cc2011-03-09 21:43:58.0 +
> +++ tsgit217/iocore/net/Socks.cc2011-03-17 13:46:07.0 +
> @@ -73,7 +73,8 @@
>nattempts = 0;
>findServer();
> -  timeout = this_ethread()->schedule_in(this, 
> HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout));
> +//  timeout = this_ethread()->schedule_in(this, 
> HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout));
> +  timeout = this_ethread()->schedule_in(this, HRTIME_SECONDS(5));
>write_done = false;
>  }
> @@ -81,6 +82,15 @@
>  SocksEntry::findServer()
>  {
>nattempts++;
> +  if(manual_parent_selection) {
> +  if(nattempts > 1) {
> +  //Nullify IP and PORT
> +  server_ip = -1;
> +  server_port = 0;
> +  }
> +  Debug("mndebug(Socks)", "findServer() is a noop with manual socks 
> selection");
> +  return;
> +  }
>  #ifdef SOCKS_WITH_TS
>if (nattempts == 1) {
> @@ -187,7 +197,6 @@
>  }
>  Debug("Socks", "Failed to connect to %u.%u.%u.%u:%d", 
> PRINT_IP(server_ip), server_port);
> -
>  findServer();
>  if (server_ip == (uint32_t) - 1) {
> diff -x '*.o' -ru tsorig/iocore/net/UnixNetProcessor.cc 
> tsgit217/iocore/net/UnixNetProcessor.cc
> --- tsorig/iocore/net/UnixNetProcessor.cc2011-03-09 21:43:58.0 
> +
> +++ tsgit217/iocore/net/UnixNetProcessor.cc2011-03-17 15:48:38.0 
> +
> @@ 

[jira] [Updated] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()

2016-09-19 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4879:
-
Description: 
The con could be closed if hyper emergency occur on check_emergency_throttle().

But we did not check the con.fd while we get return from 
check_emergency_throttle().

For hyper emergency:

- The socket fd is removed from epoll while it is closed.
- A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to SM.

Thus:

- The NetVC will never triggered by NetHandler.
- Only InactivityCop could handle the NetVC and the default timeout value is 
86400 secs.

For the counter: net_connections_currently_open_stat

- It is increased in “connect_re_internal()”
- It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
- Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. 
(TS-4178)

  was:
The con could be closed if hyper emergency occur on check_emergency_throttle().

But we did not check the con.fd while we get return from 
check_emergency_throttle().

For hyper emergency:

- The socket fd is removed from epoll while it is closed.
- A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to SM.

Thus:

- The NetVC will never triggered by NetHandler.
- Only InactivityCop could handle the NetVC and the default timeout value is 
86400 secs.

For the counter: net_connections_currently_open_stat

- It is increased in “connect_re_internal()”
- It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
- Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD.


> NetVC leaks while hyper emergency occur on check_emergency_throttle()
> -
>
> Key: TS-4879
> URL: https://issues.apache.org/jira/browse/TS-4879
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
>
> The con could be closed if hyper emergency occur on 
> check_emergency_throttle().
> But we did not check the con.fd while we get return from 
> check_emergency_throttle().
> For hyper emergency:
> - The socket fd is removed from epoll while it is closed.
> - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to 
> SM.
> Thus:
> - The NetVC will never triggered by NetHandler.
> - Only InactivityCop could handle the NetVC and the default timeout value is 
> 86400 secs.
> For the counter: net_connections_currently_open_stat
> - It is increased in “connect_re_internal()”
> - It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
> - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD. 
> (TS-4178)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()

2016-09-18 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4879:


 Summary: NetVC leaks while hyper emergency occur on 
check_emergency_throttle()
 Key: TS-4879
 URL: https://issues.apache.org/jira/browse/TS-4879
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Oknet Xu


The con could be closed if hyper emergency occur on check_emergency_throttle().

But we did not check the con.fd while we get return from 
check_emergency_throttle().

For hyper emergency:

- The socket fd is removed from epoll while it is closed.
- A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to SM.

Thus:

- The NetVC will never triggered by NetHandler.
- Only InactivityCop could handle the NetVC and the default timeout value is 
86400 secs.

For the counter: net_connections_currently_open_stat

- It is increased in “connect_re_internal()”
- It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
- Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TS-4879) NetVC leaks while hyper emergency occur on check_emergency_throttle()

2016-09-18 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu reassigned TS-4879:


Assignee: Oknet Xu

> NetVC leaks while hyper emergency occur on check_emergency_throttle()
> -
>
> Key: TS-4879
> URL: https://issues.apache.org/jira/browse/TS-4879
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
>
> The con could be closed if hyper emergency occur on 
> check_emergency_throttle().
> But we did not check the con.fd while we get return from 
> check_emergency_throttle().
> For hyper emergency:
> - The socket fd is removed from epoll while it is closed.
> - A NetVC with a closed socket fd is created and callback NET_EVENT_OPEN to 
> SM.
> Thus:
> - The NetVC will never triggered by NetHandler.
> - Only InactivityCop could handle the NetVC and the default timeout value is 
> 86400 secs.
> For the counter: net_connections_currently_open_stat
> - It is increased in “connect_re_internal()”
> - It isn't decreased while the con.fd set to NO_FD due to hyper emergency 
> - Because it is decreased in close_UnixNetVConnection() only con.fd != NO_FD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-803) Fix SOCKS breakage and allow for setting next-hop SOCKS

2016-09-18 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500410#comment-15500410
 ] 

Oknet Xu commented on TS-803:
-

[~jpe...@apache.org] The patch implemented a new API: TSHttpTxnSocksProxySet to 
set socks server in plugin manually. And I don't find any codes related to 
“Does fix SOCKS connections in general”.

Does it need API REVIEW if it is a new API ?

> Fix SOCKS breakage and allow for setting next-hop SOCKS
> ---
>
> Key: TS-803
> URL: https://issues.apache.org/jira/browse/TS-803
> Project: Traffic Server
>  Issue Type: New Feature
>  Components: Network, SOCKS
>Affects Versions: 3.0.0
> Environment: Wherever ATS might run
>Reporter: M. Nunberg
>
> Here is a patch I drew up a few months ago against a snapshot of ATS/2.1.7 
> unstable/git. There are some quirks here, and I'm not that sure any more what 
> this patch does exactly. However it:
> 1) Does fix SOCKS connections in general
> 2) Allows setting next-hop SOCKS proxy via the API
> Problems:
> See https://issues.apache.org/jira/browse/TS-802
> This has no effect on connections which are drawn from the connection pool, 
> as it seems ATS currently doesn't maintain unique identities for peripheral 
> connection params (source IP, SOCKS etc); i.e. this only affects new TCP 
> connections to an OS.
> diff -x '*.o' -ru tsorig/iocore/net/I_NetVConnection.h 
> tsgit217/iocore/net/I_NetVConnection.h
> --- tsorig/iocore/net/I_NetVConnection.h2011-03-09 21:43:58.0 
> +
> +++ tsgit217/iocore/net/I_NetVConnection.h2011-03-17 14:37:18.0 
> +
> @@ -120,6 +120,13 @@
>/// Version of SOCKS to use.
>unsigned char socks_version;
> +  struct {
> +  unsigned int ip;
> +  int port;
> +  char *username;
> +  char *password;
> +  } socks_override;
> +
>int socket_recv_bufsize;
>int socket_send_bufsize;
> Only in tsgit217/iocore/net: Makefile
> Only in tsgit217/iocore/net: Makefile.in
> diff -x '*.o' -ru tsorig/iocore/net/P_Socks.h tsgit217/iocore/net/P_Socks.h
> --- tsorig/iocore/net/P_Socks.h2011-03-09 21:43:58.0 +
> +++ tsgit217/iocore/net/P_Socks.h2011-03-17 13:17:20.0 +
> @@ -126,7 +126,7 @@
>unsigned char version;
>bool write_done;
> -
> +  bool manual_parent_selection;
>SocksAuthHandler auth_handler;
>unsigned char socks_cmd;
> @@ -145,7 +145,8 @@
>  SocksEntry():Continuation(NULL), netVConnection(0),
>  ip(0), port(0), server_ip(0), server_port(0), nattempts(0),
> -lerrno(0), timeout(0), version(5), write_done(false), 
> auth_handler(NULL), socks_cmd(NORMAL_SOCKS)
> +lerrno(0), timeout(0), version(5), write_done(false), 
> manual_parent_selection(false),
> +auth_handler(NULL), socks_cmd(NORMAL_SOCKS)
>{
>}
>  };
> diff -x '*.o' -ru tsorig/iocore/net/Socks.cc tsgit217/iocore/net/Socks.cc
> --- tsorig/iocore/net/Socks.cc2011-03-09 21:43:58.0 +
> +++ tsgit217/iocore/net/Socks.cc2011-03-17 13:46:07.0 +
> @@ -73,7 +73,8 @@
>nattempts = 0;
>findServer();
> -  timeout = this_ethread()->schedule_in(this, 
> HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout));
> +//  timeout = this_ethread()->schedule_in(this, 
> HRTIME_SECONDS(netProcessor.socks_conf_stuff->server_connect_timeout));
> +  timeout = this_ethread()->schedule_in(this, HRTIME_SECONDS(5));
>write_done = false;
>  }
> @@ -81,6 +82,15 @@
>  SocksEntry::findServer()
>  {
>nattempts++;
> +  if(manual_parent_selection) {
> +  if(nattempts > 1) {
> +  //Nullify IP and PORT
> +  server_ip = -1;
> +  server_port = 0;
> +  }
> +  Debug("mndebug(Socks)", "findServer() is a noop with manual socks 
> selection");
> +  return;
> +  }
>  #ifdef SOCKS_WITH_TS
>if (nattempts == 1) {
> @@ -187,7 +197,6 @@
>  }
>  Debug("Socks", "Failed to connect to %u.%u.%u.%u:%d", 
> PRINT_IP(server_ip), server_port);
> -
>  findServer();
>  if (server_ip == (uint32_t) - 1) {
> diff -x '*.o' -ru tsorig/iocore/net/UnixNetProcessor.cc 
> tsgit217/iocore/net/UnixNetProcessor.cc
> --- tsorig/iocore/net/UnixNetProcessor.cc2011-03-09 21:43:58.0 
> +
> +++ tsgit217/iocore/net/UnixNetProcessor.cc2011-03-17 15:48:38.0 
> +
> @@ -228,6 +228,11 @@
>!socks_conf_stuff->ip_range.match(ip))
>  #endif
>  );
> +  if(opt->socks_override.ip >= 1) {
> +  using_socks = true;
> +  Debug("mndebug", "trying to set using_socks to true");
> +  }
> +
>SocksEntry *socksEntry = NULL;
>  #endif
>NET_SUM_GLOBAL_DYN_STAT(net_connections_currently_open_stat, 1);
> @@ -242,6 +247,16 @@
>if (using_socks) {
>  Debug("Socks", "Using Socks ip: %u.%u.%u.%u:%d\n", PRINT_IP(ip), port);
>  

[jira] [Commented] (TS-2889) Crash in FetchSM related to spdy FetchSM changes in 5.0.x

2016-09-07 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470041#comment-15470041
 ] 

Oknet Xu commented on TS-2889:
--

Could you explain the code that copy data from resp_reader to resp_buffer ?

{code}
+
+while (total_bytes_copied < bytes) {
+   int64_t actual_bytes_copied;
+   actual_bytes_copied = resp_buffer->write(resp_reader, bytes, 0);
+   Debug(DEBUG_TAG, "[%s] copied %" PRId64 " bytes", __FUNCTION__, 
actual_bytes_copied);
+   if (actual_bytes_copied <= 0) {
+   break;
+   }
+   total_bytes_copied += actual_bytes_copied;
+}
+Debug(DEBUG_TAG, "[%s] total copied %" PRId64 " bytes", __FUNCTION__, 
total_bytes_copied);
+resp_reader->consume(total_bytes_copied);
+
{code}

Copy the data and then cosume old copy ? why ?

> Crash in FetchSM related to spdy FetchSM changes in 5.0.x
> -
>
> Key: TS-2889
> URL: https://issues.apache.org/jira/browse/TS-2889
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, SPDY
>Affects Versions: 5.0.0
>Reporter: Brian Geffon
>Assignee: Brian Geffon
>  Labels: yahoo
> Fix For: 5.1.0
>
> Attachments: ts2889.diff
>
>
> I'm seeing a crash in the FetchSM on 5.0.x, this is surely because of the 
> changes that were made to the FetchSM as a result of SPDY. 
> Sample bt:
> #0  0x00377c632925 in raise () from /lib64/libc.so.6
> #1  0x00377c634105 in abort () from /lib64/libc.so.6
> #2  0x2b09b0693ef0 in ink_die_die_die (retval=1) at ink_error.cc:43
> #3  0x2b09b0693fbd in ink_fatal_va(int, const char *, typedef 
> __va_list_tag __va_list_tag *) (return_code=1, 
> message_format=0x2b09b06a1358 "%s:%d: failed assert `%s`", 
> ap=0x2b09b8806710) at ink_error.cc:65
> #4  0x2b09b0694086 in ink_fatal (return_code=1, 
> message_format=0x2b09b06a1358 "%s:%d: failed assert `%s`")
> at ink_error.cc:73
> #5  0x2b09b0692d40 in _ink_assert (expression=0x761f2f "header_done", 
> file=0x761ede "FetchSM.cc", line=160) at ink_assert.cc:37
> #6  0x004fa5c0 in FetchSM::check_chunked (this=0x2b09f8012240)
> at FetchSM.cc:160
> #7  0x004fac82 in FetchSM::get_info_from_buffer (this=0x2b09f8012240, 
> the_reader=0x2b09f4004818) at FetchSM.cc:313
> #8  0x004fb18b in FetchSM::process_fetch_read (this=0x2b09f8012240, 
> event=104) at FetchSM.cc:402
> #9  0x004fb42d in FetchSM::fetch_handler (this=0x2b09f8012240, 
> event=104, edata=0x2b09f8002768) at FetchSM.cc:449
> #10 0x004fc43e in Continuation::handleEvent (this=0x2b09f8012240, 
> event=104, data=0x2b09f8002768)
> at ../iocore/eventsystem/I_Continuation.h:146
> ---Type  to continue, or q  to quit---
> #11 0x00537f2e in PluginVC::process_read_side (this=0x2b09f8002670, 
> other_side_call=false) at PluginVC.cc:637
> #12 0x00536856 in PluginVC::main_handler (this=0x2b09f8002670, 
> event=1, data=0x2b0a340293e0) at PluginVC.cc:208
> #13 0x004fc43e in Continuation::handleEvent (this=0x2b09f8002670, 
> event=1, data=0x2b0a340293e0) at 
> ../iocore/eventsystem/I_Continuation.h:146
> #14 0x0075d2e6 in EThread::process_event (this=0x2b09b23cc010, 
> e=0x2b0a340293e0, calling_code=1) at UnixEThread.cc:145
> #15 0x0075d4b4 in EThread::execute (this=0x2b09b23cc010)
> at UnixEThread.cc:196
> #16 0x0075c844 in spawn_thread_internal (a=0x1428b10) at Thread.cc:88
> #17 0x00377ce079d1 in start_thread () from /lib64/libpthread.so.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4804) Incorrect write.vio.ndone

2016-09-01 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15455845#comment-15455845
 ] 

Oknet Xu commented on TS-4804:
--

[~zwoop] the bug only in 7.0.0.

> Incorrect write.vio.ndone
> -
>
> Key: TS-4804
> URL: https://issues.apache.org/jira/browse/TS-4804
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> int64_t
> UnixNetVConnection::load_buffer_and_write(int64_t towrite, MIOBufferAccessor 
> , int64_t _written, int )
> {
> ...
> if (r > 0) {
>   buf.reader()->consume(r);
> }
> total_written += r;
> ...
>   return r;
> }
> {code}
> the 'r' is returned from socketManage.writev().
> 'total_written += r;' should be enclosed by if statement because the 'r' may 
> be a negative value otherwise the total_written is incorrect.
> {code}
> void
> write_to_net_io(NetHandler *nh, UnixNetVConnection *vc, EThread *thread)
> {
> ...
>   int64_t r = vc->load_buffer_and_write(towrite, buf, 
> total_written, needs);
>   if (total_written > 0) {
> NET_SUM_DYN_STAT(net_write_bytes_stat, total_written);
> s->vio.ndone += total_written;
>   }
> ...
> }
> {code}
> The incorrect total_written will cause the incorrect of write.vio.ndone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4804) Incorrect write.vio.ndone

2016-09-01 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4804:


 Summary: Incorrect write.vio.ndone
 Key: TS-4804
 URL: https://issues.apache.org/jira/browse/TS-4804
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Oknet Xu


{code}
int64_t
UnixNetVConnection::load_buffer_and_write(int64_t towrite, MIOBufferAccessor 
, int64_t _written, int )
{
...
if (r > 0) {
  buf.reader()->consume(r);
}
total_written += r;
...
  return r;
}
{code}

the 'r' is returned from socketManage.writev().
'total_written += r;' should be enclosed by if statement because the 'r' may be 
a negative value otherwise the total_written is incorrect.

{code}
void
write_to_net_io(NetHandler *nh, UnixNetVConnection *vc, EThread *thread)
{
...
  int64_t r = vc->load_buffer_and_write(towrite, buf, 
total_written, needs);

  if (total_written > 0) {
NET_SUM_DYN_STAT(net_write_bytes_stat, total_written);
s->vio.ndone += total_written;
  }
...
}
{code}

The incorrect total_written will cause the incorrect of write.vio.ndone.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2482) Problems with SOCKS

2016-08-27 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441427#comment-15441427
 ] 

Oknet Xu commented on TS-2482:
--

The socks proxy is broken by “TS-919: Make iocore IPv6 capable 
,https://github.com/oknet/trafficserver/commit/8247bcac9e326746132d6526469c6b30146c0baf”
 at 19 Aug 2011.

server_addr is socks server.
target_addr is remote_addr that to connect with.

> Problems with SOCKS
> ---
>
> Key: TS-2482
> URL: https://issues.apache.org/jira/browse/TS-2482
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Radim Kolar
>Assignee: Oknet Xu
> Fix For: sometime
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are several problems with using SOCKS. I am interested in case when TF 
> is sock client. Client sends HTTP request and TF uses SOCKS server to make 
> connection to internet.
> a/ - not documented enough in default configs
> From default configs comments it seems that for running 
> TF 4.1.2 as socks client, it is sufficient to add one line to socks.config:
> dest_ip=0.0.0.0-255.255.255.255 parent="10.0.0.7:9050"
> but socks proxy is not used. If i run tcpdump sniffing packets  TF never 
> tries to connect to that SOCKS.
> From source code - 
> https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it 
> looks that is needed to set "proxy.config.socks.socks_needed" to activate 
> socks support. This should be documented in both sample files: socks.config 
> and record.config
> b/
> after enabling socks, i am hit by this assert:
> Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, 
> line 65.
> i run on dual stack system (ip4,ip6). 
> This code is setting default destination for SOCKS request? Can not you use 
> just 127.0.0.1 for case if client gets connected over IP6?
> https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used

2016-08-26 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4613:
-
Priority: Minor  (was: Major)

> Set an independent thread_data_used for each thread group instead of sharing 
> one thread_data_used
> -
>
> Key: TS-4613
> URL: https://issues.apache.org/jira/browse/TS-4613
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
>Priority: Minor
>  Labels: Optimization
> Fix For: 7.0.0
>
>
> "thread_data_used" indicate the usage of EThread::thread_private[ ].
> The EThread::thread_private[ ] saved thread specific data e.g. :
>   - stat system arrays
>   - NetHandler object
>   - PollCont object
> However, the private data of thread group are different.
> Sharing thread_data_used cause the waste of space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used

2016-08-26 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4613:
-
Issue Type: Improvement  (was: Bug)

> Set an independent thread_data_used for each thread group instead of sharing 
> one thread_data_used
> -
>
> Key: TS-4613
> URL: https://issues.apache.org/jira/browse/TS-4613
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
>  Labels: Optimization
> Fix For: 7.0.0
>
>
> "thread_data_used" indicate the usage of EThread::thread_private[ ].
> The EThread::thread_private[ ] saved thread specific data e.g. :
>   - stat system arrays
>   - NetHandler object
>   - PollCont object
> However, the private data of thread group are different.
> Sharing thread_data_used cause the waste of space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used

2016-08-26 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4613:
-
Labels: Optimization  (was: )

> Set an independent thread_data_used for each thread group instead of sharing 
> one thread_data_used
> -
>
> Key: TS-4613
> URL: https://issues.apache.org/jira/browse/TS-4613
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
>  Labels: Optimization
> Fix For: 7.0.0
>
>
> "thread_data_used" indicate the usage of EThread::thread_private[ ].
> The EThread::thread_private[ ] saved thread specific data e.g. :
>   - stat system arrays
>   - NetHandler object
>   - PollCont object
> However, the private data of thread group are different.
> Sharing thread_data_used cause the waste of space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used

2016-08-26 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4613:
-
Description: 
"thread_data_used" indicate the usage of EThread::thread_private[ ].

The EThread::thread_private[ ] saved thread specific data e.g. :

  - stat system arrays
  - NetHandler object
  - PollCont object

However, the private data of thread group are different.

Sharing thread_data_used cause the waste of space.

  was:
NetHandler has a method: _close_vc , It is called by InactivityCop.

first, create a dummy Event in stack,
then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
);

the handleEvent is mainEvent here.

In the UnixNetVConnection::mainEvent code:

```
int
UnixNetVConnection::mainEvent(int event, Event *e)
{
  ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
  ink_assert(thread == this_ethread());

  MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
  MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
e->ethread);
  MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, 
e->ethread);

  if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
  (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
  (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
#endif
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
return EVENT_CONT;
  }
```

the dummy Event would be schedule_in into Event System by 
e->schedule_in(HRTIME_MSECONDS(net_retry_delay));

I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.

```
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
#endif
```

I'm try to allocate a Event instead dummy Event, but meet Event System callback 
on a deallocated UnixNetVConnection.

due to NetHandler called close_UnixNetVConnection before Event System callback 
the Event by schedule_in.

In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
from UnixNetVConnection::mainEvent, to do ++handle_event; or not.

```
if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
  ++handle_event;
```

the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
due to the mutex of ServerSessionVC may different from ClientSessionVC.

Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
Session from SessionPool. ServerSessionVC still keep the old mutex.


> Set an independent thread_data_used for each thread group instead of sharing 
> one thread_data_used
> -
>
> Key: TS-4613
> URL: https://issues.apache.org/jira/browse/TS-4613
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>
> "thread_data_used" indicate the usage of EThread::thread_private[ ].
> The EThread::thread_private[ ] saved thread specific data e.g. :
>   - stat system arrays
>   - NetHandler object
>   - PollCont object
> However, the private data of thread group are different.
> Sharing thread_data_used cause the waste of space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used

2016-08-26 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4613:
-
Summary: Set an independent thread_data_used for each thread group instead 
of sharing one thread_data_used  (was: In UnixNetVConnection::mainEvent should 
not do e->schedule_in for dummy event callback )

> Set an independent thread_data_used for each thread group instead of sharing 
> one thread_data_used
> -
>
> Key: TS-4613
> URL: https://issues.apache.org/jira/browse/TS-4613
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>
> NetHandler has a method: _close_vc , It is called by InactivityCop.
> first, create a dummy Event in stack,
> then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
> );
> the handleEvent is mainEvent here.
> In the UnixNetVConnection::mainEvent code:
> ```
> int
> UnixNetVConnection::mainEvent(int event, Event *e)
> {
>   ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
>   ink_assert(thread == this_ethread());
>   MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
>   MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
> e->ethread);
>   MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : 
> e->ethread->mutex, e->ethread);
>   if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
>   (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
>   (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
> #endif
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> return EVENT_CONT;
>   }
> ```
> the dummy Event would be schedule_in into Event System by 
> e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.
> ```
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> #endif
> ```
> I'm try to allocate a Event instead dummy Event, but meet Event System 
> callback on a deallocated UnixNetVConnection.
> due to NetHandler called close_UnixNetVConnection before Event System 
> callback the Event by schedule_in.
> In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
> from UnixNetVConnection::mainEvent, to do ++handle_event; or not.
> ```
> if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
>   ++handle_event;
> ```
> the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
> due to the mutex of ServerSessionVC may different from ClientSessionVC.
> Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
> Session from SessionPool. ServerSessionVC still keep the old mutex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4613) Set an independent thread_data_used for each thread group instead of sharing one thread_data_used

2016-08-26 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4613:
-
Component/s: (was: Network)

> Set an independent thread_data_used for each thread group instead of sharing 
> one thread_data_used
> -
>
> Key: TS-4613
> URL: https://issues.apache.org/jira/browse/TS-4613
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>
> NetHandler has a method: _close_vc , It is called by InactivityCop.
> first, create a dummy Event in stack,
> then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
> );
> the handleEvent is mainEvent here.
> In the UnixNetVConnection::mainEvent code:
> ```
> int
> UnixNetVConnection::mainEvent(int event, Event *e)
> {
>   ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
>   ink_assert(thread == this_ethread());
>   MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
>   MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
> e->ethread);
>   MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : 
> e->ethread->mutex, e->ethread);
>   if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
>   (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
>   (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
> #endif
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> return EVENT_CONT;
>   }
> ```
> the dummy Event would be schedule_in into Event System by 
> e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.
> ```
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> #endif
> ```
> I'm try to allocate a Event instead dummy Event, but meet Event System 
> callback on a deallocated UnixNetVConnection.
> due to NetHandler called close_UnixNetVConnection before Event System 
> callback the Event by schedule_in.
> In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
> from UnixNetVConnection::mainEvent, to do ++handle_event; or not.
> ```
> if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
>   ++handle_event;
> ```
> the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
> due to the mutex of ServerSessionVC may different from ClientSessionVC.
> Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
> Session from SessionPool. ServerSessionVC still keep the old mutex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4337) Pickup wrong dest ip or port while parsing CONNECT Method with use_client_target_addr = 2

2016-08-18 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4337:
-
Description: 
add config line into records.config
{code}
CONFIG proxy.config.http.use_client_target_addr INT 2
{code}

Setup ATS working on bridge mode with ebtables/iptables,
and enable tr-full on 8080 port.

send a HTTP CONNECT request.

{code}
telnet 200.x.y.10 8080
CONNECT 220.181.111.188:443 HTTP/1.1

{code}

the ip address 200.x.y.10 is a public http proxy address.

Snip contents from traffic.out
{code}
+ Proxy's Request +
-- State Machine Id: 578
CONNECT  HTTP/1.1
Client-ip: 172.22.70.66
X-Forwarded-For: 172.22.70.66
Via: http/1.0 debian[AC166F6E] (ApacheTrafficServer/6.0.0)
Host: 220.181.111.188:443

[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_trans) Next action 
next; HttpTransact::HandleResponse
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] State 
Transition: SM_ACTION_API_OS_DNS -> SM_ACTION_ORIGIN_SERVER_RAW_OPEN
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_track) entered 
inside do_http_server_open ][IPv4]
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open 
connection to 220.181.111.188: 200.x.y.10:443
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_seq) 
[HttpSM::do_http_server_open] Sending request to server
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) calling 
netProcessor.connect_s
{code}

please notice on the below line:
{code}
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open 
connection to 220.181.111.188: 200.x.y.10:443
{code}

with "use_client_target_addr INT 2", ATS does not do name resolve and pickup 
dest ip directly from TCP layer but still pickup dest port from HTTP request.

In a tr-full mode, does ATS should tunnel the CONNECT method to a remote proxy 
? just think it is a one shot parent proxy, only for this tcp connection.

or other behaviour ?

  was:
add config line into records.config
{code}
CONFIG proxy.config.http.use_client_target_addr INT 2
{code}

Setup ATS working on bridge mode with ebtables/iptables,
and enable tr-full on 8080 port.

send a HTTP CONNECT request.

{code}
telnet 200.x.y.10 8080
CONNECT 220.181.111.188:443 HTTP/1.1

{code}

the ip address 200.x.y.10 is a public http proxy address.

Snip contents from traffic.out
{code}
+ Proxy's Request +
-- State Machine Id: 578
CONNECT  HTTP/1.1
Client-ip: 172.22.70.66
X-Forwarded-For: 172.22.70.66
Via: http/1.0 debian[AC166F6E] (ApacheTrafficServer/6.0.0)
Host: 220.181.111.188:443

[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_trans) Next action 
next; HttpTransact::HandleResponse
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] State 
Transition: SM_ACTION_API_OS_DNS -> SM_ACTION_ORIGIN_SERVER_RAW_OPEN
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_track) entered 
inside do_http_server_open ][IPv4]
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open 
connection to 220.181.111.188: 111.13.56.28:443
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http_seq) 
[HttpSM::do_http_server_open] Sending request to server
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) calling 
netProcessor.connect_s
{code}

please notice on the below line:
{code}
[Apr  7 16:58:49.996] Server {0x2b6176b8a700} DEBUG: (http) [578] open 
connection to 220.181.111.188: 200.x.y.10:443
{code}

with "use_client_target_addr INT 2", ATS does not do name resolve and pickup 
dest ip directly from TCP layer but still pickup dest port from HTTP request.

In a tr-full mode, does ATS should tunnel the CONNECT method to a remote proxy 
? just think it is a one shot parent proxy, only for this tcp connection.

or other behaviour ?


> Pickup wrong dest ip or port while parsing CONNECT Method with 
> use_client_target_addr = 2
> -
>
> Key: TS-4337
> URL: https://issues.apache.org/jira/browse/TS-4337
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Reporter: Oknet Xu
> Fix For: 7.1.0
>
>
> add config line into records.config
> {code}
> CONFIG proxy.config.http.use_client_target_addr INT 2
> {code}
> Setup ATS working on bridge mode with ebtables/iptables,
> and enable tr-full on 8080 port.
> send a HTTP CONNECT request.
> {code}
> telnet 200.x.y.10 8080
> CONNECT 220.181.111.188:443 HTTP/1.1
> {code}
> the ip address 200.x.y.10 is a public http proxy address.
> Snip contents from traffic.out
> {code}
> + Proxy's Request +
> -- State Machine Id: 578
> CONNECT  HTTP/1.1
> Client-ip: 172.22.70.66
> X-Forwarded-For: 172.22.70.66
> Via: http/1.0 debian[AC166F6E] (ApacheTrafficServer/6.0.0)
> Host: 220.181.111.188:443
> [Apr  7 16:58:49.996] Server {0x2b6176b8a700} 

[jira] [Updated] (TS-4522) Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS only signaled from read_from_net()

2016-08-18 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4522:
-
Description: 
The 1st Problem:

The "r" saved return value from write(). The "r == 0" or "!r" is not means EOS. 

Because of on a closed socket fd: 
- read(socketfd) return 0
- write(socketfd) return EPIPE

In the write_to_net_io, we check the return value of write() with the same way 
to read().

{code}
if (!r || r == -ECONNRESET) {
{code}

It is a copy & paste bug.

The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but 
VC_EVENT_ERROR instead. 

full code here:
{code}
  int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs);

  if (total_written > 0) {
NET_SUM_DYN_STAT(net_write_bytes_stat, total_written);
s->vio.ndone += total_written;
  }

  // check for errors
  if (r <= 0) { // if the socket was not ready,add to WaitList
if (r == -EAGAIN || r == -ENOTCONN) {
  NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat);
  if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
vc->write.triggered = 0;
nh->write_ready_list.remove(vc);
write_reschedule(nh, vc);
  }
  if ((needs & EVENTIO_READ) == EVENTIO_READ) {
vc->read.triggered = 0;
nh->read_ready_list.remove(vc);
read_reschedule(nh, vc);
  }
  return;
}
if (!r || r == -ECONNRESET) {
  vc->write.triggered = 0;
  write_signal_done(VC_EVENT_EOS, nh, vc);
  return;
}
vc->write.triggered = 0;
write_signal_error(nh, vc, (int)-total_written);
return;
{code}

The 2nd Problem:

In the iocore/net/I_NetVConnection.h, the comments for do_io_write:
{code}
257   /**
258 Initiates write. Thread-safe, may be called when not handling
259 an event from the NetVConnection, or the NetVConnection creation
260 callback.
261 
262 Callbacks: non-reentrant, c's lock taken during callbacks.
263 
264 
265   
266 c->handleEvent(VC_EVENT_WRITE_READY, vio)
267 signifies data has written from the reader or there are no 
bytes available for the reader to write.
268   
269   
270 c->handleEvent(VC_EVENT_WRITE_COMPLETE, vio)
271 signifies the amount of data indicated by nbytes has been read 
from the buffer
272   
273   
274 c->handleEvent(VC_EVENT_ERROR, vio)
275 signified that error occured during write.
276   
277 
278 
279 The vio returned during callbacks is the same as the one returned
280 by do_io_write(). The vio can be changed only during call backs
281 from the vconnection. The vconnection deallocates the reader
282 when it is destroyed.
283 
284 @param c continuation to be called back after (partial) write
285 @param nbytes no of bytes to write, if unknown msut be set to INT64_MAX
286 @param buf source of data
287 @param owner
288 @return vio pointer
289 
290   */
291   virtual VIO *do_io_write(Continuation *c, int64_t nbytes, IOBufferReader 
*buf, bool owner = false) = 0;
{code}

Only 3 Events was introduced

- VC_EVENT_WRITE_READY
- VC_EVENT_WRITE_COMPLETE
- VC_EVENT_ERROR

The code {code}write_signal_done(VC_EVENT_EOS, nh, vc);{code} should not be 
here (write_to_net_io).


  was:
On a closed socket fd:
read(socketfd) return 0
write(socketfd) return EPIPE

In the write_to_net_io, we check the return value of write() with the same way 
to read().

{code}
if (!r || r == -ECONNRESET) {
{code}

The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but 
VC_EVENT_ERROR instead. 

full code here:
{code}
  int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs);

  if (total_written > 0) {
NET_SUM_DYN_STAT(net_write_bytes_stat, total_written);
s->vio.ndone += total_written;
  }

  // check for errors
  if (r <= 0) { // if the socket was not ready,add to WaitList
if (r == -EAGAIN || r == -ENOTCONN) {
  NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat);
  if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
vc->write.triggered = 0;
nh->write_ready_list.remove(vc);
write_reschedule(nh, vc);
  }
  if ((needs & EVENTIO_READ) == EVENTIO_READ) {
vc->read.triggered = 0;
nh->read_ready_list.remove(vc);
read_reschedule(nh, vc);
  }
  return;
}
if (!r || r == -ECONNRESET) {
  vc->write.triggered = 0;
  write_signal_done(VC_EVENT_EOS, nh, vc);
  return;
}
vc->write.triggered = 0;
write_signal_error(nh, vc, (int)-total_written);
return;
{code}


> Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS 
> only signaled from read_from_net()
> -
>
> Key: TS-4522
> URL: https://issues.apache.org/jira/browse/TS-4522
> Project: 

[jira] [Updated] (TS-4522) Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS only signaled from read_from_net()

2016-08-18 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4522:
-
Summary: Should signal SM with EVENT_ERROR on error in write_to_net_io(), 
EVENT_EOS only signaled from read_from_net()  (was: did not check EPIPE on 
write_to_net_io)

> Should signal SM with EVENT_ERROR on error in write_to_net_io(), EVENT_EOS 
> only signaled from read_from_net()
> -
>
> Key: TS-4522
> URL: https://issues.apache.org/jira/browse/TS-4522
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, Network
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> On a closed socket fd:
> read(socketfd) return 0
> write(socketfd) return EPIPE
> In the write_to_net_io, we check the return value of write() with the same 
> way to read().
> {code}
> if (!r || r == -ECONNRESET) {
> {code}
> The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but 
> VC_EVENT_ERROR instead. 
> full code here:
> {code}
>   int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs);
>   if (total_written > 0) {
> NET_SUM_DYN_STAT(net_write_bytes_stat, total_written);
> s->vio.ndone += total_written;
>   }
>   // check for errors
>   if (r <= 0) { // if the socket was not ready,add to WaitList
> if (r == -EAGAIN || r == -ENOTCONN) {
>   NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat);
>   if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
> vc->write.triggered = 0;
> nh->write_ready_list.remove(vc);
> write_reschedule(nh, vc);
>   }
>   if ((needs & EVENTIO_READ) == EVENTIO_READ) {
> vc->read.triggered = 0;
> nh->read_ready_list.remove(vc);
> read_reschedule(nh, vc);
>   }
>   return;
> }
> if (!r || r == -ECONNRESET) {
>   vc->write.triggered = 0;
>   write_signal_done(VC_EVENT_EOS, nh, vc);
>   return;
> }
> vc->write.triggered = 0;
> write_signal_error(nh, vc, (int)-total_written);
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4322) ProfileSM Proposal

2016-08-18 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426171#comment-15426171
 ] 

Oknet Xu commented on TS-4322:
--

Yes, VC migration logic could.

But the ProfileSM is a better desgin for ATS, it is not conflic with VC 
migration logic.


>From my understanding of the code, the NetVC has the below inheritance:

  SSLNetVConnection <-- UnixNetVConnection <-- NetVConnection <-- VConnection

A long time ago, there was a class NTNetVConnection (no longer existed), I 
guess its inheritance is: 

  SSLNetVConnection <-- NTNetVConnection <-- NetVConnection <-- VConnection

Thus SSLNetVConnection has 2 versions, one derived from UnixNetVConnection 
another one derived from NTNetVConnection.

I noticed there are two kind of prefix of header files in iccore. one is "P_ " 
and another one is "I_".

P_ prefix means the header file is used to define private interfaces and 
variables only.
I_ prefix means the header file is user to define public interfaces and 
variables.

The class NetVConnection is defined in I_NetVConnection.h, thus it is a 
interface used by HttpSM.
The class UnixNetVConnection is defined in P_UnixNetVConnection.h, thus it is a 
typeof implement for NetVConnection.
Meanwhile, the NTNetVConnection is a typeof implement too, The design makes 
HttpSM can use NetVConnection directly regardless of the operating system 
(Windows or Unix).

But the SSLNetVConnection breaking the design. (6.0.x branch)

The NetVConnection designed to be a resource handle but there is not any I/O 
operation abstract for NetVConnection.

The ProfileSM designed to abstract I/O operation on NetVConnection.

The TcpProfileSM is a typeof implement for NetVConnection with TCP.
The SslProfileSM is a typeof implement for NetVConnection with TCP-SSL.

I have already accomplish TcpProfileSM and SslProfileSM.
Now, I am working on the SocksProfileSM which is used to verify this design.



> ProfileSM Proposal
> --
>
> Key: TS-4322
> URL: https://issues.apache.org/jira/browse/TS-4322
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core, Network
>Reporter: Oknet Xu
> Fix For: sometime
>
> Attachments: ATS SslProfileSMv1.png, ATS TcpProfileSM.png
>
>
> Preface
> ===
> NetVConnection is a base class for all NetIO derived classes:
>   - NetVConnection
> - UnixNetVConnection for TCP NetIO
>   - SSLNetVConnection for SSL NetIO
> with the below codes to test a NetVC whether is a SSLNetVC :
> {code}
>   sslvc = dynamic_castnetvc;
>   if (sslvc != NULL)
>   {
>  // netvc is a SSLNetVConnection
>   } else {
>  // netvc is a UnixNetVConnection
>   }
> {code}
> ATS support HTTP, SPDY and H2 protocol, and also support them with SSL/TLS.
> Sometimes we want to talk in HTTP/TCP first, and then talk in HTTP/SSL,
> Example : HTTPS over HTTP CONNECT method
> {code}
> Client send a CONNECT method request
> C->P: CONNECT www.example.com:443 HTTP/1.1
> C->P: Host: www.example.com:443
> C->P:
> ATS reply a HTTP 200/OK, then build a TCP tunnel to www.example.com:443
> P->C: 200 OK
> P->C: 
> Client send a SSL Handshake Client Hello message
> C->P: 
> ATS tunnel the message
> P->S: 
> Server response a SSL Handshake Server Hello message
> P<-S: 
> ATS tunnel the message
> C<-P: 
> Server send a Certificate to ATS
> P<-S: 
> ATS tunnel the message
> C<-P: 
> etc . . .
> {code}
> currently, It isn't a easy way upgrading to SSLNetVConnection from 
> UnixNetVConnection.
> the ProfileSM is designed to setup a plugable mechanism for 
> UnixNetVConnection to handle(abstract) different type of I/O operation.
> so we will have TcpProfileSM and UdpProfileSM as low level ProfileSM and 
> SslProfileSM as high level ProfileSM.
> How to implement
> 
> Introduce a new class ProfileSM & TcpProfileSM & SslProfileSM :
> It is a derived class from Continuation
> - Has handleEvent() function
> - Has mutex member
> TcpProfileSM is a derived class from ProfileSM
> SslProfileSM is a derived class from ProfileSM
> handshakeEvent(int event, void *data) function
> - only defined in SslProfileSM
> - the SSL handshake handle function
> - `event' can be IOCORE_EVENTS_READ or IOCORE_EVENTS_WRITE
> - it is callback from NetHandler::mainNetEvent()
> - `data' is a pointer to Nethandler type
> - it is implement NPN/ALPN support and replace SSLNextProtocolAccept & 
> SSLNextProtocolTrampoline, pick some codes from sslvc->net_read_io(), 
> write_to_net_io()
> - set Continuation->handler to mainEvent() when HandShake done.
> mainEvent(int event, void *data) function
> - the first entrance
> - `event' can be IOCORE_EVENTS_READ or IOCORE_EVENTS_WRITE
> - it is callback from NetHandler::mainNetEvent()
> - `data' is a pointer to Nethandler type
> handle_read(NetHandler *nh, EThread *lthread)
> - 

[jira] [Commented] (TS-4475) Crash in Log-Collation client after using inactivity-cop.

2016-08-15 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420710#comment-15420710
 ] 

Oknet Xu commented on TS-4475:
--

The Log-Collation has LogCollationHostSM and LogCollationClientSM, the 
LogCollationHostSM handle server side netvc and LogCollationClientSM handle 
client side.

We should do the same thing for LogCollationHostSM. what do you think, 
[~sudheerv] ?

Can you include it in your PR#831 ?

> Crash in Log-Collation client after using inactivity-cop.
> -
>
> Key: TS-4475
> URL: https://issues.apache.org/jira/browse/TS-4475
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 6.1.1
>Reporter: Peter Chou
> Fix For: sometime
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Background: We recently tried making use of inactivity-cop by setting it to 
> 300s instead of the default one-day setting. This was to address an issue 
> where, under heavy load, ATS would become un-responsive to client requests, 
> and the condition would persist after traffic was stopped with the active 
> queue saying 0 connections but 'netstat -na' showing a bunch of established 
> connections (up to the throttle limit approximately).
> Inactivity cop seemed to help ATS handle this situation, but we have since 
> experienced a couple of core dumps over the last four day period. It seems 
> occasionally the Log Collation Client State Machine will have event value 105 
> or VC_EVENT_INACTIVITY_TIMEOUT, but when it reaches read_signal_and_update() 
> it tries to call the continuation handler which down the line does not know 
> about this event thus causing core dump !"unexpcted state" [sic].
> Here is the back-trace --
> (gdb) bt
> #0  0x2b67cd5405f7 in raise () from /lib64/libc.so.6
> #1  0x2b67cd541e28 in abort () from /lib64/libc.so.6
> #2  0x2b67cb032921 in ink_die_die_die () at ink_error.cc:43
> #3  0x2b67cb0329da in ink_fatal_va (fmt=0x2b67cb0442dc "%s:%d: failed 
> assert `%s`", ap=0x7ffc690e7ba8) at ink_error.cc:65
> #4  0x2b67cb032a79 in ink_fatal (message_format=0x2b67cb0442dc "%s:%d: 
> failed assert `%s`") at ink_error.cc:73
> #5  0x2b67cb0305a6 in _ink_assert (expression=0x7fb422 "!\"unexpcted 
> state\"", file=0x7fb35b "LogCollationClientSM.cc",
> line=445) at ink_assert.cc:37
> #6  0x0069c86b in LogCollationClientSM::client_idle 
> (this=0x2b681400bb00, event=105) at LogCollationClientSM.cc:445
> #7  0x0069b427 in LogCollationClientSM::client_handler 
> (this=0x2b681400bb00, event=105, data=0x2b680c017020)
> at LogCollationClientSM.cc:119
> #8  0x00502cc6 in Continuation::handleEvent (this=0x2b681400bb00, 
> event=105, data=0x2b680c017020)
> at ../iocore/eventsystem/I_Continuation.h:153
> #9  0x00783d40 in read_signal_and_update (event=105, 
> vc=0x2b680c016f00) at UnixNetVConnection.cc:150
> #10 0x00787a22 in UnixNetVConnection::mainEvent (this=0x2b680c016f00, 
> event=1, e=0x127ad60) at UnixNetVConnection.cc:1188
> #11 0x00502cc6 in Continuation::handleEvent (this=0x2b680c016f00, 
> event=1, data=0x127ad60)
> at ../iocore/eventsystem/I_Continuation.h:153
> #12 0x0077d943 in InactivityCop::check_inactivity (this=0x1209a00, 
> event=2, e=0x127ad60) at UnixNet.cc:102
> #13 0x00502cc6 in Continuation::handleEvent (this=0x1209a00, event=2, 
> data=0x127ad60)
> at ../iocore/eventsystem/I_Continuation.h:153
> #14 0x007a5df6 in EThread::process_event (this=0x2b67cf7bb010, 
> e=0x127ad60, calling_code=2) at UnixEThread.cc:128
> #15 0x007a61f5 in EThread::execute (this=0x2b67cf7bb010) at 
> UnixEThread.cc:207
> #16 0x00534430 in main (argv=0x7ffc690e82e8) at Main.cc:1918
> I believe it takes a wrong turn here --
> #9  0x00783d40 in read_signal_and_update (event=105, 
> vc=0x2b680c016f00) at UnixNetVConnection.cc:150
> 150 vc->read.vio._cont->handleEvent(event, >read.vio);
> (gdb) list
> 145 static inline int
> 146 read_signal_and_update(int event, UnixNetVConnection *vc)
> 147 {
> 148   vc->recursion++;
> 149   if (vc->read.vio._cont) {
> 150 vc->read.vio._cont->handleEvent(event, >read.vio);
> 151   } else {
> 152 switch (event) {
> 153 case VC_EVENT_EOS:
> 154 case VC_EVENT_ERROR:
> (gdb) list
> 155 case VC_EVENT_ACTIVE_TIMEOUT:
> 156 case VC_EVENT_INACTIVITY_TIMEOUT:
> 157   Debug("inactivity_cop", "event %d: null read.vio cont, closing 
> vc %p", event, vc);
> 158   vc->closed = 1;
> 159   break;
> 160 default:
> 161   Error("Unexpected event %d for vc %p", event, vc);
> 162   ink_release_assert(0);
> 163   break;
> 164 }
> Note: I understand that 

[jira] [Commented] (TS-4475) Crash in Log-Collation client after using inactivity-cop.

2016-08-03 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405696#comment-15405696
 ] 

Oknet Xu commented on TS-4475:
--

[~pbchou] You should handle Active Timeout and Inactive Timeout event in 
Log-Collation, ignore timeout event would leave dead netvc in ATS.
And specify a standalone timeout value for Log-Collation.

> Crash in Log-Collation client after using inactivity-cop.
> -
>
> Key: TS-4475
> URL: https://issues.apache.org/jira/browse/TS-4475
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 6.1.1
>Reporter: Peter Chou
> Fix For: sometime
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Background: We recently tried making use of inactivity-cop by setting it to 
> 300s instead of the default one-day setting. This was to address an issue 
> where, under heavy load, ATS would become un-responsive to client requests, 
> and the condition would persist after traffic was stopped with the active 
> queue saying 0 connections but 'netstat -na' showing a bunch of established 
> connections (up to the throttle limit approximately).
> Inactivity cop seemed to help ATS handle this situation, but we have since 
> experienced a couple of core dumps over the last four day period. It seems 
> occasionally the Log Collation Client State Machine will have event value 105 
> or VC_EVENT_INACTIVITY_TIMEOUT, but when it reaches read_signal_and_update() 
> it tries to call the continuation handler which down the line does not know 
> about this event thus causing core dump !"unexpcted state" [sic].
> Here is the back-trace --
> (gdb) bt
> #0  0x2b67cd5405f7 in raise () from /lib64/libc.so.6
> #1  0x2b67cd541e28 in abort () from /lib64/libc.so.6
> #2  0x2b67cb032921 in ink_die_die_die () at ink_error.cc:43
> #3  0x2b67cb0329da in ink_fatal_va (fmt=0x2b67cb0442dc "%s:%d: failed 
> assert `%s`", ap=0x7ffc690e7ba8) at ink_error.cc:65
> #4  0x2b67cb032a79 in ink_fatal (message_format=0x2b67cb0442dc "%s:%d: 
> failed assert `%s`") at ink_error.cc:73
> #5  0x2b67cb0305a6 in _ink_assert (expression=0x7fb422 "!\"unexpcted 
> state\"", file=0x7fb35b "LogCollationClientSM.cc",
> line=445) at ink_assert.cc:37
> #6  0x0069c86b in LogCollationClientSM::client_idle 
> (this=0x2b681400bb00, event=105) at LogCollationClientSM.cc:445
> #7  0x0069b427 in LogCollationClientSM::client_handler 
> (this=0x2b681400bb00, event=105, data=0x2b680c017020)
> at LogCollationClientSM.cc:119
> #8  0x00502cc6 in Continuation::handleEvent (this=0x2b681400bb00, 
> event=105, data=0x2b680c017020)
> at ../iocore/eventsystem/I_Continuation.h:153
> #9  0x00783d40 in read_signal_and_update (event=105, 
> vc=0x2b680c016f00) at UnixNetVConnection.cc:150
> #10 0x00787a22 in UnixNetVConnection::mainEvent (this=0x2b680c016f00, 
> event=1, e=0x127ad60) at UnixNetVConnection.cc:1188
> #11 0x00502cc6 in Continuation::handleEvent (this=0x2b680c016f00, 
> event=1, data=0x127ad60)
> at ../iocore/eventsystem/I_Continuation.h:153
> #12 0x0077d943 in InactivityCop::check_inactivity (this=0x1209a00, 
> event=2, e=0x127ad60) at UnixNet.cc:102
> #13 0x00502cc6 in Continuation::handleEvent (this=0x1209a00, event=2, 
> data=0x127ad60)
> at ../iocore/eventsystem/I_Continuation.h:153
> #14 0x007a5df6 in EThread::process_event (this=0x2b67cf7bb010, 
> e=0x127ad60, calling_code=2) at UnixEThread.cc:128
> #15 0x007a61f5 in EThread::execute (this=0x2b67cf7bb010) at 
> UnixEThread.cc:207
> #16 0x00534430 in main (argv=0x7ffc690e82e8) at Main.cc:1918
> I believe it takes a wrong turn here --
> #9  0x00783d40 in read_signal_and_update (event=105, 
> vc=0x2b680c016f00) at UnixNetVConnection.cc:150
> 150 vc->read.vio._cont->handleEvent(event, >read.vio);
> (gdb) list
> 145 static inline int
> 146 read_signal_and_update(int event, UnixNetVConnection *vc)
> 147 {
> 148   vc->recursion++;
> 149   if (vc->read.vio._cont) {
> 150 vc->read.vio._cont->handleEvent(event, >read.vio);
> 151   } else {
> 152 switch (event) {
> 153 case VC_EVENT_EOS:
> 154 case VC_EVENT_ERROR:
> (gdb) list
> 155 case VC_EVENT_ACTIVE_TIMEOUT:
> 156 case VC_EVENT_INACTIVITY_TIMEOUT:
> 157   Debug("inactivity_cop", "event %d: null read.vio cont, closing 
> vc %p", event, vc);
> 158   vc->closed = 1;
> 159   break;
> 160 default:
> 161   Error("Unexpected event %d for vc %p", event, vc);
> 162   ink_release_assert(0);
> 163   break;
> 164 }
> Note: I understand that there were several issues related to TS-3196 
> concerning inactivity_cop and this section of 

[jira] [Commented] (TS-4475) Crash in Log-Collation client after using inactivity-cop.

2016-08-01 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401947#comment-15401947
 ] 

Oknet Xu commented on TS-4475:
--

Acroding the source file: /proxy/logging/LogCollationClientSM.cc, the 
LogCollationClientSM::client_idle() handle LOG_COLL_CLIENT_IDLE state.
The netvc managed by LogCollationClientSM is a persistent connection, the 
client_idle() handle EOS for do_io_read and ERROR for do_io_write. Thus, there 
is no timeout event only idle state, the InactivityCop should not check timeout 
for the netvc managed by LogCollationClientSM.

Consider the LogCollationHostSM::read_hdr() would receive TIMEOUT Event from 
InactivityCop too, but it is call host_handler() with LOG_COLL_EVENT_ERROR 
event and no assert.
The host_handler() only log the error event and call host_done() to close the 
netvc.
Thus no crash report from LogCollationHostSM.

A possible solution:
Step 1. handle Timeout event at LogCollationClientSM::client_idle() and some 
others, and call client_fail() to close the timeouted netvc.
Step 2. Set a standalone inactivity timeout value for logcollation's netvc to 
keep the connection in a idle state.

The PR831 is not call client_fail() to close the timeouted netvc. 
[~pbchou] , can you replace "return EVENT_CONT" with "return 
client_fail(LOG_COLL_EVENT_SWITCH, NULL);" for VC_EVENT_INACTIVITY_TIMEOUT.

> Crash in Log-Collation client after using inactivity-cop.
> -
>
> Key: TS-4475
> URL: https://issues.apache.org/jira/browse/TS-4475
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 6.1.1
>Reporter: Peter Chou
> Fix For: sometime
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Background: We recently tried making use of inactivity-cop by setting it to 
> 300s instead of the default one-day setting. This was to address an issue 
> where, under heavy load, ATS would become un-responsive to client requests, 
> and the condition would persist after traffic was stopped with the active 
> queue saying 0 connections but 'netstat -na' showing a bunch of established 
> connections (up to the throttle limit approximately).
> Inactivity cop seemed to help ATS handle this situation, but we have since 
> experienced a couple of core dumps over the last four day period. It seems 
> occasionally the Log Collation Client State Machine will have event value 105 
> or VC_EVENT_INACTIVITY_TIMEOUT, but when it reaches read_signal_and_update() 
> it tries to call the continuation handler which down the line does not know 
> about this event thus causing core dump !"unexpcted state" [sic].
> Here is the back-trace --
> (gdb) bt
> #0  0x2b67cd5405f7 in raise () from /lib64/libc.so.6
> #1  0x2b67cd541e28 in abort () from /lib64/libc.so.6
> #2  0x2b67cb032921 in ink_die_die_die () at ink_error.cc:43
> #3  0x2b67cb0329da in ink_fatal_va (fmt=0x2b67cb0442dc "%s:%d: failed 
> assert `%s`", ap=0x7ffc690e7ba8) at ink_error.cc:65
> #4  0x2b67cb032a79 in ink_fatal (message_format=0x2b67cb0442dc "%s:%d: 
> failed assert `%s`") at ink_error.cc:73
> #5  0x2b67cb0305a6 in _ink_assert (expression=0x7fb422 "!\"unexpcted 
> state\"", file=0x7fb35b "LogCollationClientSM.cc",
> line=445) at ink_assert.cc:37
> #6  0x0069c86b in LogCollationClientSM::client_idle 
> (this=0x2b681400bb00, event=105) at LogCollationClientSM.cc:445
> #7  0x0069b427 in LogCollationClientSM::client_handler 
> (this=0x2b681400bb00, event=105, data=0x2b680c017020)
> at LogCollationClientSM.cc:119
> #8  0x00502cc6 in Continuation::handleEvent (this=0x2b681400bb00, 
> event=105, data=0x2b680c017020)
> at ../iocore/eventsystem/I_Continuation.h:153
> #9  0x00783d40 in read_signal_and_update (event=105, 
> vc=0x2b680c016f00) at UnixNetVConnection.cc:150
> #10 0x00787a22 in UnixNetVConnection::mainEvent (this=0x2b680c016f00, 
> event=1, e=0x127ad60) at UnixNetVConnection.cc:1188
> #11 0x00502cc6 in Continuation::handleEvent (this=0x2b680c016f00, 
> event=1, data=0x127ad60)
> at ../iocore/eventsystem/I_Continuation.h:153
> #12 0x0077d943 in InactivityCop::check_inactivity (this=0x1209a00, 
> event=2, e=0x127ad60) at UnixNet.cc:102
> #13 0x00502cc6 in Continuation::handleEvent (this=0x1209a00, event=2, 
> data=0x127ad60)
> at ../iocore/eventsystem/I_Continuation.h:153
> #14 0x007a5df6 in EThread::process_event (this=0x2b67cf7bb010, 
> e=0x127ad60, calling_code=2) at UnixEThread.cc:128
> #15 0x007a61f5 in EThread::execute (this=0x2b67cf7bb010) at 
> UnixEThread.cc:207
> #16 0x00534430 in main (argv=0x7ffc690e82e8) at Main.cc:1918
> I believe it takes a wrong turn here --
> #9  0x00783d40 in read_signal_and_update (event=105, 
> vc=0x2b680c016f00) 

[jira] [Created] (TS-4705) Proposal: NetVC Context

2016-07-28 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4705:


 Summary: Proposal: NetVC Context
 Key: TS-4705
 URL: https://issues.apache.org/jira/browse/TS-4705
 Project: Traffic Server
  Issue Type: Improvement
  Components: Core
Reporter: Oknet Xu


Goal 1st:
In the NetVConnection, we have get_local_addr() and get_remote_addr() methods.
Also have members local_addr, remote_addr and netvc->con.addr.

Thus, we should using netvc->con.addr or remote_addr to replace member 
server_addr in UnixNetVConnection.

Goal 2nd:
SSLNetVConnection has member sslClientConnection with 2 methods 
setSSLClientConnection() and getSSLClientConnection() to indictor ATS is a 
client or server in a SSL session.

To abstract above two goals, I'm design the netvc context function.

As a proxy, there has two side: client side ( Client <-> Proxy ) and server 
side ( Proxy <-> Server ). With the netvc context funtion to indicate which 
side the NetVC working on.

Goal 3rd:
Fix a minor bug in NetAccept::do_blocking_accept, call to 
check_emergency_throttle(con) first then allocate vc.

Goal 4th:
NetAccept Optimize, remove dup code, etc...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4614) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback

2016-07-28 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397182#comment-15397182
 ] 

Oknet Xu commented on TS-4614:
--

What I should do for the backport ?

> In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event 
> callback 
> ---
>
> Key: TS-4614
> URL: https://issues.apache.org/jira/browse/TS-4614
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cop
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> NetHandler has a method: _close_vc , It is called by InactivityCop.
> first, create a dummy Event in stack,
> then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
> );
> the handleEvent is mainEvent here.
> In the UnixNetVConnection::mainEvent code:
> ```
> int
> UnixNetVConnection::mainEvent(int event, Event *e)
> {
>   ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
>   ink_assert(thread == this_ethread());
>   MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
>   MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
> e->ethread);
>   MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : 
> e->ethread->mutex, e->ethread);
>   if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
>   (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
>   (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
> #endif
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> return EVENT_CONT;
>   }
> ```
> the dummy Event would be schedule_in into Event System by 
> e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.
> ```
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> #endif
> ```
> I'm try to allocate a Event instead dummy Event, but meet Event System 
> callback on a deallocated UnixNetVConnection.
> due to NetHandler called close_UnixNetVConnection before Event System 
> callback the Event by schedule_in.
> In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
> from UnixNetVConnection::mainEvent, to do ++handle_event; or not.
> ```
> if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
>   ++handle_event;
> ```
> the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
> due to the mutex of ServerSessionVC may different from ClientSessionVC.
> Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
> Session from SessionPool. ServerSessionVC still keep the old mutex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4697) MIOBuffer did not free if failed on ipallow check in HttpSessionAccept::accept()

2016-07-23 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4697:


 Summary: MIOBuffer did not free if failed on ipallow check in 
HttpSessionAccept::accept()
 Key: TS-4697
 URL: https://issues.apache.org/jira/browse/TS-4697
 Project: Traffic Server
  Issue Type: Bug
  Components: HTTP, Network
Reporter: Oknet Xu


{code}
void
HttpSessionAccept::accept(NetVConnection *netvc, MIOBuffer *iobuf, 
IOBufferReader *reader)
{
  sockaddr const *client_ip = netvc->get_remote_addr();
  const AclRecord *acl_record = NULL;
  ip_port_text_buffer ipb;
  IpAllow::scoped_config ipallow;

  // The backdoor port is now only bound to "localhost", so no
  // reason to check for if it's incoming from "localhost" or not.
  if (backdoor) {
acl_record = IpAllow::AllMethodAcl();
  } else if (ipallow && (((acl_record = ipallow->match(client_ip)) == NULL) || 
(acl_record->isEmpty( {

// if client address forbidden, close immediately //

Warning("client '%s' prohibited by ip-allow policy", ats_ip_ntop(client_ip, 
ipb, sizeof(ipb)));
netvc->do_io_close();

return;   // ->  MIOBuffer did not free.
  }
...

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4612) Proposal: InactivityCop Optimize

2016-06-29 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4612:
-
Description: 
By review the processing of InactivityCop::check_inactivity():

1. get all local vc from open_list
2. put them into cop_list
3. check every vc in cop_list if it is already timeouted
4. callback vc->handleEvent to close vc if it is timeout

InactivityCop and NetHandler share one mutex.
InactivityCop runs every second, NetHandler runs every 10ms, that means 
Nethandler runs 100 times until next InactivityCop runs.

if one vc has read/write in a Nethandler call, it is won't be timeout in the 
next InactivityCop run.

Thus, if the vc has read/write in Nethandler, we move it out of cop-list then 
the InactivityCop runs would get better performace.


  was:
NetHandler has a method: _close_vc , It is called by InactivityCop.

first, create a dummy Event in stack,
then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
);

the handleEvent is mainEvent here.

In the UnixNetVConnection::mainEvent code:

```
int
UnixNetVConnection::mainEvent(int event, Event *e)
{
  ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
  ink_assert(thread == this_ethread());

  MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
  MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
e->ethread);
  MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, 
e->ethread);

  if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
  (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
  (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
#endif
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
return EVENT_CONT;
  }
```

the dummy Event would be schedule_in into Event System by 
e->schedule_in(HRTIME_MSECONDS(net_retry_delay));

I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.

```
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
#endif
```

I'm try to allocate a Event instead dummy Event, but meet Event System callback 
on a deallocated UnixNetVConnection.

due to NetHandler called close_UnixNetVConnection before Event System callback 
the Event by schedule_in.

In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
from UnixNetVConnection::mainEvent, to do ++handle_event; or not.

```
if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
  ++handle_event;
```

the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
due to the mutex of ServerSessionVC may different from ClientSessionVC.

Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
Session from SessionPool. ServerSessionVC still keep the old mutex.


> Proposal: InactivityCop Optimize
> 
>
> Key: TS-4612
> URL: https://issues.apache.org/jira/browse/TS-4612
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cop
>Reporter: Oknet Xu
>
> By review the processing of InactivityCop::check_inactivity():
> 1. get all local vc from open_list
> 2. put them into cop_list
> 3. check every vc in cop_list if it is already timeouted
> 4. callback vc->handleEvent to close vc if it is timeout
> InactivityCop and NetHandler share one mutex.
> InactivityCop runs every second, NetHandler runs every 10ms, that means 
> Nethandler runs 100 times until next InactivityCop runs.
> if one vc has read/write in a Nethandler call, it is won't be timeout in the 
> next InactivityCop run.
> Thus, if the vc has read/write in Nethandler, we move it out of cop-list then 
> the InactivityCop runs would get better performace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4612) Proposal: InactivityCop Optimize

2016-06-29 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4612:
-
Summary: Proposal: InactivityCop Optimize  (was: In 
UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event 
callback )

> Proposal: InactivityCop Optimize
> 
>
> Key: TS-4612
> URL: https://issues.apache.org/jira/browse/TS-4612
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cop
>Reporter: Oknet Xu
>
> NetHandler has a method: _close_vc , It is called by InactivityCop.
> first, create a dummy Event in stack,
> then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
> );
> the handleEvent is mainEvent here.
> In the UnixNetVConnection::mainEvent code:
> ```
> int
> UnixNetVConnection::mainEvent(int event, Event *e)
> {
>   ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
>   ink_assert(thread == this_ethread());
>   MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
>   MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
> e->ethread);
>   MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : 
> e->ethread->mutex, e->ethread);
>   if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
>   (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
>   (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
> #endif
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> return EVENT_CONT;
>   }
> ```
> the dummy Event would be schedule_in into Event System by 
> e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.
> ```
> #ifdef INACTIVITY_TIMEOUT
> if (e == active_timeout)
>   e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
> #endif
> ```
> I'm try to allocate a Event instead dummy Event, but meet Event System 
> callback on a deallocated UnixNetVConnection.
> due to NetHandler called close_UnixNetVConnection before Event System 
> callback the Event by schedule_in.
> In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
> from UnixNetVConnection::mainEvent, to do ++handle_event; or not.
> ```
> if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
>   ++handle_event;
> ```
> the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
> due to the mutex of ServerSessionVC may different from ClientSessionVC.
> Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
> Session from SessionPool. ServerSessionVC still keep the old mutex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4614) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback

2016-06-29 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4614:


 Summary: In UnixNetVConnection::mainEvent should not do 
e->schedule_in for dummy event callback 
 Key: TS-4614
 URL: https://issues.apache.org/jira/browse/TS-4614
 Project: Traffic Server
  Issue Type: Bug
  Components: Cop
Reporter: Oknet Xu


NetHandler has a method: _close_vc , It is called by InactivityCop.

first, create a dummy Event in stack,
then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
);

the handleEvent is mainEvent here.

In the UnixNetVConnection::mainEvent code:

```
int
UnixNetVConnection::mainEvent(int event, Event *e)
{
  ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
  ink_assert(thread == this_ethread());

  MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
  MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
e->ethread);
  MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, 
e->ethread);

  if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
  (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
  (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
#endif
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
return EVENT_CONT;
  }
```

the dummy Event would be schedule_in into Event System by 
e->schedule_in(HRTIME_MSECONDS(net_retry_delay));

I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.

```
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
#endif
```

I'm try to allocate a Event instead dummy Event, but meet Event System callback 
on a deallocated UnixNetVConnection.

due to NetHandler called close_UnixNetVConnection before Event System callback 
the Event by schedule_in.

In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
from UnixNetVConnection::mainEvent, to do ++handle_event; or not.

```
if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
  ++handle_event;
```

the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
due to the mutex of ServerSessionVC may different from ClientSessionVC.

Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
Session from SessionPool. ServerSessionVC still keep the old mutex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4612) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback

2016-06-29 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4612:


 Summary: In UnixNetVConnection::mainEvent should not do 
e->schedule_in for dummy event callback 
 Key: TS-4612
 URL: https://issues.apache.org/jira/browse/TS-4612
 Project: Traffic Server
  Issue Type: Bug
  Components: Cop
Reporter: Oknet Xu


NetHandler has a method: _close_vc , It is called by InactivityCop.

first, create a dummy Event in stack,
then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
);

the handleEvent is mainEvent here.

In the UnixNetVConnection::mainEvent code:

```
int
UnixNetVConnection::mainEvent(int event, Event *e)
{
  ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
  ink_assert(thread == this_ethread());

  MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
  MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
e->ethread);
  MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, 
e->ethread);

  if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
  (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
  (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
#endif
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
return EVENT_CONT;
  }
```

the dummy Event would be schedule_in into Event System by 
e->schedule_in(HRTIME_MSECONDS(net_retry_delay));

I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.

```
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
#endif
```

I'm try to allocate a Event instead dummy Event, but meet Event System callback 
on a deallocated UnixNetVConnection.

due to NetHandler called close_UnixNetVConnection before Event System callback 
the Event by schedule_in.

In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
from UnixNetVConnection::mainEvent, to do ++handle_event; or not.

```
if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
  ++handle_event;
```

the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
due to the mutex of ServerSessionVC may different from ClientSessionVC.

Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
Session from SessionPool. ServerSessionVC still keep the old mutex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4613) In UnixNetVConnection::mainEvent should not do e->schedule_in for dummy event callback

2016-06-29 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4613:


 Summary: In UnixNetVConnection::mainEvent should not do 
e->schedule_in for dummy event callback 
 Key: TS-4613
 URL: https://issues.apache.org/jira/browse/TS-4613
 Project: Traffic Server
  Issue Type: Bug
  Components: Cop
Reporter: Oknet Xu


NetHandler has a method: _close_vc , It is called by InactivityCop.

first, create a dummy Event in stack,
then call UnixNetVConnection::mainEvent by vc->handleEvent(EVENT_IMMEDIATE, 
);

the handleEvent is mainEvent here.

In the UnixNetVConnection::mainEvent code:

```
int
UnixNetVConnection::mainEvent(int event, Event *e)
{
  ink_assert(event == EVENT_IMMEDIATE || event == EVENT_INTERVAL);
  ink_assert(thread == this_ethread());

  MUTEX_TRY_LOCK(hlock, get_NetHandler(thread)->mutex, e->ethread);
  MUTEX_TRY_LOCK(rlock, read.vio.mutex ? read.vio.mutex : e->ethread->mutex, 
e->ethread);
  MUTEX_TRY_LOCK(wlock, write.vio.mutex ? write.vio.mutex : e->ethread->mutex, 
e->ethread);

  if (!hlock.is_locked() || !rlock.is_locked() || !wlock.is_locked() ||
  (read.vio.mutex && rlock.get_mutex() != read.vio.mutex.get()) ||
  (write.vio.mutex && wlock.get_mutex() != write.vio.mutex.get())) {
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
#endif
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
return EVENT_CONT;
  }
```

the dummy Event would be schedule_in into Event System by 
e->schedule_in(HRTIME_MSECONDS(net_retry_delay));

I think we should move the schedule_in into the INACTIVITY_TIMEOUT macro.

```
#ifdef INACTIVITY_TIMEOUT
if (e == active_timeout)
  e->schedule_in(HRTIME_MSECONDS(net_retry_delay));
#endif
```

I'm try to allocate a Event instead dummy Event, but meet Event System callback 
on a deallocated UnixNetVConnection.

due to NetHandler called close_UnixNetVConnection before Event System callback 
the Event by schedule_in.

In NetHandler::_close_vc, depend the return value (EVENT_DONE or EVENT_CONT) 
from UnixNetVConnection::mainEvent, to do ++handle_event; or not.

```
if (vc->handleEvent(EVENT_IMMEDIATE, ) == EVENT_DONE)
  ++handle_event;
```

the 3 MUTEX_TRY_LOCK not always success on InactivityCop callback,
due to the mutex of ServerSessionVC may different from ClientSessionVC.

Only mutex of ServerSession is set to HttpSM when HttpSM pick up a Server 
Session from SessionPool. ServerSessionVC still keep the old mutex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) haven't check the change of lock after return from wbe callback

2016-06-28 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Summary: haven't check the change of lock after return from wbe callback  
(was: Don't reschedule read depend on needs & did not check the change of lock 
at the return callback with wbe.)

> haven't check the change of lock after return from wbe callback
> ---
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: SSL
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {  // should check needs==0
>   write_disable(nh, vc);
>   return;
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}
> another issue in write_to_net_io(): did not check the change of lock at the 
> return callback with wbe.
> {code}
>   if (s->vio.ntodo() <= 0) {
> write_signal_done(VC_EVENT_WRITE_COMPLETE, nh, vc);
> return;
>   } else if (signalled && (wbe_event != vc->write_buffer_empty_event)) {
> // @a signalled means we won't send an event, and the event values 
> differing means we
> // had a write buffer trap and cleared it, so we need to send it now.
> if (write_signal_and_update(wbe_event, vc) != EVENT_CONT)
>   return;
> // > did not check the change of lock at the return callback 
> with wbe.
>   } else if (!signalled) {
> if (write_signal_and_update(VC_EVENT_WRITE_READY, vc) != EVENT_CONT) {
>   return;
> }
> // change of lock... don't look at shared variables!
> if (lock.get_mutex() != s->vio.mutex.get()) {
>   write_reschedule(nh, vc);
>   return;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4590) INKVConnInternal didn't set m_free_magic to DEAD as INKContInternal

2016-06-27 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4590:


 Summary: INKVConnInternal didn't set m_free_magic to DEAD as 
INKContInternal
 Key: TS-4590
 URL: https://issues.apache.org/jira/browse/TS-4590
 Project: Traffic Server
  Issue Type: Improvement
  Components: TS API
Reporter: Oknet Xu


The class INKContInternal is a base class of INKVConnInternal.

INKVConnInternal rewrite destroy() and handle_event(), but forgot to set 
m_free_magic to DEAD that is a debug flag.

I will add 2 methods for INKContInternal and INKVConnInternal:

- clear()
  - clear variables
- free()
  - call clear() first
  - call this->mutex.clear();
  - set m_free_magic
  - call xxxAllocator.free(this)

and rewrite destroy to call free().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4539) the mutex of server_vc is not set while server_session reuse.

2016-06-16 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333670#comment-15333670
 ] 

Oknet Xu commented on TS-4539:
--

Client_vc->mutex is alloced in NetAccept by {code}vc->mutex = 
new_ProxyMutex();{code}

and

Server_vc->mutex is set to {code}vc->mutex = cont->mutex;{code} in 
UnixNetProcessor::connect_re_internal().

the connect_re_internal() is called by NetProcessor::connect_re().

{code}
  if (scheme_to_use == URL_WKSIDX_HTTPS) {
DebugSM("http", "calling sslNetProcessor.connect_re");
int len = 0;
const char *host = t_state.hdr_info.server_request.host_get();
if (host && len > 0)
  opt.set_sni_servername(host, len);
connect_action_handle = sslNetProcessor.connect_re(this,
 // state machine
   
_state.current.server->dst_addr.sa, // addr + port
   );
  } else {
if (t_state.method != HTTP_WKSIDX_CONNECT) {
  DebugSM("http", "calling netProcessor.connect_re");
  connect_action_handle = netProcessor.connect_re(this, 
// state machine
  
_state.current.server->dst_addr.sa, // addr + port
  );
} else {
{code}

acroding the above code in HttpSM::do_http_server_open(), the cont is HttpSM.

the Server_vc->mutex is set to HttpSM->mutex.

the Server_vc->mutex is not set to EThread->mutex as your said.


> the mutex of server_vc is not set while server_session reuse.
> -
>
> Key: TS-4539
> URL: https://issues.apache.org/jira/browse/TS-4539
> Project: Traffic Server
>  Issue Type: Bug
>  Components: HTTP
>Reporter: Oknet Xu
>
> NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex.
> And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share 
> the same mutex.
> The HttpServerSession and server_vc will put into ServerSessionPool and may 
> assign to next new client_vc.
> The HttpSM::attach_server_session() only set the mutex of HttpServerSession 
> to the mutex of HttpSM after get a HttpServerSession from ServerSessionPool.
> But it forget to set the mutex of server_vc to the mutex of HttpSM.
>  
> {code}
> void
> HttpSM::attach_server_session(HttpServerSession *s)
> {
>   hsm_release_assert(server_session == NULL);
>   hsm_release_assert(server_entry == NULL);
>   hsm_release_assert(s->state == HSS_ACTIVE);
>   server_session = s; 
>   server_session->transact_count++;
>   // Set the mutex so that we have something to update
>   //   stats with
>   server_session->mutex = this->mutex;
> {code}
> But I can not found any issue, Is it by design?
> Or it is hard to locate the problem, due to my limited knowedge on HttpSM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4539) the mutex of server_vc is not set while server_session reuse.

2016-06-14 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4539:


 Summary: the mutex of server_vc is not set while server_session 
reuse.
 Key: TS-4539
 URL: https://issues.apache.org/jira/browse/TS-4539
 Project: Traffic Server
  Issue Type: Bug
  Components: HTTP
Reporter: Oknet Xu


NetAccept got a client_vc and call new_ProxyMutex() to assign a mutex.
And the HttpClientSession, HttpSM, HttpServerSession, server_vc also share the 
same mutex.

The HttpServerSession and server_vc will put into ServerSessionPool and may 
assign to next new client_vc.

The HttpSM::attach_server_session() only set the mutex of HttpServerSession to 
the mutex of HttpSM after get a HttpServerSession from ServerSessionPool.
But it forget to set the mutex of server_vc to the mutex of HttpSM.
 
{code}
void
HttpSM::attach_server_session(HttpServerSession *s)
{
  hsm_release_assert(server_session == NULL);
  hsm_release_assert(server_entry == NULL);
  hsm_release_assert(s->state == HSS_ACTIVE);
  server_session = s; 
  server_session->transact_count++;

  // Set the mutex so that we have something to update
  //   stats with
  server_session->mutex = this->mutex;
{code}

But I can not found any issue, Is it by design?
Or it is hard to locate the problem, due to my limited knowedge on HttpSM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4521) compile error on build proxy/http2/test_HPACK

2016-06-13 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326884#comment-15326884
 ] 

Oknet Xu commented on TS-4521:
--

Thanks! It's worked.

> compile error on build proxy/http2/test_HPACK
> -
>
> Key: TS-4521
> URL: https://issues.apache.org/jira/browse/TS-4521
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Oknet Xu
>Assignee: Masakazu Kitajo
> Fix For: 7.0.0
>
>
> OS: Debian 7 (wheezy)
> ATS Branch: master
> GCC: 4.7.2(Debian 4.7.2-5)
> {code}
> /usr/bin/ld: ../../proxy/hdrs/libhdrs.a(HttpCompat.o): undefined reference to 
> symbol 'Tcl_NextHashEntry'
> /usr/bin/ld: note: 'Tcl_NextHashEntry' is defined in DSO 
> /usr/lib/libtcl8.5.so.0 so try adding it to the linker command line
> /usr/lib/libtcl8.5.so.0: could not read symbols: Invalid operation
> collect2: error: ld returned 1 exit status
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4522) did not check EPIPE on write_to_net_io

2016-06-12 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4522:


 Summary: did not check EPIPE on write_to_net_io
 Key: TS-4522
 URL: https://issues.apache.org/jira/browse/TS-4522
 Project: Traffic Server
  Issue Type: Bug
  Components: Core, Network
Reporter: Oknet Xu


On a closed socket fd:
read(socketfd) return 0
write(socketfd) return EPIPE

In the write_to_net_io, we check the return value of write() with the same way 
to read().

{code}
if (!r || r == -ECONNRESET) {
{code}

The bug makes no VC_EVENT_EOS callbacked while write_to_net_io, but 
VC_EVENT_ERROR instead. 

full code here:
{code}
  int64_t r = vc->load_buffer_and_write(towrite, buf, total_written, needs);

  if (total_written > 0) {
NET_SUM_DYN_STAT(net_write_bytes_stat, total_written);
s->vio.ndone += total_written;
  }

  // check for errors
  if (r <= 0) { // if the socket was not ready,add to WaitList
if (r == -EAGAIN || r == -ENOTCONN) {
  NET_INCREMENT_DYN_STAT(net_calls_to_write_nodata_stat);
  if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
vc->write.triggered = 0;
nh->write_ready_list.remove(vc);
write_reschedule(nh, vc);
  }
  if ((needs & EVENTIO_READ) == EVENTIO_READ) {
vc->read.triggered = 0;
nh->read_ready_list.remove(vc);
read_reschedule(nh, vc);
  }
  return;
}
if (!r || r == -ECONNRESET) {
  vc->write.triggered = 0;
  write_signal_done(VC_EVENT_EOS, nh, vc);
  return;
}
vc->write.triggered = 0;
write_signal_error(nh, vc, (int)-total_written);
return;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-4488) Segmentation fault at HttpSM::tunnel_handler_ua

2016-06-12 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326419#comment-15326419
 ] 

Oknet Xu edited comment on TS-4488 at 6/12/16 12:34 PM:


meet the similar crash, only has stack trace:
{code}
traffic_server: Segmentation fault (Address not mapped to object [(nil)])
traffic_server - STACK TRACE: 
/usr/bin/traffic_server(crash_logger_invoke(int, siginfo*, 
void*)+0xa2)[0x2ad121596e32]
/lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2ad1252711a0]
/usr/bin/traffic_server(HttpTunnel::consumer_handler(int, 
HttpTunnelConsumer*)+0xca)[0x2ad1216ab19a]
/usr/bin/traffic_server(HttpTunnel::main_handler(int, 
void*)+0x5e)[0x2ad1216ab58e]
/usr/bin/traffic_server(HttpSM::state_send_server_request_header(int, 
void*)+0xfd)[0x2ad12166ea9d]
/usr/bin/traffic_server(HttpSM::main_handler(int, void*)+0x80)[0x2ad12166cb10]
/usr/bin/traffic_server(UnixNetVConnection::mainEvent(int, 
Event*)+0x4e7)[0x2ad121808687]
/usr/bin/traffic_server(InactivityCop::check_inactivity(int, 
Event*)+0x287)[0x2ad1217fe8d7]
/usr/bin/traffic_server(EThread::process_event(Event*, 
int)+0x90)[0x2ad12182a020]
/usr/bin/traffic_server(EThread::execute()+0x69e)[0x2ad12182ac4e]
/usr/bin/traffic_server(+0x39938a)[0x2ad12182938a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2ad1255d3b50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ad12531d30d]{code}

{code}
traffic_server: Segmentation fault (Address not mapped to object [0x3d])
traffic_server - STACK TRACE: 
/usr/bin/traffic_server(crash_logger_invoke(int, siginfo*, 
void*)+0xa2)[0x2afd2459fe32]
/lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2afd2827a1a0]
/usr/bin/traffic_server(HttpSM::tunnel_handler_transform_write(int, 
HttpTunnelConsumer*)+0x1a4)[0x2afd24663f94]
/usr/bin/traffic_server(HttpTunnel::consumer_handler(int, 
HttpTunnelConsumer*)+0xaf)[0x2afd246b417f]
/usr/bin/traffic_server(HttpTunnel::main_handler(int, 
void*)+0x5e)[0x2afd246b458e]
/usr/lib/trafficserver/modules/skg-spe.so(+0xc179)[0x2afd7b2a1179]
/usr/bin/traffic_server(EThread::process_event(Event*, 
int)+0x90)[0x2afd24833020]
/usr/bin/traffic_server(EThread::execute()+0x67f)[0x2afd24833c2f]
/usr/bin/traffic_server(+0x39938a)[0x2afd2483238a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2afd285dcb50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2afd2832630d]
traffic_server: using root directory '/usr'
{code}


was (Author: oknet):
meet the similar crash, only has stack trace:
{code}
traffic_server: Segmentation fault (Address not mapped to object [(nil)])
traffic_server - STACK TRACE: 
/usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2ad121596e32]
/lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2ad1252711a0]
/usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xca)[0x2ad1216ab19a]
/usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2ad1216ab58e]
/usr/bin/traffic_server(_ZN6HttpSM32state_send_server_request_headerEiPv+0xfd)[0x2ad12166ea9d]
/usr/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x80)[0x2ad12166cb10]
/usr/bin/traffic_server(_ZN18UnixNetVConnection9mainEventEiP5Event+0x4e7)[0x2ad121808687]
/usr/bin/traffic_server(_ZN13InactivityCop16check_inactivityEiP5Event+0x287)[0x2ad1217fe8d7]
/usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2ad12182a020]
/usr/bin/traffic_server(_ZN7EThread7executeEv+0x69e)[0x2ad12182ac4e]
/usr/bin/traffic_server(+0x39938a)[0x2ad12182938a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2ad1255d3b50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ad12531d30d]
{code}

{code}
traffic_server: Segmentation fault (Address not mapped to object [0x3d])
traffic_server - STACK TRACE: 
/usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2afd2459fe32]
/lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2afd2827a1a0]
/usr/bin/traffic_server(_ZN6HttpSM30tunnel_handler_transform_writeEiP18HttpTunnelConsumer+0x1a4)[0x2afd24663f94]
/usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xaf)[0x2afd246b417f]
/usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2afd246b458e]
/usr/lib/trafficserver/modules/skg-spe.so(+0xc179)[0x2afd7b2a1179]
/usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2afd24833020]
/usr/bin/traffic_server(_ZN7EThread7executeEv+0x67f)[0x2afd24833c2f]
/usr/bin/traffic_server(+0x39938a)[0x2afd2483238a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2afd285dcb50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2afd2832630d]
traffic_server: using root directory '/usr'
{code}

> Segmentation fault at HttpSM::tunnel_handler_ua
> ---
>
> Key: TS-4488
> URL: https://issues.apache.org/jira/browse/TS-4488
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Pavlo Yatsukhnenko
>Assignee: Pavlo Yatsukhnenko
>  

[jira] [Commented] (TS-4488) Segmentation fault at HttpSM::tunnel_handler_ua

2016-06-12 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326419#comment-15326419
 ] 

Oknet Xu commented on TS-4488:
--

meet the similar crash, only has stack trace:
{code}
traffic_server: Segmentation fault (Address not mapped to object [(nil)])
traffic_server - STACK TRACE: 
/usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2ad121596e32]
/lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2ad1252711a0]
/usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xca)[0x2ad1216ab19a]
/usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2ad1216ab58e]
/usr/bin/traffic_server(_ZN6HttpSM32state_send_server_request_headerEiPv+0xfd)[0x2ad12166ea9d]
/usr/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x80)[0x2ad12166cb10]
/usr/bin/traffic_server(_ZN18UnixNetVConnection9mainEventEiP5Event+0x4e7)[0x2ad121808687]
/usr/bin/traffic_server(_ZN13InactivityCop16check_inactivityEiP5Event+0x287)[0x2ad1217fe8d7]
/usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2ad12182a020]
/usr/bin/traffic_server(_ZN7EThread7executeEv+0x69e)[0x2ad12182ac4e]
/usr/bin/traffic_server(+0x39938a)[0x2ad12182938a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2ad1255d3b50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ad12531d30d]
{code}

{code}
traffic_server: Segmentation fault (Address not mapped to object [0x3d])
traffic_server - STACK TRACE: 
/usr/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0xa2)[0x2afd2459fe32]
/lib/x86_64-linux-gnu/libc.so.6(+0x321a0)[0x2afd2827a1a0]
/usr/bin/traffic_server(_ZN6HttpSM30tunnel_handler_transform_writeEiP18HttpTunnelConsumer+0x1a4)[0x2afd24663f94]
/usr/bin/traffic_server(_ZN10HttpTunnel16consumer_handlerEiP18HttpTunnelConsumer+0xaf)[0x2afd246b417f]
/usr/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0x5e)[0x2afd246b458e]
/usr/lib/trafficserver/modules/skg-spe.so(+0xc179)[0x2afd7b2a1179]
/usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x90)[0x2afd24833020]
/usr/bin/traffic_server(_ZN7EThread7executeEv+0x67f)[0x2afd24833c2f]
/usr/bin/traffic_server(+0x39938a)[0x2afd2483238a]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2afd285dcb50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2afd2832630d]
traffic_server: using root directory '/usr'
{code}

> Segmentation fault at HttpSM::tunnel_handler_ua
> ---
>
> Key: TS-4488
> URL: https://issues.apache.org/jira/browse/TS-4488
> Project: Traffic Server
>  Issue Type: Bug
>Reporter: Pavlo Yatsukhnenko
>Assignee: Pavlo Yatsukhnenko
>  Labels: Crash
> Fix For: 7.0.0
>
>
> From Github PR #674



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4521) compile error on build proxy/http2/test_HPACK

2016-06-11 Thread Oknet Xu (JIRA)
Oknet Xu created TS-4521:


 Summary: compile error on build proxy/http2/test_HPACK
 Key: TS-4521
 URL: https://issues.apache.org/jira/browse/TS-4521
 Project: Traffic Server
  Issue Type: Bug
Reporter: Oknet Xu


OS: Debian 7 (wheezy)
ATS Branch: master
GCC: 4.7.2(Debian 4.7.2-5)

{code}
/usr/bin/ld: ../../proxy/hdrs/libhdrs.a(HttpCompat.o): undefined reference to 
symbol 'Tcl_NextHashEntry'
/usr/bin/ld: note: 'Tcl_NextHashEntry' is defined in DSO 
/usr/lib/libtcl8.5.so.0 so try adding it to the linker command line
/usr/lib/libtcl8.5.so.0: could not read symbols: Invalid operation
collect2: error: ld returned 1 exit status
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4483) NetAccept & SSLNetAccept Optimize

2016-05-27 Thread Oknet Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305195#comment-15305195
 ] 

Oknet Xu commented on TS-4483:
--

This optimize also fix a bug in SSLNetAccept::init_accept_per_thread(bool 
isTransparent)

{code}
PollDescriptor *pd = get_PollDescriptor(t);
if (ep.start(pd, this, EVENTIO_READ) < 0)   // ==> should be  
a->ep.start(pd, a, EVENTIO_READ)
  Debug("iocore_net", "error starting EventIO");
a->mutex = get_NetHandler(t)->mutex;
t->schedule_every(a, period, etype);
{code}

> NetAccept & SSLNetAccept Optimize
> -
>
> Key: TS-4483
> URL: https://issues.apache.org/jira/browse/TS-4483
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Core
>Reporter: Oknet Xu
>Assignee: Oknet Xu
> Fix For: 7.0.0
>
>
> replace getEtype() with member etype.
> NetAccept has a member named 'etype' and it is set by upgradeEtype before 
> NetAccept running.
> Thus, we can replace getEtype() with member etype and make the SSLNetAccept 
> codes clearly.
> Should we remote the getEtype() method ? It is called by none after the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't reschedule read depend on needs & did not check the change of lock at the return callback with wbe.

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Summary: Don't reschedule read depend on needs & did not check the change 
of lock at the return callback with wbe.  (was: Don't reschedule read depend on 
needs)

> Don't reschedule read depend on needs & did not check the change of lock at 
> the return callback with wbe.
> -
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {  // should check needs==0
>   write_disable(nh, vc);
>   return;
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't reschedule read depend on needs

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Description: 
the code:
{code}
int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, 
needs);
{code}

At the end of write_to_net_io, 

{code}
if (!buf.reader()->read_avail()) {  // should check needs==0
  write_disable(nh, vc);
  return;
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}

  was:
the code:
{code}
int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, 
needs);
{code}

At the end of write_to_net_io, 

{code}
if (!buf.reader()->read_avail()) {  // should check needs==0
  write_disable(nh, vc);
  return;
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
//  here r>0, don't need to check the needs.
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}


> Don't reschedule read depend on needs
> -
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {  // should check needs==0
>   write_disable(nh, vc);
>   return;
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't reschedule read depend on needs

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Description: 
the code:
{code}
int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, 
needs);
{code}

At the end of write_to_net_io, 

{code}
if (!buf.reader()->read_avail()) {  // should check needs==0
  write_disable(nh, vc);
  return;
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
//  here r>0, don't need to check the needs.
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}

  was:
the code:
{code}
int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, 
needs);
{code}

In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
EVENTIO_WRITE on r>0.

At the end of write_to_net_io, 

{code}
if (!buf.reader()->read_avail()) {
  write_disable(nh, vc);
  return;
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
//  here r>0, don't need to check the needs.
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}


> Don't reschedule read depend on needs
> -
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {  // should check needs==0
>   write_disable(nh, vc);
>   return;
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> //  here r>0, don't need to check the needs.
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't reschedule read depend on needs

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Summary: Don't reschedule read depend on needs  (was: Don't need reschedule 
read depend on needs)

> Don't reschedule read depend on needs
> -
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
> EVENTIO_WRITE on r>0.
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {
>   write_disable(nh, vc);
>   return;
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> //  here r>0, don't need to check the needs.
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't need reschedule read depend on needs

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Description: 
the code:
{code}
int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, 
needs);
{code}

In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
EVENTIO_WRITE on r>0.

At the end of write_to_net_io, 

{code}
if (!buf.reader()->read_avail()) {
  write_disable(nh, vc);
  return;
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
//  here r>0, don't need to check the needs.
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}

  was:
the code:
{code}
int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, 
needs);
{code}

In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
EVENTIO_WRITE on r>0.

At the end of write_to_net_io, 

{code}
if (!buf.reader()->read_avail()) {
  write_disable(nh, vc);
  return;  <-- return from here, but don't reschedule read
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}


> Don't need reschedule read depend on needs
> --
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
> EVENTIO_WRITE on r>0.
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {
>   write_disable(nh, vc);
>   return;
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> //  here r>0, don't need to check the needs.
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't need reschedule read depend on needs

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Issue Type: Improvement  (was: Bug)

> Don't need reschedule read depend on needs
> --
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
> EVENTIO_WRITE on r>0.
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {
>   write_disable(nh, vc);
>   return;  <-- return from here, but don't reschedule read
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't reschedule read depend on needs while write buffer empty

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Description: 
the code:
{code}
int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, buf, 
needs);
{code}

In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
EVENTIO_WRITE on r>0.

At the end of write_to_net_io, 

{code}
if (!buf.reader()->read_avail()) {
  write_disable(nh, vc);
  return;  <-- return from here, but don't reschedule read
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}

  was:
At the end of write_to_net_io
{code}
if (!buf.reader()->read_avail()) {
  write_disable(nh, vc);
  return;  <-- return from here, but don't reschedule read
}

if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
  write_reschedule(nh, vc);
}
if ((needs & EVENTIO_READ) == EVENTIO_READ) {
  read_reschedule(nh, vc);
}
return;
{code}


> Don't reschedule read depend on needs while write buffer empty
> --
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Bug
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
> EVENTIO_WRITE on r>0.
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {
>   write_disable(nh, vc);
>   return;  <-- return from here, but don't reschedule read
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4487) Don't need reschedule read depend on needs

2016-05-27 Thread Oknet Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oknet Xu updated TS-4487:
-
Summary: Don't need reschedule read depend on needs  (was: Don't reschedule 
read depend on needs while write buffer empty)

> Don't need reschedule read depend on needs
> --
>
> Key: TS-4487
> URL: https://issues.apache.org/jira/browse/TS-4487
> Project: Traffic Server
>  Issue Type: Bug
>  Components: SSL
>Reporter: Oknet Xu
>
> the code:
> {code}
> int64_t r = vc->load_buffer_and_write(towrite, wattempted, total_written, 
> buf, needs);
> {code}
> In the SSLNetVConnection::load_buffer_and_write(), only set needs |= 
> EVENTIO_WRITE on r>0.
> At the end of write_to_net_io, 
> {code}
> if (!buf.reader()->read_avail()) {
>   write_disable(nh, vc);
>   return;  <-- return from here, but don't reschedule read
> }
> if ((needs & EVENTIO_WRITE) == EVENTIO_WRITE) {
>   write_reschedule(nh, vc);
> }
> if ((needs & EVENTIO_READ) == EVENTIO_READ) {
>   read_reschedule(nh, vc);
> }
> return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >