[jira] [Assigned] (TS-4991) jtest should handle Range request

2016-10-20 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming reassigned TS-4991:
-

Assignee: song

[~jasondmee] please take care of this request. thanks

> jtest should handle Range request
> -
>
> Key: TS-4991
> URL: https://issues.apache.org/jira/browse/TS-4991
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: HTTP, Tests, Tools
>Reporter: Zhao Yongming
>Assignee: song
>
> jtest is not able to generate Range requests and handle Range requests, we 
> should make it
> I'd like to see the SIMPLE "Range: bytes=100-200/1000" works first, then 
> maybe some other Range syntax oven multiple Range should be consider later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4991) jtest should handle Range request

2016-10-20 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-4991:
-

 Summary: jtest should handle Range request
 Key: TS-4991
 URL: https://issues.apache.org/jira/browse/TS-4991
 Project: Traffic Server
  Issue Type: Improvement
  Components: HTTP, Tests, Tools
Reporter: Zhao Yongming


jtest is not able to generate Range requests and handle Range requests, we 
should make it

I'd like to see the SIMPLE "Range: bytes=100-200/1000" works first, then maybe 
some other Range syntax oven multiple Range should be consider later.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-2482) Problems with SOCKS

2016-08-26 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-2482:
--
Assignee: Oknet Xu  (was: weijin)

> Problems with SOCKS
> ---
>
> Key: TS-2482
> URL: https://issues.apache.org/jira/browse/TS-2482
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Reporter: Radim Kolar
>Assignee: Oknet Xu
> Fix For: sometime
>
>
> There are several problems with using SOCKS. I am interested in case when TF 
> is sock client. Client sends HTTP request and TF uses SOCKS server to make 
> connection to internet.
> a/ - not documented enough in default configs
> From default configs comments it seems that for running 
> TF 4.1.2 as socks client, it is sufficient to add one line to socks.config:
> dest_ip=0.0.0.0-255.255.255.255 parent="10.0.0.7:9050"
> but socks proxy is not used. If i run tcpdump sniffing packets  TF never 
> tries to connect to that SOCKS.
> From source code - 
> https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it 
> looks that is needed to set "proxy.config.socks.socks_needed" to activate 
> socks support. This should be documented in both sample files: socks.config 
> and record.config
> b/
> after enabling socks, i am hit by this assert:
> Assertion failed: (ats_is_ip4(_addr)), function init, file Socks.cc, 
> line 65.
> i run on dual stack system (ip4,ip6). 
> This code is setting default destination for SOCKS request? Can not you use 
> just 127.0.0.1 for case if client gets connected over IP6?
> https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4396) Off-by-one error in max redirects with redirection enabled

2016-07-02 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360120#comment-15360120
 ] 

Zhao Yongming commented on TS-4396:
---

proxy.config.http.number_of_redirections = 1 does NOT work as expected, let us 
fix it first.

> Off-by-one error in max redirects with redirection enabled
> --
>
> Key: TS-4396
> URL: https://issues.apache.org/jira/browse/TS-4396
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, Network
>Reporter: Felix Buenemann
>Assignee: Zhao Yongming
> Fix For: 7.0.0
>
>
> There is a problem in the current stable version 6.1.1 where the setting 
> proxy.config.http.number_of_redirections = 1 is incorrectly checked when 
> following origin redirects by setting proxy.config.http.redirection_enabled = 
> 1.
> If the requested URL is not already cached, ATS returns the redirect response 
> to the client instead of storing the target into the cache and returning it 
> to the client.
> The problem can be fixed by using proxy.config.http.number_of_redirections = 
> 2, but we are only following one redirect, so this is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TS-4396) Off-by-one error in max redirects with redirection enabled

2016-07-02 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming reassigned TS-4396:
-

Assignee: Zhao Yongming

> Off-by-one error in max redirects with redirection enabled
> --
>
> Key: TS-4396
> URL: https://issues.apache.org/jira/browse/TS-4396
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core, Network
>Reporter: Felix Buenemann
>Assignee: Zhao Yongming
> Fix For: 7.0.0
>
>
> There is a problem in the current stable version 6.1.1 where the setting 
> proxy.config.http.number_of_redirections = 1 is incorrectly checked when 
> following origin redirects by setting proxy.config.http.redirection_enabled = 
> 1.
> If the requested URL is not already cached, ATS returns the redirect response 
> to the client instead of storing the target into the cache and returning it 
> to the client.
> The problem can be fixed by using proxy.config.http.number_of_redirections = 
> 2, but we are only following one redirect, so this is wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4368) Segmentation fault

2016-04-21 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-4368:
--
Component/s: (was: Logging)

> Segmentation fault
> --
>
> Key: TS-4368
> URL: https://issues.apache.org/jira/browse/TS-4368
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Clustering
>Affects Versions: 6.1.2
>Reporter: Stef Fen
>
> We have a test trafficserver cluster of 2 nodes where the first node has 
> segfaults and the other doesn't.
> We are using this source 
> https://github.com/researchgate/trafficserver/tree/6.1.x
> which creates this packages (version 6.1.2)
> https://launchpad.net/~researchgate/+archive/ubuntu/trafficserver
> {code}
> [Apr 20 12:47:52.434] {0x2b72121ca600} ERROR: wrote crash log to 
> /var/log/trafficserver/crash-2016-04-20-124752.log
> traffic_server: Segmentation fault (Address not mapped to object [0x8050])
> traffic_server - STACK TRACE:
> /usr/bin/traffic_server(crash_logger_invoke(int, siginfo_t*, 
> void*)+0x97)[0x2ac6b8d676d7]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ac6bafdc340]
> /usr/bin/traffic_server(ink_aio_read(AIOCallback*, int)+0x36)[0x2ac6b8fe2e46]
> /usr/bin/traffic_server(CacheVC::handleRead(int, 
> Event*)+0x3a1)[0x2ac6b8f9d131]
> /usr/bin/traffic_server(Cache::open_read(Continuation*, ats::CryptoHash 
> const*, HTTPHdr*, CacheLookupHttpConfig*, CacheFragType, char const*, 
> int)+0x61f)[0x2ac6b8fc056f]
> /usr/bin/traffic_server(cache_op_ClusterFunction(ClusterHandler*, void*, 
> int)+0x94c)[0x2ac6b8f8fefc]
> /usr/bin/traffic_server(ClusterHandler::process_large_control_msgs()+0xf4)[0x2ac6b8f6dc84]
> /usr/bin/traffic_server(ClusterHandler::update_channels_read()+0x9b)[0x2ac6b8f7099b]
> /usr/bin/traffic_server(ClusterHandler::process_read(long)+0xae)[0x2ac6b8f7471e]
> /usr/bin/traffic_server(ClusterHandler::mainClusterEvent(int, 
> Event*)+0x158)[0x2ac6b8f75048]
> /usr/bin/traffic_server(ClusterState::doIO_read_event(int, 
> void*)+0x160)[0x2ac6b8f78d50]
> /usr/bin/traffic_server(+0x37e4c7)[0x2ac6b90114c7]
> /usr/bin/traffic_server(NetHandler::mainNetEvent(int, 
> Event*)+0x218)[0x2ac6b90005e8]
> /usr/bin/traffic_server(EThread::execute()+0xa82)[0x2ac6b9033b82]
> /usr/bin/traffic_server(+0x39f6ca)[0x2ac6b90326ca]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x2ac6bafd4182]
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ac6bbd0847d]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4368) Segmentation fault

2016-04-21 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-4368:
--
Description: 
We have a test trafficserver cluster of 2 nodes where the first node has 
segfaults and the other doesn't.

We are using this source 
https://github.com/researchgate/trafficserver/tree/6.1.x
which creates this packages (version 6.1.2)
https://launchpad.net/~researchgate/+archive/ubuntu/trafficserver

{code}
[Apr 20 12:47:52.434] {0x2b72121ca600} ERROR: wrote crash log to 
/var/log/trafficserver/crash-2016-04-20-124752.log
traffic_server: Segmentation fault (Address not mapped to object [0x8050])
traffic_server - STACK TRACE:
/usr/bin/traffic_server(crash_logger_invoke(int, siginfo_t*, 
void*)+0x97)[0x2ac6b8d676d7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ac6bafdc340]
/usr/bin/traffic_server(ink_aio_read(AIOCallback*, int)+0x36)[0x2ac6b8fe2e46]
/usr/bin/traffic_server(CacheVC::handleRead(int, Event*)+0x3a1)[0x2ac6b8f9d131]
/usr/bin/traffic_server(Cache::open_read(Continuation*, ats::CryptoHash const*, 
HTTPHdr*, CacheLookupHttpConfig*, CacheFragType, char const*, 
int)+0x61f)[0x2ac6b8fc056f]
/usr/bin/traffic_server(cache_op_ClusterFunction(ClusterHandler*, void*, 
int)+0x94c)[0x2ac6b8f8fefc]
/usr/bin/traffic_server(ClusterHandler::process_large_control_msgs()+0xf4)[0x2ac6b8f6dc84]
/usr/bin/traffic_server(ClusterHandler::update_channels_read()+0x9b)[0x2ac6b8f7099b]
/usr/bin/traffic_server(ClusterHandler::process_read(long)+0xae)[0x2ac6b8f7471e]
/usr/bin/traffic_server(ClusterHandler::mainClusterEvent(int, 
Event*)+0x158)[0x2ac6b8f75048]
/usr/bin/traffic_server(ClusterState::doIO_read_event(int, 
void*)+0x160)[0x2ac6b8f78d50]
/usr/bin/traffic_server(+0x37e4c7)[0x2ac6b90114c7]
/usr/bin/traffic_server(NetHandler::mainNetEvent(int, 
Event*)+0x218)[0x2ac6b90005e8]
/usr/bin/traffic_server(EThread::execute()+0xa82)[0x2ac6b9033b82]
/usr/bin/traffic_server(+0x39f6ca)[0x2ac6b90326ca]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x2ac6bafd4182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ac6bbd0847d]
{code}


  was:
We have a test trafficserver cluster of 2 nodes where the first node has 
segfaults and the other doesn't.

We are using this source 
https://github.com/researchgate/trafficserver/tree/6.1.x
which creates this packages (version 6.1.2)
https://launchpad.net/~researchgate/+archive/ubuntu/trafficserver

{code}
[Apr 20 12:47:52.434] {0x2b72121ca600} ERROR: wrote crash log to 
/var/log/trafficserver/crash-2016-04-20-124752.log
traffic_server: Segmentation fault (Address not mapped to object [0x8050])
traffic_server - STACK TRACE:
/usr/bin/traffic_server(_Z19crash_logger_invokeiP9siginfo_tPv+0x97)[0x2ac6b8d676d7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ac6bafdc340]
/usr/bin/traffic_server(_Z12ink_aio_readP11AIOCallbacki+0x36)[0x2ac6b8fe2e46]
/usr/bin/traffic_server(_ZN7CacheVC10handleReadEiP5Event+0x3a1)[0x2ac6b8f9d131]
/usr/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationPKN3ats10CryptoHashEP7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePKci+0x61f)[0x2ac6b8fc056f]
/usr/bin/traffic_server(_Z24cache_op_ClusterFunctionP14ClusterHandlerPvi+0x94c)[0x2ac6b8f8fefc]
/usr/bin/traffic_server(_ZN14ClusterHandler26process_large_control_msgsEv+0xf4)[0x2ac6b8f6dc84]
/usr/bin/traffic_server(_ZN14ClusterHandler20update_channels_readEv+0x9b)[0x2ac6b8f7099b]
/usr/bin/traffic_server(_ZN14ClusterHandler12process_readEl+0xae)[0x2ac6b8f7471e]
/usr/bin/traffic_server(_ZN14ClusterHandler16mainClusterEventEiP5Event+0x158)[0x2ac6b8f75048]
/usr/bin/traffic_server(_ZN12ClusterState15doIO_read_eventEiPv+0x160)[0x2ac6b8f78d50]
/usr/bin/traffic_server(+0x37e4c7)[0x2ac6b90114c7]
/usr/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x218)[0x2ac6b90005e8]
/usr/bin/traffic_server(_ZN7EThread7executeEv+0xa82)[0x2ac6b9033b82]
/usr/bin/traffic_server(+0x39f6ca)[0x2ac6b90326ca]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x2ac6bafd4182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ac6bbd0847d]
{code}



> Segmentation fault
> --
>
> Key: TS-4368
> URL: https://issues.apache.org/jira/browse/TS-4368
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Clustering
>Affects Versions: 6.1.2
>Reporter: Stef Fen
>
> We have a test trafficserver cluster of 2 nodes where the first node has 
> segfaults and the other doesn't.
> We are using this source 
> https://github.com/researchgate/trafficserver/tree/6.1.x
> which creates this packages (version 6.1.2)
> https://launchpad.net/~researchgate/+archive/ubuntu/trafficserver
> {code}
> [Apr 20 12:47:52.434] {0x2b72121ca600} ERROR: wrote crash log to 
> /var/log/trafficserver/crash-2016-04-20-124752.log
> traffic_server: Segmentation fault (Address not mapped to object [0x8050])
> traffic_server - STACK TRACE:
> /usr/bin/traffic_server(crash_logger_invoke(int, 

[jira] [Updated] (TS-4368) Segmentation fault

2016-04-21 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-4368:
--
Affects Version/s: 6.1.2
  Component/s: Logging
   Clustering

> Segmentation fault
> --
>
> Key: TS-4368
> URL: https://issues.apache.org/jira/browse/TS-4368
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Clustering, Logging
>Affects Versions: 6.1.2
>Reporter: Stef Fen
>
> We have a test trafficserver cluster of 2 nodes where the first node has 
> segfaults and the other doesn't.
> We are using this source 
> https://github.com/researchgate/trafficserver/tree/6.1.x
> which creates this packages (version 6.1.2)
> https://launchpad.net/~researchgate/+archive/ubuntu/trafficserver
> {code}
> [Apr 20 12:47:52.434] {0x2b72121ca600} ERROR: wrote crash log to 
> /var/log/trafficserver/crash-2016-04-20-124752.log
> traffic_server: Segmentation fault (Address not mapped to object [0x8050])
> traffic_server - STACK TRACE:
> /usr/bin/traffic_server(_Z19crash_logger_invokeiP9siginfo_tPv+0x97)[0x2ac6b8d676d7]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x2ac6bafdc340]
> /usr/bin/traffic_server(_Z12ink_aio_readP11AIOCallbacki+0x36)[0x2ac6b8fe2e46]
> /usr/bin/traffic_server(_ZN7CacheVC10handleReadEiP5Event+0x3a1)[0x2ac6b8f9d131]
> /usr/bin/traffic_server(_ZN5Cache9open_readEP12ContinuationPKN3ats10CryptoHashEP7HTTPHdrP21CacheLookupHttpConfig13CacheFragTypePKci+0x61f)[0x2ac6b8fc056f]
> /usr/bin/traffic_server(_Z24cache_op_ClusterFunctionP14ClusterHandlerPvi+0x94c)[0x2ac6b8f8fefc]
> /usr/bin/traffic_server(_ZN14ClusterHandler26process_large_control_msgsEv+0xf4)[0x2ac6b8f6dc84]
> /usr/bin/traffic_server(_ZN14ClusterHandler20update_channels_readEv+0x9b)[0x2ac6b8f7099b]
> /usr/bin/traffic_server(_ZN14ClusterHandler12process_readEl+0xae)[0x2ac6b8f7471e]
> /usr/bin/traffic_server(_ZN14ClusterHandler16mainClusterEventEiP5Event+0x158)[0x2ac6b8f75048]
> /usr/bin/traffic_server(_ZN12ClusterState15doIO_read_eventEiPv+0x160)[0x2ac6b8f78d50]
> /usr/bin/traffic_server(+0x37e4c7)[0x2ac6b90114c7]
> /usr/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x218)[0x2ac6b90005e8]
> /usr/bin/traffic_server(_ZN7EThread7executeEv+0xa82)[0x2ac6b9033b82]
> /usr/bin/traffic_server(+0x39f6ca)[0x2ac6b90326ca]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x2ac6bafd4182]
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2ac6bbd0847d]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4156) remove the traffic_sac, stand alone log collation server

2016-04-21 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-4156:
--
Fix Version/s: (was: sometime)
   7.0.0

> remove the traffic_sac, stand alone log collation server
> 
>
> Key: TS-4156
> URL: https://issues.apache.org/jira/browse/TS-4156
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Logging
>Reporter: Zhao Yongming
>Assignee: Zhao Yongming
> Fix For: 7.0.0
>
>
> the stand alone collation server act as a dedicated log server from ATS, this 
> is a dedicated log product back in the Inktomi age, and we don't need it as 
> this functions are build into the traffic_server binary for free distribution.
> it is time to nuke it down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4156) remove the traffic_sac, stand alone log collation server

2016-01-31 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125402#comment-15125402
 ] 

Zhao Yongming commented on TS-4156:
---

the log collation works as:
1, dedicated log collation server: no http cache (or others) function active, 
bug still a traffic_server, with full functions installed, that way we just 
don't put any request on this traffic_server.
   someone with very high traffic may use this mode, just don't keep the 
collation server out of the http cache service.

2, mixed with cache server: both cache and logging collation server in active, 
we log for other hosts(collation clients).
  most of the users may choose this mode, it will help you collect all the logs 
into one single place, and easy for check or backup.

3, traffic_sac stand alone log server: no server function, just log collation 
server.
  this is the duplicated binary.

  by design, the log collation is going to help you simple the logging by store 
all the logs into one single place, one single file the whole site, with just 
one timeline. and the log collation mode 'poxy.local.log.collation_mode' is a 
LOCAL directive in records.config, that make it possible to active a single 
host as collation server while others as collation server, while still got the 
cluster management of config files.

so, I think that traffic_server with log collation mode in client or server is 
just a must builtin function if we want to keep the log collation feature, and 
keep a completely dedicated log collation server may bring more code complex.
 

> remove the traffic_sac, stand alone log collation server
> 
>
> Key: TS-4156
> URL: https://issues.apache.org/jira/browse/TS-4156
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Logging
>Reporter: Zhao Yongming
>Assignee: Zhao Yongming
> Fix For: sometime
>
>
> the stand alone collation server act as a dedicated log server from ATS, this 
> is a dedicated log product back in the Inktomi age, and we don't need it as 
> this functions are build into the traffic_server binary for free distribution.
> it is time to nuke it down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-4156) remove the traffic_sac, stand alone log collation server

2016-01-31 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125402#comment-15125402
 ] 

Zhao Yongming edited comment on TS-4156 at 1/31/16 4:30 PM:


the log collation works as:
1, dedicated log collation server: no http cache (or others) function active, 
but still a traffic_server, with full functions installed, that way we just 
don't put any request on this traffic_server.
   someone with very high traffic may use this mode, just keep the collation 
server out of the http cache service.

2, mixed with cache server: both cache and logging collation server in active, 
we log for other hosts(collation clients).
  most of the users may choose this mode, it will help you collect all the logs 
into one single place, and easy for check or backup.

3, traffic_sac stand alone log server: no server function, just log collation 
server.
  this is the duplicated binary.

  by design, the log collation is going to help you simple the logging by store 
all the logs into one single place, one single file the whole site, with just 
one timeline. and the log collation mode 'poxy.local.log.collation_mode' is a 
LOCAL directive in records.config, that make it possible to active a single 
host as collation server while others as collation server, while still got the 
cluster management of config files.

so, I think that traffic_server with log collation mode in client or server is 
just a must builtin function if we want to keep the log collation feature, and 
keep a completely dedicated log collation server may bring more code complex.
 


was (Author: zym):
the log collation works as:
1, dedicated log collation server: no http cache (or others) function active, 
bug still a traffic_server, with full functions installed, that way we just 
don't put any request on this traffic_server.
   someone with very high traffic may use this mode, just don't keep the 
collation server out of the http cache service.

2, mixed with cache server: both cache and logging collation server in active, 
we log for other hosts(collation clients).
  most of the users may choose this mode, it will help you collect all the logs 
into one single place, and easy for check or backup.

3, traffic_sac stand alone log server: no server function, just log collation 
server.
  this is the duplicated binary.

  by design, the log collation is going to help you simple the logging by store 
all the logs into one single place, one single file the whole site, with just 
one timeline. and the log collation mode 'poxy.local.log.collation_mode' is a 
LOCAL directive in records.config, that make it possible to active a single 
host as collation server while others as collation server, while still got the 
cluster management of config files.

so, I think that traffic_server with log collation mode in client or server is 
just a must builtin function if we want to keep the log collation feature, and 
keep a completely dedicated log collation server may bring more code complex.
 

> remove the traffic_sac, stand alone log collation server
> 
>
> Key: TS-4156
> URL: https://issues.apache.org/jira/browse/TS-4156
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Logging
>Reporter: Zhao Yongming
>Assignee: Zhao Yongming
> Fix For: sometime
>
>
> the stand alone collation server act as a dedicated log server from ATS, this 
> is a dedicated log product back in the Inktomi age, and we don't need it as 
> this functions are build into the traffic_server binary for free distribution.
> it is time to nuke it down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4156) remove the traffic_sac, stand alone log collation server

2016-01-31 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125409#comment-15125409
 ] 

Zhao Yongming commented on TS-4156:
---

orphaned log is a point where we can improve, I think we can make some tools on 
that, and due to that the orphaned log is out of the mainline log file, it is 
hard to archive the single file of logging in that period, even we can collect 
all the orphaned logs into one single box.

the orphaned log file happened when the log server down or under traffic issue, 
we seen very few orphaned logs after we improved the log collation server 
performance.

> remove the traffic_sac, stand alone log collation server
> 
>
> Key: TS-4156
> URL: https://issues.apache.org/jira/browse/TS-4156
> Project: Traffic Server
>  Issue Type: Improvement
>  Components: Logging
>Reporter: Zhao Yongming
>Assignee: Zhao Yongming
> Fix For: sometime
>
>
> the stand alone collation server act as a dedicated log server from ATS, this 
> is a dedicated log product back in the Inktomi age, and we don't need it as 
> this functions are build into the traffic_server binary for free distribution.
> it is time to nuke it down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-4156) remove the traffic_sac, stand alone log collation server

2016-01-27 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-4156:
-

 Summary: remove the traffic_sac, stand alone log collation server
 Key: TS-4156
 URL: https://issues.apache.org/jira/browse/TS-4156
 Project: Traffic Server
  Issue Type: Improvement
  Components: Logging
Reporter: Zhao Yongming


the stand alone collation server act as a dedicated log server from ATS, this 
is a dedicated log product back in the Inktomi age, and we don't need it as 
this functions are build into the traffic_server binary for free distribution.

it is time to nuke it down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-4056) MemLeak: ~NetAccept() do not free alloc_cache(vc)

2015-12-07 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-4056:
--
Affects Version/s: 6.1.0

> MemLeak: ~NetAccept() do not free alloc_cache(vc)
> -
>
> Key: TS-4056
> URL: https://issues.apache.org/jira/browse/TS-4056
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 6.1.0
>Reporter: Oknet Xu
>
> NetAccpet::alloc_cache is a void pointor is used in net_accept().
> the alloc_cache does not release after NetAccept canceled.
> I'm looking for all code, believe the "alloc_cache" is a bad idea here.
> I create a pull request on github: 
> https://github.com/apache/trafficserver/pull/366
> also add a condition check for vc==NULL after allocate_vc()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4059) Default value for proxy.config.bin_path does not use value from config.layout

2015-12-07 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046143#comment-15046143
 ] 

Zhao Yongming commented on TS-4059:
---

I think that Craig Forbes want the building & installation honor the 
'--bindir=DIRuser executables [EPREFIX/bin]',  which is default to 
'EPREFIX/bin' if not specified, I think you may submit patch if that does not 
work as you wish.

IMO, all the _path config options should be removed as those are binary 
releasing options, as we are now open sourced with all layout configurable, we 
should remove them (or hardcode to the configure specified directory) from 
records config.

patch welcome

FYI

> Default value for proxy.config.bin_path does not use value from config.layout
> -
>
> Key: TS-4059
> URL: https://issues.apache.org/jira/browse/TS-4059
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Configuration
>Reporter: Craig Forbes
>
> The default value for proxy.config.bin_path defined in RecordsConfig.cc is 
> hard coded to "bin".
> The value should be TS_BUILD_BINDIR so the value specified at configure time 
> is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-4058) Logging doesn't work when TS is compiled and run w/ --with-user

2015-12-07 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046156#comment-15046156
 ] 

Zhao Yongming commented on TS-4058:
---

good catch, _cop is design to be run as root, and --with-user=danielxu 
specified the _server to be run as danielxu, that is the current setup. 
currently one unprivileged user should not run _cop, in the past it even fail 
if you want to make install as no-root, haha. in most case we would advice to 
run with _server directly for small testing with _server.

It would be nice if you want can make _cop run with unprivileged user.

> Logging doesn't work when TS is compiled and run w/ --with-user
> ---
>
> Key: TS-4058
> URL: https://issues.apache.org/jira/browse/TS-4058
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Logging
>Reporter: Daniel Xu
>Assignee: Daniel Xu
>
> ie. we run this _without_ sudo. 
> traffic_cop output seems to point to permission errors that occur within 
> traffic_manager



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-3510) header_rewrite is blocking building on raspberry pi

2015-04-08 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-3510:
-

 Summary: header_rewrite is blocking building on raspberry pi 
 Key: TS-3510
 URL: https://issues.apache.org/jira/browse/TS-3510
 Project: Traffic Server
  Issue Type: Bug
  Components: Build, Plugins
Reporter: Zhao Yongming


ARM support is so good that we just have raspberrypi fail to build for 
header_rewrite.

{code}
pi@raspberrypi ~/trafficserver/plugins/header_rewrite $ make -j 2
  CXXconditions.lo
  CXXheader_rewrite.lo
{standard input}: Assembler messages:
{standard input}:1221: Error: selected processor does not support ARM mode `dmb'
  CXXlulu.lo
  CXXmatcher.lo
  CXXoperator.lo
Makefile:689: recipe for target 'conditions.lo' failed
make: *** [conditions.lo] Error 1
make: *** Waiting for unfinished jobs
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3472) SNI proxy alike feature for TS

2015-03-30 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-3472:
--
Fix Version/s: sometime

 SNI proxy alike feature for TS
 --

 Key: TS-3472
 URL: https://issues.apache.org/jira/browse/TS-3472
 Project: Traffic Server
  Issue Type: New Feature
  Components: SSL
Reporter: Zhao Yongming
 Fix For: sometime


 when doing forward proxy only setup, the sniproxy: 
 https://github.com/dlundquist/sniproxy.git is a very tiny but cool effort to 
 setup a TLS layer proxy with SNI, very good for some dirty tasks.
 in ATS, there is already a very good support in all those basic components, 
 add SNI blind proxy should be a very good feature, with tiny small changes 
 maybe.
 SNI in TLS, will extent the proxy(on caching) into all TLS based services, 
 such as mail etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-3472) SNI proxy alike feature for TS

2015-03-30 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-3472:
-

 Summary: SNI proxy alike feature for TS
 Key: TS-3472
 URL: https://issues.apache.org/jira/browse/TS-3472
 Project: Traffic Server
  Issue Type: New Feature
  Components: SSL
Reporter: Zhao Yongming


when doing forward proxy only setup, the sniproxy: 
https://github.com/dlundquist/sniproxy.git is a very tiny but cool effort to 
setup a TLS layer proxy with SNI, very good for some dirty tasks.

in ATS, there is already a very good support in all those basic components, add 
SNI blind proxy should be a very good feature, with tiny small changes maybe.

SNI in TLS, will extent the proxy(on caching) into all TLS based services, such 
as mail etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2482) Problems with SOCKS

2015-03-30 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386809#comment-14386809
 ] 

Zhao Yongming commented on TS-2482:
---

no time to test, here is the rough patch:
{code}
diff --git a/proxy/http/HttpTransact.cc b/proxy/http/HttpTransact.cc
index c6f55ed..cc4ffdc 100644
--- a/proxy/http/HttpTransact.cc
+++ b/proxy/http/HttpTransact.cc
@@ -865,7 +865,7 @@ HttpTransact::EndRemapRequest(State* s)
 /
 if (s-http_config_param-reverse_proxy_enabled
  !s-client_info.is_transparent
- !incoming_request-is_target_in_url()) {
+ !(incoming_request-is_target_in_url() || 
incoming_request-m_host_length  0)) {
   /
   // the url mapping failed, reverse proxy was enabled,
   // and the request contains no host:
{code}

and:

{code}
diff --git a/iocore/net/Socks.cc b/iocore/net/Socks.cc
index cfdd214..c04c0f4 100644
--- a/iocore/net/Socks.cc
+++ b/iocore/net/Socks.cc
@@ -62,7 +62,7 @@ SocksEntry::init(ProxyMutex * m, SocksNetVC * vc, unsigned 
char socks_support, u
   req_data.api_info = 0;
   req_data.xact_start = time(0);

-  assert(ats_is_ip4(target_addr));
+  //assert(ats_is_ip4(target_addr));
   ats_ip_copy(req_data.dest_ip, target_addr);

   //we dont have information about the source. set to destination's
{code}


the patch assert may need more work, and the socks server only do http checking 
and no other socks support. that is a not so good socks server indeed, I'd see 
someone take it and continue to improve the socks server feature.

so, paste the patch here, before it lost in time.

 Problems with SOCKS
 ---

 Key: TS-2482
 URL: https://issues.apache.org/jira/browse/TS-2482
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Radim Kolar
Assignee: weijin
 Fix For: sometime


 There are several problems with using SOCKS. I am interested in case when TF 
 is sock client. Client sends HTTP request and TF uses SOCKS server to make 
 connection to internet.
 a/ - not documented enough in default configs
 From default configs comments it seems that for running 
 TF 4.1.2 as socks client, it is sufficient to add one line to socks.config:
 dest_ip=0.0.0.0-255.255.255.255 parent=10.0.0.7:9050
 but socks proxy is not used. If i run tcpdump sniffing packets  TF never 
 tries to connect to that SOCKS.
 From source code - 
 https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it 
 looks that is needed to set proxy.config.socks.socks_needed to activate 
 socks support. This should be documented in both sample files: socks.config 
 and record.config
 b/
 after enabling socks, i am hit by this assert:
 Assertion failed: (ats_is_ip4(target_addr)), function init, file Socks.cc, 
 line 65.
 i run on dual stack system (ip4,ip6). 
 This code is setting default destination for SOCKS request? Can not you use 
 just 127.0.0.1 for case if client gets connected over IP6?
 https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3472) SNI proxy alike feature for TS

2015-03-30 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386893#comment-14386893
 ] 

Zhao Yongming commented on TS-3472:
---

the sniproxy do not need to intercept with ssl server|client, it only take SNI 
name and route to the backend. it does not even need to link against with SSL 
libary.

with ssl_multicert.config:
dest_ip=* action=tunnel
does not work as we need a ssl cert/key file to act as a SSL intercept?

 SNI proxy alike feature for TS
 --

 Key: TS-3472
 URL: https://issues.apache.org/jira/browse/TS-3472
 Project: Traffic Server
  Issue Type: New Feature
  Components: SSL
Reporter: Zhao Yongming
 Fix For: sometime


 when doing forward proxy only setup, the sniproxy: 
 https://github.com/dlundquist/sniproxy.git is a very tiny but cool effort to 
 setup a TLS layer proxy with SNI, very good for some dirty tasks.
 in ATS, there is already a very good support in all those basic components, 
 add SNI blind proxy should be a very good feature, with tiny small changes 
 maybe.
 SNI in TLS, will extent the proxy(on caching) into all TLS based services, 
 such as mail etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3472) SNI proxy alike feature for TS

2015-03-30 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386944#comment-14386944
 ] 

Zhao Yongming commented on TS-3472:
---

yes, the sniproxy make it possible to do proxy(without cache) for TLS based 
service with remap like origin routing control, that like some layer7 
routing|proxy service?

sometimes in forwarding proxy, proxy is much important than caching

 SNI proxy alike feature for TS
 --

 Key: TS-3472
 URL: https://issues.apache.org/jira/browse/TS-3472
 Project: Traffic Server
  Issue Type: New Feature
  Components: SSL
Reporter: Zhao Yongming
 Fix For: sometime


 when doing forward proxy only setup, the sniproxy: 
 https://github.com/dlundquist/sniproxy.git is a very tiny but cool effort to 
 setup a TLS layer proxy with SNI, very good for some dirty tasks.
 in ATS, there is already a very good support in all those basic components, 
 add SNI blind proxy should be a very good feature, with tiny small changes 
 maybe.
 SNI in TLS, will extent the proxy(on caching) into all TLS based services, 
 such as mail etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-3472) SNI proxy alike feature for TS

2015-03-30 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387868#comment-14387868
 ] 

Zhao Yongming edited comment on TS-3472 at 3/31/15 2:52 AM:


the forwarding proxy have nothing to control, that means they try to proxy if 
not cache-able. while the reverse proxy do caching on the site side, most of 
the forwarding proxy works on the user side.


was (Author: zym):
the forwarding proxy have nothing to control, that means they try to proxy if 
not cache-able. while the reverse proxy do caching on the site side, most of 
the forwarding proxy works on the user site.

 SNI proxy alike feature for TS
 --

 Key: TS-3472
 URL: https://issues.apache.org/jira/browse/TS-3472
 Project: Traffic Server
  Issue Type: New Feature
  Components: SSL
Reporter: Zhao Yongming
 Fix For: sometime


 when doing forward proxy only setup, the sniproxy: 
 https://github.com/dlundquist/sniproxy.git is a very tiny but cool effort to 
 setup a TLS layer proxy with SNI, very good for some dirty tasks.
 in ATS, there is already a very good support in all those basic components, 
 add SNI blind proxy should be a very good feature, with tiny small changes 
 maybe.
 SNI in TLS, will extent the proxy(on caching) into all TLS based services, 
 such as mail etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3472) SNI proxy alike feature for TS

2015-03-30 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387868#comment-14387868
 ] 

Zhao Yongming commented on TS-3472:
---

the forwarding proxy have nothing to control, that means they try to proxy if 
not cache-able. while the reverse proxy do caching on the site side, most of 
the forwarding proxy works on the user site.

 SNI proxy alike feature for TS
 --

 Key: TS-3472
 URL: https://issues.apache.org/jira/browse/TS-3472
 Project: Traffic Server
  Issue Type: New Feature
  Components: SSL
Reporter: Zhao Yongming
 Fix For: sometime


 when doing forward proxy only setup, the sniproxy: 
 https://github.com/dlundquist/sniproxy.git is a very tiny but cool effort to 
 setup a TLS layer proxy with SNI, very good for some dirty tasks.
 in ATS, there is already a very good support in all those basic components, 
 add SNI blind proxy should be a very good feature, with tiny small changes 
 maybe.
 SNI in TLS, will extent the proxy(on caching) into all TLS based services, 
 such as mail etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2205) AIO caused system hang

2015-03-29 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386140#comment-14386140
 ] 

Zhao Yongming commented on TS-2205:
---

TS-3458 is reported as a index sycing issue. things need more information, I 
will try.

basicly we don't find anything that need to look anymore, I will let this issue 
open for a while and close if no further information.

 AIO caused system hang
 --

 Key: TS-2205
 URL: https://issues.apache.org/jira/browse/TS-2205
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Affects Versions: 4.0.1
Reporter: Zhao Yongming
Assignee: weijin
Priority: Critical
 Fix For: 6.0.0


 the system may hang with AIO thread CPU usage rising:
 {code}
 top - 17:10:46 up 38 days, 22:43,  2 users,  load average: 11.34, 2.97, 2.75
 Tasks: 512 total,  55 running, 457 sleeping,   0 stopped,   0 zombie
 Cpu(s):  6.9%us, 54.8%sy,  0.0%ni, 37.3%id,  0.0%wa,  0.0%hi,  0.9%si,  0.0%st
 Mem:  65963696k total, 64318444k used,  1645252k free,   241496k buffers
 Swap: 33554424k total,20416k used, 33534008k free, 14864188k cached
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 32498 ats   20   0 59.3g  45g  25m R 65.8 72.1  24:44.15 [ET_AIO 5]
  3213 root  20   0 000 S 15.4  0.0  13:38.32 kondemand/7
  3219 root  20   0 000 S 15.1  0.0  16:32.78 kondemand/13
 4 root  20   0 000 S 13.8  0.0  33:18.13 ksoftirqd/0
13 root  20   0 000 S 13.4  0.0  21:45.18 ksoftirqd/2
37 root  20   0 000 S 13.4  0.0  19:42.34 ksoftirqd/8
45 root  20   0 000 S 13.4  0.0  18:31.17 ksoftirqd/10
 32483 ats   20   0 59.3g  45g  25m R 13.4 72.1  16:47.14 [ET_AIO 6]
 32487 ats   20   0 59.3g  45g  25m R 13.4 72.1  16:46.93 [ET_AIO 2]
25 root  20   0 000 S 13.1  0.0  19:02.18 ksoftirqd/5
65 root  20   0 000 S 13.1  0.0  19:24.04 ksoftirqd/15
 32477 ats   20   0 59.3g  45g  25m R 13.1 72.1  16:32.90 [ET_AIO 0]
 32478 ats   20   0 59.3g  45g  25m R 13.1 72.1  16:49.77 [ET_AIO 1]
 32479 ats   20   0 59.3g  45g  25m S 13.1 72.1  16:41.77 [ET_AIO 2]
 32481 ats   20   0 59.3g  45g  25m R 13.1 72.1  16:50.40 [ET_AIO 4]
 32482 ats   20   0 59.3g  45g  25m R 13.1 72.1  16:47.42 [ET_AIO 5]
 32484 ats   20   0 59.3g  45g  25m R 13.1 72.1  16:25.81 [ET_AIO 7]
 32485 ats   20   0 59.3g  45g  25m S 13.1 72.1  16:52.71 [ET_AIO 0]
 32486 ats   20   0 59.3g  45g  25m S 13.1 72.1  16:51.69 [ET_AIO 1]
 32491 ats   20   0 59.3g  45g  25m S 13.1 72.1  16:50.58 [ET_AIO 6]
 32492 ats   20   0 59.3g  45g  25m S 13.1 72.1  16:49.12 [ET_AIO 7]
 32480 ats   20   0 59.3g  45g  25m S 12.8 72.1  16:47.39 [ET_AIO 3]
 32488 ats   20   0 59.3g  45g  25m R 12.8 72.1  16:52.16 [ET_AIO 3]
 32489 ats   20   0 59.3g  45g  25m S 12.8 72.1  16:50.79 [ET_AIO 4]
 32490 ats   20   0 59.3g  45g  25m R 12.8 72.1  16:52.61 [ET_AIO 5]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-965) cache.config can't deal with both revalidate= and ttl-in-cache= specified

2015-03-08 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352036#comment-14352036
 ] 

Zhao Yongming commented on TS-965:
--

I have no idea of what is the detail, as cache.config is a multi-matching rule 
system, and there is some hard-coded rules which is not explained anywhare, for 
example: if you matched with 'no-cache', then it will not cache.

I don't like the idea with the cache-control matching, which is hard to extend 
and hard to use in real world, maybe we should avoid using it in fever of the 
LUA remaping and LUA plugins

 cache.config can't deal with both revalidate= and ttl-in-cache= specified
 -

 Key: TS-965
 URL: https://issues.apache.org/jira/browse/TS-965
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Affects Versions: 3.1.0, 3.0.1
Reporter: Igor Galić
Assignee: Alan M. Carroll
  Labels: A, cache-control
 Fix For: 5.3.0


 If both of these options are specified (with the same time?), nothing is 
 cached at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3197) dest_ip in cache.config should be expand to network style

2015-03-03 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-3197:
--
Summary: dest_ip in cache.config should be expand to network style  (was: 
dest_ip in cache.config doesn't work)

 dest_ip in cache.config should be expand to network style
 -

 Key: TS-3197
 URL: https://issues.apache.org/jira/browse/TS-3197
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache, Configuration, Performance
Reporter: Luca Rea
 Fix For: sometime


 Hi,
 I'm tring to exclude a /22 netblock from the cache system but the syntax 
 dest_ip doesn't work, detalis below:
 dest_ip=x.y.84.0-x.y.87.255 action=never-cache
 I've tried to stop,clear-cache,start several times but every time images 
 have been put into the cache and log shows NONE FIN FIN TCP_MEM_HIT or 
 NONE FIN FIN TCP_IMS_HIT.
 Other Info:
 proxy.node.version.manager.long=Apache Traffic Server - traffic_manager - 
 5.1.0 - (build # 81013 on Sep 10 2014 at 13:13:42)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3197) dest_ip in cache.config should be expand to network style

2015-03-03 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-3197:
--
Priority: Minor  (was: Major)

 dest_ip in cache.config should be expand to network style
 -

 Key: TS-3197
 URL: https://issues.apache.org/jira/browse/TS-3197
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cache, Configuration, Performance
Reporter: Luca Rea
Priority: Minor
 Fix For: sometime


 Hi,
 I'm tring to exclude a /22 netblock from the cache system but the syntax 
 dest_ip doesn't work, detalis below:
 dest_ip=x.y.84.0-x.y.87.255 action=never-cache
 I've tried to stop,clear-cache,start several times but every time images 
 have been put into the cache and log shows NONE FIN FIN TCP_MEM_HIT or 
 NONE FIN FIN TCP_IMS_HIT.
 Other Info:
 proxy.node.version.manager.long=Apache Traffic Server - traffic_manager - 
 5.1.0 - (build # 81013 on Sep 10 2014 at 13:13:42)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3197) dest_ip in cache.config should be expand to network style

2015-03-03 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-3197:
--
Issue Type: Improvement  (was: Bug)

 dest_ip in cache.config should be expand to network style
 -

 Key: TS-3197
 URL: https://issues.apache.org/jira/browse/TS-3197
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cache, Configuration, Performance
Reporter: Luca Rea
 Fix For: sometime


 Hi,
 I'm tring to exclude a /22 netblock from the cache system but the syntax 
 dest_ip doesn't work, detalis below:
 dest_ip=x.y.84.0-x.y.87.255 action=never-cache
 I've tried to stop,clear-cache,start several times but every time images 
 have been put into the cache and log shows NONE FIN FIN TCP_MEM_HIT or 
 NONE FIN FIN TCP_IMS_HIT.
 Other Info:
 proxy.node.version.manager.long=Apache Traffic Server - traffic_manager - 
 5.1.0 - (build # 81013 on Sep 10 2014 at 13:13:42)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3212) 200 code is returned as 304

2015-03-03 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344883#comment-14344883
 ] 

Zhao Yongming commented on TS-3212:
---

yeah, let us start tracking on cache control issue then.

and what confusing me is that why IMS is there if your resonse including all 
the 'no-cache' directives to inform the client that the content is not 
cache-able. that is weird.

anyway, keep this issue open until we fix the cache-control and recheck it.

 200 code is returned as 304
 ---

 Key: TS-3212
 URL: https://issues.apache.org/jira/browse/TS-3212
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Rea
 Fix For: sometime


 The live streaming videos from akamaihd.net CDN cannot be watched because ATS 
 rewrite codes 200 into 304 and videos enter continuosly in buffering status:
 {code}
 GET 
 http://abclive.abcnews.com/z/abc_live1@136327/1200_02769fd3e0d85977-p.bootstrap?g=PDSTQVGEMQKRb=500,300,700,900,1200hdcore=3.1.0plugin=aasp-3.1.0.43.124
  HTTP/1.1
 Host: abclive.abcnews.com
 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 
 Firefox/33.0
 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
 Accept-Encoding: gzip, deflate
 Referer: 
 http://a.abcnews.com/assets/player/amp/2.0.0012/amp.premier/AkamaiPremierPlayer.swf
 Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==
 Connection: keep-alive
 HTTP/1.1 200 OK
 Server: ContactLab
 Mime-Version: 1.0
 Content-Type: video/abst
 Content-Length: 122
 Last-Modified: Tue, 25 Nov 2014 05:28:32 GMT
 Expires: Tue, 25 Nov 2014 15:31:53 GMT
 Cache-Control: max-age=0, no-cache
 Pragma: no-cache
 Date: Tue, 25 Nov 2014 15:31:53 GMT
 access-control-allow-origin: *
 Set-Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==; path=/z/abc_live1@136327/; 
 domain=abclive.abcnews.com
 Age: 0
 Connection: keep-alive
 GET 
 http://abclive.abcnews.com/z/abc_live1@136327/1200_02769fd3e0d85977-p.bootstrap?g=PDSTQVGEMQKRb=500,300,700,900,1200hdcore=3.1.0plugin=aasp-3.1.0.43.124
  HTTP/1.1
 Host: abclive.abcnews.com
 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 
 Firefox/33.0
 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
 Accept-Encoding: gzip, deflate
 Referer: 
 http://a.abcnews.com/assets/player/amp/2.0.0012/amp.premier/AkamaiPremierPlayer.swf
 Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==
 Connection: keep-alive
 If-Modified-Since: Tue, 25 Nov 2014 05:28:32 GMT
 HTTP/1.1 304 Not Modified
 Date: Tue, 25 Nov 2014 15:31:58 GMT
 Expires: Tue, 25 Nov 2014 15:31:58 GMT
 Cache-Control: max-age=0, no-cache
 Connection: keep-alive
 Server: ContactLab
 {code}
 using the url_regex to skip cache/IMS doesn't work, the workaround is the 
 following line in records.config:
 CONFIG proxy.config.http.cache.cache_urls_that_look_dynamic INT 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3360) TS don't use peer IP address from icp.config

2015-03-02 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344597#comment-14344597
 ] 

Zhao Yongming commented on TS-3360:
---

I think that is only an default config file misleading, as due to the official 
doc:
https://docs.trafficserver.apache.org/en/latest/reference/configuration/icp.config.en.html#std:configfile-icp.config

the Hostname and HostIP only need to specify one of them, not both :D

can you provide an update on the default config file to make it clear?

 TS don't use peer IP address from icp.config
 

 Key: TS-3360
 URL: https://issues.apache.org/jira/browse/TS-3360
 Project: Traffic Server
  Issue Type: Bug
  Components: Configuration, ICP
Reporter: Anton Ageev
 Fix For: 5.3.0


 I use TS 5.0.1.
 I try to add peer in icp.config:
 {code}
 peer1|192.168.0.2|2|80|3130|0|0.0.0.0|1|
 {code}
 But I got in the log:
 {code}
 DEBUG: (icp_warn) ICP query send, res=90, ip=*Not IP address [0]*
 {code}
 The only way to specify peer IP is to specify *real* hostname:
 {code}
 google.com|192.168.0.2|2|80|3130|0|0.0.0.0|1|
 {code}
 ICP request to google.com in the log:
 {code}
 DEBUG: (icp) [ICP_QUEUE_REQUEST] Id=617 send query to [173.194.112.96:3130]
 {code}
 Host IP (second field) is parsed to {{\*Not IP address \[0\]\*}} always.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3212) 200 code is returned as 304

2015-03-02 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344608#comment-14344608
 ] 

Zhao Yongming commented on TS-3212:
---

[~luca.rea] are you still following on this issue? I think we have find out 
some of the dark side:
1. your client send with a IMS but don't want the proxy/cache return with a 
304. this is really hard to do that unless you make the IMS a fail in compare.

2, the cache.config is not working on never-cache, as expected as the no touch 
and passing-through. that is a dark side of ATS in the cache-control, IMO.

I am going to sort out most of the cache-control issues as much as possible, 
I'd like to hear from you.

 200 code is returned as 304
 ---

 Key: TS-3212
 URL: https://issues.apache.org/jira/browse/TS-3212
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Rea
 Fix For: sometime


 The live streaming videos from akamaihd.net CDN cannot be watched because ATS 
 rewrite codes 200 into 304 and videos enter continuosly in buffering status:
 {code}
 GET 
 http://abclive.abcnews.com/z/abc_live1@136327/1200_02769fd3e0d85977-p.bootstrap?g=PDSTQVGEMQKRb=500,300,700,900,1200hdcore=3.1.0plugin=aasp-3.1.0.43.124
  HTTP/1.1
 Host: abclive.abcnews.com
 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 
 Firefox/33.0
 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
 Accept-Encoding: gzip, deflate
 Referer: 
 http://a.abcnews.com/assets/player/amp/2.0.0012/amp.premier/AkamaiPremierPlayer.swf
 Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==
 Connection: keep-alive
 HTTP/1.1 200 OK
 Server: ContactLab
 Mime-Version: 1.0
 Content-Type: video/abst
 Content-Length: 122
 Last-Modified: Tue, 25 Nov 2014 05:28:32 GMT
 Expires: Tue, 25 Nov 2014 15:31:53 GMT
 Cache-Control: max-age=0, no-cache
 Pragma: no-cache
 Date: Tue, 25 Nov 2014 15:31:53 GMT
 access-control-allow-origin: *
 Set-Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==; path=/z/abc_live1@136327/; 
 domain=abclive.abcnews.com
 Age: 0
 Connection: keep-alive
 GET 
 http://abclive.abcnews.com/z/abc_live1@136327/1200_02769fd3e0d85977-p.bootstrap?g=PDSTQVGEMQKRb=500,300,700,900,1200hdcore=3.1.0plugin=aasp-3.1.0.43.124
  HTTP/1.1
 Host: abclive.abcnews.com
 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 
 Firefox/33.0
 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
 Accept-Encoding: gzip, deflate
 Referer: 
 http://a.abcnews.com/assets/player/amp/2.0.0012/amp.premier/AkamaiPremierPlayer.swf
 Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==
 Connection: keep-alive
 If-Modified-Since: Tue, 25 Nov 2014 05:28:32 GMT
 HTTP/1.1 304 Not Modified
 Date: Tue, 25 Nov 2014 15:31:58 GMT
 Expires: Tue, 25 Nov 2014 15:31:58 GMT
 Cache-Control: max-age=0, no-cache
 Connection: keep-alive
 Server: ContactLab
 {code}
 using the url_regex to skip cache/IMS doesn't work, the workaround is the 
 following line in records.config:
 CONFIG proxy.config.http.cache.cache_urls_that_look_dynamic INT 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3412) Segmentation fault ET_CLUSTER

2015-02-25 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-3412:
--
Description: 
Can anyone help me please?

2.6.32-431.el6.x86_64
mem : 16GB
cpu   : 6core * 2   HS = 24core

{noformat}
kernel: [ET_CLUSTER 1][4508]: segfault at a8 ip 006c1571 sp 
2b5b58738890 error 4 in traffic_server[40+421000]
{noformat}

traffic.out
{noformat}
traffic_server: using root directory '/opt/ats'
traffic_server: Segmentation fault (Address not mapped to object 
[0xa8])traffic_server - STACK TRACE:
/opt/ats/bin/traffic_server(crash_logger_invoke(int, siginfo*, 
void*)+0x99)[0x4aaf19]
/lib64/libpthread.so.0(+0xf710)[0x2b5a25658710]
/opt/ats/bin/traffic_server(ClusterProcessor::connect_local(Continuation*, 
ClusterVCToken*, int, int)+0xa
/opt/ats/bin/traffic_server(cache_op_ClusterFunction(ClusterHandler*, void*, 
int)+0xabd)[0x6a71cd]
/opt/ats/bin/traffic_server(ClusterHandler::process_large_control_msgs()+0xe9)[0x6ab5e9]
/opt/ats/bin/traffic_server(ClusterHandler::update_channels_read()+0x8b)[0x6b0d7b]
/opt/ats/bin/traffic_server(ClusterHandler::process_read(long)+0x138)[0x6b1528]
/opt/ats/bin/traffic_server(ClusterHandler::mainClusterEvent(int, 
Event*)+0x176)[0x6b3f56]
/opt/ats/bin/traffic_server(ClusterState::IOComplete()+0x8a)[0x6b701a]
/opt/ats/bin/traffic_server(ClusterState::doIO_read_event(int, 
void*)+0xa7)[0x6b7307]
/opt/ats/bin/traffic_server[0x72b2e7]
/opt/ats/bin/traffic_server[0x72c53d]
/opt/ats/bin/traffic_server(NetHandler::mainNetEvent(int, 
Event*)+0x1f2)[0x7213c2]
/opt/ats/bin/traffic_server(EThread::process_event(Event*, int)+0x125)[0x74d4e5]
/opt/ats/bin/traffic_server(EThread::execute()+0x4c9)[0x74de29]
/opt/ats/bin/traffic_server[0x74c92a]
/lib64/libpthread.so.0(+0x79d1)[0x2b5a256509d1]
/lib64/libc.so.6(clone+0x6d)[0x2b5a26fa38fd]
traffic_server: using root directory '/opt/ats'
traffic_server: Terminated (Signal sent by kill() 28739 0)[E. Mgmt] log == 
[TrafficManager] using rootats'
{noformat}

records.config
{noformat}
CONFIG proxy.config.proxy_name STRING cluster-v530
LOCAL proxy.local.cluster.type INT 1
CONFIG proxy.config.cluster.ethernet_interface STRING bond0
CONFIG proxy.config.cluster.cluster_port INT 8086
CONFIG proxy.config.cluster.rsport INT 8088
CONFIG proxy.config.cluster.mcport INT 8089
CONFIG proxy.config.cluster.mc_group_addr STRING 224.0.1.40
CONFIG proxy.config.cluster.cluster_configuration STRING cluster.config
CONFIG proxy.config.cluster.threads INT 4
{noformat}

  was:
Can anyone help me please?

2.6.32-431.el6.x86_64
mem : 16GB
cpu   : 6core * 2   HS = 24core

{noformat}
kernel: [ET_CLUSTER 1][4508]: segfault at a8 ip 006c1571 sp 
2b5b58738890 error 4 in traffic_server[40+421000]
{noformat}

traffic.out
{noformat}
traffic_server: using root directory '/opt/ats'
traffic_server: Segmentation fault (Address not mapped to object 
[0xa8])traffic_server - STACK TRACE:
/opt/ats/bin/traffic_server(_Z19crash_logger_invokeiP7siginfoPv+0x99)[0x4aaf19]
/lib64/libpthread.so.0(+0xf710)[0x2b5a25658710]
/opt/ats/bin/traffic_server(_ZN16ClusterProcessor13connect_localEP12ContinuationP14ClusterVCTokenii+0xa
/opt/ats/bin/traffic_server(_Z24cache_op_ClusterFunctionP14ClusterHandlerPvi+0xabd)[0x6a71cd]
/opt/ats/bin/traffic_server(_ZN14ClusterHandler26process_large_control_msgsEv+0xe9)[0x6ab5e9]
/opt/ats/bin/traffic_server(_ZN14ClusterHandler20update_channels_readEv+0x8b)[0x6b0d7b]
/opt/ats/bin/traffic_server(_ZN14ClusterHandler12process_readEl+0x138)[0x6b1528]
/opt/ats/bin/traffic_server(_ZN14ClusterHandler16mainClusterEventEiP5Event+0x176)[0x6b3f56]
/opt/ats/bin/traffic_server(_ZN12ClusterState10IOCompleteEv+0x8a)[0x6b701a]
/opt/ats/bin/traffic_server(_ZN12ClusterState15doIO_read_eventEiPv+0xa7)[0x6b7307]
/opt/ats/bin/traffic_server[0x72b2e7]
/opt/ats/bin/traffic_server[0x72c53d]
/opt/ats/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x7213c2]
/opt/ats/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x125)[0x74d4e5]
/opt/ats/bin/traffic_server(_ZN7EThread7executeEv+0x4c9)[0x74de29]
/opt/ats/bin/traffic_server[0x74c92a]
/lib64/libpthread.so.0(+0x79d1)[0x2b5a256509d1]
/lib64/libc.so.6(clone+0x6d)[0x2b5a26fa38fd]
traffic_server: using root directory '/opt/ats'
traffic_server: Terminated (Signal sent by kill() 28739 0)[E. Mgmt] log == 
[TrafficManager] using rootats'
{noformat}

records.config
{noformat}
CONFIG proxy.config.proxy_name STRING cluster-v530
LOCAL proxy.local.cluster.type INT 1
CONFIG proxy.config.cluster.ethernet_interface STRING bond0
CONFIG proxy.config.cluster.cluster_port INT 8086
CONFIG proxy.config.cluster.rsport INT 8088
CONFIG proxy.config.cluster.mcport INT 8089
CONFIG proxy.config.cluster.mc_group_addr STRING 224.0.1.40
CONFIG proxy.config.cluster.cluster_configuration STRING cluster.config
CONFIG proxy.config.cluster.threads INT 4
{noformat}


 Segmentation fault ET_CLUSTER
 

[jira] [Commented] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14331983#comment-14331983
 ] 

Zhao Yongming commented on TS-3395:
---

I really don't know what you want, you want to stress out what ATS can do? or 
you want ATS do what Nginx/Squid would do? in both issue, I have point out the 
ATS way we deal with your issues, even guide you step by step towards the root 
cause and how we deal it with ATS, you are now using ATS, very different from 
Squid etc, it is powerfull and design in some strange way, if you are the fresh 
user, find out the ATS way is a good start, as it turns out that ATS will 
perform well in most of the real world cases.

on the testing issue, please refer to jtest (tools/jtest/) on testing if you 
dont know that, that is an other good stress tool which is suitble for stress a 
performance monster as ATS.

anyway, welcome to the ATS Colosseum.



 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 !http://i.imgur.com/dFTJI16.png!
 traffic_top says 25% ram hit, 37% fresh, 63% cold.
 So given that it doesn't seem to be a concurrency problem when requesting the 
 url to the origin server, could it be a problem of concurrent write access to 
 the cache? So that some pages are not cached at all? The traffoc_top fresh 
 percentage also makes me think it can be a problem in writing the cache.
 Not sure if I explained the problem correctly, ask me further information in 
 case. But in summary: hit ratio drops with a high number of connections, and 
 the problem seems related to pages that are not written to the cache.
 This is some related issue: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201301.mbox/%3ccd28cb1f.1f44a%25peter.wa...@email.disney.com%3E
 Also this: 
 http://apache-traffic-server.24303.n7.nabble.com/why-my-proxy-node-cache-hit-ratio-drops-td928.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330321#comment-14330321
 ] 

Zhao Yongming commented on TS-3395:
---

when you get the disk write IO bottoleneck, you will get a high water writes, 
and all others will not able to write when ATS will try to forward other 
request to the origin, that is in the users view, and that will result in cache 
hit ratio decreased, but you will get a higher request per second nubumer than 
others.

this is a feature by design I think :D

 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 !http://i.imgur.com/dFTJI16.png!
 traffic_top says 25% ram hit, 37% fresh, 63% cold.
 So given that it doesn't seem to be a concurrency problem when requesting the 
 url to the origin server, could it be a problem of concurrent write access to 
 the cache? So that some pages are not cached at all? The traffoc_top fresh 
 percentage also makes me think it can be a problem in writing the cache.
 Not sure if I explained the problem correctly, ask me further information in 
 case. But in summary: hit ratio drops with a high number of connections, and 
 the problem seems related to pages that are not written to the cache.
 This is some related issue: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201301.mbox/%3ccd28cb1f.1f44a%25peter.wa...@email.disney.com%3E
 Also this: 
 http://apache-traffic-server.24303.n7.nabble.com/why-my-proxy-node-cache-hit-ratio-drops-td928.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330323#comment-14330323
 ] 

Zhao Yongming commented on TS-3395:
---

my suggestion on performance testing is always avoid the disk IO(iops), as that 
is a hard limit on ATS performance, or any other proxy/cache system, and you 
can even calc out the real performance in production if that is the whole 
system bottoleneck.

 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 !http://i.imgur.com/dFTJI16.png!
 traffic_top says 25% ram hit, 37% fresh, 63% cold.
 So given that it doesn't seem to be a concurrency problem when requesting the 
 url to the origin server, could it be a problem of concurrent write access to 
 the cache? So that some pages are not cached at all? The traffoc_top fresh 
 percentage also makes me think it can be a problem in writing the cache.
 Not sure if I explained the problem correctly, ask me further information in 
 case. But in summary: hit ratio drops with a high number of connections, and 
 the problem seems related to pages that are not written to the cache.
 This is some related issue: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201301.mbox/%3ccd28cb1f.1f44a%25peter.wa...@email.disney.com%3E
 Also this: 
 http://apache-traffic-server.24303.n7.nabble.com/why-my-proxy-node-cache-hit-ratio-drops-td928.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330336#comment-14330336
 ] 

Zhao Yongming commented on TS-3395:
---

if you take ATS as a proxy, why not limit the connections on origin side? we 
have more options to protect the origin server than limit the disk io.
if you take ATS as a cache, the disk io and space is the key point of the cache 
system, why not add more disks if you can? a disk write bottleneck is really 
rare case when we talking about the cache system, right?

 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 !http://i.imgur.com/dFTJI16.png!
 traffic_top says 25% ram hit, 37% fresh, 63% cold.
 So given that it doesn't seem to be a concurrency problem when requesting the 
 url to the origin server, could it be a problem of concurrent write access to 
 the cache? So that some pages are not cached at all? The traffoc_top fresh 
 percentage also makes me think it can be a problem in writing the cache.
 Not sure if I explained the problem correctly, ask me further information in 
 case. But in summary: hit ratio drops with a high number of connections, and 
 the problem seems related to pages that are not written to the cache.
 This is some related issue: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201301.mbox/%3ccd28cb1f.1f44a%25peter.wa...@email.disney.com%3E
 Also this: 
 http://apache-traffic-server.24303.n7.nabble.com/why-my-proxy-node-cache-hit-ratio-drops-td928.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330349#comment-14330349
 ] 

Zhao Yongming edited comment on TS-3395 at 2/21/15 5:08 PM:


good, that is the case what I'd like to avoid. haha.

I am talking about the limit on the origin side, while TS-3386 is facing on the 
UA side connection limit, and in your case, the limit on the UA and OS messed 
up when you deal with the 127.0.0.1 at the beginning.

in your practice, we strongly used the limit on the UA side, which is a very 
good solution for both cache  origin nice.
please refer to 'proxy.config.http.origin_max_connections' for the limit on the 
OS side

when cache connections holding by waiting or other issue, the httpSM will keep 
alive, which may cause you very huge memories, when in our productions with 
20kqps, we would like to keep the cache connections in about 1k-2k. that is 
critical to a busy system, when you want it stable in service.

and cache write may cause more performace decrease than read, pay more 
attention to cache writes



was (Author: zym):
good, that is the case what I'd like to avoid. haha.

I am talking about the limit on the origin side, while TS-3386 is facing on the 
UA side connection limit, and in your case, the limit on the UA and OS messed 
up when you deal with the 127.0.0.1 at the beginning.

in your practice, we strongly used the limit on the UA side, which is a very 
good solution for both cache  origin nice.
please refer to 'proxy.config.http.origin_max_connections' for the limit on the 
UA side

when cache connections holding by waiting or other issue, the httpSM will keep 
alive, which may cause you very huge memories, when in our productions with 
20kqps, we would like to keep the cache connections in about 1k-2k. that is 
critical to a busy system, when you want it stable in service.

and cache write may cause more performace decrease than read, pay more 
attention to cache writes


 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 

[jira] [Comment Edited] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330349#comment-14330349
 ] 

Zhao Yongming edited comment on TS-3395 at 2/21/15 5:08 PM:


good, that is the case what I'd like to avoid. haha.

I am talking about the limit on the origin side, while TS-3386 is facing on the 
UA side connection limit, and in your case, the limit on the UA and OS messed 
up when you deal with the 127.0.0.1 at the beginning.

in your practice, we strongly used the limit on the OS side, which is a very 
good solution for both cache  origin nice.
please refer to 'proxy.config.http.origin_max_connections' for the limit on the 
OS side

when cache connections holding by waiting or other issue, the httpSM will keep 
alive, which may cause you very huge memories, when in our productions with 
20kqps, we would like to keep the cache connections in about 1k-2k. that is 
critical to a busy system, when you want it stable in service.

and cache write may cause more performace decrease than read, pay more 
attention to cache writes



was (Author: zym):
good, that is the case what I'd like to avoid. haha.

I am talking about the limit on the origin side, while TS-3386 is facing on the 
UA side connection limit, and in your case, the limit on the UA and OS messed 
up when you deal with the 127.0.0.1 at the beginning.

in your practice, we strongly used the limit on the UA side, which is a very 
good solution for both cache  origin nice.
please refer to 'proxy.config.http.origin_max_connections' for the limit on the 
OS side

when cache connections holding by waiting or other issue, the httpSM will keep 
alive, which may cause you very huge memories, when in our productions with 
20kqps, we would like to keep the cache connections in about 1k-2k. that is 
critical to a busy system, when you want it stable in service.

and cache write may cause more performace decrease than read, pay more 
attention to cache writes


 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 

[jira] [Commented] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330349#comment-14330349
 ] 

Zhao Yongming commented on TS-3395:
---

good, that is the case what I'd like to avoid. haha.

I am talking about the limit on the origin side, while TS-3386 is facing on the 
UA side connection limit, and in your case, the limit on the UA and OS messed 
up when you deal with the 127.0.0.1 at the beginning.

in your practice, we strongly used the limit on the UA side, which is a very 
good solution for both cache  origin nice.
please refer to 'proxy.config.http.origin_max_connections' for the limit on the 
UA side

when cache connections holding by waiting or other issue, the httpSM will keep 
alive, which may cause you very huge memories, when in our productions with 
20kqps, we would like to keep the cache connections in about 1k-2k. that is 
critical to a busy system, when you want it stable in service.

and cache write may cause more performace decrease than read, pay more 
attention to cache writes


 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 !http://i.imgur.com/dFTJI16.png!
 traffic_top says 25% ram hit, 37% fresh, 63% cold.
 So given that it doesn't seem to be a concurrency problem when requesting the 
 url to the origin server, could it be a problem of concurrent write access to 
 the cache? So that some pages are not cached at all? The traffoc_top fresh 
 percentage also makes me think it can be a problem in writing the cache.
 Not sure if I explained the problem correctly, ask me further information in 
 case. But in summary: hit ratio drops with a high number of connections, and 
 the problem seems related to pages that are not written to the cache.
 This is some related issue: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201301.mbox/%3ccd28cb1f.1f44a%25peter.wa...@email.disney.com%3E
 Also this: 
 http://apache-traffic-server.24303.n7.nabble.com/why-my-proxy-node-cache-hit-ratio-drops-td928.html



--
This message was sent by Atlassian JIRA

[jira] [Commented] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330352#comment-14330352
 ] 

Zhao Yongming commented on TS-3395:
---

and in practice, the cache write connections will always less than 
origin_max_connections, sounds perfect?

 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 !http://i.imgur.com/dFTJI16.png!
 traffic_top says 25% ram hit, 37% fresh, 63% cold.
 So given that it doesn't seem to be a concurrency problem when requesting the 
 url to the origin server, could it be a problem of concurrent write access to 
 the cache? So that some pages are not cached at all? The traffoc_top fresh 
 percentage also makes me think it can be a problem in writing the cache.
 Not sure if I explained the problem correctly, ask me further information in 
 case. But in summary: hit ratio drops with a high number of connections, and 
 the problem seems related to pages that are not written to the cache.
 This is some related issue: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201301.mbox/%3ccd28cb1f.1f44a%25peter.wa...@email.disney.com%3E
 Also this: 
 http://apache-traffic-server.24303.n7.nabble.com/why-my-proxy-node-cache-hit-ratio-drops-td928.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3395) Hit ratio drops with high concurrency

2015-02-21 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14330257#comment-14330257
 ] 

Zhao Yongming commented on TS-3395:
---

well, if that is the disk IO bottlenetck, I think that is something reasonable, 
can you please attach a disk iops verion of the disk I/O?

 Hit ratio drops with high concurrency
 -

 Key: TS-3395
 URL: https://issues.apache.org/jira/browse/TS-3395
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Bruno
 Fix For: 5.3.0


 I'm doing some tests and I've noticed that the hit ratio drops with more than 
 300 simultaneous http connections.
 The cache is on a raw disk of 500gb and it's not filled, so no eviction. The 
 ram cache is disabled.
 The test is done with web-polygraph. Content size vary from 5kb to 20kb 
 uniformly, expected hit ratio 60%, 2000 http connections, documents expire 
 after months. There's no Vary.
 !http://i.imgur.com/Zxlhgnf.png!
 Then I thought it could be a problem of polygraph. I wrote my own 
 client/server test code, it works fine also with squid, varnish and nginx. I 
 register a hit if I get either cR or cH in the headers.
 {noformat}
 2015/02/19 12:38:28 Starting 100 requests
 2015/02/19 12:37:58 Elapsed: 3m51.23552164s
 2015/02/19 12:37:58 Total average: 231.235µs/req, 4324.60req/s
 2015/02/19 12:37:58 Average size: 12.50kb/req
 2015/02/19 12:37:58 Bytes read: 12498412.45kb, 54050.57kb/s
 2015/02/19 12:37:58 Errors: 0
 2015/02/19 12:37:58 Offered Hit ratio: 59.95%
 2015/02/19 12:37:58 Measured Hit ratio: 37.20%
 2015/02/19 12:37:58 Hit bytes: 4649000609
 2015/02/19 12:37:58 Hit success: 599476/599476 (100.00%), 469.840902ms/req
 2015/02/19 12:37:58 Miss success: 400524/400524 (100.00%), 336.301464ms/req
 {noformat}
 So similar results, 37.20% on average. Then I thought that could be a problem 
 of how I'm testing stuff, and tried with nginx cache. It achieves 60% hit 
 ratio, but request rate is very slow compared to ATS for obvious reasons.
 Then I wanted to check if with 200 connections but with longer test time hit 
 ratio also dropped, but no, it's fine:
 !http://i.imgur.com/oMHscuf.png!
 So not a problem of my tests I guess.
 Then I realized by debugging the test server that the same url was asked 
 twice.
 Out of 100 requests, 78600 urls were asked at least twice. An url was 
 even requested 9 times. These same url are not requested close to each other: 
 even more than 30sec can pass from one request to the other for the same url.
 I also tweaked the following parameters:
 {noformat}
 CONFIG proxy.config.http.cache.fuzz.time INT 0
 CONFIG proxy.config.http.cache.fuzz.min_time INT 0
 CONFIG proxy.config.http.cache.fuzz.probability FLOAT 0.00
 CONFIG proxy.config.http.cache.max_open_read_retries INT 4
 CONFIG proxy.config.http.cache.open_read_retry_time INT 500
 {noformat}
 And this is the result with polygraph, similar results:
 !http://i.imgur.com/YgOndhY.png!
 Tweaked the read-while-writer option, and yet having similar results.
 Then I've enabled 1GB of ram, it is slightly better at the beginning, but 
 then it drops:
 !http://i.imgur.com/dFTJI16.png!
 traffic_top says 25% ram hit, 37% fresh, 63% cold.
 So given that it doesn't seem to be a concurrency problem when requesting the 
 url to the origin server, could it be a problem of concurrent write access to 
 the cache? So that some pages are not cached at all? The traffoc_top fresh 
 percentage also makes me think it can be a problem in writing the cache.
 Not sure if I explained the problem correctly, ask me further information in 
 case. But in summary: hit ratio drops with a high number of connections, and 
 the problem seems related to pages that are not written to the cache.
 This is some related issue: 
 http://mail-archives.apache.org/mod_mbox/trafficserver-users/201301.mbox/%3ccd28cb1f.1f44a%25peter.wa...@email.disney.com%3E
 Also this: 
 http://apache-traffic-server.24303.n7.nabble.com/why-my-proxy-node-cache-hit-ratio-drops-td928.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3164) why the load of trafficserver occurrs a abrupt rise on a occasion ?

2015-02-11 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316244#comment-14316244
 ] 

Zhao Yongming commented on TS-3164:
---

I have seen that, but I got some very short lockup like situation, mostly less 
than 15s. still don't know why

 why the load of trafficserver occurrs a abrupt rise on a occasion ?
 ---

 Key: TS-3164
 URL: https://issues.apache.org/jira/browse/TS-3164
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
 Environment: CentOS 6.3 64bit, 8 cores, 128G mem 
Reporter: taoyunxing
 Fix For: sometime


 I use Tsar to monitor the traffic status of the ATS 4.2.0, and come across 
 the following problem:
 {code}
 Time   ---cpu-- ---mem-- ---tcp-- -traffic --sda--- --sdb--- 
 --sdc---  ---load- 
 Time util util   retranbytin  bytout util util
  util load1   
 03/11/14-18:20  40.6787.19 3.3624.5M   43.9M13.0294.68
  0.00  5.34   
 03/11/14-18:25  40.3087.20 3.2722.5M   42.6M12.3894.87
  0.00  5.79   
 03/11/14-18:30  40.8484.67 3.4421.4M   42.0M13.2995.37
  0.00  6.28   
 03/11/14-18:35  43.6387.36 3.2123.8M   45.0M13.2393.99
  0.00  7.37   
 03/11/14-18:40  42.2587.37 3.0924.2M   44.8M12.8495.77
  0.00  7.25   
 03/11/14-18:45  42.9687.44 3.4623.3M   46.0M12.9695.84
  0.00  7.10   
 03/11/14-18:50  44.0087.42 3.4922.3M   43.0M14.1794.99
  0.00  6.57   
 03/11/14-18:55  42.2087.44 3.4622.3M   43.6M13.1996.05
  0.00  6.09   
 03/11/14-19:00  44.9087.53 3.6023.6M   46.5M13.6196.67
  0.00  8.06   
 03/11/14-19:05  46.2687.73 3.2425.8M   49.1M15.3994.05
  0.00  9.98   
 03/11/14-19:10  43.8587.69 3.1925.4M   50.9M12.8897.80
  0.00  7.99   
 03/11/14-19:15  45.2887.69 3.3625.6M   49.6M13.1096.86
  0.00  7.47   
 03/11/14-19:20  44.1185.20 3.2924.1M   47.8M14.2496.75
  0.00  5.82   
 03/11/14-19:25  45.2687.78 3.5224.4M   47.7M13.2195.44
  0.00  7.61   
 03/11/14-19:30  44.8387.80 3.6425.7M   50.8M13.2798.02
  0.00  6.85   
 03/11/14-19:35  44.8987.78 3.6123.9M   49.0M13.3497.42
  0.00  7.04   
 03/11/14-19:40  69.2188.88 0.5518.3M   33.7M11.3971.23
  0.00 65.80   
 03/11/14-19:45  72.4788.66 0.2715.4M   31.6M11.5172.31
  0.00 11.56   
 03/11/14-19:50  44.8788.72 4.1122.7M   46.3M12.9997.33
  0.00  8.29
 {code}

 in addition, top command show
 {code}
 hi:0
 ni:0
 si:45.56
 st:0
 sy:13.92
 us:12.58
 wa:14.3
 id:15.96
 {code}
 who help me ? thanks in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3386) Heartbeat failed with high load, trafficserver restarted

2015-02-11 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316482#comment-14316482
 ] 

Zhao Yongming commented on TS-3386:
---

well, things get more interesting.
q1: why you will lose the cached content in a restart of the traffic server??
 q1.1: is that a cache issue?

q2: you are going to protect the origin server, why you think that limit on the 
UA side connection is a better solution to the limit on the origin side?
 q2.1 have you seen any occurrence of connection(httpSM) hanghup?
 q2.2 what is a better way to handle of the connection issue, for example 
timeout?

when you try to handle tons of cache, tons of the traffic, keep it simple, keep 
it robust always better than anything intelligent.

yes, we have fixed many cache issue we meet, http SM issues, and connections 
timeout issue, connection leaking ... I think most of the important change 
already in the official tree. and this is the way we figure out the root issues 
in ATS, which may lead to just some very tiny fix that will only affect very 
high traffic site with very strict SLA requirement.

 Heartbeat failed with high load, trafficserver restarted
 

 Key: TS-3386
 URL: https://issues.apache.org/jira/browse/TS-3386
 Project: Traffic Server
  Issue Type: Bug
  Components: Performance
Reporter: Luca Bruno

 I've been evaluating ATS for some days. I'm using it with mostly default 
 settings, except I've lowered the number of connections to the backend, I 
 have a raw storage of 500gb, and disabled ram cache.
 Working fine, then I wanted to stress it more. I've increased the test to 
 1000 concurrent requests, then the ATS worker has been restarted and thus 
 lost the whole cache.
 /var/log/syslog:
 {noformat}
 Feb 11 10:05:52 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:05:52 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:02 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR:  
 (last system error 32: Broken pipe)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: cop received child status 
 signal [32985 256]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: traffic_manager not running, 
 making sure traffic_server is dead
 Feb 11 10:06:22 test-cache traffic_cop[32984]: spawning traffic_manager
 Feb 11 10:06:22 test-cache traffic_cop[32984]: binpath is bin
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: --- Manager Starting 
 ---
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: Manager Version: 
 Apache Traffic Server - traffic_manager - 5.2.0 - (build # 11013 on Feb 10 
 2015 at 13:05:19)
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:32 test-cache 

[jira] [Commented] (TS-3386) Heartbeat failed with high load, trafficserver restarted

2015-02-11 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316279#comment-14316279
 ] 

Zhao Yongming commented on TS-3386:
---

well, the remap metters, please don't mess up 127.0.0.1 8080 with most of the 
services, that is not what ATS working to as a proxy.

use something like map http://mydomain.com:8080/ . and do your testing 
using modified /etc/hosts or -x 127.0.0.1:8080 in curl.

 Heartbeat failed with high load, trafficserver restarted
 

 Key: TS-3386
 URL: https://issues.apache.org/jira/browse/TS-3386
 Project: Traffic Server
  Issue Type: Bug
  Components: Performance
Reporter: Luca Bruno

 I've been evaluating ATS for some days. I'm using it with mostly default 
 settings, except I've lowered the number of connections to the backend, I 
 have a raw storage of 500gb, and disabled ram cache.
 Working fine, then I wanted to stress it more. I've increased the test to 
 1000 concurrent requests, then the ATS worker has been restarted and thus 
 lost the whole cache.
 /var/log/syslog:
 {noformat}
 Feb 11 10:05:52 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:05:52 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:02 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR:  
 (last system error 32: Broken pipe)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: cop received child status 
 signal [32985 256]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: traffic_manager not running, 
 making sure traffic_server is dead
 Feb 11 10:06:22 test-cache traffic_cop[32984]: spawning traffic_manager
 Feb 11 10:06:22 test-cache traffic_cop[32984]: binpath is bin
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: --- Manager Starting 
 ---
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: Manager Version: 
 Apache Traffic Server - traffic_manager - 5.2.0 - (build # 11013 on Feb 10 
 2015 at 13:05:19)
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:42 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:44 test-cache traffic_server[59077]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:44 test-cache 

[jira] [Commented] (TS-3386) Heartbeat failed with high load, trafficserver restarted

2015-02-11 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316376#comment-14316376
 ] 

Zhao Yongming commented on TS-3386:
---

if you want to talk about the kill, I'd like to say there should be more work 
before taking down the server, but how would you know that the connection full 
and all works well?

we have tried to put the heartbeat into a connection that will not be affect in 
the connection limit, but sounds not so good too

the heart beat is a fake L7 service health checker, which is design to find out 
something abnormal :D

 Heartbeat failed with high load, trafficserver restarted
 

 Key: TS-3386
 URL: https://issues.apache.org/jira/browse/TS-3386
 Project: Traffic Server
  Issue Type: Bug
  Components: Performance
Reporter: Luca Bruno

 I've been evaluating ATS for some days. I'm using it with mostly default 
 settings, except I've lowered the number of connections to the backend, I 
 have a raw storage of 500gb, and disabled ram cache.
 Working fine, then I wanted to stress it more. I've increased the test to 
 1000 concurrent requests, then the ATS worker has been restarted and thus 
 lost the whole cache.
 /var/log/syslog:
 {noformat}
 Feb 11 10:05:52 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:05:52 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:02 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR:  
 (last system error 32: Broken pipe)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: cop received child status 
 signal [32985 256]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: traffic_manager not running, 
 making sure traffic_server is dead
 Feb 11 10:06:22 test-cache traffic_cop[32984]: spawning traffic_manager
 Feb 11 10:06:22 test-cache traffic_cop[32984]: binpath is bin
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: --- Manager Starting 
 ---
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: Manager Version: 
 Apache Traffic Server - traffic_manager - 5.2.0 - (build # 11013 on Feb 10 
 2015 at 13:05:19)
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:42 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [Alarms::signalAlarm] 

[jira] [Commented] (TS-3386) Heartbeat failed with high load, trafficserver restarted

2015-02-11 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316343#comment-14316343
 ] 

Zhao Yongming commented on TS-3386:
---

well, proxy.config.net.connections_throttle = 1000, are you kidding? ATS is not 
squid nor httpd-1.x

 Heartbeat failed with high load, trafficserver restarted
 

 Key: TS-3386
 URL: https://issues.apache.org/jira/browse/TS-3386
 Project: Traffic Server
  Issue Type: Bug
  Components: Performance
Reporter: Luca Bruno

 I've been evaluating ATS for some days. I'm using it with mostly default 
 settings, except I've lowered the number of connections to the backend, I 
 have a raw storage of 500gb, and disabled ram cache.
 Working fine, then I wanted to stress it more. I've increased the test to 
 1000 concurrent requests, then the ATS worker has been restarted and thus 
 lost the whole cache.
 /var/log/syslog:
 {noformat}
 Feb 11 10:05:52 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:05:52 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:02 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR:  
 (last system error 32: Broken pipe)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: cop received child status 
 signal [32985 256]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: traffic_manager not running, 
 making sure traffic_server is dead
 Feb 11 10:06:22 test-cache traffic_cop[32984]: spawning traffic_manager
 Feb 11 10:06:22 test-cache traffic_cop[32984]: binpath is bin
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: --- Manager Starting 
 ---
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: Manager Version: 
 Apache Traffic Server - traffic_manager - 5.2.0 - (build # 11013 on Feb 10 
 2015 at 13:05:19)
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:42 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:44 test-cache traffic_server[59077]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:44 test-cache traffic_server[59077]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:44 

[jira] [Commented] (TS-3386) Heartbeat failed with high load, trafficserver restarted

2015-02-11 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316238#comment-14316238
 ] 

Zhao Yongming commented on TS-3386:
---

o, please just don't load any traffic and enable debug on http.*|dns.*, and I'd 
suspect this is a HostDB reverse lookup on 127.0.0.1 or lookup on localhost 
issue. let us dig it out.

 Heartbeat failed with high load, trafficserver restarted
 

 Key: TS-3386
 URL: https://issues.apache.org/jira/browse/TS-3386
 Project: Traffic Server
  Issue Type: Bug
  Components: Performance
Reporter: Luca Bruno

 I've been evaluating ATS for some days. I'm using it with mostly default 
 settings, except I've lowered the number of connections to the backend, I 
 have a raw storage of 500gb, and disabled ram cache.
 Working fine, then I wanted to stress it more. I've increased the test to 
 1000 concurrent requests, then the ATS worker has been restarted and thus 
 lost the whole cache.
 /var/log/syslog:
 {noformat}
 Feb 11 10:05:52 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:05:52 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:02 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:02 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:02 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:04 test-cache traffic_server[59047]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:12 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Feb 11 10:06:22 test-cache traffic_manager[32985]: {0x7f975c537720} ERROR:  
 (last system error 32: Broken pipe)
 Feb 11 10:06:22 test-cache traffic_cop[32984]: cop received child status 
 signal [32985 256]
 Feb 11 10:06:22 test-cache traffic_cop[32984]: traffic_manager not running, 
 making sure traffic_server is dead
 Feb 11 10:06:22 test-cache traffic_cop[32984]: spawning traffic_manager
 Feb 11 10:06:22 test-cache traffic_cop[32984]: binpath is bin
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: --- Manager Starting 
 ---
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: Manager Version: 
 Apache Traffic Server - traffic_manager - 5.2.0 - (build # 11013 on Feb 10 
 2015 at 13:05:19)
 Feb 11 10:06:22 test-cache traffic_manager[59057]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: traffic_server 
 Version: Apache Traffic Server - traffic_server - 5.2.0 - (build # 11013 on 
 Feb 10 2015 at 13:04:42)
 Feb 11 10:06:24 test-cache traffic_server[59065]: NOTE: 
 RLIMIT_NOFILE(7):cur(736236),max(736236)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:32 test-cache traffic_cop[32984]: server heartbeat failed [1]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: (http test) received non-200 
 status(502)
 Feb 11 10:06:42 test-cache traffic_cop[32984]: server heartbeat failed [2]
 Feb 11 10:06:42 test-cache traffic_cop[32984]: killing server
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 9: 
 Killed
 Feb 11 10:06:42 test-cache traffic_manager[59057]: {0x7f2c94ded720} ERROR: 
 [Alarms::signalAlarm] Server Process was reset
 Feb 11 10:06:44 test-cache traffic_server[59077]: NOTE: --- traffic_server 
 Starting ---
 Feb 11 10:06:44 test-cache traffic_server[59077]: NOTE: traffic_server 
 Version: Apache Traffic Server - 

[jira] [Updated] (TS-2482) Problems with SOCKS

2015-01-14 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-2482:
--
Assignee: weijin

 Problems with SOCKS
 ---

 Key: TS-2482
 URL: https://issues.apache.org/jira/browse/TS-2482
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Radim Kolar
Assignee: weijin
 Fix For: sometime


 There are several problems with using SOCKS. I am interested in case when TF 
 is sock client. Client sends HTTP request and TF uses SOCKS server to make 
 connection to internet.
 a/ - not documented enough in default configs
 From default configs comments it seems that for running 
 TF 4.1.2 as socks client, it is sufficient to add one line to socks.config:
 dest_ip=0.0.0.0-255.255.255.255 parent=10.0.0.7:9050
 but socks proxy is not used. If i run tcpdump sniffing packets  TF never 
 tries to connect to that SOCKS.
 From source code - 
 https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it 
 looks that is needed to set proxy.config.socks.socks_needed to activate 
 socks support. This should be documented in both sample files: socks.config 
 and record.config
 b/
 after enabling socks, i am hit by this assert:
 Assertion failed: (ats_is_ip4(target_addr)), function init, file Socks.cc, 
 line 65.
 i run on dual stack system (ip4,ip6). 
 This code is setting default destination for SOCKS request? Can not you use 
 just 127.0.0.1 for case if client gets connected over IP6?
 https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2482) Problems with SOCKS

2015-01-14 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278227#comment-14278227
 ] 

Zhao Yongming commented on TS-2482:
---

we have a patch that will fix the problem, I think. it works on turning ATS 
into a SOCSK5 server, but still pending to full testing with parent socks 
feature. the problem here is not only the assert, but also the HTTP 
transactions.

 Problems with SOCKS
 ---

 Key: TS-2482
 URL: https://issues.apache.org/jira/browse/TS-2482
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Radim Kolar
 Fix For: sometime


 There are several problems with using SOCKS. I am interested in case when TF 
 is sock client. Client sends HTTP request and TF uses SOCKS server to make 
 connection to internet.
 a/ - not documented enough in default configs
 From default configs comments it seems that for running 
 TF 4.1.2 as socks client, it is sufficient to add one line to socks.config:
 dest_ip=0.0.0.0-255.255.255.255 parent=10.0.0.7:9050
 but socks proxy is not used. If i run tcpdump sniffing packets  TF never 
 tries to connect to that SOCKS.
 From source code - 
 https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc it 
 looks that is needed to set proxy.config.socks.socks_needed to activate 
 socks support. This should be documented in both sample files: socks.config 
 and record.config
 b/
 after enabling socks, i am hit by this assert:
 Assertion failed: (ats_is_ip4(target_addr)), function init, file Socks.cc, 
 line 65.
 i run on dual stack system (ip4,ip6). 
 This code is setting default destination for SOCKS request? Can not you use 
 just 127.0.0.1 for case if client gets connected over IP6?
 https://github.com/apache/trafficserver/blob/master/iocore/net/Socks.cc#L66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3088) Have ATS look at /etc/hosts

2014-12-15 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247700#comment-14247700
 ] 

Zhao Yongming commented on TS-3088:
---

looks some of the SPLIT DNS codes is removed, is that feature still working 
after this commit?

 Have ATS look at /etc/hosts
 ---

 Key: TS-3088
 URL: https://issues.apache.org/jira/browse/TS-3088
 Project: Traffic Server
  Issue Type: New Feature
  Components: DNS
Reporter: David Carlin
Assignee: Alan M. Carroll
Priority: Minor
 Fix For: 5.3.0

 Attachments: ts-3088-3-2-x-patch.diff


 It would be nice if /etc/hosts was read when resolving hostnames - useful for 
 testing/troubleshooting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3220) Update http cache stats so we can determine if a response was served from ram cache

2014-12-05 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235637#comment-14235637
 ] 

Zhao Yongming commented on TS-3220:
---

yeah, nice catch, we have seen some ram cache hit higher than expected too.

 Update http cache stats so we can determine if a response was served from ram 
 cache
 ---

 Key: TS-3220
 URL: https://issues.apache.org/jira/browse/TS-3220
 Project: Traffic Server
  Issue Type: Improvement
  Components: Metrics
Reporter: Bryan Call
  Labels: A, Yahoo
 Fix For: 5.3.0


 Currently we use a combination of ram cache stats and some http ram cache 
 information to try to determine if the response was served from ram cache.  
 The ram cache stats don't know about http and the entry in ram cache might 
 not be valid.  It is possible to have a ram cache hit from the cache's point 
 of view, but not serve the response from cache at all.
 The http cache stats are missing a few stats to determine if the response was 
 served from ram.  We would need to add stat for ims responses served from ram 
 {{proxy.process.http.cache_hit_mem_ims}} and a stat if the stale response was 
 served from ram {{proxy.process.http.cache_hit_mem_stale_served}}.
 Ram cache stats for reference
 {code}
 proxy.process.cache.ram_cache.hits
 proxy.process.cache.ram_cache.misses
 {code}
 Current http cache stats for reference
 {code}
 proxy.process.http.cache_hit_fresh
 proxy.process.http.cache_hit_mem_fresh
 proxy.process.http.cache_hit_revalidated
 proxy.process.http.cache_hit_ims
 proxy.process.http.cache_hit_stale_served
 proxy.process.http.cache_miss_cold
 proxy.process.http.cache_miss_changed
 proxy.process.http.cache_miss_client_no_cache
 proxy.process.http.cache_miss_client_not_cacheable
 proxy.process.http.cache_miss_ims
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3212) 200 code is returned as 304

2014-11-25 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225607#comment-14225607
 ] 

Zhao Yongming commented on TS-3212:
---

well, if the ATS return you a 304, there will be two case:
1, the UA side IMS is pass to origin and origin returned with a 304, and that 
304 response itself is saved
2, the content is saved in cache and expired, then ATS query the origin with a 
selfbuilding IMS header, origin server response with a 200, but ATS try to 
reponse with a 304 to UA.

if it is case #2, please confirm that the content is saved in cache, and the 
origin response is 200. the http_ui and tcpdump or debug in records may help.

I think that case #2 looks cool, but it should not saved as here the content is 
set to 'no cache', right?

 200 code is returned as 304
 ---

 Key: TS-3212
 URL: https://issues.apache.org/jira/browse/TS-3212
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Luca Rea

 The live streaming videos from akamaihd.net CDN cannot be watched because ATS 
 rewrite codes 200 into 304 and videos enter continuosly in buffering status:
 {code}
 GET 
 http://abclive.abcnews.com/z/abc_live1@136327/1200_02769fd3e0d85977-p.bootstrap?g=PDSTQVGEMQKRb=500,300,700,900,1200hdcore=3.1.0plugin=aasp-3.1.0.43.124
  HTTP/1.1
 Host: abclive.abcnews.com
 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 
 Firefox/33.0
 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
 Accept-Encoding: gzip, deflate
 Referer: 
 http://a.abcnews.com/assets/player/amp/2.0.0012/amp.premier/AkamaiPremierPlayer.swf
 Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==
 Connection: keep-alive
 HTTP/1.1 200 OK
 Server: ContactLab
 Mime-Version: 1.0
 Content-Type: video/abst
 Content-Length: 122
 Last-Modified: Tue, 25 Nov 2014 05:28:32 GMT
 Expires: Tue, 25 Nov 2014 15:31:53 GMT
 Cache-Control: max-age=0, no-cache
 Pragma: no-cache
 Date: Tue, 25 Nov 2014 15:31:53 GMT
 access-control-allow-origin: *
 Set-Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==; path=/z/abc_live1@136327/; 
 domain=abclive.abcnews.com
 Age: 0
 Connection: keep-alive
 GET 
 http://abclive.abcnews.com/z/abc_live1@136327/1200_02769fd3e0d85977-p.bootstrap?g=PDSTQVGEMQKRb=500,300,700,900,1200hdcore=3.1.0plugin=aasp-3.1.0.43.124
  HTTP/1.1
 Host: abclive.abcnews.com
 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 
 Firefox/33.0
 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
 Accept-Language: it-IT,it;q=0.8,en-US;q=0.5,en;q=0.3
 Accept-Encoding: gzip, deflate
 Referer: 
 http://a.abcnews.com/assets/player/amp/2.0.0012/amp.premier/AkamaiPremierPlayer.swf
 Cookie: _alid_=0OHcZb9VLdpbE6LrNYyDDA==
 Connection: keep-alive
 If-Modified-Since: Tue, 25 Nov 2014 05:28:32 GMT
 HTTP/1.1 304 Not Modified
 Date: Tue, 25 Nov 2014 15:31:58 GMT
 Expires: Tue, 25 Nov 2014 15:31:58 GMT
 Cache-Control: max-age=0, no-cache
 Connection: keep-alive
 Server: ContactLab
 {code}
 using the url_regex to skip cache/IMS doesn't work, the workaround is the 
 following line in records.config:
 CONFIG proxy.config.http.cache.cache_urls_that_look_dynamic INT 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3192) implement proxy.config.config_dir

2014-11-11 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207663#comment-14207663
 ] 

Zhao Yongming commented on TS-3192:
---

that is a pending to removing feature, IMO. the origin TS is desgin to be 
relocatable for the config files due to binary distribution. it may accept 
records config and shell ENV settings, after the opensource, we can set the 
config dir by configure options and there is no need to make things that 
complex.

FYI

 implement proxy.config.config_dir
 -

 Key: TS-3192
 URL: https://issues.apache.org/jira/browse/TS-3192
 Project: Traffic Server
  Issue Type: New Feature
  Components: Configuration
Reporter: James Peach
Assignee: James Peach
 Fix For: 5.2.0


 {{proxy.config.config_dir}} has never been implemented, but there are various 
 scenarios where is it useful to be able to point Traffic Server to a 
 non-default set of configuration files. {{TS_ROOT}} is not always sufficient 
 for this because the system config directory is a path relative to the prefix 
 which otherwise cannot be altered (even assuming you know it).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-1822) Do we still need proxy.config.system.mmap_max ?

2014-11-10 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205773#comment-14205773
 ] 

Zhao Yongming commented on TS-1822:
---

we make use of the reclaim freelist on our 48G memory system, handling about 
24-32G ram cache, with about 32KB everage content size, the default sysctl 
seting vm.max_map_count = 65530 is no enough, we have to rise it to 2x.

so, I'd make this a option to rise the default sysctl setting if we choose to 
keep it, for example by cop process.

 Do we still need proxy.config.system.mmap_max ?
 ---

 Key: TS-1822
 URL: https://issues.apache.org/jira/browse/TS-1822
 Project: Traffic Server
  Issue Type: Improvement
  Components: Core
Reporter: Leif Hedstrom
Assignee: Phil Sorber
  Labels: compatibility
 Fix For: 6.0.0


 A long time ago, we added proxy.config.system.mmap_max to let the 
 traffic_server increase the max number of mmap segments that we want to use. 
 We currently set this to 2MM.
 I'm wondering, do we really need this still ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-3181) manager ports should only do local network interaction

2014-11-09 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-3181:
-

 Summary: manager ports should only do local network interaction
 Key: TS-3181
 URL: https://issues.apache.org/jira/browse/TS-3181
 Project: Traffic Server
  Issue Type: Improvement
  Components: Manager
Reporter: Zhao Yongming


the manager ports, such as 8088 8089 etc shoud only accept local network 
connections, and by ignore all the connections from outer network, we can make 
the interactions more stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3181) manager ports should only do local network interaction

2014-11-09 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203965#comment-14203965
 ] 

Zhao Yongming commented on TS-3181:
---

for example, we should try to filter out these issues with cluster enabled:
{code}
[Nov  7 15:28:21.428] Manager {0x7f277bfff700} NOTE: 
[ClusterCom::drainIncomingChannel] Unexpected message on cluster port.  
Possibly an attack
[Nov  7 15:28:57.501] Manager {0x7f277bfff700} NOTE: 
[ClusterCom::drainIncomingChannel] Unexpected message on cluster port.  
Possibly an attack
[Nov  7 15:34:09.624] Manager {0x7f277bfff700} NOTE: 
[ClusterCom::drainIncomingChannel] Unexpected message on cluster port.  
Possibly an attack
[Nov  7 15:38:36.235] Manager {0x7f277bfff700} NOTE: 
[ClusterCom::drainIncomingChannel] Unexpected message on cluster port.  
Possibly an attack
[Nov  7 15:39:45.596] Manager {0x7f277bfff700} NOTE: 
[ClusterCom::drainIncomingChannel] Unexpected message on cluster port.  
Possibly an attack
{code}

 manager ports should only do local network interaction
 --

 Key: TS-3181
 URL: https://issues.apache.org/jira/browse/TS-3181
 Project: Traffic Server
  Issue Type: Improvement
  Components: Manager
Reporter: Zhao Yongming

 the manager ports, such as 8088 8089 etc shoud only accept local network 
 connections, and by ignore all the connections from outer network, we can 
 make the interactions more stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3181) manager ports should only do local network interaction

2014-11-09 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-3181:
--
Fix Version/s: sometime

 manager ports should only do local network interaction
 --

 Key: TS-3181
 URL: https://issues.apache.org/jira/browse/TS-3181
 Project: Traffic Server
  Issue Type: Improvement
  Components: Manager
Reporter: Zhao Yongming
 Fix For: sometime


 the manager ports, such as 8088 8089 etc shoud only accept local network 
 connections, and by ignore all the connections from outer network, we can 
 make the interactions more stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3174) Kill LRU Ram Cache

2014-11-08 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203330#comment-14203330
 ] 

Zhao Yongming commented on TS-3174:
---

hmm, are you sure you get the correct understanding of the CLFUS effects? In 
our using of the ram cache, the CLFUS will cause some trouble on memory 
wasting, especially with a heavy changed traffic patten, be cause that CLFUS 
will try to cache more small objects and swapping out the big objects after ram 
cache memory full. that is a good feature, but need more working to archive the 
memory allocating/de-allocating during this step, I think that there still need 
more work to make the big objects swaping out and de-allocate or reuse.

I know that will not kill TS on most so busy systems, but you still need to 
keep an eye on that. TS cop process will bring back the failed server, it may 
hide most of the problems from users. :D

and it is easy to verify, on system with mixed objects, ie, active object size 
range from 1KB-100MB.
set a higher ram cut off size from 4M to 100M, and following the 
doc/sdk/troubleshooting-tips/debugging-memory-leaks.en.rst to enable memory 
dump, compare the allocated and used memories on each size.

FYI

 Kill LRU Ram Cache
 --

 Key: TS-3174
 URL: https://issues.apache.org/jira/browse/TS-3174
 Project: Traffic Server
  Issue Type: Task
Reporter: Susan Hinrichs
 Fix For: 6.0.0


 Comment from [~zwoop]. Now that CLFUS is both stable, and default, is there 
 even a reason to keep the old LRU cache. If no objections should remove for 
 the next major version change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-3180) Linux native aio not support disk 2T

2014-11-08 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-3180:
-

 Summary: Linux native aio not support disk 2T
 Key: TS-3180
 URL: https://issues.apache.org/jira/browse/TS-3180
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Zhao Yongming


{code}
21:47  faysal [Nov  8 15:45:30.080] Server {0x2ab53ff36700} WARNING: unable 
to clear cache directory '/dev/sdc 548864:366283256'
21:48  faysal although brw-rw 1 nobody nobody 8, 32 Nov  8 15:45 /dev/sdc
21:48  faysal fedora 21
21:48  faysal ping anyone
21:49  ming_zym disk fail?
21:52  ming_zym try to restart traffic server?
21:55  faysal i did restarted traffic server couple of times no luck
21:56  faysal by the way this is build with linux native aio enabled
21:56  faysal and latest master pulled today
21:56  ming_zym o, please don't use linux native aio in production
21:57  ming_zym not that ready to be used expect in testing
21:58  ming_zym I am sorry we don't have time to track down all those native 
aio issues here
21:59  faysal ok
21:59  faysal am compiling now without native aio
21:59  faysal and see what happens and inform you
22:06  faysal ming_zym: if you are working on native aio stuff its the issue
22:07  faysal i compiled without it and now its working fine
22:07  faysal i have noticed this on harddisks over 2T size
22:07  faysal smaller disks work fine with native aio
22:12  ming_zym ok, cool
22:13  faysal thats because i guess my disks are 3T each and one with 240G
22:14  faysal the 240 was taken no problem
22:14  faysal but the 3T has to be in GPT patition format
22:14  faysal and Fedora for some reason had issues identifying it
22:14  ming_zym hmm, maybe that is bug
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2314) New config to allow unsatifiable Range: request to go straight to Origin

2014-09-22 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143290#comment-14143290
 ] 

Zhao Yongming commented on TS-2314:
---

looks like a fundamental bug in the rww, should we take a deep look at it? 
cutting down the origin traffic is always the critical feature for a cache.

 New config to allow unsatifiable Range: request to go straight to Origin
 

 Key: TS-2314
 URL: https://issues.apache.org/jira/browse/TS-2314
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: jaekyung oh
  Labels: range
 Attachments: TS-2314.diff


 Basically read_while_writer works fine when ATS handles normal file.
 In progressive download and playback of mp4 in which moov atom is placed at 
 the end of the file, ATS makes and returns wrong response for range request 
 from unfulfilled cache when read_while_writer is 1.
 In origin, apache has h264 streaming module. Everything is ok whether the 
 moov atom is placed at the beginning of the file or not in origin except a 
 range request happens with read_while_writer.
 Mostly our customer’s contents placed moov atom at the end of the file and in 
 the case movie player stops playing when it seek somewhere in the movie.
 to check if read_while_writer works fine,
 1. prepare a mp4 file whose moov atom is placed at the end of the file.
 2. curl --range - http://www.test.com/mp4/test.mp4 1 
 no_cache_from_origin 
 3. wget http://www.test.com/mp4/test.mp4
 4. right after wget, execute “curl --range - 
 http://www.test.com/mp4/test.mp4 1 from_read_while_writer” on other terminal
 (the point is sending range request while ATS is still downloading)
 5. after wget gets done, curl --range - 
 http://www.test.com/mp4/test.mp4 1 from_cache
 6. you can check compare those files by bindiff.
 The response from origin(no_cache_from_origin) for the range request is 
 exactly same to from_cache resulted from #5's range request. but 
 from_read_while_writer from #4 is totally different from others.
 i think a range request should be forwarded to origin server if it can’t find 
 the content with the offset in cache even if the read_while_writer is on, 
 instead ATS makes(from where?) and sends wrong response. (In squid.log it 
 indicates TCP_HIT)
 That’s why a movie player stops when it seeks right after the movie starts.
 Well. we turned off read_while_writer and movie play is ok but the problems 
 is read_while_writer is global options. we can’t set it differently for each 
 remap entry by conf_remap.
 So the downloading of Big file(not mp4 file) gives overhead to origin server 
 because read_while_writer is off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2314) New config to allow unsatifiable Range: request to go straight to Origin

2014-09-22 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143348#comment-14143348
 ] 

Zhao Yongming commented on TS-2314:
---

yeah, I think your suggestion is great. and the current rww lack of support for 
many cases besides this case, for example:
1, how long should a reader in waiting should wait for?
   in this case, the answer sounds like not at all, but use as much downloaded 
data as possible
2, should we enable the rww for a file as big as 5G?
   for example I'd like to make a limited usage with relatively small files 
such as 30m, due to the origin site is far away from the edge site.
3, should we consider on the patial feature in the 
https://cwiki.apache.org/confluence/display/TS/Partial+Object+Caching ?
4, well, if it is a low speed user that triggered the cache storing, will it be 
a speed problem for others readers that waiting?

well, that are some of the issues we thinking on rww, I just want rww get more 
loves :D

 New config to allow unsatifiable Range: request to go straight to Origin
 

 Key: TS-2314
 URL: https://issues.apache.org/jira/browse/TS-2314
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: jaekyung oh
  Labels: range
 Attachments: TS-2314.diff


 Basically read_while_writer works fine when ATS handles normal file.
 In progressive download and playback of mp4 in which moov atom is placed at 
 the end of the file, ATS makes and returns wrong response for range request 
 from unfulfilled cache when read_while_writer is 1.
 In origin, apache has h264 streaming module. Everything is ok whether the 
 moov atom is placed at the beginning of the file or not in origin except a 
 range request happens with read_while_writer.
 Mostly our customer’s contents placed moov atom at the end of the file and in 
 the case movie player stops playing when it seek somewhere in the movie.
 to check if read_while_writer works fine,
 1. prepare a mp4 file whose moov atom is placed at the end of the file.
 2. curl --range - http://www.test.com/mp4/test.mp4 1 
 no_cache_from_origin 
 3. wget http://www.test.com/mp4/test.mp4
 4. right after wget, execute “curl --range - 
 http://www.test.com/mp4/test.mp4 1 from_read_while_writer” on other terminal
 (the point is sending range request while ATS is still downloading)
 5. after wget gets done, curl --range - 
 http://www.test.com/mp4/test.mp4 1 from_cache
 6. you can check compare those files by bindiff.
 The response from origin(no_cache_from_origin) for the range request is 
 exactly same to from_cache resulted from #5's range request. but 
 from_read_while_writer from #4 is totally different from others.
 i think a range request should be forwarded to origin server if it can’t find 
 the content with the offset in cache even if the read_while_writer is on, 
 instead ATS makes(from where?) and sends wrong response. (In squid.log it 
 indicates TCP_HIT)
 That’s why a movie player stops when it seeks right after the movie starts.
 Well. we turned off read_while_writer and movie play is ok but the problems 
 is read_while_writer is global options. we can’t set it differently for each 
 remap entry by conf_remap.
 So the downloading of Big file(not mp4 file) gives overhead to origin server 
 because read_while_writer is off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-2314) New config to allow unsatifiable Range: request to go straight to Origin

2014-09-22 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143348#comment-14143348
 ] 

Zhao Yongming edited comment on TS-2314 at 9/22/14 4:10 PM:


yeah, I think your suggestion is great. and the current rww lack of support for 
many cases besides this case, for example:
1, how long a reader in waiting should wait for?
   in this case, the answer sounds like not at all, but use as much downloaded 
data as possible
2, should we enable the rww for a file as big as 5G?
   for example I'd like to make a limited usage with relatively small files 
such as 30m, due to the origin site is far away from the edge site.
3, should we consider on the patial feature in the 
https://cwiki.apache.org/confluence/display/TS/Partial+Object+Caching ?
4, well, if it is a low speed user that triggered the cache storing, will it be 
a speed problem for others readers that waiting?

well, that are some of the issues we thinking on rww, I just want rww get more 
loves :D


was (Author: zym):
yeah, I think your suggestion is great. and the current rww lack of support for 
many cases besides this case, for example:
1, how long should a reader in waiting should wait for?
   in this case, the answer sounds like not at all, but use as much downloaded 
data as possible
2, should we enable the rww for a file as big as 5G?
   for example I'd like to make a limited usage with relatively small files 
such as 30m, due to the origin site is far away from the edge site.
3, should we consider on the patial feature in the 
https://cwiki.apache.org/confluence/display/TS/Partial+Object+Caching ?
4, well, if it is a low speed user that triggered the cache storing, will it be 
a speed problem for others readers that waiting?

well, that are some of the issues we thinking on rww, I just want rww get more 
loves :D

 New config to allow unsatifiable Range: request to go straight to Origin
 

 Key: TS-2314
 URL: https://issues.apache.org/jira/browse/TS-2314
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: jaekyung oh
  Labels: range
 Attachments: TS-2314.diff


 Basically read_while_writer works fine when ATS handles normal file.
 In progressive download and playback of mp4 in which moov atom is placed at 
 the end of the file, ATS makes and returns wrong response for range request 
 from unfulfilled cache when read_while_writer is 1.
 In origin, apache has h264 streaming module. Everything is ok whether the 
 moov atom is placed at the beginning of the file or not in origin except a 
 range request happens with read_while_writer.
 Mostly our customer’s contents placed moov atom at the end of the file and in 
 the case movie player stops playing when it seek somewhere in the movie.
 to check if read_while_writer works fine,
 1. prepare a mp4 file whose moov atom is placed at the end of the file.
 2. curl --range - http://www.test.com/mp4/test.mp4 1 
 no_cache_from_origin 
 3. wget http://www.test.com/mp4/test.mp4
 4. right after wget, execute “curl --range - 
 http://www.test.com/mp4/test.mp4 1 from_read_while_writer” on other terminal
 (the point is sending range request while ATS is still downloading)
 5. after wget gets done, curl --range - 
 http://www.test.com/mp4/test.mp4 1 from_cache
 6. you can check compare those files by bindiff.
 The response from origin(no_cache_from_origin) for the range request is 
 exactly same to from_cache resulted from #5's range request. but 
 from_read_while_writer from #4 is totally different from others.
 i think a range request should be forwarded to origin server if it can’t find 
 the content with the offset in cache even if the read_while_writer is on, 
 instead ATS makes(from where?) and sends wrong response. (In squid.log it 
 indicates TCP_HIT)
 That’s why a movie player stops when it seeks right after the movie starts.
 Well. we turned off read_while_writer and movie play is ok but the problems 
 is read_while_writer is global options. we can’t set it differently for each 
 remap entry by conf_remap.
 So the downloading of Big file(not mp4 file) gives overhead to origin server 
 because read_while_writer is off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3083) crash

2014-09-17 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138429#comment-14138429
 ] 

Zhao Yongming commented on TS-3083:
---

hmm, can you provide more information on your configure options and env? I 
think we may get [~yunkai] take a look if it is the freelist issue

 crash
 -

 Key: TS-3083
 URL: https://issues.apache.org/jira/browse/TS-3083
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.2
Reporter: bettydramit
  Labels: crash

 c++filt a.txt 
 {code}
 /lib64/libpthread.so.0(+0xf710)[0x2b4c37949710]
 /usr/lib64/trafficserver/libtsutil.so.5(ink_atomiclist_pop+0x3e)[0x2b4c35abb64e]
 /usr/lib64/trafficserver/libtsutil.so.5(reclaimable_freelist_new+0x65)[0x2b4c35abc065]
 /usr/bin/traffic_server(MIOBuffer_tracker::operator()(long)+0x2b)[0x4a33db]
 /usr/bin/traffic_server(PluginVCCore::init()+0x2e3)[0x4d9903]
 /usr/bin/traffic_server(PluginVCCore::alloc()+0x11d)[0x4dcf4d]
 /usr/bin/traffic_server(TSHttpConnectWithPluginId+0x5d)[0x4b9e9d]
 /usr/bin/traffic_server(FetchSM::httpConnect()+0x74)[0x4a0224]
 /usr/bin/traffic_server(PluginVC::process_read_side(bool)+0x375)[0x4da675]
 /usr/bin/traffic_server(PluginVC::process_write_side(bool)+0x57a)[0x4dafca]
 /usr/bin/traffic_server(PluginVC::main_handler(int, void*)+0x315)[0x4dc9a5]
 /usr/bin/traffic_server(EThread::process_event(Event*, int)+0x8f)[0x73788f]
 /usr/bin/traffic_server(EThread::execute()+0x57b)[0x7381fb]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-29 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115401#comment-14115401
 ] 

Zhao Yongming commented on TS-3032:
---

yeah, 64k is too small for you, I'd suggest you  128K, you may use 256K I 
think.

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0

 Attachments: memory.d.png


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 void*)+0x22b)[0x59270b]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server[0x714a60]
 /z/bin/traffic_server(NetHandler::mainNetEvent(int, Event*)+0x1ed)[0x7077cd]
 /z/bin/traffic_server(EThread::process_event(Event*, int)+0x91)[0x736111]
 

[jira] [Comment Edited] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-29 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115401#comment-14115401
 ] 

Zhao Yongming edited comment on TS-3032 at 8/29/14 4:26 PM:


yeah, 64k is too small for you, I'd suggest you  128K for 48G ram system, but 
is that your ram cache set to 10G too? why it still use so many memory here? 
can you dump out the mem allocator debug info?


was (Author: zym):
yeah, 64k is too small for you, I'd suggest you  128K, you may use 256K I 
think.

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0

 Attachments: memory.d.png


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 void*)+0x22b)[0x59270b]
 

[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-29 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115416#comment-14115416
 ] 

Zhao Yongming commented on TS-3032:
---

well, looks your memory is starting from 20G, I'd think that your index memory 
is nearly about 20G, that indicate you may have ~20TB storage, if you haven't 
change proxy.config.cache.min_average_object_size, is this right?

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0

 Attachments: memory.d.png


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 void*)+0x22b)[0x59270b]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server[0x714a60]
 

[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-28 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113762#comment-14113762
 ] 

Zhao Yongming commented on TS-3032:
---

any update? [~ngorchilov]

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 void*)+0x22b)[0x59270b]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server[0x714a60]
 /z/bin/traffic_server(NetHandler::mainNetEvent(int, Event*)+0x1ed)[0x7077cd]
 /z/bin/traffic_server(EThread::process_event(Event*, int)+0x91)[0x736111]
 /z/bin/traffic_server(EThread::execute()+0x4fc)[0x736bcc]
 /z/bin/traffic_server[0x7353aa]
 

[jira] [Commented] (TS-3021) hosting.config vs volume.config

2014-08-26 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110898#comment-14110898
 ] 

Zhao Yongming commented on TS-3021:
---

the hosting and volume is the same usage? I don't think so. the volume defines 
the partation spliting of the storage space, and the hosting assign them to 
hostname. unless you want to remove the control matcher, I'd not suggest to 
change thire file syntax. 

the config file is End User Interface, and we should do carefully discuss 
before we take any action. changes in UI is much evil than function renames in 
codes 

 hosting.config vs volume.config
 ---

 Key: TS-3021
 URL: https://issues.apache.org/jira/browse/TS-3021
 Project: Traffic Server
  Issue Type: Bug
  Components: Configuration
Reporter: Igor Galić
 Fix For: sometime


 it appears to me that hosting.config and volume.config have a very similar 
 purpose / use-case. perhaps it would be good to merge those two.
 ---
 n.b.: i'm not up-to-date on the plans re lua-config, but even then we'll need 
 to consider how to present.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-26 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110941#comment-14110941
 ] 

Zhao Yongming commented on TS-3032:
---

I'd suggest you get some tool to log the memory usage and other history data. a 
tool we used very often in tracing issues like this is 
https://github.com/alibaba/tsar  
https://blog.zymlinux.net/index.php/archives/251 , any other tool that can find 
out the data to compare is great.

when we deal with TS-1006, I even make some excel sheet to point out that the 
memory is a big problem, the more data the better

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 

[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-25 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109133#comment-14109133
 ] 

Zhao Yongming commented on TS-3032:
---

looks nothing unusal, I think that 'Cached: 25975284 kB' is caused by 
the access logging, then we need more infomation on ATS:
1. your ram cache setting: proxy.config.cache.ram_cache.size, if not set please 
tell us your storage device usage, and cache min_average_object_size.
2. let us dump some memory details in the ATS itself: 
https://docs.trafficserver.apache.org/en/latest/sdk/troubleshooting-tips/debugging-memory-leaks.en.html

and we should better get all those data the breaking point too :D

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 

[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-25 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109214#comment-14109214
 ] 

Zhao Yongming commented on TS-3032:
---

well, you have 7368964608 memory in the freelist, and 4893378608 in use, that 
is 66% in use. with about 8000 active connections. all sounds not so bad except 
that 7G is far smaller than that 19G from the pid summary, why?


 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 void*)+0x22b)[0x59270b]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server[0x714a60]
 /z/bin/traffic_server(NetHandler::mainNetEvent(int, 

[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-25 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109216#comment-14109216
 ] 

Zhao Yongming commented on TS-3032:
---

I'd like you keep colect those data for some more days, the same time(to get 
the same load) if you can, to figure out which component is wasting more 
memories.

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 void*)+0x22b)[0x59270b]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server[0x714a60]
 /z/bin/traffic_server(NetHandler::mainNetEvent(int, Event*)+0x1ed)[0x7077cd]
 

[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-25 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109225#comment-14109225
 ] 

Zhao Yongming commented on TS-3032:
---

and if you have more than one boxes with that issue, please consider test one 
box with the following tweak:
1. re-install with reclaim freelist enabled. and make sure reclaim is enabled 
in the records.config
2. use the standard LRU: set proxy.config.cache.ram_cache.algorithm to 1

and if you have more system that can do a release test, we can identify which 
release is proved to be correct. :D

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 

[jira] [Commented] (TS-3032) FATAL: ats_malloc: couldn't allocate XXXXXX bytes

2014-08-24 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14108410#comment-14108410
 ] 

Zhao Yongming commented on TS-3032:
---

I don't know who have any sucess story on BIG memory system, I'd like to hear 
if any.

for the problem you have, please attach some more data such as:
1. /proc/meminfo
2. the traffic_server process status: /proc//status
3. more system log related to alloc and memory, such as dmesg  syslog

and, please tell us your configure options when building the binary too. hopes 
that will help us inspect the problem.

 FATAL: ats_malloc: couldn't allocate XX bytes
 -

 Key: TS-3032
 URL: https://issues.apache.org/jira/browse/TS-3032
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Nikolai Gorchilov
Assignee: Brian Geffon
  Labels: crash
 Fix For: 5.2.0


 ATS 5.0.1 under Unbuntu 12.04.4 running happily for days suddenly crashes due 
 to memory allocation issue. Happens once or twice a week.
 Server is having plenty of RAM - 128G - out of which 64G+ are free. Nothing 
 suspicious in dmesg.
 {noformat}
 FATAL: ats_malloc: couldn't allocate 155648 bytes
 /z/bin/traffic_server - STACK TRACE: 
 /z/lib/libtsutil.so.5(+0x1e837)[0x2b6251b3d837]
 /z/lib/libtsutil.so.5(ats_malloc+0x30)[0x2b6251b40c50]
 /z/bin/traffic_server(HdrHeap::coalesce_str_heaps(int)+0x34)[0x62e834]
 /z/bin/traffic_server(http_hdr_clone(HTTPHdrImpl*, HdrHeap*, 
 HdrHeap*)+0x8f)[0x62a54f]
 /z/bin/traffic_server(HttpTransactHeaders::copy_header_fields(HTTPHdr*, 
 HTTPHdr*, bool, long)+0x1ae)[0x5d08de]
 /z/bin/traffic_server(HttpTransact::build_request(HttpTransact::State*, 
 HTTPHdr*, HTTPHdr*, HTTPVersion)+0x5c)[0x5b280c]
 /z/bin/traffic_server(HttpTransact::HandleCacheOpenReadMiss(HttpTransact::State*)+0x2c8)[0x5c2ce8]
 /z/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
 (*)(HttpTransact::State*))+0x66)[0x58e356]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::do_hostdb_lookup()+0x27a)[0x58e84a]
 /z/bin/traffic_server(HttpSM::set_next_state()+0xd48)[0x5a1038]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/x3me_dscp.so(http_txn_hook(tsapi_cont*, TSEvent, 
 void*)+0x236)[0x2b626342b508]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_cache_open_read(int, 
 void*)+0x180)[0x59b070]
 /z/bin/traffic_server(HttpSM::main_handler(int, void*)+0xd8)[0x59ad98]
 /z/bin/traffic_server(HttpCacheSM::state_cache_open_read(int, 
 void*)+0x173)[0x57bbb3]
 /z/bin/traffic_server(Cache::open_read(Continuation*, INK_MD5*, HTTPHdr*, 
 CacheLookupHttpConfig*, CacheFragType, char*, int)+0x616)[0x6d65a6]
 /z/bin/traffic_server(CacheProcessor::open_read(Continuation*, URL*, bool, 
 HTTPHdr*, CacheLookupHttpConfig*, long, CacheFragType)+0xb0)[0x6b1af0]
 /z/bin/traffic_server(HttpCacheSM::open_read(URL*, HTTPHdr*, 
 CacheLookupHttpConfig*, long)+0x83)[0x57c2d3]
 /z/bin/traffic_server(HttpSM::do_cache_lookup_and_read()+0xfb)[0x58baeb]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x888)[0x5a0b78]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x7e2)[0x5a0ad2]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x343)[0x599c03]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/cacheurl.so(+0x17dc)[0x2b6263a477dc]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/tslua.so(+0x596f)[0x2b626363396f]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::state_api_callback(int, void*)+0x8a)[0x59c81a]
 /z/bin/traffic_server(TSHttpTxnReenable+0x141)[0x4caa51]
 /z/lib/plugins/stats_over_http.so(+0x1235)[0x2b6263228235]
 /z/bin/traffic_server(HttpSM::state_api_callout(int, void*)+0x102)[0x5999c2]
 /z/bin/traffic_server(HttpSM::set_next_state()+0x238)[0x5a0528]
 /z/bin/traffic_server(HttpSM::state_read_client_request_header(int, 
 

[jira] [Assigned] (TS-2966) Update Feature not working

2014-08-17 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming reassigned TS-2966:
-

Assignee: Zhao Yongming

 Update Feature not working
 --

 Key: TS-2966
 URL: https://issues.apache.org/jira/browse/TS-2966
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache, Core
Reporter: Thomas Stinner
Assignee: Zhao Yongming
 Fix For: sometime

 Attachments: traffic.out, trafficserver.patch


 I had a problem using the update feature. I recevied a SegFault in 
 do_host_db_lookup which was caused by accessing ua_session which was not 
 initialized (see attached patch). 
 After fixing that i no longer get an SegFault, but the files that are 
 retrieved by recursion are not placed into the cache. They are requested in 
 every schedule. 
 Only the starting file is placed correctly into the cache. 
 When retrieving the files with a client, caching works as expected. So i 
 don't think this is a configuration error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2895) memory allocation failure

2014-08-17 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099899#comment-14099899
 ] 

Zhao Yongming commented on TS-2895:
---

[~wangjun] any update on this issue?

 memory allocation failure
 -

 Key: TS-2895
 URL: https://issues.apache.org/jira/browse/TS-2895
 Project: Traffic Server
  Issue Type: Test
  Components: Cache, Clustering
Reporter: wangjun
Assignee: Zhao Yongming
  Labels: crash
 Fix For: sometime

 Attachments: screenshot-1.jpg, screenshot-2.jpg


 In this version(ats 4.0.2), I encountered a bug (memory allocation failure), 
 Look at the system log, screenshots below(screenshot-1.jpg).
 Look at the program logs, screenshots below((screenshot-2.jpg).
 Please help me, thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2903) Connections are leaked at about 1000 per hour

2014-07-01 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049568#comment-14049568
 ] 

Zhao Yongming commented on TS-2903:
---

well, 3.2.5 is definitely a very old version, can you test it on the git master 
version?

and if you find out that connections is leaking, you may need to check why the 
httpsm is hanging, please use the {http} in http_ui to get the detailed 
imformations, it is the best tool for this issue.

good luck

 Connections are leaked at about 1000 per hour
 -

 Key: TS-2903
 URL: https://issues.apache.org/jira/browse/TS-2903
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Puneet Dhaliwal

 For version 3.2.5, with keep alive on for in/out and post out, connections 
 were leaked at about 1000 per hour. The limit of 
 proxy.config.net.connections_throttle was reached at 30k and at 60k after 
 enough time.
 CONFIG proxy.config.http.keep_alive_post_out INT 1
 CONFIG proxy.config.http.keep_alive_enabled_in INT 1
 CONFIG proxy.config.http.keep_alive_enabled_out INT 1
 This might also be happening for 4.2.1 and 5.0.
 Pls let me know if there is further information required.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2796) Leaking CacheVConnections

2014-05-22 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14006172#comment-14006172
 ] 

Zhao Yongming commented on TS-2796:
---

any update on this issue? do you need me push on the code diffing in taobao's 
side?

 Leaking CacheVConnections
 -

 Key: TS-2796
 URL: https://issues.apache.org/jira/browse/TS-2796
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Affects Versions: 4.0.2, 4.2.1, 5.0.0
Reporter: Brian Geffon
Assignee: Brian Geffon
  Labels: yahoo
 Fix For: 5.0.0


 It appears there is a memory leak in 4.0.x, 4.2.x, and master leaking 
 CacheVConnections resulting in IOBufAllocator leaking also, here is an 
 example:
  allocated  |in-use  | type size  |   free list name
67108864 |  0 |2097152 | 
 memory/ioBufAllocator[14]
67108864 |   19922944 |1048576 | 
 memory/ioBufAllocator[13]
  4798283776 |   14155776 | 524288 | 
 memory/ioBufAllocator[12]
  7281311744 |   98304000 | 262144 | 
 memory/ioBufAllocator[11]
  1115684864 |  148242432 | 131072 | 
 memory/ioBufAllocator[10]
  497544 |  379977728 |  65536 | 
 memory/ioBufAllocator[9]
  9902751744 | 5223546880 |  32768 | 
 memory/ioBufAllocator[8]
 14762901504 |14762311680 |  16384 | 
 memory/ioBufAllocator[7]
  6558056448 | 6557859840 |   8192 | 
 memory/ioBufAllocator[6]
41418752 |   30502912 |   4096 | 
 memory/ioBufAllocator[5]
  524288 |  0 |   2048 | 
 memory/ioBufAllocator[4]
   0 |  0 |   1024 | 
 memory/ioBufAllocator[3]
   0 |  0 |512 | 
 memory/ioBufAllocator[2]
   32768 |  0 |256 | 
 memory/ioBufAllocator[1]
   0 |  0 |128 | 
 memory/ioBufAllocator[0]
 2138112 |2124192 |928 | 
 memory/cacheVConnection
 [~bcall] has observed this issue on 4.0.x, and we have observed this on 4.2.x.
 The code path in CacheVC that is allocating the IoBuffers is 
 memory/IOBuffer/Cache.cc:2603; however, that's just the observable symptom 
 the real issue here is the leaking CacheVC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2796) Leaking CacheVConnections

2014-05-15 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997180#comment-13997180
 ] 

Zhao Yongming commented on TS-2796:
---

hmm, from what I know, many people have the memory issue, and it is a malloc 
and gc issue, but they just don't realized it. that is why I am pushing the 
reclaim freelist enabled by default. why not test it if you can vevify the 
result in hours.

and please take a look at the 'allocated' -  'in-use' where 
memory/ioBufAllocator size  32K, if you sum up them, that is the memory you 
leak. the same as TS-1006

 Leaking CacheVConnections
 -

 Key: TS-2796
 URL: https://issues.apache.org/jira/browse/TS-2796
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Affects Versions: 4.0.2, 4.2.1, 5.0.0
Reporter: Brian Geffon
  Labels: yahoo
 Fix For: 5.0.0


 It appears there is a memory leak in 4.0.x, 4.2.x, and master leaking 
 CacheVConnections resulting in IOBufAllocator leaking also, here is an 
 example:
  allocated  |in-use  | type size  |   free list name
67108864 |  0 |2097152 | 
 memory/ioBufAllocator[14]
67108864 |   19922944 |1048576 | 
 memory/ioBufAllocator[13]
  4798283776 |   14155776 | 524288 | 
 memory/ioBufAllocator[12]
  7281311744 |   98304000 | 262144 | 
 memory/ioBufAllocator[11]
  1115684864 |  148242432 | 131072 | 
 memory/ioBufAllocator[10]
  497544 |  379977728 |  65536 | 
 memory/ioBufAllocator[9]
  9902751744 | 5223546880 |  32768 | 
 memory/ioBufAllocator[8]
 14762901504 |14762311680 |  16384 | 
 memory/ioBufAllocator[7]
  6558056448 | 6557859840 |   8192 | 
 memory/ioBufAllocator[6]
41418752 |   30502912 |   4096 | 
 memory/ioBufAllocator[5]
  524288 |  0 |   2048 | 
 memory/ioBufAllocator[4]
   0 |  0 |   1024 | 
 memory/ioBufAllocator[3]
   0 |  0 |512 | 
 memory/ioBufAllocator[2]
   32768 |  0 |256 | 
 memory/ioBufAllocator[1]
   0 |  0 |128 | 
 memory/ioBufAllocator[0]
 2138112 |2124192 |928 | 
 memory/cacheVConnection
 [~bcall] has observed this issue on 4.0.x, and we have observed this on 4.2.x.
 The code path in CacheVC that is allocating the IoBuffers is 
 memory/IOBuffer/Cache.cc:2603; however, that's just the observable symptom 
 the real issue here is the leaking CacheVC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2796) Leaking CacheVConnections

2014-05-14 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997221#comment-13997221
 ] 

Zhao Yongming commented on TS-2796:
---

yeah, I know you may think that the reclaim freelist is hard to manage and evil 
in coding, but if we can confirm it may help in this case, I'd like you think 
of enable it by default, we really should not waste so many time here, and pull 
back some not so experiencd users when they may think that we do have big 
memory problem in core.

I'd push on the other enhancement you like to make it enable by default. :D

 Leaking CacheVConnections
 -

 Key: TS-2796
 URL: https://issues.apache.org/jira/browse/TS-2796
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Affects Versions: 4.0.2, 4.2.1, 5.0.0
Reporter: Brian Geffon
  Labels: yahoo
 Fix For: 5.0.0


 It appears there is a memory leak in 4.0.x, 4.2.x, and master leaking 
 CacheVConnections resulting in IOBufAllocator leaking also, here is an 
 example:
  allocated  |in-use  | type size  |   free list name
67108864 |  0 |2097152 | 
 memory/ioBufAllocator[14]
67108864 |   19922944 |1048576 | 
 memory/ioBufAllocator[13]
  4798283776 |   14155776 | 524288 | 
 memory/ioBufAllocator[12]
  7281311744 |   98304000 | 262144 | 
 memory/ioBufAllocator[11]
  1115684864 |  148242432 | 131072 | 
 memory/ioBufAllocator[10]
  497544 |  379977728 |  65536 | 
 memory/ioBufAllocator[9]
  9902751744 | 5223546880 |  32768 | 
 memory/ioBufAllocator[8]
 14762901504 |14762311680 |  16384 | 
 memory/ioBufAllocator[7]
  6558056448 | 6557859840 |   8192 | 
 memory/ioBufAllocator[6]
41418752 |   30502912 |   4096 | 
 memory/ioBufAllocator[5]
  524288 |  0 |   2048 | 
 memory/ioBufAllocator[4]
   0 |  0 |   1024 | 
 memory/ioBufAllocator[3]
   0 |  0 |512 | 
 memory/ioBufAllocator[2]
   32768 |  0 |256 | 
 memory/ioBufAllocator[1]
   0 |  0 |128 | 
 memory/ioBufAllocator[0]
 2138112 |2124192 |928 | 
 memory/cacheVConnection
 [~bcall] has observed this issue on 4.0.x, and we have observed this on 4.2.x.
 The code path in CacheVC that is allocating the IoBuffers is 
 memory/IOBuffer/Cache.cc:2603; however, that's just the observable symptom 
 the real issue here is the leaking CacheVC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2796) Leaking CacheVConnections

2014-05-12 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995956#comment-13995956
 ] 

Zhao Yongming commented on TS-2796:
---

I don't know what is this issue looking for, if we focus on the last line of 
memory dump, the memory/cacheVConnection, please ignore my comment. 

the most of the memory leaking in your memory dump result is 
memory/ioBufAllocator size  32K. and from what I can guess, you are using the 
defualt CLFUS ram cache algorithm, which will produce this effect when the 
system running a long time, and the big objects in memory is replaced by the 
smaller ones, but memory used by big objects is not released to the system yet.

and that issue is already adressed in TS-1006, and result in the reclaimable 
freelist memory management codes, already shiped in 4.0 releases, with a 
configure options to enable.

so, if this is the cause, please help verify that your problem is still there 
with reclaimable freelist enabled, and you may test the simple LRU algorithm in 
the ram cache too.

thanks


 Leaking CacheVConnections
 -

 Key: TS-2796
 URL: https://issues.apache.org/jira/browse/TS-2796
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Affects Versions: 4.0.2, 4.2.1, 5.0.0
Reporter: Brian Geffon
  Labels: yahoo
 Fix For: 5.0.0


 It appears there is a memory leak in 4.0.x, 4.2.x, and master leaking 
 CacheVConnections resulting in IOBufAllocator leaking also, here is an 
 example:
  allocated  |in-use  | type size  |   free list name
67108864 |  0 |2097152 | 
 memory/ioBufAllocator[14]
67108864 |   19922944 |1048576 | 
 memory/ioBufAllocator[13]
  4798283776 |   14155776 | 524288 | 
 memory/ioBufAllocator[12]
  7281311744 |   98304000 | 262144 | 
 memory/ioBufAllocator[11]
  1115684864 |  148242432 | 131072 | 
 memory/ioBufAllocator[10]
  497544 |  379977728 |  65536 | 
 memory/ioBufAllocator[9]
  9902751744 | 5223546880 |  32768 | 
 memory/ioBufAllocator[8]
 14762901504 |14762311680 |  16384 | 
 memory/ioBufAllocator[7]
  6558056448 | 6557859840 |   8192 | 
 memory/ioBufAllocator[6]
41418752 |   30502912 |   4096 | 
 memory/ioBufAllocator[5]
  524288 |  0 |   2048 | 
 memory/ioBufAllocator[4]
   0 |  0 |   1024 | 
 memory/ioBufAllocator[3]
   0 |  0 |512 | 
 memory/ioBufAllocator[2]
   32768 |  0 |256 | 
 memory/ioBufAllocator[1]
   0 |  0 |128 | 
 memory/ioBufAllocator[0]
 2138112 |2124192 |928 | 
 memory/cacheVConnection
 [~bcall] has observed this issue on 4.0.x, and we have observed this on 4.2.x.
 The code path in CacheVC that is allocating the IoBuffers is 
 memory/IOBuffer/Cache.cc:2603; however, that's just the observable symptom 
 the real issue here is the leaking CacheVC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2796) Leaking CacheVConnections

2014-05-12 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995961#comment-13995961
 ] 

Zhao Yongming commented on TS-2796:
---

and if the recleamable freelist will help you, please help me promote it to be 
enabled by default

 Leaking CacheVConnections
 -

 Key: TS-2796
 URL: https://issues.apache.org/jira/browse/TS-2796
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Affects Versions: 4.0.2, 4.2.1, 5.0.0
Reporter: Brian Geffon
  Labels: yahoo
 Fix For: 5.0.0


 It appears there is a memory leak in 4.0.x, 4.2.x, and master leaking 
 CacheVConnections resulting in IOBufAllocator leaking also, here is an 
 example:
  allocated  |in-use  | type size  |   free list name
67108864 |  0 |2097152 | 
 memory/ioBufAllocator[14]
67108864 |   19922944 |1048576 | 
 memory/ioBufAllocator[13]
  4798283776 |   14155776 | 524288 | 
 memory/ioBufAllocator[12]
  7281311744 |   98304000 | 262144 | 
 memory/ioBufAllocator[11]
  1115684864 |  148242432 | 131072 | 
 memory/ioBufAllocator[10]
  497544 |  379977728 |  65536 | 
 memory/ioBufAllocator[9]
  9902751744 | 5223546880 |  32768 | 
 memory/ioBufAllocator[8]
 14762901504 |14762311680 |  16384 | 
 memory/ioBufAllocator[7]
  6558056448 | 6557859840 |   8192 | 
 memory/ioBufAllocator[6]
41418752 |   30502912 |   4096 | 
 memory/ioBufAllocator[5]
  524288 |  0 |   2048 | 
 memory/ioBufAllocator[4]
   0 |  0 |   1024 | 
 memory/ioBufAllocator[3]
   0 |  0 |512 | 
 memory/ioBufAllocator[2]
   32768 |  0 |256 | 
 memory/ioBufAllocator[1]
   0 |  0 |128 | 
 memory/ioBufAllocator[0]
 2138112 |2124192 |928 | 
 memory/cacheVConnection
 [~bcall] has observed this issue on 4.0.x, and we have observed this on 4.2.x.
 The code path in CacheVC that is allocating the IoBuffers is 
 memory/IOBuffer/Cache.cc:2603; however, that's just the observable symptom 
 the real issue here is the leaking CacheVC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-2669) ATS crash, then restart with all cached objects cleared

2014-03-28 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950437#comment-13950437
 ] 

Zhao Yongming commented on TS-2669:
---

well, maybe something that need more check, please findout the diags.log lines 
like:
{code}
[Mar 27 20:31:25.948] {0x2b02748fde00} STATUS: opened 
/var/log/trafficserver/diags.log
[Mar 27 20:31:25.948] {0x2b02748fde00} NOTE: updated diags config
[Mar 27 20:31:25.954] Server {0x2b02748fde00} NOTE: cache clustering disabled
[Mar 27 20:31:25.964] Server {0x2b02748fde00} NOTE: ip_allow.config updated, 
reloading
[Mar 27 20:31:25.969] Server {0x2b02748fde00} NOTE: loading SSL certificate 
configuration from /etc/trafficserver/ssl_multicert.config
[Mar 27 20:31:25.976] Server {0x2b02748fde00} NOTE: cache clustering disabled
[Mar 27 20:31:25.977] Server {0x2b02748fde00} NOTE: logging initialized[15], 
logging_mode = 3
[Mar 27 20:31:25.978] Server {0x2b02748fde00} NOTE: loading plugin 
'/usr/lib64/trafficserver/plugins/libloader.so'
[Mar 27 20:31:25.982] Server {0x2b02748fde00} NOTE: loading plugin 
'/usr/local/ironbee/libexec/ts_ironbee.so'
[Mar 27 20:31:25.983] Server {0x2b02748fde00} NOTE: Rolling interval adjusted 
from 0 sec to 300 sec for /var/log/trafficserver/ts-ironbee.log
[Mar 27 20:31:25.992] Server {0x2b02748fde00} NOTE: traffic server running
[Mar 27 20:31:26.077] Server {0x2b0275d8e700} NOTE: cache enabled
{code}
you may find out that ' traffic server running' indicates that ATS is running, 
and 'cache enabled' show that cache is working. due to your system is crash and 
cache is not enabled, I will suspect that your ATS does not have that 'cache 
enabled' line.

this is often caused by something like privilege issue, that ATS server process 
does not have the write privilege on the disk block device files etc. or 
anything else you may findout in the diags.log or even system logs.

the interim cache  aio bugs may cause you lose the saved data, interim cache 
may cause you lose the data while server process restart, and AIO may get all 
data clear. but all those bugs fixed in the v4.1.0 release.


 ATS crash, then restart with all cached objects cleared
 ---

 Key: TS-2669
 URL: https://issues.apache.org/jira/browse/TS-2669
 Project: Traffic Server
  Issue Type: Bug
Reporter: AnDao

 Hi all,
 I'm using ATS 4.1.2, my ATS is just crashed and restart and clean all the 
 cached objects, cause my backend servers overload. Why ATS do clean all the 
 cached objects when crash and restart?
 The log is:
 * manager.log
  [Mar 27 12:57:13.022] Manager {0x7f597e3477e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 [Mar 27 12:57:13.022] Manager {0x7f597e3477e0} NOTE: 
 [LocalManager::mgmtShutdown] Executing shutdown request.
 [Mar 27 12:57:13.022] Manager {0x7f597e3477e0} NOTE: 
 [LocalManager::processShutdown] Executing process shutdown request.
 [Mar 27 12:57:13.028] Manager {0x7f597e3477e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 [Mar 27 12:57:13.028] Manager {0x7f597e3477e0} ERROR:  (last system error 32: 
 Broken pipe)
 [Mar 27 12:57:13.174] {0x7ffaeec7e7e0} STATUS: opened 
 /zserver/log/trafficserver/manager.log
 [Mar 27 12:57:13.174] {0x7ffaeec7e7e0} NOTE: updated diags config
 [Mar 27 12:57:13.520] Manager {0x7ffaeec7e7e0} NOTE: [ClusterCom::ClusterCom] 
 Node running on OS: 'Linux' Release: '2.6.32-358.6.2.el6.x86_64'
 [Mar 27 12:57:13.550] Manager {0x7ffaeec7e7e0} NOTE: 
 [LocalManager::listenForProxy] Listening on port: 80
 [Mar 27 12:57:13.550] Manager {0x7ffaeec7e7e0} NOTE: [TrafficManager] Setup 
 complete
 [Mar 27 12:57:14.618] Manager {0x7ffaeec7e7e0} NOTE: 
 [LocalManager::startProxy] Launching ts process
 [Mar 27 12:57:14.632] Manager {0x7ffaeec7e7e0} NOTE: 
 [LocalManager::pollMgmtProcessServer] New process connecting fd '15'
 [Mar 27 12:57:14.632] Manager {0x7ffaeec7e7e0} NOTE: [Alarms::signalAlarm] 
 Server Process born
 *** traffic.out ***
 [E. Mgmt] log == [TrafficManager] using root directory 
 '/zserver/trafficserver-4.1.2'
 [TrafficServer] using root directory '/zserver/trafficserver-4.1.2'
 NOTE: Traffic Server received Sig 15: Terminated
 [E. Mgmt] log == [TrafficManager] using root directory 
 '/zserver/trafficserver-4.1.2'
 [TrafficServer] using root directory '/zserver/trafficserver-4.1.2'
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /zserver/trafficserver-4.1.2/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0(+0x35a360f500)[0x2b3b55819500]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN12HttpTransact47change_response_header_because_of_range_requestEPNS_5StateEP7HTTPHdr+0x240)[0x54b8a0]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN12HttpTransact28handle_content_length_headerEPNS_5StateEP7HTTPHdrS3_+0x2c8)[0x54bc38]
 

[jira] [Commented] (TS-2669) ATS crash, then restart with all cached objects cleared

2014-03-28 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950510#comment-13950510
 ] 

Zhao Yongming commented on TS-2669:
---

please attach the server start log in diags.log file, as I attached. and please 
tell us how is your storage is configured.

 ATS crash, then restart with all cached objects cleared
 ---

 Key: TS-2669
 URL: https://issues.apache.org/jira/browse/TS-2669
 Project: Traffic Server
  Issue Type: Bug
Reporter: AnDao
 Attachments: cachobjecs.png, storage.png


 Hi all,
 I'm using ATS 4.1.2, my ATS is just crashed and restart and clean all the 
 cached objects, cause my backend servers overload. Why ATS do clean all the 
 cached objects when crash and restart?
 The log is:
 * manager.log
  [Mar 27 12:57:13.022] Manager {0x7f597e3477e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 [Mar 27 12:57:13.022] Manager {0x7f597e3477e0} NOTE: 
 [LocalManager::mgmtShutdown] Executing shutdown request.
 [Mar 27 12:57:13.022] Manager {0x7f597e3477e0} NOTE: 
 [LocalManager::processShutdown] Executing process shutdown request.
 [Mar 27 12:57:13.028] Manager {0x7f597e3477e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 [Mar 27 12:57:13.028] Manager {0x7f597e3477e0} ERROR:  (last system error 32: 
 Broken pipe)
 [Mar 27 12:57:13.174] {0x7ffaeec7e7e0} STATUS: opened 
 /zserver/log/trafficserver/manager.log
 [Mar 27 12:57:13.174] {0x7ffaeec7e7e0} NOTE: updated diags config
 [Mar 27 12:57:13.520] Manager {0x7ffaeec7e7e0} NOTE: [ClusterCom::ClusterCom] 
 Node running on OS: 'Linux' Release: '2.6.32-358.6.2.el6.x86_64'
 [Mar 27 12:57:13.550] Manager {0x7ffaeec7e7e0} NOTE: 
 [LocalManager::listenForProxy] Listening on port: 80
 [Mar 27 12:57:13.550] Manager {0x7ffaeec7e7e0} NOTE: [TrafficManager] Setup 
 complete
 [Mar 27 12:57:14.618] Manager {0x7ffaeec7e7e0} NOTE: 
 [LocalManager::startProxy] Launching ts process
 [Mar 27 12:57:14.632] Manager {0x7ffaeec7e7e0} NOTE: 
 [LocalManager::pollMgmtProcessServer] New process connecting fd '15'
 [Mar 27 12:57:14.632] Manager {0x7ffaeec7e7e0} NOTE: [Alarms::signalAlarm] 
 Server Process born
 *** traffic.out ***
 [E. Mgmt] log == [TrafficManager] using root directory 
 '/zserver/trafficserver-4.1.2'
 [TrafficServer] using root directory '/zserver/trafficserver-4.1.2'
 NOTE: Traffic Server received Sig 15: Terminated
 [E. Mgmt] log == [TrafficManager] using root directory 
 '/zserver/trafficserver-4.1.2'
 [TrafficServer] using root directory '/zserver/trafficserver-4.1.2'
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /zserver/trafficserver-4.1.2/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0(+0x35a360f500)[0x2b3b55819500]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN12HttpTransact47change_response_header_because_of_range_requestEPNS_5StateEP7HTTPHdr+0x240)[0x54b8a0]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN12HttpTransact28handle_content_length_headerEPNS_5StateEP7HTTPHdrS3_+0x2c8)[0x54bc38]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN12HttpTransact14build_responseEPNS_5StateEP7HTTPHdrS3_11HTTPVersion10HTTPStatusPKc+0x3e3)[0x54c0c3]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN12HttpTransact22handle_transform_readyEPNS_5StateE+0x70)[0x54ca40]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN6HttpSM32call_transact_and_set_next_stateEPFvPN12HttpTransact5StateEE+0x28)[0x51b418]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN6HttpSM38state_response_wait_for_transform_readEiPv+0xed)[0x52988d]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0xd8)[0x533178]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN17TransformTerminus12handle_eventEiPv+0x1d2)[0x4e8c62]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x6a5a0f]
 /zserver/trafficserver-4.1.2/bin/traffic_server(_ZN7EThread7executeEv+0x63b)[0x6a658b]
 /zserver/trafficserver-4.1.2/bin/traffic_server[0x6a48aa]
 /lib64/libpthread.so.0(+0x35a3607851)[0x2b3b55811851]
 /lib64/libc.so.6(clone+0x6d)[0x35a32e890d]
 [E. Mgmt] log == [TrafficManager] using root directory 
 '/zserver/trafficserver-4.1.2'
 [TrafficServer] using root directory '/zserver/trafficserver-4.1.2'



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (TS-2668) need a way to fetch from the cluster when doing cluster local caching

2014-03-26 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-2668:
-

 Summary: need a way to fetch from the cluster when doing cluster 
local caching
 Key: TS-2668
 URL: https://issues.apache.org/jira/browse/TS-2668
 Project: Traffic Server
  Issue Type: Sub-task
  Components: Cache, Clustering
Reporter: Zhao Yongming


this is the TS-2184 #2 feature subtask.

when you want do local caching in cluster env, you must tell cache to write 
done to the local disk when cluster hit. we need a good way to handle this.

maybe a new API or similar API changes.

be aware, the #2 feature may harms, and should be co-working with the other 
features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (TS-2668) need a way to fetch from the cluster when doing cluster local caching

2014-03-26 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming reassigned TS-2668:
-

Assignee: weijin

Weijin and Yuqing is working on a feature that is related to the API change 
requirements. please help find out the way to merge with this feature.

 need a way to fetch from the cluster when doing cluster local caching
 -

 Key: TS-2668
 URL: https://issues.apache.org/jira/browse/TS-2668
 Project: Traffic Server
  Issue Type: Sub-task
  Components: Cache, Clustering
Reporter: Zhao Yongming
Assignee: weijin
 Fix For: sometime


 this is the TS-2184 #2 feature subtask.
 when you want do local caching in cluster env, you must tell cache to write 
 done to the local disk when cluster hit. we need a good way to handle this.
 maybe a new API or similar API changes.
 be aware, the #2 feature may harms, and should be co-working with the other 
 features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (TS-2528) better bool handling in public APIs (ts / mgmt)

2014-03-26 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming reassigned TS-2528:
-

Assignee: Zhao Yongming

 better bool handling in public APIs (ts / mgmt)
 ---

 Key: TS-2528
 URL: https://issues.apache.org/jira/browse/TS-2528
 Project: Traffic Server
  Issue Type: Bug
  Components: Management API
Reporter: Zhao Yongming
Assignee: Zhao Yongming
  Labels: api-change
 Fix For: 5.0.0


 {code}
   tsapi bool TSListIsEmpty(TSList l);
   tsapi bool TSListIsValid(TSList l);
   tsapi bool TSIpAddrListIsEmpty(TSIpAddrList ip_addrl);
   tsapi bool TSIpAddrListIsValid(TSIpAddrList ip_addrl);
   tsapi bool TSPortListIsEmpty(TSPortList portl);
   tsapi bool TSPortListIsValid(TSPortList portl);
   tsapi bool TSStringListIsEmpty(TSStringList strl);
   tsapi bool TSStringListIsValid(TSStringList strl);
   tsapi bool TSIntListIsEmpty(TSIntList intl);
   tsapi bool TSIntListIsValid(TSIntList intl, int min, int max);
   tsapi bool TSDomainListIsEmpty(TSDomainList domainl);
   tsapi bool TSDomainListIsValid(TSDomainList domainl);
   tsapi TSError TSRestart(bool cluster);
   tsapi TSError TSBounce(bool cluster);
   tsapi TSError TSStatsReset(bool cluster, const char *name = NULL);
   tsapi TSError TSEventIsActive(char *event_name, bool * is_current);
 {code}
 and we have:
 {code}
 #if !defined(linux)
 #if defined (__SUNPRO_CC) || (defined (__GNUC__) || ! defined(__cplusplus))
 #if !defined (bool)
 #if !defined(darwin)  !defined(freebsd)  !defined(solaris)
 // XXX: What other platforms are there?
 #define bool int
 #endif
 #endif
 #if !defined (true)
 #define true 1
 #endif
 #if !defined (false)
 #define false 0
 #endif
 #endif
 #endif  // not linux
 {code}
 I'd like we can make it a typedef or replace bool with int completely, to 
 make things better to be parsed by SWIG tools etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-1521) Enable compression for binary log format

2014-03-04 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919388#comment-13919388
 ] 

Zhao Yongming commented on TS-1521:
---

[~bettydreamit] submit thire gzipping patch for the ascii loging, please 
consider accept this feature too.

 Enable compression for binary log format
 

 Key: TS-1521
 URL: https://issues.apache.org/jira/browse/TS-1521
 Project: Traffic Server
  Issue Type: New Feature
  Components: Logging
 Environment: RHEL 6+
Reporter: Lans Carstensen
Assignee: Yunkai Zhang
 Fix For: 6.0.0

 Attachments: logcompress.patch


 As noted by in a discussion on #traffic-server, gzip can result in 90%+ 
 compression on the binary access logs.  By adding a reasonable streaming 
 compression algorithm to the binary format you could significantly reduce 
 logging-related IOPS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TS-727) Do we need support for streams partitions?

2014-02-27 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914204#comment-13914204
 ] 

Zhao Yongming commented on TS-727:
--

I think that remove the streams partitions will result into a completely remove 
of the MIXT cache, someone is working on the rtmp alike streaming cache for 
ATS, I'd like to talk to them before we nuke them.

IMO, the 'stream' cache is much efficent than http if you would like to use ATS 
for live streaming broadcasting.

 Do we need support for streams partitions?
 

 Key: TS-727
 URL: https://issues.apache.org/jira/browse/TS-727
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cache
Reporter: Leif Hedstrom
Assignee: Alan M. Carroll
 Fix For: 5.0.0


 There's code in the cache related to MIXT streams volumes (caches). Since we 
 don't support streams, I'm thinking this code could be removed? Or 
 alternatively, we should expose APIs so that someone writing a plugin and 
 wish to store a different protocol (e.g. QT) can register this media type 
 with the API and core. The idea being that the core only contains protocols 
 that are in the core, but expose the cache core so that plugins can take 
 advantage of it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (TS-2184) Fetch from cluster with proxy.config.http.cache.cluster_cache_local enabled

2014-02-25 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911428#comment-13911428
 ] 

Zhao Yongming commented on TS-2184:
---

when Cluster is designed, the origin goal is to have ONLY one single valid 
content in the cluster, that is a good idea when you have very huge volume 
contents, and we have continued on this target, make sure even when some of the 
machines flapping in the cluster , to ensure that at anytime there is only one 
valid content in the cluster.

but consider of the ICP protocol and others, those may have multiple same 
content in the ICP cluster, and if you make it complex, it may have 
multi-version content in the cluster at sometime. so, ICP alike protocols is 
consider as a not so cool(safe) protocol if you need to enforce of the 
consistency of the contents you provide to the user agents.

back to this requirement, we can make the cluster act like ICP, first write to 
the cluster hashing machine, and the second or later read poll that content 
from the cluster and write to the local, but it will introduce the consistency 
problem here, you don't know who have that content local in the cluster when it 
is updated on the origin side, within the freshness. in most case, all write to 
the cache haveto broadcast to all the machines in the cluster, to enforce that 
change.

the proxy.config.http.cache.cluster_cache_local enabled is a directive to 
disable cluster hashing in the cluster mode, our origin target is to use it to 
put some very hot hostnames(or urls) in the local, to reducing the 
intro-cluster traffic. proxy.config.http.cache.cluster_cache_local enabled is 
override-able and we have the same directive in the cache.config too. if it is 
in active, Cluster may consider as mode=3, the single host mod.

so, if we want to achive ICP alike feature in cluster, mostly we should:
1, write the content on the hashing machine, if it is a miss in the cluster
2, read the cluster if it is missing in the local machine
3, write the local if it is a hit in the cluster
4, broadcast the change to all the machines in the cluster, if it is a 
overwrite(ie, revalidating etc)
5, purge on the hashing machine and broadcast the purge to all the machines in 
the cluster
it will be a very big change in the Cluster and http transaction

cc [~zwoop]


 Fetch from cluster with proxy.config.http.cache.cluster_cache_local enabled
 ---

 Key: TS-2184
 URL: https://issues.apache.org/jira/browse/TS-2184
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cache, Clustering
Reporter: Scott Harris
Assignee: Bin Chen
 Fix For: 6.0.0


 With proxy.config.http.cache.cluster_cache_local enabled I would like cluster 
 nodes to store content locally but try to retrieve content from the cluster 
 first (if not cached locally) and if no cluster nodes have content cached 
 then retrieve from origin.
 Example - 2 Cluster nodes in Full cluster mode.
 1. Node1 and Node2 are both empty.
 2. Request to Node1 for http://www.example.com/foo.html;.
 3. Query Cluster for object
 4. Not cached in cluster so retrieve from orgin, serve to client, object now 
 cached on Node1.
 5. Request comes to Node2 for http://www.example.com/foo.html;.
 6. Node2 retrieves cached version from Node1, serves to client, stores 
 locally.
 7. Subsequent request comes to Node1 or Node2 for 
 http://www.example.com/foo.html;, object is served to client from local 
 cache.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (TS-2184) Fetch from cluster with proxy.config.http.cache.cluster_cache_local enabled

2014-02-25 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911434#comment-13911434
 ] 

Zhao Yongming commented on TS-2184:
---

for the very HOT content in the cluster, we have another solution that tracking 
of the hot content(in traffic view) and put them in the cluster_cache_local 
list to make it dynamic. and this solution will need a workaround of the 
purging, you need to broadcast all the purging to every machines in the cluster.

and pull from the hashing machine in the cluster is not implemented too. we are 
testing to see how cool it will be. this function is provide by [~happy_fish100]

FYI

 Fetch from cluster with proxy.config.http.cache.cluster_cache_local enabled
 ---

 Key: TS-2184
 URL: https://issues.apache.org/jira/browse/TS-2184
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cache, Clustering
Reporter: Scott Harris
Assignee: Bin Chen
 Fix For: 6.0.0


 With proxy.config.http.cache.cluster_cache_local enabled I would like cluster 
 nodes to store content locally but try to retrieve content from the cluster 
 first (if not cached locally) and if no cluster nodes have content cached 
 then retrieve from origin.
 Example - 2 Cluster nodes in Full cluster mode.
 1. Node1 and Node2 are both empty.
 2. Request to Node1 for http://www.example.com/foo.html;.
 3. Query Cluster for object
 4. Not cached in cluster so retrieve from orgin, serve to client, object now 
 cached on Node1.
 5. Request comes to Node2 for http://www.example.com/foo.html;.
 6. Node2 retrieves cached version from Node1, serves to client, stores 
 locally.
 7. Subsequent request comes to Node1 or Node2 for 
 http://www.example.com/foo.html;, object is served to client from local 
 cache.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (TS-2531) The default remap rule doesn't match a forward proxy request

2014-02-08 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895473#comment-13895473
 ] 

Zhao Yongming commented on TS-2531:
---

seems you want a default host rule on a forward proxy setup, what is this 
config for? it is a bit weird

 The default remap rule doesn't match a forward proxy request
 

 Key: TS-2531
 URL: https://issues.apache.org/jira/browse/TS-2531
 Project: Traffic Server
  Issue Type: Bug
  Components: HTTP
Reporter: Bryan Call
 Attachments: 0001-fix-bug-TS_2531.patch


 when doing a forward proxy request it won't math the default rule, but will 
 match other rules that specify the hostname.
 Example request:
 GET http://foo.yahoo.com HTTP/1.1
 Host: foo.yahoo.com
 remap.config:
 map / http://www.yahoo.com
 Response:
 HTTP/1.1 404 Not Found
 ...
 
 However, this works:
 remap.config:
 map http://foo.yahoo.com http://www.yahoo.com



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (TS-2561) remove app-template from examples

2014-02-08 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-2561:
-

 Summary: remove app-template from examples
 Key: TS-2561
 URL: https://issues.apache.org/jira/browse/TS-2561
 Project: Traffic Server
  Issue Type: Bug
  Components: Cleanup
Reporter: Zhao Yongming


due to the STANDALONE IOCORE is removed, the app-template example should not be 
there. and most of the app-template  STANDALONE IOCORE design purpose is able 
to satisfied with the protocol plugin.

let us remove it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (TS-2561) remove app-template from examples

2014-02-08 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-2561:
--

Affects Version/s: 5.0.0
Fix Version/s: 5.0.0
 Assignee: Zhao Yongming

 remove app-template from examples
 -

 Key: TS-2561
 URL: https://issues.apache.org/jira/browse/TS-2561
 Project: Traffic Server
  Issue Type: Bug
  Components: Cleanup
Affects Versions: 5.0.0
Reporter: Zhao Yongming
Assignee: Zhao Yongming
 Fix For: 5.0.0


 due to the STANDALONE IOCORE is removed, the app-template example should not 
 be there. and most of the app-template  STANDALONE IOCORE design purpose is 
 able to satisfied with the protocol plugin.
 let us remove it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (TS-2019) find out what is the problem of reporting OpenReadHead failed on vector inconsistency

2014-02-08 Thread Zhao Yongming (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895487#comment-13895487
 ] 

Zhao Yongming commented on TS-2019:
---

[~weijin] should check this issue

 find out what is the problem of reporting OpenReadHead failed on vector 
 inconsistency
 -

 Key: TS-2019
 URL: https://issues.apache.org/jira/browse/TS-2019
 Project: Traffic Server
  Issue Type: Task
  Components: Cache
Reporter: Zhao Yongming
Assignee: Alan M. Carroll
Priority: Critical
 Fix For: 5.0.0


 {code}
 [Jul 10 19:40:33.170] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey 44B5C68B : vector inconsistency with 4624
 [Jul 10 19:40:33.293] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey 2ABA746F : vector inconsistency with 4632
 [Jul 10 19:40:33.368] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey 389594A0 : vector inconsistency with 4632
 [Jul 10 19:40:33.399] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey FBC601A3 : vector inconsistency with 4632
 [Jul 10 19:40:33.506] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey 1F39AD5F : vector inconsistency with 4632
 [Jul 10 19:40:33.602] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey ABFC6D97 : vector inconsistency with 4632
 [Jul 10 19:40:33.687] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey 2420ABBF : vector inconsistency with 4632
 [Jul 10 19:40:33.753] Server {0x2aaf1680} NOTE: OpenReadHead failed for 
 cachekey 5DD061C8 : vector inconsistency with 4632
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (TS-2527) mgmtapi.h should be C style

2014-01-24 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-2527:
-

 Summary: mgmtapi.h should be C style
 Key: TS-2527
 URL: https://issues.apache.org/jira/browse/TS-2527
 Project: Traffic Server
  Issue Type: Bug
  Components: Management API
Reporter: Zhao Yongming


{code}
/*--- statistics operations ---*/
/* TSStatsReset: sets all the statistics variables to their default values
 * Input: cluster - Reset the stats clusterwide or not
 * Outpue: TSErrr
 */
  tsapi TSError TSStatsReset(bool cluster, const char *name = NULL);


{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (TS-2528) better bool handling in mgmtapi.h

2014-01-24 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-2528:
-

 Summary: better bool handling in mgmtapi.h
 Key: TS-2528
 URL: https://issues.apache.org/jira/browse/TS-2528
 Project: Traffic Server
  Issue Type: Bug
  Components: Management API
Reporter: Zhao Yongming


{code}
  tsapi bool TSListIsEmpty(TSList l);
  tsapi bool TSListIsValid(TSList l);
  tsapi bool TSIpAddrListIsEmpty(TSIpAddrList ip_addrl);
  tsapi bool TSIpAddrListIsValid(TSIpAddrList ip_addrl);
  tsapi bool TSPortListIsEmpty(TSPortList portl);
  tsapi bool TSPortListIsValid(TSPortList portl);
  tsapi bool TSStringListIsEmpty(TSStringList strl);
  tsapi bool TSStringListIsValid(TSStringList strl);
  tsapi bool TSIntListIsEmpty(TSIntList intl);
  tsapi bool TSIntListIsValid(TSIntList intl, int min, int max);
  tsapi bool TSDomainListIsEmpty(TSDomainList domainl);
  tsapi bool TSDomainListIsValid(TSDomainList domainl);
  tsapi TSError TSRestart(bool cluster);
  tsapi TSError TSBounce(bool cluster);
  tsapi TSError TSStatsReset(bool cluster, const char *name = NULL);
  tsapi TSError TSEventIsActive(char *event_name, bool * is_current);
{code}

and we have:

{code}
#if !defined(linux)
#if defined (__SUNPRO_CC) || (defined (__GNUC__) || ! defined(__cplusplus))
#if !defined (bool)
#if !defined(darwin)  !defined(freebsd)  !defined(solaris)
// XXX: What other platforms are there?
#define bool int
#endif
#endif

#if !defined (true)
#define true 1
#endif

#if !defined (false)
#define false 0
#endif

#endif
#endif  // not linux
{code}

I'd like we can make it a typedef or replace bool with int completely, to make 
things better to be parsed by SWIG tools etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (TS-2493) API: introducing UDP API

2014-01-14 Thread Zhao Yongming (JIRA)
Zhao Yongming created TS-2493:
-

 Summary: API: introducing UDP API
 Key: TS-2493
 URL: https://issues.apache.org/jira/browse/TS-2493
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Zhao Yongming


when doing UDP tasks in plugins, there is no UDP api available to use, we need 
to introduce those APIs

task for [~xinyuziran]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (TS-2493) API: introducing UDP API

2014-01-14 Thread Zhao Yongming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-2493:
--

  Component/s: TS API
Affects Version/s: 4.1.2
Fix Version/s: 4.2.0
 Assignee: Zhao Yongming
   Labels: UDP  (was: )

 API: introducing UDP API
 

 Key: TS-2493
 URL: https://issues.apache.org/jira/browse/TS-2493
 Project: Traffic Server
  Issue Type: Improvement
  Components: TS API
Affects Versions: 4.1.2
Reporter: Zhao Yongming
Assignee: Zhao Yongming
  Labels: UDP
 Fix For: 4.2.0


 when doing UDP tasks in plugins, there is no UDP api available to use, we 
 need to introduce those APIs
 task for [~xinyuziran]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   3   4   5   6   >