[jira] [Updated] (TS-1201) Crash report: hostdb multicache, double free

2012-04-13 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1201:
--

Description: 
{code}
*** glibc detected *** /usr/bin/traffic_server: corrupted double-linked list: 
0x1fe10ef0 ***
=== Backtrace: =
/lib64/libc.so.6[0x3db2072555]   
/lib64/libc.so.6(cfree+0x4b)[0x3db20728bb]
/usr/bin/traffic_server(MultiCacheSync::mcEvent(int, Event*)+0xa4)[0x5dae04]
/usr/bin/traffic_server(EThread::process_event(Event*, int)+0x22f)[0x691c8f]
/usr/bin/traffic_server(EThread::execute()+0x6a1)[0x692681]
/usr/bin/traffic_server[0x69115e]
/lib64/libpthread.so.0[0x3db280673d]
/lib64/libc.so.6(clone+0x6d)[0x3db20d44bd]
=== Memory map: 
{code}

  was:

{code}
*** glibc detected *** /usr/bin/traffic_server: corrupted double-linked list: 
0x1fe10ef0 ***
=== Backtrace: =
/lib64/libc.so.6[0x3db2072555]   
/lib64/libc.so.6(cfree+0x4b)[0x3db20728bb]
/usr/bin/traffic_server(_ZN14MultiCacheSync7mcEventEiP5Event+0xa4)[0x5dae04]
/usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x22f)[0x691c8f]
/usr/bin/traffic_server(_ZN7EThread7executeEv+0x6a1)[0x692681]
/usr/bin/traffic_server[0x69115e]
/lib64/libpthread.so.0[0x3db280673d]
/lib64/libc.so.6(clone+0x6d)[0x3db20d44bd]
=== Memory map: 
{code}


 Crash report: hostdb multicache, double free
 

 Key: TS-1201
 URL: https://issues.apache.org/jira/browse/TS-1201
 Project: Traffic Server
  Issue Type: Bug
Reporter: Zhao Yongming
Assignee: weijin

 {code}
 *** glibc detected *** /usr/bin/traffic_server: corrupted double-linked list: 
 0x1fe10ef0 ***
 === Backtrace: =
 /lib64/libc.so.6[0x3db2072555]   
 /lib64/libc.so.6(cfree+0x4b)[0x3db20728bb]
 /usr/bin/traffic_server(MultiCacheSync::mcEvent(int, Event*)+0xa4)[0x5dae04]
 /usr/bin/traffic_server(EThread::process_event(Event*, int)+0x22f)[0x691c8f]
 /usr/bin/traffic_server(EThread::execute()+0x6a1)[0x692681]
 /usr/bin/traffic_server[0x69115e]
 /lib64/libpthread.so.0[0x3db280673d]
 /lib64/libc.so.6(clone+0x6d)[0x3db20d44bd]
 === Memory map: 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-980) change client_session schedule from global to thread local, and reduce the try_locks in UnixNetVConnection::reenable

2012-04-09 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-980:
-

Fix Version/s: (was: 3.1.4)
   3.3.0

reschedule to 3.3.0, may need more tweak  testing

 change client_session schedule from global  to thread local, and reduce the 
 try_locks in UnixNetVConnection::reenable
 -

 Key: TS-980
 URL: https://issues.apache.org/jira/browse/TS-980
 Project: Traffic Server
  Issue Type: Improvement
  Components: Network, Performance
Affects Versions: 3.1.0, 3.0.0
 Environment: all
Reporter: weijin
Assignee: weijin
 Fix For: 3.3.0

 Attachments: ts-980.diff


 I did some performance test on ats last days(disable cache, set share_server 
 session 2, pure proxy mode), I did see significant improvement on low load, 
 but it dropped rapidly when load is high. meanwhile, some stability problems 
 happened. Through gdb, I found the client_session`s mutex can be acquired by 
 two or more threads, I believe some schedules happened during the sm 
 life_time. May be we need do some work to find these eventProcessor.schedules 
 and change them to thread schedules.
 UnixVConnecton::reenable {
 if (nh-mutex-thread_holding == t) {
   // put into ready_list
 } else {
MUTEX_TRY_LOCK(lock, nh-mutex, t);
if (!lock) {
  // put into enable_list;
} else {
  // put into ready_list;
}
 }
 remove UnixNetVConnection::reenable try_lock operations, 3 reasons
 1. try_lock operation means obj allocation and deallocation operation. 
 frequently
 2. try_lock hardly can lock the net-handler`s mutex.(net-handler is schedule 
 by as soon as possible)
 3. try_lock should not acquire the net-handler`s mutex. That may lead more 
 net io latency if it is an epoll event need to be processed in other threads. 
 If it is not an epoll event(time event), I don`t think putting vc in 
 ready_list has any advantage than in enable_list.
 may be we can change reenale function like this:
 UnixVConnecton::reenable {
 if (nh-mutex-thread_holding == t) {
   // put into ready_list;
 } else {
   // put into enable_list;
 }
 my buddies, any advice?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1034) reduce futex locking period

2012-03-28 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1034:
--

Fix Version/s: (was: 3.1.4)
   3.3.0

 reduce futex locking period
 ---

 Key: TS-1034
 URL: https://issues.apache.org/jira/browse/TS-1034
 Project: Traffic Server
  Issue Type: Improvement
  Components: Core, HTTP
Affects Versions: 3.1.1
Reporter: Zhao Yongming
Assignee: Zhao Yongming
 Fix For: 3.3.0


 we need to reduce futex locking period, here is a simple testing in my 
 24cores HP380 system, with 24 ab, all cached in memory:
 {code}
 #!/bin/sh
 for i in {1..24}
 do
  ab -n 1 -c 16 -X 127.0.0.$i:8080 
 http://img02.taobaocdn.com/tps/i2/T1o0ypXk4w-1000-40.png?$i 
 done
 {code}
 result:
 {code}
 Every 2.0s: echo show:proxy-stats | traffic_shell 
  Mon Nov 28 16:06:42 2011
 Successfully Initialized MgmtAPI in /var/run/trafficserver
 Document Hit Rate  100.00 %  *
 Bandwidth Saving - 100.00 %  *
 Cache Percent Free --- 99.999619 %
 Open Server Connections -- 0
 Open Client Connections -- 9 
 Open Cache Connections --- 2
 Client Throughput  6824.747070 MBit/Sec
 Transaction Per Second --- 53914.925781
 * Value represents 10 second average.
 strace -c -p 11712
 Process 11712 attached - interrupt to quit
 ^CProcess 11712 detached
 % time seconds  usecs/call callserrors syscall
 -- --- --- - - 
  26.850.890335  15 58920   writev
  24.450.810866   7118147   epoll_ctl
  22.270.738451  13 58920   close
  11.500.381362   6 59227   getsockname
   9.860.326843   3119192 59228 read
   3.530.117058  16  7100  1931 futex
   1.530.050884  58   884   epoll_wait
   0.000.37   0   404   rt_sigprocmask
   0.000.00   0 3   write
   0.000.00   0 2   brk
   0.000.00   010   msync
 -- --- --- - - 
 100.003.315836422809 61159 total
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1031) reduce lock in netHandler and reduce the possiblity of acquiring expire server sessions

2012-03-28 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1031:
--

Fix Version/s: (was: 3.1.4)
   3.3.0

I think the patch should not be required after we fix the do_io_close issue, 
let us focus on other enhancement later.

 reduce lock in netHandler and reduce the possiblity of acquiring expire 
 server sessions
 ---

 Key: TS-1031
 URL: https://issues.apache.org/jira/browse/TS-1031
 Project: Traffic Server
  Issue Type: Improvement
  Components: Core
Affects Versions: 3.1.1
Reporter: Zhao Yongming
Assignee: weijin
Priority: Minor
 Fix For: 3.3.0

 Attachments: ts-1031.diff


 reduce lock in netHandler and reduce the possiblity of acquiring expire 
 server sessions. put your patch here for review :D

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1154) quick_filter on HEAD does not work

2012-03-27 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1154:
--

Fix Version/s: (was: 3.1.4)
   3.0.4

we do not need to fix the current git master, let us just fix it in 3.0.4

 quick_filter on HEAD does not work
 --

 Key: TS-1154
 URL: https://issues.apache.org/jira/browse/TS-1154
 Project: Traffic Server
  Issue Type: Bug
  Components: HTTP
Reporter: Zhao Yongming
Assignee: weijin
 Fix For: 3.0.4

 Attachments: head_method.diff


 we take quick filter as a good solution for some security concern, but when I 
 set it to 0x0733, it does not allow HEAD in, but setting as 0x0723 does that.
 Weijin have the patch in our tree: 
 https://gitorious.org/trafficserver/taobao/commit/cb23b87d167da4074e047fabc94786003ee94e9a/diffs/db7d0e5be69988b531e8d1e4eea717e6d46df5cd
 I will commit if no one complain in 2 days.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1151) in some strange situation, cop will crash

2012-03-18 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1151:
--

Description: 
we get some strange crash, the manager  cop may die, we are not sure what that 
is, but I'd like to start one Issue here if we have other same issue.

here is the log in /var/log/messages
{code}
Mar 19 10:08:24 cache172.cn77 kernel:: [1553138.961401] [ET_NET 2][17949]: 
segfault at 2aadf1387937 ip 003c5bc7bdbe sp 410f3188 error 4 in 
libc-2.5.so[3c5bc0+14d000]
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} FATAL: 
[LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} FATAL:  
(last system error 104: Connection reset by peer)
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} ERROR: 
[LocalManager::sendMgmtMsgToProcesses] Error writing message
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} ERROR:  
(last system error 32: Broken pipe)
Mar 19 10:08:33 cache172.cn77 traffic_cop[17933]: cop received child status 
signal [17935 2816]
Mar 19 10:08:33 cache172.cn77 traffic_cop[17933]: traffic_manager not running, 
making sure traffic_server is dead
Mar 19 10:08:33 cache172.cn77 traffic_cop[17933]: spawning traffic_manager
Mar 19 10:08:40 cache172.cn77 traffic_manager[2760]: NOTE: --- Manager Starting 
---
Mar 19 10:08:40 cache172.cn77 traffic_manager[2760]: NOTE: Manager Version: 
Apache Traffic Server - traffic_manager - 3.0.2 - (build # 299 on Mar  9 2012 
at 09:55:44)
Mar 19 10:08:40 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} STATUS: 
opened /var/log/trafficserver/manager.log
Mar 19 10:08:46 cache172.cn77 traffic_cop[17933]: (cli test) unable to retrieve 
manager_binary
Mar 19 10:08:54 cache172.cn77 traffic_server[2789]: NOTE: --- Server Starting 
---
Mar 19 10:08:54 cache172.cn77 traffic_server[2789]: NOTE: Server Version: 
Apache Traffic Server - traffic_server - 3.0.2 - (build # 299 on Mar  9 2012 at 
09:56:00)
Mar 19 10:09:00 cache172.cn77 traffic_server[2789]: {0x2b5a8ef03970} STATUS: 
opened /var/log/trafficserver/diags.log
Mar 19 10:14:02 cache172.cn77 kernel:: [1553476.364204] [ET_NET 0][2789]: 
segfault at 2aab1fa99ce3 ip 003c5bc7bdbe sp 7fff39743fa8 error 4 in 
libc-2.5.so[3c5bc0+14d000]
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} FATAL: 
[LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} FATAL:  
(last system error 104: Connection reset by peer)
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} ERROR: 
[LocalManager::sendMgmtMsgToProcesses] Error writing message
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} ERROR:  
(last system error 32: Broken pipe)
{code}

here is the message in traffic.out
{code}
Mar 19 10:11:06 cache162.cn77 kernel:: [2510081.212455] [ET_NET 3][319]: 
segfault at 2aaae6e986bc ip 003f7f27bdbe sp 40be2188 error 4 in 
libc-2.5.so[3f7f20+14d000]
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} FATAL: 
[LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} FATAL:  
(last system error 104: Connection reset by peer)
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} ERROR: 
[LocalManager::sendMgmtMsgToProcesses] Error writing message
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} ERROR:  
(last system error 32: Broken pipe)
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: cop received child status 
signal [305 2816]
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: traffic_manager not running, 
making sure traffic_server is dead
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: spawning traffic_manager
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: NOTE: --- Manager Starting 
---
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: NOTE: Manager Version: 
Apache Traffic Server - traffic_manager - 3.0.2 - (build # 299 on Mar  9 2012 
at 09:55:44)
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: {0x7f8ae2f48720} STATUS: 
opened /var/log/trafficserver/manager.log
Mar 19 10:11:23 cache162.cn77 traffic_cop[303]: (cli test) unable to retrieve 
manager_binary
Mar 19 10:11:39 cache162.cn77 traffic_server[1260]: NOTE: --- Server Starting 
---
Mar 19 10:11:39 cache162.cn77 traffic_server[1260]: NOTE: Server Version: 
Apache Traffic Server - traffic_server - 3.0.2 - (build # 299 on Mar  9 2012 at 
09:56:00)
Mar 19 10:11:46 cache162.cn77 traffic_server[1260]: {0x2ad4afd3d970} STATUS: 
opened /var/log/trafficserver/diags.log
Mar 19 10:15:06 cache162.cn77 kernel:: [2510320.713808] [ET_NET 3][1277]: 
segfault at 2aab1cfa6a03 ip 003f7f27bdbe sp 4141c188 error 4 in 

[jira] [Updated] (TS-1114) Crash report: HttpTransactCache::SelectFromAlternates

2012-03-08 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1114:
--

Fix Version/s: (was: 3.1.5)
   3.1.3

this patch runs perfect in our production for weeks

 Crash report: HttpTransactCache::SelectFromAlternates
 -

 Key: TS-1114
 URL: https://issues.apache.org/jira/browse/TS-1114
 Project: Traffic Server
  Issue Type: Bug
Reporter: Zhao Yongming
Assignee: weijin
 Fix For: 3.1.3

 Attachments: cache_crash.diff


 it may or may not be the upstream issue, let us open it for tracking.
 {code}
 #0  0x0053075e in HttpTransactCache::SelectFromAlternates 
 (cache_vector=0x2aaab80ff500, client_request=0x2aaab80ff4c0, 
 http_config_params=0x2aaab547b800) at ../../proxy/hdrs/HTTP.h:1375
 1375((int32_t *)  val)[0] = m_alt-m_object_key[0];
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1002) log unmapped HOST when pristine_host_hdr disabled

2012-03-02 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1002:
--

Attachment: TS-1002.patch

I have create a new cquuh ( client_req_unmapped_url_host ), which is what your 
needs, this patch will apply to git master

config:
  Format = %cquuc %cquup %cquuh /
result:
http://cdn.zymlinux.net/trafficserver/ts75.png /trafficserver/ts75.png 
cdn.zymlinux.net

please test  review

 log unmapped HOST when pristine_host_hdr disabled
 -

 Key: TS-1002
 URL: https://issues.apache.org/jira/browse/TS-1002
 Project: Traffic Server
  Issue Type: Wish
  Components: Logging
Reporter: Conan Wang
Assignee: Zhao Yongming
Priority: Minor
 Fix For: 3.1.5

 Attachments: TS-1002.patch


 I want to log user request's Host in http header before remap. I write 
 logs_xml.config, like:  %{Host}cqh
 When proxy.config.url_remap.pristine_host_hdr is enabled, I will get the 
 right Host which is not rewritten.
 When disable the config, I always get the rewritten/mapped Host which is 
 not what I need.
 logs_xml reference: 
 http://trafficserver.apache.org/docs/v2/admin/logfmts.htm#66912

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1002) log unmapped HOST when pristine_host_hdr disabled

2012-03-02 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1002:
--

Fix Version/s: (was: 3.1.5)
   3.1.3

 log unmapped HOST when pristine_host_hdr disabled
 -

 Key: TS-1002
 URL: https://issues.apache.org/jira/browse/TS-1002
 Project: Traffic Server
  Issue Type: Wish
  Components: Logging
Reporter: Conan Wang
Assignee: Zhao Yongming
Priority: Minor
 Fix For: 3.1.3

 Attachments: TS-1002.patch


 I want to log user request's Host in http header before remap. I write 
 logs_xml.config, like:  %{Host}cqh
 When proxy.config.url_remap.pristine_host_hdr is enabled, I will get the 
 right Host which is not rewritten.
 When disable the config, I always get the rewritten/mapped Host which is 
 not what I need.
 logs_xml reference: 
 http://trafficserver.apache.org/docs/v2/admin/logfmts.htm#66912

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-348) Infinite core file creation

2012-03-02 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-348:
-

Assignee: Zhao Yongming

we should make sure the admin user is there, the attached patch is a good 
proposal.

 Infinite core file creation
 ---

 Key: TS-348
 URL: https://issues.apache.org/jira/browse/TS-348
 Project: Traffic Server
  Issue Type: Bug
  Components: Configuration
Affects Versions: 2.1.0
Reporter: Mladen Turk
Assignee: Zhao Yongming
Priority: Critical
 Fix For: sometime

 Attachments: ts-348-proposed.patch


 If traffic server is started with non root user account the launch script 
 endlessly loops in start attempts
 generating core.PID file on each iteration.
 This creates 80+ MB core file about each second until the disk gets full.
 The following log entry is added on each iteration:
 E. Mgmt] log == [TrafficManager] using root directory 
 '/home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local'
 [May 13 12:50:18.299] {3086546656} STATUS: opened 
 var/log/trafficserver/manager.log
 [TrafficServer] using root directory 
 '/home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local'
 [May 13 12:50:20.830] {1074246896} STATUS: opened 
 var/log/trafficserver/diags.log
 FATAL: Can't change group to user: nobody, gid: 99
 /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server
  - STACK TRACE:
 /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(ink_fatal_va+0x8f)[0x83451c7]
 /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(ink_fatal_die+0x1d)[0x83451f7]
 /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(_Z14change_uid_gidPKc+0xd8)[0x8152a52]
 /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(main+0x1296)[0x8153e68]
 /lib/libc.so.6(__libc_start_main+0xdc)[0x7bee9c]
 /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server[0x80f3b31]
 [May 13 12:50:21.176] Manager {3086546656} ERROR: 
 [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 6: 
 Aborted
 [May 13 12:50:21.176] Manager {3086546656} ERROR:  (last system error 2: No 
 such file or directory)
 [May 13 12:50:21.176] Manager {3086546656} ERROR: [Alarms::signalAlarm] 
 Server Process was reset
 [May 13 12:50:21.176] Manager {3086546656} ERROR:  (last system error 2: No 
 such file or directory)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1111) crash in RangeTransform::handle_event

2012-02-28 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-:
--

Fix Version/s: (was: 3.1.4)
   3.1.3

 crash in RangeTransform::handle_event
 -

 Key: TS-
 URL: https://issues.apache.org/jira/browse/TS-
 Project: Traffic Server
  Issue Type: Bug
Reporter: Zhao Yongming
Assignee: weijin
  Labels: crash
 Fix For: 3.1.3

 Attachments: transform.patch


 we have some crashing in range requesting processing, it is on our own tree 
 based on 3.0.x, that maybe the root cause of the do_io_close issue, we are 
 still look at how to fix that, feedback is welcome
 {code}
 #0  0x004dc624 in RangeTransform::handle_event (this=0x2aaed0001fb0, 
 event=1, edata=value optimized out) at Transform.cc:926
 926 Debug(transform_range, RangeTransform destroy: %d, 
 m_output_vio ? m_output_vio-ndone : 0);
 (gdb) bt
 #0  0x004dc624 in RangeTransform::handle_event (this=0x2aaed0001fb0, 
 event=1, edata=value optimized out) at Transform.cc:926
 #1  0x0069145f in EThread::process_event (this=0x2b332010, 
 e=0x591d410, calling_code=1) at I_Continuation.h:146
 #2  0x00691b8b in EThread::execute (this=0x2b332010) at 
 UnixEThread.cc:218
 #3  0x0069092e in spawn_thread_internal (a=0x440bfa0) at Thread.cc:88
 #4  0x00359fe0673d in pthread_create@@GLIBC_2.2.5 () from 
 /lib64/libpthread.so.0
 #5  0x in ?? ()
 (gdb) p m_output_vio
 $1 = (VIO *) 0x2aaed0029fe8
 (gdb) p *m_output_vio
 Cannot access memory at address 0x2aaed0029fe8
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1109) stack dump may crash too

2012-02-26 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1109:
--

Fix Version/s: (was: 3.1.4)
   3.1.3

 stack dump may crash too
 

 Key: TS-1109
 URL: https://issues.apache.org/jira/browse/TS-1109
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 3.1.2
Reporter: Zhao Yongming
Assignee: weijin
  Labels: crash
 Fix For: 3.1.3

 Attachments: cop_crash.diff


 the codes doing stack dump may crash, in this case you will not able to get a 
 core file, that will hide most of the rare issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1111) crash in RangeTransform::handle_event

2012-02-20 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-:
--

Attachment: transform.patch

this is a stupid patch for the current issue, it will not prevent all the 
crashing. taorui(Wei Jin) will submit a cool fix and clear all the do_io_close 
related crashing. we are testing in our production, so far so good till now.

 crash in RangeTransform::handle_event
 -

 Key: TS-
 URL: https://issues.apache.org/jira/browse/TS-
 Project: Traffic Server
  Issue Type: Bug
Reporter: Zhao Yongming
Assignee: weijin
  Labels: crash
 Fix For: 3.1.3

 Attachments: transform.patch


 we have some crashing in range requesting processing, it is on our own tree 
 based on 3.0.x, that maybe the root cause of the do_io_close issue, we are 
 still look at how to fix that, feedback is welcome
 {code}
 #0  0x004dc624 in RangeTransform::handle_event (this=0x2aaed0001fb0, 
 event=1, edata=value optimized out) at Transform.cc:926
 926 Debug(transform_range, RangeTransform destroy: %d, 
 m_output_vio ? m_output_vio-ndone : 0);
 (gdb) bt
 #0  0x004dc624 in RangeTransform::handle_event (this=0x2aaed0001fb0, 
 event=1, edata=value optimized out) at Transform.cc:926
 #1  0x0069145f in EThread::process_event (this=0x2b332010, 
 e=0x591d410, calling_code=1) at I_Continuation.h:146
 #2  0x00691b8b in EThread::execute (this=0x2b332010) at 
 UnixEThread.cc:218
 #3  0x0069092e in spawn_thread_internal (a=0x440bfa0) at Thread.cc:88
 #4  0x00359fe0673d in pthread_create@@GLIBC_2.2.5 () from 
 /lib64/libpthread.so.0
 #5  0x in ?? ()
 (gdb) p m_output_vio
 $1 = (VIO *) 0x2aaed0029fe8
 (gdb) p *m_output_vio
 Cannot access memory at address 0x2aaed0029fe8
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1068) we should not wait all the prefetch clients finish

2011-12-31 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1068:
--

  Component/s: HTTP
  Description: when one request is on prefetching, we should not wait 
for all the prefetch clients, as there will be minutes or even hours.
Affects Version/s: 3.1.1

 we should not wait all the prefetch clients finish
 --

 Key: TS-1068
 URL: https://issues.apache.org/jira/browse/TS-1068
 Project: Traffic Server
  Issue Type: Sub-task
  Components: HTTP
Affects Versions: 3.1.1
Reporter: Zhao Yongming
 Fix For: 3.1.4


 when one request is on prefetching, we should not wait for all the prefetch 
 clients, as there will be minutes or even hours.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1069) better handle of the gzipped content

2011-12-31 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1069:
--

Description: when the triggered URL is gzipped, prefetch engine will skip 
that request, while not put that URL in the prefetch url list, and start 
another request without accept gzip encodes?
Summary: better handle of the gzipped content  (was: better handle of 
the gziped content)

 better handle of the gzipped content
 

 Key: TS-1069
 URL: https://issues.apache.org/jira/browse/TS-1069
 Project: Traffic Server
  Issue Type: Sub-task
Reporter: Zhao Yongming
 Fix For: 3.1.4


 when the triggered URL is gzipped, prefetch engine will skip that request, 
 while not put that URL in the prefetch url list, and start another request 
 without accept gzip encodes?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1059) prefetch segment fault

2011-12-29 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1059:
--

Fix Version/s: 3.1.3
   Labels: crash  (was: )

 prefetch segment fault
 --

 Key: TS-1059
 URL: https://issues.apache.org/jira/browse/TS-1059
 Project: Traffic Server
  Issue Type: Sub-task
  Components: HTTP
Affects Versions: 3.0.2
 Environment: linux(ubuntu)
Reporter: yunfei chen
  Labels: crash
 Fix For: 3.1.3


 I encountered a segment faut when I was testing Prefetch module。
 ab -n 1 -c 1 -X 192.168.16.198:8080 -d 
 http://club.baobao.sohu.com/r-mmbb-3954004-0-29-900.html
 configuration in records.config
  CONFIG proxy.config.prefetch.prefetch_enabled INT 1
 configuration in prefetch.config
  prefetch_children 192.168.16.198
  html_tag img src
 #0  0x08124f17 in VIO::reenable (this=0x0) at 
 ../iocore/eventsystem/P_VIO.h:123
 #1  0x08147fe3 in KeepAliveConn::append (this=0xab9aef20, rdr=0x9c91b794) at 
 Prefetch.cc:1984
 #2  0x08145fd7 in KeepAliveConnTable::append (this=0xb2393608, ip=16777343, 
 buf=0x9c91b780, reader=0x9c91b794) at Prefetch.cc:2039
 #3  0x0814679b in KeepAliveLockHandler::handleEvent (this=0xb23e0b30, 
 event=2, data=0x8ab2f60) at Prefetch.cc:2168
 #4  0x08104ba5 in Continuation::handleEvent (this=0xb23e0b30, event=2, 
 data=0x8ab2f60)
 at ../iocore/eventsystem/I_Continuation.h:146
 #5  0x0830a9f5 in EThread::process_event (this=0xb7396008, e=0x8ab2f60, 
 calling_code=2) at UnixEThread.cc:140
 #6  0x0830add5 in EThread::execute (this=0xb7396008) at UnixEThread.cc:217
 #7  0x0830900e in spawn_thread_internal (a=0x895eed8) at Thread.cc:88
 #8  0x00165cc9 in start_thread (arg=0xb6f91b70) at pthread_create.c:304
 #9  0x0066f69e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
 #0  0x08124f17 in VIO::reenable (this=0x0) at 
 ../iocore/eventsystem/P_VIO.h:123
 #1  0x08147fe3 in KeepAliveConn::append (this=0xa5736888, rdr=0x8d0b2f4) at 
 Prefetch.cc:1984
 #2  0x08145fd7 in KeepAliveConnTable::append (this=0xb2393608, ip=16777343, 
 buf=0x8d0b2e0, reader=0x8d0b2f4) at Prefetch.cc:2039
 #3  0x08141db3 in PrefetchUrlBlaster::udpUrlBlaster (this=0x8abd3e0, 
 event=3300, data=0x0) at Prefetch.cc:885
 #4  0x0813e4ea in PrefetchUrlBlaster::init (this=0x8abd3e0, 
 list_head=0xabc59ac0, u_proto=TCP_BLAST) at Prefetch.h:280
 #5  0x08147806 in BlasterUrlList::invokeUrlBlaster (this=0xa7c22260) at 
 Prefetch.h:287
 #6  0x08141ac8 in BlasterUrlList::handleEvent (this=0xa7c22260, event=3302, 
 data=0xabc59ac0) at Prefetch.cc:803
 #7  0x08143c89 in PrefetchBlaster::handleEvent (this=0xa5739920, event=2, 
 data=0x0) at Prefetch.cc:1420
 #8  0x08144f42 in PrefetchBlaster::invokeBlaster (this=0xa5739920) at 
 Prefetch.cc:1769
 #9  0x08143e22 in PrefetchBlaster::handleEvent (this=0xa5739920, event=1102, 
 data=0xb23cdca0) at Prefetch.cc:1448
 #10 0x08104ba5 in Continuation::handleEvent (this=0xa5739920, event=1102, 
 data=0xb23cdca0)
 at ../iocore/eventsystem/I_Continuation.h:146
 #11 0x082c1abf in CacheVC::callcont (this=0xb23cdca0, event=1102) at 
 P_CacheInternal.h:629
 #12 0x082c1487 in CacheVC::openReadStartHead (this=0xb23cdca0, event=3900, 
 e=0x0) at CacheRead.cc:1115
 #13 0x08104ba5 in Continuation::handleEvent (this=0xb23cdca0, event=3900, 
 data=0x0) at ../iocore/eventsystem/I_Continuation.h:146
 #14 0x082c1431 in CacheVC::openReadStartHead (this=0xb23cdca0, event=2, 
 e=0x8ab48a0) at CacheRead.cc:1112
 #15 0x08104ba5 in Continuation::handleEvent (this=0xb23cdca0, event=2, 
 data=0x8ab48a0)
 at ../iocore/eventsystem/I_Continuation.h:146
 #16 0x0830a9f5 in EThread::process_event (this=0xb7295008, e=0x8ab48a0, 
 calling_code=2) at UnixEThread.cc:140
 #17 0x0830add5 in EThread::execute (this=0xb7295008) at UnixEThread.cc:217
 #18 0x0830900e in spawn_thread_internal (a=0x895dd00) at Thread.cc:88
 #19 0x00165cc9 in start_thread (arg=0xb6e90b70) at pthread_create.c:304
 #20 0x0066f69e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1059) prefetch segment fault

2011-12-25 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1059:
--

Issue Type: Sub-task  (was: Bug)
Parent: TS-893

 prefetch segment fault
 --

 Key: TS-1059
 URL: https://issues.apache.org/jira/browse/TS-1059
 Project: Traffic Server
  Issue Type: Sub-task
  Components: HTTP
Affects Versions: 3.0.2
 Environment: linux(ubuntu)
Reporter: yunfei chen

 I encountered a segment faut when I was testing Prefetch module。
 ab -n 1 -c 1 -X 192.168.16.198:8080 -d 
 http://club.baobao.sohu.com/r-mmbb-3954004-0-29-900.html
 configuration in records.config
  CONFIG proxy.config.prefetch.prefetch_enabled INT 1
 configuration in prefetch.config
  prefetch_children 192.168.16.198
  html_tag img src
 #0  0x08124f17 in VIO::reenable (this=0x0) at 
 ../iocore/eventsystem/P_VIO.h:123
 #1  0x08147fe3 in KeepAliveConn::append (this=0xab9aef20, rdr=0x9c91b794) at 
 Prefetch.cc:1984
 #2  0x08145fd7 in KeepAliveConnTable::append (this=0xb2393608, ip=16777343, 
 buf=0x9c91b780, reader=0x9c91b794) at Prefetch.cc:2039
 #3  0x0814679b in KeepAliveLockHandler::handleEvent (this=0xb23e0b30, 
 event=2, data=0x8ab2f60) at Prefetch.cc:2168
 #4  0x08104ba5 in Continuation::handleEvent (this=0xb23e0b30, event=2, 
 data=0x8ab2f60)
 at ../iocore/eventsystem/I_Continuation.h:146
 #5  0x0830a9f5 in EThread::process_event (this=0xb7396008, e=0x8ab2f60, 
 calling_code=2) at UnixEThread.cc:140
 #6  0x0830add5 in EThread::execute (this=0xb7396008) at UnixEThread.cc:217
 #7  0x0830900e in spawn_thread_internal (a=0x895eed8) at Thread.cc:88
 #8  0x00165cc9 in start_thread (arg=0xb6f91b70) at pthread_create.c:304
 #9  0x0066f69e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
 #0  0x08124f17 in VIO::reenable (this=0x0) at 
 ../iocore/eventsystem/P_VIO.h:123
 #1  0x08147fe3 in KeepAliveConn::append (this=0xa5736888, rdr=0x8d0b2f4) at 
 Prefetch.cc:1984
 #2  0x08145fd7 in KeepAliveConnTable::append (this=0xb2393608, ip=16777343, 
 buf=0x8d0b2e0, reader=0x8d0b2f4) at Prefetch.cc:2039
 #3  0x08141db3 in PrefetchUrlBlaster::udpUrlBlaster (this=0x8abd3e0, 
 event=3300, data=0x0) at Prefetch.cc:885
 #4  0x0813e4ea in PrefetchUrlBlaster::init (this=0x8abd3e0, 
 list_head=0xabc59ac0, u_proto=TCP_BLAST) at Prefetch.h:280
 #5  0x08147806 in BlasterUrlList::invokeUrlBlaster (this=0xa7c22260) at 
 Prefetch.h:287
 #6  0x08141ac8 in BlasterUrlList::handleEvent (this=0xa7c22260, event=3302, 
 data=0xabc59ac0) at Prefetch.cc:803
 #7  0x08143c89 in PrefetchBlaster::handleEvent (this=0xa5739920, event=2, 
 data=0x0) at Prefetch.cc:1420
 #8  0x08144f42 in PrefetchBlaster::invokeBlaster (this=0xa5739920) at 
 Prefetch.cc:1769
 #9  0x08143e22 in PrefetchBlaster::handleEvent (this=0xa5739920, event=1102, 
 data=0xb23cdca0) at Prefetch.cc:1448
 #10 0x08104ba5 in Continuation::handleEvent (this=0xa5739920, event=1102, 
 data=0xb23cdca0)
 at ../iocore/eventsystem/I_Continuation.h:146
 #11 0x082c1abf in CacheVC::callcont (this=0xb23cdca0, event=1102) at 
 P_CacheInternal.h:629
 #12 0x082c1487 in CacheVC::openReadStartHead (this=0xb23cdca0, event=3900, 
 e=0x0) at CacheRead.cc:1115
 #13 0x08104ba5 in Continuation::handleEvent (this=0xb23cdca0, event=3900, 
 data=0x0) at ../iocore/eventsystem/I_Continuation.h:146
 #14 0x082c1431 in CacheVC::openReadStartHead (this=0xb23cdca0, event=2, 
 e=0x8ab48a0) at CacheRead.cc:1112
 #15 0x08104ba5 in Continuation::handleEvent (this=0xb23cdca0, event=2, 
 data=0x8ab48a0)
 at ../iocore/eventsystem/I_Continuation.h:146
 #16 0x0830a9f5 in EThread::process_event (this=0xb7295008, e=0x8ab48a0, 
 calling_code=2) at UnixEThread.cc:140
 #17 0x0830add5 in EThread::execute (this=0xb7295008) at UnixEThread.cc:217
 #18 0x0830900e in spawn_thread_internal (a=0x895dd00) at Thread.cc:88
 #19 0x00165cc9 in start_thread (arg=0xb6e90b70) at pthread_create.c:304
 #20 0x0066f69e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1034) reduce futex locking period

2011-11-28 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1034:
--

Description: 
we need to reduce futex locking period, here is a simple testing in my 24cores 
HP380 system, with 24 ab, all cached in memory:
{code}
#!/bin/sh
for i in {1..24}
do
 ab -n 1 -c 16 -X 127.0.0.$i:8080 
http://img02.taobaocdn.com/tps/i2/T1o0ypXk4w-1000-40.png?$i 
done
{code}
result:
{code}
Every 2.0s: echo show:proxy-stats | traffic_shell   
   Mon Nov 28 16:06:42 2011

Successfully Initialized MgmtAPI in /var/run/trafficserver

Document Hit Rate  100.00 %  *
Bandwidth Saving - 100.00 %  *
Cache Percent Free --- 99.999619 %
Open Server Connections -- 0
Open Client Connections -- 9 
Open Cache Connections --- 2
Client Throughput  6824.747070 MBit/Sec
Transaction Per Second --- 53914.925781

* Value represents 10 second average.



strace -c -p 11712
Process 11712 attached - interrupt to quit
^CProcess 11712 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 26.850.890335  15 58920   writev
 24.450.810866   7118147   epoll_ctl
 22.270.738451  13 58920   close
 11.500.381362   6 59227   getsockname
  9.860.326843   3119192 59228 read
  3.530.117058  16  7100  1931 futex
  1.530.050884  58   884   epoll_wait
  0.000.37   0   404   rt_sigprocmask
  0.000.00   0 3   write
  0.000.00   0 2   brk
  0.000.00   010   msync
-- --- --- - - 
100.003.315836422809 61159 total


{code}

  was:
we need to reduce futex locking period, here is a simple testing in my 24cores 
HP380 system, with 24 ab:
{codes}
#!/bin/sh
for i in {1..24}
do
 ab -n 1 -c 16 -X 127.0.0.$i:8080 
http://img02.taobaocdn.com/tps/i2/T1o0ypXk4w-1000-40.png?$i 
done
{codes}
result:
{codes}
Every 2.0s: echo show:proxy-stats | traffic_shell   
   Mon Nov 28 16:06:42 2011

Successfully Initialized MgmtAPI in /var/run/trafficserver

Document Hit Rate  100.00 %  *
Bandwidth Saving - 100.00 %  *
Cache Percent Free --- 99.999619 %
Open Server Connections -- 0
Open Client Connections -- 9 
Open Cache Connections --- 2
Client Throughput  6824.747070 MBit/Sec
Transaction Per Second --- 53914.925781

* Value represents 10 second average.



[r...@hp380g7test.sqa.cm4 ~]# strace -c -p 11712
Process 11712 attached - interrupt to quit
^CProcess 11712 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 26.850.890335  15 58920   writev
 24.450.810866   7118147   epoll_ctl
 22.270.738451  13 58920   close
 11.500.381362   6 59227   getsockname
  9.860.326843   3119192 59228 read
  3.530.117058  16  7100  1931 futex
  1.530.050884  58   884   epoll_wait
  0.000.37   0   404   rt_sigprocmask
  0.000.00   0 3   write
  0.000.00   0 2   brk
  0.000.00   010   msync
-- --- --- - - 
100.003.315836422809 61159 total
[r...@hp380g7test.sqa.cm4 ~]# 

{codes}


 reduce futex locking period
 ---

 Key: TS-1034
 URL: https://issues.apache.org/jira/browse/TS-1034
 Project: Traffic Server
  Issue Type: Improvement
  Components: Core, HTTP
Affects Versions: 3.1.1
Reporter: Zhao Yongming
Assignee: Zhao Yongming

 we need to reduce futex locking period, here is a simple testing in my 
 24cores HP380 system, with 24 ab, all cached in memory:
 {code}
 #!/bin/sh
 for i in {1..24}
 do
  ab -n 1 -c 16 -X 127.0.0.$i:8080 
 http://img02.taobaocdn.com/tps/i2/T1o0ypXk4w-1000-40.png?$i 
 done
 {code}
 result:
 {code}
 Every 2.0s: echo show:proxy-stats | traffic_shell 
  Mon Nov 28 16:06:42 2011
 Successfully Initialized MgmtAPI in /var/run/trafficserver
 Document Hit Rate  100.00 %  *
 Bandwidth Saving - 100.00 %  *
 Cache Percent Free --- 99.999619 %
 Open Server Connections -- 0
 Open 

[jira] [Updated] (TS-1006) memory management, cut down memory waste ?

2011-11-24 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1006:
--

Attachment: memusage.ods

This is the result after mohan_zl's patch that use Da  rbtree to reduce the 
mem alignemnt memory, 1.8G for my env.

 memory management, cut down memory waste ?
 --

 Key: TS-1006
 URL: https://issues.apache.org/jira/browse/TS-1006
 Project: Traffic Server
  Issue Type: Improvement
  Components: Core
Affects Versions: 3.1.1
Reporter: Zhao Yongming
Assignee: Zhao Yongming
 Fix For: 3.1.3

 Attachments: memusage.ods, memusage.ods


 when we review the memory usage in the production, there is something 
 abnormal, ie, looks like TS take much memory than index data + common system 
 waste, and here is some memory dump result by set 
 proxy.config.dump_mem_info_frequency
 1, the one on a not so busy forwarding system:
 physics memory: 32G
 RAM cache: 22G
 DISK: 6140 GB
 average_object_size 64000
 {code}
  allocated  |in-use  | type size  |   free list name
 |||--
   671088640 |   37748736 |2097152 | 
 memory/ioBufAllocator[14]
  2248146944 | 2135949312 |1048576 | 
 memory/ioBufAllocator[13]
  1711276032 | 1705508864 | 524288 | 
 memory/ioBufAllocator[12]
  1669332992 | 1667760128 | 262144 | 
 memory/ioBufAllocator[11]
  2214592512 | 221184 | 131072 | 
 memory/ioBufAllocator[10]
  2325741568 | 2323775488 |  65536 | 
 memory/ioBufAllocator[9]
  2091909120 | 2089123840 |  32768 | 
 memory/ioBufAllocator[8]
  1956642816 | 1956478976 |  16384 | 
 memory/ioBufAllocator[7]
  2094530560 | 2094071808 |   8192 | 
 memory/ioBufAllocator[6]
   356515840 |  355540992 |   4096 | 
 memory/ioBufAllocator[5]
 1048576 |  14336 |   2048 | 
 memory/ioBufAllocator[4]
  131072 |  0 |   1024 | 
 memory/ioBufAllocator[3]
   65536 |  0 |512 | 
 memory/ioBufAllocator[2]
   32768 |  0 |256 | 
 memory/ioBufAllocator[1]
   16384 |  0 |128 | 
 memory/ioBufAllocator[0]
   0 |  0 |576 | 
 memory/ICPRequestCont_allocator
   0 |  0 |112 | 
 memory/ICPPeerReadContAllocator
   0 |  0 |432 | 
 memory/PeerReadDataAllocator
   0 |  0 | 32 | 
 memory/MIMEFieldSDKHandle
   0 |  0 |240 | 
 memory/INKVConnAllocator
   0 |  0 | 96 | 
 memory/INKContAllocator
4096 |  0 | 32 | 
 memory/apiHookAllocator
   0 |  0 |288 | 
 memory/FetchSMAllocator
   0 |  0 | 80 | 
 memory/prefetchLockHandlerAllocator
   0 |  0 |176 | 
 memory/PrefetchBlasterAllocator
   0 |  0 | 80 | 
 memory/prefetchUrlBlaster
   0 |  0 | 96 | memory/blasterUrlList
   0 |  0 | 96 | 
 memory/prefetchUrlEntryAllocator
   0 |  0 |128 | 
 memory/socksProxyAllocator
   0 |  0 |144 | 
 memory/ObjectReloadCont
 3258368 | 576016 |592 | 
 memory/httpClientSessionAllocator
  825344 | 139568 |208 | 
 memory/httpServerSessionAllocator
22597632 |1284848 |   9808 | memory/httpSMAllocator
   0 |  0 | 32 | 
 memory/CacheLookupHttpConfigAllocator
   0 |  0 |   9856 | 
 memory/httpUpdateSMAllocator
   0 |  0 |128 | 
 memory/RemapPluginsAlloc
   0 |  0 | 48 | 
 memory/CongestRequestParamAllocator
   0 |  0 |128 | 
 memory/CongestionDBContAllocator
 5767168 | 704512 |   2048 | memory/hdrStrHeap
18350080 |1153024 |   2048 | memory/hdrHeap
   53248 |   2912 |208 | 
 memory/httpCacheAltAllocator
   0 |  0 |112 | 
 memory/OneWayTunnelAllocator
  157696 |   

[jira] [Updated] (TS-1029) DNS crash if we free the memory into system

2011-11-23 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1029:
--

Description: 
when we start to testing free memory into system, the DNS will cause crashing 
at:
dns_result()
{code}
  if (!e-post(h, ent)) {
for (int i = 0; i  MAX_DNS_RETRIES; i++) {
  if (e-id[i]  0)
break;
  h-release_query_id(e-id[i]);
}
return;
  }
{code}

  was:
when we start to testing free memory into system, the DNS will cause crashing 
at:
dns_result()
{codes}
  if (!e-post(h, ent)) {
for (int i = 0; i  MAX_DNS_RETRIES; i++) {
  if (e-id[i]  0)
break;
  h-release_query_id(e-id[i]);
}
return;
  }
{codes}


 DNS crash if we free the memory into system
 ---

 Key: TS-1029
 URL: https://issues.apache.org/jira/browse/TS-1029
 Project: Traffic Server
  Issue Type: Bug
  Components: DNS
Affects Versions: 3.1.2
Reporter: Zhao Yongming
Assignee: weijin
 Fix For: 3.1.2


 when we start to testing free memory into system, the DNS will cause crashing 
 at:
 dns_result()
 {code}
   if (!e-post(h, ent)) {
 for (int i = 0; i  MAX_DNS_RETRIES; i++) {
   if (e-id[i]  0)
 break;
   h-release_query_id(e-id[i]);
 }
 return;
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1003) prefetch: the config file

2011-10-24 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1003:
--

  Component/s: Configuration
  Description: PreFetch and Update is the most strange plugin that keep in 
the proxy dir. and we have some PreFetch API that should make some usage, but 
the config file for prefetch and configs not managed live other config files in 
the tree. we should make PreFetch a big feature for TS, and smooth all the ugly 
coded issue.
Fix Version/s: (was: 3.1.4)
   2.1.3

when dealing with the config file and config options updates, I think we need 
figure out some of the issues later:
1, how to deal with configs that our plugins need:
 1.1, how to add config file to the mgmt system. the config file is monitored 
by mgmt, and the file list is hard coded into fileUpdated(), that will show no 
flex while the config file keeping increasing for now.
 1.2, how to deal with the config option call backs? is there any guide in the 
coding?
2, should we bring up the config file syntax talk?
 2.1, why so many types and so many codes in handling these syntaxes?

I think we need some roadmap on the configs, at least some directons :D

 prefetch: the config file
 -

 Key: TS-1003
 URL: https://issues.apache.org/jira/browse/TS-1003
 Project: Traffic Server
  Issue Type: Sub-task
  Components: Configuration
Reporter: Zhao Yongming
Assignee: Zhao Yongming
 Fix For: 2.1.3


 PreFetch and Update is the most strange plugin that keep in the proxy dir. 
 and we have some PreFetch API that should make some usage, but the config 
 file for prefetch and configs not managed live other config files in the 
 tree. we should make PreFetch a big feature for TS, and smooth all the ugly 
 coded issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1003) prefetch: the config file

2011-10-24 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1003:
--

Attachment: TS-1003.patch

here is the patch that will make the config same as other.

 prefetch: the config file
 -

 Key: TS-1003
 URL: https://issues.apache.org/jira/browse/TS-1003
 Project: Traffic Server
  Issue Type: Sub-task
  Components: Configuration
Reporter: Zhao Yongming
Assignee: Zhao Yongming
 Fix For: 2.1.3

 Attachments: TS-1003.patch


 PreFetch and Update is the most strange plugin that keep in the proxy dir. 
 and we have some PreFetch API that should make some usage, but the config 
 file for prefetch and configs not managed live other config files in the 
 tree. we should make PreFetch a big feature for TS, and smooth all the ugly 
 coded issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-994) X-Forwarded-For should follow the squid way

2011-10-21 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-994:
-

Attachment: XFF.patch

a quick fix

 X-Forwarded-For should follow the squid way
 ---

 Key: TS-994
 URL: https://issues.apache.org/jira/browse/TS-994
 Project: Traffic Server
  Issue Type: Improvement
  Components: HTTP
Affects Versions: 3.1.1
Reporter: Zhao Yongming
Assignee: Zhao Yongming
Priority: Minor
 Fix For: 3.1.1

 Attachments: XFF.patch


 TS will append the IP in the format of: 
 X-Forwarded-For: 127.0.0.1,  10.32.102.41\r\n
 while http://en.wikipedia.org/wiki/X-Forwarded-For says:
 X-Forwarded-For: client1, proxy1, proxy2
 there is 2 spaces

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-899) ts crash

2011-10-18 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-899:
-

Fix Version/s: (was: 3.1.1)
   3.1.3

the transform related issue is still in progress, not that easy to fix. weijin 
is moving forward slowly.

 ts crash
 

 Key: TS-899
 URL: https://issues.apache.org/jira/browse/TS-899
 Project: Traffic Server
  Issue Type: Sub-task
  Components: HTTP, MIME
Affects Versions: 3.0.1
 Environment: readhat5.5, ts-3.0.1, X86-64
Reporter: weijin
Assignee: weijin
 Fix For: 3.1.3


 If a request url is forbidden then redirected to another url, TS crash.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-910) log collation in custom log will make dedicate connection to the same collation server

2011-10-15 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-910:
-

Fix Version/s: (was: 3.1.1)
   3.1.3

well, it is not that hurry as most of the cluster is under 32hosts. I will try 
the fix in the next releases

 log collation in custom log will make dedicate connection to the same 
 collation server
 --

 Key: TS-910
 URL: https://issues.apache.org/jira/browse/TS-910
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.1.0
Reporter: Zhao Yongming
Assignee: Zhao Yongming
 Fix For: 3.1.3


 when you define LogObject in logs_xml.config, and set CollationHosts, it will 
 open connections for each LogObject, despite you put the same host in 
 CollationHosts.
 it will affect the default squid logging too. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-801) Crash Report: enable update will triger Segmentation fault

2011-10-12 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-801:
-

Fix Version/s: (was: 3.1.1)
   3.1.2

I think the update still need to check, as I can get the crash from trunk:

{code}
[Oct 13 10:35:28.724] Server {0x2b71049b3860} DEBUG: (update) (R) speculative 
start update id: 7 [http://cdn.zymlinux.net/icons/blank.gif]
[Oct 13 10:35:28.724] Server {0x2b710587e700} DEBUG: (update) Start HTTP GET 
id: 7 [http://cdn.zymlinux.net/icons/blank.gif]
NOTE: Traffic Server received Sig 11: Segmentation fault
/opt/ats/bin/traffic_server - STACK TRACE: 
/opt/ats/bin/traffic_server[0x53b188]
/lib64/libpthread.so.0(+0xf400)[0x2b7101e84400]
[0x6e]
/opt/ats/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
(*)(HttpTransact::State*))+0x6e)[0x591a86]
/opt/ats/bin/traffic_server(HttpSM::set_next_state()+0x165)[0x591cdf]
/opt/ats/bin/traffic_server(HttpUpdateSM::set_next_state()+0xad)[0x5c8861]
/opt/ats/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
(*)(HttpTransact::State*))+0x13d)[0x591b55]
/opt/ats/bin/traffic_server(HttpSM::handle_api_return()+0x138)[0x5816ca]
/opt/ats/bin/traffic_server(HttpUpdateSM::handle_api_return()+0x45)[0x5c84d1]
/opt/ats/bin/traffic_server(HttpSM::do_api_callout()+0x3f)[0x596d69]
/opt/ats/bin/traffic_server(HttpSM::set_next_state()+0x64)[0x591bde]
/opt/ats/bin/traffic_server(HttpUpdateSM::set_next_state()+0xad)[0x5c8861]
/opt/ats/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
(*)(HttpTransact::State*))+0x13d)[0x591b55]
/opt/ats/bin/traffic_server(HttpSM::handle_api_return()+0x138)[0x5816ca]
/opt/ats/bin/traffic_server(HttpUpdateSM::handle_api_return()+0x45)[0x5c84d1]
/opt/ats/bin/traffic_server(HttpSM::do_api_callout()+0x3f)[0x596d69]
/opt/ats/bin/traffic_server(HttpSM::set_next_state()+0x64)[0x591bde]
/opt/ats/bin/traffic_server(HttpUpdateSM::set_next_state()+0xad)[0x5c8861]
/opt/ats/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void 
(*)(HttpTransact::State*))+0x13d)[0x591b55]
/opt/ats/bin/traffic_server(HttpUpdateSM::handle_api_return()+0x34)[0x5c84c0]
/opt/ats/bin/traffic_server(HttpSM::do_api_callout()+0x3f)[0x596d69]
/opt/ats/bin/traffic_server(HttpSM::state_add_to_list(int, 
void*)+0x2b3)[0x57e31f]
/opt/ats/bin/traffic_server(HttpSM::main_handler(int, void*)+0x2a0)[0x584860]
/opt/ats/bin/traffic_server(Continuation::handleEvent(int, 
void*)+0x68)[0x4f4e90]
/opt/ats/bin/traffic_server(HttpUpdateSM::start_scheduled_update(Continuation*, 
HTTPHdr*)+0x174)[0x5c8438]
/opt/ats/bin/traffic_server(UpdateSM::http_scheme(UpdateSM*)+0x279)[0x54a1b5]
/opt/ats/bin/traffic_server(UpdateSM::HandleSMEvent(int, 
Event*)+0x1ab)[0x549d0f]
/opt/ats/bin/traffic_server(Continuation::handleEvent(int, 
void*)+0x68)[0x4f4e90]
/opt/ats/bin/traffic_server(EThread::process_event(Event*, int)+0x127)[0x704069]
/opt/ats/bin/traffic_server(EThread::execute()+0x9a)[0x704272]
/opt/ats/bin/traffic_server[0x703240]
/lib64/libpthread.so.0(+0x6d5c)[0x2b7101e7bd5c]
/lib64/libc.so.6(clone+0x6d)[0x2b71044f72dd]
/lib64/libc.so.6(clone+0x6d)[0x2b71044f72dd]
[Oct 13 10:35:28.794] Manager {0x7f8adbb55720} ERROR: 
[LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: 
Segmentation fault
[Oct 13 10:35:28.794] Manager {0x7f8adbb55720} ERROR:  (last system error 2: No 
such file or directory)
[Oct 13 10:35:28.794] Manager {0x7f8adbb55720} ERROR: [Alarms::signalAlarm] 
Server Process was reset
[Oct 13 10:35:28.794] Manager {0x7f8adbb55720} ERROR:  (last system error 2: No 
such file or directory)
[Oct 13 10:35:29.798] Manager {0x7f8adbb55720} NOTE: [LocalManager::startProxy] 
Launching ts process
{code}

 Crash Report: enable update will triger Segmentation fault
 --

 Key: TS-801
 URL: https://issues.apache.org/jira/browse/TS-801
 Project: Traffic Server
  Issue Type: Bug
  Components: HTTP
Affects Versions: 2.1.8
 Environment: v2.1.8 and update function enabled.
Reporter: Zhao Yongming
  Labels: update
 Fix For: 3.1.2


 {code}
 b13621367...@hotmail.com: NOTE: Traffic Server received Sig 11: Segmentation 
 fault
 /usr/local/ts/bin/traffic_server - STACK TRACE:
 b13621367...@hotmail.com: 
 /usr/local/ts/bin/traffic_server[0x5260c9]
 /lib64/libpthread.so.0[0x3088e0f4c0]
 [0x6e]
 /usr/local/ts/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void
  (*)(HttpTransact::State*))+0x6e)[0x57e0e2]
 /usr/local/ts/bin/traffic_server(HttpSM::set_next_state()+0x18b)[0x57e369]
 /usr/local/ts/bin/traffic_server(HttpUpdateSM::set_next_state()+0xad)[0x5b604b]
 /usr/local/ts/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void
  (*)(HttpTransact::State*))+0x15e)[0x57e1d2]
 

[jira] [Updated] (TS-893) the prefetch function in codes need more love to show up

2011-10-10 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-893:
-

Fix Version/s: (was: 3.1.1)
   3.1.2
 Assignee: Zhao Yongming

I have done some working on the target #1, for the config files, as the ugly 
RecCore usage and config file watching interface, there will need more long 
term tweak for better design for such a transform related in tree plugin. I 
will separate the job and setup more tickets for different target, I will take 
this issue as a key to cleanup all the related issues.  

 the prefetch function in codes need more love to show up
 

 Key: TS-893
 URL: https://issues.apache.org/jira/browse/TS-893
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Zhao Yongming
Assignee: Zhao Yongming
 Fix For: 3.1.2


 the prefetch function in proxy is a good solution when you really need to 
 faster up your user download time, it can parse any allowed plean html file, 
 get all resource tags out and do batch loading from OS. I am going to preload 
 my site before we put it online, as it will get about 1 month to get the disk 
 full and hit rate stable. it is a cool feature but it have the following 
 issues:
 1, the prefetch config file is not managed well. ie, it is not managed by 
 cluster
 2, the it does not any document in the admin guide or old pdf file.
 3, prefetching just care plean html file, without compressing, should we do 
 some decompressing? is that possible?
 hopes this is the starting of make prefetch really useful for some cutting 
 edge situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira