[jira] [Commented] (TS-1212) can not limit ram cache
[ https://issues.apache.org/jira/browse/TS-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258003#comment-13258003 ] Zhao Yongming commented on TS-1212: --- when you traffic_line -L restart the server, the total_bytes whill change: {code} [yonghao@cache177 ~]$ links -dump http://localhost:8080/stat/ | grep ram proxy.config.cache.ram_cache.size=10737418240 proxy.config.cache.ram_cache_cutoff=131072 proxy.config.cache.ram_cache.algorithm=1 proxy.config.cache.ram_cache.compress=0 proxy.config.cache.ram_cache.ssd_percent=25 proxy.config.cache.ram_cache.compress_percent=90 proxy.process.cache.ram_cache.total_bytes=10737418239 proxy.process.cache.volume_0.ram_cache.total_bytes=-79456895011 proxy.process.cache.ram_cache.bytes_used=0 proxy.process.cache.ram_cache.hits=0 proxy.process.cache.ram_cache.misses=0 proxy.process.cache.ram.read.success=0 proxy.process.cache.volume_0.ram_cache.bytes_used=0 proxy.process.cache.volume_0.ram_cache.hits=0 proxy.process.cache.volume_0.ram_cache.misses=0 proxy.process.cache.volume_0.ram.read.success=0 [yonghao@cache177 ~]$ links -dump http://localhost:8080/stat/ | grep ram proxy.config.cache.ram_cache.size=10737418240 proxy.config.cache.ram_cache_cutoff=131072 proxy.config.cache.ram_cache.algorithm=1 proxy.config.cache.ram_cache.compress=0 proxy.config.cache.ram_cache.ssd_percent=25 proxy.config.cache.ram_cache.compress_percent=90 proxy.process.cache.ram_cache.total_bytes=10737418239 proxy.process.cache.volume_0.ram_cache.total_bytes=-85899345956 proxy.process.cache.ram_cache.bytes_used=0 proxy.process.cache.ram_cache.hits=0 proxy.process.cache.ram_cache.misses=0 proxy.process.cache.ram.read.success=0 proxy.process.cache.volume_0.ram_cache.bytes_used=0 proxy.process.cache.volume_0.ram_cache.hits=0 proxy.process.cache.volume_0.ram_cache.misses=0 proxy.process.cache.volume_0.ram.read.success=0 {code} can not limit ram cache --- Key: TS-1212 URL: https://issues.apache.org/jira/browse/TS-1212 Project: Traffic Server Issue Type: Bug Components: Cache Affects Versions: 2.1.3 Environment: we are on v3.0.x but maybe affected v3.1 and later too. Reporter: Zhao Yongming ram cache limit is not activate at sometime: {code} [yonghao@cache177 ~]$ links -dump http://localhost:8080/stat/ | grep ram proxy.config.cache.ram_cache.size=10737418240 proxy.config.cache.ram_cache_cutoff=131072 proxy.config.cache.ram_cache.algorithm=1 proxy.config.cache.ram_cache.compress=0 proxy.config.cache.ram_cache.ssd_percent=25 proxy.config.cache.ram_cache.compress_percent=90 proxy.process.cache.ram_cache.total_bytes=12884901886 proxy.process.cache.volume_0.ram_cache.total_bytes=-7301066 proxy.process.cache.ram_cache.bytes_used=11840122880 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1212) can not limit ram cache
[ https://issues.apache.org/jira/browse/TS-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258033#comment-13258033 ] Zhao Yongming commented on TS-1212: --- in the codes: {code} static void reg_int(const char *str, int stat, RecRawStatBlock *rsb, const char *prefix, RecRawStatSyncCb sync_cb=RecRawStatSyncSum) { char stat_str[256]; snprintf(stat_str, sizeof(stat_str), %s.%s, prefix, str); RecRegisterRawStat(rsb, RECT_PROCESS, stat_str, RECD_INT, RECP_NON_PERSISTENT, stat, sync_cb); DOCACHE_CLEAR_DYN_STAT(stat) } #define REG_INT(_str, _stat) reg_int(_str, (int)_stat, rsb, prefix) // Register Stats void register_cache_stats(RecRawStatBlock *rsb, const char *prefix) { char stat_str[256]; // Special case for this sucker, since it uses its own aggregator. reg_int(bytes_used, cache_bytes_used_stat, rsb, prefix, cache_stats_bytes_used_cb); REG_INT(bytes_total, cache_bytes_total_stat); snprintf(stat_str, sizeof(stat_str), %s.%s, prefix, ram_cache.total_bytes); RecRegisterRawStat(rsb, RECT_PROCESS, stat_str, RECD_INT, RECP_NULL, (int) cache_ram_cache_bytes_total_stat, RecRawStatSyncSum); REG_INT(ram_cache.bytes_used, cache_ram_cache_bytes_stat); REG_INT(ram_cache.hits, cache_ram_cache_hits_stat); REG_INT(ram_cache.misses, cache_ram_cache_misses_stat); REG_INT(pread_count, cache_pread_count_stat); {code} the ram_cache.total_bytes with prefix, is registered with RECP_NULL, while others are RECP_NON_PERSISTENT, what does that mean? from the codes, I think RECP_NULL, RECP_PERSISTENT, are treat as RECP_PERSISTENT, and will be persistent between restart. why we put here a RECP_NULL ?? can not limit ram cache --- Key: TS-1212 URL: https://issues.apache.org/jira/browse/TS-1212 Project: Traffic Server Issue Type: Bug Components: Cache Affects Versions: 2.1.3 Environment: we are on v3.0.x but maybe affected v3.1 and later too. Reporter: Zhao Yongming ram cache limit is not activate at sometime: {code} [yonghao@cache177 ~]$ links -dump http://localhost:8080/stat/ | grep ram proxy.config.cache.ram_cache.size=10737418240 proxy.config.cache.ram_cache_cutoff=131072 proxy.config.cache.ram_cache.algorithm=1 proxy.config.cache.ram_cache.compress=0 proxy.config.cache.ram_cache.ssd_percent=25 proxy.config.cache.ram_cache.compress_percent=90 proxy.process.cache.ram_cache.total_bytes=12884901886 proxy.process.cache.volume_0.ram_cache.total_bytes=-7301066 proxy.process.cache.ram_cache.bytes_used=11840122880 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1164) a race condition in cache init
[ https://issues.apache.org/jira/browse/TS-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250767#comment-13250767 ] Zhao Yongming commented on TS-1164: --- in git master: 1df0305b798a725dbbc1b44a1522936a425de8eb a race condition in cache init --- Key: TS-1164 URL: https://issues.apache.org/jira/browse/TS-1164 Project: Traffic Server Issue Type: Bug Components: Cache Affects Versions: 3.1.1, 3.0.0 Environment: all Reporter: weijin Assignee: weijin Fix For: 3.1.4 Attachments: taorui-cache.diff there is a race condition in CacheProcessor::diskInitialized, which may lead to cache can not be enabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1103) Traffic Server ESI plugin issues
[ https://issues.apache.org/jira/browse/TS-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248517#comment-13248517 ] Zhao Yongming commented on TS-1103: --- hmm, I think the codes may need more cleanup, so far we have now a working base. Traffic Server ESI plugin issues Key: TS-1103 URL: https://issues.apache.org/jira/browse/TS-1103 Project: Traffic Server Issue Type: Bug Components: Plugins Affects Versions: sometime Environment: Newest trunk. Reporter: Kevin Fox Assignee: Zhao Yongming Fix For: 3.1.4 Attachments: esi.patch, gzip.patch Patch to fix: * Makefile fix to add missing files. * Change return code checking to match whats trunk trafficserver. * Include missing header files. * Fix c++ namespace issues. * Work around strange name mangling/linking issue. * Force the assumption that the cached data is RAW_ESI, not PACKED_ESI. Things wouldn't work without it. * Comment out a block of code that looked to be incorrectly handling EOF. After this, simply loading the plugin and setting response header X-ESI in apache httpd seems to work. A few further bugs I have bumped into that aren't addressed in this patch: * It doesn't seem to parse gzip like it looks like it should. To work around, I had to disable it in apache httpd with RewriteRule . - [E=no-gzip:1] * If the client requests gzip, the ESI processor will gzip the result. It works in firefox but is invalid in chrome. Pulling a dump with curl and running it through gzip --list shows it has the correct uncompressed size and compressed size. using zcat shows the correct data but has the warning: invalid compressed data--length error. As far as I read the gzip spec though, the raw binary file looks valid to me. Not sure what this is. This can probably be simply disabled for now though. * esi:include is slightly broken. You get all the data back properly but sometimes the headers are sent prematurely with a Content-Length of 2**31-1. This causes clients to timeout and fail. I'm currently unsure how to fix this. I've tried a few of the more advanced esi features, including ensuring cookies make it back to the origin server and things seem to work good. So, once the above bugs are figured out (particularly the include one), I think it will be in pretty good shape. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1085) traffic_shell enable command doesn't work
[ https://issues.apache.org/jira/browse/TS-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246999#comment-13246999 ] Zhao Yongming commented on TS-1085: --- I think that is because of traffic_shell traffic_line need write permission on the unix socks, where in most case they are owned by nobody or ats, and permission 0755: {code} [root@test55 trafficserver]# ls -l /var/run/trafficserver/ | grep ^s srwxr-xr-x 1 ats ats 0 Apr 5 10:09 eventapisocket srwxr-xr-x 1 ats ats 0 Apr 5 10:09 mgmtapisocket srwxr-xr-x 1 ats ats 0 Apr 5 10:09 process_server {code} in any way, you can not write to those socks unless you are root, that is why traffic_line traffic_shell need the root privileges. traffic_shell enable command doesn't work - Key: TS-1085 URL: https://issues.apache.org/jira/browse/TS-1085 Project: Traffic Server Issue Type: Bug Components: Management Reporter: James Peach Assignee: James Peach Fix For: 3.3.0 Let's try this as root: blacko:trafficserver.git jpeach$ sudo /opt/ats/bin/traffic_shell Successfully Initialized MgmtAPI in /opt/ats/var/trafficserver % trafficserver enable Already Enabled trafficserver exit Ok, to let's try as non-root: blacko:trafficserver.git jpeach$ /opt/ats/bin/traffic_shell [connect] ERROR (main_socket_fd 3): Permission denied TSInit 5: Failed to initialize MgmtAPI in /opt/ats/var/trafficserver [connect] ERROR (main_socket_fd 3): Permission denied % trafficserver enable FATAL: ConfigCmd.cc:137: failed assert `enable_restricted_commands` /opt/ats/bin/traffic_shell - STACK TRACE: 0 libtsutil.3.dylib 0x00010cec8b8b ink_fatal_va + 283 1 libtsutil.3.dylib 0x00010cec8e94 ink_fatal + 356 2 libtsutil.3.dylib 0x00010cec66ff _ink_assert + 271 3 traffic_shell 0x00010ce253ab _Z10Cmd_EnablePvP10Tcl_InterpiPPKc + 395 4 Tcl 0x00010cf34261 TclInvokeStringCommand + 121 5 Tcl 0x00010cf360b7 Tcl_GetMathFuncInfo + 2533 6 Tcl 0x00010cf36d14 Tcl_GetMathFuncInfo + 5698 7 Tcl 0x00010cf370d2 Tcl_Eval + 42 So either enable does nothing or it crashes. Seems like we should fix or remove this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1178) cop will kill manager server, even cop it self
[ https://issues.apache.org/jira/browse/TS-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244911#comment-13244911 ] Zhao Yongming commented on TS-1178: --- this issue is related to the cluster mode 1, and it is triggered in: {code} static int heartbeat_manager() { int err; #ifdef TRACE_LOG_COP cop_log(COP_DEBUG, Entering heartbeat_manager()\n); #endif // the CLI, and the rsport if cluster is enabled. err = test_mgmt_cli_port(); if ((0 == err) (cluster_type != NO_CLUSTER)) err = test_rs_port(); {code} here is the tcpdump on port 8088: {code} [root@test58 ~]# tcpdump -n -i lo -s 1500 -vX port 8083 or port 8088 tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 1500 bytes 09:48:21.876800 IP (tos 0x0, ttl 64, id 13344, offset 0, flags [DF], proto TCP (6), length 60) 127.0.0.1.46344 127.0.0.1.radan-http: Flags [S], cksum 0x640b (correct), seq 2279143422, win 32792, options [mss 16396,sackOK,TS val 321504222 ecr 0,nop,wscale 9], length 0 0x: 4500 003c 3420 4000 4006 089a 7f00 0001 E..4.@.@... 0x0010: 7f00 0001 b508 1f98 87d8 f7fe 0x0020: a002 8018 640b 0204 400c 0402 080a d.@. 0x0030: 1329 c3de 0103 0309.).. 09:48:21.876812 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 127.0.0.1.radan-http 127.0.0.1.46344: Flags [S.], cksum 0x4162 (correct), seq 2271462468, ack 2279143423, win 32768, options [mss 16396,sackOK,TS val 321504222 ecr 321504222,nop,wscale 9], length 0 0x: 4500 003c 4000 4006 3cba 7f00 0001 E@.@.. 0x0010: 7f00 0001 1f98 b508 8763 c444 87d8 f7ff .c.D 0x0020: a012 8000 4162 0204 400c 0402 080a Ab@. 0x0030: 1329 c3de 1329 c3de 0103 0309.)...).. 09:48:21.876821 IP (tos 0x0, ttl 64, id 13345, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.46344 127.0.0.1.radan-http: Flags [.], cksum 0x2a48 (correct), ack 1, win 65, options [nop,nop,TS val 321504222 ecr 321504222], length 0 0x: 4500 0034 3421 4000 4006 08a1 7f00 0001 E..44!@.@... 0x0010: 7f00 0001 b508 1f98 87d8 f7ff 8763 c445 .c.E 0x0020: 8010 0041 2a48 0101 080a 1329 c3de ...A*H...).. 0x0030: 1329 c3de.).. 09:48:21.876903 IP (tos 0x0, ttl 64, id 13346, offset 0, flags [DF], proto TCP (6), length 85) 127.0.0.1.46344 127.0.0.1.radan-http: Flags [P.], cksum 0xfe49 (incorrect - 0xdcc6), seq 1:34, ack 1, win 65, options [nop,nop,TS val 321504222 ecr 321504222], length 33 0x: 4500 0055 3422 4000 4006 087f 7f00 0001 E..U4@.@... 0x0010: 7f00 0001 b508 1f98 87d8 f7ff 8763 c445 .c.E 0x0020: 8018 0041 fe49 0101 080a 1329 c3de ...A.I...).. 0x0030: 1329 c3de 7265 6164 2070 726f 7879 2e63 .)..read.proxy.c 0x0040: 6f6e 6669 672e 6d61 6e61 6765 725f 6269 onfig.manager_bi 0x0050: 6e61 7279 0a nary. 09:48:21.876941 IP (tos 0x0, ttl 64, id 1947, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.radan-http 127.0.0.1.46344: Flags [.], cksum 0x2a28 (correct), ack 34, win 64, options [nop,nop,TS val 321504222 ecr 321504222], length 0 0x: 4500 0034 079b 4000 4006 3527 7f00 0001 E..4..@.@.5' 0x0010: 7f00 0001 1f98 b508 8763 c445 87d8 f820 .c.E 0x0020: 8010 0040 2a28 0101 080a 1329 c3de ...@*(...).. 0x0030: 1329 c3de.).. 09:48:21.876983 IP (tos 0x0, ttl 64, id 1948, offset 0, flags [DF], proto TCP (6), length 113) 127.0.0.1.radan-http 127.0.0.1.46344: Flags [P.], cksum 0xfe65 (incorrect - 0x0738), seq 1:62, ack 34, win 64, options [nop,nop,TS val 321504222 ecr 321504222], length 61 0x: 4500 0071 079c 4000 4006 34e9 7f00 0001 E..q..@.@.4. 0x0010: 7f00 0001 1f98 b508 8763 c445 87d8 f820 .c.E 0x0020: 8018 0040 fe65 0101 080a 1329 c3de ...@.e...).. 0x0030: 1329 c3de 0a52 6563 6f72 6420 2770 726f .)...Record.'pro 0x0040: 7879 2e63 6f6e 6669 672e 6d61 6e61 6765 xy.config.manage 0x0050: 725f 6269 6e61 7279 2720 5661 6c3a 2027 r_binary'.Val:.' 0x0060: 7472 6166 6669 635f 6d61 6e61 6765 7227 traffic_manager' 0x0070: 0a . 09:48:21.876990 IP (tos 0x0, ttl 64, id 13347, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.46344 127.0.0.1.radan-http: Flags [.], cksum 0x29ea (correct), ack 62, win 65, options [nop,nop,TS val 321504222 ecr 321504222], length 0 0x: 4500 0034 3423 4000 4006 089f 7f00 0001 E..44#@.@... 0x0010: 7f00 0001 b508 1f98 87d8 f820 8763 c482
[jira] [Commented] (TS-1178) cop will kill manager server, even cop it self
[ https://issues.apache.org/jira/browse/TS-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243621#comment-13243621 ] Zhao Yongming commented on TS-1178: --- there may hide two problem: 1, why cop-manager heartbeat failed with [WARNING]: (manager test) bad response value 2, why the safe_kill killed cop itself by the end cop will kill manager server, even cop it self Key: TS-1178 URL: https://issues.apache.org/jira/browse/TS-1178 Project: Traffic Server Issue Type: Bug Components: Management Affects Versions: 3.1.4 Environment: git master on RHEL6.1 x86_64 Reporter: Zhao Yongming Assignee: Zhao Yongming Fix For: 3.1.4 {code} [root@test58 trafficserver]# cat /tmp/traffic_cop.trace 1333239680. [DEBUG]: Entering init() 1333239680. [DEBUG]: Entering init_signals() 1333239680. [DEBUG]: Entering set_alarm_death() 1333239680. [DEBUG]: Leaving set_alarm_death() 1333239680. [DEBUG]: Leaving init_signals() 1333239680. [DEBUG]: Entering init_config_dir() 1333239680. [DEBUG]: Leaving init_config_dir() 1333239680. [DEBUG]: Entering init_config_file() 1333239680. [DEBUG]: Leaving init_config_file() 1333239680. [DEBUG]: Entering init_lockfiles() 1333239680. [DEBUG]: Leaving init_lockfiles() 1333239680. [DEBUG]: Entering check_lockfile() 1333239680. [unknown]: --- Cop Starting [Version: Apache Traffic Server - traffic_cop - 3.1.4-unstable - (build # 310 on Apr 1 2012 at 00:34:30)] --- 1333239680. [DEBUG]: Leaving check_lockfile() 1333239680. [DEBUG]: Leaving init() 1333239680. [DEBUG]: Entering check() 1333239680. [DEBUG]: Entering check_no_run() 1333239680. [DEBUG]: Entering transient_error(2, 500) 1333239680. [DEBUG]: Leaving transient_error(2, 500) -- false 1333239680. [DEBUG]: Leaving check_no_run() -- 0 1333239680. [DEBUG]: Entering read_config() 1333239680. [DEBUG]: Entering build_config_table(33932704) 1333239680. [DEBUG]: Leaving build_config_table(33932704) 1333239680. [DEBUG]: Entering process_syslog_config() 1333239680. [DEBUG]: Leaving process_syslog_config() 1333239680. [DEBUG]: Leaving read_config() 1333239680. [DEBUG]: Entering check_programs() 1333239680. [DEBUG]: Entering heartbeat_manager() 1333239680. [WARNING]: (cli test) unable to retrieve manager_binary 1333239680. [WARNING]: manager heartbeat [variable] failed [1] 1333239680. [DEBUG]: Leaving heartbeat_manager() -- -1 1333239680. [DEBUG]: Entering check_memory() 1333239680. [DEBUG]: Leaving check_memory() 1333239680. [DEBUG]: Entering millisleep(1) 1333239680. [DEBUG]: Leaving millisleep(1) 1333239680. [DEBUG]: Entering check_no_run() 1333239680. [DEBUG]: Entering transient_error(2, 500) 1333239680. [DEBUG]: Leaving transient_error(2, 500) -- false 1333239680. [DEBUG]: Leaving check_no_run() -- 0 1333239680. [DEBUG]: Entering read_config() 1333239680. [DEBUG]: Entering check_programs() 1333239680. [DEBUG]: Entering heartbeat_manager() 1333239680. [DEBUG]: Entering milliseconds() 1333239680. [DEBUG]: Leaving milliseconds() 1333239680. [DEBUG]: Entering open_socket(8088, (null), (null)) 1333239680. [DEBUG]: Entering transient_error(115, 500) 1333239680. [DEBUG]: Leaving transient_error(115, 500) -- false 1333239680. [DEBUG]: Leaving open_socket(8088, 127.0.0.1, (null)) -- 8 1333239680. [DEBUG]: Entering milliseconds() 1333239680. [DEBUG]: Leaving milliseconds() 1333239680. [DEBUG]: Entering milliseconds() 1333239680. [DEBUG]: Leaving milliseconds() 1333239680. [DEBUG]: Entering milliseconds() 1333239680. [DEBUG]: Leaving milliseconds() 1333239680. [WARNING]: (manager test) bad response value 1333239680. [WARNING]: manager heartbeat [variable] failed [2] 1333239680. [WARNING]: killing manager 1333239680. [DEBUG]: Entering safe_kill(/var/run/trafficserver/manager.lock, traffic_manager, 1) 1333239680. [DEBUG]: Entering set_alarm_warn() 1333239680. [DEBUG]: Leaving set_alarm_warn() 1333239680. [DEBUG]: Entering set_alarm_death() 1333239680. [DEBUG]: Leaving set_alarm_death() 1333239680. [DEBUG]: Leaving safe_kill(/var/run/trafficserver/manager.lock, traffic_manager, 1) 1333239680. [DEBUG]: Leaving heartbeat_manager() -- -1 1333239680. [DEBUG]: Entering check_memory() 1333239680. [DEBUG]: Leaving check_memory() 1333239680. [DEBUG]: Entering millisleep(1) 1333239680. [DEBUG]: Leaving millisleep(1) 1333239680. [DEBUG]: Entering check_no_run() 1333239680. [DEBUG]: Entering transient_error(2, 500) 1333239680.
[jira] [Commented] (TS-801) Crash Report: enable update will triger Segmentation fault
[ https://issues.apache.org/jira/browse/TS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242983#comment-13242983 ] Zhao Yongming commented on TS-801: -- I think in the update, the UA is cleared from the SM but what is the situation? {code} Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x2b811f539700 (LWP 3024)] HttpTransact::process_quick_http_filter (s=0x2b812c3dfb98, method=110) at HttpTransact.cc:6544 6544 if (!IpAllow::CheckMask(s-state_machine-ua_session-acl_method_mask, method)) { (gdb) bt #0 HttpTransact::process_quick_http_filter (s=0x2b812c3dfb98, method=110) at HttpTransact.cc:6544 #1 0x00554301 in HttpTransact::EndRemapRequest (s=0x2b812c3dfb98) at HttpTransact.cc:851 #2 0x0052f552 in HttpSM::call_transact_and_set_next_state (this=0x2b812c3dfb30, f=optimized out) at HttpSM.cc:6319 #3 0x0053e26a in HttpSM::set_next_state (this=0x2b812c3dfb30) at HttpSM.cc:6377 #4 0x0053e14c in HttpSM::set_next_state (this=0x2b812c3dfb30) at HttpSM.cc:6516 #5 0x0053e14c in HttpSM::set_next_state (this=0x2b812c3dfb30) at HttpSM.cc:6516 #6 0x00539961 in do_api_callout (this=0x2b812c3dfb30) at HttpSM.cc:499 #7 do_api_callout (this=0x2b812c3dfb30) at HttpSM.cc:504 #8 HttpSM::state_add_to_list (this=0x2b812c3dfb30, event=optimized out, data=optimized out) at HttpSM.cc:527 #9 0x0053a738 in HttpSM::main_handler (this=0x2b812c3dfb30, event=0, data=0x0) at HttpSM.cc:2440 #10 0x0056c267 in handleEvent (data=0x0, event=0, this=0x2b812c3dfb30) at ../../iocore/eventsystem/I_Continuation.h:146 #11 HttpUpdateSM::start_scheduled_update (this=0x2b812c3dfb30, cont=0x2b81201242c0, request=0x1246ab0) at HttpUpdateSM.cc:92 #12 0x004fbbf7 in UpdateSM::http_scheme (sm=0x2b81201242c0) at Update.cc:1567 #13 0x004f7008 in UpdateSM::HandleSMEvent (this=0x2b81201242c0, event=1, e=optimized out) at Update.cc:1478 #14 0x006a6380 in handleEvent (data=0x1202570, event=1, this=optimized out) at I_Continuation.h:146 #15 EThread::process_event (this=0x2b811f237010, e=0x1202570, calling_code=1) at UnixEThread.cc:142 #16 0x006a6f3b in EThread::execute (this=0x2b811f237010) at UnixEThread.cc:191 #17 0x006a5172 in spawn_thread_internal (a=0x11e27e0) at Thread.cc:88 #18 0x2b811b67ae2c in start_thread () from /lib64/libpthread.so.0 #19 0x2b811df5b3cd in clone () from /lib64/libc.so.6 (gdb) p s-state_machine-ua_session $1 = (HttpClientSession *) 0x0 (gdb) {code} Crash Report: enable update will triger Segmentation fault -- Key: TS-801 URL: https://issues.apache.org/jira/browse/TS-801 Project: Traffic Server Issue Type: Bug Components: HTTP Affects Versions: 2.1.8 Environment: v2.1.8 and update function enabled. Reporter: Zhao Yongming Labels: update Fix For: 3.1.4 {code} b13621367...@hotmail.com: NOTE: Traffic Server received Sig 11: Segmentation fault /usr/local/ts/bin/traffic_server - STACK TRACE: b13621367...@hotmail.com: /usr/local/ts/bin/traffic_server[0x5260c9] /lib64/libpthread.so.0[0x3088e0f4c0] [0x6e] /usr/local/ts/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0x6e)[0x57e0e2] /usr/local/ts/bin/traffic_server(HttpSM::set_next_state()+0x18b)[0x57e369] /usr/local/ts/bin/traffic_server(HttpUpdateSM::set_next_state()+0xad)[0x5b604b] /usr/local/ts/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0x15e)[0x57e1d2] /usr/local/ts/bin/traffic_server(HttpSM::handle_api_return()+0x138)[0x56d9aa] /usr/local/ts/bin/traffic_server(HttpUpdateSM::handle_api_return()+0x47)[0x5b5cc1] /usr/local/ts/bin/traffic_server(HttpSM::do_api_callout()+0x3f)[0x582cc3] /usr/local/ts/bin/traffic_server(HttpSM::set_next_state()+0x64)[0x57e242] /usr/local/ts/bin/traffic_server(HttpUpdateSM::set_next_state()+0xad)[0x5b604b] /usr/local/ts/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0x15e)[0x57e1d2] /usr/local/ts/bin/traffic_server(HttpSM::handle_api_return()+0x13b13621367...@hotmail.com: 8)[0x56d9aa] /usr/local/ts/bin/traffic_server(HttpUpdateSM::handle_api_return()+0x47)[0x5b5cc1] /usr/local/ts/bin/traffic_server(HttpSM::do_api_callout()+0x3f)[0x582cc3] /usr/local/ts/bin/traffic_server(HttpSM::set_next_state()+0x64)[0x57e242] /usr/local/ts/bin/traffic_server(HttpUpdateSM::set_next_state()+0xad)[0x5b604b] /usr/local/ts/bin/traffic_server(HttpSM::call_transact_and_set_next_state(void (*)(HttpTransact::State*))+0x15e)[0x57e1d2] /usr/local/ts/bin/traffic_server(HttpUpdateSM::handle_api_return()+0x36)[0x5b5cb0]
[jira] [Commented] (TS-801) Crash Report: enable update will triger Segmentation fault
[ https://issues.apache.org/jira/browse/TS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243020#comment-13243020 ] Zhao Yongming commented on TS-801: -- {code} Program terminated with signal 11, Segmentation fault. #0 HttpTransact::process_quick_http_filter (s=0x2b65bc2a5b98, method=110) at HttpTransact.cc:6544 6544 if (!IpAllow::CheckMask(s-state_machine-ua_session-acl_method_mask, method)) { (gdb) bt #0 HttpTransact::process_quick_http_filter (s=0x2b65bc2a5b98, method=110) at HttpTransact.cc:6544 #1 0x00554301 in HttpTransact::EndRemapRequest (s=0x2b65bc2a5b98) at HttpTransact.cc:851 #2 0x0052f552 in HttpSM::call_transact_and_set_next_state (this=0x2b65bc2a5b30, f=optimized out) at HttpSM.cc:6319 #3 0x0053e26a in HttpSM::set_next_state (this=0x2b65bc2a5b30) at HttpSM.cc:6377 #4 0x0053e14c in HttpSM::set_next_state (this=0x2b65bc2a5b30) at HttpSM.cc:6516 #5 0x0053e14c in HttpSM::set_next_state (this=0x2b65bc2a5b30) at HttpSM.cc:6516 #6 0x00539961 in do_api_callout (this=0x2b65bc2a5b30) at HttpSM.cc:499 #7 do_api_callout (this=0x2b65bc2a5b30) at HttpSM.cc:504 #8 HttpSM::state_add_to_list (this=0x2b65bc2a5b30, event=optimized out, data=optimized out) at HttpSM.cc:527 #9 0x0053a738 in HttpSM::main_handler (this=0x2b65bc2a5b30, event=0, data=0x0) at HttpSM.cc:2440 #10 0x0056c267 in handleEvent (data=0x0, event=0, this=0x2b65bc2a5b30) at ../../iocore/eventsystem/I_Continuation.h:146 #11 HttpUpdateSM::start_scheduled_update (this=0x2b65bc2a5b30, cont=0x13e6880, request=0x2b65b0120e20) at HttpUpdateSM.cc:92 #12 0x004fbbf7 in UpdateSM::http_scheme (sm=0x13e6880) at Update.cc:1567 #13 0x004f7008 in UpdateSM::HandleSMEvent (this=0x13e6880, event=1, e=optimized out) at Update.cc:1478 #14 0x006a6380 in handleEvent (data=0x13ae130, event=1, this=optimized out) at I_Continuation.h:146 #15 EThread::process_event (this=0x2b65ad7c8010, e=0x13ae130, calling_code=1) at UnixEThread.cc:142 #16 0x006a6f3b in EThread::execute (this=0x2b65ad7c8010) at UnixEThread.cc:191 #17 0x0048b946 in main (argc=optimized out, argv=optimized out) at Main.cc:1841 (gdb) i threads Id Target Id Frame 21 Thread 0x2b65b6de5700 (LWP 3286) 0x2b65ac6eea53 in epoll_wait () from /lib64/libc.so.6 20 Thread 0x2b65adccc700 (LWP 3265) 0x2b65ac6eea53 in epoll_wait () from /lib64/libc.so.6 19 Thread 0x2b65b61d9700 (LWP 3272) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 18 Thread 0x2b65b7710700 (LWP 3346) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 17 Thread 0x2b65b760f700 (LWP 3345) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 16 Thread 0x2b65b7912700 (LWP 3347) 0x2b65a9e14f3d in accept () from /lib64/libpthread.so.0 15 Thread 0x2b65b730c700 (LWP 3344) 0x2b65a9e14f3d in accept () from /lib64/libpthread.so.0 14 Thread 0x2b65b7088700 (LWP 3341) 0x2b65a9e11d7c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 13 Thread 0x2b65b6be3700 (LWP 3280) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 12 Thread 0x2b65b69e1700 (LWP 3278) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 11 Thread 0x2b65b67df700 (LWP 3277) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 10 Thread 0x2b65b65dd700 (LWP 3276) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 9Thread 0x2b65b63db700 (LWP 3275) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 8Thread 0x2b65b5fd7700 (LWP 3270) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 7Thread 0x2b65b5dd5700 (LWP 3269) 0x2b65a9e120fb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 6Thread 0x2b65ae2d2700 (LWP 3268) 0x2b65ac6bdbfd in nanosleep () from /lib64/libc.so.6 5Thread 0x2b65ae0d0700 (LWP 3267) 0x2b65ac6bdbfd in nanosleep () from /lib64/libc.so.6 4Thread 0x2b65adece700 (LWP 3266) 0x2b65ac6bdbfd in nanosleep () from /lib64/libc.so.6 3Thread 0x2b65adbcb700 (LWP 3264) 0x2b65ac6eea53 in epoll_wait () from /lib64/libc.so.6 2Thread 0x2b65ad066700 (LWP 3261) 0x2b65ac6bdbfd in nanosleep () from /lib64/libc.so.6 * 1Thread 0x2b65acdfb340 (LWP 3260) HttpTransact::process_quick_http_filter (s=0x2b65bc2a5b98, method=110) at HttpTransact.cc:6544 (gdb) bt #0 HttpTransact::process_quick_http_filter (s=0x2b65bc2a5b98, method=110) at HttpTransact.cc:6544 #1 0x00554301 in HttpTransact::EndRemapRequest (s=0x2b65bc2a5b98) at HttpTransact.cc:851 #2
[jira] [Commented] (TS-1078) trafficserver-3.1.1-unstable.tar.bz2 core dumps during load test
[ https://issues.apache.org/jira/browse/TS-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240420#comment-13240420 ] Zhao Yongming commented on TS-1078: --- I think we can consider all crashes related to do_io_close as all fixed, please confirm with the git master testing, and feed back. trafficserver-3.1.1-unstable.tar.bz2 core dumps during load test Key: TS-1078 URL: https://issues.apache.org/jira/browse/TS-1078 Project: Traffic Server Issue Type: Bug Components: HTTP Affects Versions: 3.1.1 Environment: Redhat linux with no plugins (although stack trace shows our plugin being called). The tests were run with no plugin and exactly the same stack trace occurred. Reporter: Alistair Stevenson Assignee: weijin Priority: Blocker Fix For: 3.1.4 {code} (gdb) bt #0 ink_restore_signal_handler_frame (stack=0x7f2a67650490, len=value optimized out, signalhandler_frame=2) at ink_stack_trace.cc:68 #1 ink_stack_trace_get (stack=0x7f2a67650490, len=value optimized out, signalhandler_frame=2) at ink_stack_trace.cc:89 #2 0x7f2a682dcf7d in ink_stack_trace_dump (sighandler_frame=2) at ink_stack_trace.cc:114 #3 0x004d2512 in signal_handler (sig=11) at signals.cc:222 #4 signal handler called #5 ink_atomiclist_push (l=0x100bb0, item=0x72a9100) at ink_queue.cc:457 #6 0x0065800b in ProtectedQueue::enqueue (this=0x100bb0, e=value optimized out, fast_signal=false) at ProtectedQueue.cc:53 #7 0x0062c3ca in NetVConnection::Handle::do_locked_io_close (this=0x6e01830, lerrno=-1) at NetVConnection.cc:93 #8 0x00507608 in HttpServerSession::do_io_close (this=0x6e01770, alerrno=value optimized out) at HttpServerSession.cc:122 #9 0x0050a78b in HttpSessionManager::acquire_session (this=0x941940, cont=value optimized out, ip=0xa5443c0, hostname=0x5732c19 10.20.48.15, ua_session=value optimized out, sm=0xa543d20) at HttpSessionManager.cc:257 #10 0x0051d544 in HttpSM::do_http_server_open (this=0xa543d20, raw=false) at HttpSM.cc:4139 #11 0x0051e0e8 in HttpSM::set_next_state (this=0xa543d20) at HttpSM.cc:6464 #12 0x0050b8ff in HttpSM::call_transact_and_set_next_state (this=0xa543d20, f=value optimized out) at HttpSM.cc:6319 #13 0x005162d8 in HttpSM::state_api_callout (this=0xa543d20, event=6, data=0x0) at HttpSM.cc:1446 #14 0x00518499 in HttpSM::state_api_callback (this=0xa543d20, event=6, data=0x0) at HttpSM.cc:1265 #15 0x004a9af8 in TSHttpTxnReenable (txnp=value optimized out, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5407 #16 0x7f2a63a3d45b in opwvPluginHandler (contp=0x7631780, event=TS_EVENT_HTTP_CACHE_LOOKUP_COMPLETE, edata=0xa543d20) at /bfs-build/build-area.93/builds/LinuxNBngp_andes/2012-01-09-0057/http/src/traffi c_server/plugin.cpp:82 #17 0x00516585 in HttpSM::state_api_callout (this=0xa543d20, event=value optimized out, data=value optimized out) at HttpSM.cc:1372 #18 0x0051de1a in HttpSM::set_next_state (this=0xa543d20) at HttpSM.cc:6353 #19 0x0050b8ff in HttpSM::call_transact_and_set_next_state (this=0xa543d20, f=value optimized out) at HttpSM.cc:6319 #20 0x005162d8 in HttpSM::state_api_callout (this=0xa543d20, event=6, data=0x0) at HttpSM.cc:1446 #21 0x00518499 in HttpSM::state_api_callback (this=0xa543d20, event=6, data=0x0) at HttpSM.cc:1265 #22 0x004a9af8 in TSHttpTxnReenable (txnp=value optimized out, event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5407 #23 0x7f2a63a3d6cc in opwvPluginHandler (contp=0x7631780, event=TS_EVENT_HTTP_OS_DNS, edata=0xa543d20) at /bfs-build/build-area.93/builds/LinuxNBngp_andes/2012-01-09-0057/http/src/traffi c_server/plugin.cpp:117 #24 0x00516585 in HttpSM::state_api_callout (this=0xa543d20, event=value optimized out, data=value optimized out) at HttpSM.cc:1372 #25 0x0051de1a in HttpSM::set_next_state (this=0xa543d20) at HttpSM.cc:6353 #26 0x0050b8ff in HttpSM::call_transact_and_set_next_state (this=0xa543d20, f=value optimized out) at HttpSM.cc:6319 #27 0x0050d58a in HttpSM::do_hostdb_lookup (this=0xa543d20) at HttpSM.cc:3749 #28 0x0051e72b in HttpSM::set_next_state (this=0xa543d20) at HttpSM.cc:6414 #29 0x0050b8ff in HttpSM::call_transact_and_set_next_state (this=0xa543d20, f=value optimized out) at HttpSM.cc:6319 #30 0x005162d8 in HttpSM::state_api_callout (this=0xa543d20, event=6, data=0x0) at HttpSM.cc:1446 #31 0x00518499 in HttpSM::state_api_callback (this=0xa543d20, event=6, data=0x0) at HttpSM.cc:1265 #32 0x004a9af8 in TSHttpTxnReenable (txnp=value optimized out,
[jira] [Commented] (TS-1114) Crash report: HttpTransactCache::SelectFromAlternates
[ https://issues.apache.org/jira/browse/TS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232196#comment-13232196 ] Zhao Yongming commented on TS-1114: --- when we tracking down this issue, we have two directions: Weijin is tracking on why the event is 8, where there should not be any event that is 8 in the event system, and in other core dumps we are sure that the event is not what it should be as a really event, it is shown as a random data, that turns out to be something really interest: 1, it should be that the old data(may or may not be the same event) is freed, and the event is not canceled. 2, someone overwrite the data in this event. Weijin track down this way and it turns out that the action cancel codes may rise some problem under certain situation. He made a patch into our tree, and we applied it on half of our servers, it runs without any crash for weeks. At the same time, Koutai is working on make the vector write read more safe, even in some very strange situation. And patched half of our servers, runs without any crash too. after carefully discuss, we conclude that Weijing's patch is what we need to keep, and here comes the patch. back to TS-857, when I look it back, there is some strange event in the back trace, we have only , is that the same issue hare? where is the action canceled without mutex protected? if we can consider TS-1114 a good fix, then we should think about TS-857 a crash same as it. so far, I am not sure how many crashes after patched with TS-1114, I just don't get too much new back trace for this issue, TS-1114 may covered many strange crashes as it will make system really strange. Crash report: HttpTransactCache::SelectFromAlternates - Key: TS-1114 URL: https://issues.apache.org/jira/browse/TS-1114 Project: Traffic Server Issue Type: Bug Reporter: Zhao Yongming Assignee: weijin Fix For: 3.1.4 Attachments: cache_crash.diff it may or may not be the upstream issue, let us open it for tracking. {code} #0 0x0053075e in HttpTransactCache::SelectFromAlternates (cache_vector=0x2aaab80ff500, client_request=0x2aaab80ff4c0, http_config_params=0x2aaab547b800) at ../../proxy/hdrs/HTTP.h:1375 1375((int32_t *) val)[0] = m_alt-m_object_key[0]; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1114) Crash report: HttpTransactCache::SelectFromAlternates
[ https://issues.apache.org/jira/browse/TS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232164#comment-13232164 ] Zhao Yongming commented on TS-1114: --- yeah, we are confidential that we have fixed the crash, and we need your review, that is what we are waiting for :D Crash report: HttpTransactCache::SelectFromAlternates - Key: TS-1114 URL: https://issues.apache.org/jira/browse/TS-1114 Project: Traffic Server Issue Type: Bug Reporter: Zhao Yongming Assignee: weijin Fix For: 3.1.4 Attachments: cache_crash.diff it may or may not be the upstream issue, let us open it for tracking. {code} #0 0x0053075e in HttpTransactCache::SelectFromAlternates (cache_vector=0x2aaab80ff500, client_request=0x2aaab80ff4c0, http_config_params=0x2aaab547b800) at ../../proxy/hdrs/HTTP.h:1375 1375((int32_t *) val)[0] = m_alt-m_object_key[0]; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-857) Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close
[ https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227664#comment-13227664 ] Zhao Yongming commented on TS-857: -- we have the codes commit in our tree in: https://gitorious.org/trafficserver/taobao/commit/c0fdec56bacea139a026412038378d7a8f65a731 Crash Report: HttpTunnel::chain_abort_all - HttpServerSession::do_io_close - UnixNetVConnection::do_io_close -- Key: TS-857 URL: https://issues.apache.org/jira/browse/TS-857 Project: Traffic Server Issue Type: Bug Components: HTTP, Network Affects Versions: 3.1.0 Environment: in my branch that is something same as 3.0.x Reporter: Zhao Yongming Assignee: weijin Fix For: 3.1.5 Attachments: ts-857.diff, ts-857.diff, ts-857.diff here is the bt from the crash, some of the information is missing due to we have not enable the --enable-debug configure options. {code} [New process 7532] #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 68fp = (void **) (*fp); (gdb) bt #0 ink_stack_trace_get (stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out) at ink_stack_trace.cc:68 #1 0x2ba641dccef1 in ink_stack_trace_dump (sighandler_frame=value optimized out) at ink_stack_trace.cc:114 #2 0x004df020 in signal_handler (sig=value optimized out) at signals.cc:225 #3 signal handler called #4 0x006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20, alerrno=value optimized out) at ../../iocore/eventsystem/I_Lock.h:297 #5 0x0051f1d0 in HttpServerSession::do_io_close (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127 #6 0x0056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70, p=0x2aabeeffdf68) at HttpTunnel.cc:1300 #7 0x005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070, event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987 #8 0x00571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70, event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232 #9 0x00572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70, event=1088608784, data=value optimized out) at HttpTunnel.cc:1456 #10 0x006a6307 in write_to_net_io (nh=0x2b12d688, vc=0x1cc876e0, thread=value optimized out) at ../../iocore/eventsystem/I_Continuation.h:146 #11 0x0069ce97 in NetHandler::mainNetEvent (this=0x2b12d688, event=value optimized out, e=0x171c1ed0) at UnixNet.cc:405 #12 0x006cddaf in EThread::process_event (this=0x2b12c010, e=0x171c1ed0, calling_code=5) at I_Continuation.h:146 #13 0x006ce6bc in EThread::execute (this=0x2b12c010) at UnixEThread.cc:262 #14 0x006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88 #15 0x003c33c064a7 in start_thread () from /lib64/libpthread.so.0 #16 0x003c330d3c2d in clone () from /lib64/libc.so.6 (gdb) info f Stack level 0, frame at 0x40e2b790: rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int) (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1 called by frame at 0x40e2bbe0 source language c++. Arglist at 0x40e2b770, args: stack=value optimized out, len=value optimized out, signalhandler_frame=value optimized out Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790 Saved registers: rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788 (gdb) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1080) Assert under heavy load with logging enabled
[ https://issues.apache.org/jira/browse/TS-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221534#comment-13221534 ] Zhao Yongming commented on TS-1080: --- hmm, in this case we have fill up the accepting queue with the 512*4 buffers before the flush thread flushed the flush queue, if we don't increase FLUSH_ARRAY_SIZE, we should have 2 options: 1, drop the buffer this may be a good solution for me, as it is better than crashing. 2, speed up the flush thread well, we will run into another complex direction, we can increase flush thread? or how can we flush at a higher speed while the IO is limited? Assert under heavy load with logging enabled Key: TS-1080 URL: https://issues.apache.org/jira/browse/TS-1080 Project: Traffic Server Issue Type: Bug Components: Logging Reporter: Leif Hedstrom Priority: Critical Fix For: 3.1.4 Given enough load (in the 100,000 QPS or more), we run out of some sort of buffer space, with an assert of {code} #0 0x2ba719d50285 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x2ba719d51b9b in __GI_abort () at abort.c:91 #2 0x006b561a in ink_die_die_die (retval=optimized out) at ink_error.cc:43 #3 ink_fatal_va(int, const char *, typedef __va_list_tag __va_list_tag *) (return_code=optimized out, message_format=optimized out, ap=0x7fff7275a7d8) at ink_error.cc:65 #4 0x006b56a7 in ink_fatal (return_code=optimized out, message_format=optimized out) at ink_error.cc:73 #5 0x006b4970 in _ink_assert (a=0x6fd380 _num_flush_buffers[_open_flush_array] FLUSH_ARRAY_SIZE, f=optimized out, l=96) at ink_assert.cc:44 #6 0x005a8b34 in add_to_flush_queue (buffer=0x2ba7443ca970, this=0x22fb918) at LogObject.h:96 #7 LogObject::_checkout_write (this=0x22fb880, write_offset=0x7fff7275add8, bytes_needed=152) at LogObject.cc:455 #8 0x005a8fd3 in LogObject::log (this=0x22fb880, lad=0x7fff7275b030, text_entry=0x0) at LogObject.cc:580 #9 0x0058e956 in log (lad=0x7fff7275b030, this=optimized out) at LogObject.h:475 #10 Log::access (lad=0x7fff7275b030) at Log.cc:1086 {code} Increasing FLUSH_ARRAY_SIZE alleviates the problem, but really, we shouldn't end up in this situation at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-968) During/after daily logfile roll, trafficserver seg faults (Sig 11)
[ https://issues.apache.org/jira/browse/TS-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220136#comment-13220136 ] Zhao Yongming commented on TS-968: -- is this issue still valid for now? can you test if it still valid with current stable release and the current git master tree? During/after daily logfile roll, trafficserver seg faults (Sig 11) -- Key: TS-968 URL: https://issues.apache.org/jira/browse/TS-968 Project: Traffic Server Issue Type: Bug Components: Logging Affects Versions: 3.0.1 Environment: Ubuntu 10.10, 2.6.35-24-virtual Reporter: Drew Rothstein Fix For: 3.1.4 Every day at 00:00:00 after/during the log file roll, I see a segfault. Here are the past couple days: [Sep 22 00:00:00.000] Server {47590596851456} STATUS: The logfile /usr/local/var/log/trafficserver/error.log was rolled to /usr/local/var/log/trafficserver/error.log_trafficserver02.20110921.00h00m06s-20110922.00h00m00s.old. [Sep 22 00:00:00.000] Server {47590596851456} STATUS: The logfile /usr/local/var/log/trafficserver/squid.log was rolled to /usr/local/var/log/trafficserver/squid.log_trafficserver02.20110921.00h00m01s-20110922.00h00m00s.old. [Sep 22 00:00:00.000] Server {47590596851456} STATUS: The logfile /usr/local/var/log/trafficserver/extended2.log was rolled to /usr/local/var/log/trafficserver/extended2.log_trafficserver02.20110921.00h00m01s-20110922.00h00m00s.old. NOTE: Traffic Server received Sig 11: Segmentation fault /usr/local/bin/traffic_server - STACK TRACE: [Sep 22 00:00:17.729] Manager {140722071643936} FATAL: [LocalManager::pollMgmtProcessServer] Error in read (errno: 104) [Sep 22 00:00:17.729] Manager {140722071643936} FATAL: (last system error 104: Connection reset by peer) [Sep 22 00:00:17.730] Manager {140722071643936} NOTE: [LocalManager::mgmtShutdown] Executing shutdown request. [Sep 22 00:00:17.730] Manager {140722071643936} NOTE: [LocalManager::processShutdown] Executing process shutdown request. [Sep 22 00:00:17.730] Manager {140722071643936} ERROR: [LocalManager::sendMgmtMsgToProcesses] Error writing message [Sep 22 00:00:17.730] Manager {140722071643936} ERROR: (last system error 32: Broken pipe) [E. Mgmt] log == [TrafficManager] using root directory '/usr/local' [Sep 22 00:00:17.786] {140131209512736} STATUS: opened /usr/local/var/log/trafficserver/manager.log [Sep 22 00:00:17.786] {140131209512736} NOTE: updated diags config [Sep 22 00:00:17.805] Manager {140131209512736} NOTE: [ClusterCom::ClusterCom] Node running on OS: 'Linux' Release: '2.6.35-24-virtual' [Sep 22 00:00:17.805] Manager {140131209512736} NOTE: [LocalManager::listenForProxy] Listening on port: 80 [Sep 22 00:00:17.805] Manager {140131209512736} NOTE: [TrafficManager] Setup complete [Sep 22 00:00:18.827] Manager {140131209512736} NOTE: [LocalManager::startProxy] Launching ts process [TrafficServer] using root directory '/usr/local' [Sep 22 00:00:18.849] Manager {140131209512736} NOTE: [LocalManager::pollMgmtProcessServer] New process connecting fd '13' [Sep 22 00:00:18.849] Manager {140131209512736} NOTE: [Alarms::signalAlarm] Server Process born [Sep 22 00:00:19.874] {47510015031936} STATUS: opened /usr/local/var/log/trafficserver/diags.log [Sep 22 00:00:19.874] {47510015031936} NOTE: updated diags config [Sep 22 00:00:19.879] Server {47510015031936} NOTE: cache clustering disabled [Sep 22 00:00:19.908] Server {47510015031936} NOTE: cache clustering disabled [Sep 22 00:00:20.019] Server {47510015031936} NOTE: logging initialized[7], logging_mode = 3 [Sep 22 00:00:20.032] Server {47510015031936} NOTE: traffic server running [Sep 22 00:00:20.045] Server {47510028859136} NOTE: cache enabled [Sep 23 00:00:00.000] Server {47409990321920} STATUS: The logfile /usr/local/var/log/trafficserver/error.log was rolled to /usr/local/var/log/trafficserver/error.log_trafficserver02.20110922.00h00m11s-20110923.00h00m00s.old. [Sep 23 00:00:00.000] Server {47409990321920} STATUS: The logfile /usr/local/var/log/trafficserver/squid.log was rolled to /usr/local/var/log/trafficserver/squid.log_trafficserver02.20110922.00h00m06s-20110923.00h00m00s.old. [Sep 23 00:00:00.000] Server {47409990321920} STATUS: The logfile /usr/local/var/log/trafficserver/extended2.log was rolled to /usr/local/var/log/trafficserver/extended2.log_trafficserver02.20110922.00h00m06s-20110923.00h00m00s.old. NOTE: Traffic Server received Sig 11: Segmentation fault /usr/local/bin/traffic_server - STACK TRACE: [Sep 23 00:00:14.668] Manager {140131209512736} FATAL: [LocalManager::pollMgmtProcessServer] Error in read (errno: 104) [Sep 23 00:00:14.668] Manager {140131209512736} FATAL: (last system error 104:
[jira] [Commented] (TS-1051) Updating logs_xml.config requires full restart
[ https://issues.apache.org/jira/browse/TS-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220145#comment-13220145 ] Zhao Yongming commented on TS-1051: --- I don't think that will happen in current master tree and stable releases. please check the result of: traffic_line -r proxy.config.log.custom_logs_enabled it must be set to 1 to enable logs_xml.config Updating logs_xml.config requires full restart -- Key: TS-1051 URL: https://issues.apache.org/jira/browse/TS-1051 Project: Traffic Server Issue Type: Bug Components: Logging Affects Versions: 3.1.2 Reporter: Billy Vierra Assignee: Leif Hedstrom Fix For: 3.1.4 Using SVN Rev: 1214051 URL: http://svn.apache.org/repos/asf/trafficserver/traffic/trunk upon adding a new LogObject and doing traffic_line -x you get the following in traffic.out [Dec 14 12:31:48.533] Manager {0x7f2f2abef700} NOTE: User has changed config file logs_xml.config However it does not go into effect (new log is not created). Upon full restart: trafficserver stop, trafficserver start it will add the new log file as expected. Not sure if it is a bug with docs or in code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1002) log unmapped HOST when pristine_host_hdr disabled
[ https://issues.apache.org/jira/browse/TS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220150#comment-13220150 ] Zhao Yongming commented on TS-1002: --- I think this is fixed in the git master? {code} zymtest1 trafficserver # tail -n 100 taobaomiss.log 00:40:39 127.0.0.1 fe80::746d:74ff:febf:ebaa 428 200 TCP_MISS http://cdn.zymlinux.net/trafficserver/0; 411 402 0 text/plain http://zymlinux.net/trafficserver/0 - 00:40:39 127.0.0.1 fe80::746d:74ff:febf:ebaa 475 200 TCP_MISS http://cdn.zymlinux.net/trafficserver/ts75.png; 418 404 9520 image/png http://zymlinux.net/trafficserver/ts75.png - 00:42:42 127.0.0.1 fe80::746d:74ff:febf:ebaa 54 200 TCP_REFRESH_HIT http://cdn.zymlinux.net/trafficserver/ts75.png; 418 404 9520 image/png http://zymlinux.net/trafficserver/ts75.png - 00:42:42 127.0.0.1 fe80::746d:74ff:febf:ebaa 53 200 TCP_MISS http://cdn.zymlinux.net/trafficserver/0; 411 402 0 text/plain http://zymlinux.net/trafficserver/0 - zymtest1 trafficserver # traffic_line -r proxy.config.url_remap.pristine_host_hdr 0 zymtest1 trafficserver # grep cdn.zymlinux.net /etc/trafficserver/remap.config map http://cdn.zymlinux.net/http://zymlinux.net zymtest1 trafficserver # {code} log unmapped HOST when pristine_host_hdr disabled - Key: TS-1002 URL: https://issues.apache.org/jira/browse/TS-1002 Project: Traffic Server Issue Type: Wish Components: Logging Reporter: Conan Wang Priority: Minor Fix For: 3.1.5 I want to log user request's Host in http header before remap. I write logs_xml.config, like: %{Host}cqh When proxy.config.url_remap.pristine_host_hdr is enabled, I will get the right Host which is not rewritten. When disable the config, I always get the rewritten/mapped Host which is not what I need. logs_xml reference: http://trafficserver.apache.org/docs/v2/admin/logfmts.htm#66912 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1114) Crash report: HttpTransactCache::SelectFromAlternates
[ https://issues.apache.org/jira/browse/TS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210040#comment-13210040 ] Zhao Yongming commented on TS-1114: --- {code} (gdb) f 1 #1 0x00644387 in CacheVC::openReadChooseWriter (this=0x2aaab80ff400, event=8, e=value optimized out) at CacheRead.cc:341 (gdb) p vector $19 = {magic = 0x0, data = {data = 0x2aaabcc8bc78, fast_data = {{alternate = {m_alt = 0x0}}, {alternate = {m_alt = 0x0}}, {alternate = { m_alt = 0x0}}, {alternate = {m_alt = 0x0}}}, default_val = 0xe85a58, size = 8, pos = 7}, xcount = 8, vector_buf = {m_ptr = 0x0}} (gdb) {code} Crash report: HttpTransactCache::SelectFromAlternates - Key: TS-1114 URL: https://issues.apache.org/jira/browse/TS-1114 Project: Traffic Server Issue Type: Bug Reporter: Zhao Yongming it may or may not be the upstream issue, let us open it for tracking. {code} #0 0x0053075e in HttpTransactCache::SelectFromAlternates (cache_vector=0x2aaab80ff500, client_request=0x2aaab80ff4c0, http_config_params=0x2aaab547b800) at ../../proxy/hdrs/HTTP.h:1375 1375((int32_t *) val)[0] = m_alt-m_object_key[0]; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-1006) memory management, cut down memory waste ?
[ https://issues.apache.org/jira/browse/TS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175301#comment-13175301 ] Zhao Yongming commented on TS-1006: --- the memory waste in TS is mostly because the Ramcache: 1, it is counted by really used memory. 2, it will hold the whole block of memory from free to OS, the old glibc memory management issue. 3, ramcache use the cache reader buffer, and the buffer is allocated from anywhere in the whole memory adress all those make TS waste much more memory than Ramcache is configured. it will eat all you memory at then end. what we want to do: 1, limit ramcache memory by allocator 2, bound ram lur list and freelist 3, freelist will be split by size 4, split ramcache memory and cache memory the codes will be ready in days memory management, cut down memory waste ? -- Key: TS-1006 URL: https://issues.apache.org/jira/browse/TS-1006 Project: Traffic Server Issue Type: Improvement Components: Core Affects Versions: 3.1.1 Reporter: Zhao Yongming Assignee: Zhao Yongming Fix For: 3.1.3 Attachments: memusage.ods, memusage.ods when we review the memory usage in the production, there is something abnormal, ie, looks like TS take much memory than index data + common system waste, and here is some memory dump result by set proxy.config.dump_mem_info_frequency 1, the one on a not so busy forwarding system: physics memory: 32G RAM cache: 22G DISK: 6140 GB average_object_size 64000 {code} allocated |in-use | type size | free list name |||-- 671088640 | 37748736 |2097152 | memory/ioBufAllocator[14] 2248146944 | 2135949312 |1048576 | memory/ioBufAllocator[13] 1711276032 | 1705508864 | 524288 | memory/ioBufAllocator[12] 1669332992 | 1667760128 | 262144 | memory/ioBufAllocator[11] 2214592512 | 221184 | 131072 | memory/ioBufAllocator[10] 2325741568 | 2323775488 | 65536 | memory/ioBufAllocator[9] 2091909120 | 2089123840 | 32768 | memory/ioBufAllocator[8] 1956642816 | 1956478976 | 16384 | memory/ioBufAllocator[7] 2094530560 | 2094071808 | 8192 | memory/ioBufAllocator[6] 356515840 | 355540992 | 4096 | memory/ioBufAllocator[5] 1048576 | 14336 | 2048 | memory/ioBufAllocator[4] 131072 | 0 | 1024 | memory/ioBufAllocator[3] 65536 | 0 |512 | memory/ioBufAllocator[2] 32768 | 0 |256 | memory/ioBufAllocator[1] 16384 | 0 |128 | memory/ioBufAllocator[0] 0 | 0 |576 | memory/ICPRequestCont_allocator 0 | 0 |112 | memory/ICPPeerReadContAllocator 0 | 0 |432 | memory/PeerReadDataAllocator 0 | 0 | 32 | memory/MIMEFieldSDKHandle 0 | 0 |240 | memory/INKVConnAllocator 0 | 0 | 96 | memory/INKContAllocator 4096 | 0 | 32 | memory/apiHookAllocator 0 | 0 |288 | memory/FetchSMAllocator 0 | 0 | 80 | memory/prefetchLockHandlerAllocator 0 | 0 |176 | memory/PrefetchBlasterAllocator 0 | 0 | 80 | memory/prefetchUrlBlaster 0 | 0 | 96 | memory/blasterUrlList 0 | 0 | 96 | memory/prefetchUrlEntryAllocator 0 | 0 |128 | memory/socksProxyAllocator 0 | 0 |144 | memory/ObjectReloadCont 3258368 | 576016 |592 | memory/httpClientSessionAllocator 825344 | 139568 |208 | memory/httpServerSessionAllocator 22597632 |1284848 | 9808 | memory/httpSMAllocator 0 | 0 | 32 | memory/CacheLookupHttpConfigAllocator 0 | 0 | 9856 | memory/httpUpdateSMAllocator 0 | 0 |128 | memory/RemapPluginsAlloc
[jira] [Commented] (TS-1006) memory management, cut down memory waste ?
[ https://issues.apache.org/jira/browse/TS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153365#comment-13153365 ] Zhao Yongming commented on TS-1006: --- we have made some enhancement and the result looks like the following, and we are still tracking for bugs and this is the first try, we will submit then codes after it stable for our usage. {code} [Nov 19 11:25:42.351] Server {0x4257a940} WARNING: Document 644CE0FF truncated at 10485040 of 21289520, missing fragment 9DD38ED0 allocated |in-use | type size | free list name |||-- 67108864 | 0 |2097152 | memory/cacheBufAllocator[14] 33554432 | 0 |1048576 | memory/cacheBufAllocator[13] 33554432 |1572864 | 524288 | memory/cacheBufAllocator[12] 33554432 |6029312 | 262144 | memory/cacheBufAllocator[11] 37748736 | 17301504 | 131072 | memory/cacheBufAllocator[10] 35651584 | 32964608 | 65536 | memory/cacheBufAllocator[9] 70254592 | 64782336 | 32768 | memory/cacheBufAllocator[8] 55574528 | 46366720 | 16384 | memory/cacheBufAllocator[7] 28573696 | 22749184 | 8192 | memory/cacheBufAllocator[6] 11534336 |320 | 4096 | memory/cacheBufAllocator[5] 0 | 0 | 2048 | memory/cacheBufAllocator[4] 0 | 0 | 1024 | memory/cacheBufAllocator[3] 0 | 0 |512 | memory/cacheBufAllocator[2] 0 | 0 |256 | memory/cacheBufAllocator[1] 0 | 0 |128 | memory/cacheBufAllocator[0] 1572864 | 997376 | 2048 | memory/hdrStrHeap 2621440 |1466368 | 2048 | memory/hdrHeap 262144 | 203776 | 1024 | memory/ArenaBlock dallocated |in-use | type size | free list name |||-- 0 | 0 |2097152 | memory/ioBufAllocator[14] 0 | 0 |1048576 | memory/ioBufAllocator[13] 0 | 0 | 524288 | memory/ioBufAllocator[12] 0 | 0 | 262144 | memory/ioBufAllocator[11] 0 | 0 | 131072 | memory/ioBufAllocator[10] 0 | 0 | 65536 | memory/ioBufAllocator[9] 7340032 |6455296 | 32768 | memory/ioBufAllocator[8] 1048576 | 720896 | 16384 | memory/ioBufAllocator[7] 1310720 |1048576 | 8192 | memory/ioBufAllocator[6] 1572864 | 929792 | 4096 | memory/ioBufAllocator[5] 0 | 0 | 2048 | memory/ioBufAllocator[4] 0 | 0 | 1024 | memory/ioBufAllocator[3] 0 | 0 |512 | memory/ioBufAllocator[2] 0 | 0 |256 | memory/ioBufAllocator[1] 0 | 0 |128 | memory/ioBufAllocator[0] 0 | 0 |592 | memory/ICPRequestCont_allocator 0 | 0 |128 | memory/ICPPeerReadContAllocator 0 | 0 |432 | memory/PeerReadDataAllocator 0 | 0 | 32 | memory/MIMEFieldSDKHandle 0 | 0 |256 | memory/INKVConnAllocator 0 | 0 |112 | memory/INKContAllocator 0 | 0 | 32 | memory/apiHookAllocator 0 | 0 |304 | memory/FetchSMAllocator 0 | 0 | 80 | memory/prefetchLockHandlerAllocator 0 | 0 |176 | memory/PrefetchBlasterAllocator 0 | 0 | 80 | memory/prefetchUrlBlaster 0 | 0 | 96 | memory/blasterUrlList 0 | 0 | 96 | memory/prefetchUrlEntryAllocator 0 | 0 |128 | memory/socksProxyAllocator 0 | 0 |160 | memory/ObjectReloadCont 155648 | 120992 |608 | memory/httpClientSessionAllocator
[jira] [Commented] (TS-1001) reload the changes in dns.resolv_conf
[ https://issues.apache.org/jira/browse/TS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135778#comment-13135778 ] Zhao Yongming commented on TS-1001: --- I think we need more input on how to handle config file monitoring, as commented in TS-1003, we many need to do some more change to handle config file register in server runtime. reload the changes in dns.resolv_conf - Key: TS-1001 URL: https://issues.apache.org/jira/browse/TS-1001 Project: Traffic Server Issue Type: Wish Components: DNS Reporter: Conan Wang Priority: Trivial a trivial wish: ATS can reload (traffic_line -x) resolv.conf if nameserver changed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TS-984) LogFile::roll crash at Machine::instance
[ https://issues.apache.org/jira/browse/TS-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128420#comment-13128420 ] Zhao Yongming commented on TS-984: -- in the #2: what I have done is that we I have both the logging colation clients and server have the same logging format name defined in logs_xml.config, but I make some change in the server, so the difine is not the same in the Format: {code} zymtest1 trafficserver # cat logs_xml.config | grep combined Format = combined [%cqtn] %chi %phi 81 %ttms \%{Referer}cqh\ \%cqtx\ %pssc %cqhl %psql %psct %crc \%{User-Agent}cqh\ %cquuc %cquup / zymtest1 trafficserver # ssh 10.62.163.237 grep combined /etc/trafficserver/logs_xml.config Format = combined [%cqtn] %chi %phi 81 %ttms \%{Referer}cqh\ \%cqtx\ %pssc %cqhl %psql %psct %crc \%{User-Agent}cqh\ / {code} that is the most stupid settings in the world, but you can not stop me doing that. :D LogFile::roll crash at Machine::instance Key: TS-984 URL: https://issues.apache.org/jira/browse/TS-984 Project: Traffic Server Issue Type: Bug Components: Logging Affects Versions: 3.1.0 Reporter: Zhao Yongming Assignee: Alan M. Carroll Labels: crash Fix For: 3.1.1 this is a strange error when I testing the current trunk with logging colation server enabled: {code} [Oct 16 01:24:48.643] Server {0x2b4967f51760} WARNING: File /var/log/trafficserver/ex_squid.log will be rolled because a LogObject with different format is requesting the same filename FATAL: Machine.cc:37: failed assert `_instance || !Machine instance accessed before initialization` /usr/bin/traffic_server - STACK TRACE: /usr/lib64/trafficserver/libtsutil.so.3(ink_fatal_va+0xc5)[0x2b49651a5109] /usr/lib64/trafficserver/libtsutil.so.3(ink_fatal+0xa3)[0x2b49651a51bb] /usr/lib64/trafficserver/libtsutil.so.3(_ink_assert+0xc2)[0x2b49651a3aee] /usr/bin/traffic_server(Machine::instance()+0x24)[0x634a04] /usr/bin/traffic_server(LogFile::roll(long, long)+0x230)[0x5f8138] /usr/bin/traffic_server(LogObjectManager::_solve_filename_conflicts(LogObject*, int)+0x489)[0x603263] /usr/bin/traffic_server(LogObjectManager::_manage_object(LogObject*, bool, int)+0x126)[0x6029ea] /usr/bin/traffic_server(LogObjectManager::manage_object(LogObject*, int)+0x2d)[0x5e9e27] /usr/bin/traffic_server(LogConfig::read_xml_log_config(int)+0x3189)[0x5f28c9] /usr/bin/traffic_server(LogConfig::setup_log_objects()+0x229)[0x5edd63] /usr/bin/traffic_server(LogConfig::init(LogConfig*)+0x254)[0x5ec3ea] /usr/bin/traffic_server(Log::init(int)+0x30c)[0x5e8134] /usr/bin/traffic_server(main+0xb6a)[0x5161a4] /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b49673b4ebd] /usr/bin/traffic_server[0x4cc789] {code} there is two problem here: 1, why the machine is not initialized? 2, what does WARNING: File /var/log/trafficserver/ex_squid.log will be rolled because a LogObject with different format is requesting the same filename mean? is that harm? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira