[ 
https://issues.apache.org/jira/browse/TS-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896236#action_12896236
 ] 

Zhao Yongming commented on TS-394:
----------------------------------

after track from the v2.1 code, I am working on the track of the first commit 
of branch trunk, 'TS-196:Merged 
traffic-branchdev(trafficserver/traffic/branches/dev) changes r891822:915884 
into trunk. Tested: ubuntu904, forward and reverse proxy.' this is a merge from 
dev branch. which led to the cluster communication out of function. will try to 
test dev branch to get the clear view.

> taffic_server process sig abort in full cluster mode
> ----------------------------------------------------
>
>                 Key: TS-394
>                 URL: https://issues.apache.org/jira/browse/TS-394
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core
>         Environment: ATS in full cluster mode is unusable, the traffic_server 
> process will get sig abort by every request. code in trunk tested. 
>            Reporter: Zhao Yongming
>            Priority: Critical
>             Fix For: 2.3.0
>
>         Attachments: traffic_full_cluster_sig_abort.patch
>
>
> I am trying to setup full cluster mode, but geting connection abort during 
> every request. after tcpdump, it seems that ATS got the correct source file 
> from backend, but do not send out the full file( with tcp reset during http 
> transfer to client), then i am trying to figure out the root cause.
> with debug log enabled in records.config:
> CONFIG proxy.config.diags.debug.enabled INT 1
> CONFIG proxy.config.diags.debug.tags STRING http.*|cluster.*
> I got the following log from traffic.out:
> [Jun 22 15:19:11.296] Manager {139809315796768} ERROR: 
> [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 6: 
> Aborted
> [Jun 22 15:19:11.296] Manager {139809315796768} ERROR:  (last system error 2: 
> No such file or directory)
> [Jun 22 15:19:11.296] Manager {139809315796768} ERROR: [Alarms::signalAlarm] 
> Server Process was reset
> [Jun 22 15:19:11.296] Manager {139809315796768} ERROR:  (last system error 2: 
> No such file or directory)
> after strace traffic_server, I got the following info:
> [pid 19830]      0.000306 <... epoll_wait resumed> {}, 32768, 10) = 0
> [pid 19830]      0.000031 gettimeofday({1277256740, 532763}, NULL) = 0
> [pid 19830]      0.000181 write(2, "FATAL: ClusterHandler.cc:2047: failed 
> assert `ntodo >= 0`\n", 58) = 58
> [pid 19830]      0.000076 gettimeofday({1277256740, 533020}, NULL) = 0
> [pid 19830]      0.000071 socket(PF_FILE, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 101
> [pid 19830]      0.000062 connect(101, {sa_family=AF_FILE, path="/dev/log"}, 
> 110) = -1 ENOENT (No such file or directory)
> [pid 19830]      0.000074 close(101)    = 0
> [pid 19830]      0.000056 write(2, "/usr/bin/traffic_server", 23) = 23
> [pid 19830]      0.000050 write(2, " - STACK TRACE: \n", 17) = 17
> [pid 19830]      0.000289 futex(0x2b3a08e6a5b0, FUTEX_WAKE_PRIVATE, 
> 2147483647) = 0
> [pid 19830]      0.000251 futex(0x2b3a08b11190, FUTEX_WAKE_PRIVATE, 
> 2147483647) = 0
> [pid 19830]      0.000656 writev(2, [{"/usr/bin/traffic_server", 23}, {"(", 
> 1}, {"ink_fatal_va", 12}, {"+0x", 3}, {"ab", 2}, {")", 1}, {"[0x", 3}, 
> {"6dcdab", 6}, {"]\n", 2}], 9) = 53
> I have fix the bug by comment out ClusterHandler.cc:2047. patch will followed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to