`traffic_manager' handles sockets incorrectly: loops error 88: Socket operation on non-socket ---------------------------------------------------------------------------------------------
Key: TS-741 URL: https://issues.apache.org/jira/browse/TS-741 Project: Traffic Server Issue Type: Bug Components: Management Affects Versions: 2.1.7, 2.1.8 Environment: ATS 2.1.7/trunk as of today: Debian AMD64, gcc 4.5.2; FreeBSD 8.2, gcc 4.2.1 Reporter: Arno Toell Attachments: strace.out traffic_manager fails to read from its sockets. This is the logging output from the first start of traffic_manager: {code} root@build:/home/arno/build-area# traffic_manager [E. Mgmt] log ==> [TrafficManager] using root directory '/usr' [Apr 15 21:13:25.167] {140016265324320} STATUS: opened /var/log/trafficserver/manager.log [Apr 15 21:13:25.167] {140016265324320} NOTE: updated diags config [Apr 15 21:13:25.220] Manager {140016265324320} NOTE: [ClusterCom::ClusterCom] Node running on OS: 'Linux' Release: '2.6.32-5-amd64' [Apr 15 21:13:25.228] Manager {140016265324320} NOTE: [LocalManager::listenForProxy] Listening on port: 8080 [Apr 15 21:13:25.228] Manager {140016265324320} NOTE: [TrafficManager] Setup complete [Apr 15 21:13:26.292] Manager {140016265324320} NOTE: [LocalManager::startProxy] Launching ts process [TrafficServer] using root directory '/usr' [Apr 15 21:13:26.402] Manager {140016265324320} NOTE: [LocalManager::pollMgmtProcessServer] New process connecting fd '11' [Apr 15 21:13:26.402] Manager {140016265324320} NOTE: [Alarms::signalAlarm] Server Process born [Apr 15 21:13:27.497] {47891714364896} STATUS: opened /var/log/trafficserver/diags.log [Apr 15 21:13:27.501] {47891714364896} NOTE: updated diags config [Apr 15 21:13:27.528] Server {47891714364896} NOTE: cache clustering disabled [Apr 15 21:13:27.528] Server {47891714364896} NOTE: clearing statistics [Apr 15 21:13:27.729] Server {47891714364896} NOTE: file '/var/run/trafficserver/host.db' size changed from 0.00MB to 30.91MB [Apr 15 21:13:27.866] Server {47891714364896} WARNING: header missing/corrupt: [hostdb.config] : reinitializing database [Apr 15 21:13:27.866] Server {47891714364896} NOTE: reconfiguring host database [Apr 15 21:13:28.110] Server {47891714364896} NOTE: cache clustering disabled [Apr 15 21:13:28.113] Server {47891714364896} WARNING: cache unable to open '/var/cache/trafficserver/cache.db': Permission denied [Apr 15 21:13:28.113] Server {47891714364896} WARNING: unable to open cache disk(s): Cache Disabled [Apr 15 21:13:28.495] Server {47891714364896} NOTE: logging initialized[7], logging_mode = 3 [Apr 15 21:13:28.572] Server {47891714364896} NOTE: traffic server running [Apr 15 21:13:30.074] Manager {140016199149312} ERROR: [drainIncomingChannel] error accepting reliable connection [Apr 15 21:13:30.074] Manager {140016199149312} ERROR: (last system error 88: Socket operation on non-socket) [Apr 15 21:13:30.074] Manager {140016199149312} ERROR: [drainIncomingChannel] error accepting reliable connection [Apr 15 21:13:30.074] Manager {140016199149312} ERROR: (last system error 88: Socket operation on non-socket) [Apr 15 21:13:30.074] Manager {140016199149312} ERROR: [drainIncomingChannel] error accepting reliable connection ... loops until stopped {code} Nothing more happens beyond this point but to loop socket error 88 each time the process polls the file descriptor. I attached a strace of the traffic_manager binary. This looks all a bit odd, the select() does nothing but fail apparently. I have no clue where the gettimeofday() spam comes from. {code} gettimeofday({1302902814, 227582}, NULL) = 0 gettimeofday({1302902814, 228426}, NULL) = 0 write(11, "\27'\0\0\2\0\0\0000\0", 10) = 10 write(11, "\31'\0\0`o\2\0\3\0\0\0\30\0\0\0`o\2\0`o\2\0\304\2\0\0\0\0\0\0"..., 159592) = 159592 munmap(0x7f08d9ee4000, 159744) = 0 write(11, "\31'\0\0`o\2\0\5\0\0\0\30\0\0\0`o\2\0`o\2\0\304\2\0\0\0\0\0\0"..., 159592) = 159592 madvise(0x7f08d009d000, 28672, MADV_DONTNEED) = 0 select(1024, [5 11], NULL, NULL, {1, 0}) = 0 (Timeout) gettimeofday({1302902815, 321699}, NULL) = 0 {code} FreeBSD is the same (but the errno): {code} [Apr 15 23:21:11.811] Manager {34391239808} ERROR: [drainIncomingChannel] error accepting reliable connection [Apr 15 23:21:11.813] Manager {34391239808} ERROR: (last system error 38: Socket operation on non-socket) {code} ./configure was (yes, redundancy with {{--enable-layout}} is intentional): {code} ./configure --prefix=/usr --sysconfdir=/etc/trafficserver \ --localstatedir=/var/run --includedir=/usr/include \ --libdir=/usr/lib --datadir=/var/cache/trafficserver \ --enable-layout=Debian --with-user=root \ --with-group=root {code} I previously tested ATS 2.1.5 which isn't affected by this problem. Debian versions of packages trafficserver is linked against: {code} ii libc6 2.11.2-13 Embedded GNU C Library: Shared lib ii libexpat1 2.0.1-7 XML parsing C library - runtime li ii libgcc1 1:4.6.0-2 GCC support library ii libpcre3 8.12-3 Perl 5 Compatible Regular Expressi ii libssl1.0.0 1.0.0d-2 SSL shared libraries ii libstdc++6 4.6.0-2 The GNU Standard C++ Library v3 ii tcl8.4 8.4.19-4 Tcl (the Tool Command Language) v8 ii zlib1g 1:1.2.3.4.dfsg-3 compression library - runtime {code} Tell me if can further help you -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira