`traffic_manager' handles sockets incorrectly: loops error 88: Socket operation 
on non-socket
---------------------------------------------------------------------------------------------

                 Key: TS-741
                 URL: https://issues.apache.org/jira/browse/TS-741
             Project: Traffic Server
          Issue Type: Bug
          Components: Management
    Affects Versions: 2.1.7, 2.1.8
         Environment: ATS 2.1.7/trunk as of today: Debian AMD64, gcc 4.5.2; 
FreeBSD 8.2, gcc 4.2.1
            Reporter: Arno Toell
         Attachments: strace.out

traffic_manager fails to read from its sockets. This is the logging output from 
the first start of traffic_manager:

{code}
root@build:/home/arno/build-area# traffic_manager 
[E. Mgmt] log ==> [TrafficManager] using root directory '/usr'
[Apr 15 21:13:25.167] {140016265324320} STATUS: opened 
/var/log/trafficserver/manager.log
[Apr 15 21:13:25.167] {140016265324320} NOTE: updated diags config
[Apr 15 21:13:25.220] Manager {140016265324320} NOTE: [ClusterCom::ClusterCom] 
Node running on OS: 'Linux' Release: '2.6.32-5-amd64'
[Apr 15 21:13:25.228] Manager {140016265324320} NOTE: 
[LocalManager::listenForProxy] Listening on port: 8080
[Apr 15 21:13:25.228] Manager {140016265324320} NOTE: [TrafficManager] Setup 
complete
[Apr 15 21:13:26.292] Manager {140016265324320} NOTE: 
[LocalManager::startProxy] Launching ts process
[TrafficServer] using root directory '/usr'
[Apr 15 21:13:26.402] Manager {140016265324320} NOTE: 
[LocalManager::pollMgmtProcessServer] New process connecting fd '11'
[Apr 15 21:13:26.402] Manager {140016265324320} NOTE: [Alarms::signalAlarm] 
Server Process born
[Apr 15 21:13:27.497] {47891714364896} STATUS: opened 
/var/log/trafficserver/diags.log
[Apr 15 21:13:27.501] {47891714364896} NOTE: updated diags config
[Apr 15 21:13:27.528] Server {47891714364896} NOTE: cache clustering disabled
[Apr 15 21:13:27.528] Server {47891714364896} NOTE: clearing statistics
[Apr 15 21:13:27.729] Server {47891714364896} NOTE: file 
'/var/run/trafficserver/host.db' size changed from 0.00MB to 30.91MB
[Apr 15 21:13:27.866] Server {47891714364896} WARNING: header missing/corrupt: 
[hostdb.config] : reinitializing database
[Apr 15 21:13:27.866] Server {47891714364896} NOTE: reconfiguring host database
[Apr 15 21:13:28.110] Server {47891714364896} NOTE: cache clustering disabled
[Apr 15 21:13:28.113] Server {47891714364896} WARNING: cache unable to open 
'/var/cache/trafficserver/cache.db': Permission denied
[Apr 15 21:13:28.113] Server {47891714364896} WARNING: unable to open cache 
disk(s): Cache Disabled
[Apr 15 21:13:28.495] Server {47891714364896} NOTE: logging initialized[7], 
logging_mode = 3
[Apr 15 21:13:28.572] Server {47891714364896} NOTE: traffic server running

[Apr 15 21:13:30.074] Manager {140016199149312} ERROR: [drainIncomingChannel] 
error accepting reliable connection
[Apr 15 21:13:30.074] Manager {140016199149312} ERROR:  (last system error 88: 
Socket operation on non-socket)
[Apr 15 21:13:30.074] Manager {140016199149312} ERROR: [drainIncomingChannel] 
error accepting reliable connection
[Apr 15 21:13:30.074] Manager {140016199149312} ERROR:  (last system error 88: 
Socket operation on non-socket)
[Apr 15 21:13:30.074] Manager {140016199149312} ERROR: [drainIncomingChannel] 
error accepting reliable connection
... loops until stopped
{code}

Nothing more happens beyond this point but to loop socket error 88 each time 
the process polls the file descriptor. I attached a strace of the 
traffic_manager binary. This looks all a bit odd, the select() does nothing but 
fail apparently. I have no clue where the gettimeofday() spam comes from. 

{code}
gettimeofday({1302902814, 227582}, NULL) = 0
gettimeofday({1302902814, 228426}, NULL) = 0
write(11, "\27'\0\0\2\0\0\0000\0", 10)  = 10
write(11, "\31'\0\0`o\2\0\3\0\0\0\30\0\0\0`o\2\0`o\2\0\304\2\0\0\0\0\0\0"..., 
159592) = 159592
munmap(0x7f08d9ee4000, 159744)          = 0
write(11, "\31'\0\0`o\2\0\5\0\0\0\30\0\0\0`o\2\0`o\2\0\304\2\0\0\0\0\0\0"..., 
159592) = 159592
madvise(0x7f08d009d000, 28672, MADV_DONTNEED) = 0
select(1024, [5 11], NULL, NULL, {1, 0}) = 0 (Timeout)
gettimeofday({1302902815, 321699}, NULL) = 0
{code}

FreeBSD is the same (but the errno):
{code}
[Apr 15 23:21:11.811] Manager {34391239808} ERROR: [drainIncomingChannel] error 
accepting reliable connection
[Apr 15 23:21:11.813] Manager {34391239808} ERROR:  (last system error 38: 
Socket operation on non-socket)
{code}

./configure was (yes, redundancy with {{--enable-layout}} is intentional):
{code}
./configure --prefix=/usr --sysconfdir=/etc/trafficserver \
    --localstatedir=/var/run --includedir=/usr/include \
    --libdir=/usr/lib --datadir=/var/cache/trafficserver \
    --enable-layout=Debian --with-user=root \
    --with-group=root
{code}

I previously tested ATS 2.1.5 which isn't affected by this problem. 

Debian versions of packages trafficserver is linked against:
{code}
ii  libc6                   2.11.2-13        Embedded GNU C Library: Shared lib
ii  libexpat1               2.0.1-7          XML parsing C library - runtime li
ii  libgcc1                 1:4.6.0-2        GCC support library
ii  libpcre3                8.12-3           Perl 5 Compatible Regular Expressi
ii  libssl1.0.0             1.0.0d-2         SSL shared libraries
ii  libstdc++6              4.6.0-2          The GNU Standard C++ Library v3
ii  tcl8.4                  8.4.19-4         Tcl (the Tool Command Language) v8
ii  zlib1g                  1:1.2.3.4.dfsg-3 compression library - runtime
{code}

Tell me if can further help you

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to