shinrich opened a new pull request #7218:
URL: https://github.com/apache/trafficserver/pull/7218
We have a new machines with bad disks. When they are in this state, they
are core machines generating cores like the following.
{code}
[ 00 ] libc-2.17.so raise
[ 01 ] libc-2.17.so abort
[ 02 ] libtscore.so.9.0.0 ink_abort (
ink_error.cc:99 )
[ 03 ] libtscore.so.9.0.0 _ink_assert (
ink_assert.cc:37 )
[ 04 ] traffic_server Cache::open(bool, bool) (
Cache.cc:2066 )
[ 05 ] traffic_server CacheProcessor::diskInitialized() (
Cache.cc:832 )
[ 06 ] traffic_server CacheDisk::openDone(int, void*) (
CacheDisk.cc:212 )
[ 07 ] traffic_server EThread::process_event(Event*, int) (
I_Continuation.h:167 )
[ 08 ] traffic_server EThread::execute_regular() (
UnixEThread.cc:241 )
[ 09 ] traffic_server execute (
UnixEThread.cc:332 )
[ 10 ] traffic_server EThread::execute() (
UnixEThread.cc:310 )
[ 11 ] traffic_server spawn_thread_internal (
Thread.cc:92 )
[ 12 ] libpthread-2.17.so start_thread
{code}
This PR fixes the issue in two ways. It replaces Fatal with Emergency so
traffic_manager stops trying. It also splits the initialization so the part
that starts background AIO threads until it knows that enough disks are up to
complete the start up.
The specific crash was due to the process shutting down while the background
threads were still running. The background threads would have the static data
structures freed from underneath them,
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]