Thanks, I totally forgot to check the tracker. I added the information I 
collected there, but don't have enough experience with ceph to dig through this 
myself so let's see if someone is willing to sacrifice their free time to help 
debugging this issue.

--
Katie

On 2017-09-12 03:15, Brad Hubbard wrote:
> Looks like there is a tracker opened for this.
> 
> http://tracker.ceph.com/issues/21197
> 
> Please add your details there.
> 
> On Tue, Sep 12, 2017 at 11:04 AM, Katie Holly <[email protected]> wrote:
>> Hi,
>>
>> I recently upgraded one of our clusters from Kraken to Luminous (the cluster 
>> was initialized with Jewel) on Ubuntu 16.04 and deployed ceph-mgr on all of 
>> our ceph-mon nodes with ceph-deploy.
>>
>> Related log entries after initial deployment of ceph-mgr:
>>
>> 2017-09-11 06:41:53.535025 7fb5aa7b8500  0 set uid:gid to 64045:64045 
>> (ceph:ceph)
>> 2017-09-11 06:41:53.535048 7fb5aa7b8500  0 ceph version 12.2.0 
>> (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process (unknown), 
>> pid 17031
>> 2017-09-11 06:41:53.536853 7fb5aa7b8500  0 pidfile_write: ignore empty 
>> --pid-file
>> 2017-09-11 06:41:53.541880 7fb5aa7b8500  1 mgr send_beacon standby
>> 2017-09-11 06:41:54.547383 7fb5a1aec700  1 mgr handle_mgr_map Activating!
>> 2017-09-11 06:41:54.547575 7fb5a1aec700  1 mgr handle_mgr_map I am now 
>> activating
>> 2017-09-11 06:41:54.650677 7fb59dae4700  1 mgr start Creating threads for 0 
>> modules
>> 2017-09-11 06:41:54.650696 7fb59dae4700  1 mgr send_beacon active
>> 2017-09-11 06:41:55.542252 7fb59eae6700  1 mgr send_beacon active
>> 2017-09-11 06:41:55.542627 7fb59eae6700  1 mgr.server send_report Not 
>> sending PG status to monitor yet, waiting for OSDs
>> 2017-09-11 06:41:57.542697 7fb59eae6700  1 mgr send_beacon active
>> [... lots of "send_beacon active" messages ...]
>> 2017-09-11 07:29:29.640892 7fb59eae6700  1 mgr send_beacon active
>> 2017-09-11 07:29:30.866366 7fb59d2e3700 -1 *** Caught signal (Aborted) **
>>  in thread 7fb59d2e3700 thread_name:ms_dispatch
>>
>>  ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
>>  1: (()+0x3de6b4) [0x55f6640e16b4]
>>  2: (()+0x11390) [0x7fb5a8fef390]
>>  3: (gsignal()+0x38) [0x7fb5a7f7f428]
>>  4: (abort()+0x16a) [0x7fb5a7f8102a]
>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7fb5a88c284d]
>>  6: (()+0x8d6b6) [0x7fb5a88c06b6]
>>  7: (()+0x8d701) [0x7fb5a88c0701]
>>  8: (()+0x8d919) [0x7fb5a88c0919]
>>  9: (()+0x2318ad) [0x55f663f348ad]
>>  10: (()+0x3e91bd) [0x55f6640ec1bd]
>>  11: (DaemonPerfCounters::update(MMgrReport*)+0x821) [0x55f663f96651]
>>  12: (DaemonServer::handle_report(MMgrReport*)+0x1ae) [0x55f663f9b79e]+
>>  13: (DaemonServer::ms_dispatch(Message*)+0x64) [0x55f663fa8d64]
>>  14: (DispatchQueue::entry()+0xf4a) [0x55f664438f3a]
>>  15: (DispatchQueue::DispatchThread::entry()+0xd) [0x55f6641dc44d]
>>  16: (()+0x76ba) [0x7fb5a8fe56ba]
>>  17: (clone()+0x6d) [0x7fb5a80513dd]
>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
>> interpret this.
>>
>> --- begin dump of recent events ---
>> [...]
>>
>>
>> I tried to manually run ceph-mgr with
>>> /usr/bin/ceph-mgr -f --cluster ceph --id $HOSTNAME --setuser ceph 
>>> --setgroup ceph
>> which immediately fails to keep running for longer than a few seconds.
>> stdout: http://xor.meo.ws/OyvoZF8v0aWq0D-rOOg2y6u03fp_yzYv.txt
>> logs: http://xor.meo.ws/jcMyjabCfFbTcfZ8GOangLdSfSSqJffr.txt
>> objdump: http://xor.meo.ws/oxo2q8h_oKAG6q7mARvNKkR_JdYjn89B.txt
>>
>> Has someone seen such an issue before and knows how to debug or even fix 
>> this?
>>
>>
>> --
>> Katie
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to