Chiming in to mirror this.

250 OSDs, and after 14.2.6 CPU is much lower on the mgr, and the balancer 
doesn't hang, which was the main thing that would stall previously.

Reed

> On Jan 16, 2020, at 4:30 AM, Dan van der Ster <[email protected]> wrote:
> 
> Hey Wido,
> We upgraded a 550-osd cluster from 14.2.4 to 14.2.6 and everything seems to 
> be working fine. Here's top:
> 
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND 
>                                                                               
>                                                           
> 1432693 ceph      20   0 3246580   2.0g  18260 S  78.4 13.9   2760:58 
> ceph-mgr                                                                      
>                                                                   
> 2075038 ceph      20   0 2235072   1.1g  16408 S  11.6  7.6 176:15.30 
> ceph-mon            
> 
> And the balancer is quick:
> 
> # ceph balancer status
> {
>     "last_optimize_duration": "0:00:02.806449", 
>     "plans": [], 
>     "mode": "upmap", 
>     "active": true, 
>     "optimize_result": "Optimization plan created successfully", 
>     "last_optimize_started": "Thu Jan 16 11:26:19 2020"
> }
> 
> Cheers, Dan
> 
> 
> On Thu, Jan 16, 2020 at 11:19 AM Wido den Hollander <[email protected] 
> <mailto:[email protected]>> wrote:
> Anybody upgraded to 14.2.6 yet?
> 
> On a 1800 OSD cluster I see that ceph-mgr is consuming 200 to 450% CPU
> on a 4C/8T system (Intel Xeon E3-1230 3.3Ghz CPU).
> 
> The logs don't show anything very special, it's just that the mgr is
> super busy.
> 
> I noticed this when I executed:
> 
> $ ceph balancer status
> 
> That command wouldn't return and then I checked the mgr. Only after
> restarting ceph-mgr the balancer module returned results again. It
> didn't change the CPU usage, it's still consuming a lot of CPU, but at
> least the balancer seems to work again.
> 
> Wido
> 
> On 1/9/20 10:21 AM, Lars Täuber wrote:
> > yesterday:
> > https://ceph.io/releases/v14-2-6-nautilus-released/ 
> > <https://ceph.io/releases/v14-2-6-nautilus-released/>
> > 
> > 
> > Cheers,
> > Lars
> > 
> > Thu, 9 Jan 2020 10:10:12 +0100
> > Wido den Hollander <[email protected] <mailto:[email protected]>> ==> Neha Ojha 
> > <[email protected] <mailto:[email protected]>>, Sasha Litvak 
> > <[email protected] <mailto:[email protected]>> :
> >> On 12/24/19 9:19 PM, Neha Ojha wrote:
> >>> The root cause of this issue is the overhead added by the network ping
> >>> time monitoring feature for the mgr to process.
> >>> We have a fix that disables sending the network ping times related
> >>> stats to the mgr and Eric has helped verify the fix(Thanks Eric!) -
> >>> https://tracker.ceph.com/issues/43364#note-9 
> >>> <https://tracker.ceph.com/issues/43364#note-9>. We'll get this fix out
> >>> in 14.2.6 after the holidays.
> >>>   
> >>
> >> It's after the holidays now and this is affecting a lot of deployments.
> >> Can people expect 14.2.6 soon?
> >>
> >> Wido
> >>
> >>>
> >>>
> >>> On Fri, Dec 20, 2019 at 6:24 PM Neha Ojha <[email protected] 
> >>> <mailto:[email protected]>> wrote:  
> >>>>
> >>>> Not yet, but we have a theory and a test build in
> >>>> https://tracker.ceph.com/issues/43364#note-6 
> >>>> <https://tracker.ceph.com/issues/43364#note-6>, if anybody would like to
> >>>> give it a try.
> >>>>
> >>>> Thanks,
> >>>> Neha
> >>>>
> >>>> On Fri, Dec 20, 2019 at 2:31 PM Sasha Litvak
> >>>> <[email protected] <mailto:[email protected]>> 
> >>>> wrote:  
> >>>>>
> >>>>> Was the root cause found and fixed?  If so, will the fix be available 
> >>>>> in 14.2.6 or sooner?
> >>>>>
> >>>>> On Thu, Dec 19, 2019 at 5:48 PM Mark Nelson <[email protected] 
> >>>>> <mailto:[email protected]>> wrote:  
> >>>>>>
> >>>>>> Hi Paul,
> >>>>>>
> >>>>>>
> >>>>>> Thanks for gathering this!  It looks to me like at the very least we
> >>>>>> should redo the fixed_u_to_string and fixed_to_string functions in
> >>>>>> common/Formatter.cc.  That alone looks like it's having a pretty
> >>>>>> significant impact.
> >>>>>>
> >>>>>>
> >>>>>> Mark
> >>>>>>
> >>>>>>
> >>>>>> On 12/19/19 2:09 PM, Paul Mezzanini wrote:  
> >>>>>>> Based on what we've seen with perf, we think this is the relevant 
> >>>>>>> section.  (attached is also the whole file)
> >>>>>>>
> >>>>>>> Thread: 73 (mgr-fin) - 1000 samples
> >>>>>>>
> >>>>>>> + 100.00% clone
> >>>>>>>    + 100.00% start_thread
> >>>>>>>      + 100.00% Finisher::finisher_thread_entry()
> >>>>>>>        + 99.40% Context::complete(int)
> >>>>>>>        | + 99.40% FunctionContext::finish(int)
> >>>>>>>        |   + 99.40% ActivePyModule::notify(std::string const&, 
> >>>>>>> std::string const&)
> >>>>>>>        |     + 91.30% PyObject_CallMethod
> >>>>>>>        |     | + 91.30% call_function_tail
> >>>>>>>        |     |   + 91.30% PyObject_Call
> >>>>>>>        |     |     + 91.30% instancemethod_call
> >>>>>>>        |     |       + 91.30% PyObject_Call
> >>>>>>>        |     |         + 91.30% function_call
> >>>>>>>        |     |           + 91.30% PyEval_EvalCodeEx
> >>>>>>>        |     |             + 88.40% PyEval_EvalFrameEx
> >>>>>>>        |     |             | + 88.40% PyEval_EvalFrameEx
> >>>>>>>        |     |             |   + 88.40% 
> >>>>>>> ceph_state_get(BaseMgrModule*, _object*)
> >>>>>>>        |     |             |     + 88.40% 
> >>>>>>> ActivePyModules::get_python(std::string const&)
> >>>>>>>        |     |             |       + 51.10% 
> >>>>>>> PGMap::dump_osd_stats(ceph::Formatter*) const
> >>>>>>>        |     |             |       | + 51.10% 
> >>>>>>> osd_stat_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   + 22.50% 
> >>>>>>> ceph::fixed_u_to_string(unsigned long, int)
> >>>>>>>        |     |             |       |   | + 10.50% 
> >>>>>>> std::basic_ostringstream<char, std::char_traits<char>, 
> >>>>>>> std::allocator<char> >::basic_ostringstream(std::_Ios_Openmode)
> >>>>>>>        |     |             |       |   | | + 9.30% 
> >>>>>>> std::basic_ios<char, std::char_traits<char> 
> >>>>>>> >::init(std::basic_streambuf<char, std::char_traits<char> >*)
> >>>>>>>        |     |             |       |   | | | + 7.00% 
> >>>>>>> std::basic_ios<char, std::char_traits<char> 
> >>>>>>> >::_M_cache_locale(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 1.60% 
> >>>>>>> std::ctype<char> const& std::use_facet<std::ctype<char> >(std::locale 
> >>>>>>> const&)
> >>>>>>>        |     |             |       |   | | | | | + 1.50% 
> >>>>>>> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | |   + 0.80% 
> >>>>>>> __cxxabiv1::__vmi_class_type_info::__do_dyncast(long, 
> >>>>>>> __cxxabiv1::__class_type_info::__sub_kind, 
> >>>>>>> __cxxabiv1::__class_type_info const*, void const*, 
> >>>>>>> __cxxabiv1::__class_type_info const*, void const*, 
> >>>>>>> __cxxabiv1::__class_type_info::__dyncast_result&) const
> >>>>>>>        |     |             |       |   | | | | + 1.40% bool 
> >>>>>>> std::has_facet<std::ctype<char> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | | + 1.30% 
> >>>>>>> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | |   + 0.90% 
> >>>>>>> __cxxabiv1::__vmi_class_type_info::__do_dyncast(long, 
> >>>>>>> __cxxabiv1::__class_type_info::__sub_kind, 
> >>>>>>> __cxxabiv1::__class_type_info const*, void const*, 
> >>>>>>> __cxxabiv1::__class_type_info const*, void const*, 
> >>>>>>> __cxxabiv1::__class_type_info::__dyncast_result&) const
> >>>>>>>        |     |             |       |   | | | | + 1.10% bool 
> >>>>>>> std::has_facet<std::num_put<char, std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | | + 0.90% 
> >>>>>>> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | + 1.00% bool 
> >>>>>>> std::has_facet<std::num_get<char, std::istreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | | + 0.70% 
> >>>>>>> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | | + 0.10% 
> >>>>>>> std::locale::id::_M_id() const
> >>>>>>>        |     |             |       |   | | | | | + 0.10% 
> >>>>>>> _ZNKSt6locale2id5_M_idEv@plt
> >>>>>>>        |     |             |       |   | | | | + 0.80% 
> >>>>>>> std::num_put<char, std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > const& std::use_facet<std::num_put<char, 
> >>>>>>> std::ostreambuf_iterator<char, std::char_traits<char> > > 
> >>>>>>> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.70% 
> >>>>>>> std::num_get<char, std::istreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > const& std::use_facet<std::num_get<char, 
> >>>>>>> std::istreambuf_iterator<char, std::char_traits<char> > > 
> >>>>>>> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.10% 
> >>>>>>> _ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
> >>>>>>>        |     |             |       |   | | | + 2.00% 
> >>>>>>> std::ios_base::_M_init()
> >>>>>>>        |     |             |       |   | | | | + 0.80% 
> >>>>>>> std::locale::operator=(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.80% 
> >>>>>>> std::locale::locale()
> >>>>>>>        |     |             |       |   | | | | + 0.30% 
> >>>>>>> std::locale::~locale()
> >>>>>>>        |     |             |       |   | | | | + 0.10% 
> >>>>>>> _ZNSt6localeC1Ev@plt
> >>>>>>>        |     |             |       |   | | | + 0.20% 
> >>>>>>> _ZNSt8ios_base7_M_initEv@plt
> >>>>>>>        |     |             |       |   | | + 0.90% 
> >>>>>>> std::locale::locale()
> >>>>>>>        |     |             |       |   | | + 0.10% 
> >>>>>>> std::ios_base::ios_base()
> >>>>>>>        |     |             |       |   | | + 0.10% 
> >>>>>>> _ZNSt9basic_iosIcSt11char_traitsIcEE4initEPSt15basic_streambufIcS1_E@plt
> >>>>>>>        |     |             |       |   | + 2.80% std::ostream& 
> >>>>>>> std::ostream::_M_insert<unsigned long>(unsigned long)
> >>>>>>>        |     |             |       |   | | + 2.40% std::num_put<char, 
> >>>>>>> std::ostreambuf_iterator<char, std::char_traits<char> > 
> >>>>>>> >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >, 
> >>>>>>> std::ios_base&, char, unsigned long) const
> >>>>>>>        |     |             |       |   | | | + 2.10% 
> >>>>>>> std::ostreambuf_iterator<char, std::char_traits<char> > 
> >>>>>>> std::num_put<char, std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > >::_M_insert_int<unsigned 
> >>>>>>> long>(std::ostreambuf_iterator<char, std::char_traits<char> >, 
> >>>>>>> std::ios_base&, char, unsigned long) const
> >>>>>>>        |     |             |       |   | | | | + 1.60% 
> >>>>>>> std::basic_streambuf<char, std::char_traits<char> >::xsputn(char 
> >>>>>>> const*, long)
> >>>>>>>        |     |             |       |   | | | | | + 1.40% 
> >>>>>>> std::basic_stringbuf<char, std::char_traits<char>, 
> >>>>>>> std::allocator<char> >::overflow(int)
> >>>>>>>        |     |             |       |   | | | | | | + 0.90% 
> >>>>>>> std::string::reserve(unsigned long)
> >>>>>>>        |     |             |       |   | | | | | | + 0.10% 
> >>>>>>> std::basic_stringbuf<char, std::char_traits<char>, 
> >>>>>>> std::allocator<char> >::_M_sync(char*, unsigned long, unsigned long)
> >>>>>>>        |     |             |       |   | | | | | | + 0.10% 
> >>>>>>> _ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE7_M_syncEPcmm@plt
> >>>>>>>        |     |             |       |   | | | | | + 0.20% 
> >>>>>>> __memcpy_ssse3_back
> >>>>>>>        |     |             |       |   | | | | + 0.20% ???
> >>>>>>>        |     |             |       |   | | | | + 0.10% 
> >>>>>>> std::num_put<char, std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > >::_M_pad(char, long, std::ios_base&, char*, 
> >>>>>>> char const*, int&) const
> >>>>>>>        |     |             |       |   | | | + 0.10% 
> >>>>>>> _ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intImEES3_S3_RSt8ios_basecT_@plt
> >>>>>>>        |     |             |       |   | | + 0.10% 
> >>>>>>> std::ostream::sentry::sentry(std::ostream&)
> >>>>>>>        |     |             |       |   | + 2.80% 
> >>>>>>> std::basic_stringbuf<char, std::char_traits<char>, 
> >>>>>>> std::allocator<char> >::str() const
> >>>>>>>        |     |             |       |   | | + 1.00% 
> >>>>>>> std::string::assign(std::string const&)
> >>>>>>>        |     |             |       |   | | + 0.90% char* 
> >>>>>>> std::string::_S_construct<char*>(char*, char*, std::allocator<char> 
> >>>>>>> const&, std::forward_iterator_tag) [clone .part.1796]
> >>>>>>>        |     |             |       |   | + 1.50% 
> >>>>>>> std::string::append(char const*, unsigned long)
> >>>>>>>        |     |             |       |   | | + 1.20% 
> >>>>>>> std::string::reserve(unsigned long)
> >>>>>>>        |     |             |       |   | |   + 0.60% 
> >>>>>>> std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned 
> >>>>>>> long)
> >>>>>>>        |     |             |       |   | |   + 0.10% tc_free
> >>>>>>>        |     |             |       |   | + 1.20% 
> >>>>>>> std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone 
> >>>>>>> .isra.97] [clone .part.98]
> >>>>>>>        |     |             |       |   | + 1.00% 
> >>>>>>> std::string::append(std::string const&)
> >>>>>>>        |     |             |       |   | | + 0.70% 
> >>>>>>> std::string::reserve(unsigned long)
> >>>>>>>        |     |             |       |   | | + 0.10% __memcpy_ssse3_back
> >>>>>>>        |     |             |       |   | + 1.00% 
> >>>>>>> std::basic_string<char, std::char_traits<char>, std::allocator<char> 
> >>>>>>> >::basic_string(std::string const&, unsigned long, unsigned long)
> >>>>>>>        |     |             |       |   | | + 0.80% char* 
> >>>>>>> std::string::_S_construct<char*>(char*, char*, std::allocator<char> 
> >>>>>>> const&, std::forward_iterator_tag) [clone .part.220]
> >>>>>>>        |     |             |       |   | + 0.40% 
> >>>>>>> std::locale::~locale()
> >>>>>>>        |     |             |       |   | + 0.20% tc_free
> >>>>>>>        |     |             |       |   | + 0.20% __strlen_sse2_pminub
> >>>>>>>        |     |             |       |   | + 0.10% 
> >>>>>>> std::ios_base::~ios_base()
> >>>>>>>        |     |             |       |   | + 0.10% 
> >>>>>>> _ZNSt8ios_baseD2Ev@plt
> >>>>>>>        |     |             |       |   | + 0.10% 
> >>>>>>> _ZNKSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@plt
> >>>>>>>        |     |             |       |   + 18.20% 
> >>>>>>> PyFormatter::open_object_section(char const*)
> >>>>>>>        |     |             |       |   | + 17.10% PyDict_New
> >>>>>>>        |     |             |       |   | | + 16.70% _PyObject_GC_New
> >>>>>>>        |     |             |       |   | |   + 16.70% 
> >>>>>>> _PyObject_GC_Malloc
> >>>>>>>        |     |             |       |   | |     + 16.60% collect
> >>>>>>>        |     |             |       |   | |     | + 8.10% dict_traverse
> >>>>>>>        |     |             |       |   | |     | | + 3.20% 
> >>>>>>> visit_reachable
> >>>>>>>        |     |             |       |   | |     | | | + 0.10% 
> >>>>>>> type_is_gc
> >>>>>>>        |     |             |       |   | |     | | + 2.80% 
> >>>>>>> visit_decref
> >>>>>>>        |     |             |       |   | |     | | + 1.60% PyDict_Next
> >>>>>>>        |     |             |       |   | |     | + 1.30% list_traverse
> >>>>>>>        |     |             |       |   | |     | | + 0.40% 
> >>>>>>> visit_decref
> >>>>>>>        |     |             |       |   | |     | | + 0.30% 
> >>>>>>> visit_reachable
> >>>>>>>        |     |             |       |   | |     | + 0.60% func_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.40% 
> >>>>>>> _PyDict_MaybeUntrack
> >>>>>>>        |     |             |       |   | |     | + 0.10% type_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10% 
> >>>>>>> subtype_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10% set_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10% 
> >>>>>>> class_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10% 
> >>>>>>> _PyDict_MaybeUntrack@plt
> >>>>>>>        |     |             |       |   | |     + 0.10% PyObject_Malloc
> >>>>>>>        |     |             |       |   | + 1.00% 
> >>>>>>> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   |   + 0.40% PyString_FromString
> >>>>>>>        |     |             |       |   |   + 0.20% 
> >>>>>>> dict_set_item_by_hash_or_entry
> >>>>>>>        |     |             |       |   |   + 0.20% PyDict_SetItem
> >>>>>>>        |     |             |       |   |   + 0.10% app1
> >>>>>>>        |     |             |       |   + 6.60% 
> >>>>>>> ceph::Formatter::dump_format_unquoted(char const*, char const*, ...)
> >>>>>>>        |     |             |       |   | + 6.60% 
> >>>>>>> PyFormatter::dump_format_va(char const*, char const*, bool, char 
> >>>>>>> const*, __va_list_tag*)
> >>>>>>>        |     |             |       |   |   + 3.90% __vsnprintf_chk
> >>>>>>>        |     |             |       |   |   | + 3.40% vfprintf
> >>>>>>>        |     |             |       |   |   | | + 0.50% strchrnul
> >>>>>>>        |     |             |       |   |   | | + 0.40% 
> >>>>>>> __GI__IO_default_xsputn
> >>>>>>>        |     |             |       |   |   | | + 0.20% tc_free
> >>>>>>>        |     |             |       |   |   | | + 0.10% free@plt
> >>>>>>>        |     |             |       |   |   | | + 0.10% (anonymous 
> >>>>>>> namespace)::free_null_or_invalid(void*, void (*)(void*)) [clone 
> >>>>>>> .constprop.41]
> >>>>>>>        |     |             |       |   |   | + 0.20% _IO_no_init
> >>>>>>>        |     |             |       |   |   | + 0.10% 
> >>>>>>> _IO_str_init_static_internal
> >>>>>>>        |     |             |       |   |   + 1.50% 
> >>>>>>> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   |   | + 0.50% 
> >>>>>>> PyString_FromString
> >>>>>>>        |     |             |       |   |   | + 0.40% PyDict_SetItem
> >>>>>>>        |     |             |       |   |   | + 0.10% 
> >>>>>>> dict_set_item_by_hash_or_entry
> >>>>>>>        |     |             |       |   |   | + 0.10% 
> >>>>>>> PyDict_SetItem@plt
> >>>>>>>        |     |             |       |   |   + 1.20% PyString_FromString
> >>>>>>>        |     |             |       |   |     + 0.60% PyObject_Malloc
> >>>>>>>        |     |             |       |   |     + 0.20% 
> >>>>>>> __strlen_sse2_pminub
> >>>>>>>        |     |             |       |   |     + 0.10% 
> >>>>>>> __memcpy_ssse3_back
> >>>>>>>        |     |             |       |   + 0.90% ctime_r
> >>>>>>>        |     |             |       |   + 0.80% 
> >>>>>>> PyFormatter::open_array_section(char const*)
> >>>>>>>        |     |             |       |   + 0.40% 
> >>>>>>> std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone 
> >>>>>>> .isra.846] [clone .part.847]
> >>>>>>>        |     |             |       |   + 0.30% 
> >>>>>>> PyFormatter::dump_int(char const*, long)
> >>>>>>>        |     |             |       |   + 0.20% 
> >>>>>>> PyFormatter::close_section()
> >>>>>>>        |     |             |       |   + 0.10% tc_free
> >>>>>>>        |     |             |       |   + 0.10% 
> >>>>>>> std::basic_string<char, std::char_traits<char>, std::allocator<char> 
> >>>>>>> >::basic_string(char const*, std::allocator<char> const&)
> >>>>>>>        |     |             |       |   + 0.10% 
> >>>>>>> std::_Rb_tree_increment(std::_Rb_tree_node_base const*)
> >>>>>>>        |     |             |       |   + 0.10% 
> >>>>>>> pow2_hist_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   + 0.10% 
> >>>>>>> objectstore_perf_stat_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   + 0.10% 
> >>>>>>> PyFormatter::dump_string(char const*, std::basic_string_view<char, 
> >>>>>>> std::char_traits<char> >)
> >>>>>>>        |     |             |       |   + 0.10% 
> >>>>>>> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       + 21.80% Mutex::lock(bool)
> >>>>>>>        |     |             |       | + 21.80% pthread_mutex_lock
> >>>>>>>        |     |             |       |   + 21.80% _L_lock_883
> >>>>>>>        |     |             |       |     + 21.80% __lll_lock_wait
> >>>>>>>        |     |             |       + 11.70% 
> >>>>>>> PGMap::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       | + 11.70% 
> >>>>>>> PGMap::dump_pg_stats(ceph::Formatter*, bool) const
> >>>>>>>        |     |             |       |   + 10.90% 
> >>>>>>> pg_stat_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   | + 4.20% 
> >>>>>>> PyFormatter::dump_stream(char const*)
> >>>>>>>        |     |             |       |   | | + 2.80% 
> >>>>>>> std::basic_ios<char, std::char_traits<char> 
> >>>>>>> >::init(std::basic_streambuf<char, std::char_traits<char> >*)
> >>>>>>>        |     |             |       |   | | | + 2.10% 
> >>>>>>> std::basic_ios<char, std::char_traits<char> 
> >>>>>>> >::_M_cache_locale(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.50% 
> >>>>>>> std::num_put<char, std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > const& std::use_facet<std::num_put<char, 
> >>>>>>> std::ostreambuf_iterator<char, std::char_traits<char> > > 
> >>>>>>> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.50% bool 
> >>>>>>> std::has_facet<std::num_put<char, std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.40% bool 
> >>>>>>> std::has_facet<std::num_get<char, std::istreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.20% 
> >>>>>>> std::num_get<char, std::istreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > > const& std::use_facet<std::num_get<char, 
> >>>>>>> std::istreambuf_iterator<char, std::char_traits<char> > > 
> >>>>>>> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.20% 
> >>>>>>> std::ctype<char> const& std::use_facet<std::ctype<char> >(std::locale 
> >>>>>>> const&)
> >>>>>>>        |     |             |       |   | | | | + 0.20% bool 
> >>>>>>> std::has_facet<std::ctype<char> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.10% 
> >>>>>>> _ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
> >>>>>>>        |     |             |       |   | | | + 0.70% 
> >>>>>>> std::ios_base::_M_init()
> >>>>>>>        |     |             |       |   | | + 0.50% 
> >>>>>>> tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, int)
> >>>>>>>        |     |             |       |   | | + 0.40% 
> >>>>>>> std::string::assign(char const*, unsigned long)
> >>>>>>>        |     |             |       |   | | + 0.20% 
> >>>>>>> std::locale::locale()
> >>>>>>>        |     |             |       |   | | + 0.10% 
> >>>>>>> std::ios_base::ios_base()
> >>>>>>>        |     |             |       |   | + 1.80% 
> >>>>>>> object_stat_collection_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   | | + 1.70% 
> >>>>>>> object_stat_sum_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   | | | + 1.40% 
> >>>>>>> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   | | | | + 0.60% dictresize
> >>>>>>>        |     |             |       |   | | | | + 0.30% 
> >>>>>>> PyString_FromString
> >>>>>>>        |     |             |       |   | | | | + 0.20% PyDict_SetItem
> >>>>>>>        |     |             |       |   | | | | + 0.10% 
> >>>>>>> dict_set_item_by_hash_or_entry
> >>>>>>>        |     |             |       |   | | | + 0.20% 
> >>>>>>> PyFormatter::dump_int(char const*, long)
> >>>>>>>        |     |             |       |   | | + 0.10% 
> >>>>>>> PyFormatter::open_object_section(char const*)
> >>>>>>>        |     |             |       |   | + 1.80% 
> >>>>>>> PyFormatter::open_array_section(char const*)
> >>>>>>>        |     |             |       |   | | + 1.60% PyList_New
> >>>>>>>        |     |             |       |   | | | + 1.60% _PyObject_GC_New
> >>>>>>>        |     |             |       |   | | |   + 1.60% 
> >>>>>>> _PyObject_GC_Malloc
> >>>>>>>        |     |             |       |   | | |     + 1.60% collect
> >>>>>>>        |     |             |       |   | | |       + 0.80% 
> >>>>>>> dict_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10% 
> >>>>>>> subtype_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10% 
> >>>>>>> list_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10% 
> >>>>>>> func_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10% 
> >>>>>>> _PyDict_MaybeUntrack
> >>>>>>>        |     |             |       |   | | + 0.20% 
> >>>>>>> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   | + 1.70% 
> >>>>>>> utime_t::localtime(std::ostream&) const
> >>>>>>>        |     |             |       |   | | + 1.00% std::ostream& 
> >>>>>>> std::ostream::_M_insert<long>(long)
> >>>>>>>        |     |             |       |   | | | + 0.60% 
> >>>>>>> std::num_put<char, std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> > >::do_put(std::ostreambuf_iterator<char, 
> >>>>>>> std::char_traits<char> >, std::ios_base&, char, long) const
> >>>>>>>        |     |             |       |   | | + 0.30% 
> >>>>>>> std::basic_ostream<char, std::char_traits<char> >& 
> >>>>>>> std::__ostream_insert<char, std::char_traits<char> 
> >>>>>>> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, 
> >>>>>>> long)
> >>>>>>>        |     |             |       |   | | + 0.20% __tz_convert
> >>>>>>>        |     |             |       |   | + 0.90% 
> >>>>>>> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   | + 0.20% 
> >>>>>>> pg_state_string(unsigned long)
> >>>>>>>        |     |             |       |   | + 0.20% 
> >>>>>>> operator<<(std::ostream&, eversion_t const&) [clone .isra.103]
> >>>>>>>        |     |             |       |   | + 0.10% std::ostream& 
> >>>>>>> std::ostream::_M_insert<unsigned long>(unsigned long)
> >>>>>>>        |     |             |       |   + 0.40% 
> >>>>>>> PyFormatter::dump_stream(char const*)
> >>>>>>>        |     |             |       |   + 0.30% 
> >>>>>>> operator<<(std::ostream&, pg_t const&)
> >>>>>>>        |     |             |       |   + 0.10% 
> >>>>>>> PyFormatter::open_object_section(char const*)
> >>>>>>>        |     |             |       + 2.70% 
> >>>>>>> PyFormatter::finish_pending_streams()
> >>>>>>>        |     |             |       | + 1.00% 
> >>>>>>> std::_List_base<std::shared_ptr<PyFormatter::PendingStream>, 
> >>>>>>> std::allocator<std::shared_ptr<PyFormatter::PendingStream> > 
> >>>>>>> >::_M_clear()
> >>>>>>>        |     |             |       | | + 0.40% 
> >>>>>>> std::_Sp_counted_ptr_inplace<PyFormatter::PendingStream, 
> >>>>>>> std::allocator<PyFormatter::PendingStream>, 
> >>>>>>> (__gnu_cxx::_Lock_policy)2>::_M_dispose()
> >>>>>>>        |     |             |       | | + 0.20% 
> >>>>>>> tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, 
> >>>>>>> unsigned int)
> >>>>>>>        |     |             |       | + 0.70% 
> >>>>>>> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       | + 0.50% 
> >>>>>>> std::string::replace(unsigned long, unsigned long, char const*, 
> >>>>>>> unsigned long)
> >>>>>>>        |     |             |       | + 0.30% PyString_FromString
> >>>>>>>        |     |             |       + 1.10% PyEval_RestoreThread
> >>>>>>>        |     |             |         + 1.10% PyThread_acquire_lock
> >>>>>>>        |     |             |           + 1.10% sem_wait@@GLIBC_2.2.5
> >>>>>>>        |     |             |             + 1.10% 
> >>>>>>> __new_sem_wait_slow.constprop.0
> >>>>>>>        |     |             |               + 1.10% 
> >>>>>>> do_futex_wait.constprop.1
> >>>>>>>        |     |             + 2.90% frame_dealloc
> >>>>>>>        |     |               + 2.90% dict_dealloc
> >>>>>>>        |     |                 + 2.90% list_dealloc
> >>>>>>>        |     |                   + 2.90% dict_dealloc
> >>>>>>>        |     |                     + 1.90% list_dealloc
> >>>>>>>        |     |                     | + 1.90% dict_dealloc
> >>>>>>>        |     |                     |   + 1.70% list_dealloc
> >>>>>>>        |     |                     |     + 1.50% dict_dealloc
> >>>>>>>        |     |                     |     | + 0.90% dict_dealloc
> >>>>>>>        |     |                     |     | + 0.10% PyObject_Free
> >>>>>>>        |     |                     |     + 0.10% 
> >>>>>>> tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, 
> >>>>>>> unsigned int)
> >>>>>>>        |     |                     + 0.30% PyObject_Free
> >>>>>>>        |     |                     + 0.20% dict_dealloc
> >>>>>>>        |     + 8.10% Gil::Gil(SafeThreadState&, bool)
> >>>>>>>        |       + 8.10% PyEval_RestoreThread
> >>>>>>>        |         + 8.10% PyThread_acquire_lock
> >>>>>>>        |           + 8.10% sem_wait@@GLIBC_2.2.5
> >>>>>>>        |             + 8.10% __new_sem_wait_slow.constprop.0
> >>>>>>>        |               + 8.10% do_futex_wait.constprop.1
> >>>>>>>        + 0.60% 
> >>>>>>> std::condition_variable::wait(std::unique_lock<std::mutex>&)
> >>>>>>>
> >>>>>>> --
> >>>>>>> Paul Mezzanini
> >>>>>>> Sr Systems Administrator / Engineer, Research Computing
> >>>>>>> Information & Technology Services
> >>>>>>> Finance & Administration
> >>>>>>> Rochester Institute of Technology
> >>>>>>> o:(585) 475-3245 | [email protected] <mailto:[email protected]>
> >>>>>>>
> >>>>>>> CONFIDENTIALITY NOTE: The information transmitted, including 
> >>>>>>> attachments, is
> >>>>>>> intended only for the person(s) or entity to which it is addressed 
> >>>>>>> and may
> >>>>>>> contain confidential and/or privileged material. Any review, 
> >>>>>>> retransmission,
> >>>>>>> dissemination or other use of, or taking of any action in reliance 
> >>>>>>> upon this
> >>>>>>> information by persons or entities other than the intended recipient 
> >>>>>>> is
> >>>>>>> prohibited. If you received this in error, please contact the sender 
> >>>>>>> and
> >>>>>>> destroy any copies of this information.
> >>>>>>> ------------------------
> >>>>>>>
> >>>>>>> ________________________________________
> >>>>>>> From: Mark Nelson <[email protected] <mailto:[email protected]>>
> >>>>>>> Sent: Thursday, December 19, 2019 11:47 AM
> >>>>>>> To: [email protected] <mailto:[email protected]>
> >>>>>>> Subject: [ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5
> >>>>>>>
> >>>>>>> If you can get a wallclock profiler on the mgr process we might be 
> >>>>>>> able
> >>>>>>> to figure out specifics of what's taking so much time (ie processing
> >>>>>>> pg_summary or something else).  Assuming you have gdb with the python
> >>>>>>> bindings and the ceph debug packages installed, if you (are anyone)
> >>>>>>> could try gdbpmp on the 100% mgr process that would be fantastic.
> >>>>>>>
> >>>>>>>
> >>>>>>> https://github.com/markhpc/gdbpmp <https://github.com/markhpc/gdbpmp>
> >>>>>>>
> >>>>>>>
> >>>>>>> gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp
> >>>>>>>
> >>>>>>>
> >>>>>>> If you want to view the results:
> >>>>>>>
> >>>>>>>
> >>>>>>> gdbpmp.py -i mgr.gdbpmp -t 1
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Mark
> >>>>>>>
> >>>>>>>
> >>>>>>> On 12/19/19 6:29 AM, Paul Emmerich wrote:  
> >>>>>>>> We're also seeing unusually high mgr CPU usage on some setups, the
> >>>>>>>> only thing they have in common seem to > 300 OSDs.
> >>>>>>>>
> >>>>>>>> Threads using the CPU are "mgr-fin" and and "ms_dispatch"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Paul
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Paul Emmerich
> >>>>>>>>
> >>>>>>>> Looking for help with your Ceph cluster? Contact us at 
> >>>>>>>> https://croit.io <https://croit.io/>
> >>>>>>>>
> >>>>>>>> croit GmbH
> >>>>>>>> Freseniusstr. 31h
> >>>>>>>> 81247 München
> >>>>>>>> www.croit.io <http://www.croit.io/> <http://www.croit.io 
> >>>>>>>> <http://www.croit.io/>>
> >>>>>>>> Tel: +49 89 1896585 90
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Dec 19, 2019 at 9:40 AM Serkan Çoban <[email protected] 
> >>>>>>>> <mailto:[email protected]>
> >>>>>>>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >>>>>>>>
> >>>>>>>>      +1
> >>>>>>>>      1500 OSDs, mgr is constant %100 after upgrading from 14.2.2 to 
> >>>>>>>> 14.2.5.
> >>>>>>>>
> >>>>>>>>      On Thu, Dec 19, 2019 at 11:06 AM Toby Darling
> >>>>>>>>      <[email protected] <mailto:[email protected]> 
> >>>>>>>> <mailto:[email protected] <mailto:[email protected]>>> 
> >>>>>>>> wrote:  
> >>>>>>>>      >
> >>>>>>>>      > On 18/12/2019 22:40, Bryan Stillwell wrote:  
> >>>>>>>>      > > That's how we noticed it too.  Our graphs went silent after 
> >>>>>>>>  
> >>>>>>>>      the upgrade  
> >>>>>>>>      > > completed.  Is your large cluster over 350 OSDs?  
> >>>>>>>>      >
> >>>>>>>>      > A 'me too' on this - graphs have gone quiet, and mgr is using 
> >>>>>>>>  
> >>>>>>>>      100% CPU.  
> >>>>>>>>      > This happened when we grew our 14.2.5 cluster from 328 to 436 
> >>>>>>>> OSDs.
> >>>>>>>>      >
> >>>>>>>>      > Cheers
> >>>>>>>>      > Toby
> >>>>>>>>      > --
> >>>>>>>>      > Toby Darling, Scientific Computing (2N249)
> >>>>>>>>      > MRC Laboratory of Molecular Biology
> >>>>>>>>      > Francis Crick Avenue
> >>>>>>>>      > Cambridge Biomedical Campus
> >>>>>>>>      > Cambridge CB2 0QH
> >>>>>>>>      > Phone 01223 267070
> >>>>>>>>      > _______________________________________________
> >>>>>>>>      > ceph-users mailing list -- [email protected] 
> >>>>>>>> <mailto:[email protected]>  
> >>>>>>>>      <mailto:[email protected] <mailto:[email protected]>>  
> >>>>>>>>      > To unsubscribe send an email to [email protected] 
> >>>>>>>> <mailto:[email protected]>  
> >>>>>>>>      <mailto:[email protected] 
> >>>>>>>> <mailto:[email protected]>>
> >>>>>>>>      _______________________________________________
> >>>>>>>>      ceph-users mailing list -- [email protected] 
> >>>>>>>> <mailto:[email protected]>
> >>>>>>>>      <mailto:[email protected] <mailto:[email protected]>>
> >>>>>>>>      To unsubscribe send an email to [email protected] 
> >>>>>>>> <mailto:[email protected]>
> >>>>>>>>      <mailto:[email protected] 
> >>>>>>>> <mailto:[email protected]>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> ceph-users mailing list -- [email protected] 
> >>>>>>>> <mailto:[email protected]>
> >>>>>>>> To unsubscribe send an email to [email protected] 
> >>>>>>>> <mailto:[email protected]>  
> >>>>>>> _______________________________________________
> >>>>>>> ceph-users mailing list -- [email protected] 
> >>>>>>> <mailto:[email protected]>
> >>>>>>> To unsubscribe send an email to [email protected] 
> >>>>>>> <mailto:[email protected]>  
> >>>>>> _______________________________________________
> >>>>>> ceph-users mailing list -- [email protected] 
> >>>>>> <mailto:[email protected]>
> >>>>>> To unsubscribe send an email to [email protected] 
> >>>>>> <mailto:[email protected]>  
> >>>>>
> >>>>> _______________________________________________
> >>>>> ceph-users mailing list -- [email protected] 
> >>>>> <mailto:[email protected]>
> >>>>> To unsubscribe send an email to [email protected] 
> >>>>> <mailto:[email protected]>  
> >>> _______________________________________________
> >>> ceph-users mailing list -- [email protected] <mailto:[email protected]>
> >>> To unsubscribe send an email to [email protected] 
> >>> <mailto:[email protected]>
> >>>   
> >> _______________________________________________
> >> ceph-users mailing list -- [email protected] <mailto:[email protected]>
> >> To unsubscribe send an email to [email protected] 
> >> <mailto:[email protected]>
> > 
> > 
> _______________________________________________
> ceph-users mailing list -- [email protected] <mailto:[email protected]>
> To unsubscribe send an email to [email protected] 
> <mailto:[email protected]>
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to