Hi, Thank you for any time.
We are tracking some very slow pwrite64() calls to a ceph filesystem -
20965 11:04:24.049186 <... pwrite64 resumed>) = 65536 <4.489594>
20966 11:04:24.069765 <... pwrite64 resumed>) = 65536 <4.508859>
20967 11:04:24.090354 <... pwrite64 resumed>) = 65536 <4.510256>
But other pwrite64()s from the same program in other threads to other files on
the same ceph fs seem fine; we cannot really reproduce this, but it happens
occasionally.
It seems we are spending some time in ceph_aio_write() when this is happening
(see call graph below)
I've noticed THP (transparent Huge Pages) is enabled.
We are running version 15.2.17 on CentOS 7.9
Do not seem to be under any significant memory pressure when this happens, just
many threads of this app blocked on i/o in pwrite64()s.
I am suggesting an upgrade, but until then, do you think this situation
involves ceph and could be improved if we disable THP ?
Thanks for any advice or suggestion,
-mark
Call graph of app when slow pwrites64()s are happening -
--87.08%--system_call_fastpath
|--58.20%--sys_pwrite64
| --58.03%--vfs_write
| do_sync_write
| ceph_aio_write
| |--54.33%--generic_file_buffered_write
| | |--27.09%--ceph_write_begin
| | |
|--17.01%--grab_cache_page_write_begin
| | | |
|--7.51%--add_to_page_cache_lru
| | | | |
|--4.28%--__add_to_page_cache_locked
| | | | | |
--3.95%--mem_cgroup_cache_charge
| | | | | |
mem_cgroup_charge_common
| | | | | |
--3.75%--__mem_cgroup_commit_charge
| | | | |
--3.23%--lru_cache_add
| | | | |
__lru_cache_add
| | | | |
--2.94%--pagevec_lru_move_fn
| | | | |
|--0.70%--mem_cgroup_page_lruvec
| | | | |
|--0.67%--__pagevec_lru_add_fn
| | | | |
--0.57%--release_pages
| | | |
|--5.31%--__page_cache_alloc
| | | | |
--5.00%--alloc_pages_current
| | | | |
__alloc_pages_nodemask
| | | | |
--4.23%--get_page_from_freelist
| | | | |
|--1.85%--__rmqueue
| | | | |
| --1.49%--list_del
| | | | |
| __list_del_entry
| | | | |
--1.74%--list_del
| | | | |
__list_del_entry
| | | |
--3.92%--__find_lock_page
| | | |
__find_get_page
| | | |
--3.46%--radix_tree_lookup_slot
| | | |
|--2.76%--radix_tree_descend
| | | |
--0.70%--__radix_tree_lookup
| | | |
radix_tree_descend
| | |
--9.45%--ceph_update_writeable_page
| | |
--8.94%--readpage_nounlock
| | |
--8.60%--ceph_osdc_readpages
| | |
--8.18%--submit_request
| | |
__submit_request
| | |
calc_target.isra.50
| | |
ceph_pg_to_up_acting_osds
| | |
crush_do_rule
| | |
crush_choose_firstn
| | |
|--4.89%--crush_choose_firstn
| | |
| is_out.isra.2.part.3
| | |
--3.30%--crush_bucket_choose
| |
|--14.50%--copy_user_enhanced_fast_string
| | |--6.34%--ceph_write_end
| | | |--3.76%--set_page_dirty
| | | |
ceph_set_page_dirty
| | | |
--2.86%--__set_page_dirty_nobuffers
| | | |
|--0.92%--_raw_spin_unlock_irqrestore
| | | |
--0.59%--radix_tree_tag_set
| | | --1.88%--unlock_page
| | | __wake_up_bit
| | |--3.44%--iov_iter_fault_in_readable
| | --2.70%--mark_page_accessed
| |--2.54%--mutex_lock
| | __mutex_lock_slowpath
| | --2.14%--schedule_preempt_disabled
| | __schedule
| | |
| |
|--0.82%--finish_task_switch
| | |
__perf_event_task_sched_in
| | |
perf_pmu_enable
| | | x86_pmu_enable
| | |
--0.80%--intel_pmu_enable_all
| | |
--0.74%--__intel_pmu_enable_all.isra.23
| | |
--0.69%--native_write_msr_safe
| |
--0.75%--__perf_event_task_sched_out
| --0.56%--mutex_unlock
| __mutex_unlock_slowpath
________________________________
The information contained in this e-mail message is intended only for the
personal and confidential use of the recipient(s) named above. This message may
be an attorney-client communication and/or work product and as such is
privileged and confidential. If the reader of this message is not the intended
recipient or an agent responsible for delivering it to the intended recipient,
you are hereby notified that you have received this document in error and that
any review, dissemination, distribution, or copying of this message is strictly
prohibited. If you have received this communication in error, please notify us
immediately by e-mail, and delete the original message.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
