I have an interesting problem. I've made no changes to the IB DDN
storage yet I'm finding OST's crashing left and right. The thread
watchdog gets triggered and the most relevant part of the dump is the
following. It appears that it took > 100s to find a free extent. On the
OSS I watch with iostat as the lun is saturated with small read requests.
We've just hit 80% full (we planned on going to 90% full) and we do have
a lot of small files (~75 million )
Is there anyway to tune the extent searching code? Does my analysis seem
likely? Is this fixed in 1.6.1 such that I should upgrade immediately?
Thanks,
Daniel
Call Trace:<ffffffffa0024125>{:sd_mod:sd_iostats_bump+147}
<ffffffffa031429a>{:ib_srp:srp_host_qcommand+399}
<ffffffff80253ebf>{deadline_next_request+34}
<ffffffff8024b329>{elv_next_request+238}
<ffffffff80309843>{io_schedule+38}
<ffffffff8017843c>{__wait_on_buffer+125}
<ffffffff801782c2>{bh_wake_function+0}
<ffffffff801782c2>{bh_wake_function+0}
<ffffffffa05771d9>{:ldiskfs:ldiskfs_mb_init_cache+469}
<ffffffff80157ba2>{add_to_page_cache+167}
<ffffffffa0577792>{:ldiskfs:ldiskfs_mb_load_buddy+257}
<ffffffffa057a89f>{:ldiskfs:ldiskfs_mb_new_blocks+1946}
<ffffffffa05b480e>{:fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+729}
<ffffffffa0574362>{:ldiskfs:ldiskfs_ext_find_extent+205}
<ffffffffa0575a69>{:ldiskfs:ldiskfs_ext_walk_space+535}
<ffffffffa05b4535>{:fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+0}
<ffffffffa05b4b56>{:fsfilt_ldiskfs:fsfilt_map_nblocks+236}
<ffffffffa05b4d50>{:fsfilt_ldiskfs:fsfilt_ldiskfs_map_ext_inode_pages+457}
<ffffffffa05d59dc>{:obdfilter:filter_direct_io+892}
<ffffffffa05b36f2>{:fsfilt_ldiskfs:fsfilt_ldiskfs_brw_start+649}
<ffffffffa05d6fb9>{:obdfilter:filter_commitrw_write+3494}
<ffffffff80308ecd>{thread_return+0}
<ffffffff80308f25>{thread_return+88}
<ffffffffa0378bf2>{:lnet:lnet_send+2251}
<ffffffffa05d100e>{:obdfilter:filter_commitrw+84}
<ffffffff8013f23f>{del_timer+107}
<ffffffff8013f2fc>{del_singleshot_timer_sync+9}
<ffffffff803099f7>{schedule_timeout+375}
<ffffffffa05a2c47>{:ost:ost_brw_write+5119}
<ffffffff801331a5>{default_wake_function+0}
<ffffffffa059f513>{:ost:ost_bulk_timeout+0}
<ffffffffa043269f>{:ptlrpc:lustre_msg_get_version+64}
<ffffffffa05a637e>{:ost:ost_handle+6987}
<ffffffffa034cf41>{:libcfs:libcfs_debug_vmsg2+1713}
<ffffffff801e9c83>{vsnprintf+1406} <ffffffff801e9d66>{snprintf+131}
<ffffffffa043906b>{:ptlrpc:ptlrpc_server_handle_request+2336}
<ffffffff8013f100>{__mod_timer+293}
<ffffffffa043ad29>{:ptlrpc:ptlrpc_main+2018}
<ffffffff801331a5>{default_wake_function+0}
<ffffffffa0439a47>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffffa0439a47>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffff80110e23>{child_rip+8}
<ffffffffa043a547>{:ptlrpc:ptlrpc_main+0}
<ffffffff80110e1b>{child_rip+0}
--
Daniel Leaberry
Systems Administrator
iArchives
Tel: 801-494-6528
Cell: 801-376-6411
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss