On 02/06/2012 03:50 AM, Jack David wrote: > On Mon, Feb 6, 2012 at 1:20 PM, wangdi<[email protected]> wrote: >> On 02/05/2012 11:26 PM, Jack David wrote: >>> On Mon, Feb 6, 2012 at 12:42 PM, wangdi<[email protected]> wrote: >>>> On 02/05/2012 10:37 PM, Jack David wrote: >>>>> Hi All, >>>>> >>>>> I am using the following guide to understand how liblustre works. >>>>> >>>>> http://wiki.lustre.org/index.php/LibLustre_How-To_Guide >>>>> >>>>> But, I am not able to run the "sanity" test. The reason may be that I >>>>> am not passing the correct "profile_name" file while running the test. >>>>> >>>>> I have created two directories under my client >>>>> /mnt/lustre >>>>> /mnt/liblustre_client >>>>> >>>>> My MDS ip address is 10.193.123.1, and I am using following command >>>>> from my client: >>>>> >>>>> sanity --target 10.193.123.1:/mnt/liblustre_client >>>> sanity --target mgsnid:/your_fsname >>>> >>> Thanks, I used the "fsname" in the sanity command, but I am now >>> getting following error. (which says The mds_connect operation failed >>> with -16) >>> ===================================== >>> <root@niteshs /usr/src/lustre-release>$ lustre/liblustre/tests/sanity >>> --target nanogon:/temp | head >>> >>> 1328512853.118823:23449:niteshs:(class_obd.c:492:init_obdclass()): >>> Lustre: 23449-niteshs:(class_obd.c:492:init_obdclass()): Lustre: Build >>> Version: 2.1.52-g48452fb-CHANGED-2.6.32-lustre-patched >>> 1328512853.151641:23449:niteshs:(lov_obd.c:2892:lov_init()): Lustre: >>> 23449-niteshs:(lov_obd.c:2892:lov_init()): Lustre LOV module >>> (0x85d180). >>> 1328512853.151687:23449:niteshs:(osc_request.c:4636:osc_init()): >>> Lustre: 23449-niteshs:(osc_request.c:4636:osc_init()): Lustre OSC >>> module (0x85da40). >>> 1328512853.158760:23449:niteshs:(sec.c:1475:sptlrpc_import_sec_adapt()): >>> Lustre: 23449-niteshs:(sec.c:1475:sptlrpc_import_sec_adapt()): import >>> mgc_dev->10.193.186.112@tcp netid 20000: select flavor null >>> 1328512853.175099:23449:niteshs:(sec.c:1475:sptlrpc_import_sec_adapt()): >>> Lustre: 23449-niteshs:(sec.c:1475:sptlrpc_import_sec_adapt()): import >>> temp-OST0000-osc-0x22c0670->10.193.184.135@tcp netid 20000: select >>> flavor null >>> 1328512853.175159:23449:niteshs:(sec.c:1475:sptlrpc_import_sec_adapt()): >>> Lustre: 23449-niteshs:(sec.c:1475:sptlrpc_import_sec_adapt()): import >>> temp-MDT0000-mdc-0x22c0670->10.193.186.112@tcp netid 20000: select >>> flavor null >>> 1328512853.179231:23449:niteshs:(client.c:1141:ptlrpc_check_status()): >>> LustreError: 23449-niteshs:(client.c:1141:ptlrpc_check_status()): >>> 11-0: an error occurred while communicating with 10.193.186.112@tcp. >>> The mds_connect operation failed with -16 >>> 1328512853.179763:23449:niteshs:(client.c:1141:ptlrpc_check_status()): >>> LustreError: 23449-niteshs:(client.c:1141:ptlrpc_check_status()): >>> 11-0: an error occurred while communicating with 10.193.186.112@tcp. >>> The mds_connect operation failed with -16 >>> 1328512853.180146:23449:niteshs:(client.c:1141:ptlrpc_check_status()): >>> LustreError: 23449-niteshs:(client.c:1141:ptlrpc_check_status()): >>> 11-0: an error occurred while communicating with 10.193.186.112@tcp. >>> The mds_connect operation failed with -16 >>> ===================================== >> It seems mds is somehow stuck in a long recovery, so it can not access the >> new connection. You might wait a bit. or just umount mds and remount it with >> -o abort_recov. >> > I tried after some time, and the mds_connect failure error disappeared > (as mentioned in my earlier email). But now I am stuck with the new > problem, i.e. sanity test does not work. I ran it with "gdb" and found > that it failed in the first test itself (test t1, which does > touch+unlink). The failure is in "open" call. I am not sure why it > fails. Does the "sanity" test have any prerequisite, like lustre > should be mounted on a specific path? Because I could see that test_t1 > file was created when I mounted the filesystem using "mount" command. > > Following screen-log says that now it failed in "unlink" command. Hmm, you can set environment variable LIBLUSTRE_MOUNT_POINT will indicate where the lustre mounted.
What is your lustre version? Btw: why do not you try lustre/tests/liblustre.sh, which might make things easier for you. Thanks WangDi > ===== START t1: touch+unlink > 1328528905======================================== > 1328528905.411406:28664:niteshs:(/usr/src/lustre-release/lustre/include/obd_class.h:1980:md_intent_lock()): > LustreError: > 28664-niteshs:(/usr/src/lustre-release/lustre/include/obd_class.h:1980:md_intent_lock()): > obd_intent_lock: NULL export > 1328528905.411425:28664:niteshs:(/usr/src/lustre-release/lustre/include/obd_class.h:1980:md_intent_lock()): > LustreError: > 28664-niteshs:(/usr/src/lustre-release/lustre/include/obd_class.h:1980:md_intent_lock()): > obd_intent_lock: NULL export > unlink(/mnt/lustre/test_t1) error: No such device > > What am I missing? _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
