I would think with gigE that you should be able to get in the neighborhood of 80 MB/s (and with multiple processes maybe into the 90s).

That dd test may be too short to show your real bandwidth, though (that example only took 2 seconds). I would suggest increasing the count and/or the block size until you have a test that runs for about a minute to get a more stable number.

As a side note, the EAGAINs that you see when stracing pvfs2-client are perfectly normal. It uses nonblocking sockets for all of its communication, in which case that error code is a normal occurrence.

If something like "cp" is running poorly it might be helpful to strace the cp tool itself and see how big its access sizes are. I'm not familiar with SUSE10, but some past versions of core utils fail to honor the block size reported by PVFS and end up using really small accesses by PVFS standards.

-Phil

Jalal wrote:
Hi kevin,


here is the output of dd:

****PVFS******
# dd if=/dev/zero of=file.out bs=1048576 count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 2.15666 seconds, 48.6 MB/s
*****local disk*****
dd if=/dev/zero of=file.out bs=1048576 count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.266154 seconds, 394 MB/s


Does that look reasonable for my setup considering that I have I only have 1GbE network on all nodes, and using 16 PVFS2 servers ?


On Thu, Jul 16, 2009 at 6:53 PM, Kevin Harms <[email protected] <mailto:[email protected]>> wrote:


     have you tried using dd? what about: dd if=/dev/zero
    of=/mnt/pvfs2/file.out bs=1048576 count=100

    kevin


    On Jul 16, 2009, at 7:07 PM, Jalal wrote:

        hello there,

        I have been trying to setup pvfs2 on a small cluster (16
        servers, and
        16 clients) running SUSE10SP2-64bit and I am running into some major
        performance problems that are causing me to doubt my install. I am
        hoping to get some help from this great users group.

        The server side of things seems to be working great. I have 14 I/O
        servers, and 2 metaDB servers. I don't see any errors at all. I can
        run the pvfs2 native tools (ex: pvfs2-cp) and I am seeing some
        fantastic results (500+ Mbs). The pvfs2-fs.conf is bone stock and is
        as generated by pvfs2-genconfig.

        When I use the native linux FS commands (ex: cp, rsync...) I am
        seeing
        some dismal results that are 10-15 times slower then the pvfs2 FS
        tools. The kernel driver build goes very smoothly, and I am not
        seeing
        any errors. Here are the steps that I am taking:


        cd /tmp
        tar zxvf pvfs-2.8.1.tar.gz
        cd pvfs-2.8.1/
        ./configure --prefix=/opt/pvfs2 --with-kernel=/tmp/linux
        --disable-server --disable-karma
        make kmod
        make kmod_install
        depmod -a
        modprobe pvfs2
        /opt/pvfs2/sbin/pvfs2-client -p  /opt/pvfs2/sbin/pvfs2-client-core
        mount -t pvfs2 tcp://lab1:3334/pvfs2-fs /mnt/pvfs2

        I did an strace on the pvfs2-client process and I am seeing lots and
        lots of retries:

        readv(26,
        [{"p\27\0\0\2\0\0\0\4\0\0\0\0\0\0\0C\362\0\0d\0\0\0\244\1"...,
        128}], 1) = 128
        read(5, 0x7d4450, 8544)                 = -1 EAGAIN (Resource
        temporarily unavailable)
        getrusage(RUSAGE_SELF, {ru_utime={2, 960185}, ru_stime={7,
        904494}, ...}) = 0
        writev(5, [{"AQ\0\0", 4}, {")\5\3 ", 4}, {"5\316\0\0\0\0\0\0", 8},
        {"\4\0\0\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0C\362"...,
        8224}],
        4) = 8240
        poll([{fd=5, events=POLLIN, revents=POLLIN}], 1, 10) = 1
        read(5, "AQ\0\0)\5\3 6\316\0\0\0\0\0\0\5\0\0\377\0\0\0\0C\362\0"...,
        8544) = 8544
        read(5, 0x7d1020, 8544)                 = -1 EAGAIN (Resource
        temporarily unavailable)
        getrusage(RUSAGE_SELF, {ru_utime={2, 960185}, ru_stime={7,
        904494}, ...}) = 0
        epoll_ctl(6, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLERR|EPOLLHUP,
        {u32=6084112, u64=6084112}}) = -1 EEXIST (File exists)
        epoll_wait(6, {}, 16, 0)                = 0
        read(5, 0x7d2020, 8544)                 = -1 EAGAIN (Resource
        temporarily unavailable)
        writev(26,
        [{"\277\312\0\0\2\0\0\0\246\267\0\0\0\0\0\0L\0\0\0\0\0\0\0"...,
        24}, {"p\27\0\0\2\0\0\0\10\0\0\0\0\0\0\0C\362\0\0d\0\0\0\1\0\0"...,
        76}], 2) = 100
        epoll_wait(6, {{EPOLLIN, {u32=6084112, u64=6084112}}}, 16, 10) = 1
        fcntl(26, F_GETFL)                      = 0x802 (flags
        O_RDWR|O_NONBLOCK)
        recvfrom(26,
        "\277\312\0\0\4\0\0\0\246\267\0\0\0\0\0\0\30\0\0\0\0\0\0"...,
        24, MSG_PEEK|MSG_NOSIGNAL, NULL, NULL) = 24
        fcntl(26, F_GETFL)                      = 0x802 (flags
        O_RDWR|O_NONBLOCK)
        recvfrom(26,
        "\277\312\0\0\4\0\0\0\246\267\0\0\0\0\0\0\30\0\0\0\0\0\0"...,
        24, MSG_NOSIGNAL, NULL, NULL) = 24
        readv(26,
        [{"p\27\0\0\2\0\0\0\10\0\0\0\0\0\0\0\331\323\17\0\0\0\0\0"...,
        24}], 1) = 24
        read(5, 0x7d2020, 8544)                 = -1 EAGAIN (Resource
        temporarily unavailable)
        epoll_ctl(6, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLERR|EPOLLHUP,
        {u32=6084112, u64=6084112}}) = -1 EEXIST (File exists)
        epoll_wait(6, {}, 16, 0)                = 0
        read(5,  <unfinished ...>


        I appreciate any and all feedback!
        _______________________________________________
        Pvfs2-users mailing list
        [email protected]
        <mailto:[email protected]>
        http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users




------------------------------------------------------------------------

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to