I use iperf for network testing.

Those dd's are run on the machine directly with the HDDs attached, not over the network connection? It's also direct to the backing device, not through /dev/drbdX? If so, your storage is the problem.

On 05/06/14 01:51 AM, Bret Mette wrote:
Do you have any suggestions on how I can test the network in isolation
that would yield results helpful in this scenario?

DRBD was not syncing, as I got those results even with the secondary in
disconnect. Storage directly yields the following results:

node1
dd if=/dev/zero of=./testbin  bs=512 count=1000 oflag=direct
12000 bytes (512 kB) copied, 0.153541 s, 3.3 MB/s

node2
dd if=/dev/zero of=~/testbin  bs=512 count=1000 oflag=direct
512000 bytes (512 kB) copied, 0.864994 s, 592 kB/s
512000 bytes (512 kB) copied, 0.328994 s, 1.6 MB/


On Wed, Jun 4, 2014 at 10:35 PM, Digimer <[email protected]
<mailto:[email protected]>> wrote:

    On 04/06/14 11:31 AM, Bret Mette wrote:

        Hello,

        I started looking at DRBD as a HA ISCSI target. I am
        experiencing very
        poor performance and decided to run some tests. My current setup
        is as
        follows:

        Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GH
        CentoS 6.5 - 2.6.32-431.17.1.el6.x86_64
        drbd version: 8.3.16 (api:88/proto:86-97)
        md RAID10 using 7200rpm drives

        The 2 drbd nodes are synced using an intel  82579LM Gigabit card

        I have created an logical drive using LVM and configured a
        couple drbd
        resources on top of that. drbd0 is my iscsi configuration file,
        which is
        shared between the 2 nodes and drbd1 is a 1.75TB ISCSI target.

        I run heartbeat on the two nodes and expose a virtual IP to the
        ISCSI
        initiators.

        Originally I was running ISCSI with write-cache off (for data
        integrity
        reasons) but have recently switched to write-cache on during testing
        (with little to no gain).

        My major concern is the extremely high latency test results I
        got when
        when dd against drbd0 mounted on the primary node.

        dd if=/dev/zero of=./testbin  bs=512 count=1000 oflag=direct
        512000 bytes (512 kB) copied, 32.3254 s, 15.8 kB/s

        I have pinged the second node as a very basic network latency
        test and
        get 0.209ms response time. I have also run the same test on both
        nodes
        with drbd disconnected (or on partitions not associated with
        drbd) and
        get typical results:

        node1
        dd if=/dev/zero of=./testbin  bs=512 count=1000 oflag=direct
        12000 bytes (512 kB) copied, 0.153541 s, 3.3 MB/s

        node2
        dd if=/dev/zero of=~/testbin  bs=512 count=1000 oflag=direct
        512000 bytes (512 kB) copied, 0.864994 s, 592 kB/s
        512000 bytes (512 kB) copied, 0.328994 s, 1.6 MB/s

        node2's latency (without drbd connected) is inconsistent but always
        falls between those two ranges.

        These tests were run with no ISCSI targets exposed, no initiators
        connected, essentially on an idle system.

        My question is why are my drbd connected latency tests showing
        results
        35 to 100 times slower than my results when dbrd is not
        connected (or
        against partitions not backed by drbd)?

        This seems to be the source of my horrible performance on the ISCSI
        targs (300~900 K/sec dd writes on the initiators) and very high
        iowait
        (35-75%) on mildly busy initiators.


        Any advice pointers, etc. would be highly appreciated. I have
        already
        tried numerous performance tuning settings (suggested by the drbd
        manual). But I am open to any suggestion and will try anything
        again if
        it might solve my problem.

        Here are the important bits of my current drbd.conf

                  net {
                  cram-hmac-alg sha1;
                  shared-secret "password";
                  after-sb-0pri disconnect;
                  after-sb-1pri disconnect;
                  after-sb-2pri disconnect;
                  rr-conflict disconnect;
                  max-buffers 8000;
                  max-epoch-size 8000;
                  sndbuf-size 0;
                  }

                  syncer {
                  rate 100M;
                  verify-alg sha1;
                  al-extents 3389;
                  }

        I've played with the watermark setting and a few others and
        latency only
        seems to get worse or stay where it's at.


        Thank you,
        Bret


    Have you tried testing the network in isolation? Is the DRBD
    resource syncing? With a syncer rate of 100M on a 1 Gbps NIC, that's
    just about all your bandwidth consumed by background sync. Can you
    test the speed of the storage directly, not over iSCSI/network?

    --
    Digimer
    Papers and Projects: https://alteeve.ca/w/
    What if the cure for cancer is trapped in the mind of a person
    without access to education?




--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without access to education?
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to