Re: [Linux-HA] Antw: Re: When mirroring is a CPU-intensive task, something must be wrong...

Ciro Iriarte Mon, 22 Apr 2013 00:03:02 -0700

2013/4/22 Ulrich Windl <[email protected]>:
>>>> Lars Marowsky-Bree <[email protected]> schrieb am 19.04.2013 um 18:46 in 
>>>> Nachricht
> <[email protected]>:
>> On 2013-04-19T16:27:14, Ulrich Windl <[email protected]> 
>> wrote:
>>
>> > Hello,
>> >
>> > Using OCFS2 on top of a cLVM-mirrored LV is an absolute no-go for SLES11 
>> > SP2:
>>
>> Note that this is unrelated to OCFS2; cLVM2 mirroring is rather slow,
>> since it communicates over the network to keep the dirty bitmaps and
>> locks in sync.
>
> That is the question: If cLVM mirroring floods the communication channel, 
> both OCFS2 (which uses the same communication channel, DLM) will suffer, just 
> as the cluster stack will (talking about faultyrings then).
>
>>
>> > First, while mirroring the LV ("only" 300GB) access to any of the involved
>> devices is very slow, but it's not the I/O that's the limit, but something
>> else; "communication" I suspect.
>> >   PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+   P COMMAND
>> > 14883 root      RT   0  535m  11m 7808 R     42  0.0  34:35.24  0 corosync
>> > 16493 root      20   0 33804 2168 1752 R     14  0.0  10:32.29  0 cmirrord
>> >
>> > Shouldn't a busy mirroring job have a "D" state instead of an "R" state,
>> burning CPU?
>>
>> They're not in "D" because they are not waiting on disk IO, but have
>> a lot of network IO and data structure maintenance to handle.
>
> Interesting: While flooding a Gb network, the acieved mirroring rate is only 
> about 60MB/s. But we are not mirroring through the network, but throuch 
> 4Gb/FC (fully redundant fabrics).
>
>>
>> > So besides of the inefficient I/O with cLVM there are more issues:
>> > 1) LVM should load-balance between the mirror legs
>>
>> It doesn't, because this simplifies the dirty logging. It always will
>> write to leg 1 first, hence all read requests can always be satisfied
>> from leg 1 without the need to cluster-wide sync if leg 1 and leg 2 are
>> already in sync in the IO paths.
>
> See the performance of MD-RAID for a movtivation: MD-RAID is much faster.
>
>>
>> > 2) LVM should use a leg-internal bitmap to resynchronize the mirror in a
>> non-stupid way
>>
>> It does use a bitmap for syncing, if you created the lvmirror with a
>> persistent mirrorlog.
>
> That design is broken: if you have two separate storage systems in two 
> locations, where do you put the bitmap? In HP-UX (similar as MD-RAID) each PV 
> had ist own bitmap; with Linux-LVM you need a _third_ device to store the 
> bitmap. That's nonsense.
>
>
>>
>> > 3) LVM should mirror the more recent mirror leg to the outdated mirror leg,
>> not use a fixed direction.
>>
>> The only situation where this matters is a split brain combined with
>> split IO. That's a situation that even DRBD doesn't handle well, and
>> the resolution that LVM2 mirroring implements is as valid as any.
>
> Yes, DRBD dual-primary also failed in out scenario: Manual repair was needed.
> The primary idea of mirroring is that systems keep running of one mirror leg 
> fails. And the necessary condition for practical use in a HA environemnt is 
> that once the failed leg returns (assuming I/O outage) the systems still keep 
> running while the data are being synchronized on the stale leg. cLVM brings 
> the system to a practical stand-still in this situation.
>
>>
>> > So my advice is: Don't use it (for SLES11 SP2).
>>
>> You should not use it if performance is your primary goal for using it,
>> no.
>
> See above. I can only assume cLVM was tested in a "toy environment" with 
> either tiny or extremely slow disks so that the disk limited the mirroring 
> speed.
>
>>
>> > I'm somewhat
>> > displeased about the situation, because I had a support request asking
>> > exactly whether this setup is a supported configuration, and it was
>> > confirmed.
>>
>> It *is* supported, but cLVM2 mirroring has constraints, especially with
>> regard to performance and flexibility.
>>
>> If you can avoid the need for a concurrent cluster mirror, do so: use
>> SAN-based mirroring, use md raid1 if active/passive access is
>> sufficient, consider building an iSCSI server using Raid1 to re-export
>> via iSCSI for your concurrent IO needs, consider using DRBD, use cLVM2
>> mirroring but with local activation, etc. They all, alas, have
>> trade-offs.
>>
>> Cluster concurrency is a hard problem. cLVM2 mirroring performance is
>> certainly pretty close to the top of our priority lists, but the battle
>> is not won in a day.
>
> Yes, I had complained about the massive logging of cLVM (which showed that 
> it's communication quite a lot (I'd say: way too much)), and the solution 
> being applied seems to be disabling logging. So the extensive communication 
> still happens.
>
>>
>> > Now the first proposal regarding the terrible performance was to _try_
>> > SLES11 SP3 beta...
>>
>> The CPU overhead will have improved some, but the basic design of cLVM2
>> mirroring hasn't changed a lot.
>>
>> This is the same upstream and in all distributions, it is not SLES
>> specific.
>
> There were some rumours that Redhat's LVM is ahead of SUSE's by at least one 
> generation...


Just out of curiosity, sources??. cLVM is not included upstream?

>
> Regards,
> Ulrich
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

Regards,

--
Ciro Iriarte
http://cyruspy.wordpress.com
--
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Antw: Re: When mirroring is a CPU-intensive task, something must be wrong...

Reply via email to