Hi Octave,
> However, for the best performance, I would recommend using SAN LUNs
> directly into the guests and have ZFS run there. The overhead in this
+1 for that. In Clustered environments, we recommend to map full LUNs
into guests for app data, OS can be on a zfs file or whatever. There
are other reasons for this, but performance is one too.
However, on the point about:
> loose the ability to quickly clone an entire LDom from the control
From what i understand, most customers want the "clone"
operation for their OS/App installation, NOT for the application data.
App data can be sitting in its own Zpool on separate LUN(s), and the
OS can actually be in a ZFS file.
So, this allows you to still clone your guest domain. ZFS already
has the ability to export/import Zpools across different hosts
on the SAN network, so the ability to move around App data is
already there.
Performance questions, for the most part, are for rw to the
app data.
Perhaps you are talking more about "stand-alone" systems where
everything (OS/Apps/Config + App Data) are all sitting on the
same system. Even there, i think that the idea of separating the
application data from your OS/App binaries/config is very useful.
-ashu
On 06/15/2010 08:30 AM, Octave Orgeron wrote:
Hi,
From my own experience, I would say that both of you are correct. If
you're comparing internal storage with ZFS, especially with the old
T1000/T2000 servers, the performance is pretty sad due to the poor SAS
controller on those systems. Now if you do the same on newer T-Series
servers and are using the latest S10 or even better OpenSolaris, the ZFS
performance is pretty good. If you use SAN LUNs instead of the internal
disks for your ZFS pool, the performance is significantly better. It's
really important to have atleast 4GB's of RAM in your control or I/O
domain serving your ZFS pools for VDS back-ends. It's also a good idea
to tune the ARC to only use 1GB of RAM and let the kernel, networking,
and OS functions use the rest. In this configuration the overhead should
be under 10-12% depending on your load.
However, for the best performance, I would recommend using SAN LUNs
directly into the guests and have ZFS run there. The overhead in this
configuration is as little as 5%, which is pretty good. However, you'll
loose the ability to quickly clone an entire LDom from the control
domain if you move ZFS further up the stack. So you really have to weigh
your requirements around performance and flexibility against these options.
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Octave J. Orgeron
Solaris Virtualization Architect and Consultant
Web: http://unixconsole.blogspot.com
E-Mail: [email protected]
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
------------------------------------------------------------------------
*From:* Tony MacDoodle <[email protected]>
*To:* [email protected]
*Sent:* Tue, June 15, 2010 10:11:20 AM
*Subject:* Re: [ldoms-discuss] Performance Bad on Virtual Disks
Regardless, I think from testing using iozone, bonnie, dd, whatever, the
end result will show that using ZFS as the backend is unfortunately a
bad idea. ZFS itself is a resource "pig" but can be tuned. Personally
for back-end mission critical apps I would still want to use a dedicated
server with no virtualization (hardware domains excluded ie 25K,
Mx000's) where performance is critical...........
my 3 cents......
On Tue, Jun 15, 2010 at 2:15 AM, Stefan Hinker - Systems Practice - Sun
Microsystems Germany <[email protected]
<mailto:[email protected]>> wrote:
Nathan,
There are other factors to consider, blocksize and storage backend
being the
two most important ones.
I just fired off a quick test with 1 instance of dd on two different
LUNs.
One gave me 55MB/sec read and 110MB/sec write at 1MB blocksize, the
other
gave me 250 MB/sec read, 135MB/sec write, again at 1MB blocksize.
Since we
are discussing the capabilities of dd and LDom virtual disk IO, I'll not
elaborate on the storage backends used... But please note the very
different behaviour of both.
However, this clearly shows that LDom virtual disks when tested with
dd, are
capable of transfering at least 250MB/sec read, 135MB/sec write at 1MB
blocksize. Note that the faster of the two devices delivered less than
40MB/sec at 8k blocksize and only 2.5 MB/sec if dd is started
without any
blocksize parameter (which means it'll use 512 byte blocks).
my 2 cents
stefan
On 15.06.10 07:49, Nathan Kroenert wrote:
> Hi Alex -
>
> It's all well and good to say that it's not representative of real
> workloads, but that's sidestepping the issue completely.
>
> Doing a DB import or export, for example, is certainly something that
> likes fast sequential access.
>
> Batch workloads that do lots of table scanning like lots of fast
> sequential access.
>
> Lots of workloads require fast sequential access, and as such, dd
is a
> great tool for such testing.
>
> Yes - it's not a 100% ACCURATE test case for all workloads, but
it's a
> perfect example of what needs to happen when you need a single
thread to
> spew out a lot of data onto disk or ingest it quickly, and I can
> certainly think of a few.
>
> 100MB/s is not fabulous. 40-50 is even worse. I can get 100MB/s
from a
> single commodity 3..5" 7200rpm SATA spindle - hardly impressive.
>
> Note that I'm not poo pooing the LDOM capability, or saying that
100MB/s
> is all you can get from it. I have certainly had good experiences
with
> LDOMS etc - I'm saying that just writing dd off as a bad test because
> you don't necessarily have a good answer for how to get it faster
is a
> little weak.
>
> How's about we consider what it would take to get dd running much
faster
> so that it becomes a non-issue?
>
> Or - at least discussing why dd is a worst case, and what folks
could do
> to help avoid such worst cases.
>
> Cheers!
>
> Nathan.
>
>
>
>
> Alexandre Chartre wrote:
>>
>> Using dd is not a got way to evaluate performances. dd does
sequential
>> serialized I/Os, and this is the worst case scenario for vdisk.
With real
>> applications, most your disk I/Os will be multiple parallel
random I/Os
>> (done by the filesystem) so this is definitively not what dd is
testing.
>>
>> A better way to test I/Os is to use vdbench, with which you can
simulate
>> different type of workload. But the best test is to run your
>> applications,
>> and see how it behaves and if you have unexpected response time.
>>
>> alex.
>>
>> On 06/10/10 03:01, Tom Kuther wrote:
>>> We have seen the same problem on a T5220, LDOMs 1.3, 8 internal 10k
>>> SAS drives and ZFS all around.
>>>
>>> Even on the ZFS RAID1 ldompool inside the control domain, I
couldn't
>>> get over 46MB/s with dd testing. Same for the 6-disk RAIDz
inside the
>>> guest LDOM (exported slice 2 of EFI labeled disk, formed into
RAIDZ)
>>>
>>> I deleted all LDOM configs and rebooted factory-default, same tests
>>> reveal around 100MB/s on the same ldompool now, and about
90MB/s for
>>> the RAIDZ,
>>>
>>> So now we end up using zones instead.
>>>
>>> (Originally posted on sun forums:
> >> http://forums.sun.com/thread.jspa?threadID=5441479)
>> _______________________________________________
>> ldoms-discuss mailing list
>> [email protected] <mailto:[email protected]>
> > http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
> _______________________________________________
> ldoms-discuss mailing list
> [email protected] <mailto:[email protected]>
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
--
Stefan Hinker
Systems LOB - Systems & Performance
Sun Microsystems GmbH Tel: +49 6103 752-300
Brandenburger Str. 2-6 [email protected]
D-40880 Ratingen http://www.sun.de/
http://blogs.sun.com/cmt
http://blogs.sun.com/cmt/en
---------------------------------------------------------------------------
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht München: HRB 161028
Geschäftsführer: Jürgen Kunz
_______________________________________________
ldoms-discuss mailing list
[email protected] <mailto:[email protected]>
http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
_______________________________________________
ldoms-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
_______________________________________________
ldoms-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss