Hi thanks all for your input,
I want to clarify my intentions, and try give answers to
questions suggestions below.
We want to test ZFS throughput over Fibrechannel/COMSTAR
on two 32 core quadsocket systems, one target one initator.
Therefor I created a ZPOOL with 2 x 6 Disks (1GB SAS) RAIDZ2 + ZEUS Ram ZIL
on the FC target (2x Emulex LP12002 / 8 Gbit) running OmniOS stable.
The initator is another system with identical hardware and software,
they are interconnected over a zoned CISCO fabric switch (9148).
1) The local dd performance on the target is around 750 MB/s and is ok
for me.
time dd if=/dev/zero of=/jbod_a1/largefile bs=128k count=64k
65536+0 records in 65536+0 records out
8589934592 bytes (8,6 GB) copied, 11,0914 s, 774 MB/s
2) Then exporting the largefile/ZVOL (tested both,same perf.)
as LUN over 2 x FC 8 GB Links and creating a zpool on it brings
in the following dd test around 189 MB/s at the initiator,
this is not ok for me.
zpool create fcvol1 c0t600144F00E7FCC0000005151C9A50003d0
time dd if=/dev/zero of=/fcvol1/file1 bs=128k count=64k
65536+0 records in 65536+0 records out
8589934592 bytes (8,6 GB) copied, 45,3851 s, 189 MB/s
3) Eliminating the zpool disks on the target side by replacing
with a RAM disk or a file in /tmp is not a good idea as i learned
through the discussion. The observation that a dd test to tmpfs
without FC is much slower then on all my other systems is still
strange to me.
Another observation I made is, that disabling hyper threading gave a
performance increase
of about 20 % with the dd in tmpfs.
My next steps would be firmware and driver updates of the LPE-12002
interface cards, then eliminating
the fabric switch and directly connecting the ports, then trying to use
other tools to investigate the problem.
I would be very glad for any suggestions/help.
Thank you,
Franz Schober
Here are the lgrpinfo for deactivated and activated hyper threading as
information:
lgroup 0 (root):
Children: 3 4 6 8
CPUs: 0-31
Memory: installed 256G, allocated 14G, free 242G
Lgroup resources: 1 2 5 7 (CPU); 1 2 5 7 (memory)
Latency: 30
lgroup 1 (leaf):
Children: none, Parent: 3
CPUs: 0-7
Memory: installed 64G, allocated 1,7G, free 62G
Lgroup resources: 1 (CPU); 1 (memory)
Load: 0,00166
Latency: 10
lgroup 2 (leaf):
Children: none, Parent: 4
CPUs: 8-15
Memory: installed 64G, allocated 2,7G, free 61G
Lgroup resources: 2 (CPU); 2 (memory)
Load: 1,53e-05
Latency: 10
lgroup 3 (intermediate):
Children: 1, Parent: 0
CPUs: 0-15 24-31
Memory: installed 192G, allocated 13G, free 179G
Lgroup resources: 1 2 7 (CPU); 1 2 7 (memory)
Latency: 21
lgroup 4 (intermediate):
Children: 2, Parent: 0
CPUs: 0-23
Memory: installed 192G, allocated 5,1G, free 187G
Lgroup resources: 1 2 5 (CPU); 1 2 5 (memory)
Latency: 21
lgroup 5 (leaf):
Children: none, Parent: 6
CPUs: 16-23
Memory: installed 64G, allocated 658M, free 63G
Lgroup resources: 5 (CPU); 5 (memory)
Load: 0,0194
Latency: 10
lgroup 6 (intermediate):
Children: 5, Parent: 0
CPUs: 8-31
Memory: installed 192G, allocated 12G, free 180G
Lgroup resources: 2 5 7 (CPU); 2 5 7 (memory)
Latency: 21
lgroup 7 (leaf):
Children: none, Parent: 8
CPUs: 24-31
Memory: installed 64G, allocated 8,6G, free 55G
Lgroup resources: 7 (CPU); 7 (memory)
Load: 0,125
Latency: 10
lgroup 8 (intermediate):
Children: 7, Parent: 0
CPUs: 0-7 16-31
Memory: installed 192G, allocated 11G, free 181G
Lgroup resources: 1 5 7 (CPU); 1 5 7 (memory)
Latency: 21
lgroup 0 (root):
Children: 3 4 6 8
CPUs: 0-63
Memory: installed 256G, allocated 27G, free 229G
Lgroup resources: 1 2 5 7 (CPU); 1 2 5 7 (memory)
Latency: 30
lgroup 1 (leaf):
Children: none, Parent: 3
CPUs: 0-7 32-39
Memory: installed 64G, allocated 9,4G, free 55G
Lgroup resources: 1 (CPU); 1 (memory)
Load: 0,0306
Latency: 10
lgroup 2 (leaf):
Children: none, Parent: 4
CPUs: 8-15 40-47
Memory: installed 64G, allocated 493M, free 64G
Lgroup resources: 2 (CPU); 2 (memory)
Load: 0,0624
Latency: 10
lgroup 3 (intermediate):
Children: 1, Parent: 0
CPUs: 0-15 24-47 56-63
Memory: installed 192G, allocated 26G, free 166G
Lgroup resources: 1 2 7 (CPU); 1 2 7 (memory)
Latency: 21
lgroup 4 (intermediate):
Children: 2, Parent: 0
CPUs: 0-23 32-55
Memory: installed 192G, allocated 11G, free 181G
Lgroup resources: 1 2 5 (CPU); 1 2 5 (memory)
Latency: 21
lgroup 5 (leaf):
Children: none, Parent: 6
CPUs: 16-23 48-55
Memory: installed 64G, allocated 1,0G, free 63G
Lgroup resources: 5 (CPU); 5 (memory)
Load: 0,000946
Latency: 10
lgroup 6 (intermediate):
Children: 5, Parent: 0
CPUs: 8-31 40-63
Memory: installed 192G, allocated 17G, free 175G
Lgroup resources: 2 5 7 (CPU); 2 5 7 (memory)
Latency: 21
lgroup 7 (leaf):
Children: none, Parent: 8
CPUs: 24-31 56-63
Memory: installed 64G, allocated 16G, free 48G
Lgroup resources: 7 (CPU); 7 (memory)
Load: 3,05e-05
Latency: 10
lgroup 8 (intermediate):
Children: 7, Parent: 0
CPUs: 0-7 16-39 48-63
Memory: installed 192G, allocated 26G, free 166G
Lgroup resources: 1 5 7 (CPU); 1 5 7 (memory)
Latency: 21
Am 26.03.13 16:11, schrieb Garrett D'Amore:
On Mar 26, 2013, at 6:44 AM, Bob Friesenhahn <[email protected]>
wrote:
On Tue, 26 Mar 2013, Sašo Kiselkov wrote:
Once I gave it bit more thought, I realized tmpfs *should* be faster,
since it doesn't traverse the block device/SCSI interface and instead
intercepts calls pretty high up the VFS stack. Nonetheless, I suspect
the tmpfs implementation isn't really designed for multi-GB/s throughput
(it's a filesystem for /tmp FFS, it's supposed to hold a couple of kB of
data anyway).
Sašo, you are continuing to ignore that the simple dd to tmpfs test turned in
abysmal results on this quad Xeon E5 system as compared to the many other
systems tested.
This seems to be a problem with the system, or the way Illumos runs on it.
Not necessarily. Higher lock contention can lead to surprising results in some
configurations. (Such as the speed of certain benchmarks actually *improving*
by either offlining cores or reducing processor speeds.) Whether that's the
case here I don't know. But tmpfs is the *wrong* way to benchmark memory speed.
I do have illumos on a Xeon E5 in my garage. It works pretty well, but I've
not spent a lot of time benchmarking or testing memory bandwidth.
- Garrett
I had not heard of Illumos running on a quad Xeon E5 system before now. It is
doubtful that Illumos has been seriously tweaked/tuned for particular newer
hardware since the days of Sun. Most efforts seem on the level of keeping
things running properly without the considerable resources required for
performance tuning/testing. Maybe there are significant issues to be resolved.
Bob
--
Bob Friesenhahn
[email protected], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/22003744-9012f59c
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/24251501-1f371bde
Modify Your Subscription: https://www.listbox.com/member/?&
Powered by Listbox: http://www.listbox.com
--
---------------------------------------------------------------------
Dipl.-Ing. Franz Schober
[email protected]
FirmOS Business Solutions Gmbh
Obstweg 4
A-8073 Graz-Neupirka
Tel +43 316 242322-10
Fax +43 316 242322-99
http://www.firmos.at
FN 278321x, Lg für ZRS Graz UID-Nr: ATU62657119
Dieses eMail ist vertraulich und nur für die genannten Empfänger bestimmt.
Sollten Sie nicht der gewünschte Adressat sein, bitten wir Sie, uns umgehend zu
informieren sowie vorliegende Nachricht zu löschen ohne vorher einen Ausdruck
oder eine Kopie anzufertigen.
This message and any attached files are confidential and intended solely for
the adressee(s). Any publication, transmission or other use of the information
by a person or entity other than the intended addressee(s) is prohibited. If
you receive this in error please contact the sender and delete the material.
The sender does not accept liability for any errors or omissions as a result of
the transmission.
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com