Here's a long due update for you all...
After updating countless drivers, BIOSes and Nexenta, it seems that our issue
has disappeared. We're slowly moving our production to our three appliances
and things are going well so far. Sadly we don't know exactly what update
fixed our issue. I wish
Then set the zfs_write_limit_override to a reasonable
value.
Our first experiments are showing progress. We'll play with it some more and
let you know. Thanks!
Ian
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
If you do a dd to the storage from the heads do you still get the same
issues?
On 31 Oct 2010 12:40, Ian D rewar...@hotmail.com wrote:
I get that multi-cores doesn't necessarily better performances, but I doubt
that both the latest AMD CPUs (the Magny-Cours) and the latest Intel CPUs
(the
If you do a dd to the storage from the heads do
you still get the same issues?
no, local read/writes are great, they never choke. It's whenever NFS or iSCSI
are involved and that the read/writes are done from a remote box that we
experience the problem. Local operations barely affects the
What if you connect locally via NFS or iscsi?
SR
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Check your TXG settings, it could be a timing issue, nagles issue, also TCP
buffer issue. Check setup system properties.
On 1 Nov 2010 19:36, SR rraj...@gmail.com wrote:
What if you connect locally via NFS or iscsi?
SR
--
This message posted from opensolaris.org
- Original Message -
Likely you don#39;t have enough ram or CPU in the box.
The Nexenta box has 256G of RAM and the latest X7500 series CPUs. That
said, the load does get crazy high (like 35+) very quickly. We can't
figure out what's taking so much CPU. It happens even when
Maybe you are experiencing this:
http://opensolaris.org/jive/thread.jspa?threadID=119421
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
You doubt AMD or Intel cpu's suffer from bad cache
mgmt?
In order to clear that out, we've tried using an older server (about 4 years
old) as the head and we see the same pattern. It's actually more obvious that
it consumes a whole lot of CPU cycles. Using the same box as a Linux-based NFS
Maybe you are experiencing this:
http://opensolaris.org/jive/thread.jspa?threadID=11942
It does look like this... Is this really the expected behaviour? That's just
unacceptable. It is so bad it sometimes drop connection and fail copies and
SQL queries...
Ian
--
This message posted from
On Nov 1, 2010, at 5:09 PM, Ian D rewar...@hotmail.com wrote:
Maybe you are experiencing this:
http://opensolaris.org/jive/thread.jspa?threadID=11942
It does look like this... Is this really the expected behaviour? That's just
unacceptable. It is so bad it sometimes drop connection and
Ross Walker wrote:
On Nov 1, 2010, at 5:09 PM, Ian D rewar...@hotmail.com wrote:
Maybe you are experiencing this:
http://opensolaris.org/jive/thread.jspa?threadID=11942
It does look like this... Is this really the expected behaviour? That's just
unacceptable. It is so bad it
I get that multi-cores doesn't necessarily better performances, but I doubt
that both the latest AMD CPUs (the Magny-Cours) and the latest Intel CPUs (the
Beckton) suffer from incredibly bad cache management. Our two test system
have 2 and 4 of each respectively. The thing is that the
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Ian D
I get that multi-cores doesn't necessarily better performances, but I
doubt that both the latest AMD CPUs (the Magny-Cours) and the latest
Intel CPUs (the Beckton) suffer from
I owe you all an update...
We found out a clear pattern we can now recreate at will. Whenever we
read/write the pool, it gives expected throughput and IOPS for a while, but at
some point it slows down to a crawl, nothing is responding and pretty much
hang for a few seconds and then things go
So maybe a next step is to run zilstat, arcstat, iostat -xe?? (I forget what
people like to use for these params), zpool iostat -v in 4 term windows while
running the same test and try to see what is spiking when that high load
period occurs.
Not sure if there is a better version than this:
Here is a total guess - but what if it has to do with zfs processing running
on one CPU having to talk to the memory owned by a different CPU? I don't
know if many people are running fully populated boxes like you are, so maybe
it is something people are not seeing due to not having huge
We had the same issue with a 24 core box a while ago. Check your l2 cache
hits and misses. Sometimes more cores does not mean more performance dtrace
is your friend!
On 30 Oct 2010 14:12, zfs user zf...@itsbeen.sent.com wrote:
Here is a total guess - but what if it has to do with zfs processing
On Sat, Oct 30, 2010 at 02:10:49PM -0700, zfs user wrote:
1 Mangy-Cours CPU
^
Dunno whether deliberate, or malapropism, but I love it.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
If you take a look at http://www.brendangregg.com/cachekit.html you will see
some DTrace yummyness which should let you tell...
---
W. A. Khushil Dep - khushil@gmail.com - 07905374843
Visit my blog at http://www.khushil.com/
On 30 October 2010 15:49, Eugen Leitl eu...@leitl.org wrote:
I did it deliberately - how dumb are these product managers that they name
products with weird names and not expect them to be abused? On the other hand,
if you do a search for mangy cours you'll find a bunch of hits where it is
clearly a misspelling on serious tech articles, postings, etc.
On 10/30/2010 7:07 PM, zfs user wrote:
I did it deliberately - how dumb are these product managers that they
name products with weird names and not expect them to be abused? On
the other hand, if you do a search for mangy cours you'll find a bunch
of hits where it is clearly a misspelling on
Right, I realized it was Magny not Mangy, but I thought it was related to the
race track or racing not a town.
I completely agree with you on codenames, the Linux distro codename irk me -
hey, guys it might be easy for you to keep track of which release is Bushy
Beaver or Itchy Ibis or
A network switch that is being maxed out? Some
switches cannot switch
at rated line speed on all their ports all at the
same time. Their
internal buses simply don't have the bandwidth needed
for that. Maybe
you are running into that limit? (I know you
mentioned bypassing the
Likely you don#39;t have enough ram or CPU in the box.
The Nexenta box has 256G of RAM and the latest X7500 series CPUs. That said,
the load does get crazy high (like 35+) very quickly. We can't figure out
what's taking so much CPU. It happens even when checksum/compression/unduping
are
I don't think the switch model was ever identified...perhaps it is a 1 GbE
switch with a few 10 GbE ports?nbsp; (Drawing at straws.)
p
It is a a Dell 8024F. It has 24 SPF+ 10GbE ports and every NICs we connect to
it are Intel X520. One issue we do have with it is when we turn jumbo frames
On Oct 23, 2010, at 4:31 AM, Ian D wrote:
Likely you don#39;t have enough ram or CPU in the box.
The Nexenta box has 256G of RAM and the latest X7500 series CPUs. That said,
the load does get crazy high (like 35+) very quickly. We can't figure out
what's taking so much CPU. It happens
Some numbers...
zpool status
pool: Pool_sas
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
Pool_sas ONLINE 0 0 0
c4t5000C506A6D3d0 ONLINE 0 0 0
One thing suspicious is that we notice a slow down of one pool when the other
is under load. How can that be?
Ian
A network switch that is being maxed out? Some switches cannot switch
at rated line speed on all their ports all at the same time. Their
internal buses simply don't have
On Fri, Oct 22, 2010 at 10:40 PM, Haudy Kazemi kaze0...@umn.edu wrote:
One thing suspicious is that we notice a slow down of one pool when the
other is under load. How can that be?
Ian
A network switch that is being maxed out? Some switches cannot switch at
rated line speed on all
What more info could you provide? Quite a lot more, actually, like: how many
streams of SQL and copy are you running? how are the filesystems/zvols
configured (recordsize, etc)? some CPU, VM and network stats would also be nice.
Based on the nexenta iostats you've provided (a tiny window on
Tim Cook wrote:
On Fri, Oct 22, 2010 at 10:40 PM, Haudy Kazemi kaze0...@umn.edu
mailto:kaze0...@umn.edu wrote:
One thing suspicious is that we notice a slow down of one pool
when the other is under load. How can that be?
Ian
A network switch that
Ian,
It would help to have some config detail (e.g. what options are you using?
zpool status output; property lists for specific filesystems and zvols; etc)
Some basic Solaris stats can be very helpful too (e.g. peak flow samples of
vmstat 1, mpstst 1, iostat -xnz 1, etc)
It would also be
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Phil Harman
I'm wondering whether your HBA has a write through or write back cache
enabled? The latter might make things very fast, but could put data at
risk if not sufficiently
As I have mentioned already, we have the same performance issues whether we
READ or we WRITE to the array, shouldn't that rule out caching issues?
Also we can get great performances with the LSI HBA if we use the JBODs as a
local file system. The issues only arise when it is done through iSCSI
He already said he has SSD's for dedicated log. This
means the best
solution is to disable WriteBack and just use
WriteThrough. Not only is it
more reliable than WriteBack, it's faster.
And I know I've said this many times before, but I
don't mind repeating: If
you have slog devices,
I've had a few people sending emails directly
suggesting it might have something to do with the
ZIL/SLOG. I guess I should have said that the issue
happen both ways, whether we copy TO or FROM the
Nexenta box.
You mentioned a second Nexenta box earlier. To rule out client-side issues,
As I have mentioned already, it would be useful to know more about the
config, how the tests are being done, and to see some basic system
performance stats.
On 15/10/2010 15:58, Ian D wrote:
As I have mentioned already, we have the same performance issues whether we
READ or we WRITE to the
You mentioned a second Nexenta box earlier. To rule
out client-side issues, have you considered testing
with Nexenta as the iSCSI/NFS client?
If you mean running the NFS client AND server on the same box then yes, and it
doesn't show the same performance issues. It's only when a Linux box
As I have mentioned already, it would be useful to
know more about the
onfig, how the tests are being done, and to see some
basic system
performance stats.
I will shortly. Thanks!
--
This message posted from opensolaris.org
___
zfs-discuss
On 15/10/2010 19:09, Ian D wrote:
It's only when a Linux box SEND/RECEIVE data to the NFS/iSCSI shares that we
have problems. But if the Linux box send/receive file through scp on the
external disks mounted by the Nexenta box as a local filesystem then there is
no problem.
Does the Linux
Does the Linux box have the same issue to any other
server ?
What if the client box isn't Linux but Solaris or
Windows or MacOS X ?
That would be a good test. We'll try that.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
After contacting LSI they say that the 9200-16e HBA is not supported in
OpenSolaris, just Solaris. Aren't Solaris drivers the same as OpenSolaris?
Is there anyone here using 9200-16e HBAs? What about the 9200-8e? We have a
couple lying around and we'll test one shortly.
Ian
--
This message
A little setback We found out that we also have the issue with the Dell
H800 controllers, not just the LSI 9200-16e. With the Dell it's initially
faster as we benefit from the cache, but after a little while it goes sour-
from 350MB/sec down to less than 40MB/sec. We've also tried with a
On 15 oct. 2010, at 22:19, Ian D wrote:
A little setback We found out that we also have the issue with the Dell
H800 controllers, not just the LSI 9200-16e. With the Dell it's initially
faster as we benefit from the cache, but after a little while it goes sour-
from 350MB/sec down to
-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ian D
Sent: Friday, October 15, 2010 4:19 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Performance issues with iSCSI under Linux
A little
Has anyone suggested either removing L2ARC/SLOG
entirely or relocating them so that all devices are
coming off the same controller? You've swapped the
external controller but the H700 with the internal
drives could be the real culprit. Could there be
issues with cross-controller IO in this
On Oct 15, 2010, at 5:34 PM, Ian D rewar...@hotmail.com wrote:
Has anyone suggested either removing L2ARC/SLOG
entirely or relocating them so that all devices are
coming off the same controller? You've swapped the
external controller but the H700 with the internal
drives could be the real
On 13 oct. 2010, at 18:37, Marty Scholes wrote:
The only thing that still stands out is that network operations (iSCSI and
NFS) to external drives are slow, correct?
Just for completeness, what happens if you scp a file to the three different
pools? If the results are the same as NFS
Sounding more and more like a networking issue - are
the network cards set up in an aggregate? I had some
similar issues on GbE where there was a mismatch
between the aggregate settings on the switches and
the LACP settings on the server. Basically the
network was wasting a ton of time
I've had a few people sending emails directly suggesting it might have
something to do with the ZIL/SLOG. I guess I should have said that the issue
happen both ways, whether we copy TO or FROM the Nexenta box.
--
This message posted from opensolaris.org
Our next test is to try with a different kind of HBA,
we have a Dell H800 lying around.
ok... we're making progress. After swapping the LSI HBA for a Dell H800 the
issue disappeared. Now, I'd rather not use those controllers because they
don't have a JBOD mode. We have no choice but to make
rewar...@hotmail.com said:
ok... we're making progress. After swapping the LSI HBA for a Dell H800 the
issue disappeared. Now, I'd rather not use those controllers because they
don't have a JBOD mode. We have no choice but to make individual RAID0
volumes for each disks which means we need
Earlier you said you had eliminated the ZIL as an
issue, but one difference
between the Dell H800 and the LSI HBA is that the
H800 has an NV cache (if
you have the battery backup present).
A very simple test would be when things are running
slow, try disabling
the ZIL temporarily, to see
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Ian D
ok... we're making progress. After swapping the LSI HBA for a Dell
H800 the issue disappeared. Now, I'd rather not use those controllers
because they don't have a JBOD mode. We have
0n Thu, Oct 14, 2010 at 09:54:09PM -0400, Edward Ned Harvey wrote:
If you happen to find that MegaCLI is the right tool for your hardware, let
me know, and I'll paste a few commands here, which will simplify your life.
When I first started using it, I found it terribly
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Wilkinson, Alex
can you paste them anyway ?
Note: If you have more than one adapter, I believe you can specify -aALL in
the commands below, instead of -a0
I have 2 disks (slots 4 5) that
Here are some more findings...
The Nexenta box has 3 pools:
syspool: made of 2 mirrored (hardware RAID) local SAS disks
pool_sas: made of 22 15K SAS disks in ZFS mirrors on 2 JBODs on 2 controllers
pool_sata: made of 42 SATA disks in 6 RAIDZ2 vdevs on a single controller
When we copy data from
Here are some more findings...
The Nexenta box has 3 pools:
syspool: made of 2 mirrored (hardware RAID) local SAS
disks
pool_sas: made of 22 15K SAS disks in ZFS mirrors on
2 JBODs on 2 controllers
pool_sata: made of 42 SATA disks in 6 RAIDZ2 vdevs on
a single controller
When we copy
The only thing that still stands out is that network
operations (iSCSI and NFS) to external drives are
slow, correct?
Yes, that pretty much resume it.
Just for completeness, what happens if you scp a file
to the three different pools? If the results are the
same as NFS and iSCSI, then I
Would it be possible to install OpenSolaris to an USB
disk and boot from it and try? That would take 1-2h
and could maybe help you narrow things down further?
I'm a little afraid to lose my data, i wouldnt be the end of the world, but I'd
rather avoid that. I'll do it in last resort.
Ian
--
More stuff...
We ran the same tests on another Nexenta box with fairly similar hardware and
had the exact same issues. The two boxes have the same models of HBAs, NICs
and JBODs but different CPUs and motherboards.
Our next test is to try with a different kind of HBA, we have a Dell H800
From the Linux side, it appears the drive in question
is either sdb or dm-3, and both appear to be the same
drive. Since switching to zfs, my Linux-disk-fu has
become a bit rusty. Is one an alias for the other?
Yes, dm-3 is the alias created by LVM while sdb is the physical (or raw)
device
I'll suggest trying something completely different, like, dd if=/dev/zero
bs=1024k | pv | ssh othermachine 'cat /dev/null' ... Just to verify there
isn't something horribly wrong with your hardware (network).
In linux, run ifconfig ... You should see errors:0
Make sure each machine has
A couple of notes: we know the Pool_sata is resilvering, but we're concerned
about the performances of the other pool (Pool_sas). We also know that we're
not using jumbo frames as for some reason it makes the linux box crash. Could
that explain it all? What sort of drives are these? It
A couple of notes: we know the Pool_sata is resilvering, but we're
concerned about the performances of the other pool (Pool_sas). We also know
that we're not using jumbo frames as for some reason it makes the linux box
crash. Could that explain it all?
What sort of drives are these? It
What sort of drives are these? It looks like iSCSI or FC device names, and
not local drives
The Pool_sas is made of 15K SAS drives on external JBOD arrays (Dell
MD1000) connected on mirrored LSI 9200-8e SAS HBAs.
The Pool_sata is made of SATA drives on other JBODs.
The shorter
If you have a single SSD for dedicated log, that will
surely be a bottleneck
for you.
We're aware of that. The original plan was to use mirrored DDRDrive X1s but
we're experiencing stability issues. Chris George is being very responsible
and we'll help us out investigate that once we
\ We're aware of that. The original plan was to use
mirrored DDRDrive X1s but we're experiencing
stability issues. Chris George is being very
responsible and we'll help us out investigate that
once we figure out our most pressing performance
problems.
I feel I need to add to my comment
Hi!We're trying to pinpoint our performance issues and we could use all the
help to community can provide. We're running the latest version of Nexenta on
a pretty powerful machine (4x Xeon 7550, 256GB RAM, 12x 100GB Samsung SSDs for
the cache, 50GB Samsung SSD for the ZIL, 10GbE on a
Where should we look at? What more information should I provide? Start
Start with 'iostat -xdn 1'. That'll provide info about the actual device I/O.
Vennlige hilsener / Best regards
roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Ian D
the help to community can provide. We're running the latest version of
Nexenta on a pretty powerful machine (4x Xeon 7550, 256GB RAM, 12x
100GB Samsung SSDs for the cache, 50GB
To see if it is iscsi related or zfs, have you tried to test performance over
nfs to a zfs filesystem instead of a zvol?
SR
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
73 matches
Mail list logo