ZFS has always done a certain amount of write throttling. In the past
(or the present, for those of you running S10 or pre build 87 bits) this
throttling was controlled by a timer and the size of the ARC: we would
cut a transaction group every 5 seconds based off of our timer, and
we would also
ugh, thanks for exploring this and isolating the problem. We will look
into what is going on (wrong) here. I have filed bug:
6545015 RAID-Z resilver broken
to track this problem.
-Mark
Marco van Lienen wrote:
On Sat, Apr 07, 2007 at 05:05:18PM -0500, in a galaxy far far away, Chris
Anton B. Rang wrote:
This sounds a lot like:
6417779 ZFS: I/O failure (write on ...) -- need to
reallocate writes
Which would allow us to retry write failures on
alternate vdevs.
Of course, if there's only one vdev, the write should be retried to a different
block on the original vdev ...
Joseph Barbey wrote:
Matthew Ahrens wrote:
Joseph Barbey wrote:
Robert Milkowski wrote:
JB So, normally, when the script runs, all snapshots finish in
maybe a minute
JB total. However, on Sundays, it continues to take longer and
longer. On
JB 2/25 it took 30 minutes, and this last
Atul Vidwansa wrote:
Hi,
I have few questions about the way a transaction group is created.
1. Is it possible to group transactions related to multiple operations
in same group? For example, an rmdir foo followed by mkdir bar,
can these end up in same transaction group?
Yes.
2. Is it
Frederic Payet - Availability Services wrote:
Hi gurus,
When creating some small files an ZFS directory, used blocks number is
not what could be espected:
hinano# zfs list
NAME USED AVAIL REFER MOUNTPOINT
pool2 702K 16.5G 26.5K /pool2
pool2/new
This issue has been discussed a number of times in this forum.
To summerize:
ZFS (specifically, the ARC) will try to use *most* of the systems
available memory to cache file system data. The default is to
max out at physmem-1GB (i.e., use all of physical memory except
for 1GB). In the face of
Robert,
This doesn't look like cache flushing, rather it looks like we are
trying to finish up some writes... but are having a hard time allocating
space for them. Is this pool almost 100% full? There are lots of
instances of zio_write_allocate_gang_members(), which indicates a very
high
Peter Buckingham wrote:
Hi Eric,
eric kustarz wrote:
The first thing i would do is see if any I/O is happening ('zpool
iostat 1'). If there's none, then perhaps the machine is hung (which
you then would want to grab a couple of '::threadlist -v 10's from mdb
to figure out if there are hung
Al Hopper wrote:
On Wed, 10 Jan 2007, Mark Maybee wrote:
Jason J. W. Williams wrote:
Hi Robert,
Thank you! Holy mackerel! That's a lot of memory. With that type of a
calculation my 4GB arc_max setting is still in the danger zone on a
Thumper. I wonder if any of the ZFS developers could shed
Jason J. W. Williams wrote:
Hi Mark,
That does help tremendously. How does ZFS decide which zio cache to
use? I apologize if this has already been addressed somewhere.
The ARC caches data blocks in the zio_buf_xxx() cache that matches
the block size. For example, dnode data is stored on disk
Jason J. W. Williams wrote:
Hi Robert,
Thank you! Holy mackerel! That's a lot of memory. With that type of a
calculation my 4GB arc_max setting is still in the danger zone on a
Thumper. I wonder if any of the ZFS developers could shed some light
on the calculation?
In a worst-case scenario,
Thomas,
This could be fragmentation in the meta-data caches. Could you
print out the results of ::kmastat?
-Mark
Tomas Ögren wrote:
On 05 January, 2007 - Robert Milkowski sent me these 3,8K bytes:
Hello Tomas,
I saw the same behavior here when ncsize was increased from default.
Try with
hands
here, so it has no ability to reduce its size.
Number 3 is the most difficult issue. We are looking into that at the
moment as well.
-Mark
Tomas Ögren wrote:
On 05 January, 2007 - Mark Maybee sent me these 0,8K bytes:
Thomas,
This could be fragmentation in the meta-data caches. Could
Tomas Ögren wrote:
On 05 January, 2007 - Mark Maybee sent me these 1,5K bytes:
So it looks like this data does not include ::kmastat info from *after*
you reset arc_reduce_dnlc_percent. Can I get that?
Yeah, attached. (although about 18 hours after the others)
Excellent, this confirms #3
Tomas,
There are a couple of things going on here:
1. There is a lot of fragmentation in your meta-data caches (znode,
dnode, dbuf, etc). This is burning up about 300MB of space in your
hung kernel. This is a known problem that we are currently working
on.
2. While the ARC has set its
Hmmm, so there is lots of evictable cache here (mostly in the MFU
part of the cache)... could you make your core file available?
I would like to take a look at it.
-Mark
Tomas Ögren wrote:
On 03 January, 2007 - Mark Maybee sent me these 5,0K bytes:
Tomas,
There are a couple of things going
Ah yes! Thank you Casper. I knew this looked familiar! :-)
Yes, this is almost certainly what is happening here. The
bug was introduced in build 51 and fixed in build 54.
[EMAIL PROTECTED] wrote:
Hmmm, so there is lots of evictable cache here (mostly in the MFU
part of the cache)... could
[EMAIL PROTECTED] wrote:
Hello Casper,
Tuesday, December 12, 2006, 10:54:27 AM, you wrote:
So 'a' UB can become corrupt, but it is unlikely that 'all' UBs will
become corrupt through something that doesn't also make all the data
also corrupt or inaccessible.
CDSC So how does this work for
Andrew Miller wrote:
Quick question about the interaction of ZFS filesystem compression and the filesystem cache. We have an Opensolaris (actually Nexenta alpha-6) box running RRD collection. These files seem to be quite compressible. A test filesystem containing about 3,000 of these files
Jeremy Teo wrote:
On 12/5/06, Bill Sommerfeld [EMAIL PROTECTED] wrote:
On Mon, 2006-12-04 at 13:56 -0500, Krzys wrote:
mypool2/[EMAIL PROTECTED] 34.4M - 151G -
mypool2/[EMAIL PROTECTED] 141K - 189G -
mypool2/d3 492G 254G 11.5G legacy
I am so
Robert Milkowski wrote:
Hello John,
Thursday, November 9, 2006, 12:03:58 PM, you wrote:
JC Hi all,
JC When testing our programs, I got a problem. On UFS, we get the number of
JC free inode via 'df -e', then do some things based this value, such as
JC create an empty file, the value will
Matthew Flanagan wrote:
Matt,
Matthew Flanagan wrote:
mkfile 100m /data
zpool create tank /data
...
rm /data
...
panic[cpu0]/thread=2a1011d3cc0: ZFS: I/O failure
(write on unknown off 0: zio 60007432bc0 [L0
unallocated] 4000L/400P DVA[0]=0:b000:400
DVA[1]=0:120a000:400 fletcher4
Patrick wrote:
Hi,
So recently, i decided to test out some of the ideas i've been toying
with, and decided to create 50 000 and 100 000 filesystems, the test
machine was a nice V20Z with dual 1.8 opterons, 4gb ram, connecting a
scsi 3310 raid array, via two scsi controllers.
Now creating the
Yup, its almost certain that this is the bug you are hitting.
-Mark
Alan Hargreaves wrote:
I know, bad form replying to myself, but I am wondering if it might be
related to
6438702 error handling in zfs_getpage() can trigger page not
locked
Which is marked fix in progress with a
Robert Milkowski wrote:
Hello Philippe,
It was recommended to lower ncsize and I did (to default ~128K).
So far it works ok for last days and staying at about 1GB free ram
(fluctuating between 900MB-1,4GB).
Do you think it's a long term solution or with more load and more data
the problem can
Jill Manfield wrote:
My customer is running java on a ZFS file system. His platform is Soalris 10
x86 SF X4200. When he enabled ZFS his memory of 18 gigs drops to 2 gigs rather
quickly. I had him do a # ps -e -o pid,vsz,comm | sort -n +1 and it came back:
The culprit application you see
Thomas Burns wrote:
Hi,
We have been using zfs for a couple of months now, and, overall, really
like it. However, we have run into a major problem -- zfs's memory
requirements
crowd out our primary application. Ultimately, we have to reboot the
machine
so there is enough free memory to
Thomas Burns wrote:
On Sep 12, 2006, at 2:04 PM, Mark Maybee wrote:
Thomas Burns wrote:
Hi,
We have been using zfs for a couple of months now, and, overall, really
like it. However, we have run into a major problem -- zfs's memory
requirements
crowd out our primary application
Jürgen Keil wrote:
We are trying to obtain a mutex that is currently held
by another thread trying to get memory.
Hmm, reminds me a bit on the zvol swap hang I got
some time ago:
http://www.opensolaris.org/jive/thread.jspa?threadID=11956tstart=150
I guess if the other thead is stuck trying
Ivan,
What mail clients use your mail server? You may be seeing the
effects of:
6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue
parallel IOs when fsyncing
This bug was fixed in nevada build 43, and I don't think made it into
s10 update 2. It will, of course, be in
Hmmm, interesting data. See comments in-line:
Robert Milkowski wrote:
Yes, server has 8GB of RAM.
Most of the time there's about 1GB of free RAM.
bash-3.00# mdb 0
Loading modules: [ unix krtld genunix dtrace specfs ufs sd md ip sctp usba fcp
fctl qlc ssd lofs zfs random logindmux ptm cpc nfs
Robert Milkowski wrote:
On Wed, 6 Sep 2006, Mark Maybee wrote:
Robert Milkowski wrote:
::dnlc!wc
1048545 3145811 76522461
Well, that explains half your problem... and maybe all of it:
After I reduced vdev prefetch from 64K to 8K for last few hours system
is working properly
Robert,
I would be interested in seeing your crash dump. ZFS will consume much
of your memory *in the absence of memory pressure*, but it should be
responsive to memory pressure, and give up memory when this happens. It
looks like you have 8GB of memory on your system? ZFS should never
Michael Schuster - Sun Microsystems wrote:
Pawel Jakub Dawidek wrote:
On Tue, Aug 22, 2006 at 04:30:44PM +0200, Jeremie Le Hen wrote:
I don't know much about ZFS, but Sun states this is a 128 bits
filesystem. How will you handle this in regards to the FreeBSD
kernel interface that is
Robert,
Are you sure that nfs-s5-p0/d5110 and nfs-s5-p0/d5111 are mounted
following the import? These messages imply that the d5110 and d5111
directories in the top-level filesystem of pool nfs-s5-p0 are not
empty. Could you verify that 'df /nfs-s5-p0/d5110' displays
nfs-s5-p0/d5110 as the
Eric Lowe wrote:
Eric Schrock wrote:
Well the fact that it's a level 2 indirect block indicates why it can't
simply be removed. We don't know what data it refers to, so we can't
free the associated blocks. The panic on move is quite interesting -
after BFU give it another shot and file a bug
Luke Lonergan wrote:
Robert,
On 8/8/06 9:11 AM, Robert Milkowski [EMAIL PROTECTED] wrote:
1. UFS, noatime, HW RAID5 6 disks, S10U2
70MB/s
2. ZFS, atime=off, HW RAID5 6 disks, S10U2 (the same lun as in #1)
87MB/s
3. ZFS, atime=off, SW RAID-Z 6 disks, S10U2
130MB/s
4. ZFS,
Jürgen Keil wrote:
I've tried to use dmake lint on on-src-20060731, and was running out of swap
on my
Tecra S1 laptop, 32-bit x86, 768MB main memory, with a 512MB swap slice.
The FULL KERNEL: global crosschecks: lint run consumes lots (~800MB) of space
in /tmp, so the system was running out of
Yup, your probably running up against the limitations of 32-bit kernel
addressability. We are currently very conservative in this environment,
and so tend to end up with a small cache as a result. It may be
possible to tweak things to get larger cache sizes, but you run the risk
of starving out
40 matches
Mail list logo