Re: [zfs-discuss] System hangs during zfs send

2010-08-25 Thread David Blasingame Oracle

What does ::zio_state show?

Dave

On 08/25/10 07:41, Bryan Leaman wrote:

Hi, I've been following these forums for a long time but this is my first post. 
 I'm looking for some advice on debugging an issue.  I've been looking at all 
the bug reports and updates though b146 but I can't find a good match.  I tried 
the fix for 6937998 but it didn't help.

Running Nexenta NCP3 and when I attempt to do a simple zfs send of my root pool 
(syspool)  /dev/null, it sends all the volume streams but then all IO hangs at 
the moment the send seems like it should be completed.  I have to restart the box 
at this point.

The following mdb output is from the hung system (from a savecore -L).  I'm 
still learning my way around mdb and kernel debugging so any suggestions on how 
to track this down would be really appreciated.  It seems like it's stuck 
waiting for txg_wait_synced.

  

::ptree


   ff02e8d97718  sshd
ff02e74c3570  sshd
 ff02e8d95e48  tcsh
  ff02d1cc3e20  bash
   ff02e7f4a720  bash
ff02e6bec900  zfs

  

ff02e6bec900::walk thread


ff02d1954720

  

ff02d1954720::threadlist -v


ADDR PROC  LWP CLS PRIWCHAN
ff02d1954720 ff02e6bec900 ff02cf543850   1  60 ff02cd54054a
  PC: _resume_from_idle+0xf1CMD: zfs send -Rvp sysp...@20100824
  stack pointer for thread ff02d1954720: ff0010b6ca90
  [ ff0010b6ca90 _resume_from_idle+0xf1() ]
swtch+0x145()
cv_wait+0x61()
txg_wait_synced+0x7c()
dsl_sync_task_group_wait+0xee()
dsl_dataset_user_release+0x101()
zfs_ioc_release+0x51()
zfsdev_ioctl+0x177()
cdev_ioctl+0x45()
spec_ioctl+0x5a()
fop_ioctl+0x7b()
ioctl+0x18e()
sys_syscall32+0xff()

  

ff02d1954720::findstack -v


stack pointer for thread ff02d1954720: ff0010b6ca90
[ ff0010b6ca90 _resume_from_idle+0xf1() ]
  ff0010b6cac0 swtch+0x145()
  ff0010b6caf0 cv_wait+0x61(ff02cd54054a, ff02cd540510)
  ff0010b6cb40 txg_wait_synced+0x7c(ff02cd540380, 9291)
  ff0010b6cb80 dsl_sync_task_group_wait+0xee(ff02d0b1a868)
  ff0010b6cc10 dsl_dataset_user_release+0x101(ff02d1336000,
  ff02d1336400, ff02d1336c00, 1)
  ff0010b6cc40 zfs_ioc_release+0x51(ff02d1336000)
  ff0010b6ccc0 zfsdev_ioctl+0x177(b6, 5a32, 8045660, 13,
  ff02cd646588, ff0010b6cde4)
  ff0010b6cd00 cdev_ioctl+0x45(b6, 5a32, 8045660, 13,
  ff02cd646588, ff0010b6cde4)
  ff0010b6cd40 spec_ioctl+0x5a(ff02d17c3180, 5a32, 8045660, 13,
  ff02cd646588, ff0010b6cde4, 0)
  ff0010b6cdc0 fop_ioctl+0x7b(ff02d17c3180, 5a32, 8045660, 13,
  ff02cd646588, ff0010b6cde4, 0)
  ff0010b6cec0 ioctl+0x18e(3, 5a32, 8045660)
  ff0010b6cf10 sys_syscall32+0xff()

  

ff02cd540380::print dsl_pool_t dp_tx


dp_tx = {
dp_tx.tx_cpu = 0xff02cd540680
dp_tx.tx_sync_lock = {
_opaque = [ 0 ]
}
dp_tx.tx_open_txg = 0x9292
dp_tx.tx_quiesced_txg = 0
dp_tx.tx_syncing_txg = 0x9291
dp_tx.tx_synced_txg = 0x9290
dp_tx.tx_sync_txg_waiting = 0x9292
dp_tx.tx_quiesce_txg_waiting = 0x9292
dp_tx.tx_sync_more_cv = {
_opaque = 0
}
dp_tx.tx_sync_done_cv = {
_opaque = 0x2
}
dp_tx.tx_quiesce_more_cv = {
_opaque = 0x1
}
dp_tx.tx_quiesce_done_cv = {
_opaque = 0
}
dp_tx.tx_timeout_cv = {
_opaque = 0
}
dp_tx.tx_exit_cv = {
_opaque = 0
}
dp_tx.tx_threads = 0x2
dp_tx.tx_exiting = 0
dp_tx.tx_sync_thread = 0xff000fa05c60
dp_tx.tx_quiesce_thread = 0xff000f9fcc60
dp_tx.tx_commit_cb_taskq = 0

  

ff02cd540380::print dsl_pool_t dp_tx.tx_sync_thread


dp_tx.tx_sync_thread = 0xff000fa05c60

  

0xff000fa05c60::findstack -v


stack pointer for thread ff000fa05c60: ff000fa05860
[ ff000fa05860 _resume_from_idle+0xf1() ]
  ff000fa05890 swtch+0x145()
  ff000fa058c0 cv_wait+0x61(ff000fa05e3e, ff000fa05e40)
  ff000fa05900 delay_common+0xab(1)
  ff000fa05940 delay+0xc4(1)
  ff000fa05960 dnode_special_close+0x28(ff02e8aa2050)
  ff000fa05990 dmu_objset_evict+0x160(ff02e5b91100)
  ff000fa05a20 dsl_dataset_user_release_sync+0x52(ff02e000b928,
  ff02d0b1a868, ff02e5b9c6e0)
  ff000fa05a70 dsl_sync_task_group_sync+0xf3(ff02d0b1a868,
  ff02e5b9c6e0)
  ff000fa05af0 dsl_pool_sync+0x1ec(ff02cd540380, 9291)
  ff000fa05ba0 spa_sync+0x37b(ff02cdd40b00, 9291)
  ff000fa05c40 txg_sync_thread+0x247(ff02cd540380)
  ff000fa05c50 thread_start+8()

  

::spa


ADDR STATE NAME
ff02cdd40b00ACTIVE syspool

  

ff02cdd40b00::print spa_t 

Re: [zfs-discuss] Intermittent ZFS hang

2010-08-30 Thread David Blasingame Oracle

Charles,

Is it just ZFS hanging (or what it appears to be is slowing down or 
blocking) or does the whole system hang? 


A couple of questions

What does iostat show during the time period of the slowdown?
What does mpstat show during the time of the slowdown?

You can look at the metadata statistics by running the following.

echo ::arc | mdb -k

When looking at a ZFS problem, I usually like to gather

echo ::spa | mdb -k

echo ::zio_state | mdb -k

I suspect you could drill down more with dtrace or lockstat to see where 
the slowdown is happening.


Dave


On 08/30/10 11:02, Charles J. Knipe wrote:

Howdy,

We're having a ZFS performance issue over here that I was hoping you guys could 
help me troubleshoot.  We have a ZFS pool made up of 24 disks, arranged into 7 
raid-z devices of 4 disks each.  We're using it as an iSCSI back-end for VMWare 
and some Oracle RAC clusters.

Under normal circumstances performance is very good both in benchmarks and 
under real-world use.  Every couple days, however, I/O seems to hang for 
anywhere between several seconds and several minutes.  The hang seems to be a 
complete stop of all write I/O.  The following zpool iostat illustrates:

pool0   2.47T  5.13T120  0   293K  0
pool0   2.47T  5.13T127  0   308K  0
pool0   2.47T  5.13T131  0   322K  0
pool0   2.47T  5.13T144  0   347K  0
pool0   2.47T  5.13T135  0   331K  0
pool0   2.47T  5.13T122  0   295K  0
pool0   2.47T  5.13T135  0   330K  0

While this is going on our VMs all hang, as do any zfs create commands or attempts to 
touch/create files in the zfs pool from the local system.  After several minutes the system 
un-hangs and we see very high write rates before things return to normal across the 
board.

Some more information about our configuration:  We're running OpenSolaris 
svn-134.  ZFS is at version 22.  Our disks are 15kRPM 300gb Seagate Cheetahs, 
mounted in Promise J610S Dual enclosures, hanging off a Dell SAS 5/e 
controller.  We'd tried out most of this configuration previously on 
OpenSolaris 2009.06 without running into this problem.  The only thing that's 
new, aside from the newer OpenSolaris/ZFS is a set of four SSDs configured as 
log disks.

At first we blamed de-dupe, but we've disabled that.  Next we suspected the SSD 
log disks, but we've seen the problem with those removed, as well.

Has anyone seen anything like this before?  Are there any tools we can use to 
gather information during the hang which might be useful in determining what's 
going wrong?

Thanks for any insights you may have.

-Charles
  



--




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Intermittent ZFS hang

2010-08-31 Thread David Blasingame Oracle

Charles,

Just like UNIX, there are several ways to drill down on the problem.  I 
would probably start with a live crash dump (savecore -L) when you see 
the problem.  Another method would be to grap multiple stats commands 
during the problem to see where you can drill down later.  I would 
probably use this method if the problem lasts for a while and drill down 
with dtrace base on what I saw.  But each method is going to depend on 
your skill, when looking at the problem.


Dave

On 08/30/10 16:15, Charles J. Knipe wrote:

David,

Thanks for your reply.  Answers to your questions are below.

  

Is it just ZFS hanging (or what it appears to be is
slowing down or
blocking) or does the whole system hang?nbsp; br



Only the ZFS storage is affected.  Any attempt to write to it blocks until the 
issue passes.  Other than that the system behaves normally.  I have not, as far 
as I remember, tried writing to the root pool while this is going on, I'll have 
to check that next time.  I suspect the problem is likely limited to a single 
pool.

  

What does iostat show during the time period of the
slowdown?br
What does mpstat show during the time of the
slowdown?br
br
You can look at the metadata statistics by running
the following.
echo ::arc | mdb -kbr
When looking at a ZFS problem, I usually like to
gather
echo ::spa | mdb -kbr
echo ::zio_state | mdb -kbr



I will plan to dump information from all of these sources next time I can catch 
it in the act.  Any other diag commands you think might be useful?

  

I suspect you could drill down more with dtrace or
lockstat to see
where the slowdown is happening.



I'm brand new to DTrace.  I'm doing some reading now toward being in a position 
to ask intelligent questions.

-Charles
  



--



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic on ZFS import - how do I recover?

2010-09-23 Thread David Blasingame Oracle
Have you tried setting zfs_recover  aok in /etc/system or setting it 
with the mdb?


Read how to set via /etc/system
http://opensolaris.org/jive/thread.jspa?threadID=114906

mdb debugger
http://www.listware.net/201009/opensolaris-zfs/46706-re-zfs-discuss-how-to-set-zfszfsrecover1-and-aok1-in-grub-at-startup.html

After you get the variables set and system booted, try importing, then 
running a scrub.


Dave

On 09/23/10 19:48, Scott Meilicke wrote:
I posted this on the www.nexentastor.org forums, but no answer so far, so I apologize if you are seeing this twice. I am also engaged with nexenta support, but was hoping to get some additional insights here. 


I am running nexenta 3.0.3 community edition, based on 134. The box crashed 
yesterday, and goes into a reboot loop (kernel panic) when trying to import my 
data pool, screenshot attached. What I have tried thus far:

Boot off of DVD, both 3.0.3 and 3.0.4 beta 8. 'zpool import -f data01' causes 
the panic in both cases.
Boot off of 3.0.4 beta 8, ran zpool import -fF data01
That gives me a message like Pool data01 returned to its stat as of ..., and 
then panics.

The import -fF does seem to import the pool, but then immediately panic. So after booting off of DVD, I can boot from my hard disks, and the system will not import the pool because it was last imported from another system. 


I have moved /etc/zfs/zfs.cache out of the way, but no luck after a reboot and 
import.

zpool import shows all of my disks are OK, and the pool itself is online.

Is it time to start working with zdb? Any suggestions?

This box is hosting development VMs, so I have some people idling their thumbs 
at the moment.

Thanks everyone,

-Scott
  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  



--


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data transfer taking a longer time than expected (Possibly dedup related)

2010-09-24 Thread David Blasingame Oracle

How do you know it is dedup causing the problem?

You can check to see how much is by looking at the threads (look for ddt)

mdb -k

::threadlist -v

or dtrace it.

fbt:zfs:ddt*:entry

You can disable dedup.  I believe current dedup data stays until it gets 
over written.  I'm not sure what send would do, but I would assume the 
new filesystem if dedup is not enabled would not have dedup'd data. 


You might also want to read.

http://blogs.sun.com/roch/entry/dedup_performance_considerations1

As far as the impact of ctrl-c on a move operation, When I do a test 
to move a file from one file system to another an ctrl-c the 
operation, the file is intact on the original filesystem and on the new 
filesystem it is partial.  So you would have to be careful about which 
data has already been copied.


Dave

On 09/24/10 14:34, Thomas S. wrote:

Hi all

I'm currently moving a fairly big dataset (~2TB) within the same zpool. Data is 
being moved from a dataset to another, which has dedup enabled.

The transfer started at quite a slow transfer speed — maybe 12MB/s. But it is 
now crawling to a near halt. Only 800GB has been moved in 48 hours.

I looked for similar problems on the forums and other places, and it seems 
dedup needs a much bigger amount of RAM than the server currently has (3GB), to 
perform smoothly for such an operation.

My question is, how can I gracefully stop the ongoing operation? What I did was simply 
mv temp/* new/ in an ssh session (which is still open).

Can I disable dedup on the dataset while the transfer is going on? Can I simply 
Ctrl-C the procress to stop it? Shoul I be careful of anything?

Help would be appreciated
  



--




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] My filesystem turned from a directory into a special character device

2010-09-29 Thread David Blasingame Oracle

Interesting thread.  So how would you go about fixing this?

I suspect you have to track down the vnode, znode_t and eventually 
modify one kernel buffers for znode_phys_t.  If your left with the 
decision to completely rebuild then repairing this might be the only 
choice some people may have.


Dave

On 09/27/10 11:56, Victor Latushkin wrote:

On Sep 27, 2010, at 8:30 PM, Scott Meilicke wrote:

  
I am running nexenta CE 3.0.3. 


I have a file system that at some point in the last week went from a directory 
per 'ls -l' to a  special character device. This results in not being able to 
get into the file system. Here is my file system, scott2, along with a new file 
system I  just created, as seen by ls -l:

drwxr-xr-x 4 root root4 Sep 27 09:14 scott
crwxr-xr-x 9 root root 0, 0 Sep 20 11:51 scott2

Notice the 'c' vs. 'd' at the beginning of the permissions list. I had been 
fiddling with permissions last week, then had problems with a kernel panic.



Are you still running with aok/zfs_recover being set? Have you seen this issue before panic? 

  

Perhaps this is related?



May be.

  

Any ideas how to get access to my file system?



This can be fixed, but it is a bit more complicated and error prone that 
setting couple of variables.

Regards
Victor
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  



--




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Is there a way to limit ZFS File Data but maintain room for the ARC to cache metadata

2010-10-01 Thread David Blasingame Oracle
I'm working on this scenario in which file system activity appears to 
cause the arc cache to evict meta data.  I would like to have a 
preference to keep the metadata in cache over ZFS File Data


What I've notice on import of a zpool the arc_meta_used goes up 
significantly.  ZFS meta data operations usually run pretty good.  
However over time with IO Operations the cache get's evicted and 
arc_no_grow get set.



-bash-3.00# echo ::arc | mdb -k| grep arc
arc_no_grow   = 1
arc_tempreserve   = 0 MB
arc_meta_used =   277 MB
arc_meta_limit=  3789 MB
arc_meta_max  =  1951 MB

-bash-3.00# echo ::arc | mdb -k| grep arc
arc_no_grow   = 1
arc_tempreserve   = 0 MB
arc_meta_used =91 MB
arc_meta_limit=  3789 MB
arc_meta_max  =  1951 MB

this will have the adverse affect on zfs commands taking longer.

-bash-3.00# echo ::memstat | mdb -k
Page SummaryPagesMB  %Tot
     
Kernel1185002  4628   29%
ZFS File Data 2186752  8542   53%
Anon41183   1601%
Exec and libs2378 90%
Page cache  13202510%
Free (cachelist)   518567  2025   13%
Free (freelist)195901   7655%

Total 4142985 16183
Physical  4054870 15839

So, I would like to limit the amount of ZFS File Data that can be used 
and keep the arc cache warm with metadata.  Any suggestions?


Thanks

Dave

--




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is there a way to limit ZFS File Data but maintain room for the ARC to cache metadata

2010-10-06 Thread David Blasingame Oracle
Good idea.  Provides options, but it would be nice to be able to set a 
low water mark on what can be taken away from the arc metadata cache 
without having to have something like an SSD.


Dave

On 10/01/10 14:02, Freddie Cash wrote:

On Fri, Oct 1, 2010 at 11:46 AM, David Blasingame Oracle
david.blasing...@oracle.com wrote:
  

I'm working on this scenario in which file system activity appears to cause
the arc cache to evict meta data.  I would like to have a preference to keep
the metadata in cache over ZFS File Data

What I've notice on import of a zpool the arc_meta_used goes up
significantly.  ZFS meta data operations usually run pretty good.  However
over time with IO Operations the cache get's evicted and arc_no_grow get
set.



snip

  

So, I would like to limit the amount of ZFS File Data that can be used and
keep the arc cache warm with metadata.  Any suggestions?



Would adding a cache device (L2ARC) and setting primarycache=metadata
and secondarycache=all on the root dataset do what you need?

That way ARC is used strictly for metadata, and L2ARC is used for metadata+data.

  



--



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Supermicro AOC-USAS2-L8i

2010-10-12 Thread David Blasingame Oracle

You might want to check this post.

http://opensolaris.org/jive/thread.jspa?threadID=122156

Dave

On 10/12/10 07:30, Alexander Lesle wrote:

Hello guys,

I want to built a new NAS and I am searching for a controller.
At supermicro I found this new one with the LSI 2008 controller.
http://www.supermicro.com/products/accessories/addon/AOC-USAS2-L8i.cfm?TYP=I

Who can confirm that this card runs under OSOL build134 or solaris10?
Why this card? Because its supports 6.0 Gb/s SATA.

  



--





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff cannot stat shares

2010-10-14 Thread David Blasingame Oracle

a diff to list the file differences between snapshots

http://arc.opensolaris.org/caselog/PSARC/2010/105/mail

Dave

On 10/13/10 15:48, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of dirk schelfhout

Wanted to test the zfs diff command and ran into this.



What's zfs diff?  I know it's been requested, but AFAIK, not implemented
yet.  Is that new feature being developed now or something?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  



--





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import panics

2010-11-11 Thread David Blasingame Oracle
The vmdump.0 is a compressed crash dump.  You will need to convert it to 
a format that can be read.


#  savecore -f ./vmdump.0  ./

This will create a couple of files, but the ones you will need next is 
unix.0  vmcore.0.  Use mdb to print out the stack.


#  mdb unix.0 vmcore.0

run the following to print the stack.  This would at least tell you what 
function the system is having a panic in.  You could then do a sunsolve 
search or google search.


$C

and gather zio_state data

::zio_state

And check the msgbuf to see if there are any hardware problems.

::msgbuf

Then quit mdb.  More drill down would be dependent on what you see.

::quit

Dave

On 11/11/10 08:51, Stephan Budach wrote:

Am 11.11.10 14:26, schrieb Steve Gonczi:

Dumpadm should tell you how your
Dumps are set up
Also you could load mdb before importing


I have located the dump, it's called vmdump.0. I also loaded mdb 
before I imported the pool, but that didn't help. Actually I tried it 
this way:


mdb -K -F
:c
zpool import -f -o readonly pool
Afterwards I tried to get back to mdb by hitting F1-a, but that didn't 
work - it was only printing 'a' on the console. Otherwise I would have 
tried systemdump, but that didn't came to pass.


Is there anything I can do with the vmdump.0 file. Unfortuanetly I am 
not a Kernel hacker…


Thanks




-




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import panics

2010-11-11 Thread David Blasingame Oracle
In this function, the second argument is a pointer to the osname 
(mount).  You can dump out the string of what it is.

ff0023b7db50 zfs_domount+0x17c(ff0588aaf698, ff0580cb3d80)

mdb unix.0 vmcore.0

ff0580cb3d80/S

Should print out the offending FS.  You could try to then import the 
pool read only (-o ro) and set the parameter to the file system to 
readonly (zfs set readonly=on fs).


Dave

On 11/11/10 09:37, Stephan Budach wrote:

David,

thanks so much (and of course to all other helpful souls here as well) 
for providing such great guidance!

Here we go:

Am 11.11.10 16:17, schrieb David Blasingame Oracle:
The vmdump.0 is a compressed crash dump.  You will need to convert it 
to a format that can be read.


#  savecore -f ./vmdump.0  ./

This will create a couple of files, but the ones you will need next 
is unix.0  vmcore.0.  Use mdb to print out the stack.


#  mdb unix.0 vmcore.0

run the following to print the stack.  This would at least tell you 
what function the system is having a panic in.  You could then do a 
sunsolve search or google search.


$C

 $C
ff0023b7d450 zap_leaf_lookup_closest+0x40(ff0588c61750, 0, 0,
ff0023b7d470)
ff0023b7d4e0 fzap_cursor_retrieve+0xc9(ff0588c61750, 
ff0023b7d5c0,

ff0023b7d600)
ff0023b7d5a0 zap_cursor_retrieve+0x19a(ff0023b7d5c0, 
ff0023b7d600)

ff0023b7d780 zfs_purgedir+0x4c(ff0581079260)
ff0023b7d7d0 zfs_rmnode+0x52(ff0581079260)
ff0023b7d810 zfs_zinactive+0xb5(ff0581079260)
ff0023b7d860 zfs_inactive+0xee(ff058118ae00, ff056ac3c108, 0)
ff0023b7d8b0 fop_inactive+0xaf(ff058118ae00, ff056ac3c108, 0)
ff0023b7d8d0 vn_rele+0x5f(ff058118ae00)
ff0023b7dac0 zfs_unlinked_drain+0xaf(ff05874c8b00)
ff0023b7daf0 zfsvfs_setup+0xfb(ff05874c8b00, 1)
ff0023b7db50 zfs_domount+0x17c(ff0588aaf698, ff0580cb3d80)
ff0023b7dc70 zfs_mount+0x1e4(ff0588aaf698, ff0588a9f100,
ff0023b7de20, ff056ac3c108)
ff0023b7dca0 fsop_mount+0x21(ff0588aaf698, ff0588a9f100,
ff0023b7de20, ff056ac3c108)
ff0023b7de00 domount+0xae3(0, ff0023b7de20, ff0588a9f100,
ff056ac3c108, ff0023b7de18)
ff0023b7de80 mount+0x121(ff0580c7e548, ff0023b7de98)
ff0023b7dec0 syscall_ap+0x8c()
ff0023b7df10 _sys_sysenter_post_swapgs+0x149()





and gather zio_state data

::zio_state

 ::zio_state
ADDRESS  TYPE  STAGEWAITER
ff0584be2348 NULL  OPEN -
ff0570ebcc88 NULL  OPEN -




And check the msgbuf to see if there are any hardware problems.

::msgbuf


 ::msgbuf
MESSAGE
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50925h, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50aefh, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
WARNING: pool 'obelixData' could not be loaded as it was last accessed 
by another system (host

: opensolaris hostid: 0x75b3c). See: http://www.sun.com/msg/ZFS-8000-EY
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50925h, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50aefh, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
WARNING: pool 'obelixData' could not be loaded as it was last accessed 
by another system (host

: opensolaris hostid: 0x75b3c). See: http://www.sun.com/msg/ZFS-8000-EY
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50aefh, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50925h, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
WARNING: pool 'obelixData' could not be loaded as it was last accessed 
by another system (host

: opensolaris hostid: 0x75b3c). See: http://www.sun.com/msg/ZFS-8000-EY
pseudo-device: devinfo0
devinfo0 is /pseudo/devi...@0
pcplusmp: asy (asy) instance 0 irq 0x4 vector 0xb0 ioapic 0x0 intin 
0x4 is bound to cpu 2

ISA-device: asy0
asy0 is /p...@0,0/i...@1f/a...@1,3f8
pcplusmp: asy (asy) instance 1 irq 0x3 vector 0xb1 ioapic 0x0 intin 
0x3 is bound to cpu 3

ISA-device: asy1
asy1 is /p...@0,0/i...@1f/a...@1,2f8
pseudo-device: ucode0
ucode0 is /pseudo/uc...@0
sgen0 at ata0: target 0 lun 0
sgen0 is /p...@0,0/pci-...@1f,2/i...@0/s...@0,0
sgen2 at mega_sas0: target 0 lun 1
sgen2 is /p...@0,0/pci8086,2...@1c/pci1028,1...@0/s...@0,1
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50925h, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h  0h  0h  0h 26h  0h  0h  0h  0h  0h
QEL qlc(0,0): ql_status_error, check condition sense data, 
d_id=50925h, lun=0h

70h  0h  5h  0h  0h  0h  0h  ah  0h

Re: [zfs-discuss] ZFS Write Performance Issues

2011-02-03 Thread David Blasingame Oracle
Can you clarify what you mean by ZFS Write Performance Issues?  A single 
kstat isn't very helpful, at least not to me.  Maybe a few over a couple 
of seconds when you are hitting the write performance issue might be 
beneficial.  A zpool iostat 1 may also be beneficial.


As far as the kstat data below, this is interesting.  Number of times 
ZFS had to throttle the ARC growth. 


zfs:0:arcstats:memory_throttle_count6508

Dave

On 02/03/11 08:31, Tony MacDoodle wrote:
We seem to be having write issues with zfs, does anyone see anything 
in the following:
 
bash-3.00# kstat -p -n arcstats

zfs:0:arcstats:c655251456
zfs:0:arcstats:c_max5242011648
zfs:0:arcstats:c_min655251456
zfs:0:arcstats:classmisc
zfs:0:arcstats:crtime   5699201.4918501
zfs:0:arcstats:data_size331404288
zfs:0:arcstats:deleted  408216
zfs:0:arcstats:demand_data_hits 4316945
zfs:0:arcstats:demand_data_misses   113229
zfs:0:arcstats:demand_metadata_hits 2250630
zfs:0:arcstats:demand_metadata_misses   94943
zfs:0:arcstats:evict_l2_cached  0
zfs:0:arcstats:evict_l2_eligible21616669184
zfs:0:arcstats:evict_l2_ineligible  13499421184
zfs:0:arcstats:evict_skip   17589309
zfs:0:arcstats:hash_chain_max   5
zfs:0:arcstats:hash_chains  2403
zfs:0:arcstats:hash_collisions  288486
zfs:0:arcstats:hash_elements26930
zfs:0:arcstats:hash_elements_max54713
zfs:0:arcstats:hdr_size 5582496
zfs:0:arcstats:hits 6802606
zfs:0:arcstats:l2_abort_lowmem  0
zfs:0:arcstats:l2_cksum_bad 0
zfs:0:arcstats:l2_evict_lock_retry  0
zfs:0:arcstats:l2_evict_reading 0
zfs:0:arcstats:l2_feeds 0
zfs:0:arcstats:l2_free_on_write 0
zfs:0:arcstats:l2_hdr_size  0
zfs:0:arcstats:l2_hits  0
zfs:0:arcstats:l2_io_error  0
zfs:0:arcstats:l2_misses0
zfs:0:arcstats:l2_read_bytes0
zfs:0:arcstats:l2_rw_clash  0
zfs:0:arcstats:l2_size  0
zfs:0:arcstats:l2_write_bytes   0
zfs:0:arcstats:l2_writes_done   0
zfs:0:arcstats:l2_writes_error  0
zfs:0:arcstats:l2_writes_hdr_miss   0
zfs:0:arcstats:l2_writes_sent   0
zfs:0:arcstats:memory_throttle_count6508
zfs:0:arcstats:mfu_ghost_hits   88648
zfs:0:arcstats:mfu_hits 5486801
zfs:0:arcstats:misses   451045
zfs:0:arcstats:mru_ghost_hits   79387
zfs:0:arcstats:mru_hits 1112447
zfs:0:arcstats:mutex_miss   367
zfs:0:arcstats:other_size   309580160
zfs:0:arcstats:p630831672
zfs:0:arcstats:prefetch_data_hits   107952
zfs:0:arcstats:prefetch_data_misses 212200
zfs:0:arcstats:prefetch_metadata_hits   127079
zfs:0:arcstats:prefetch_metadata_misses 30673
zfs:0:arcstats:recycle_miss 209685
zfs:0:arcstats:size 646566944
zfs:0:arcstats:snaptime 5944007.85720004
 
 
Thanks



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  



--





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Performance

2011-02-25 Thread David Blasingame Oracle

Hi All,

In reading the ZFS Best practices, I'm curious if this statement is 
still true about 80% utilization.


from :  
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide





 
http://www.solarisinternals.com/wiki/index.php?title=ZFS_Best_Practices_Guideaction=editsection=12Storage
 Pool Performance Considerations

.
Keep pool space under 80% utilization to maintain pool performance. 
Currently, pool performance can degrade when a pool is very full and 
file systems are updated frequently, such as on a busy mail server. Full 
pools might cause a performance penalty, but no other issues.




Dave


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss