date:20100628

[zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Gabriele Bulfon

I found this today:

http://blog.lastinfirstout.net/2010/06/sunoracle-finally-announces-zfs-data.html?utm_source=feedburnerutm_medium=feedutm_campaign=Feed%3A+LastInFirstOut+%28Last+In%2C+First+Out%29utm_content=FriendFeed+Bot

How can I be sure my Solaris 10 systems are fine?
Is latest OpenSolaris (134) safe?

Thx
Gabriele.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS root recovery SMI/EFI label weirdness

2010-06-28 Thread Sean .

Thanks I don't know how I missed it.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Ian Collins


On 06/28/10 08:15 PM, Gabriele Bulfon wrote:

I found this today:

http://blog.lastinfirstout.net/2010/06/sunoracle-finally-announces-zfs-data.html?utm_source=feedburnerutm_medium=feedutm_campaign=Feed%3A+LastInFirstOut+%28Last+In%2C+First+Out%29utm_content=FriendFeed+Bot

How can I be sure my Solaris 10 systems are fine?
Is latest OpenSolaris (134) safe?
   


Did you read the Sunsolve document?

b134 is not vulnerable.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on Ubuntu

2010-06-28 Thread Joe Little

All true, I just saw too many need ubuntu and zfs and thought to state the 
obvious in case the patch set for nexenta happen to differ enough to provide a 
working set. I've had nexenta succeed where opensolaris quarter releases failed 
and vice versa

On Jun 27, 2010, at 9:54 PM, Erik Trimble erik.trim...@oracle.com wrote:

 On 6/27/2010 9:07 PM, Richard Elling wrote:
 On Jun 27, 2010, at 8:52 PM, Erik Trimble wrote:
 
   
 But that won't solve the OP's problem, which was that OpenSolaris doesn't 
 support his hardware. Nexenta has the same hardware limitations as 
 OpenSolaris.
 
 AFAICT, the OP's problem is with a keyboard.  The vagaries of keyboards
 is well documented, but there is no silver bullet. Indeed, I have one box 
 that
 seems to be more or less happy with PS-2 vs USB for every other OS or
 hypervisor. My advice, have one of each handy, just in case.
  -- richard
 
   
 
 Right. I was just pointing out the fallacy of thinking that Nexenta might 
 work on hardware that OpenSolaris doesn't (or has problems with).
 
 
 
 -- 
 Erik Trimble
 Java System Support
 Mailstop:  usca22-123
 Phone:  x17195
 Santa Clara, CA
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Kernel panic on zpool status -v (build 143)

2010-06-28 Thread Andrej Podzimek


I ran 'zpool scrub' and will report what happens once it's finished. (It will 
take pretty long.)


The scrub finished successfully (with no errors) and 'zpool status -v' doesn't 
crash the kernel any more.

Andrej



smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Gabriele Bulfon

Yes, I did read it.
And what worries me is patches availability...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on Ubuntu

2010-06-28 Thread Roy Sigurd Karlsbakk

I think zfs on ubuntu currently is a rather bad idea. See test below with 
ubuntu Lucid 10.04 (amd64)

r...@bigone:~# cat /proc/partitions 
major minor  #blocks  name

   80  312571224 sda
   81 979933 sda1
   823911827 sda2
   83   48829567 sda3
   84  1 sda4
   85   49287388 sda5
   86   49287388 sda6
   87   49287388 sda7
   88   49287388 sda8
   89   49287388 sda9
   8   10   12410181 sda10
r...@bigone:~# zpool create zowhat raidz2 sda5 sda6 sda7 sda8 sda9
cannot open 'zowhat': dataset does not exist
r...@bigone:~# zpool status
  pool: zowhat
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
zowhat  ONLINE   0 0 0
  raidz2ONLINE   0 0 0
sda5ONLINE   0 0 0
sda6ONLINE   0 0 0
sda7ONLINE   0 0 0
sda8ONLINE   0 0 0
sda9ONLINE   0 0 0

errors: No known data errors
r...@bigone:~# zpool list
NAME   SIZE   USED  AVAILCAP  HEALTH  ALTROOT
- -  -  -  -   -  -
r...@bigone:~# zfs list
no datasets available
r...@bigone:~# 
 

- Original Message -
 All true, I just saw too many need ubuntu and zfs and thought to
 state the obvious in case the patch set for nexenta happen to differ
 enough to provide a working set. I've had nexenta succeed where
 opensolaris quarter releases failed and vice versa
 
 On Jun 27, 2010, at 9:54 PM, Erik Trimble erik.trim...@oracle.com
 wrote:
 
  On 6/27/2010 9:07 PM, Richard Elling wrote:
  On Jun 27, 2010, at 8:52 PM, Erik Trimble wrote:
 
 
  But that won't solve the OP's problem, which was that OpenSolaris
  doesn't support his hardware. Nexenta has the same hardware
  limitations as OpenSolaris.
 
  AFAICT, the OP's problem is with a keyboard. The vagaries of
  keyboards
  is well documented, but there is no silver bullet. Indeed, I have
  one box that
  seems to be more or less happy with PS-2 vs USB for every other OS
  or
  hypervisor. My advice, have one of each handy, just in case.
   -- richard
 
 
 
  Right. I was just pointing out the fallacy of thinking that Nexenta
  might work on hardware that OpenSolaris doesn't (or has problems
  with).
 
 
 
  --
  Erik Trimble
  Java System Support
  Mailstop: usca22-123
  Phone: x17195
  Santa Clara, CA
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Gabriele Bulfon

mmmI double checked some of the running systems.
Most of them have the first patch (sparc-122640-05 and x86-122641-06), but not 
the second one (sparc-142900-09 and x86-142901-09)...

...I feel I'm right in the middle of the problem...
How much am I risking?! These systems are all mirrored via zpool...

Would this really make me safe without patching?? :

set zfs:zfs_immediate_write_sz=10
set zfs:zvol_immediate_write_sz=10

Or a Log would be preferred?

*sweat*
These systems are all running for years nowand I considered them safe...
Have I been at risk all this time?!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Dick Hoogendijk


On 28-6-2010 12:13, Gabriele Bulfon wrote:


*sweat*
These systems are all running for years nowand I considered them safe...
Have I been at risk all this time?!


They're still running, are they not? So, stop sweating. g
But you're right about the changed patching service from Oracle.
It sucks big time. Safety patches should be available, even it the OS is 
free. You can't expect users to run unsafe systems just because they 
have not payed for the OS. After all, it's Oracle (SUN) who gives away 
the OS.


--
+ All that's really worth doing is what we do for others (Lewis Carrol)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Gabriele Bulfon

Yes...they're still running...but being aware that a power failure causing an 
unexpected poweroff may make the pool unreadable is a pain

Yes. Patches should be available.
Or adoption may be lowering a lot...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Victor Latushkin


On 28.06.10 16:16, Gabriele Bulfon wrote:

Yes...they're still running...but being aware that a power failure causing an
unexpected poweroff may make the pool unreadable is a pain


Pool integrity is not affected by this issue.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers

2010-06-28 Thread Frank Cusack


On 6/26/10 9:47 AM -0400 David Magda wrote:

Crickey. Who's the genius who thinks of these URLs?


SEOs
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Garrett D'Amore

On Mon, 2010-06-28 at 05:16 -0700, Gabriele Bulfon wrote:
 Yes...they're still running...but being aware that a power failure causing an 
 unexpected poweroff may make the pool unreadable is a pain
 
 Yes. Patches should be available.
 Or adoption may be lowering a lot...


I don't have access to the information, but if this problem is the same
one I think it is, then the pool does not become unreadable.  Rather,
its state after such an event represents a *consistent* state from some
point of time *earlier* than that confirmed fsync() (or a write on a
file opened with O_SYNC or O_DSYNC).

For most users, this is not a critical failing.  For users using
databases or requiring transactional integrity for data stored on ZFS,
then yes, this is a very nasty problem indeed.

I suspect that this is the problem I reported earlier in my blog
(http://gdamore.blogspot.com) about certain kernels having O_SYNC and
O_DSYNC problems.  I can't confirm this though, because I don't have
access to the SunSolve database to read the report.

(This is something I'll have to check into fixing... it seems like my
employer ought to have access to that information...)

- Garrett

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Announce: zfsdump

2010-06-28 Thread Tristram Scott

For quite some time I have been using zfs send -R fsn...@snapname | dd 
of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks back 
the size of the file system grew to larger than would fit on a single DAT72 
tape, and I once again searched for a simple solution to allow dumping of a zfs 
file system to multiple tapes.  Once again I was disappointed...

I expect there are plenty of other ways this could have been handled, but none 
leapt out at me.  I didn't want to pay large sums of cash for a commercial 
backup product, and I didn't see that Amanda would be an easy thing to fit into 
my existing scripts.  In particular, (and I could well be reading this 
incorrectly) it seems that the commercial products, Amanda, star, all are 
dumping the zfs file system file by file (with or without ACLs).  I found none 
which would allow me to dump the file system and its snapshots, unless I used 
zfs send to a scratch disk, and dumped to tape from there.  But, of course, 
that assumes I have a scratch disk large enough.

So, I have implemented zfsdump as a ksh script.  The method is as follows:
1. Make a bunch of fifos.
2. Pipe the stream from zfs send to split, with split writing to the fifos (in 
sequence).
3. Use dd to copy from the fifos to tape(s).

When the first tape is complete, zfsdump returns.  One then calls it again, 
specifying that the second tape is to be used, and so on.

From the man page:

 Example 1.  Dump the @Tues snapshot of the  tank  filesystem
 to  the  non-rewinding,  non-compressing  tape,  with a 36GB
 capacity:

  zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 0

 For the second tape:

  zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 1

If you would like to try it out, download the package from:
http://www.quantmodels.co.uk/zfsdump/

I have packaged it up, so do the usual pkgadd stuff to install.

Please, though, [b]try this out with caution[/b].  Build a few test file 
systems, and see that it works for you. 
[b]It comes without warranty of any kind.[/b]


Tristram
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Brian Kolaci


I use Bacula which works very well (much better than Amanda did).
You may be able to customize it to do direct zfs send/receive, however I find 
that although they are great for copying file systems to other machines, they 
are inadequate for backups unless you always intend to restore the whole file 
system.  Most people want to restore a file or directory tree of files, not a 
whole file system.  In the past 25 years of backups and restores, I've never 
had to restore a whole file system.  I get requests for a few files, or 
somebody's mailbox or somebody's http document root.
You can directly install it from CSW (or blastwave).

On 6/28/2010 11:26 AM, Tristram Scott wrote:

For quite some time I have been using zfs send -R fsn...@snapname | dd 
of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks back 
the size of the file system grew to larger than would fit on a single DAT72 
tape, and I once again searched for a simple solution to allow dumping of a zfs 
file system to multiple tapes.  Once again I was disappointed...

I expect there are plenty of other ways this could have been handled, but none 
leapt out at me.  I didn't want to pay large sums of cash for a commercial 
backup product, and I didn't see that Amanda would be an easy thing to fit into 
my existing scripts.  In particular, (and I could well be reading this 
incorrectly) it seems that the commercial products, Amanda, star, all are 
dumping the zfs file system file by file (with or without ACLs).  I found none 
which would allow me to dump the file system and its snapshots, unless I used 
zfs send to a scratch disk, and dumped to tape from there.  But, of course, 
that assumes I have a scratch disk large enough.

So, I have implemented zfsdump as a ksh script.  The method is as follows:
1. Make a bunch of fifos.
2. Pipe the stream from zfs send to split, with split writing to the fifos (in 
sequence).
3. Use dd to copy from the fifos to tape(s).

When the first tape is complete, zfsdump returns.  One then calls it again, 
specifying that the second tape is to be used, and so on.

 From the man page:

  Example 1.  Dump the @Tues snapshot of the  tank  filesystem
  to  the  non-rewinding,  non-compressing  tape,  with a 36GB
  capacity:

   zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 0

  For the second tape:

   zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 1

If you would like to try it out, download the package from:
http://www.quantmodels.co.uk/zfsdump/

I have packaged it up, so do the usual pkgadd stuff to install.

Please, though, [b]try this out with caution[/b].  Build a few test file 
systems, and see that it works for you.
[b]It comes without warranty of any kind.[/b]


Tristram


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Andrew Jones

Now at 36 hours since zdb process start and:


 PID USERNAME  SIZE   RSS STATE  PRI NICE  TIME  CPU PROCESS/NLWP
   827 root 4936M 4931M sleep   590   0:50:47 0.2% zdb/209

Idling at 0.2% processor for nearly the past 24 hours... feels very stuck. 
Thoughts on how to determine where and why?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Tristram Scott

 I use Bacula which works very well (much better than
 Amanda did).
 You may be able to customize it to do direct zfs
 send/receive, however I find that although they are
 great for copying file systems to other machines,
 they are inadequate for backups unless you always
 intend to restore the whole file system.  Most people
 want to restore a file or directory tree of files,
 not a whole file system.  In the past 25 years of
 backups and restores, I've never had to restore a
 whole file system.  I get requests for a few files,
 or somebody's mailbox or somebody's http document
 root.
 You can directly install it from CSW (or blastwave).

Thanks for your comments, Brian.  I should look at Bacula in more detail.

As for full restore versus ad hoc requests for files I just deleted, my 
experience is mostly similar to yours, although I have had need for full system 
restore more than once.

For the restore of a few files here and there, I believe this is now well 
handled with zfs snapshots.  I have always found these requests to be down to 
human actions.  The need for full system restore has (almost) always been 
hardware failure. 

If the file was there an hour ago, or yesterday, or last week, or last month, 
then we have it in a snapshot.

If the disk died horribly during a power outage (grrr!) then it would be very 
nice to be able to restore not only the full file system, but also the 
snapshots too.  The only way I know of achieving that is by using zfs send etc. 
 

 
 On 6/28/2010 11:26 AM, Tristram Scott wrote:
[snip]

 
  Tristram
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Brian Kolaci


On Jun 28, 2010, at 12:26 PM, Tristram Scott wrote:

 I use Bacula which works very well (much better than
 Amanda did).
 You may be able to customize it to do direct zfs
 send/receive, however I find that although they are
 great for copying file systems to other machines,
 they are inadequate for backups unless you always
 intend to restore the whole file system.  Most people
 want to restore a file or directory tree of files,
 not a whole file system.  In the past 25 years of
 backups and restores, I've never had to restore a
 whole file system.  I get requests for a few files,
 or somebody's mailbox or somebody's http document
 root.
 You can directly install it from CSW (or blastwave).
 
 Thanks for your comments, Brian.  I should look at Bacula in more detail.
 
 As for full restore versus ad hoc requests for files I just deleted, my 
 experience is mostly similar to yours, although I have had need for full 
 system restore more than once.
 
 For the restore of a few files here and there, I believe this is now well 
 handled with zfs snapshots.  I have always found these requests to be down to 
 human actions.  The need for full system restore has (almost) always been 
 hardware failure. 
 
 If the file was there an hour ago, or yesterday, or last week, or last month, 
 then we have it in a snapshot.
 
 If the disk died horribly during a power outage (grrr!) then it would be very 
 nice to be able to restore not only the full file system, but also the 
 snapshots too.  The only way I know of achieving that is by using zfs send 
 etc.  
 

I like snapshots when I'm making a major change to the system or for cloning.  
So to me, snapshots are good for transaction based operations.  Such as 
stopping  flushing a database, take a snapshot, then resume the database.  
Then you can back up the snapshot with Bacula and destroy the snapshot when the 
backup is complete.  I have Bacula configured with a pre-backup and post-backup 
scripts to do just that.  When you do the restore, it will create something 
that looks like a snapshot from the file system perspective, but isn't really 
one.

But if you're looking for a copy of a file from a specific date, Bacula retains 
that.  In fact you specify the retention period you want and you'll have access 
to any/all individual files on a per date basis.  You can retain the files for 
months or years if you like, and you specify that in the Bacula config file as 
to how long you want to keep the tapes around.  So it really comes down to your 
use-case.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Andrew Jones

Update: have given up on the zdb write mode repair effort, as least for now. 
Hoping for any guidance / direction anyone's willing to offer...

Re-running 'zpool import -F -f tank' with some stack trace debug, as suggested 
in similar threads elsewhere. Note that this appears hung at near idle.


ff03e278c520 ff03e9c60038 ff03ef109490   1  60 ff0530db4680
  PC: _resume_from_idle+0xf1CMD: zpool import -F -f tank
  stack pointer for thread ff03e278c520: ff00182bbff0
  [ ff00182bbff0 _resume_from_idle+0xf1() ]
swtch+0x145()
cv_wait+0x61()
zio_wait+0x5d()
dbuf_read+0x1e8()
dnode_next_offset_level+0x129()
dnode_next_offset+0xa2()
get_next_chunk+0xa5()
dmu_free_long_range_impl+0x9e()
dmu_free_object+0xe6()
dsl_dataset_destroy+0x122()
dsl_destroy_inconsistent+0x5f()
findfunc+0x23()
dmu_objset_find_spa+0x38c()
dmu_objset_find_spa+0x153()
dmu_objset_find+0x40()
spa_load_impl+0xb23()
spa_load+0x117()
spa_load_best+0x78()
spa_import+0xee()
zfs_ioc_pool_import+0xc0()
zfsdev_ioctl+0x177()
cdev_ioctl+0x45()
spec_ioctl+0x5a()
fop_ioctl+0x7b()
ioctl+0x18e()
dtrace_systrace_syscall32+0x11a()
_sys_sysenter_post_swapgs+0x149()
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Roy Sigurd Karlsbakk

- Original Message -
 Now at 36 hours since zdb process start and:
 
 
 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
 827 root 4936M 4931M sleep 59 0 0:50:47 0.2% zdb/209
 
 Idling at 0.2% processor for nearly the past 24 hours... feels very
 stuck. Thoughts on how to determine where and why?

Just a hunch, is this pool using dedup?

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Malachi de Ælfweald

I had a similar issue on boot after upgrade in the past and it was due to
the large number of snapshots I had...  don't know if that could be related
or not...


Malachi de Ælfweald
http://www.google.com/profiles/malachid


On Mon, Jun 28, 2010 at 8:59 AM, Andrew Jones andrewnjo...@gmail.comwrote:

 Now at 36 hours since zdb process start and:


  PID USERNAME  SIZE   RSS STATE  PRI NICE  TIME  CPU PROCESS/NLWP
   827 root 4936M 4931M sleep   590   0:50:47 0.2% zdb/209

 Idling at 0.2% processor for nearly the past 24 hours... feels very stuck.
 Thoughts on how to determine where and why?
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Andrew Jones

Dedup had been turned on in the past for some of the volumes, but I had turned 
it off altogether before entering production due to performance issues. GZIP 
compression was turned on for the volume I was trying to delete.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Andrew Jones

Malachi,

Thanks for the reply. There were no snapshots for the CSV1 volume that I 
recall... very few snapshots on the any volume in the tank.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Kernel Panic on zpool clean

2010-06-28 Thread George

Hi,

I have a machine running 2009.06 with 8 SATA drives in SCSI connected enclosure.

I had a drive fail and accidentally replaced the wrong one, which 
unsurprisingly caused the rebuild to fail. The status of the zpool then ended 
up as:

 pool: storage2
 state: FAULTED
status: An intent log record could not be read.
Waiting for adminstrator intervention to fix the faulted pool.
action: Either restore the affected device(s) and run 'zpool online',
or ignore the intent log records by running 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-K4
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
storage2   FAULTED  0 0 1  bad intent log
raidz1   ONLINE   0 0 0
c9t4d2 ONLINE   0 0 0
c9t4d3 ONLINE   0 0 0
c10t4d2ONLINE   0 0 0
c10t4d4ONLINE   0 0 0
raidz1   DEGRADED 0 0 6
c10t4d0UNAVAIL  0 0 0  cannot open
replacing  ONLINE   0 0 0
c9t4d0   ONLINE   0 0 0
c10t4d3  ONLINE   0 0 0
c10t4d1ONLINE   0 0 0
c9t4d1 ONLINE   0 0 0

running zpool clear storage2 caused the machine to dump and reboot.
I've tried removing the spare and putting back the faulty drive to give:

  pool: storage2
 state: FAULTED
status: An intent log record could not be read.
Waiting for adminstrator intervention to fix the faulted pool.
action: Either restore the affected device(s) and run 'zpool online',
or ignore the intent log records by running 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-K4
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
storage2   FAULTED  0 0 1  bad intent log
raidz1   ONLINE   0 0 0
c9t4d2 ONLINE   0 0 0
c9t4d3 ONLINE   0 0 0
c10t4d2ONLINE   0 0 0
c10t4d4ONLINE   0 0 0
raidz1   DEGRADED 0 0 6
c10t4d0FAULTED  0 0 0  corrupted data
replacing  DEGRADED 0 0 0
c9t4d0   ONLINE   0 0 0
c9t4d4   UNAVAIL  0 0 0  cannot open
c10t4d1ONLINE   0 0 0
c9t4d1 ONLINE   0 0 0

Again this core dumps when I try to do zpool clear storage2

Does anyone have any suggestions what would be the best course of action now?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-28 Thread valrh...@gmail.com

I'm putting together a new server, based on a Dell PowerEdge T410. 

I have  simple SAS controller, with six 2TB Hitachi DeskStar 7200 RPM SATA 
drives. The processor is a quad-core 2 GHz Core i7-based Xeon.

I will run the drives as one set of three mirror pairs striped together, for 6 
TB of homogeneous storage.

I'd like to run Dedup, but right now the server has only 4 GB of RAM. It has 
been pointed out to me several times that this is far too little. So how much 
should I buy? A few considerations:

1. I would like to run dedup on old copies of backups (dedup ratio for these 
filesystems are 3+). Basically I have a few years of backups onto tape, and 
will consolidate these. I need to have the data there on disk, but I rarely 
need to access it (maybe once a month). So those filesystems can be exported, 
and effectively shut off. Am I correct in guessing that, if a filesystem has 
been exported, its dedup table is not in RAM, and therefore is not relevant to 
RAM requirements? I don't mind if it's really slow to do the first and only 
copy to the file system, as I can let it run for a week without a problem.

2. Are the RAM requirements for ZFS with dedup based on the total available 
zpool size (I'm not using thin provisioning), or just on how much data is in 
the filesystem being deduped? That is, if I have 500 GB of deduped data but 6 
TB of possible storage, which number is relevant for calculating RAM 
requirements?

3. What are the RAM requirements for ZFS in the absence of dedup? That is, if I 
only have deduped filesystems in an exported state, and all that is active is 
non-deduped, is 4 GB enough?

4. How does the L2ARC come into play? I can afford to buy a fast Intel X25M G2, 
for instance, or any of the newer SandForce-based MLC SSDs to cache the dedup 
table. But does it work that way? It's not really affordable for me to get more 
than 16 GB of RAM on this system, because there are only four slots available, 
and the 8 GB DIMMS are a bit pricey.

5. Could I use one of the PCIe-based SSD cards for this purpose, such as the 
brand-new OCZ Revo? That should be somewhere between a SATA-based SSD and RAM.

Thanks in advance for all of your advice and help.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Roy Sigurd Karlsbakk

- Original Message -
 Dedup had been turned on in the past for some of the volumes, but I
 had turned it off altogether before entering production due to
 performance issues. GZIP compression was turned on for the volume I
 was trying to delete.

Was there a lot of deduped data still on disk before it was put into 
production? Turning off dedup won't dedup the data, just inhibit deduplication 
of new data...

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Andrew Jones

Just re-ran 'zdb -e tank' to confirm the CSV1 volume is still exhibiting error 
16:

snip
Could not open tank/CSV1, error 16
snip

Considering my attempt to delete the CSV1 volume lead to the failure in the 
first place, I have to think that if I can either 1) complete the deletion of 
this volume or 2) roll back to a transaction prior to this based on logging or 
3) repair whatever corruption has been caused by this partial deletion, that I 
will then be able to import the pool.

What does 'error 16' mean in the ZDB output, any suggestions?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-28 Thread Roy Sigurd Karlsbakk

 2. Are the RAM requirements for ZFS with dedup based on the total
 available zpool size (I'm not using thin provisioning), or just on how
 much data is in the filesystem being deduped? That is, if I have 500
 GB of deduped data but 6 TB of possible storage, which number is
 relevant for calculating RAM requirements?

It's based on the data stored in the zpool. You'll need about 200 bytes of per 
DDT (data deduplication table) entry, meaning about 1,2GB per 1TB stored on 
128kB blocks. With smaller blocks (smaller files are stored in smaller blocks), 
that means more memory. With only large files, 1,2GB or 1,5GB per 1TB stored 
data should sufffice.
 
 3. What are the RAM requirements for ZFS in the absence of dedup? That
 is, if I only have deduped filesystems in an exported state, and all
 that is active is non-deduped, is 4 GB enough?

Probably not, see above.
 
 4. How does the L2ARC come into play? I can afford to buy a fast Intel
 X25M G2, for instance, or any of the newer SandForce-based MLC SSDs to
 cache the dedup table. But does it work that way? It's not really
 affordable for me to get more than 16 GB of RAM on this system,
 because there are only four slots available, and the 8 GB DIMMS are a
 bit pricey.

L2ARC will buffer the DDT along with the data, so if you get some good SSDs 
(such as Crucial RealSSD C300), this will speed things up quite a bit.

 5. Could I use one of the PCIe-based SSD cards for this purpose, such
 as the brand-new OCZ Revo? That should be somewhere between a
 SATA-based SSD and RAM.

If your budget is low, as it may seem, good SATA SSDs will probably be the 
best. They can help out quite a bit.

Just remember that dedup on opensolaris is not thoroughly tested yet. It works, 
but AFAIK there are still issues with long hangs in case of unexpected reboots.

Disclaimer: I'm not an Oracle (nor Sun) employee - this is just my advice to 
you based on testing dedup on my test systems.

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-28 Thread Erik Trimble


On 6/28/2010 12:33 PM, valrh...@gmail.com wrote:

I'm putting together a new server, based on a Dell PowerEdge T410.

I have  simple SAS controller, with six 2TB Hitachi DeskStar 7200 RPM SATA 
drives. The processor is a quad-core 2 GHz Core i7-based Xeon.

I will run the drives as one set of three mirror pairs striped together, for 6 
TB of homogeneous storage.

I'd like to run Dedup, but right now the server has only 4 GB of RAM. It has 
been pointed out to me several times that this is far too little. So how much 
should I buy? A few considerations:

1. I would like to run dedup on old copies of backups (dedup ratio for these 
filesystems are 3+). Basically I have a few years of backups onto tape, and 
will consolidate these. I need to have the data there on disk, but I rarely 
need to access it (maybe once a month). So those filesystems can be exported, 
and effectively shut off. Am I correct in guessing that, if a filesystem has 
been exported, its dedup table is not in RAM, and therefore is not relevant to 
RAM requirements? I don't mind if it's really slow to do the first and only 
copy to the file system, as I can let it run for a week without a problem.

   
That's correct. An exported pool is effectively ignored by the system, 
so it won't contribute to any ARC requirements.



2. Are the RAM requirements for ZFS with dedup based on the total available 
zpool size (I'm not using thin provisioning), or just on how much data is in 
the filesystem being deduped? That is, if I have 500 GB of deduped data but 6 
TB of possible storage, which number is relevant for calculating RAM 
requirements?
   
Requirements are based on *current* BLOCK usage, after dedup has 
occurred.  That is, ZFS needs an entry in the DDT for each block 
actually allocated in the filesystem.  The number of times that block is 
referenced won't influence the DDT size, nor will the *potential* size 
of the pool matter (other than for capacity planning).  Remember that 
ZFS uses variable size blocks, so you need to determine what your 
average block size is in order to estimate your DDT usage.



3. What are the RAM requirements for ZFS in the absence of dedup? That is, if I 
only have deduped filesystems in an exported state, and all that is active is 
non-deduped, is 4 GB enough?
   
It of course depends heavily on your usage pattern, and the kind of 
files you are serving up. ZFS requires at a bare minimum a couple of 
dozen MB for its own usage. Everything above that is caching. Heavy 
write I/O will also eat up RAM, as ZFS needs to cache the writes in RAM 
before doing a large write I/O to backing store.   Take a look at the 
amount of data you expect to be using heavily - your RAM should probably 
exceed this amount, plus an additional 1GB or so for the 
OS/ZFS/kernel/etc use. That is assuming you are doing nothing but 
fileserving on the system.



4. How does the L2ARC come into play? I can afford to buy a fast Intel X25M G2, 
for instance, or any of the newer SandForce-based MLC SSDs to cache the dedup 
table. But does it work that way? It's not really affordable for me to get more 
than 16 GB of RAM on this system, because there are only four slots available, 
and the 8 GB DIMMS are a bit pricey.
   
L2ARC is secondary ARC. ZFS attempts to cache all reads in the ARC 
(Adaptive Read Cache) - should it find that it doesn't have enough space 
in the ARC (which is RAM-resident), it will evict some data over to the 
L2ARC (which in turn will simply dump the least-recently-used data when 
it runs out of space).  Remember, however, every time something gets 
written to the L2ARC, a little bit of space is taken up in the ARC 
itself (a pointer to the L2ARC entry needs to be kept in ARC).  So, it's 
not possible to have a giant L2ARC and tiny ARC. As a rule of thump, I 
try not to have my L2ARC exceed my main RAM by more than 10-15x  (with 
really bigMem machines, I'm a bit looser and allow 20-25x or so, but 
still...).   So, if you are thinking of getting a 160GB SSD, it would be 
wise to go for at minimum 8GB of RAM.   Once again, the amount of ARC 
space reserved for a L2ARC entry is fixed, and independent of the actual 
block size stored in L2ARC.   The jist of this is that tiny files eat up 
a disproportionate amount of systems resources for their size (smaller 
size = larger % overhead vis-a-vis large files).





5. Could I use one of the PCIe-based SSD cards for this purpose, such as the 
brand-new OCZ Revo? That should be somewhere between a SATA-based SSD and RAM.

Thanks in advance for all of your advice and help.
   


ZFS doesn't care what you use for the L2ARC.  Some of us actually use 
Hard drives, so a PCI-E Flash card is entirely possible.  The Revo is 
possibly the first PCI-E Flash card that wasn't massively expensive, 
otherwise, I don't think they'd be a good option.  They're going to be 
more expensive than even an SLC SSD, however.  In addition, given that 
L2ARC is heavily read-biased, cheap MLC SSDs are

Re: [zfs-discuss] Dedup RAM requirements, vs. L2ARC?

2010-06-28 Thread Erik Trimble


On 6/28/2010 12:53 PM, Roy Sigurd Karlsbakk wrote:

2. Are the RAM requirements for ZFS with dedup based on the total
available zpool size (I'm not using thin provisioning), or just on how
much data is in the filesystem being deduped? That is, if I have 500
GB of deduped data but 6 TB of possible storage, which number is
relevant for calculating RAM requirements?
 

It's based on the data stored in the zpool. You'll need about 200 bytes of per 
DDT (data deduplication table) entry, meaning about 1,2GB per 1TB stored on 
128kB blocks. With smaller blocks (smaller files are stored in smaller blocks), 
that means more memory. With only large files, 1,2GB or 1,5GB per 1TB stored 
data should sufffice.
   


Actually, I think the rule-of-thumb is 270 bytes/DDT entry.  It's 200 
bytes of ARC for every L2ARC entry.


DDT doesn't count for this ARC space usage

E.g.:I have 1TB of 4k files that are to be deduped, and it turns 
out that I have about a 5:1 dedup ratio. I'd also like to see how much 
ARC usage I eat up with a 160GB L2ARC.


(1)How many entries are there in the DDT:
1TB of 4k files means there are 2^30 files (about 1 billion).
However, at a 5:1 dedup ratio, I'm only actually storing 
20% of that, so I have about 214 million blocks.

Thus, I need a DDT of about 270 * 214 million  =~  58GB in size

(2)My L2ARC is 160GB in size, but I'm using 58GB for the DDT.  Thus, 
I have 102GB free for use as a data cache.
102GB / 4k =~ 27 million blocks can be stored in the 
remaining L2ARC space.
However, 26 million files takes up:   200 * 27 million =~ 
5.5GB of space in ARC
Thus, I'd better have at least 5.5GB of RAM allocated 
solely for L2ARC reference pointers, and no other use.







4. How does the L2ARC come into play? I can afford to buy a fast Intel
X25M G2, for instance, or any of the newer SandForce-based MLC SSDs to
cache the dedup table. But does it work that way? It's not really
affordable for me to get more than 16 GB of RAM on this system,
because there are only four slots available, and the 8 GB DIMMS are a
bit pricey.
 

L2ARC will buffer the DDT along with the data, so if you get some good SSDs 
(such as Crucial RealSSD C300), this will speed things up quite a bit.

   

5. Could I use one of the PCIe-based SSD cards for this purpose, such
as the brand-new OCZ Revo? That should be somewhere between a
SATA-based SSD and RAM.
 

If your budget is low, as it may seem, good SATA SSDs will probably be the 
best. They can help out quite a bit.

Just remember that dedup on opensolaris is not thoroughly tested yet. It works, 
but AFAIK there are still issues with long hangs in case of unexpected reboots.

Disclaimer: I'm not an Oracle (nor Sun) employee - this is just my advice to 
you based on testing dedup on my test systems.

Vennlige hilsener / Best regards

roy


While I'm an Oracle employee, but I don't have any insider knowledge on 
this. It's solely my experience talking.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Victor Latushkin


On Jun 28, 2010, at 9:32 PM, Andrew Jones wrote:

 Update: have given up on the zdb write mode repair effort, as least for now. 
 Hoping for any guidance / direction anyone's willing to offer...
 
 Re-running 'zpool import -F -f tank' with some stack trace debug, as 
 suggested in similar threads elsewhere. Note that this appears hung at near 
 idle.

It looks like it is processing huge inconsistent data set that was destroyed 
previously. So you need to wait a bit longer.

regards
victor 

 
 
 ff03e278c520 ff03e9c60038 ff03ef109490   1  60 ff0530db4680
  PC: _resume_from_idle+0xf1CMD: zpool import -F -f tank
  stack pointer for thread ff03e278c520: ff00182bbff0
  [ ff00182bbff0 _resume_from_idle+0xf1() ]
swtch+0x145()
cv_wait+0x61()
zio_wait+0x5d()
dbuf_read+0x1e8()
dnode_next_offset_level+0x129()
dnode_next_offset+0xa2()
get_next_chunk+0xa5()
dmu_free_long_range_impl+0x9e()
dmu_free_object+0xe6()
dsl_dataset_destroy+0x122()
dsl_destroy_inconsistent+0x5f()
findfunc+0x23()
dmu_objset_find_spa+0x38c()
dmu_objset_find_spa+0x153()
dmu_objset_find+0x40()
spa_load_impl+0xb23()
spa_load+0x117()
spa_load_best+0x78()
spa_import+0xee()
zfs_ioc_pool_import+0xc0()
zfsdev_ioctl+0x177()
cdev_ioctl+0x45()
spec_ioctl+0x5a()
fop_ioctl+0x7b()
ioctl+0x18e()
dtrace_systrace_syscall32+0x11a()
_sys_sysenter_post_swapgs+0x149()
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Resilvering onto a spare - degraded because of read and cksum errors

2010-06-28 Thread Cindy Swearingen


Hi Donald,

I think this is just a reporting error in the zpool status output,
depending on what Solaris release is.

Thanks,

Cindy

On 06/27/10 15:13, Donald Murray, P.Eng. wrote:

Hi,

I awoke this morning to a panic'd opensolaris zfs box. I rebooted it
and confirmed it would panic each time it tried to import the 'tank'
pool. Once I disconnected half of one of the mirrored disks, the box
booted cleanly and the pool imported without a panic.

Because this box has a hot spare, it began resilvering automatically.
This is the first time I've resilvered to a hot spare, so I'm not sure
whether the output below [1]  is normal.

In particular, I think it's odd that the spare has an equal number of
read and cksum errors. Is this normal? Is my spare a piece of junk,
just like the disk it replaced?


[1]
r...@weyl:~# zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver in progress for 3h42m, 97.34% done, 0h6m to go
config:

NAME   STATE READ WRITE CKSUM
tank   DEGRADED 0 0 0
  mirror   DEGRADED 0 0 0
spare  DEGRADED 1.36M 0 0
  9828443264686839751  UNAVAIL  0 0 0  was
/dev/dsk/c6t1d0s0
  c7t1d0   DEGRADED 0 0 1.36M  too many errors
c9t0d0 ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c7t0d0 ONLINE   0 0 0
c5t1d0 ONLINE   0 0 0
spares
  c7t1d0   INUSE currently in use

errors: No known data errors
r...@weyl:~#
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Kernel Panic on zpool clean

2010-06-28 Thread Victor Latushkin


On Jun 28, 2010, at 11:27 PM, George wrote:

 I've tried removing the spare and putting back the faulty drive to give:
 
  pool: storage2
 state: FAULTED
 status: An intent log record could not be read.
Waiting for adminstrator intervention to fix the faulted pool.
 action: Either restore the affected device(s) and run 'zpool online',
or ignore the intent log records by running 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-K4
 scrub: none requested
 config:
 
NAME   STATE READ WRITE CKSUM
storage2   FAULTED  0 0 1  bad intent log
raidz1   ONLINE   0 0 0
c9t4d2 ONLINE   0 0 0
c9t4d3 ONLINE   0 0 0
c10t4d2ONLINE   0 0 0
c10t4d4ONLINE   0 0 0
raidz1   DEGRADED 0 0 6
c10t4d0FAULTED  0 0 0  corrupted data
replacing  DEGRADED 0 0 0
c9t4d0   ONLINE   0 0 0
c9t4d4   UNAVAIL  0 0 0  cannot open
c10t4d1ONLINE   0 0 0
c9t4d1 ONLINE   0 0 0
 
 Again this core dumps when I try to do zpool clear storage2
 
 Does anyone have any suggestions what would be the best course of action now?

I think first we need to understand why it does not like 'zpool clear', as that 
may provide better understanding of what is wrong.

For that you need to create directory for saving crashdumps e.g. like this

mkdir -p /var/crash/`uname -n`

then run savecore and see if it would save a crash dump into that directory.

If crashdump is there, then you need to perform some basic investigation:

cd /var/crash/`uname -n`

mdb dump number

::status
::stack
::spa -c
::spa -v
::spa -ve
$q

for a start.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Kernel Panic on zpool clean

2010-06-28 Thread George

I've attached the output of those commands. The machine is a v20z if that makes 
any difference.

Thanks,

George
-- 
This message posted from opensolaris.orgmdb: logging to debug.txt
 ::status
debugging crash dump vmcore.0 (64-bit) from crypt
operating system: 5.11 snv_111b (i86pc)
panic message: 
BAD TRAP: type=e (#pf Page fault) rp=ff00084fc660 addr=0 occurred in module 
unix due to a NULL pointer dereference
dump content: kernel pages only



 ::stack
mutex_enter+0xb()
metaslab_free+0x12e(ff01c9fb3800, ff01cce64668, 1b9528, 0)
zio_dva_free+0x26(ff01cce64608)
zio_execute+0xa0(ff01cce64608)
zio_nowait+0x5a(ff01cce64608)
arc_free+0x197(ff01cf0c80c0, ff01c9fb3800, 1b9528, ff01d389bcf0, 0, 
0)
dsl_free+0x30(ff01cf0c80c0, ff01d389bcc0, 1b9528, ff01d389bcf0, 0, 0
)
dsl_dataset_block_kill+0x293(0, ff01d389bcf0, ff01cf0c80c0, 
ff01d18cfd80)
dmu_objset_sync+0xc4(ff01cffe0080, ff01cf0c80c0, ff01d18cfd80)
dsl_pool_sync+0x1ee(ff01d389bcc0, 1b9528)
spa_sync+0x32a(ff01c9fb3800, 1b9528)
txg_sync_thread+0x265(ff01d389bcc0)
thread_start+8()



 ::spa -c
ADDR STATE NAME
ff01c8df3000ACTIVE rpool

version=000e
name='rpool'
state=
txg=056a6ad1
pool_guid=53825ef3c58abc97
hostid=00820b9b
hostname='crypt'
vdev_tree
type='root'
id=
guid=53825ef3c58abc97
children[0]
type='mirror'
id=
guid=e9b8daed37492cfe
whole_disk=
metaslab_array=0017
metaslab_shift=001d
ashift=0009
asize=001114e0
is_log=
children[0]
type='disk'
id=
guid=ad7e5022f804365a
path='/dev/dsk/c8t0d0s0'
devid='id1,s...@sseagate_st373307lc__3hz76yyd743809wm/a'
phys_path='/p...@0,0/pci1022,7...@a/pci17c2,1...@4/s...@0,0:a'
whole_disk=
DTL=0052
children[1]
type='disk'
id=0001
guid=2f7a03c75a4931ac
path='/dev/dsk/c8t1d0s0'
devid='id1,s...@sseagate_st373307lc__3hz80bdp743793pa/a'
phys_path='/p...@0,0/pci1022,7...@a/pci17c2,1...@4/s...@1,0:a'
whole_disk=
DTL=0050
ff01c9fb3800ACTIVE storage2

version=000e
name='storage2'
state=
txg=001b9406
pool_guid=cc049c0f1321fc28
hostid=00820b9b
hostname='crypt'
vdev_tree
type='root'
id=
guid=cc049c0f1321fc28
children[0]
type='raidz'
id=
guid=dc1ecf18721028c1
nparity=0001
metaslab_array=000e
metaslab_shift=0023
ashift=0009
asize=03a33f10
is_log=
children[0]
type='disk'
id=
guid=c7b64596709ebdef
path='/dev/dsk/c9t4d2s0'
devid='id1,s...@n600d0230006c8a5f0c3fd863ea736d00/a'
phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,2:a'
whole_disk=0001
DTL=012d
children[1]
type='disk'
id=0001
guid=cd7ba5d38162fe0d
path='/dev/dsk/c9t4d3s0'
devid='id1,s...@n600d0230006c8a5f0c3fd8514ed8d900/a'
phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1/s...@4,3:a'
whole_disk=0001
DTL=012c
children[2]
type='disk'
id=0002
guid=3b499fb48e06460b
path='/dev/dsk/c10t4d2s0'
devid='id1,s...@n600d0230006c8a5f0c3fd84312aa6d00/a'
phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,2:a'
whole_disk=0001
DTL=012b
children[3]
type='disk'
id=0003
guid=e205849496e5e447
path='/dev/dsk/c10t4d4s0'
devid='id1,s...@n600d0230006c8a5f0c3fd8415c62ae00/a'
phys_path='/p...@0,0/pci1022,7...@b/pci9005,4...@1,1/s...@4,4:a'
whole_disk=0001
DTL=0128
children[1]
type='raidz'

[zfs-discuss] COMSTAR ISCSI - configuration export/impo rt

2010-06-28 Thread bso...@epinfante.com

Hi all,

Having osol b134 exporting a couple of iscsi targets to some hosts,how can the 
COMSTAR configuration be migrated to other host?
I can use the ZFS send/receive to replicate the luns but how can I replicate 
the target,views from serverA to serverB ?
 
Is there any best procedures to follow to accomplish this?
Thanks for all your time,

Bruno

Sent from my HTC
-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Andrew Jones

Thanks Victor. I will give it another 24 hrs or so and will let you know how it 
goes...

You are right, a large 2TB volume (CSV1) was not in the process of being 
deleted, as described above. It is showing error 16 on  'zdb -e'
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] COMSTAR ISCSI - configuration export/import

2010-06-28 Thread Mike Devlin

I havnt tried it yet, but supposedly this will backup/restore the
comstar config:

$ svccfg export -a stmf  ⁠comstar⁠.bak.${DATE}

If you ever need to restore the configuration, you can attach the
storage and run an import:

$ svccfg import ⁠comstar⁠.bak.${DATE}


- Mike

On 6/28/10, bso...@epinfante.com bso...@epinfante.com wrote:
 Hi all,

 Having osol b134 exporting a couple of iscsi targets to some hosts,how can
 the COMSTAR configuration be migrated to other host?
 I can use the ZFS send/receive to replicate the luns but how can I
 replicate the target,views from serverA to serverB ?

 Is there any best procedures to follow to accomplish this?
 Thanks for all your time,

 Bruno

 Sent from my HTC
 --
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.



-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS bug - should I be worried about this?

2010-06-28 Thread Gabriele Bulfon

Oh well, thanks for this answer.
It makes me feel much better!
What are eventual risks?
Gabriele Bulfon - Sonicle S.r.l.
Tel +39 028246016 Int. 30 - Fax +39 028243880
Via Felice Cavallotti 16 - 20089, Rozzano - Milano - ITALY
http://www.sonicle.com
--
Da: Victor Latushkin
A: Gabriele Bulfon
Cc: zfs-discuss@opensolaris.org
Data: 28 giugno 2010 16.14.12 CEST
Oggetto: Re: [zfs-discuss] ZFS bug - should I be worried about this?
On 28.06.10 16:16, Gabriele Bulfon wrote:
Yes...they're still running...but being aware that a power failure causing an
unexpected poweroff may make the pool unreadable is a pain
Pool integrity is not affected by this issue.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs indefinitely (retry post in parts; too long?)

2010-06-28 Thread Geoff Shipman


Andrew,

Looks like the zpool is telling you the devices are still doing work of 
some kind, or that there are locks still held.


From man of section 2 intro page the errors are listed.  Number 16 
looks to be an EBUSY.



 16 EBUSYDevice busy

 An attempt was made to mount a  dev-
 ice  that  was already mounted or an
 attempt was made to unmount a device
 on  which  there  is  an active file
 (open   file,   current   directory,
 mounted-on  file,  active  text seg-
 ment). It  will  also  occur  if  an
 attempt is made to enable accounting
 when it  is  already  enabled.   The
 device or resource is currently una-
 vailable.   EBUSY is  also  used  by
 mutexes, semaphores, condition vari-
 ables, and r/w  locks,  to  indicate
 that   a  lock  is held,  and by the
 processor  control  function
 P_ONLINE.


On 06/28/10 01:50 PM, Andrew Jones wrote:

Just re-ran 'zdb -e tank' to confirm the CSV1 volume is still exhibiting error 
16:

snip
Could not open tank/CSV1, error 16
snip

Considering my attempt to delete the CSV1 volume lead to the failure in the 
first place, I have to think that if I can either 1) complete the deletion of 
this volume or 2) roll back to a transaction prior to this based on logging or 
3) repair whatever corruption has been caused by this partial deletion, that I 
will then be able to import the pool.

What does 'error 16' mean in the ZDB output, any suggestions?
   


--
Geoff Shipman | Senior Technical Support Engineer
Phone: +13034644710
Oracle Global Customer Services
500 Eldorado Blvd. UBRM-04 | Broomfield, CO 80021
Email: geoff.ship...@sun.com | Hours:9am-5pm MT,Monday-Friday

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Edward Ned Harvey

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Tristram Scott
 
 If you would like to try it out, download the package from:
 http://www.quantmodels.co.uk/zfsdump/

I haven't tried this yet, but thank you very much!

Other people have pointed out bacula is able to handle multiple tapes, and
individual file restores.  However, the disadvantage of
bacula/tar/cpio/rsync etc is that they all have to walk the entire
filesystem searching for things that have changed.

The advantage of zfs send (assuming incremental backups) is that it
already knows what's changed, and it can generate a continuous datastream
almost instantly.  Something like 1-2 orders of magnitude faster per
incremental backup.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Resilvering onto a spare - degraded because of read and cksum errors

2010-06-28 Thread Donald Murray, P.Eng.

Thanks Cindy. I'm running 111b at the moment. I ran a scrub last
night, and it still reports the same status.

r...@weyl:~# uname -a
SunOS weyl 5.11 snv_111b i86pc i386 i86pc Solaris
r...@weyl:~# zpool status -x
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: scrub completed after 2h40m with 0 errors on Mon Jun 28 01:23:12 2010
config:

NAME   STATE READ WRITE CKSUM
tank   DEGRADED 0 0 0
  mirror   DEGRADED 0 0 0
spare  DEGRADED 1.37M 0 0
  9828443264686839751  UNAVAIL  0 0 0  was
/dev/dsk/c6t1d0s0
  c7t1d0   DEGRADED 0 0 1.37M  too many errors
c9t0d0 ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c7t0d0 ONLINE   0 0 0
c5t1d0 ONLINE   0 0 0
spares
  c7t1d0   INUSE currently in use

errors: No known data errors
r...@weyl:~#



On Mon, Jun 28, 2010 at 14:55, Cindy Swearingen
cindy.swearin...@oracle.com wrote:
 Hi Donald,

 I think this is just a reporting error in the zpool status output,
 depending on what Solaris release is.

 Thanks,

 Cindy

 On 06/27/10 15:13, Donald Murray, P.Eng. wrote:

 Hi,

 I awoke this morning to a panic'd opensolaris zfs box. I rebooted it
 and confirmed it would panic each time it tried to import the 'tank'
 pool. Once I disconnected half of one of the mirrored disks, the box
 booted cleanly and the pool imported without a panic.

 Because this box has a hot spare, it began resilvering automatically.
 This is the first time I've resilvered to a hot spare, so I'm not sure
 whether the output below [1]  is normal.

 In particular, I think it's odd that the spare has an equal number of
 read and cksum errors. Is this normal? Is my spare a piece of junk,
 just like the disk it replaced?


 [1]
 r...@weyl:~# zpool status tank
  pool: tank
  state: DEGRADED
 status: One or more devices could not be opened.  Sufficient replicas
 exist for
        the pool to continue functioning in a degraded state.
 action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
  scrub: resilver in progress for 3h42m, 97.34% done, 0h6m to go
 config:

        NAME                       STATE     READ WRITE CKSUM
        tank                       DEGRADED     0     0     0
          mirror                   DEGRADED     0     0     0
            spare                  DEGRADED 1.36M     0     0
              9828443264686839751  UNAVAIL      0     0     0  was
 /dev/dsk/c6t1d0s0
              c7t1d0               DEGRADED     0     0 1.36M  too many
 errors
            c9t0d0                 ONLINE       0     0     0
          mirror                   ONLINE       0     0     0
            c7t0d0                 ONLINE       0     0     0
            c5t1d0                 ONLINE       0     0     0
        spares
          c7t1d0                   INUSE     currently in use

 errors: No known data errors
 r...@weyl:~#
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Asif Iqbal

On Mon, Jun 28, 2010 at 11:26 AM, Tristram Scott
tristram.sc...@quantmodels.co.uk wrote:
 For quite some time I have been using zfs send -R fsn...@snapname | dd 
 of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks 
 back the size of the file system grew to larger than would fit on a single 
 DAT72 tape, and I once again searched for a simple solution to allow dumping 
 of a zfs file system to multiple tapes.  Once again I was disappointed...

 I expect there are plenty of other ways this could have been handled, but 
 none leapt out at me.  I didn't want to pay large sums of cash for a 
 commercial backup product, and I didn't see that Amanda would be an easy 
 thing to fit into my existing scripts.  In particular, (and I could well be 
 reading this incorrectly) it seems that the commercial products, Amanda, 
 star, all are dumping the zfs file system file by file (with or without 
 ACLs).  I found none which would allow me to dump the file system and its 
 snapshots, unless I used zfs send to a scratch disk, and dumped to tape from 
 there.  But, of course, that assumes I have a scratch disk large enough.

 So, I have implemented zfsdump as a ksh script.  The method is as follows:
 1. Make a bunch of fifos.
 2. Pipe the stream from zfs send to split, with split writing to the fifos 
 (in sequence).

would be nice if i could pipe the zfs send stream to a split and then
send of those splitted stream over the
network to a remote system. it would help sending it over to remote
system quicker. can your tool do that?

something like this

   s | - | j
  zfs send p | - | o   zfs recv
   (local)   l  | - | i(remote)
   t  | - | n


 3. Use dd to copy from the fifos to tape(s).

 When the first tape is complete, zfsdump returns.  One then calls it again, 
 specifying that the second tape is to be used, and so on.

 From the man page:

     Example 1.  Dump the @Tues snapshot of the  tank  filesystem
     to  the  non-rewinding,  non-compressing  tape,  with a 36GB
     capacity:

          zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 0

     For the second tape:

          zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 1

 If you would like to try it out, download the package from:
 http://www.quantmodels.co.uk/zfsdump/

 I have packaged it up, so do the usual pkgadd stuff to install.

 Please, though, [b]try this out with caution[/b].  Build a few test file 
 systems, and see that it works for you.
 [b]It comes without warranty of any kind.[/b]


 Tristram
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Processes hang in /dev/zvol/dsk/poolname

2010-06-28 Thread James Dickens

After multiple power outages caused by storms coming through, I can no
longer access /dev/zvol/dsk/poolname, which are hold l2arc and slog devices
in another pool I don't think this is related, since I the pools are ofline
pending access to the volumes.

I tried running find /dev/zvol/dsk/poolname -type f and here is the stack,
hopefully this someone a hint at what the issue is, I have scrubbed the pool
and no errors were found, and zdb -l reports no issues that I can see.

::ps ! grep find

R   1248   1243   1248   1243101 0x4a004000 ff02630d5728 find

 ff02630d5728::walk thread | ::findstack

stack pointer for thread ff025f15b3e0: ff000cb54650

[ ff000cb54650 _resume_from_idle+0xf1() ]

  ff000cb54680 swtch+0x145()

  ff000cb546b0 cv_wait+0x61()

  ff000cb54700 txg_wait_synced+0x7c()

  ff000cb54770 zil_replay+0xe8()

  ff000cb54830 zvol_create_minor+0x227()

  ff000cb54850 sdev_zvol_create_minor+0x19()

  ff000cb549c0 devzvol_create_link+0x49()

  ff000cb54ad0 sdev_call_dircallback+0xfe()

  ff000cb54c20 devname_lookup_func+0x4cf()

  ff000cb54ca0 devzvol_lookup+0xf8()

  ff000cb54d20 sdev_iter_datasets+0xb0()

  ff000cb54da0 devzvol_readdir+0xd6()

  ff000cb54e20 fop_readdir+0xab()

  ff000cb54ec0 getdents64+0xbc()

  ff000cb54f10 sys_syscall32+0xff()


-- DISK---
-bash-4.0$ sudo /usr/sbin/zdb -l /dev/dsk/c7t1d0s0

LABEL 0

version: 22
name: 'puddle'
state: 0
txg: 3139
pool_guid: 13462109782214169516
hostid: 4421991
hostname: 'amd'
top_guid: 15895240748538558983
guid: 15895240748538558983
vdev_children: 2
vdev_tree:
type: 'disk'
id: 0
guid: 15895240748538558983
path: '/dev/dsk/c7t1d0s0'
devid: 'id1,s...@sata_hitachi_hdt72101__stf607mh3a3ksk/a'
phys_path: '/p...@0,0/pci1043,8...@12/d...@1,0:a'
whole_disk: 1
metaslab_array: 23
metaslab_shift: 33
ashift: 9
asize: 1000191557632
is_log: 0
DTL: 605

LABEL 1

version: 22
name: 'puddle'
state: 0
txg: 3139
pool_guid: 13462109782214169516
hostid: 4421991
hostname: 'amd'
top_guid: 15895240748538558983
guid: 15895240748538558983
vdev_children: 2
vdev_tree:
type: 'disk'
id: 0
guid: 15895240748538558983
path: '/dev/dsk/c7t1d0s0'
devid: 'id1,s...@sata_hitachi_hdt72101__stf607mh3a3ksk/a'
phys_path: '/p...@0,0/pci1043,8...@12/d...@1,0:a'
whole_disk: 1
metaslab_array: 23
metaslab_shift: 33
ashift: 9
asize: 1000191557632
is_log: 0
DTL: 605

LABEL 2

version: 22
name: 'puddle'
state: 0
txg: 3139
pool_guid: 13462109782214169516
hostid: 4421991
hostname: 'amd'
top_guid: 15895240748538558983
guid: 15895240748538558983
vdev_children: 2
vdev_tree:
type: 'disk'
id: 0
guid: 15895240748538558983
path: '/dev/dsk/c7t1d0s0'
devid: 'id1,s...@sata_hitachi_hdt72101__stf607mh3a3ksk/a'
phys_path: '/p...@0,0/pci1043,8...@12/d...@1,0:a'
whole_disk: 1
metaslab_array: 23
metaslab_shift: 33
ashift: 9
asize: 1000191557632
is_log: 0
DTL: 605

LABEL 3

version: 22
name: 'puddle'
state: 0
txg: 3139
pool_guid: 13462109782214169516
hostid: 4421991
hostname: 'amd'
top_guid: 15895240748538558983
guid: 15895240748538558983
vdev_children: 2
vdev_tree:
type: 'disk'
id: 0
guid: 15895240748538558983
path: '/dev/dsk/c7t1d0s0'
devid: 'id1,s...@sata_hitachi_hdt72101__stf607mh3a3ksk/a'
phys_path: '/p...@0,0/pci1043,8...@12/d...@1,0:a'
whole_disk: 1
metaslab_array: 23
metaslab_shift: 33
ashift: 9
asize: 1000191557632
is_log: 0
DTL: 605
-bash-4.0$


there is also a volume attached that is part of the roolt pool that is
accessed fine but zdb -l  is not returning from zdb -l

bash-4.0# /usr/sbin/zpool status -v
  pool: puddle
 state: ONLINE
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
puddle ONLINE   0 0 0
  c7t1d0   ONLINE   0 0 0
logs
  /dev/zvol/dsk/rpool/puddle_slog  ONLINE   0 0 0


zfs list -rt volume puddle
NAMEUSED  AVAIL  REFER  MOUNTPOINT
puddle/l2arc

43 matches

Mail list logo