[zfs-discuss] ZFS shareiscsi and Comstar

2010-01-13 Thread Matthew Hollick
Good morning,

std_noob_disclaimer
I am new to the OpenSolaris game so forgive me if this is covered elsewhere.
/std_noob_disclaimer

While reading the documentation for zfs and comstar I note that there are two 
methods for creating iscsi targets with zfs volumes. The first (which appears 
to be unsupported) is to use the schareiscsi=on attribute on the volume. This 
method does not appear to use comstar but the older, userspace, iscsitgt 
service. In fact, if iscsitgt is offline it gets started when attempting to 
create a volume with the sharescsi=on attribute. 

Is the intent to move away from iscsitgt? Can I change some configuration 
somewhere to specify which iscsi target service to use?


Regards,

Matthew.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS shareiscsi and Comstar

2010-01-13 Thread Cyril Plisko
On Wed, Jan 13, 2010 at 1:03 PM, Matthew Hollick matt...@thehollick.com wrote:


 Is the intent to move away from iscsitgt?

Yes. [1]

 Can I change some configuration somewhere to specify which iscsi target 
 service to use?

No. In fact the cited ARC case obsoletes the shareiscsi property altogether.


[1] http://arc.opensolaris.org/caselog/PSARC/2010/006/

-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs fast mirror resync?

2010-01-13 Thread Max Levine
Veritas has this feature called fast mirror resync where they have  a
DRL on each side of the mirror and, detaching/re-attaching a mirror
causes only the changed bits to be re-synced. Is anything similar
planned for ZFS?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs fast mirror resync?

2010-01-13 Thread Cyril Plisko
On Wed, Jan 13, 2010 at 4:35 PM, Max Levine max...@gmail.com wrote:
 Veritas has this feature called fast mirror resync where they have  a
 DRL on each side of the mirror and, detaching/re-attaching a mirror
 causes only the changed bits to be re-synced. Is anything similar
 planned for ZFS?

ZFS has that feature from moment zero.


-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500/x4540 does the internal controllers have a bbu?

2010-01-13 Thread Richard Elling
On Jan 12, 2010, at 7:46 PM, Brad wrote:

 Richard,
 
 Yes, write cache is enabled by default, depending on the pool configuration.
 Is it enabled for a striped (mirrored configuration) zpool?  I'm asking 
 because of a concern I've read on this forum about a problem with SSDs (and 
 disks) where if a power outage occurs any data in cache would be lost if it 
 hasn't been flushed to disk.

If the vdev is a whole disk (for Solaris == not a slice), then ZFS will attempt 
to
set the write cache enable. By default, Solaris will not set write cache enable
on disks, in part because it causes bad juju for UFS.  This is independent of
the data protection configuration.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

2010-01-13 Thread Christopher George
The DDRdrive X1 OpenSolaris device driver is now complete,
please join us in our first-ever ZFS Intent Log (ZIL) beta test 
program.  A select number of X1s are available for loan,
preferred candidates would have a validation background 
and/or a true passion for torturing new hardware/driver :-)

We are singularly focused on the ZIL device market, so a test
environment bound by synchronous writes is required.  The
beta program will provide extensive technical support and a
unique opportunity to have direct interaction with the product
designers.

Would you like to take part in the advancement of Open
Storage and explore the far-reaching potential of ZFS
based Hybrid Storage Pools?

If so, please send an inquiry to zfs at ddrdrive dot com.

The drive for speed,

Christopher George
Founder/CTO
www.ddrdrive.com

*** Special thanks goes out to SUN employees Garrett D'Amore and
James McPherson for their exemplary help and support.  Well done!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thin device support in ZFS?

2010-01-13 Thread Miles Nordin
 et == Erik Trimble erik.trim...@sun.com writes:

et Probably, the smart thing to push for is inclusion of some new
et command in the ATA standard (in a manner like TRIM).  Likely
et something that would return both native Block and Page sizes
et upon query.

that would be the *sane* thing to do.  The *smart* thing to do would
be write a quick test to determine the apparent page size by
performance-testing write-flush-write-flush-write-flush with various
write sizes and finding the knee that indicates the smallest size at
which read-before-write has stopped.  The test could happen in 'zpool
create' and have its result written into the vdev label.  

Inventing ATA commands takes too long to propogate through the
technosphere, and the EE's always implement them wrongly: for example,
a device with SDRAM + supercap should probably report 512 byte sectors
because the algorithm for copying from SDRAM to NAND is subject to
change and none of your business, but EE's are not good with language
and will try to apelike match up the paragraph in the spec with the
disorganized thoughts in their head, fit pegs into holes, and will end
up giving you the NAND page size without really understanding why you
wanted it other than that some standard they can't control demands it.
They may not even understand why their devices are faster and
slower---they are probably just hurling shit against an NTFS and
shipping whatever runs some testsuite fastest---so doing the empirical
test is the only way to document what you really care about in a way
that will make it across the language and cultural barriers between
people who argue about javascript vs python and ones that argue about
Agilent vs LeCroy.  Within the proprietary wall of these flash
filesystem companies the testsuites are probably worth as much as the
filesystem code, and here without the wall an open-source statistical
test is worth more than a haggled standard.  

Remember the ``removeable'' bit in USB sticks and the mess that both
software and hardware made out of it.  (hot-swappable SATA drives are
``non-removeable'' and don't need rmformat while USB/firewore do?
yeah, sorry, u fail abstraction.  and USB drives have the ``removable
medium'' bit set when the medium and the controller are inseperable,
it's the _controller_ that's removeable?  ya sorry u fail reading
English.)  If you can get an answer by testing, DO IT, and evolve the
test to match products on the market as necessary.  This promises to
be a lot more resilient than the track record with bullshit ATA
commands and will work with old devices too.  By the time you iron out
your standard we will be using optonanocyberflash instead: that's what
happened with the removeable bit and r/w optical storage.  BTW let me
know when read/write UDF 2.0 on dvd+r is ready---the standard was only
announced twelve years ago, thanks.


pgpOg9cjVknOA.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Recovering a broken mirror

2010-01-13 Thread Jim Sloey
We have a production SunFireV240 that had a zfs mirror until this week. One of 
the drives (c1t3d0) in the mirror failed. 
The system was shutdown and the bad disk replaced without an export.
I don't know what happened next but by the time I got involved there was no 
evidence that the remaining good disk (c1t2d0) had ever been part of a ZFS 
mirror.
Using dd on the raw device I can see data on slice 6 of the good disk but can't 
import it.
Is there any way to recover from this or are they SOL?

Thanks in advance

# zpool status
no pools available
# zpool import
# ls /etc/zfs
#

ls /dev/dsk
c0t0d0s0  c0t0d0s3  c0t0d0s6  c1t0d0s1  c1t0d0s4  c1t0d0s7  c1t1d0s2  c1t1d0s5  
c1t2d0c1t2d0s2  c1t2d0s5  c1t3d0s0  c1t3d0s3  c1t3d0s6
c0t0d0s1  c0t0d0s4  c0t0d0s7  c1t0d0s2  c1t0d0s5  c1t1d0s0  c1t1d0s3  c1t1d0s6  
c1t2d0s0  c1t2d0s3  c1t2d0s6  c1t3d0s1  c1t3d0s4
c0t0d0s2  c0t0d0s5  c1t0d0s0  c1t0d0s3  c1t0d0s6  c1t1d0s1  c1t1d0s4  c1t1d0s7  
c1t2d0s1  c1t2d0s4  c1t3d0c1t3d0s2  c1t3d0s5

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c1t0d0 SUN72G cyl 14087 alt 2 hd 24 sec 424  bootdisk
  /p...@1c,60/s...@2/s...@0,0
   1. c1t1d0 SUN72G cyl 14087 alt 2 hd 24 sec 424  bootmirr
  /p...@1c,60/s...@2/s...@1,0
   2. c1t2d0 SUN146G cyl 14087 alt 2 hd 24 sec 848
  /p...@1c,60/s...@2/s...@2,0
   3. c1t3d0 SUN146G cyl 14087 alt 2 hd 24 sec 848
  /p...@1c,60/s...@2/s...@3,0
Specify disk (enter its number): 2
selecting c1t2d0

format inquiry
Vendor:   FUJITSU
Product:  MAT3147N SUN146G
Revision: 1703

format current
Current Disk = c1t2d0
SUN146G cyl 14087 alt 2 hd 24 sec 848
/p...@1c,60/s...@2/s...@2,0

format verify

Primary label contents:

Volume name = 
ascii name  = SUN146G cyl 14087 alt 2 hd 24 sec 848
pcyl= 14089
ncyl= 14087
acyl=2
nhead   =   24
nsect   =  848
Part  TagFlag Cylinders SizeBlocks
  0   rootwm   0 -12  129.19MB(13/0/0)   264576
  1   swapwu  13 -25  129.19MB(13/0/0)   264576
  2 backupwu   0 - 14086  136.71GB(14087/0/0) 286698624
  3 unassignedwm   00 (0/0/0) 0
  4 unassignedwm   00 (0/0/0) 0
  5 unassignedwm   00 (0/0/0) 0
  6usrwm  26 - 14086  136.46GB(14061/0/0) 286169472
  7 unassignedwm   00 (0/0/0) 0

format quit

# dd if=/dev/rdsk/c1t2d0s6 count=1 | od -x
1+0 records in
1+0 records out
000 7215 2b79 046c 8ddc 3e31 6966 caa4 6950
020 9c60 4514 7d4a 2a13 9b66 e69e d484 a327
040 4eb0 220e 9c7f 6604 6182 7b39 1310 9c5c
060 4584 c7c6 bd51 aba9 7b4d ec9a 99b2 6bc2
100 6cab 7a88 46d7 937d 5026 86cd 4cf9 ae83
120 20f3 44ec c22e d322 e6cc 2c09 f598 caf4
140 a9c5 85ad a695 8862 c6cc 124d bb72 d540
160 8886 2173 57cc 9759 a209 d78e 9a11 df4d
200 cdc4 5c99 259a 56e5 a301 d540 e691 182b
220 b354 93a9 bc33 085e 1fb6 0445 ac95 59aa
240 fb5a dd66 21de 2f18 24e7 d4c9 c464 99a5
260 9ae4 628a a434 7b96 d1a0 d761 3c21 3ed5
300 c417 5364 e5a3 837a dfd6 266c 50a6 4b10
320 95d5 2952 0f8f cb30 9ef0 23ab 6abc 6872
340 ed58 1977 79ff 9a89 0533 530e 6b83 95aa
360 630b f638 8508 02b1 6266 ca8a 6990 8ad4
400 47c2 7db3 9d9c 62cc ccb4 db3a 0803 ef35
420 0bd3 46b3 04bb d778 c471 9d65 de1b 1861
440 e0b9 ae27 d084 19da 716d b0ca 67be 07ea
460 5650 268e eb2c d7cc 083d c1a8 55ac 4c3c
500 d699 f558 d353 dc61 e25b 2bb8 7d8c 249c
520 c853 258a 01cd b366 bad3 2599 f8ac b3dc
540 6783 72eb 9029 926b 72e6 c84c 3cd7 59e1
560 f122 f20e f8d8 f32f 8226 ceeb acd0 ccf0
600 df3c f3f5 1e71 5d67 da75 1d84 b177 d21b
620 5fa8 a340 6404 2bec 2884 1d62 83cc 2498
640 4288 cf67 c6de 0970 75fe 9e05 8ed8 2173
660 fd30 4ec8 9ea0 63ee bd3f 7a07 b01a d04b
700 8045 29a6 6203 9ed3 9c16 740f 335e 53d8
720 c70e 9c73 981a f0f1 3547 8b84 0651 b1fb
740 b5c8 4887 dafe 15ab 721b 60d2 c1d8 8441
760 eee2 1896 2311 76da 1bfb 4422 3439 07e5
0001000
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

2010-01-13 Thread Adam Leventhal
Hey Chris,

 The DDRdrive X1 OpenSolaris device driver is now complete,
 please join us in our first-ever ZFS Intent Log (ZIL) beta test 
 program.  A select number of X1s are available for loan,
 preferred candidates would have a validation background 
 and/or a true passion for torturing new hardware/driver :-)
 
 We are singularly focused on the ZIL device market, so a test
 environment bound by synchronous writes is required.  The
 beta program will provide extensive technical support and a
 unique opportunity to have direct interaction with the product
 designers.

Congratulations! This is great news for ZFS. I'll be very interested to
see the results members of the community can get with your device as part
of their pool. COMSTAR iSCSI performance should be dramatically improved
in particular.

Adam

--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovering a broken mirror

2010-01-13 Thread Victor Latushkin

Jim Sloey wrote:
We have a production SunFireV240 that had a zfs mirror until this week. One of the drives (c1t3d0) in the mirror failed. 
The system was shutdown and the bad disk replaced without an export.

I don't know what happened next but by the time I got involved there was no 
evidence that the remaining good disk (c1t2d0) had ever been part of a ZFS 
mirror.
Using dd on the raw device I can see data on slice 6 of the good disk but can't 
import it.


Did you use entire disks c1t2d0 and c1t3d0 to create your pool? If yes, 
then current labeling on c1t2d0 suggests that it got relabeled somehow 
in the process.



Is there any way to recover from this or are they SOL?


First I'd make a full copy of c1t2d0 content (so you can try recovery 
several times).


Then first thing I'd try is to relabel c1t2d0 with EFI label and check 
if the pool is there.


regards,
victor



Thanks in advance

# zpool status
no pools available
# zpool import
# ls /etc/zfs
#

ls /dev/dsk
c0t0d0s0  c0t0d0s3  c0t0d0s6  c1t0d0s1  c1t0d0s4  c1t0d0s7  c1t1d0s2  c1t1d0s5  
c1t2d0c1t2d0s2  c1t2d0s5  c1t3d0s0  c1t3d0s3  c1t3d0s6
c0t0d0s1  c0t0d0s4  c0t0d0s7  c1t0d0s2  c1t0d0s5  c1t1d0s0  c1t1d0s3  c1t1d0s6  
c1t2d0s0  c1t2d0s3  c1t2d0s6  c1t3d0s1  c1t3d0s4
c0t0d0s2  c0t0d0s5  c1t0d0s0  c1t0d0s3  c1t0d0s6  c1t1d0s1  c1t1d0s4  c1t1d0s7  
c1t2d0s1  c1t2d0s4  c1t3d0c1t3d0s2  c1t3d0s5

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c1t0d0 SUN72G cyl 14087 alt 2 hd 24 sec 424  bootdisk
  /p...@1c,60/s...@2/s...@0,0
   1. c1t1d0 SUN72G cyl 14087 alt 2 hd 24 sec 424  bootmirr
  /p...@1c,60/s...@2/s...@1,0
   2. c1t2d0 SUN146G cyl 14087 alt 2 hd 24 sec 848
  /p...@1c,60/s...@2/s...@2,0
   3. c1t3d0 SUN146G cyl 14087 alt 2 hd 24 sec 848
  /p...@1c,60/s...@2/s...@3,0
Specify disk (enter its number): 2
selecting c1t2d0

format inquiry
Vendor:   FUJITSU
Product:  MAT3147N SUN146G
Revision: 1703

format current
Current Disk = c1t2d0
SUN146G cyl 14087 alt 2 hd 24 sec 848
/p...@1c,60/s...@2/s...@2,0

format verify

Primary label contents:

Volume name = 
ascii name  = SUN146G cyl 14087 alt 2 hd 24 sec 848
pcyl= 14089
ncyl= 14087
acyl=2
nhead   =   24
nsect   =  848
Part  TagFlag Cylinders SizeBlocks
  0   rootwm   0 -12  129.19MB(13/0/0)   264576
  1   swapwu  13 -25  129.19MB(13/0/0)   264576
  2 backupwu   0 - 14086  136.71GB(14087/0/0) 286698624
  3 unassignedwm   00 (0/0/0) 0
  4 unassignedwm   00 (0/0/0) 0
  5 unassignedwm   00 (0/0/0) 0
  6usrwm  26 - 14086  136.46GB(14061/0/0) 286169472
  7 unassignedwm   00 (0/0/0) 0

format quit

# dd if=/dev/rdsk/c1t2d0s6 count=1 | od -x
1+0 records in
1+0 records out
000 7215 2b79 046c 8ddc 3e31 6966 caa4 6950
020 9c60 4514 7d4a 2a13 9b66 e69e d484 a327
040 4eb0 220e 9c7f 6604 6182 7b39 1310 9c5c
060 4584 c7c6 bd51 aba9 7b4d ec9a 99b2 6bc2
100 6cab 7a88 46d7 937d 5026 86cd 4cf9 ae83
120 20f3 44ec c22e d322 e6cc 2c09 f598 caf4
140 a9c5 85ad a695 8862 c6cc 124d bb72 d540
160 8886 2173 57cc 9759 a209 d78e 9a11 df4d
200 cdc4 5c99 259a 56e5 a301 d540 e691 182b
220 b354 93a9 bc33 085e 1fb6 0445 ac95 59aa
240 fb5a dd66 21de 2f18 24e7 d4c9 c464 99a5
260 9ae4 628a a434 7b96 d1a0 d761 3c21 3ed5
300 c417 5364 e5a3 837a dfd6 266c 50a6 4b10
320 95d5 2952 0f8f cb30 9ef0 23ab 6abc 6872
340 ed58 1977 79ff 9a89 0533 530e 6b83 95aa
360 630b f638 8508 02b1 6266 ca8a 6990 8ad4
400 47c2 7db3 9d9c 62cc ccb4 db3a 0803 ef35
420 0bd3 46b3 04bb d778 c471 9d65 de1b 1861
440 e0b9 ae27 d084 19da 716d b0ca 67be 07ea
460 5650 268e eb2c d7cc 083d c1a8 55ac 4c3c
500 d699 f558 d353 dc61 e25b 2bb8 7d8c 249c
520 c853 258a 01cd b366 bad3 2599 f8ac b3dc
540 6783 72eb 9029 926b 72e6 c84c 3cd7 59e1
560 f122 f20e f8d8 f32f 8226 ceeb acd0 ccf0
600 df3c f3f5 1e71 5d67 da75 1d84 b177 d21b
620 5fa8 a340 6404 2bec 2884 1d62 83cc 2498
640 4288 cf67 c6de 0970 75fe 9e05 8ed8 2173
660 fd30 4ec8 9ea0 63ee bd3f 7a07 b01a d04b
700 8045 29a6 6203 9ed3 9c16 740f 335e 53d8
720 c70e 9c73 981a f0f1 3547 8b84 0651 b1fb
740 b5c8 4887 dafe 15ab 721b 60d2 c1d8 8441
760 eee2 1896 2311 76da 1bfb 4422 3439 07e5
0001000


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recovering a broken mirror

2010-01-13 Thread Jim Sloey
No. Only slice 6 from what I understand. 
I didn't create this (the person who did has left the company) and all I know 
is that the pool was mounted on /oraprod before it faulted.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

2010-01-13 Thread Neil Perrin

Hi Adam,

So was FW aware of this or in contact with these guys?
Also are you requesting/ordering any of these cards to evaluate?

The device seems kind of small at 4GB, and uses a double wide PCI Express slot.

Neil.

On 01/13/10 12:27, Adam Leventhal wrote:

Hey Chris,


The DDRdrive X1 OpenSolaris device driver is now complete,
please join us in our first-ever ZFS Intent Log (ZIL) beta test 
program.  A select number of X1s are available for loan,
preferred candidates would have a validation background 
and/or a true passion for torturing new hardware/driver :-)


We are singularly focused on the ZIL device market, so a test
environment bound by synchronous writes is required.  The
beta program will provide extensive technical support and a
unique opportunity to have direct interaction with the product
designers.


Congratulations! This is great news for ZFS. I'll be very interested to
see the results members of the community can get with your device as part
of their pool. COMSTAR iSCSI performance should be dramatically improved
in particular.

Adam

--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

2010-01-13 Thread Tristan Ball
That's very interesting tech you've got there... :-) I have a couple of 
questions, with apologies in advance if I missed them on the website..


I see the PCI card has an external power connector - can you explain 
how/why that's required, as opposed to using an on card battery or 
similar. What happens if the power to the card fails?


The 155mb rate for sustained writes is low for DDR ram? Is this because 
the backup to NAND is a constant thing, rather than only at power fail?


Regards
   Tristan

Christopher George wrote:

The DDRdrive X1 OpenSolaris device driver is now complete,
please join us in our first-ever ZFS Intent Log (ZIL) beta test 
program.  A select number of X1s are available for loan,
preferred candidates would have a validation background 
and/or a true passion for torturing new hardware/driver :-)


We are singularly focused on the ZIL device market, so a test
environment bound by synchronous writes is required.  The
beta program will provide extensive technical support and a
unique opportunity to have direct interaction with the product
designers.

Would you like to take part in the advancement of Open
Storage and explore the far-reaching potential of ZFS
based Hybrid Storage Pools?

If so, please send an inquiry to zfs at ddrdrive dot com.

The drive for speed,

Christopher George
Founder/CTO
www.ddrdrive.com

*** Special thanks goes out to SUN employees Garrett D'Amore and
James McPherson for their exemplary help and support.  Well done!
  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] NexentaStor 2.2.1 Developer Edition Released

2010-01-13 Thread Anil Gulecha
Hi All,

I'd like to announce the immediate availability of NexentaStor
Developer Edition v2.2.1.

Changes since v2.2 include many bug fixes. More information:

* This is a major stable release.
* Storage limit increased to 4TB.
* Built-in antivirus capability.
* Consistent snapshots Oracle and MySQL databases.
* A Citrix StorageLink adapter
* Asynchronous reverse replication support
* Per-snapshot probabilistic search engine
* Remote-access support were added.
* Japanese language support for interface

You can download CD image at
http://www.nexentastor.org/projects/site/wiki/DeveloperEdition

Summary of recent changes is on freshmeat at
http://freshmeat.net/projects/nexentastor/

A complete list of projects (14 and growing) is at
http://www.nexentastor.org/projects

Nightly images are available at
http://ftp.nexentastor.org/nightly/

Regards
--
Anil Gulecha
Community Lead, NexentaStor.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS import hangs with over 66000 context switches shown in top

2010-01-13 Thread Orvar Korvar
It seems there are more info on this issue here
http://opensolaris.org/jive/thread.jspa?threadID=121568tstart=0
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] opensolaris-vmware

2010-01-13 Thread Gregory Durham
Tim,
iSCSI was a design descision at the time. Performance was key and I wanted
to utilize being able to hand a LUN on the SAN to esxi, and use it as a raw
disk in physical compatibility mode...however what this has done is that I
can no longer take snapshots on the esxi server and must rely on zfs
snapshot. Also I have multiple *nix virtual machines I need to worry about
backing up and making sure that if all fails that the file systems are
consistent...

Thanks,
Greg

On Mon, Jan 11, 2010 at 7:36 PM, Tim Cook t...@cook.ms wrote:



 On Mon, Jan 11, 2010 at 6:17 PM, Greg gregory.dur...@gmail.com wrote:

 Hello All,
 I hope this makes sense, I have two opensolaris machines with a bunch of
 hard disks, one acts as a iSCSI SAN, and the other is identical other than
 the hard disk configuration. The only thing being served are VMWare esxi raw
 disks, which hold either virtual machines or data that the particular
 virtual machine uses, I.E. we have exchange 2007 virtualized and through its
 iSCSI initiator we are mounting two LUNs one for the database and another
 for the Logs, all on different arrays of course. Any how we are then
 snapshotting this data across the SAN network to the other box using
 snapshot send/recv. In the case the other box fails this box can immediatly
 serve all of the iSCSI LUNs. The problem, I don't really know if its a
 problem...Is when I snapshot a running vm will it come up alive in esxi or
 do I have to accomplish this in a different way. These snapshots will then
 be written to tape with bacula. I hope I am posting this in the correct
 place.

 Thanks,
 Greg
 --


 What you've got are crash consistent snapshots.  The disks are in the same
 state they would be in if you pulled the power plug.  They may come up just
 fine, or they may be in a corrupt state.  If you take snapshots frequently
 enough, you should have at least one good snapshot.  Your other option is
 scripting.  You can build custom scripts to leverage the VSS providers in
 Windows... but it won't be easy.

 Any reason in particular you're using iSCSI?  I've found NFS to be much
 more simple to manage, and performance to be equivalent if not better (in
 large clusters).

 --
 --Tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] opensolaris-vmware

2010-01-13 Thread Gregory Durham
Arnaud,
The virtual machines coming up as if they were on is the least of my
worries, my biggest worry is keeping the filesystems of the vms alive i.e.
not corrupt. I have all of my virtual machines set up with raw LUNs in
physical compatibility mode. This has increased performance but sadly at the
cost of vmware snapshots. Is there anything within the virtual machine
itself I can do to keep the filesysystem in tact?

In the case of exchange, I have exchange itself on a raw lun in physical
compatibility mode, and I have 2 LUNs mounted with the Server 2008 iSCSI
initiator for logs and the exchange DB.

This is a set up is similar to several other *nix vms I have residing on
this SAN. Which I am also worrying about. Any other ideas?

Thanks,
Greg


On Tue, Jan 12, 2010 at 1:11 AM, Arnaud Brand abr...@esca.fr wrote:

  Your machines won’t come up running, they’ll start up from scratch (like
 if you had hit the reset button).



 If you want your machines to come up you have to make vmware snapshots,
 which capture the state of the running VM (memory, etc..). Typically this is
 automated with solutions like VCB (Vmware consolidated backup), but I’ve
 just found http://communities.vmware.com/docs/DOC-8760 (not tested though
 since we are running ESX and have bought VCB licenses).



 Bear in mind that vmware won’t be able to take a consistent snapshot if
 some disks in the VM come from VMDK files while some other disks are raw
 LUNs (or otherwise mounted directly in the VM, I mean out of control from
 esx).  You’ll have to restart the machine from scratch in this case and have
 a strong potential for discrepancies between VMDK and raw luns.

 On the other hand, I understand that you want Exchange2007 logs and db to
 live their live so that when you « revert to snapshot » you don’t loose all
 the mail that was sent/delivered in between.

 So this can be a perfectly valid design depending on how you have set it
 up.



 I don’t think snapshots (be they vmware or zfs) are a good tool for
 failover or redundancy here. Basically, if your storage is not accessible
 from your esxi hosts, your VMs are toasted and you have to restart them from
 scratch.

 Please note, I don’t know about esxi iscsi retry policies specifics. For
 ESX we use an SVC cluster (2 node FC cluster), so our ESX hosts can always
 access the storage.



 You could try to setup an iscsi cluster like this
 http://docs.sun.com/app/docs/doc/820-7821/z4f557a?a=view (look for the
 figure at the bottom). You would obtain a mirrored pool where you could
 place the vmware zvols. Then you could iscsi-share these zvols.

 Though I’m not sure if/how OpenHA could/would failover if one of your node
 fails (I always wanted to play with openHA but don’t have the time nor the
 hardware at hand to try it).



 This setup of course doesn’t prevent you from doing vmware snapshots and
 zfs snapshots, you’ll just achieve some level of fault-tolerance.



 Please note I don’t know anything about using NFS with esx/esxi. Maybe
 there are setups that are easier to achieve using NFS and provide the same
 (or a better) level of fault-tolerance.



 Hope this helps,

 Arnaud



 *De :* zfs-discuss-boun...@opensolaris.org [mailto:
 zfs-discuss-boun...@opensolaris.org] *De la part de* Tim Cook
 *Envoyé :* mardi 12 janvier 2010 04:36
 *À :* Greg
 *Cc :* zfs-discuss@opensolaris.org
 *Objet :* Re: [zfs-discuss] opensolaris-vmware





 On Mon, Jan 11, 2010 at 6:17 PM, Greg gregory.dur...@gmail.com wrote:

 Hello All,
 I hope this makes sense, I have two opensolaris machines with a bunch of
 hard disks, one acts as a iSCSI SAN, and the other is identical other than
 the hard disk configuration. The only thing being served are VMWare esxi raw
 disks, which hold either virtual machines or data that the particular
 virtual machine uses, I.E. we have exchange 2007 virtualized and through its
 iSCSI initiator we are mounting two LUNs one for the database and another
 for the Logs, all on different arrays of course. Any how we are then
 snapshotting this data across the SAN network to the other box using
 snapshot send/recv. In the case the other box fails this box can immediatly
 serve all of the iSCSI LUNs. The problem, I don't really know if its a
 problem...Is when I snapshot a running vm will it come up alive in esxi or
 do I have to accomplish this in a different way. These snapshots will then
 be written to tape with bacula. I hope I am posting this in the correct
 place.

 Thanks,
 Greg
 --


 What you've got are crash consistent snapshots.  The disks are in the same
 state they would be in if you pulled the power plug.  They may come up just
 fine, or they may be in a corrupt state.  If you take snapshots frequently
 enough, you should have at least one good snapshot.  Your other option is
 scripting.  You can build custom scripts to leverage the VSS providers in
 Windows... but it won't be easy.

 Any reason in particular you're using iSCSI?  I've found NFS to be much
 

Re: [zfs-discuss] opensolaris-vmware

2010-01-13 Thread Fajar A. Nugraha
On Thu, Jan 14, 2010 at 6:40 AM, Gregory Durham
gregory.dur...@gmail.com wrote:
 Arnaud,
 The virtual machines coming up as if they were on is the least of my
 worries, my biggest worry is keeping the filesystems of the vms alive i.e.
 not corrupt.

As Tim said,  The snapshot disk are in the same state they would be in
if you pulled the power plug.
This is also the same thing you got BTW if you use LVM snapshot (on
Linux) or SAN/NAS based snapshots (like NetApp)

 In the case of exchange, I have exchange itself on a raw lun in physical
 compatibility mode, and I have 2 LUNs mounted with the Server 2008 iSCSI
 initiator for logs and the exchange DB.

Most modern filesystem and database have journaling that can recover
from power failure scenarios, so they should be able to use the
snapshot and provide consistent, non-corrupt information.

So the question now is, have you tried restoring from snapshot?

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

2010-01-13 Thread Christopher George
Excellent questions!

 I see the PCI card has an external power connector - can you explain 
 how/why that's required, as opposed to using an on card battery or 
 similar. 

DDRdrive X1 ZIL functionality is best served with an external attached UPS,
this allows the X1 to perform as a non-volatile storage device without specific
user configuration or unique operation.  An often overlooked aspect of batteries
(irrespective of technology or internal/external) is their limited lifetime and
varying degrees of maintenance and oversight required.  For example, a
lithium (Li-Ion) battery supply, as used by older NVRAM products and not the
X1, does have the minimum required energy density for an internal solution.
But has a fatal flaw for enterprise applications - an ignition mode failure
possibility.  Google lithium battery fire.  Such an instance, even if rare,
would be catastrophic not only to the on-card data but the host server and so
on...  Supercapacitors are another alternative which thankfully do not share
the ignition mode failure mechanism of Li-Ion, but are hampered mainly by
cost with some longevity concerns which can be addressed.  In the end, we
selected data integrity, cost, and serviceability as our top three priorities.
This led us to the industry standard external lead-acid battery as sold by APC.

Key benefits of the DDRdrive X1 power solution:

1)  Data Integrity - Supports multiple back-to-back power failures, a single
DDRdrive X1 uses less than 5W when the host is powered down, even a
small UPS is over-provisioned and unlike an internal solution will not normally
require a lengthy recharge time prior to the next power incident.  Optionally a
backup to NAND can be performed to remove the UPS duration as a factor.

2)  Cost Effective / Flexible - The Smart-UPS SC 450VA (280 Watts) is an
excellent choice for most installations and retails for approximately $150.00.
Flexibility is in regard to UPS selection, as it can be right-sized (duration) 
for
each individual application if needed.

3)  Reliability / Maintenance - UPS front panel LED status for battery 
replacement and audible alarms when battery is low or non-operational. 
Industry standard battery form factor backed by APC the industry leading 
manufacture of enterprise-class backup solutions.

 What happens if the *host* power to the card fails?

Nothing, the DDRdrive X1's data integrity is guaranteed by the attached UPS.

 The 155mb rate for sustained writes is low for DDR ram?

The DRAM's value add is it's extremely low latency (even compared to NAND)
and other intrinsic properties such as longevity and reliability.  The 
read/write
sequential bandwidth is completely bound by the PCI Express interface.

 Is this because the backup to NAND is a constant thing, rather than only
 at power fail?

No, the backup to NAND is not continual.  All Host IO is directed to DRAM for
maximum performance while the NAND only provides an optional (user
configured) backup/restore feature.

Christopher George
Founder/CTO
www.ddrdrive.com
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] opensolaris-vmware

2010-01-13 Thread Gregory Durham
Haha, Yeah that's tomorrow, I have a test vm I will be testing on. I shall
report back! Thank you all!

On Wed, Jan 13, 2010 at 8:26 PM, Fajar A. Nugraha fa...@fajar.net wrote:

 On Thu, Jan 14, 2010 at 6:40 AM, Gregory Durham
 gregory.dur...@gmail.com wrote:
  Arnaud,
  The virtual machines coming up as if they were on is the least of my
  worries, my biggest worry is keeping the filesystems of the vms alive
 i.e.
  not corrupt.

 As Tim said,  The snapshot disk are in the same state they would be in
 if you pulled the power plug.
 This is also the same thing you got BTW if you use LVM snapshot (on
 Linux) or SAN/NAS based snapshots (like NetApp)

  In the case of exchange, I have exchange itself on a raw lun in physical
  compatibility mode, and I have 2 LUNs mounted with the Server 2008 iSCSI
  initiator for logs and the exchange DB.

 Most modern filesystem and database have journaling that can recover
 from power failure scenarios, so they should be able to use the
 snapshot and provide consistent, non-corrupt information.

 So the question now is, have you tried restoring from snapshot?

 --
 Fajar

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

2010-01-13 Thread Miles Nordin
 cg == Christopher George cgeo...@ddrdrive.com writes:

cg Nothing, the DDRdrive X1's data integrity is guaranteed by the
cg attached UPS.

I've found UPS power is less reliable than unprotected line power
where I live, especially when using bargain UPS's like the ones you
suggest.  I've tracked it for five years, and that's simply the case.
When devices have dual power inputs I do plug one into the UPS though.

I've also found unplanned powerdowns usually occur during maintenance
because of people tripping over cords (networking equipment likes to
put A/B power on opposite sides of the chassis.  Thanks for that, to
those who do it.), dropping things, bumping power strip switches
(which should not exist in the first place), provoking crappy devices
(ex poweron surges causing overcurrent), mucking around with the
batteries, or confusing highly-stupid UPS microcontrollers over their
buggy web interfaces (``reset controller''), clumsy buttonpads (a
single on/off/test button?  are you *CRAZY*?  and sometimes I have to
_hold the button down_?  What next, double-pressing?  there's on,
there's off, but what about the ``off-but-charging'' state: how's it
requested and how's it confirmed?  hazily?  thanks, assholes.).  Your
decision to use UPS power is based on the imaginary scenario you walk
us through: building loses line power for X minutes, UPS runs out.
Obviously I'm familiar with the scenario but honestly I've not run
into that one in practice as often as other ones, which is why I call
it fantasy.

cg NAND only provides an optional (user configured)
cg backup/restore feature.

so, it does not even attempt to query the UPS?  How can it live up to
the ideally-functioning-UPS protection scheme you describe, then?  To
do so it needs UPS communication: it'd need to NAND-backup before the
battery ran out, so it needs to get advance warning of a low battery
from the UPS.  It'd also need a way to halt the computer, or at least
to take itself offline and propogate the error up the driver stack, if
the UPS has not enough charge to complete a NAND backup or of the UPS
considers its batteries defective.

Personally, I don't care if the card talks to the UPS, because I think
realistically if you take the cases when power stops coming out of a
UPS and overlap them with the cases when the UPS provided warning
before the power stopped coming out, there's not much overlap.
Spurious warnings and sudden shutdowns are *more* common over the life
of the units I've had than this imaginary graceful powerdown scenario.

Finally, data that's stored ``durably'' needs to survive yanked
cables.  IMHO most people who are certain cables will never be yanked
or are willing to take the risk, would be better off just disabling
the ZIL rather than using a slog.  Then you don't have to worry about
pools failing to import from missing slog if you do yank a cable,
which is a better tradeoff.

NAND storage therefore needs to be self-contained, like a disk drive,
to be useful as a slog.  The ANS-9010 comes closer to that than this
card, though I don't know if it actually delivers, either.


pgpEQSwbNLlEe.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

2010-01-13 Thread Tristan Ball

Thanks for the detailed response - further questions inline...

Christopher George wrote:

Excellent questions!

  
I see the PCI card has an external power connector - can you explain 
how/why that's required, as opposed to using an on card battery or 
similar. 



DDRdrive X1 ZIL functionality is best served with an external attached UPS,
this allows the X1 to perform as a non-volatile storage device without specific
user configuration or unique operation.  An often overlooked aspect of batteries
(irrespective of technology or internal/external) is their limited lifetime and
varying degrees of maintenance and oversight required.  For example, a
lithium (Li-Ion) battery supply, as used by older NVRAM products and not the
X1, does have the minimum required energy density for an internal solution.
But has a fatal flaw for enterprise applications - an ignition mode failure
possibility.  Google lithium battery fire.  Such an instance, even if rare,
would be catastrophic not only to the on-card data but the host server and so
on...  Supercapacitors are another alternative which thankfully do not share
the ignition mode failure mechanism of Li-Ion, but are hampered mainly by
cost with some longevity concerns which can be addressed.  In the end, we
selected data integrity, cost, and serviceability as our top three priorities.
This led us to the industry standard external lead-acid battery as sold by APC.

Key benefits of the DDRdrive X1 power solution:

1)  Data Integrity - Supports multiple back-to-back power failures, a single
DDRdrive X1 uses less than 5W when the host is powered down, even a
small UPS is over-provisioned and unlike an internal solution will not normally
require a lengthy recharge time prior to the next power incident.  Optionally a
backup to NAND can be performed to remove the UPS duration as a factor.

2)  Cost Effective / Flexible - The Smart-UPS SC 450VA (280 Watts) is an
excellent choice for most installations and retails for approximately $150.00.
Flexibility is in regard to UPS selection, as it can be right-sized (duration) 
for
each individual application if needed.

3)  Reliability / Maintenance - UPS front panel LED status for battery 
replacement and audible alarms when battery is low or non-operational. 
Industry standard battery form factor backed by APC the industry leading 
manufacture of enterprise-class backup solutions.
  
OK, I take your point about battery fires, however we've been using 
battery backed cards (of various types) in servers for a while now, and 
I think you might have over over-emphasized those risks, when compared 
to the operational complexity of maintaining a separate power circuit 
for my PCI cards! But then, I haven't actually done the research on 
battery reliability either. :-)


I'm not sure about others on the list, but I have a dislike of AC power 
bricks in my racks. Sometimes they're unavoidable, but they're also 
physically awkward - where do we put them? Using up space on a dedicated 
shelf? Cable tied to the rack itself? Hidden under the floor?


Is the state of the power input exposed to software in some way? In 
other terms, can I have a nagios check running on my server that 
triggers an alert if the power cable accidentally gets pulled out?


  

What happens if the *host* power to the card fails?



Nothing, the DDRdrive X1's data integrity is guaranteed by the attached UPS.
  
OK, which means that the UPS must be separate to the UPS powering the 
server then.
  

The 155mb rate for sustained writes is low for DDR ram?



The DRAM's value add is it's extremely low latency (even compared to NAND)
and other intrinsic properties such as longevity and reliability.  The 
read/write
sequential bandwidth is completely bound by the PCI Express interface.
  
Any plans on a pci-e multi-lane version then? All my servers are still 
Gig-E, and I'm not likely to see more than 100MB/sec of NFS traffic, 
however I'm sure there are plenty of NFS servers on 10G out there that 
will see quite a bit more than 155MB/sec for moderate amounts of time. I 
know we can put more than one of these cards in a server, but those 
slots are often taken up with other things!


I look forward to these being available in Australia  :-)

Thanks,
   Tristan


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss