Re: [zfs-discuss] virtualization, alignment and zfs variation stripes

2009-07-23 Thread Nicolas Williams
On Wed, Jul 22, 2009 at 02:45:52PM -0500, Bob Friesenhahn wrote:
 On Wed, 22 Jul 2009, t. johnson wrote:
 Lets say I have a simple-ish setup that uses vmware files for 
 virtual disks on an NFS share from zfs. I'm wondering how zfs' 
 variable block size comes into play? Does it make the alignment 
 problem go away? Does it make it worse? Or should we perhaps be
 
 My understanding is that zfs uses fixed block sizes except for the 
 tail block of a file, or if the filesystem has compression enabled.

For one block files, the block is variable, between 512 bytes and the
smaller of the dataset's recordsize or 128KB.  For multi-block files all
blocks are the same size, except the tail block.  But these are sizes in
file data, not actual on-disk sizes (which can be less because of
compression).

 Zfs's large blocks can definitely cause performance problems if the 
 system has insufficient memory to cache the blocks which are accessed, 
 or only part of the block is updated.

You should set the virtual disk image files' recordsize (or, rather, the
containing dataset's recordsize) to match the preferred block size of
the filesystem types (or data) that you'll put on those virtual disks.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] triple-parity: RAID-Z3

2009-07-23 Thread Victor Latushkin

On 22.07.09 10:45, Adam Leventhal wrote:

which gap?

'RAID-Z should mind the gap on writes' ?

Message was edited by: thometal


I believe this is in reference to the raid 5 write hole, described here:
http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5_performance


It's not.

So I'm not sure what the 'RAID-Z should mind the gap on writes' 
comment is getting at either.


Clarification?



I'm planning to write a blog post describing this, but the basic problem 
is that RAID-Z, by virtue of supporting variable stripe writes (the 
insight that allows us to avoid the RAID-5 write hole), must round the 
number of sectors up to a multiple of nparity+1. This means that we may 
have sectors that are effectively skipped. ZFS generally lays down data 
in large contiguous streams, but these skipped sectors can stymie both 
ZFS's write aggregation as well as the hard drive's ability to group 
I/Os and write them quickly.


Jeff Bonwick added some code to mind these gaps on reads. The key 
insight there is that if we're going to read 64K, say, with a 512 byte 
hole in the middle, we might as well do one big read rather than two 
smaller reads and just throw out the data that we don't care about.


Of course, doing this for writes is a bit trickier since we can't just 
blithely write over gaps as those might contain live data on the disk. 
To solve this we push the knowledge of those skipped sectors down to the 
I/O aggregation layer in the form of 'optional' I/Os purely for the 
purpose of coalescing writes into larger chunks.


This exact issue was discussed here almost three years ago:

http://www.opensolaris.org/jive/thread.jspa?messageID=60241


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] why is zpool import still hanging in opensolaris 2009.06 ??? no fix yet ???

2009-07-23 Thread Luc De Meyer
Follow-up : happy end ...

It took quite some thinkering but... i have my data back...

I ended up starting without the troublesome zfs storage array, de-installed the 
iscsitartget software and re-installed it...just to have solaris boot without 
complaining about missing modules...

That left me with a system that would boot as long as the storage was 
disconnected... Reconnecting it made the boot stop at the hostname. Then the 
disk activity light would flash every second or so forever... I then rebooted 
using milestone=none. That worked also with the storage attached! So now I was 
sure that some software process was causing a hangup (or what appeared to be a 
hangup.) I could now in milestone none verify that the pool was intact: and so 
it was... fortunately I had not broken the pool itself... all online with no 
errors to report.
I then went to milestone-all which again made the system hang with the disk 
activity every second forever. I think the task doing this was devfsadm. I then 
assumed on a gut feeling that somehow the system was scanning or checking the 
pool. I left the system overnight in a desperate attempt because I calculated 
the 500GB checking to take about 8 hrs if the system would *really* scan 
everything. (I copied a 1 TB drive last week which took nearly 20 hrs, so I 
learned that sometimes I need to wait... copying these big disks takes a *lot* 
of time!)

This morning I switched on the monitor and lo and behold : a login screen 
The store was there!

Lesson for myself and others: you MUST wait at the hostname line: the system 
WILL eventually come online... but don't ask how long it takes... I hate to 
think how long it would take if I had a 10TB system. (but then again, a 
file-system-check on an ext2 disk also takes forever...)

I re-enabled the iscsitgtd and did a list : it saw one of the two targets ! 
(which was ok because I remembered that I had turned off the shareiscsi flag on 
the second share.
I then went ahead and connected the system back into the network and repaired 
the iscsi-target on the virtual mainframe : WORKED ! Copied over the virtual 
disks to local store so I can at least start up these servers asap again.
Then set the iscsishare on the second and most important share: OK! Listed the 
targets: THERE, BOTH! Repaired it's connection too: WORKED...!

I am copying everything away from the ZFS pools now, but my data is 
recovered... fortunately.

I now have mixed feelings about the ordeal: yes Sun Solaris kept its promise: I 
did not loose my data. But the time and trouble it took to recover in this case 
(just to restart a system for example taking an overnight wait!) is something 
that a few of my customers would *seriously* dislike...

But: a happy end after all... most important data rescued and 2nd important : I 
learned a lot in the process...

Bye
Luc De Meyer
Belgium
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread F. Wessels
Thanks posting this solution.

But I would like to point out that bug 6574286 removing a slog doesn't work 
still isn't resolved. A solution is under it's way, according to George Wilson. 
But in the mean time, IF something happens you might be in a lot of trouble. 
Even without some unfortunate incident you cannot for example export your data 
pool, pull the drives and leave the root pool.
Don't get me wrong I would like such a setup a lot. But I'm not going to 
implement it until the slog can be removed or the pool be imported without the 
slog.

In the mean time can someone confirm that in such a case, root pool and zil in 
two slices and mirrored, that the write cache can be enabled with format? Only 
zfs is using the disk, but perhaps I'm wrong on this. There have been post's 
regarding enabling the write_cache. But I couldn't find a conclusive answer for 
the above scenario.

Regards,

Frederik
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread F. Wessels
Hi,

I'm using asus m3a78 boards (with the sb700) for opensolaris and m2a* boards 
(with the sb600) for linux some of them with 4*1GB and others with 4*2Gb ECC 
memory. Ecc faults will be detected and reported. I tested it with a small 
tungsten light. By moving the light source slowly towards the memory banks 
you'll heat them up in a controlled way and at a certain point bit flips will 
occur.
I recommend you to go for a m4a board since they support up to 16 GB. 
I don't know if you can run opensolaris without a videocard after installation 
I think you can disable the halt on no video card in the bios. But Simon 
Breden had some trouble with it, see his homeserver blog. But you can go for 
one of the three m4a boards with a 780g onboard. Those will give you 2 pci-e 
x16 connectors. I don't think the onboard nic is supported. I always put an 
intel (e1000) in, just to prevent any trouble. I don't have any trouble with 
the sb700 in ahci mode. Hotplugging works like a charm. Transfering a couple of 
GB's over esata takes considerable less time than via usb.
I have a pata to dual cf adapter and two industrial 16gb cf cards as mirrored 
root pool. It takes for ever to install nevada, at least 14 hours. I suspect 
the cf cards lack caches. But I don't update that regularly, still on snv104.  
And have 2 mirrors and a hot spare. The sixth port is an esata port I use to 
transfer large amounts of data. This system consumes about 73 watts idle and 82 
under load i/o load. (5 disks , a separate nic  ,8 gb ram and a be2400 all 
using just 73 watts!!!)
Please note that frequency scaling is only supported on the K10 architecture. 
But don't expect to much power saving from it. A lower voltage yields far 
greater savings than a lower frequency.
In september I'll do a post about the afore mentioned M4A boards and an lsi sas 
controller in one of the pcie x16 slots.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Kyle McDonald

F. Wessels wrote:

Thanks posting this solution.

But I would like to point out that bug 6574286 removing a slog doesn't work 
still isn't resolved. A solution is under it's way, according to George Wilson. But in 
the mean time, IF something happens you might be in a lot of trouble. Even without some 
unfortunate incident you cannot for example export your data pool, pull the drives and 
leave the root pool.
  
In my case the slog slice wouldn't be the slog for the root pool, it 
would be the slog for a second data pool.


If the device went bad, I'd have to replace it, true. But if the device 
goes bad, then so did a good part of my root pool, and I'd have to 
replace that too.

Don't get me wrong I would like such a setup a lot. But I'm not going to 
implement it until the slog can be removed or the pool be imported without the 
slog.

In the mean time can someone confirm that in such a case, root pool and zil in 
two slices and mirrored, that the write cache can be enabled with format? Only 
zfs is using the disk, but perhaps I'm wrong on this. There have been post's 
regarding enabling the write_cache. But I couldn't find a conclusive answer for 
the above scenario.

  
When you have just the root pool on a disk, ZFS won't enable the write 
cache by default. I think you can manually enable it but I don't know 
the dangers. Adding the slog shouldn't be any different. To be honest, I 
don't know how closely the write caching on a SSD matches what a moving 
disk has.


 -Kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Brian Hechinger
On Thu, Jul 23, 2009 at 10:28:38AM -0400, Kyle McDonald wrote:
   
 In my case the slog slice wouldn't be the slog for the root pool, it 
 would be the slog for a second data pool.

I didn't think you could add a slog to the root pool anyway.  Or has that
changed in recent builds?  I'm a little behind on my SXCE versions, been
too busy to keep up. :)

 When you have just the root pool on a disk, ZFS won't enable the write 
 cache by default.

I don't think this is limited to root pools.  None of my pools (root or
non-root) seem to have the write cache enabled.  Now that I think about
it, all my disks are hidden behind an LSI1078 controller so I'm not
sure what sort of impact that would have on the situation.

-brian
-- 
Coding in C is like sending a 3 year old to do groceries. You gotta
tell them exactly what you want or you'll end up with a cupboard full of
pop tarts and pancake mix. -- IRC User (http://www.bash.org/?841435)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Kyle McDonald

Brian Hechinger wrote:

On Thu, Jul 23, 2009 at 10:28:38AM -0400, Kyle McDonald wrote:
  
 
  
In my case the slog slice wouldn't be the slog for the root pool, it 
would be the slog for a second data pool.



I didn't think you could add a slog to the root pool anyway.  Or has that
changed in recent builds?  I'm a little behind on my SXCE versions, been
too busy to keep up. :)
  
I don't know either. It's not really what I was looking to do so I never 
even thought of it. :)
  
When you have just the root pool on a disk, ZFS won't enable the write 
cache by default.



I don't think this is limited to root pools.  None of my pools (root or
non-root) seem to have the write cache enabled.  Now that I think about
it, all my disks are hidden behind an LSI1078 controller so I'm not
sure what sort of impact that would have on the situation.

  
When you give the full disk (deivce name 'cWtXdY' - with no 'sZ' ) then 
ZFS  will usually instruct the drive to enable write caching.
You're right though if youre drives are really something like single 
drive RAID 0 LUNs, then who knows what happens.


 -Kyle


-brian
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread Richard Elling

On Jul 23, 2009, at 5:42 AM, F. Wessels wrote:


Hi,

I'm using asus m3a78 boards (with the sb700) for opensolaris and  
m2a* boards (with the sb600) for linux some of them with 4*1GB and  
others with 4*2Gb ECC memory. Ecc faults will be detected and  
reported. I tested it with a small tungsten light. By moving the  
light source slowly towards the memory banks you'll heat them up in  
a controlled way and at a certain point bit flips will occur.


I am impressed!  I don't know very many people interested in inducing
errors in their garage.  This is an excellent way to demonstrate random
DRAM errors. Well done!


I recommend you to go for a m4a board since they support up to 16 GB.
I don't know if you can run opensolaris without a videocard after  
installation I think you can disable the halt on no video card in  
the bios. But Simon Breden had some trouble with it, see his  
homeserver blog. But you can go for one of the three m4a boards with  
a 780g onboard. Those will give you 2 pci-e x16 connectors. I don't  
think the onboard nic is supported. I always put an intel (e1000)  
in, just to prevent any trouble. I don't have any trouble with the  
sb700 in ahci mode. Hotplugging works like a charm. Transfering a  
couple of GB's over esata takes considerable less time than via usb.
I have a pata to dual cf adapter and two industrial 16gb cf cards as  
mirrored root pool. It takes for ever to install nevada, at least 14  
hours. I suspect the cf cards lack caches. But I don't update that  
regularly, still on snv104.  And have 2 mirrors and a hot spare. The  
sixth port is an esata port I use to transfer large amounts of data.  
This system consumes about 73 watts idle and 82 under load i/o load.  
(5 disks , a separate nic  ,8 gb ram and a be2400 all using just 73  
watts!!!)


How much power does the tungsten light burn? :-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Richard Elling


On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote:


F. Wessels wrote:

Thanks posting this solution.

But I would like to point out that bug 6574286 removing a slog  
doesn't work still isn't resolved. A solution is under it's way,  
according to George Wilson. But in the mean time, IF something  
happens you might be in a lot of trouble. Even without some  
unfortunate incident you cannot for example export your data pool,  
pull the drives and leave the root pool.


In my case the slog slice wouldn't be the slog for the root pool, it  
would be the slog for a second data pool.


If the device went bad, I'd have to replace it, true. But if the  
device goes bad, then so did a good part of my root pool, and I'd  
have to replace that too.


Mirror the slog to match your mirrored root pool.

Don't get me wrong I would like such a setup a lot. But I'm not  
going to implement it until the slog can be removed or the pool be  
imported without the slog.


In the mean time can someone confirm that in such a case, root pool  
and zil in two slices and mirrored, that the write cache can be  
enabled with format? Only zfs is using the disk, but perhaps I'm  
wrong on this. There have been post's regarding enabling the  
write_cache. But I couldn't find a conclusive answer for the above  
scenario.



When you have just the root pool on a disk, ZFS won't enable the  
write cache by default. I think you can manually enable it but I  
don't know the dangers. Adding the slog shouldn't be any different.  
To be honest, I don't know how closely the write caching on a SSD  
matches what a moving disk has.


Write caches only help hard disks.  Most (all?) SSDs do not have  
volatile write buffers.
Volatile write buffers are another bad thing you can forget when you  
go to SSDs :-)

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Kyle McDonald

Richard Elling wrote:


On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote:


F. Wessels wrote:

Thanks posting this solution.

But I would like to point out that bug 6574286 removing a slog 
doesn't work still isn't resolved. A solution is under it's way, 
according to George Wilson. But in the mean time, IF something 
happens you might be in a lot of trouble. Even without some 
unfortunate incident you cannot for example export your data pool, 
pull the drives and leave the root pool.


In my case the slog slice wouldn't be the slog for the root pool, it 
would be the slog for a second data pool.


If the device went bad, I'd have to replace it, true. But if the 
device goes bad, then so did a good part of my root pool, and I'd 
have to replace that too.


Mirror the slog to match your mirrored root pool.
Yep. That was the plan. I was just explaining that not being able to 
remove the slog wasn't an issue for me since I planned on always having 
that device available.


I was more curious about whether there were any diown sides to sharing 
the SSD between the root pool and the slog?


Thanks for the valuable input, Richard.

 -Kyle



Don't get me wrong I would like such a setup a lot. But I'm not 
going to implement it until the slog can be removed or the pool be 
imported without the slog.


In the mean time can someone confirm that in such a case, root pool 
and zil in two slices and mirrored, that the write cache can be 
enabled with format? Only zfs is using the disk, but perhaps I'm 
wrong on this. There have been post's regarding enabling the 
write_cache. But I couldn't find a conclusive answer for the above 
scenario.



When you have just the root pool on a disk, ZFS won't enable the 
write cache by default. I think you can manually enable it but I 
don't know the dangers. Adding the slog shouldn't be any different. 
To be honest, I don't know how closely the write caching on a SSD 
matches what a moving disk has.


Write caches only help hard disks.  Most (all?) SSDs do not have 
volatile write buffers.
Volatile write buffers are another bad thing you can forget when you 
go to SSDs :-)

 -- richard



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Richard Elling

On Jul 23, 2009, at 9:37 AM, Kyle McDonald wrote:


Richard Elling wrote:


On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote:


F. Wessels wrote:

Thanks posting this solution.

But I would like to point out that bug 6574286 removing a slog  
doesn't work still isn't resolved. A solution is under it's way,  
according to George Wilson. But in the mean time, IF something  
happens you might be in a lot of trouble. Even without some  
unfortunate incident you cannot for example export your data  
pool, pull the drives and leave the root pool.


In my case the slog slice wouldn't be the slog for the root pool,  
it would be the slog for a second data pool.


If the device went bad, I'd have to replace it, true. But if the  
device goes bad, then so did a good part of my root pool, and I'd  
have to replace that too.


Mirror the slog to match your mirrored root pool.
Yep. That was the plan. I was just explaining that not being able to  
remove the slog wasn't an issue for me since I planned on always  
having that device available.


I was more curious about whether there were any diown sides to  
sharing the SSD between the root pool and the slog?


I think it is a great idea, assuming the SSD has good write performance.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Kyle McDonald

Richard Elling wrote:

On Jul 23, 2009, at 9:37 AM, Kyle McDonald wrote:


Richard Elling wrote:


On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote:


F. Wessels wrote:

Thanks posting this solution.

But I would like to point out that bug 6574286 removing a slog 
doesn't work still isn't resolved. A solution is under it's way, 
according to George Wilson. But in the mean time, IF something 
happens you might be in a lot of trouble. Even without some 
unfortunate incident you cannot for example export your data pool, 
pull the drives and leave the root pool.


In my case the slog slice wouldn't be the slog for the root pool, 
it would be the slog for a second data pool.


If the device went bad, I'd have to replace it, true. But if the 
device goes bad, then so did a good part of my root pool, and I'd 
have to replace that too.


Mirror the slog to match your mirrored root pool.
Yep. That was the plan. I was just explaining that not being able to 
remove the slog wasn't an issue for me since I planned on always 
having that device available.


I was more curious about whether there were any diown sides to 
sharing the SSD between the root pool and the slog?


I think it is a great idea, assuming the SSD has good write performance.

This one claims up to 230MB/s read and 180MB/s write and it's only $196.

http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393

Compared to this one (250MB/s read and 170MB/s write) which is $699.

Are those claims really trustworthy? They sound too good to be true!

 -Kyle


 -- richard



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Kyle McDonald

Kyle McDonald wrote:

Richard Elling wrote:

On Jul 23, 2009, at 9:37 AM, Kyle McDonald wrote:


Richard Elling wrote:


On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote:


F. Wessels wrote:

Thanks posting this solution.

But I would like to point out that bug 6574286 removing a slog 
doesn't work still isn't resolved. A solution is under it's way, 
according to George Wilson. But in the mean time, IF something 
happens you might be in a lot of trouble. Even without some 
unfortunate incident you cannot for example export your data 
pool, pull the drives and leave the root pool.


In my case the slog slice wouldn't be the slog for the root pool, 
it would be the slog for a second data pool.


If the device went bad, I'd have to replace it, true. But if the 
device goes bad, then so did a good part of my root pool, and I'd 
have to replace that too.


Mirror the slog to match your mirrored root pool.
Yep. That was the plan. I was just explaining that not being able to 
remove the slog wasn't an issue for me since I planned on always 
having that device available.


I was more curious about whether there were any diown sides to 
sharing the SSD between the root pool and the slog?


I think it is a great idea, assuming the SSD has good write performance.

This one claims up to 230MB/s read and 180MB/s write and it's only $196.

http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393

Compared to this one (250MB/s read and 170MB/s write) which is $699.


Oops. Forgot the link:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014

Are those claims really trustworthy? They sound too good to be true!

 -Kyle


 -- richard



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Greg Mason
  I think it is a great idea, assuming the SSD has good write performance.
  This one claims up to 230MB/s read and 180MB/s write and it's only $196.
 
  http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393
 
  Compared to this one (250MB/s read and 170MB/s write) which is $699.
 
 Oops. Forgot the link:
 
 http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014
  Are those claims really trustworthy? They sound too good to be true!
 
   -Kyle

Kyle-

The less expensive SSD is an MLC device. The Intel SSD is an SLC device.
That right there accounts for the cost difference. The SLC device (Intel
X25-E) will last quite a bit longer than the MLC device.

-Greg

-- 
Greg Mason
System Administrator
Michigan State University
High Performance Computing Center

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Kyle McDonald

Greg Mason wrote:

I think it is a great idea, assuming the SSD has good write performance.


This one claims up to 230MB/s read and 180MB/s write and it's only $196.

http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393

Compared to this one (250MB/s read and 170MB/s write) which is $699.

  

Oops. Forgot the link:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014


Are those claims really trustworthy? They sound too good to be true!

 -Kyle
  


Kyle-

The less expensive SSD is an MLC device. The Intel SSD is an SLC device.
That right there accounts for the cost difference. The SLC device (Intel
X25-E) will last quite a bit longer than the MLC device.
  
I understand that. That's why I picked that one to compare. It was my 
understanding that the MLC drives weren't even close performance wise to 
the SLC ones. This one seems pretty close. How can that be?


 -Kyle

 
-Greg


  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Adam Sherman
In the context of a low-volume file server, for a few users, is the  
low-end Intel SSD sufficient?


A.

--
Adam Sherman
+1.613.797.6819

On 2009-07-23, at 14:09, Greg Mason gma...@msu.edu wrote:

I think it is a great idea, assuming the SSD has good write  
performance.
This one claims up to 230MB/s read and 180MB/s write and it's only  
$196.


http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393

Compared to this one (250MB/s read and 170MB/s write) which is $699.


Oops. Forgot the link:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014

Are those claims really trustworthy? They sound too good to be true!

-Kyle


Kyle-

The less expensive SSD is an MLC device. The Intel SSD is an SLC  
device.
That right there accounts for the cost difference. The SLC device  
(Intel

X25-E) will last quite a bit longer than the MLC device.

-Greg

--
Greg Mason
System Administrator
Michigan State University
High Performance Computing Center

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Kyle McDonald

Adam Sherman wrote:
In the context of a low-volume file server, for a few users, is the 
low-end Intel SSD sufficient?


You're right, it supposedly has less than half the the write speed, and 
that probably won't matter for me, but I can't find a 64GB version of it 
for sale, and the 80GB version is over 50% more at $314.


 -Kyle




A.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread Neal Pollack

On 07/23/09 09:19 AM, Richard Elling wrote:

On Jul 23, 2009, at 5:42 AM, F. Wessels wrote:


Hi,

I'm using asus m3a78 boards (with the sb700) for opensolaris and m2a* 
boards (with the sb600) for linux some of them with 4*1GB and others 
with 4*2Gb ECC memory. Ecc faults will be detected and reported. I 
tested it with a small tungsten light. By moving the light source 
slowly towards the memory banks you'll heat them up in a controlled 
way and at a certain point bit flips will occur.


I am impressed!  I don't know very many people interested in inducing
errors in their garage.  This is an excellent way to demonstrate random
DRAM errors. Well done!


I recommend you to go for a m4a board since they support up to 16 GB.
I don't know if you can run opensolaris without a videocard after 
installation I think you can disable the halt on no video card in 
the bios. But Simon Breden had some trouble with it, see his 
homeserver blog. But you can go for one of the three m4a boards with 
a 780g onboard. Those will give you 2 pci-e x16 connectors. I don't 
think the onboard nic is supported. 



What is the specific model of the onboard nic chip?
We may be working on it right now.

Neal


I always put an intel (e1000) in, just to prevent any trouble. I 
don't have any trouble with the sb700 in ahci mode. Hotplugging works 
like a charm. Transfering a couple of GB's over esata takes 
considerable less time than via usb.
I have a pata to dual cf adapter and two industrial 16gb cf cards as 
mirrored root pool. It takes for ever to install nevada, at least 14 
hours. I suspect the cf cards lack caches. But I don't update that 
regularly, still on snv104.  And have 2 mirrors and a hot spare. The 
sixth port is an esata port I use to transfer large amounts of data. 
This system consumes about 73 watts idle and 82 under load i/o load. 
(5 disks , a separate nic  ,8 gb ram and a be2400 all using just 73 
watts!!!)


How much power does the tungsten light burn? :-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread thomas
 I think it is a great idea, assuming the SSD has good write performance.
 This one claims up to 230MB/s read and 180MB/s write and it's only $196.
 
 http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393
 
 Compared to this one (250MB/s read and 170MB/s write) which is $699.
 
 Are those claims really trustworthy? They sound too good to be true!


MB/s numbers are not a good indication of performance. What you should pay 
attention to are usually random IOPS write and read. They tend to correlate a 
bit, but those numbers on newegg are probably just best case from the 
manufacturer.

In the world of consumer grade SSDs, Intel has crushed everyone on IOPS 
performance.. but the other manufacturers are starting to catch up a bit.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] No ZFS snapshot since upgrade to 2009.06

2009-07-23 Thread Axelle Apvrille
I've upgrade my OpenSolaris 2008.11 to 2009.06. During that process it created 
a new boot environment:
BEActive Mountpoint Space Policy Created  
---- -- - -- ---  
opensolaris   NR /  7.53G static 2009-01-03 13:18 
opensolaris-1 -  -  2.80G static 2009-07-20 22:38

But now, all my zfs snapshot services are in maintenance mode:
svc:/system/filesystem/zfs/auto-snapshot:frequent (ZFS automatic snapshots)
 State: maintenance since Thu Jul 23 20:21:29 2009
Reason: Restarter svc:/system/svc/restarter:default gave no explanation.
   See: http://sun.com/msg/SMF-8000-9C
   See: /var/svc/log/system-filesystem-zfs-auto-snapshot:frequent.log
Impact: 1 dependent service is not running:
svc:/application/time-slider:default
etc

The logs say that my host tried to create a snapshot, but couldn't because 
'dataset is busy':
cannot create snapshot 
'rpool/ROOT/opensolari...@zfs-auto-snap:frequent-2009-07-23-20:21': dataset is 
busy
no snapshots were created
Error: Unable to take recursive snapshots of 
rpool/r...@zfs-auto-snap:frequent-2009-07-23-20:21.
Moving service svc:/system/filesystem/zfs/auto-snapshot:frequent to maintenance 
mode.

Anyone knows how to fix this ? 
What is that boot environment opensolaris-1 for ? can I erase it safely ?

Regards
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Joseph L. Casale
I don't think this is limited to root pools.  None of my pools (root or
non-root) seem to have the write cache enabled.  Now that I think about
it, all my disks are hidden behind an LSI1078 controller so I'm not
sure what sort of impact that would have on the situation.

I have a few of those controllers as well, I wouldn't believe for a second
that ZFS could change the controller config for an ld, I couldn't see how?

# /usr/local/bin/CLI/MegaCli -LdGetProp -DskCache -LAll -a0

Adapter 0-VD 0(target id: 0): Disk Write Cache : Disabled
Adapter 0-VD 1(target id: 1): Disk Write Cache : Disabled
Adapter 0-VD 2(target id: 2): Disk Write Cache : Disabled
Adapter 0-VD 3(target id: 3): Disk Write Cache : Disabled
Adapter 0-VD 4(target id: 4): Disk Write Cache : Disabled

The comment later about defining a pool w/ and w/o the sX syntax warrants a 
test:)
Good to keep in mind...

jlc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] [SOLVED] Re: No ZFS snapshot since upgrade to 2009.06

2009-07-23 Thread Axelle Apvrille
Ok, I've found the solution to my problem on Internet here: 
http://sigtar.com/2009/03/17/troubleshooting-time-slider-zfs-snapshots/

This was indeed caused by the old boot environment. This is how to solve it:
- disable snapshots on the old boot environment:
pfexec zfs set com.sun:auto-snapshot=false rpool/ROOT/opensolaris-1
- clear zfs snapshot services
pfexec svcadm clear auto-snapshot:frequent
pfexec svcadm clear auto-snapshot:hourly
pfexec svcadm clear auto-snapshot:daily
- launch the time slider and enable zfs snapshots on the appropriate systems
- check the services are online: svcs 
...
online 21:50:20 svc:/system/filesystem/zfs/auto-snapshot:hourly
online 21:50:26 svc:/system/filesystem/zfs/auto-snapshot:frequent
online 21:50:36 svc:/system/filesystem/zfs/auto-snapshot:daily

Hurray !
-- Axelle
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Richard Elling


On Jul 23, 2009, at 11:09 AM, Greg Mason wrote:

I think it is a great idea, assuming the SSD has good write  
performance.
This one claims up to 230MB/s read and 180MB/s write and it's only  
$196.


http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393

Compared to this one (250MB/s read and 170MB/s write) which is $699.


Oops. Forgot the link:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014

Are those claims really trustworthy? They sound too good to be true!

-Kyle


Kyle-

The less expensive SSD is an MLC device. The Intel SSD is an SLC  
device.


Some newer designs use both SLC and MLC.  It is no longer possible
to use SLC vs MLC as a primary differentiator. Use the specifications.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread chris
The Asus M4N78-VM uses a Nvidia GeForce 8200 Chipset (This board only has 1 
PCIe-16 slot though, I should look at those that have 2 slots).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread chris
Oh, and another unrelated question: 

Would I better off using OpenSolaris or Solaris Community Edition? 

I suspect SCE has more drivers (though mayby in a more beta state?), but its 
huge download size (several days in backward New Zealand, thanks Telecom NZ!) 
means I would only try if there is good justification.
What would you guys recommend (I know, this is an OpenSolaris forum, but at 
least can you tell me how these 2 differ)?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread Erik Trimble
On Thu, 2009-07-23 at 14:24 -0700, Richard Elling wrote:
 On Jul 23, 2009, at 11:09 AM, Greg Mason wrote:
 
  I think it is a great idea, assuming the SSD has good write  
  performance.
  This one claims up to 230MB/s read and 180MB/s write and it's only  
  $196.
 
  http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393
 
  Compared to this one (250MB/s read and 170MB/s write) which is $699.
 
  Oops. Forgot the link:
 
  http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014
  Are those claims really trustworthy? They sound too good to be true!
 
  -Kyle
 
  Kyle-
 
  The less expensive SSD is an MLC device. The Intel SSD is an SLC  
  device.
 
 Some newer designs use both SLC and MLC.  It is no longer possible
 to use SLC vs MLC as a primary differentiator. Use the specifications.
   -- richard
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


I'm finding the new-gen MLC w/ large DRAM cache  improved
microcontroller to be more than sufficient for workgroup server.  

e.g. the OCZ Summit series and similar.  I suspect the Intel X25-M is
likely good enough, too.

I'm using one SSD for both read and write caches, and it's good enough
for a 20-person small workgroup server doing NFS.  I suspect that write
caches are much more sensitive to IOPS performance than read ones, but
that's just my feeling.  In any case, I'd pay more attention to the IOPS
rating for things, than the sync read/write speeds.


I'm testing that set up right now for iSCSI-based xVM guests, so we'll
see if it can stand the IOPs.  



-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread Miles Nordin
 c == chris  no-re...@opensolaris.org writes:

 c do you know what the ECC BIOS modes mean?

It's about the hardware scrubbing feature I mentioned.


pgpcOJUfEwhmS.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD's and ZFS...

2009-07-23 Thread F. Wessels
I didn't meant using slog for the root pool. I meant using the slog for a data 
pool. Where the data pool consists of (rotating) hard disk and complement them 
with a ssd based slog. But instead of a dedicated ssd for the slog I want the 
root pool share the ssd with the slog. Both can mirrored to a second ssd.
I think that in this scenario my initial concern remains. Since you cannot 
remove an slog from a pool, if you want to move the pool or something bad 
happens you're in trouble.

Richard I'm under the impression that most current ssd's have a dram buffer. 
Some are used only for reading some are also used for writing. I'm pretty sure 
the sun LogZilla devices (the stec zeus) have a dram write buffer. Some have a 
supercap to flush the caches others don't
I'm trying to compile some guidelines regarding write caching, ssd and zfs. I 
don't like the posts like I can't import my pool my pool went down the 
niagara falls etc. So in order to prevent more of these stories I think it's 
important to get it in the open if write caching can be enabled on ssd's (full 
disk and slice usage) I'm really looking for a conclusive test to determine 
wether or not it can be enabled.

Regards,

Frederik
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread Erik Trimble
I'm going the other route here, and using a Intel small server
motherboard.

I'm currently trying the Supermicro X7SBE, which supports a non-Xeon
CPU, and _should_ actually use the (unbuffered) ECC RAM I have in it.
It can also support a network KVM IPMI board, which is nice (though not
cheap - i.e. $100 or so). 


The Supermicro X7SBL-LN[12] boards also look good, though they won't
support the network KVM option.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] triple-parity: RAID-Z3

2009-07-23 Thread Robert Milkowski

Adam Leventhal wrote:

Hey Bob,

MTTDL analysis shows that given normal evironmental conditions, the 
MTTDL of RAID-Z2 is already much longer than the life of the computer 
or the attendant human.  Of course sometimes one encounters unusual 
conditions where additional redundancy is desired.


To what analysis are you referring? Today the absolute fastest you can 
resilver a 1TB drive is about 4 hours. Real-world speeds might be half 
that. In 2010 we'll have 3TB drives meaning it may take a full day to 
resilver. The odds of hitting a latent bit error are already 
reasonably high especially with a large pool that's infrequently 
scrubbed meaning. What then are the odds of a second drive failing in 
the 24 hours it takes to resiler?




I wish it was so good with raid-zN.
In real life, at least from mine experience, it can take several days to 
resilver a disk for vdevs in raid-z2 made of 11x sata disk drives with 
real data.
While the way zfs ynchronizes data is way faster under some 
circumstances it is also much slower under other.
IIRC some builds ago there were some fixes integrated so maybe it is 
different now.



I do think that it is worthwhile to be able to add another parity 
disk to an existing raidz vdev but I don't know how much work that 
entails.


It entails a bunch of work:

  http://blogs.sun.com/ahl/entry/expand_o_matic_raid_z

Matt Ahrens is working on a key component after which it should all be 
possible.



A lot of people are waiting for it! :) :) :)


ps. thank you for raid-z3!

--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] triple-parity: RAID-Z3

2009-07-23 Thread Adam Leventhal
Robert,

On Fri, Jul 24, 2009 at 12:59:01AM +0100, Robert Milkowski wrote:
 To what analysis are you referring? Today the absolute fastest you can 
 resilver a 1TB drive is about 4 hours. Real-world speeds might be half 
 that. In 2010 we'll have 3TB drives meaning it may take a full day to 
 resilver. The odds of hitting a latent bit error are already reasonably 
 high especially with a large pool that's infrequently scrubbed meaning. 
 What then are the odds of a second drive failing in the 24 hours it takes 
 to resiler?

 I wish it was so good with raid-zN.
 In real life, at least from mine experience, it can take several days to 
 resilver a disk for vdevs in raid-z2 made of 11x sata disk drives with real 
 data.
 While the way zfs ynchronizes data is way faster under some circumstances 
 it is also much slower under other.
 IIRC some builds ago there were some fixes integrated so maybe it is 
 different now.

Absolutely. I was talking more or less about optimal timing. I realize that
due to the priorities within ZFS and real word loads that it can take far
longer.

Adam

-- 
Adam Leventhal, Fishworks http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] triple-parity: RAID-Z3

2009-07-23 Thread Robert Milkowski

Adam Leventhal wrote:


I just blogged about triple-parity RAID-Z (raidz3):

  http://blogs.sun.com/ahl/entry/triple_parity_raid_z

As for performance, on the system I was using (a max config Sun Storage
7410), I saw about a 25% improvement to 1GB/s for a streaming write
workload. YMMV, but I'd be interested in hearing your results.


25% improvement when comparing what exactly to what?


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirror cloning

2009-07-23 Thread Jorgen Lundman


Ok, so it seems that with DiskSuite, detaching a mirror does nothing to 
the disk you detached.


However, zpool detach appears to mark the disk as blank, so nothing 
will find any pools (import, import -D etc). zdb -l will show labels, 
but no amount of work that we have found will bring the HDD back online 
in the new server. Grub is blank, and findroot can not see any pool.


zpool will not let you offline the 2nd disk in a mirror. This is 
incorrect behaviour.


You can not cfgadm unconfigure the sata device while zpool has the disk.

We can just yank the disk, but we had issues getting a new-blank disk 
recognised after that. cfgadm would not release the old disk.



However, we found we can do this:

# cfgadm -x sata_port_deactivate sata0/1::dsk/c0t1d0

This will make zpool mark it:

 c0t1d0s0  REMOVED  0 0 0

and eventually:

 c0t1d0s0  FAULTED  0 0 0  too many errors


After that, we pull out the disk, and issue:

# zpool detach zboot c0t1d0s0
# cfgadm -x sata_port_activate sata0/1::dsk/c0t1d0
# cfgadm -c configure sata0/1::dsk/c0t1d0
# format   (fdisk, partition as required to be the same)
# zpool attach zboot c0t0d0s0 c0t1d0s0


There is one final thing to address, when the disk is used in a new 
machine, it will generally panic with pool was used previously with 
system-id xx. Which requires more miniroot work. It would be nice 
to be able to avoid this as well. But you can't export the / pool 
before pulling out the disk, either.




Jorgen Lundman wrote:


Hello list,

Before we started changing to ZFS bootfs, we used DiskSuite mirrored ufs 
boot.


Very often, if we needed to grow a cluster by another machine or two, we 
would simply clone a running live server. Generally the procedure for 
this would be;


1 detach the 2nd HDD, metaclear, and delete metadb on 2nd disk.
2 mount the 2nd HDD under /mnt, and change system/vfstab to be a single 
boot HDD, and no longer mirrored, as well as host name, and IP addresses.

3 bootadm update-archive -R /mnt
4 unmount, cfgadm unconfigure, and pull out the HDD.

and generally, in about ~4 minutes, we have a new live server in the 
cluster.



We tried to do the same thing to day, but with a ZFS bootfs. We did:

1 zpool detach on the 2nd HDD.
2 cfgadm unconfigure the HDD, and pull out the disk.

The source server was fine, could insert new disk, attach it, and it 
resilvered.


However, the new destination server had lots of issues. At first, grub 
would give no menu at all, just the grub? command prompt.


The command: findroot(pool_zboot,0,a) would return Error 15: No such 
file.


After booting a Solaris Live CD, I could zpool import the pool, but of 
course it was in Degraded mode etc.


Now it would show menu, but if you boot it, it would flash the message 
that the pool was last accessed by Solaris $sysid, and panic.


After a lot of reboots, and fiddling, I managed to get miniroot to at 
least boot, then, only after inserting a new HDD and letting the pool 
become completely good would it let me boot into multi-user.


Is there something we should do perhaps, that will let the cloning 
procedure go smoothly? Should I export the 'now separated disk' 
somehow? In fact, can I mount that disk to make changes to it before 
pulling out the disk?


Most documentation on cloning uses zfs send, which would be possible, 
but 4 minutes is hard to beat when your cluster is under heavy load.


Lund



--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread Haudy Kazemi

chris wrote:

Ok, so the choice for a MB boils down to:

- Intel desktop MB, no ECC support
  
This is mostly true.  The exceptions are some implementations of the 
Socket T LGA 775 (i.e. late Pentium 4 series, and Core 2) D975X and X38 
chipsets, and possibly some X48 boards as well.  Intel's other desktop 
chipsets do not support ECC.  Some motherboard examples include:


Intel DX38BT - ECC support is mentioned in the documentation and is a 
BIOS option
Gigabyte GA-X38-DS4, GA-EX38-DS4 - ECC support is mentioned in the 
documentation and is listed in the website FAQ

The Sun Ultra 24 also uses the X38 chipset.

It's not clear how well ECC support has actually been implemented on the 
Intel and Gigabyte boards, i.e. whether it is simply unbuffered ECC 
memory compatible, or actually able to initialize and use the ECC 
capability.  I mentioned the X48 chipset above because discussions 
surrounding it say it is just a higher binned X38 chip.


On Linux, the EDAC project maintains software to manage the 
motherboard's ECC capability.  A list of memory controllers currently 
supported by Linux EDAC is here:

http://buttersideup.com/edacwiki/Main_Page

A prior discussion thread in 'fm' titled 'X38/975x ECC memory support' 
is here:

http://opensolaris.org/jive/thread.jspa?threadID=52440tstart=60

Thread links:
http://www.madore.org/~david/linux/#ECC_for_82x
http://developmentonsolaris.wordpress.com/2008/03/12/intel-82975x-mch-and-logging-of-ecc-events-on-solaris/

Note that the 'ecccheck.pl' script depends on the 'pcitweak' utility 
which is no longer present in OpenSolaris 2009.06 and Ubuntu 8.10 
because of Xorg changes.  One Linux user needing the utility copied it 
from another distro.  The version of pcitweak included with previous 
versions of OpenSolaris might work on 2009.06.

http://opensolaris.org/jive/thread.jspa?threadID=105975tstart=90
http://ubuntuforums.org/showthread.php?t=1054516

Finally, on unbuffered ECC memory prices and speeds...they are a bit 
behind in price and speed vs. regular unbuffered RAM, but both are still 
reasonable.  Keep When comparing prices, remember that ECC RAM uses 9 
chips where non-ECC uses 8, so expect at least a 12.5% price increase.  
Consider:


DDR2: $64 for Crucial 4GB kit (2GBx2), 240-pin DIMM, Unbuffered DDR2 
PC2-6400 memory module

http://www.crucial.com/store/partspecs.aspx?IMODULE=CT2KIT25672AA800

DDR3: $108 for Crucial 6GB (3 x 2GB) 240-Pin DDR3 SDRAM ECC Unbuffered 
DDR3 1333 (PC3 10600) Triple Channel Kit Server Memory Model 
CT3KIT25672BA1339 - Retail

http://www.newegg.com/Product/Product.aspx?Item=N82E16820148259

-hk


- Intel server MB, ECC support, expensive (requires a Xeon for speedstep 
support). It is a shame to waste top kit doing nothing 24/7.
- AMD K8: ECC support(right?), no Cool'n'quiet support (but maybe still cool 
enough with the right CPU?)
- AMD K10: should have the best all of both worlds: ECC support, Cool'n'quiet, 
cheap-ish and lowish-power CPU like Athlon II 250

Is my understanding correct? Like many I want reliable, cheap, low power, ECC 
supporting MB. Integrated video and low power chipset would be best. The sata 
ports will have to come from an additional controller it seems, but that's life.

Intel gear is best supported, but they shoot themselves (or is that that us?) 
in the foot with their ECC-on-server MB policy.

AMD K10 seems the most tempting as it has it all. I wonder about solaris support though. For example, is an AM3 MB OK with solaris? 


I'd like this hopefully to work right away with opensolaris 2009.06, without 
fiddling with drivers, I dont have much time or skills.

What AM3 MB do you guys know that is trouble free with solaris? 


If none, maybe top quality ram (suggestions?) would allow me to forego ECC and 
use a well supported low power intel board (suggestions?) instead? and a E5200?

Thanks for your insight.
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Opensolaris attached to 70 disk HP array

2009-07-23 Thread Brent Jones
Looking at this external array by HP:
http://h18006.www1.hp.com/products/storageworks/600mds/index.html

70 disks in 5U, which could probably be configured in JBOD.
Has anyone attempted to connect this to a box running opensolaris to
create a 70 disk pool?

-- 
Brent Jones
br...@servuhome.net
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-23 Thread Frank Middleton

On 07/21/09 01:21 PM, Richard Elling wrote:


I never win the lottery either :-)


Let's see. Your chance of winning a 49 ball lottery is apparently
around 1 in 14*10^6, although it's much better than that because of
submatches (smaller payoffs for matches on less than 6 balls).

There are about 32*10^6 seconds in a year. If ZFS saves its writes
for 30 seconds and batches them out, that means 1 write leaves the
buffer exposed for roughly one millionth of a year. If you have 4GB
of memory, you might get 50  errors a year, but you say ZFS uses only
1/10 of this for writes, so that memory could see 5 errors/year. If
your single write was 1/70th of that (say around 6 MB), your chance
of a hit is around 5/70/10^-6 or 1 in 14*10^6, so you are correct!

So if you do one 6MB write/year, your chances of a hit in a year are
about the same as that of winning a grand slam lottery. Hopefully
not every hit will trash a file or pool, but odds are that you'll
do many more writes than that, so on the whole I think a ZFS hit
is quite a bit more likely than winning the lottery each year :-).

Conversely, if you average one big write every 3 minutes or so (20%
occupancy), odds are almost certain that you'll get one hit a year.
So some SOHO users who do far fewer writes won't see any hits (say)
over a 5 year period. But some will, and they will be most unhappy --
calculate your odds and then make a decision! I daresay the PC
makers have done this calculation, which is why PCs don't have ECC,
and hence IMO make for insufficiently reliable servers.

Conclusions from what I've gleaned from all the discussions here:
if you are too cheap to opt for mirroring, your best bet is to
disable checksumming and set copies=2. If you mirror but don't
have ECC then at least set copies=2 and consider disabling checksums.
Actually, set copies=2 regardless, so that you have some redundancy
if one half of the mirror fails and you have a 10 hour resilver,
in which time you could easily get a (real) disk read error.

It seems to me some vendor is going to cotton onto the SOHO server
problem and make a bundle at the right price point. Sun's offerings
seem unfortunately mostly overkill for the SOHO market, although the
X4140 looks rather interesting... Shame there aren't any entry
level SPARCs any more :-(. Now what would doctors' front offices do
if they couldn't blame the computer for being down all the time?
 

It is quite simple -- ZFS sent the flush command and VirtualBox
ignored it. Therefore the bits on the persistent store are consistent.


But even on the most majestic of hardware, a flush command could be
lost, could it not? An obvious case in point is ZFS over iscsi and
a router glitch. But the discussion seems to be moot since CR
6667683 is being addressed. Now about those writes to mirrored disks :)

Cheers -- Frank

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirror cloning

2009-07-23 Thread Jorgen Lundman



Jorgen Lundman wrote:
However, zpool detach appears to mark the disk as blank, so nothing 
will find any pools (import, import -D etc). zdb -l will show labels, 


For kicks, I tried to demonstrate this does indeed happen, so I dd'ed 
the first 1024 1k blocks from the disk, zpool detach it, then dd'ed the 
image back out to the HDD.


Pulled out disk and it boots directly without any interventions. If only 
zpool detach had a flag to tell it not to scribble over the detached disk.


Guess I could diff the before and after disk image and work out what it 
is that it does, and write a tool to undo it, or figure out if I can 
undo it using zdb.


Lund

--
Jorgen Lundman   | lund...@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
Japan| +81 (0)3 -3375-1767  (home)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread chris
Cheers Miles, and thanks also for the tip to look in the BIOS options to see if 
ECC is actually used. 
Which mode woud you use? Max seems the most appealing, why would anyone use 
something called basic? But there must be a catch if they provided several  ECC 
support modes.

I am glad this thread seems to be going somewhere with lots of valuable 
contributions  =:^)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread chris
More choice is good!

It seems Intel's server boards sometimes accept desktop CPUS, but don't support 
speedstep. Is all OK with those?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirror cloning

2009-07-23 Thread Fajar A. Nugraha
On Fri, Jul 24, 2009 at 9:24 AM, Jorgen Lundmanlund...@gmo.jp wrote:
 However, zpool detach appears to mark the disk as blank, so nothing will
 find any pools (import, import -D etc). zdb -l will show labels,

If both disks are bootable (with installboot or installgrub), removing
the mirror and put in in the new server should create an exact clone
(including IP address and hostname). I don't think this is recommended
though.

This page provides root pool recovery methods, which shoud also be
usable for cloning purposes.
http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Root_Pool_Recovery

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motherboard for home zfs/solaris file server

2009-07-23 Thread chris
Note that the 'ecccheck.pl' script depends on the 'pcitweak' utility 
which is no longer present in OpenSolaris 2009.06 and Ubuntu 8.10 
because of Xorg changes. 

This is exactly the kind of hidden trap I fear. One does everything right and 
then discovers that xx is missing or has been changed or depends on yy or 
doesn't work with zz. And that discovery comes after hours/days/weeks of trying 
to find out why something misbehaves. Thanks for the heads up! 
2008.11 would be a safer bet then? Or Solaris CE?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Opensolaris attached to 70 disk HP array

2009-07-23 Thread thomas
That is an interesting bit of kit. I wish a white box manufacturer would 
create something like this (hint hint supermicro)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss