Re: [zfs-discuss] Dedup memory overhead

2010-02-04 Thread Mertol Ozyoney
Sorry fort he late answer. 

Approximately it's 150 bytes per individual block. So increasing the
blocksize is a good idea. 
Also when L1 and L2 arc is not enough system will start making disk IOPS and
RaidZ is not very effective for random IOPS and it's likely that when your
dram is not enough your perfor ance will suffer. 
You may choose to use Raid 10 which is a lot better on random loads
Mertol 




Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email mertol.ozyo...@sun.com



-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of erik.ableson
Sent: Thursday, January 21, 2010 6:05 PM
To: zfs-discuss
Subject: [zfs-discuss] Dedup memory overhead

Hi all,

I'm going to be trying out some tests using b130 for dedup on a server with
about 1,7Tb of useable storage (14x146 in two raidz vdevs of 7 disks).  What
I'm trying to get a handle on is how to estimate the memory overhead
required for dedup on that amount of storage.  From what I gather, the dedup
hash keys are held in ARC and L2ARC and as such are in competition for the
available memory.

So the question is how much memory or L2ARC would be necessary to ensure
that I'm never going back to disk to read out the hash keys. Better yet
would be some kind of algorithm for calculating the overhead. eg - averaged
block size of 4K = a hash key for every 4k stored and a hash occupies 256
bits. An associated question is then how does the ARC handle competition
between hash keys and regular ARC functions?

Based on these estimations, I think that I should be able to calculate the
following:
1,7 TB
1740,8  GB
1782579,2   MB
1825361100,8KB
4   average block size
456340275,2 blocks
256 hash key size-bits
1,16823E+11 hash key overhead - bits
1460206,4   hash key size-bytes
14260633,6  hash key size-KB
13926,4 hash key size-MB
13,6hash key overhead-GB

Of course the big question on this will be the average block size - or
better yet - to be able to analyze an existing datastore to see just how
many blocks it uses and what is the current distribution of different block
sizes. I'm currently playing around with zdb with mixed success  on
extracting this kind of data. That's also a worst case scenario since it's
counting really small blocks and using 100% of available storage - highly
unlikely. 

# zdb -ddbb siovale/iphone
Dataset siovale/iphone [ZPL], ID 2381, cr_txg 3764691, 44.6G, 99 objects

ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0,
flags 0x0

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  57.0K64K   77.34  DMU dnode
 1116K 1K  1.50K 1K  100.00  ZFS master node
 2116K512  1.50K512  100.00  ZFS delete queue
 3216K16K  18.0K32K  100.00  ZFS directory
 4316K   128K   408M   408M  100.00  ZFS plain file
 5116K16K  3.00K16K  100.00  FUID table
 6116K 4K  4.50K 4K  100.00  ZFS plain file
 7116K  6.50K  6.50K  6.50K  100.00  ZFS plain file
 8316K   128K   952M   952M  100.00  ZFS plain file
 9316K   128K   912M   912M  100.00  ZFS plain file
10316K   128K   695M   695M  100.00  ZFS plain file
11316K   128K   914M   914M  100.00  ZFS plain file
 
Now, if I'm understanding this output properly, object 4 is composed of
128KB blocks with a total size of 408MB, meaning that it uses 3264 blocks.
Can someone confirm (or correct) that assumption? Also, I note that each
object  (as far as my limited testing has shown) has a single block size
with no internal variation.

Interestingly, all of my zvols seem to use fixed size blocks - that is,
there is no variation in the block sizes - they're all the size defined on
creation with no dynamic block sizes being used. I previously thought that
the -b option set the maximum size, rather than fixing all blocks.  Learned
something today :-)

# zdb -ddbb siovale/testvol
Dataset siovale/testvol [ZVOL], ID 45, cr_txg 4717890, 23.9K, 2 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1116K64K  064K0.00  zvol object
 2116K512  1.50K512  100.00  zvol prop

# zdb -ddbb siovale/tm-media
Dataset siovale/tm-media [ZVOL], ID 706, cr_txg 4426997, 240G, 2 objects

ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0,
flags 0x0

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1516K 8K   240G   250G   97.33  zvol object
 2116K512  1.50K512  100.00  zvol prop


Re: [zfs-discuss] Large scale ZFS deployments out there (200 disks)

2010-02-04 Thread Mertol Ozyoney
We got 50+ X4500/X4540's running in the same DC happiliy with ZFS.
Approximately 2500 drives and growing everyday... 

Br
Mertol 



Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email mertol.ozyo...@sun.com



-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Henrik Johansen
Sent: Friday, January 29, 2010 10:45 AM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Large scale ZFS deployments out there (200
disks)

On 01/28/10 11:13 PM, Lutz Schumann wrote:
 While thinking about ZFS as the next generation filesystem without
 limits I am wondering if the real world is ready for this kind of
 incredible technology ...

 I'm actually speaking of hardware :)

 ZFS can handle a lot of devices. Once in the import bug
 (http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6761786)
 is fixed it should be able to handle a lot of disks.

That was fixed in build 125.

 I want to ask the ZFS community and users what large scale deploments
 are out there.  How man disks ? How much capacity ? Single pool or
 many pools on a server ? How does resilver work in those
 environtments ? How to you backup ? What is the experience so far ?
 Major headakes ?

 It would be great if large scale users would share their setups and
 experiences with ZFS.

The largest ZFS deployment that we have is currently comprised of 22 
Dell MD1000 enclosures (330 750 GB Nearline SAS disks). We have 3 head 
nodes and use one zpool per node, comprised of rather narrow (5+2) 
RAIDZ2 vdevs. This setup is exclusively used for storing backup data.

Resilver times could be better - I am sure that this will improve once 
we upgrade from S10u9 to 2010.03.

One of the things that I am missing in ZFS is the ability to prioritize 
background operations like scrub and resilver. All our disks are idle 
during daytime and I would love to be able to take advantage of this, 
especially during resilver operations.

This setup has been running for about a year with no major issues so 
far. The only hickups we've had were all HW related (no fun in firmware 
upgrading 200+ disks).

 Will you ? :) Thanks, Robert


-- 
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-02-04 Thread Tonmaus
Hi Simon

 I.e. you'll have to manually intervene
 if a consumer drive causes the system to hang, and
 replace it, whereas the RAID edition drives will
 probably report the error quickly and then ZFS will
 rewrite the data elsewhere, and thus maybe not kick
 the drive.

IMHO the relevant aspects are if ZFS is able to give accurate account on cache 
flush status and even realize if a drive is not responsive. That being said, I 
have no seen a specific report that ZFS would kick green drives at random or at 
pattern, like the poor SoHo storage enclosure users do all the time.

 
 So it sounds preferable to have TLER in operation, if
 one can find a consumer-priced drive that allows it,
 or just take the hit and go with whatever non-TLER
 drive you choose and expect to have to manually
 intervene if a drive plays up. OK for home user where
 he is not too affected, but not good for businesses
 which need to have something recovered quickly.

One point about TLER is that two error correction schemes concur in the case 
you run a consumer drive on an active RAID controller that has its own 
mechanisms. When you run ZFS on a RAID controller in contrast to the best 
practise recommendations, an analogue question arises. On the other hand, if 
you run a green consumer drive on a dumb HBA , I wouldn't know what is wrong 
with it in the first place. 
As much as for manual interventions, the only one I am aware of would be to 
re-attach a single drive. Not an option if you are really affected like those 
miserable Thecus N7000 users that see the entire array of only a handful of 
drives drop out within hours - over and over again, or not even get to finish 
formatting the stripe set.
The dire consequences of the gossiped TLER problems let me believe that there 
would be much more and quite specific reports in this place if this was a 
systematic issue with ZFS. Other than that, we are operating outside supported 
specs when running consumer level drives in large arrays. So far at least the 
perspective of Seagate and WD.

 
  That all rather points to singular issues with
  firmware bugs or similar than to a systematic
 issue,
  doesn't it?
 
 I'm not sure. Some people in the WDC threads seem to
 report problems with pauses during media streaming
 etc. 

This was again for SoHo storage enclosures - not for ZFS, right?

  when the
 32MB+ cache is empty, then it loads another 32MB into
 cache etc and so on? 

I am not sure if any current disk will have such a simplistic cache management 
that will draw upon completely cycling the buffer content, let alone for reads 
that belong to a single file (a disk basically is agnostic of files). Moreover, 
such a buffer management would be completely useless for a striped array. I 
don't know much better what a disk cache does either, but I am afraid that 
direction is probably not helpful to understanding certain phenomenons people 
have reported.

I think that at this time we are seeing a quite large amount of evolutions 
going on in disk storage, whereas many established assumptions are being 
abandoned while backwards compatibility is not always taken care of. SAS 6G 
(will my controller really work in a PCIe 1.1 slot?) and 4k clusters are 
certainly only prominent examples. It's probably even more true than ever to 
fall back to established technologies in such times, including of biting the 
bullet of cost premium on occasion.

Best regards

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Henu
So do you mean I cannot gather the names and locations of  
changed/created/removed files just by analyzing a stream of  
(incremental) zfs_send?


Quoting Andrey Kuzmin andrey.v.kuz...@gmail.com:

On Wed, Feb 3, 2010 at 6:11 PM, Ross Walker rswwal...@gmail.com wrote:

On Feb 3, 2010, at 9:53 AM, Henu henrik.he...@tut.fi wrote:


Okay, so first of all, it's true that send is always fast and 100%
reliable because it uses blocks to see differences. Good, and thanks for
this information. If everything else fails, I can parse the information I
want from send stream :)

But am I right, that there is no other methods to get the list of changed
files other than the send command?


At zfs_send level there are no files, just DMU objects (modified in
some txg which is the basis for changed/unchanged decision).




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Henu
Whoa! That is exactly what I've been looking for. Is there any  
developement version publicly available for testing?


Regards,
Henrik Heino

Quoting Matthew Ahrens matthew.ahr...@sun.com:
This is RFE 6425091 want 'zfs diff' to list files that have changed  
between snapshots, which covers both file  directory changes, and  
file removal/creation/renaming.  We actually have a prototype of zfs  
diff. Hopefully someday we will finish it up...


--matt



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Ian Collins

Henu wrote:
So do you mean I cannot gather the names and locations of 
changed/created/removed files just by analyzing a stream of 
(incremental) zfs_send?


That's correct, you can't.  Snapshots do not work at the file level.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread dick hoogendijk

Frank Cusack wrote:
 Is it possible to emulate a unionfs with zfs and zones somehow?  My zones
 are sparse zones and I want to make part of /usr writable within a zone.
 (/usr/perl5/mumble to be exact)

Why don't you just export that directory with NFS (rw) to your sparse zone
and mount it on /usr/perl5/mumble ? Or is this too simple a thought?

-- 
Dick Hoogendijk -- PGP/GnuPG key: F86289CE

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Karl Pielorz


Hi All,

I've been using ZFS for a while now - and everything's been going well. I 
use it under FreeBSD - but this question almost certainly should be the 
same answer, whether it's FreeBSD or Solaris (I think/hope :)...



Imagine if I have a zpool with 2 disks in it, that are mirrored:


NAME STATE READ WRITE CKSUM
vol  ONLINE   0 0 0
  mirrorONLINE   0 0 0
ad1 ONLINE   0 0 0
ad2 ONLINE   0 0 0


(The device names are FreeBSD disks)

If I offline 'ad2' - and then did:


dd if=/dev/ad1 of=/dev/ad2


(i.e. make a mirror copy of ad1 to ad2 - on a *running* system).


What would happen when I tried to 'online' ad2 again?


I fully expect it might not be pleasant... I'm just curious as to what's 
going to happen.



When I 'online' ad2 will ZFS look at it, and be clever enough to figure out 
the disk is obviously corrupt/unusable/has bad meta data on it - and 
resilver accordingly?


Or is it going to see what it thinks is another 'ad1' and get a little 
upset?



I'm trying to setup something here so I can test what happens - I just 
thought I'd ask around a bit to see if anyone knows what'll happen from 
past experience.



Thanks,

-Karl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Roshan Perera
Hi All,

Anyone in the group using ZFS compression on clearcase vobs? If so any issues, 
gotchas?

IBM support informs that ZFS compression is not supported. Any views on this?

Rgds

Roshan

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Darren J Moffat

On 04/02/2010 11:54, Roshan Perera wrote:

Anyone in the group using ZFS compression on clearcase vobs? If so any issues, 
gotchas?


There shouldn't be any issues and I'd be very surprised if there was.


IBM support informs that ZFS compression is not supported. Any views on this?


Need more data on why the claim it isn't supported - what issue have 
they seen or do they thing there could be.  I see no reason that ZFS 
compression wouldn't be supported, in fact Clearcase shouldn't even be 
able to tell.


Compression in ZFS is completely below the POSIX filesystem layer and 
completely out of the control of any application or even kernel service 
like NFS or CIFS that just uses POSIX interfaces.  Same is true of 
deduplication and will be true of encryption when it integrates as well.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread Peter Tribble
On Thu, Feb 4, 2010 at 2:09 AM, Frank Cusack
frank+lists/z...@linetwo.net wrote:
 Is it possible to emulate a unionfs with zfs and zones somehow?  My zones
 are sparse zones and I want to make part of /usr writable within a zone.
 (/usr/perl5/mumble to be exact)

 I can't just mount a writable directory on top of /usr/perl5 because then
 it hides all the stuff in the global zone.  I could repopulate it in the
 local zone but ugh that is unattractive.  I'm hoping for a better way.
 Creating a full zone is not an option for me.

 I don't think this is possible but maybe someone else knows better.  I
 was thinking something with snapshots and clones?

The way I normally do this is to (in the global zone) symlink /usr/perl5/mumble
to somewhere that would be writable such as /opt, and then put what you need
into that location in the zone. Leaves a dangling symlink in the global zone and
other zones, but that's relatively harmless.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Roshan Perera
Hi Darren,

Thanks - IBM basically haven't test clearcase with ZFS compression therefore, 
they don't support currently. Future may change, as such my customer cannot use 
compression. I have asked IBM for roadmap info to find whether/when it will be 
supported. 

Thanks
Roshan

- Original Message -
From: Darren J Moffat darr...@opensolaris.org
Date: Thursday, February 4, 2010 11:59 am
Subject: Re: [zfs-discuss] ZFS compression on Clearcase
To: Roshan Perera roshan.per...@sun.com
Cc: zfs-discuss@opensolaris.org


 On 04/02/2010 11:54, Roshan Perera wrote:
  Anyone in the group using ZFS compression on clearcase vobs? If so 
 any issues, gotchas?
  
  There shouldn't be any issues and I'd be very surprised if there was.
  
  IBM support informs that ZFS compression is not supported. Any views 
 on this?
  
  Need more data on why the claim it isn't supported - what issue have 
 they seen or do they thing there could be.  I see no reason that ZFS 
 compression wouldn't be supported, in fact Clearcase shouldn't even be 
 able to tell.
  
  Compression in ZFS is completely below the POSIX filesystem layer and 
 completely out of the control of any application or even kernel 
 service like NFS or CIFS that just uses POSIX interfaces.  Same is 
 true of deduplication and will be true of encryption when it 
 integrates as well.
  
  -- 
  Darren J Moffat
  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay
Hi Ross,

 zdb -  f...@snapshot | grep path | nawk '{print $2}'

Enjoy!

Darren Mackay
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread Thomas Maier-Komor
On 04.02.2010 12:12, dick hoogendijk wrote:
 
 Frank Cusack wrote:
 Is it possible to emulate a unionfs with zfs and zones somehow?  My zones
 are sparse zones and I want to make part of /usr writable within a zone.
 (/usr/perl5/mumble to be exact)
 
 Why don't you just export that directory with NFS (rw) to your sparse zone
 and mount it on /usr/perl5/mumble ? Or is this too simple a thought?
 
What about lofs? I thinks lofs is the equivalent for unionfs on Solaris.

E.g.

mount -F lofs /originial/path /my/alternate/mount/point

- Thomas

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Darren J Moffat

On 04/02/2010 12:13, Roshan Perera wrote:

Hi Darren,

Thanks - IBM basically haven't test clearcase with ZFS compression therefore, 
they don't support currently. Future may change, as such my customer cannot use 
compression. I have asked IBM for roadmap info to find whether/when it will be 
supported.


That is FUD generation in my opinion and being overly cautious.  The 
whole point of the POSIX interfaces to a filesystem is that applications 
don't actually care how the filesystem stores their data.


UFS never had checksums before but ZFS adds those, but that didn't mean 
that applications had to be checked because checksums were now done on 
the data.


What if it was the disk drive that was doing the compression ?  There 
would be similarly no way for the application to actually know that it 
is happening.


What about every other feature we add to ZFS ?  Like dedup (which is a 
type of compression) - again they app can't tell.  Or snapshots - the 
app can't tell.


Thats my opinion though and I know that ISVs can be very cautious about 
new features sometimes and overly so when it is far below their parts of 
the stack.


Taking another example it would be like an ISV that supports their 
application running over NFS saying they don't support a certain type of 
vendors switch in the network because they haven't tested it.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] verging OT: how to buy J4500 w/o overpriced drives

2010-02-04 Thread Eugen Leitl
On Wed, Feb 03, 2010 at 03:02:21PM -0800, Brandon High wrote:

 Another solution, for a true DIY x4500: BackBlaze has schematics for
 the 45 drive chassis that they designed available on their website.
 http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
 
 Someone brought it up on the list a few months ago (which is how I
 know about it) and there was some interesting discussion at that time.

IIRC the consensus was that the vibration dampening was inadequate
and the interfaces oversubscribed and the disks being not nearline
too unreliable, but I might be misremembering.

I'm still happy with my 16x WD RE4 drives (linux mdraid RAID 10,
CentOS, Oracle, no zfs). Supermicro does 36x drive chassis now
http://www.supermicro.com/products/chassis/4U/?chs=847 so budget
DIY for zfs is about 72 TByte raw storage with 2 TByte nearline
SATA drives.

I've had trouble finding internal 2x 2.5 in one 3.5 
SSD mounts from Supermicro for hybrid zfs, but no doubt one 
could improvise something from the usual ricer supplies. 

On smaller scale http://www.supermicro.com/products/chassis/2U/?chs=216
works well with 2.5 Intel SSDs and VelociRaptors. I hope to be able
to use one for a hybrid zfs iSCSI target for VMWare, probably with
10 GBit Ethernet.

 There's no way I would use something like this for most installs, but
 there is definitely some use. Now that opensolaris supports sata pmp,
 you could use a similar chassis for a zfs pool.

-- 
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Roshan Perera
Hi Darren,

I totally agree with you and have raised some of the points mentioned but you 
have given even more items to pass on.
I will update the alias when I hear further.

Many Thanks

Roshan


- Original Message -
From: Darren J Moffat darr...@opensolaris.org
Date: Thursday, February 4, 2010 12:42 pm
Subject: Re: [zfs-discuss] ZFS compression on Clearcase
To: Roshan Perera roshan.per...@sun.com
Cc: zfs-discuss@opensolaris.org


 On 04/02/2010 12:13, Roshan Perera wrote:
  Hi Darren,
  
  Thanks - IBM basically haven't test clearcase with ZFS compression 
 therefore, they don't support currently. Future may change, as such my 
 customer cannot use compression. I have asked IBM for roadmap info to 
 find whether/when it will be supported.
  
  That is FUD generation in my opinion and being overly cautious.  The 
 whole point of the POSIX interfaces to a filesystem is that 
 applications don't actually care how the filesystem stores their data.
  
  UFS never had checksums before but ZFS adds those, but that didn't 
 mean that applications had to be checked because checksums were now 
 done on the data.
  
  What if it was the disk drive that was doing the compression ?  There 
 would be similarly no way for the application to actually know that it 
 is happening.
  
  What about every other feature we add to ZFS ?  Like dedup (which is 
 a type of compression) - again they app can't tell.  Or snapshots - 
 the app can't tell.
  
  Thats my opinion though and I know that ISVs can be very cautious 
 about new features sometimes and overly so when it is far below their 
 parts of the stack.
  
  Taking another example it would be like an ISV that supports their 
 application running over NFS saying they don't support a certain type 
 of vendors switch in the network because they haven't tested it.
  
  -- 
  Darren J Moffat
  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay
looking through some more code.. i was a bit premature in my last post - been a 
long day.

extracting the guids and query the metadata seems to be logical - i think 
runnign a zfs send just to parse the data stream is a lot of overhead, when you 
really only need to traverse metadata directly.

zdb sources have most of the bits there - just need to unwind the deadlist 
(this seems to match the numder of blocks that have been deleted since the last 
snap)...

might look into this in the next week or 2 if i have time - seems like a 
worthwhile project ;-)

Darren Mackay
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Ross Walker





On Feb 4, 2010, at 2:00 AM, Tomas Ögren st...@acc.umu.se wrote:


On 03 February, 2010 - Frank Cusack sent me these 0,7K bytes:

On February 3, 2010 12:04:07 PM +0200 Henu henrik.he...@tut.fi  
wrote:

Is there a possibility to get a list of changed files between two
snapshots? Currently I do this manually, using basic file system
functions offered by OS. I scan every byte in every file manually  
and it

 ^^^

On February 3, 2010 10:11:01 AM -0500 Ross Walker rswwal...@gmail.com 


wrote:
Not a ZFS method, but you could use rsync with the dry run option  
to list

all changed files between two file systems.


That's exactly what the OP is already doing ...


rsync by default compares metadata first, and only checks through  
every

byte if you add the -c (checksum) flag.

I would say rsync is the best tool here.

The find -newer blah suggested in other posts won't catch newer  
files

with an old timestamp (which could happen for various reasons, like
being copied with kept timestamps from somewhere else).


Find -newer doesn't catch files added or removed it assumes identical  
trees.


I would be interested in comparing ddiff, bart and rsync (local  
comparison only) to see imperically how they match up.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay
The delete queue and related blocks need further investigation...

r...@osol-dev:/data/zdb-test# zdb -dd data/zdb-test | more
Dataset data/zdb-test [ZPL], ID 641, cr_txg 529804, 24.5K, 6 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  15.0K16K   18.75  DMU dnode
-1116K512 1K512  100.00  ZFS user/group used
-2116K512 1K512  100.00  ZFS user/group used
 1116K512 1K512  100.00  ZFS master node
 2116K512 1K512  100.00  ZFS delete queue
 3116K  1.50K 1K  1.50K  100.00  ZFS directory
 4116K512 1K512  100.00  ZFS directory
19116K512512512  100.00  ZFS plain file
22116K 2K 2K 2K  100.00  ZFS plain file


all the info seems to be there  (otherwise, we would not be able to store files 
at all!!).

and *spare time* project for the coming couple of weeks...

Darren
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Karl Pielorz


--On 04 February 2010 11:31 + Karl Pielorz kpielorz_...@tdx.co.uk 
wrote:



What would happen when I tried to 'online' ad2 again?


A reply to my own post... I tried this out, when you make 'ad2' online 
again, ZFS immediately logs a 'vdev corrupt' failure, and marks 'ad2' 
(which at this point is a byte-for-byte copy of 'ad1' as it was being 
written to in background) as 'FAULTED' with 'corrupted data'.


You can't replace it with itself at that point, but a detach on ad2, and 
then attaching ad2 back to ad1 results in a resilver, and recovery.


So to answer my own question - from my tests it looks like you can do this, 
and get away with it. It's probably not ideal, but it does work.


A safer bet would be to detach the drive from the pool, and then re-attach 
it (at which point ZFS assumes it's a new drive and probably ignores the 
'mirror image' data that's on it).


-Karl

(The reason for testing this is because of a weird RAID setup I have where 
if 'ad2' fails, and gets replaced - the RAID controller is going to mirror 
'ad1' over to 'ad2' - and cannot be stopped. However, once the re-mirroring 
is complete the RAID controller steps out the way, and allows raw access to 
each disk in the mirror. Strange, a long story - but true).

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Booting OpenSolaris on ZFS root on Sun Netra 240

2010-02-04 Thread Saso Kiselkov
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I'm kind stuck at trying to get my aging Netra 240 machine to boot
OpenSolaris. The live CD and installation worked perfectly, but when I
reboot and try to boot from the installed disk, I get:

Rebooting with command: boot disk0
Boot device: /p...@1c,60/s...@2/d...@0,0  File and args:
|
The file just loaded does not appear to be executable.


I suspect it's due to the fact that my OBP can't boot a ZFS root
(OpenBoot 4.22.19). Is there a to work around this?

Regards,
- --
Saso
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktqz7kACgkQRO8UcfzpOHCqhgCgl8I+5zCTBLb0MUVq9cz5zrqz
9LgAoIurhee3/+nfXtUBwVczkjKxQVaj
=7dXF
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Jacob Ritorto
Seems your controller is actually doing only harm here, or am I missing
something?

On Feb 4, 2010 8:46 AM, Karl Pielorz kpielorz_...@tdx.co.uk wrote:


--On 04 February 2010 11:31 + Karl Pielorz kpielorz_...@tdx.co.uk
wrote:

 What would happen...
A reply to my own post... I tried this out, when you make 'ad2' online
again, ZFS immediately logs a 'vdev corrupt' failure, and marks 'ad2' (which
at this point is a byte-for-byte copy of 'ad1' as it was being written to in
background) as 'FAULTED' with 'corrupted data'.

You can't replace it with itself at that point, but a detach on ad2, and
then attaching ad2 back to ad1 results in a resilver, and recovery.

So to answer my own question - from my tests it looks like you can do this,
and get away with it. It's probably not ideal, but it does work.

A safer bet would be to detach the drive from the pool, and then re-attach
it (at which point ZFS assumes it's a new drive and probably ignores the
'mirror image' data that's on it).

-Karl

(The reason for testing this is because of a weird RAID setup I have where
if 'ad2' fails, and gets replaced - the RAID controller is going to mirror
'ad1' over to 'ad2' - and cannot be stopped. However, once the re-mirroring
is complete the RAID controller steps out the way, and allows raw access to
each disk in the mirror. Strange, a long story - but true).


___
zfs-discuss mailing list
zfs-disc...@opensolaris.or...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Marc Nicholas
I think you'll do just fine then. And I think the extra platter will
work to your advantage.

-marc

On 2/3/10, Simon Breden sbre...@gmail.com wrote:
 Probably 6 in a RAID-Z2 vdev.

 Cheers,
 Simon
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs/sol10u8 less stable than in sol10u5?

2010-02-04 Thread Carsten Aulbert
Hi all,

it might not be a ZFS issue (and thus on the wrong list), but maybe there's 
someone here who might be able to give us a good hint:

We are operating 13 x4500 and started to play with non-Sun blessed SSDs in 
there. As we were running Solaris 10u5 before and wanted to use them as log 
devices we upgraded to the latest and greatest 10u8 and changed the zpool 
layout[1]. However, on the first machine we found many, many problems with 
various disks failing in different vdevs (I wrote about this in December on 
this list IIRC).

After going through this with Sun they gave us hints but mostly blamed (maybe 
rightfully the Intel X25e in there), we considered the 2.5 to 2.5 converter 
to be at fault as well. Thus we did the next test by placing the SSD into the 
tray without a conversion unit, but that box (a different one) failed with the 
same problems.

Now, we learned from this experience and did the same to another box but 
without the SSD, i.e. jumpstarted the box and installed 10u8, redid the zpool 
and started to fill data in. In today's scrub suddenly this happened:

s09:~# zpool status   
  pool: atlashome 
 state: DEGRADED  
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors  
using 'zpool clear' or replace the device with 'zpool replace'. 
   see: http://www.sun.com/msg/ZFS-8000-9P  
 scrub: resilver in progress for 0h9m, 3.89% done, 4h2m to go   
config: 

NAME  STATE READ WRITE CKSUM
atlashome DEGRADED 0 0 0
  raidz1  ONLINE   0 0 0
c0t0d0ONLINE   0 0 0
c1t0d0ONLINE   0 0 0
c4t0d0ONLINE   0 0 0
c6t0d0ONLINE   0 0 0
c7t0d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c0t1d0ONLINE   0 0 0
c1t1d0ONLINE   0 0 0
c4t1d0ONLINE   0 0 0
c5t1d0ONLINE   0 0 0
c6t1d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c7t1d0ONLINE   0 0 1
c0t2d0ONLINE   0 0 0
c1t2d0ONLINE   0 0 2
c4t2d0ONLINE   0 0 0
c5t2d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c6t2d0ONLINE   0 0 0
c7t2d0ONLINE   0 0 0
c0t3d0ONLINE   0 0 0
c1t3d0ONLINE   0 0 0
c4t3d0ONLINE   0 0 0
  raidz1  DEGRADED 0 0 0
c5t3d0ONLINE   0 0 0
c6t3d0ONLINE   0 0 0
c7t3d0ONLINE   0 0 0
c1t4d0ONLINE   0 0 1
spare DEGRADED 0 0 0
  c4t4d0  DEGRADED 5 011  too many errors
  c0t4d0  ONLINE   0 0 0  5.38G resilvered
  raidz1  ONLINE   0 0 0
c5t4d0ONLINE   0 0 0
c6t4d0ONLINE   0 0 0
c7t4d0ONLINE   0 0 0
c0t5d0ONLINE   0 0 0
c1t5d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c4t5d0ONLINE   0 0 0
c5t5d0ONLINE   0 0 0
c6t5d0ONLINE   0 0 0
c7t5d0ONLINE   0 0 0
c0t6d0ONLINE   0 0 1
  raidz1  ONLINE   0 0 0
c1t6d0ONLINE   0 0 0
c4t6d0ONLINE   0 0 0
c5t6d0ONLINE   0 0 0
c6t6d0ONLINE   0 0 0
c7t6d0ONLINE   0 0 1
  raidz1  ONLINE   0 0 0
c0t7d0ONLINE   0 0 0
c1t7d0ONLINE   0 0 0
c4t7d0ONLINE   0 0 0
c5t7d0ONLINE   0 0 0
c6t7d0ONLINE   0 0 0
spares
  c0t4d0  INUSE currently in use
  c7t7d0  AVAIL


Also similar to the other hosts were the much, much higher Soft/Hard error 
count in iostat:

s09:~# iostat -En|grep Soft
c2t0d0   Soft Errors: 1 Hard Errors: 2 Transport 

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Tonmaus
Hi Arnaud,

which type of controller is this?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Karl Pielorz


--On 04 February 2010 08:58 -0500 Jacob Ritorto jacob.rito...@gmail.com 
wrote:



Seems your controller is actually doing only harm here, or am I missing
something?


The RAID controller presents the drives as both a mirrored pair, and JBOD - 
*at the same time*...


The machine boots off the partition on the 'mirrored' pair - and ZFS uses 
the JBOD devices (a different area of, of course).


It's a little weird to say the least - and I wouldn't recommend it, but it 
does work 'for me' - and is a way of getting the system to boot off a 
mirror, and still be able to use ZFS with only 2 drives available in the 
chassis.


-Karl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Arnaud Brand

Le 04/02/10 16:57, Tonmaus a écrit :

Hi Arnaud,

which type of controller is this?

Regards,

Tonmaus
   
I use two LSI SAS3081E-R in each server (16 hard disk trays, passive 
backplane AFAICT, no expander).

Works very well.

Arnaud
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Robert Milkowski

On 04/02/2010 13:45, Karl Pielorz wrote:


--On 04 February 2010 11:31 + Karl Pielorz 
kpielorz_...@tdx.co.uk wrote:



What would happen when I tried to 'online' ad2 again?


A reply to my own post... I tried this out, when you make 'ad2' online 
again, ZFS immediately logs a 'vdev corrupt' failure, and marks 'ad2' 
(which at this point is a byte-for-byte copy of 'ad1' as it was being 
written to in background) as 'FAULTED' with 'corrupted data'.


You can't replace it with itself at that point, but a detach on ad2, 
and then attaching ad2 back to ad1 results in a resilver, and recovery.


So to answer my own question - from my tests it looks like you can do 
this, and get away with it. It's probably not ideal, but it does work.


it is actually fine - zfs is designed to detect and fix corruption like 
the one you induced.



A safer bet would be to detach the drive from the pool, and then 
re-attach it (at which point ZFS assumes it's a new drive and probably 
ignores the 'mirror image' data that's on it).




Yes, it should and if you want to force resynchronization that's 
probably the best way to do it.
Other thing is that if you suspect some of your data to be corrupted on 
a half of mirror you might try to run zpool scrub as it will fix only 
those corrupted blocks instead of resynchronizing entire mirror which 
might be faster and safer approach.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when: file-corrupted and no-redundancy?

2010-02-04 Thread Robert Milkowski

On 03/02/2010 21:45, Aleksandr Levchuk wrote:

Hardware RAID6 + hot spare, worked well for us. So, I wanted to stick
our SAN for data protection. I understand that the end-to-end checks
of ZFS make it better at detecting corruptions.

In my case, I can imagine that ZFS would FREEZ the whole volume when a
single block or file is found to be corrupted.

Ideally, I would not like this to happen and instead get a log with
names of corrupted files.

What exactly does happens when zfs detects a corrupted block/file and
does not have redundancy to correct it?

Alex

   

I will repeat myself (as I sent below email just yesterday...)

ZFS won't freeze a pool if a single block is corrupted even if no 
redundancy is configured on zfs level.


zpool status -v should provide you with list of affected files which you 
should be able to delete. In case of corrupted block containg meta-data 
zfs should actually be able to fix it on the fly for you as all 
meta-data related blocks are kept in at least two copies even if no 
redundancy is configured at pool level.


Let's test it:

mi...@r600:~# mkfile 128m file1
mi...@r600:~# zpool create test `pwd`/file1
mi...@r600:~# zpool status test
  pool: test
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
testONLINE   0 0 0
  /export/home/milek/file1  ONLINE   0 0 0

errors: No known data errors
mi...@r600:~#
mi...@r600:~# cp /bin/bash /test/file1
mi...@r600:~# cp /bin/bash /test/file2
mi...@r600:~# cp /bin/bash /test/file3
mi...@r600:~# cp /bin/bash /test/file4
mi...@r600:~# cp /bin/bash /test/file5
mi...@r600:~# cp /bin/bash /test/file6
mi...@r600:~# cp /bin/bash /test/file7
mi...@r600:~# cp /bin/bash /test/file8
mi...@r600:~# cp /bin/bash /test/file9
mi...@r600:~# sync
mi...@r600:~# dd if=/dev/zero of=file1 seek=50 count=1 conv=notrunc
1+0 records in
1+0 records out
512 bytes (5.1 MB) copied, 0.179617 s, 28.5 MB/s
mi...@r600:~# sync
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
  pool: test
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed after 0h0m with 7 errors on Thu Feb  4 00:18:40 
2010

config:

NAMESTATE READ WRITE CKSUM
testDEGRADED 0 0 7
  /export/home/milek/file1  DEGRADED 0 029  too many 
errors


errors: Permanent errors have been detected in the following files:

/test/file1
mi...@r600:~#
mi...@r600:~# rm /test/file1
mi...@r600:~# sync
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
  pool: test
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 0h0m with 0 errors on Thu Feb  4 00:19:55 
2010

config:

NAMESTATE READ WRITE CKSUM
testDEGRADED 0 0 7
  /export/home/milek/file1  DEGRADED 0 029  too many 
errors


errors: No known data errors
mi...@r600:~# zpool clear test
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
  pool: test
 state: ONLINE
 scrub: scrub completed after 0h0m with 0 errors on Thu Feb  4 00:20:12 
2010

config:

NAMESTATE READ WRITE CKSUM
testONLINE   0 0 0
  /export/home/milek/file1  ONLINE   0 0 0

errors: No known data errors
mi...@r600:~#
mi...@r600:~# ls -la /test/
total 7191
drwxr-xr-x  2 root root 10 2010-02-04 00:19 .
drwxr-xr-x 28 root root 30 2010-02-04 00:17 ..
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file2
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file3
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file4
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file5
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file6
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file7
-r-xr-xr-x  1 root root 799040 2010-02-04 00:18 file8
-r-xr-xr-x  1 root root 799040 2010-02-04 00:18 file9
mi...@r600:~#


--
Robert Milkowski
htpp://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Tonmaus
Hi again,

thanks for the answer. Another thing that came to my mind is that you mentioned 
that you mixed the disks among the controllers. Does that mean you mixed them 
as well among pools? Unsurprisingly,  the WD20EADS is slower than the Hitachi 
that is a fixed 7200 rpm drive. I wonder what impact that would have if you use 
them as vdevs of the same pool.

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Alex Blewitt
On 4 Feb 2010, at 16:35, Bob Friesenhahn wrote:

 On Thu, 4 Feb 2010, Darren J Moffat wrote:
 Thanks - IBM basically haven't test clearcase with ZFS compression 
 therefore, they don't support currently. Future may change, as such my 
 customer cannot use compression. I have asked IBM for roadmap info to find 
 whether/when it will be supported.
 
 That is FUD generation in my opinion and being overly cautious.  The whole 
 point of the POSIX interfaces to a filesystem is that applications don't 
 actually care how the filesystem stores their data.
 
 Clearcase itself implements a versioning filesystem so perhaps it is not 
 being overly cautious.  Compression could change aspects such as how free 
 space is reported.

I'd also like to echo Bob's observations here. Darren's FUDFUD is based on 
limited experience of ClearCase, I expect ...

On the client side, ClearCase actually presnets itself as a mounted filesystem, 
regardless of what the OS has under the covers. In other words, a ClearCase 
directory will never be 'ZFS' because it's not ZFS, it's ClearCaseFS. On the 
server side (which might be the case here) the way ClearCase works is to 
represent the files and contents in a way more akin to a database (e.g. Oracle) 
than traditional file-system approaches to data (e.g. CVS, SVN). In much the 
same way there are app-specific issues with ZFS (e.g. matching block-sizes, 
dealing with ZFS snapshots on a VM image and so forth) there may well be some 
with ClearCase.

At the very least, though, IBM may just be unable/willing to test it at the 
time and put their stamp of approval on it. In many cases for IBM products, 
there are supported platforms (often with specific patch levels), much like 
there are offically supported Solaris platforms and hot-fixes to go for certain 
applications. They may well just being cautious in what there is until they've 
had time to test it out for themselves - or more likely, until the first set of 
paying customers wants to get invoiced for the investigation. But to claim it's 
FUD without any real data to back it up is just FUD^2.

Alex
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Robert Milkowski

On 04/02/2010 12:42, Darren J Moffat wrote:

On 04/02/2010 12:13, Roshan Perera wrote:

Hi Darren,

Thanks - IBM basically haven't test clearcase with ZFS compression
therefore, they don't support currently. Future may change, as such
my customer cannot use compression. I have asked IBM for roadmap info
to find whether/when it will be supported.


That is FUD generation in my opinion and being overly cautious.  The
whole point of the POSIX interfaces to a filesystem is that
applications don't actually care how the filesystem stores their data.



I agree (*). It is very similar to what EMC did some years ago by 
officially stating that while ZFS is supported on their disk arrays ZFS 
snapshots are not. Even more funny.



(*) - however compression is not entirely transparent in such a sense 
that a reported disk space usage might not be exactly what application 
expects. But I'm not saying it is an issue here - I honestly don't know.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread Frank Cusack
On February 4, 2010 12:12:04 PM +0100 dick hoogendijk d...@nagual.nl 
wrote:

Why don't you just export that directory with NFS (rw) to your sparse zone
and mount it on /usr/perl5/mumble ? Or is this too simple a thought?


On February 4, 2010 1:41:20 PM +0100 Thomas Maier-Komor 
tho...@maier-komor.de wrote:

What about lofs? I thinks lofs is the equivalent for unionfs on Solaris.


The problem with both of those solutions is a) writes will overwrite the
original filesystem data and b) writes will be visible to everyone else.

Neither suggestion provides unionfs capability.

On February 4, 2010 12:12:18 PM + Peter Tribble 
peter.trib...@gmail.com wrote:

The way I normally do this is to (in the global zone) symlink
/usr/perl5/mumble to somewhere that would be writable such as /opt, and
then put what you need into that location in the zone. Leaves a dangling
symlink in the global zone and other zones, but that's relatively
harmless.


The problem with that is you don't see the underlying data that exists
in the global zone.  I do use that technique for other data (e.g. the
entire /usr/local hierarchy), but it doesn't meet my desired needs in
this case.

I looked into clones (and at least now I understand them much better
than before) and they *almost* provide the functionality I want.  I
could mount a clone in the zoned version of /foo and it would see the
original /foo, and changes would go to the clone only, just like a real
unionfs.

What it's lacking though is that when the underlying filesystem changes
(in the global zone), those changes don't percolate up to the clone.
The clone's base view of files is from the snapshot it was generated
from, which cannot change.  It would be great if you could re-target
(or re-base?) a clone from a different snapshot than the one it was
originally generated from.  Since I don't need realtime updates, for
my purposes that would be a great equivalent to a true unionfs.

So the thread on zfs diff gave me an idea; I will use clones and will
write a 'zfs diff'-like tool.  When the original /usr/perl5/mumble
changes I will use that to pick out files that are different in the
clone and populate a new clone with them.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread Frank Cusack

BTW, I could just install everything in the global zone and use the
default inheritance of /usr into each local zone to see the data.
But then my zones are not independent portable entities; they would
depend on some non-default software installed in the global zone.

Just wanted to explain why this is valuable to me and not just some
crazy way to do something simple.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Frank Cusack

On 2/4/10 8:00 AM +0100 Tomas Ögren wrote:

rsync by default compares metadata first, and only checks through every
byte if you add the -c (checksum) flag.

I would say rsync is the best tool here.


ah, i didn't know that was the default.  no wonder recently when i was
incremental-rsyncing a few TB of data between 2 hosts (not using zfs)
i didn't get any speedup from --size-only or whatever the flag is.


The find -newer blah suggested in other posts won't catch newer files
with an old timestamp (which could happen for various reasons, like
being copied with kept timestamps from somewhere else).


good point.  that is definitely a restriction with find -newer.  but if
you meet that restriction, and don't need to find added or deleted files,
it will be faster since only 1 directory tree has to be walked.

but in the general case it does sound like rsync is the best.  unless
bart can find added and missing files.  in which case bart is better
because it only has to walk 1 dir tree -- assuming you have a saved
manifest from a previous walk over the original dir tree.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Frank Cusack

On 2/4/10 8:21 AM -0500 Ross Walker wrote:

Find -newer doesn't catch files added or removed it assumes identical
trees.


This may be redundant in light of my earlier post, but yes it does.
Directory mtimes are updated when a file is added or removed, and
find -newer will detect that.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread Nicolas Williams
On Thu, Feb 04, 2010 at 03:19:15PM -0500, Frank Cusack wrote:
 BTW, I could just install everything in the global zone and use the
 default inheritance of /usr into each local zone to see the data.
 But then my zones are not independent portable entities; they would
 depend on some non-default software installed in the global zone.
 
 Just wanted to explain why this is valuable to me and not just some
 crazy way to do something simple.

There's no unionfs for Solaris.

(For those of you who don't know, unionfs is a BSDism and is a
pseudo-filesystem which presents the union of two underlying
filesystems, but with all changes being made only to one of the two
filesystems.  The idea is that one of the underlying filesystems cannot
be modified through the union, with all changes made through the union
being recorded in an overlay fs.  Think, for example, of unionfs-
mounting read-only media containing sources: you could cd to the mount
point and build the sources, with all intermediate files and results
placed in the overlay.)

In Frank's case, IIUC, the better solution is to avoid the need for
unionfs in the first place by not placing pkg content in directories
that one might want to be writable from zones.  If there's anything
about Perl5 (or anything else) that causes this need to arise, then I
suggest filing a bug.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread Frank Cusack

On 2/4/10 2:46 PM -0600 Nicolas Williams wrote:

In Frank's case, IIUC, the better solution is to avoid the need for
unionfs in the first place by not placing pkg content in directories
that one might want to be writable from zones.  If there's anything
about Perl5 (or anything else) that causes this need to arise, then I
suggest filing a bug.


Right, and thanks for chiming in.  Problem is that perl wants to install
add-on packages in places that the coincide with the system install.
Most stuff is limited to the site_perl directory, which is easily
redirected, but it also has some other locations it likes to meddle with.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] unionfs help

2010-02-04 Thread Nicolas Williams
On Thu, Feb 04, 2010 at 04:03:19PM -0500, Frank Cusack wrote:
 On 2/4/10 2:46 PM -0600 Nicolas Williams wrote:
 In Frank's case, IIUC, the better solution is to avoid the need for
 unionfs in the first place by not placing pkg content in directories
 that one might want to be writable from zones.  If there's anything
 about Perl5 (or anything else) that causes this need to arise, then I
 suggest filing a bug.
 
 Right, and thanks for chiming in.  Problem is that perl wants to install
 add-on packages in places that the coincide with the system install.
 Most stuff is limited to the site_perl directory, which is easily
 redirected, but it also has some other locations it likes to meddle with.

Maybe we need a zone_perl location.  Judicious use of the search paths
will get you out of this bind, I think.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Pool disk replacing fails

2010-02-04 Thread Alexander M. Stetsenko

Hi all,
Im trying to replace broken LUN in pool using zpool replace -f lun, 
but it fails. Physical disk is already replaced, and new lun has the 
same address as broken one.  But zpool detach/attach works.

This is simple configration:

 pool: mypool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
   attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
   using 'zpool clear' or replace the device with 'zpool replace'.
  see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed after 0h0m with 0 errors on Thu Feb  4 
23:16:21 2010

config:

   NAMESTATE READ WRITE CKSUM
   mypool  DEGRADED 0 0 0
 mirrorDEGRADED 0 0 0
   c1t4d0  DEGRADED 0 028  too many errors
   c1t5d0  ONLINE   0 0 0



c1t4d0 is physically replaced LUN. then I`m trying to replace it in pool.

r...@myhost:~# zpool replace -f mypool c1t4d0
invalid vdev specification
the following errors must be manually repaired:
/dev/dsk/c1t4d0s0 is part of active ZFS pool mypool. Please see zpool(1M).

zpool manual says: -fForces use of new_device, even if its appears 
to be in use. Not all devices can be overridden in this manner.



c1t4d0 in use only in mypool.
What is the problem with zpool replace in my case? Accordingly to 
zpool manual it should work.


Thanx you


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Travis Tabbal
Supermicro USAS-L8i controllers. 

I agree with you, I'd much rather have the drives respond properly and promptly 
than save a little power if that means I'm going to get strange errors from the 
array. And these are the green drives, they just don't seem to cause me any 
problems. The issues people have noted with WD have made me stay away from them 
as just about every drive I own lives in some kind of RAID sometime in its 
life. I have a couple laptop drives that are single, all desktops have at least 
a mirror. I'm a little nuts and would probably install mirrors in the laptops 
if there were somewhere to put them. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay
Hi Ross,

Yes - zdb - is dumping out info in the form of:

Object  lvl   iblk   dblk  dsize  lsize   %full  type
19116K512512512  100.00  ZFS plain file
264   bonus  ZFS znode
dnode flags: USED_BYTES USERUSED_ACCOUNTED 
dnode maxblkid: 0
path/snapshot.sh
uid 0
gid 0
atime   Thu Feb  4 23:04:50 2010
mtime   Thu Feb  4 23:04:50 2010
ctime   Thu Feb  4 23:04:50 2010
crtime  Thu Feb  4 23:04:50 2010
gen 529806
mode100755
size174
parent  3
links  
xattr   0
rdev0x


for all objects referenced in the snap.

Perhaps if you wanted to script this, then parsing the above output for time 
stamps that are after the previous snapshot.

Deleted files (and of course new files) can be diffed against the list for the 
snapshot you want to compare with, but I assume you also want files that have 
been modified, hence the requirement to parse the above outputs.

Unfortunately time does not permit me to come up with a working solution until 
(really snowed under until mid next week - did someone say there is meant to be 
a weekend in their too?). But I am sure there is enough info here for someone 
to hack together a script.

Cheers,

Darren Mackay
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Brian
I am Starting to put together a home NAS server that will have the following 
roles:

(1) Store TV recordings from SageTV over either iSCSI or CIFS.  Up to 4 or 5 HD 
streams at a time.  These will be streamed live to the NAS box during recording.
(2) Playback TV (could be stream being recorded, could be others) to 3 or more 
extenders
(3) Hold a music repository
(4) Hold backups from windows machines, mac (time machine), linux.
(5) Be an iSCSI target for several different Virtual Boxes.

Function 4 will use compression and deduplication.
Function 5 will use deduplication.

I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2 mirrored 
boot drives.  

I have been reading these forums off and on for about 6 months trying to figure 
out how to best piece together this system.

I am first trying to select the CPU.  I am leaning towards AMD because of ECC 
support and power consumption.

For items such as de-dupliciation, compression, checksums etc.  Is it better to 
get a faster clock speed or should I consider more cores?  I know certain 
functions such as compression may run on multiple cores.

I have so far narrowed it down to:

AMD Phenom II X2 550 Black Edition Callisto 3.1GHz
and
AMD Phenom X4 9150e Agena 1.8GHz Socket AM2+ 65W Quad-Core

As they are roughly the same price.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Glenn Lagasse
* Brian (broco...@vt.edu) wrote:
 I am Starting to put together a home NAS server that will have the
 following roles:
 
 (1) Store TV recordings from SageTV over either iSCSI or CIFS.  Up to
 4 or 5 HD streams at a time.  These will be streamed live to the NAS
 box during recording.  (2) Playback TV (could be stream being
 recorded, could be others) to 3 or more extenders (3) Hold a music
 repository (4) Hold backups from windows machines, mac (time machine),
 linux.  (5) Be an iSCSI target for several different Virtual Boxes.
 
 Function 4 will use compression and deduplication.  Function 5 will
 use deduplication.
 
 I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2
 mirrored boot drives.  
 
 I have been reading these forums off and on for about 6 months trying
 to figure out how to best piece together this system.
 
 I am first trying to select the CPU.  I am leaning towards AMD because
 of ECC support and power consumption.

I can't comment on most of your question, but I will point you at:

http://blogs.sun.com/mhaywood/entry/powernow_for_solaris

I *think* the cpu's you're looking at won't be an issue but just something
to be aware of when looking at AMD kit (especially if you want to manage
the processor speed).

Cheers,

-- 
Glenn
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Peter Radig
I was interested in the impact the type of an SSD has on the performance of the 
ZIL. So I did some benchmarking and just want to share the results.

My test case is simply untarring the latest ON source (528 MB, 53k files) on an 
Linux system that has a ZFS file system mounted via NFS over gigabit ethernet.

I got the following results:
- locally on the Solaris box: 30 sec
- remotely with no dedicated ZIL device: 36 min 37 sec (factor 73 compared to 
local)
- remotely with ZIL disabled: 1 min 54 sec (factor 3.8 compared to local)
- remotely with a OCZ VERTEX SATA II 120 GB as ZIL device: 14 min 40 sec 
(factor 29.3 compared to local)
- remotely with an Intel X25-E 32 GB as ZIL device: 3 min 11 sec (factor 6.4 
compared to local)

So it really makes a difference what type of SSD you use for your ZIL device. I 
was expecting a good performance from the X25-E, but was really suprised that 
it is that good (only 1.7 times slower than it takes with ZIL completely 
disabled). So I will use the X25-E as ZIL device on my box and will not 
consider disabling ZIL at all to improve NFS performance.

-- Peter
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Marc Nicholas
I would go with cores (threads) rather than clock speed here. My home system
is a 4-core AMD @ 1.8Ghz and performs well.

I wouldn't use drives that big and you should be aware of the overheads of
RaidZ[x].

-marc



On Thu, Feb 4, 2010 at 6:19 PM, Brian broco...@vt.edu wrote:

 I am Starting to put together a home NAS server that will have the
 following roles:

 (1) Store TV recordings from SageTV over either iSCSI or CIFS.  Up to 4 or
 5 HD streams at a time.  These will be streamed live to the NAS box during
 recording.
 (2) Playback TV (could be stream being recorded, could be others) to 3 or
 more extenders
 (3) Hold a music repository
 (4) Hold backups from windows machines, mac (time machine), linux.
 (5) Be an iSCSI target for several different Virtual Boxes.

 Function 4 will use compression and deduplication.
 Function 5 will use deduplication.

 I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2
 mirrored boot drives.

 I have been reading these forums off and on for about 6 months trying to
 figure out how to best piece together this system.

 I am first trying to select the CPU.  I am leaning towards AMD because of
 ECC support and power consumption.

 For items such as de-dupliciation, compression, checksums etc.  Is it
 better to get a faster clock speed or should I consider more cores?  I know
 certain functions such as compression may run on multiple cores.

 I have so far narrowed it down to:

 AMD Phenom II X2 550 Black Edition Callisto 3.1GHz
 and
 AMD Phenom X4 9150e Agena 1.8GHz Socket AM2+ 65W Quad-Core

 As they are roughly the same price.
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Arnaud Brand




Le 04/02/10 20:26, Tonmaus a crit:

  Hi again,

thanks for the answer. Another thing that came to my mind is that you mentioned that you mixed the disks among the controllers. Does that mean you mixed them as well among pools? Unsurprisingly,  the WD20EADS is slower than the Hitachi that is a fixed 7200 rpm drive. I wonder what impact that would have if you use them as vdevs of the same pool.

Cheers,

Tonmaus
  

Yes, we mixed them among controllers and pools.
We've done something that's not recommended : a 15 disk raidz3 pool.

Disks are as follows :
c3 (LSI SAS) has :
- 1x 64 GB Intel X25E
- 3 x 2TB WD20EADS
- 4 x 2TB Hitachi
c2 (LSI SAS) has :
- 4 x 2TB WD20EADS
- 4 x 2TB Hitachi
c5 (motherboard ICH10 if I remember well) has :
- 1x160GB 2,5'' WD
- DVD

All the 2TB drivers are in the raidz3 zpool named tank (we've been very
innovative here ;-).
X25E is sliced in 20GB for the system, 1GB for ZIL for tank, the rest
as cache for tank.

The 2,5'' 160GB WD was not initially part of the setup since we were
planning to slice the 2TB drives in 32GB for the system (mirrored
accross all drives) and the rest for the big zpool, while the X25E was
just there for the ZIL and the cache, but two things we've read on
lists and forums made us change our minds :
- the disk write cache is disabled when you're not using the whole
drive 
- some reports on this list about X25E loosing up to 256 cache flushes
in case of power failures.

So we bought this 160GB disk (it was really the last thing that could
fit in the chassis) and sliced it in the same way as the X25E.
The system and the ZIL are mirrored between the X25E and the WD160.
We do not use the WD160 for the cache : we thought it would be better
to save IOPS on this disk for the ZIL mirror.
I don't know wether it's a good idea to mirror the ZIL on such a disk
but we prefer having slower setup and not loose that much cache flushes
on power failure.

Regarding the perfs obtained by using only Hitachi disks, I can't tell,
I haven't tested it, and can't do it right now as the system is in
preproduction testing.

Also, I should have mentionned in my previous post that some WD20EADS
(the 32SB0) have shorter reponse times (as reported by iostat). 
They're even "faster" than the Hitachi : I've seen them quite a few
times in the range 0.3 to 1.5 ms, which seems far to short for this
kind of drives.
I suspect they're sort of dropping flush requests. Add to it that 2 out
of 3 failed WD20EADS were 32SB0 and you get the picture...
Note they might also be hybrid drives with some flash memory which
allows quick acknoledgment of writes, but I think we would have heard
of such a feature on this list.

Arnaud


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Brian
Thanks for the reply.

Are cores better because of the compression/deduplication being mult-threaded 
or because of multiple streams?  It is a pretty big difference in clock speed - 
so curious as to why core would be better.  Glad to see your 4 core system is 
working well for you - so seems like I won't really have a bad choice.

Why avoid large drives?  Reliability reasons?  My main thought on that is that 
there is a 3 year warranty and I am building raidz2 because I expect failure.  
Or are there other reasons to avoid large drives?

I thought I understood the overhead..  The write and read speeds should be 
roughly that of the slowest disk? 

Thanks.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Marc Nicholas
Very interesting stats -- thanks for taking the time and trouble to share
them!

One thing I found interesting is that the Gen 2 X25-M has higher write IOPS
than the X25-E according to Intel's documentation (6,600 IOPS for 4K writes
versus 3,300 IOPS for 4K writes on the E). I wonder if it'd perform better
as a ZIL? (The write latency on both drives is the same).

-marc

On Thu, Feb 4, 2010 at 6:43 PM, Peter Radig pe...@radig.de wrote:

 I was interested in the impact the type of an SSD has on the performance of
 the ZIL. So I did some benchmarking and just want to share the results.

 My test case is simply untarring the latest ON source (528 MB, 53k files)
 on an Linux system that has a ZFS file system mounted via NFS over gigabit
 ethernet.

 I got the following results:
 - locally on the Solaris box: 30 sec
 - remotely with no dedicated ZIL device: 36 min 37 sec (factor 73 compared
 to local)
 - remotely with ZIL disabled: 1 min 54 sec (factor 3.8 compared to local)
 - remotely with a OCZ VERTEX SATA II 120 GB as ZIL device: 14 min 40 sec
 (factor 29.3 compared to local)
 - remotely with an Intel X25-E 32 GB as ZIL device: 3 min 11 sec (factor
 6.4 compared to local)

 So it really makes a difference what type of SSD you use for your ZIL
 device. I was expecting a good performance from the X25-E, but was really
 suprised that it is that good (only 1.7 times slower than it takes with ZIL
 completely disabled). So I will use the X25-E as ZIL device on my box and
 will not consider disabling ZIL at all to improve NFS performance.

 -- Peter
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Richard Elling
Put your money into RAM, especially for dedup.
 -- richard

On Feb 4, 2010, at 3:19 PM, Brian wrote:

 I am Starting to put together a home NAS server that will have the following 
 roles:
 
 (1) Store TV recordings from SageTV over either iSCSI or CIFS.  Up to 4 or 5 
 HD streams at a time.  These will be streamed live to the NAS box during 
 recording.
 (2) Playback TV (could be stream being recorded, could be others) to 3 or 
 more extenders
 (3) Hold a music repository
 (4) Hold backups from windows machines, mac (time machine), linux.
 (5) Be an iSCSI target for several different Virtual Boxes.
 
 Function 4 will use compression and deduplication.
 Function 5 will use deduplication.
 
 I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2 mirrored 
 boot drives.  
 
 I have been reading these forums off and on for about 6 months trying to 
 figure out how to best piece together this system.
 
 I am first trying to select the CPU.  I am leaning towards AMD because of ECC 
 support and power consumption.
 
 For items such as de-dupliciation, compression, checksums etc.  Is it better 
 to get a faster clock speed or should I consider more cores?  I know certain 
 functions such as compression may run on multiple cores.
 
 I have so far narrowed it down to:
 
 AMD Phenom II X2 550 Black Edition Callisto 3.1GHz
 and
 AMD Phenom X4 9150e Agena 1.8GHz Socket AM2+ 65W Quad-Core
 
 As they are roughly the same price.
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Andrew Gabriel

Peter Radig wrote:

I was interested in the impact the type of an SSD has on the performance of the 
ZIL. So I did some benchmarking and just want to share the results.

My test case is simply untarring the latest ON source (528 MB, 53k files) on an 
Linux system that has a ZFS file system mounted via NFS over gigabit ethernet.

I got the following results:

- remotely with no dedicated ZIL device: 36 min 37 sec (factor 73 compared to 
local)

- remotely with an Intel X25-E 32 GB as ZIL device: 3 min 11 sec (factor 6.4 
compared to local)
  


That's about the same ratio I get when I demonstrate this on the 
SSD/Flash/Turbocharge Discovery Days I run the UK from time to time (the 
name changes over time;-).


--
Andrew Gabriel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Mounting a snapshot of an iSCSI volume using Windows

2010-02-04 Thread Scott Meilicke
I have a single zfs volume, shared out using COMSTAR and connected to a Windows 
VM. I am taking snapshots of the volume regularly. I now want to mount a 
previous snapshot, but when I go through the process, Windows sees the new 
volume, but thinks it is blank and wants to initialize it. Any ideas how to get 
Windows to see that it has data on it?

Steps I took after the snap:

zfs clone snapshot data01/san/gallardo/g-recovery
sbdadm create-lu /dev/zvol/rdsk/data01/san/gallardo/g-recovery
stmfadm add-view -h HG-Gallardo -t TG-Gallardo -n 1 
600144F0EAE40A004B6B59090003

At this point, my server Gallardo can see the LUN, but like I said, it looks 
blank to the OS. I suspect the 'sbdadm create-lu' phase.

Any help to get Windows to see it as a LUN with NTFS data would be appreciated.

Thanks,
Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Arnaud Brand




Le 05/02/10 01:00, Brian a crit:

  Thanks for the reply.

Are cores better because of the compression/deduplication being mult-threaded or because of multiple streams?  It is a pretty big difference in clock speed - so curious as to why core would be better.  Glad to see your 4 core system is working well for you - so seems like I won't really have a bad choice.

Why avoid large drives?  Reliability reasons?  My main thought on that is that there is a 3 year warranty and I am building raidz2 because I expect failure.  Or are there other reasons to avoid large drives?

I thought I understood the overhead..  The write and read speeds should be roughly that of the slowest disk? 

Thanks.
  

From what I saw, ZFS scales terribly well with
multiple cores. 
If you want to send/receive your filesystems through ssh to another
machine, speed matters since ssh only uses one core (but then you can
always use netcat).
On Xeon E5520 running at 2.27 GHz we achieve around 70/80 MB/s ssh
throughput.

For dedup, you want lots of RAM and if possible a large and fast ssd
for L2ARC.
Someone on this list was asking about estimates on ram/cache needs
based on blocksizes / fs size / estimated dedup ratio.
Either I missed the answer or there was no really simple answer (other
than more is better, which always stays true for ram and l2arc).
Anyway, we tested it and were surprised about the quantity of reads
that ensue.

Arnaud



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Cindy Swearingen

Hi Brian,

If you are considering testing dedup, particularly on large datasets, 
see the list of known issues, here:


http://hub.opensolaris.org/bin/view/Community+Group+zfs/dedup

Start with build 132.

Thanks,

Cindy


On 02/04/10 16:19, Brian wrote:

I am Starting to put together a home NAS server that will have the following 
roles:

(1) Store TV recordings from SageTV over either iSCSI or CIFS.  Up to 4 or 5 HD 
streams at a time.  These will be streamed live to the NAS box during recording.
(2) Playback TV (could be stream being recorded, could be others) to 3 or more 
extenders
(3) Hold a music repository
(4) Hold backups from windows machines, mac (time machine), linux.
(5) Be an iSCSI target for several different Virtual Boxes.

Function 4 will use compression and deduplication.
Function 5 will use deduplication.

I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2 mirrored boot drives.  


I have been reading these forums off and on for about 6 months trying to figure 
out how to best piece together this system.

I am first trying to select the CPU.  I am leaning towards AMD because of ECC 
support and power consumption.

For items such as de-dupliciation, compression, checksums etc.  Is it better to 
get a faster clock speed or should I consider more cores?  I know certain 
functions such as compression may run on multiple cores.

I have so far narrowed it down to:

AMD Phenom II X2 550 Black Edition Callisto 3.1GHz
and
AMD Phenom X4 9150e Agena 1.8GHz Socket AM2+ 65W Quad-Core

As they are roughly the same price.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Brian
It sounds like the consensus is more cores over clock speed.  Surprising to me 
since the difference in clocks speed was over 1Ghz.  So, I will go with a quad 
core.

I was leaning towards 4GB of ram - which hopefully should be enough for dedup 
as I am only planning on dedupping my smaller file systems (backups and VMs).

Was my raidz2 performance comment above correct?  That the write speed is that 
of the slowest disk?  That is what I believe I have read.

Now on to the hard part of picking a motherboard that is supported and has 
enough SATA ports!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Marc Nicholas
On Thu, Feb 4, 2010 at 7:54 PM, Brian broco...@vt.edu wrote:

 It sounds like the consensus is more cores over clock speed.  Surprising to
 me since the difference in clocks speed was over 1Ghz.  So, I will go with a
 quad core.


Four cores @ 1.8Ghz = 7.2Ghz of threaded performance ([Open]Solaris is
relatively decent in terms of threading).

Two cores @ 3.1Ghz = 6.2Ghz

:)

Although you may find single threaded operations slower, as someone pointed
out, but even those might wash out as sometimes its I/O that's the problem.

I was leaning towards 4GB of ram - which hopefully should be enough for
 dedup as I am only planning on dedupping my smaller file systems (backups
 and VMs)


4GB is a good start.


 Was my raidz2 performance comment above correct?  That the write speed is
 that of the slowest disk?  That is what I believe I have read.


You are sort-of-correct that its the write speed of the slowest disk.

Mirrored drives will be faster, especially for random I/O. But you sacrifice
storage for that performance boost. That said, I have a similar setup as far
as number of spindles and can push 200MB/sec+ through it and saturate GigE
for iSCSI so maybe I'm being harsh on raidz2 :)


 Now on to the hard part of picking a motherboard that is supported and has
 enough SATA ports!


I used an ASUS board (M4A785-M) which has six (6) SATA2 ports onboard and
pretty decent Hypertransport throughput.

Hope that helps.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Edward Ned Harvey
 I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2
 mirrored boot drives.

You want to use compression and deduplication and raidz2.  I hope you didn't
want to get any performance out of this system, because all of those are
compute or IO intensive.

FWIW ... 5 disks in raidz2 will have capacity of 3 disks.  But if you bought
6 disks in mirrored configuration, you have a small extra cost, and much
better performance.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Brian
Interesting comments..

But I am confused.

Performance for my backups (compression/deduplication) would most likely not be 
#1 priority.

I want my VMs to run fast - so is it deduplication that really slows things 
down?

Are you saying raidz2 would overwhelm current I/O controllers to where I could 
not saturate 1 GB network link?

Is the CPU I am looking at not capable of doing dedup and compression?  Or are 
no CPUs capable of doing that currently?  If I only enable it for the backup 
filesystem will all my filesystems suffer performance wise?

Where are the bottlenecks in a raidz2 system that I will only access over a 
single gigabit link?  Are the insurmountable?



  I plan to start with 5 1.5 TB drives in a raidz2
 configuration and 2
  mirrored boot drives.
 
 You want to use compression and deduplication and
 raidz2.  I hope you didn't
 want to get any performance out of this system,
 because all of those are
 compute or IO intensive.
 
 FWIW ... 5 disks in raidz2 will have capacity of 3
 disks.  But if you bought
 6 disks in mirrored configuration, you have a small
 extra cost, and much
 better performance.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Bob Friesenhahn

On Thu, 4 Feb 2010, Brian wrote:

Was my raidz2 performance comment above correct?  That the write 
speed is that of the slowest disk?  That is what I believe I have 
read.


Data in raidz2 is striped so that it is split across multiple disks. 
In this (sequential) sense it is faster than a single disk.  For 
random access, the stripe performance can not be faster than the 
slowest disk though.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Bob Friesenhahn

On Thu, 4 Feb 2010, Marc Nicholas wrote:


Very interesting stats -- thanks for taking the time and trouble to share them!

One thing I found interesting is that the Gen 2 X25-M has higher write IOPS 
than the
X25-E according to Intel's documentation (6,600 IOPS for 4K writes versus 3,300 
IOPS for
4K writes on the E). I wonder if it'd perform better as a ZIL? (The write 
latency on
both drives is the same).


The write IOPS between the X25-M and the X25-E are different since 
with the X25-M, much more of your data gets completely lost.  Most of 
us prefer not to lose our data.


The X25-M is about as valuable as a paper weight for use as a zfs 
slog.  Toilet paper would be a step up.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Marc Nicholas
On Thu, Feb 4, 2010 at 10:18 PM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Thu, 4 Feb 2010, Marc Nicholas wrote:

  Very interesting stats -- thanks for taking the time and trouble to share
 them!

 One thing I found interesting is that the Gen 2 X25-M has higher write
 IOPS than the
 X25-E according to Intel's documentation (6,600 IOPS for 4K writes versus
 3,300 IOPS for
 4K writes on the E). I wonder if it'd perform better as a ZIL? (The
 write latency on
 both drives is the same).


 The write IOPS between the X25-M and the X25-E are different since with the
 X25-M, much more of your data gets completely lost.  Most of us prefer not
 to lose our data.

 Would you like to qualify your statement further?

While I understand the difference between MLC and SLC parts, I'm pretty sure
Intel didn't design the M version to make data get completely lost. ;)

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Bob Friesenhahn

On Thu, 4 Feb 2010, Marc Nicholas wrote:


The write IOPS between the X25-M and the X25-E are different since with the 
X25-M, much
more of your data gets completely lost.  Most of us prefer not to lose our data.

Would you like to qualify your statement further?


Google is your friend.  And check earlier on this list/forum as well.


While I understand the difference between MLC and SLC parts, I'm pretty sure 
Intel didn't
design the M version to make data get completely lost. ;)


It loses the most recently written data, even after a cache sync 
request.  A number of people have verified this for themselves and 
posted results.  Even the X25-E has been shown to lose some 
transactions.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Impact of an enterprise class SSD on ZIL performance

2010-02-04 Thread Marc Nicholas
On Thu, Feb 4, 2010 at 10:35 PM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Thu, 4 Feb 2010, Marc Nicholas wrote:


 The write IOPS between the X25-M and the X25-E are different since with
 the X25-M, much
 more of your data gets completely lost.  Most of us prefer not to lose our
 data.

 Would you like to qualify your statement further?


 Google is your friend.  And check earlier on this list/forum as well.

  While I understand the difference between MLC and SLC parts, I'm pretty
 sure Intel didn't
 design the M version to make data get completely lost. ;)


 It loses the most recently written data, even after a cache sync request.
  A number of people have verified this for themselves and posted results.
  Even the X25-E has been shown to lose some transactions.

 The devices have some DRAM (16MB) that is used for write amplification
levelling. The sudden loss of power means that this DRAM doesn't get flushed
to Flash. This is the very reason the STEC devices have a supercap.

-marc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Edward Ned Harvey
 Data in raidz2 is striped so that it is split across multiple disks.

Partial truth.
Yes, the data is on more than one disk, but it's a parity hash, requiring
computation overhead and a write operation on each and every disk.  It's not
simply striped.  Whenever you read or write, you need to access all the
disks (or a bunch of 'em) and use compute cycles to generate the actual data
stream.  I don't know enough about the underlying methods of calculating and
distributing everything to say intelligently *why*, but I know this:

 In this (sequential) sense it is faster than a single disk.  

Whenever I benchmark raid5 versus a mirror, the mirror is always faster.
Noticeably and measurably faster, as in 50% to 4x faster.  (50% for a single
disk mirror versus a 6-disk raid5, and 4x faster for a stripe of mirrors, 6
disks with the capacity of 3, versus a 6-disk raid5.)  Granted, I'm talking
about raid5 and not raidz.  There is possibly a difference there, but I
don't think so.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Edward Ned Harvey
 I want my VMs to run fast - so is it deduplication that really slows
 things down?
 
 Are you saying raidz2 would overwhelm current I/O controllers to where
 I could not saturate 1 GB network link?
 
 Is the CPU I am looking at not capable of doing dedup and compression?
 Or are no CPUs capable of doing that currently?  If I only enable it
 for the backup filesystem will all my filesystems suffer performance
 wise?
 
 Where are the bottlenecks in a raidz2 system that I will only access
 over a single gigabit link?  Are the insurmountable?

I'm not sure if anybody can answer your questions.  I will suggest you just
try things out, and see for yourself.  Everybody would have different
techniques to tweak performance...

If you want to use fast compression and dedup, lots of cpu and ram.  (You
said 4G, but I don't think that's a lot.  I never buy a laptop with less
than 4G nowadays.  I think a lot of ram is 16G and higher.)

As for raidz2, and Ethernet ... I don't know.  If you've got 5 disks in a
raidz2 configuration ... Assuming each disk can sustain 500Mbits, then
theoretically these disks might be able to achieve 1.5Gbit or 2.5Gbit with
perfect efficiency ... So maybe they can max out your Ethernet.  I don't
know.  But I do know, if you had a stripe of 3 mirrors, they would have
absolutely no trouble maxing out the Ethernet.  Even a single mirror could
just barely do that.  For 2 or more mirrors, it's cake.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Erik Trimble

Brian wrote:

Interesting comments..

But I am confused.

Performance for my backups (compression/deduplication) would most likely not be 
#1 priority.

I want my VMs to run fast - so is it deduplication that really slows things 
down?
  
Dedup requires a fair amount of CPU, but it really wants a big L2ARC and 
RAM.  I'd seriously consider no less than 8GB of RAM, and look at 
getting a smaller-sized (~40GB) SSD, something on the order of an Intel 
X25-M.


Also, iSCSI-served VMs tend to do mostly random I/O, which is better 
handled by a striped mirror than RaidZ. 


Are you saying raidz2 would overwhelm current I/O controllers to where I could 
not saturate 1 GB network link?
  

No.


Is the CPU I am looking at not capable of doing dedup and compression?  Or are 
no CPUs capable of doing that currently?  If I only enable it for the backup 
filesystem will all my filesystems suffer performance wise?
  
All the CPUs you indicate can handle the job, it's a matter of getting 
enough data to them.



Where are the bottlenecks in a raidz2 system that I will only access over a 
single gigabit link?  Are the insurmountable?
  
RaidZ is good for streaming writes of large size, where you should get 
performance roughly equal to the number of data drives.  Likewise, for 
streaming reads.  Small writes generally limit performance to a level of 
about 1 disk, regardless of the number of data drives in the RaidZ. 
Small reads are in-between in terms of performance.



Personally, I'd look into having 2 different zpools - a striped mirror 
for your iSCSI-shared VMs, and a raidz2 for your main storage. 

In any case, for dedup, you really should have an SSD for L2ARC, if at 
all possible.  Being able to store all the metadata for the entire zpool 
in the L2ARC really, really helps speed up dedup.



Also, about your CPU choices, look here for a good summary of the 
current AMD processor features:


http://en.wikipedia.org/wiki/List_of_AMD_Phenom_microprocessors

(this covers the Phenom, Phenom II, and Athlon II families).


The main difference between the various models comes down to amount of 
L3 cache, and HT speed.  I'd be interested in doing some benchmarking to 
see exactly how the variations make a difference.



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/05/2010 03:21 AM, Edward Ned Harvey wrote:
 FWIW ... 5 disks in raidz2 will have capacity of 3 disks.  But if you bought
 6 disks in mirrored configuration, you have a small extra cost, and much
 better performance.

But the raidz2 can survive the lost of ANY two disk, while the 6 disk
mirror configuration will be destroyed if the two disks lost are from
the SAME pair.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBS2ukAZlgi5GaxT1NAQKD6wQAjI7zTFGmsHKtrhfSGS65edDecxwG8MSV
rDsxoDD0OFs5A1rAJBKZ0UWcRrrDt8iTUKyM0W13+3D2S3i6pxaMLU5jCLFEIPJ7
ZukQxUQ3eRLksXNCjsc7IlIyoe3GTwNclV8pymYCkHp+jggHASRyRtVnninDDX+g
zs1X2Rd4qwU=
=qzs+
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Rob Logan

  I am leaning towards AMD because of ECC support 

well, lets look at Intel's offerings... Ram is faster than AMD's
at 1333Mhz DDR3 and one gets ECC and thermal sensor for $10 over non-ECC 
http://www.newegg.com/Product/Product.aspx?Item=N82E16820139040

This MB has two Intel ethernets and for an extra $30 an ether KVM (LOM)
http://www.newegg.com/Product/Product.aspx?Item=N82E16813182212

One needs a Xeon 34xx for ECC, the 45W versions isn't on newegg, and ignoring
the one without Hyper-Threading leaves us 
http://www.newegg.com/Product/Product.aspx?Item=N82E16819117225

Yea @ 95W it isn't exactly low power, but 4 cores @ 2533MHz and another
4 Hyper-Thread cores is nice.. If you only need one core, the marketing
paperwork claims it will push to 2.93GHz too. But the ram bandwidth is the 
big win for Intel. 

Avoid the temptation, but @ 2.8Ghz without ECC, this close $$
http://www.newegg.com/Product/Product.aspx?Item=N82E16819115214

Now, this gets one to 8G ECC easily...AMD's unfair advantage is all those
ram slots on their multi-die MBs... A slow AMD cpu with 64G ram
might be better depending on your working set / dedup requirements.

Rob



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss