date:20100204

Re: [zfs-discuss] Dedup memory overhead

2010-02-04 Thread Mertol Ozyoney

Sorry fort he late answer. 

Approximately it's 150 bytes per individual block. So increasing the
blocksize is a good idea. 
Also when L1 and L2 arc is not enough system will start making disk IOPS and
RaidZ is not very effective for random IOPS and it's likely that when your
dram is not enough your perfor ance will suffer. 
You may choose to use Raid 10 which is a lot better on random loads
Mertol 




Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email mertol.ozyo...@sun.com



-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of erik.ableson
Sent: Thursday, January 21, 2010 6:05 PM
To: zfs-discuss
Subject: [zfs-discuss] Dedup memory overhead

Hi all,

I'm going to be trying out some tests using b130 for dedup on a server with
about 1,7Tb of useable storage (14x146 in two raidz vdevs of 7 disks).  What
I'm trying to get a handle on is how to estimate the memory overhead
required for dedup on that amount of storage.  From what I gather, the dedup
hash keys are held in ARC and L2ARC and as such are in competition for the
available memory.

So the question is how much memory or L2ARC would be necessary to ensure
that I'm never going back to disk to read out the hash keys. Better yet
would be some kind of algorithm for calculating the overhead. eg - averaged
block size of 4K = a hash key for every 4k stored and a hash occupies 256
bits. An associated question is then how does the ARC handle competition
between hash keys and regular ARC functions?

Based on these estimations, I think that I should be able to calculate the
following:
1,7 TB
1740,8  GB
1782579,2   MB
1825361100,8KB
4   average block size
456340275,2 blocks
256 hash key size-bits
1,16823E+11 hash key overhead - bits
1460206,4   hash key size-bytes
14260633,6  hash key size-KB
13926,4 hash key size-MB
13,6hash key overhead-GB

Of course the big question on this will be the average block size - or
better yet - to be able to analyze an existing datastore to see just how
many blocks it uses and what is the current distribution of different block
sizes. I'm currently playing around with zdb with mixed success  on
extracting this kind of data. That's also a worst case scenario since it's
counting really small blocks and using 100% of available storage - highly
unlikely. 

# zdb -ddbb siovale/iphone
Dataset siovale/iphone [ZPL], ID 2381, cr_txg 3764691, 44.6G, 99 objects

ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0,
flags 0x0

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  57.0K64K   77.34  DMU dnode
 1116K 1K  1.50K 1K  100.00  ZFS master node
 2116K512  1.50K512  100.00  ZFS delete queue
 3216K16K  18.0K32K  100.00  ZFS directory
 4316K   128K   408M   408M  100.00  ZFS plain file
 5116K16K  3.00K16K  100.00  FUID table
 6116K 4K  4.50K 4K  100.00  ZFS plain file
 7116K  6.50K  6.50K  6.50K  100.00  ZFS plain file
 8316K   128K   952M   952M  100.00  ZFS plain file
 9316K   128K   912M   912M  100.00  ZFS plain file
10316K   128K   695M   695M  100.00  ZFS plain file
11316K   128K   914M   914M  100.00  ZFS plain file
 
Now, if I'm understanding this output properly, object 4 is composed of
128KB blocks with a total size of 408MB, meaning that it uses 3264 blocks.
Can someone confirm (or correct) that assumption? Also, I note that each
object  (as far as my limited testing has shown) has a single block size
with no internal variation.

Interestingly, all of my zvols seem to use fixed size blocks - that is,
there is no variation in the block sizes - they're all the size defined on
creation with no dynamic block sizes being used. I previously thought that
the -b option set the maximum size, rather than fixing all blocks.  Learned
something today :-)

# zdb -ddbb siovale/testvol
Dataset siovale/testvol [ZVOL], ID 45, cr_txg 4717890, 23.9K, 2 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1116K64K  064K0.00  zvol object
 2116K512  1.50K512  100.00  zvol prop

# zdb -ddbb siovale/tm-media
Dataset siovale/tm-media [ZVOL], ID 706, cr_txg 4426997, 240G, 2 objects

ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0,
flags 0x0

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1516K 8K   240G   250G   97.33  zvol object
 2116K512  1.50K512  100.00  zvol prop

Re: [zfs-discuss] Large scale ZFS deployments out there (200 disks)

2010-02-04 Thread Mertol Ozyoney

We got 50+ X4500/X4540's running in the same DC happiliy with ZFS.
Approximately 2500 drives and growing everyday... 

Br
Mertol 



Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email mertol.ozyo...@sun.com



-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Henrik Johansen
Sent: Friday, January 29, 2010 10:45 AM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Large scale ZFS deployments out there (200
disks)

On 01/28/10 11:13 PM, Lutz Schumann wrote:
 While thinking about ZFS as the next generation filesystem without
 limits I am wondering if the real world is ready for this kind of
 incredible technology ...

 I'm actually speaking of hardware :)

 ZFS can handle a lot of devices. Once in the import bug
 (http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6761786)
 is fixed it should be able to handle a lot of disks.

That was fixed in build 125.

 I want to ask the ZFS community and users what large scale deploments
 are out there.  How man disks ? How much capacity ? Single pool or
 many pools on a server ? How does resilver work in those
 environtments ? How to you backup ? What is the experience so far ?
 Major headakes ?

 It would be great if large scale users would share their setups and
 experiences with ZFS.

The largest ZFS deployment that we have is currently comprised of 22 
Dell MD1000 enclosures (330 750 GB Nearline SAS disks). We have 3 head 
nodes and use one zpool per node, comprised of rather narrow (5+2) 
RAIDZ2 vdevs. This setup is exclusively used for storing backup data.

Resilver times could be better - I am sure that this will improve once 
we upgrade from S10u9 to 2010.03.

One of the things that I am missing in ZFS is the ability to prioritize 
background operations like scrub and resilver. All our disks are idle 
during daytime and I would love to be able to take advantage of this, 
especially during resilver operations.

This setup has been running for about a year with no major issues so 
far. The only hickups we've had were all HW related (no fun in firmware 
upgrading 200+ disks).

 Will you ? :) Thanks, Robert


-- 
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-02-04 Thread Tonmaus

Hi Simon

 I.e. you'll have to manually intervene
 if a consumer drive causes the system to hang, and
 replace it, whereas the RAID edition drives will
 probably report the error quickly and then ZFS will
 rewrite the data elsewhere, and thus maybe not kick
 the drive.

IMHO the relevant aspects are if ZFS is able to give accurate account on cache 
flush status and even realize if a drive is not responsive. That being said, I 
have no seen a specific report that ZFS would kick green drives at random or at 
pattern, like the poor SoHo storage enclosure users do all the time.

 
 So it sounds preferable to have TLER in operation, if
 one can find a consumer-priced drive that allows it,
 or just take the hit and go with whatever non-TLER
 drive you choose and expect to have to manually
 intervene if a drive plays up. OK for home user where
 he is not too affected, but not good for businesses
 which need to have something recovered quickly.

One point about TLER is that two error correction schemes concur in the case 
you run a consumer drive on an active RAID controller that has its own 
mechanisms. When you run ZFS on a RAID controller in contrast to the best 
practise recommendations, an analogue question arises. On the other hand, if 
you run a green consumer drive on a dumb HBA , I wouldn't know what is wrong 
with it in the first place. 
As much as for manual interventions, the only one I am aware of would be to 
re-attach a single drive. Not an option if you are really affected like those 
miserable Thecus N7000 users that see the entire array of only a handful of 
drives drop out within hours - over and over again, or not even get to finish 
formatting the stripe set.
The dire consequences of the gossiped TLER problems let me believe that there 
would be much more and quite specific reports in this place if this was a 
systematic issue with ZFS. Other than that, we are operating outside supported 
specs when running consumer level drives in large arrays. So far at least the 
perspective of Seagate and WD.

 
  That all rather points to singular issues with
  firmware bugs or similar than to a systematic
 issue,
  doesn't it?
 
 I'm not sure. Some people in the WDC threads seem to
 report problems with pauses during media streaming
 etc. 

This was again for SoHo storage enclosures - not for ZFS, right?

  when the
 32MB+ cache is empty, then it loads another 32MB into
 cache etc and so on? 

I am not sure if any current disk will have such a simplistic cache management 
that will draw upon completely cycling the buffer content, let alone for reads 
that belong to a single file (a disk basically is agnostic of files). Moreover, 
such a buffer management would be completely useless for a striped array. I 
don't know much better what a disk cache does either, but I am afraid that 
direction is probably not helpful to understanding certain phenomenons people 
have reported.

I think that at this time we are seeing a quite large amount of evolutions 
going on in disk storage, whereas many established assumptions are being 
abandoned while backwards compatibility is not always taken care of. SAS 6G 
(will my controller really work in a PCIe 1.1 slot?) and 4k clusters are 
certainly only prominent examples. It's probably even more true than ever to 
fall back to established technologies in such times, including of biting the 
bullet of cost premium on occasion.

Best regards

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Henu

So do you mean I cannot gather the names and locations of  
changed/created/removed files just by analyzing a stream of  
(incremental) zfs_send?


Quoting Andrey Kuzmin andrey.v.kuz...@gmail.com:

On Wed, Feb 3, 2010 at 6:11 PM, Ross Walker rswwal...@gmail.com wrote:

On Feb 3, 2010, at 9:53 AM, Henu henrik.he...@tut.fi wrote:


Okay, so first of all, it's true that send is always fast and 100%
reliable because it uses blocks to see differences. Good, and thanks for
this information. If everything else fails, I can parse the information I
want from send stream :)

But am I right, that there is no other methods to get the list of changed
files other than the send command?


At zfs_send level there are no files, just DMU objects (modified in
some txg which is the basis for changed/unchanged decision).




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Henu

Whoa! That is exactly what I've been looking for. Is there any  
developement version publicly available for testing?


Regards,
Henrik Heino

Quoting Matthew Ahrens matthew.ahr...@sun.com:
This is RFE 6425091 want 'zfs diff' to list files that have changed  
between snapshots, which covers both file  directory changes, and  
file removal/creation/renaming.  We actually have a prototype of zfs  
diff. Hopefully someday we will finish it up...


--matt



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Ian Collins


Henu wrote:
So do you mean I cannot gather the names and locations of 
changed/created/removed files just by analyzing a stream of 
(incremental) zfs_send?


That's correct, you can't.  Snapshots do not work at the file level.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread dick hoogendijk


Frank Cusack wrote:
 Is it possible to emulate a unionfs with zfs and zones somehow?  My zones
 are sparse zones and I want to make part of /usr writable within a zone.
 (/usr/perl5/mumble to be exact)

Why don't you just export that directory with NFS (rw) to your sparse zone
and mount it on /usr/perl5/mumble ? Or is this too simple a thought?

-- 
Dick Hoogendijk -- PGP/GnuPG key: F86289CE

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Karl Pielorz



Hi All,

I've been using ZFS for a while now - and everything's been going well. I 
use it under FreeBSD - but this question almost certainly should be the 
same answer, whether it's FreeBSD or Solaris (I think/hope :)...



Imagine if I have a zpool with 2 disks in it, that are mirrored:


NAME STATE READ WRITE CKSUM
vol  ONLINE   0 0 0
  mirrorONLINE   0 0 0
ad1 ONLINE   0 0 0
ad2 ONLINE   0 0 0


(The device names are FreeBSD disks)

If I offline 'ad2' - and then did:


dd if=/dev/ad1 of=/dev/ad2


(i.e. make a mirror copy of ad1 to ad2 - on a *running* system).


What would happen when I tried to 'online' ad2 again?


I fully expect it might not be pleasant... I'm just curious as to what's 
going to happen.



When I 'online' ad2 will ZFS look at it, and be clever enough to figure out 
the disk is obviously corrupt/unusable/has bad meta data on it - and 
resilver accordingly?


Or is it going to see what it thinks is another 'ad1' and get a little 
upset?



I'm trying to setup something here so I can test what happens - I just 
thought I'd ask around a bit to see if anyone knows what'll happen from 
past experience.



Thanks,

-Karl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Roshan Perera

Hi All,

Anyone in the group using ZFS compression on clearcase vobs? If so any issues, 
gotchas?

IBM support informs that ZFS compression is not supported. Any views on this?

Rgds

Roshan

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Darren J Moffat


On 04/02/2010 11:54, Roshan Perera wrote:

Anyone in the group using ZFS compression on clearcase vobs? If so any issues, 
gotchas?


There shouldn't be any issues and I'd be very surprised if there was.


IBM support informs that ZFS compression is not supported. Any views on this?


Need more data on why the claim it isn't supported - what issue have 
they seen or do they thing there could be.  I see no reason that ZFS 
compression wouldn't be supported, in fact Clearcase shouldn't even be 
able to tell.


Compression in ZFS is completely below the POSIX filesystem layer and 
completely out of the control of any application or even kernel service 
like NFS or CIFS that just uses POSIX interfaces.  Same is true of 
deduplication and will be true of encryption when it integrates as well.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread Peter Tribble

On Thu, Feb 4, 2010 at 2:09 AM, Frank Cusack
frank+lists/z...@linetwo.net wrote:
 Is it possible to emulate a unionfs with zfs and zones somehow?  My zones
 are sparse zones and I want to make part of /usr writable within a zone.
 (/usr/perl5/mumble to be exact)

 I can't just mount a writable directory on top of /usr/perl5 because then
 it hides all the stuff in the global zone.  I could repopulate it in the
 local zone but ugh that is unattractive.  I'm hoping for a better way.
 Creating a full zone is not an option for me.

 I don't think this is possible but maybe someone else knows better.  I
 was thinking something with snapshots and clones?

The way I normally do this is to (in the global zone) symlink /usr/perl5/mumble
to somewhere that would be writable such as /opt, and then put what you need
into that location in the zone. Leaves a dangling symlink in the global zone and
other zones, but that's relatively harmless.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Roshan Perera

Hi Darren,

Thanks - IBM basically haven't test clearcase with ZFS compression therefore, 
they don't support currently. Future may change, as such my customer cannot use 
compression. I have asked IBM for roadmap info to find whether/when it will be 
supported. 

Thanks
Roshan

- Original Message -
From: Darren J Moffat darr...@opensolaris.org
Date: Thursday, February 4, 2010 11:59 am
Subject: Re: [zfs-discuss] ZFS compression on Clearcase
To: Roshan Perera roshan.per...@sun.com
Cc: zfs-discuss@opensolaris.org


 On 04/02/2010 11:54, Roshan Perera wrote:
  Anyone in the group using ZFS compression on clearcase vobs? If so 
 any issues, gotchas?
  
  There shouldn't be any issues and I'd be very surprised if there was.
  
  IBM support informs that ZFS compression is not supported. Any views 
 on this?
  
  Need more data on why the claim it isn't supported - what issue have 
 they seen or do they thing there could be.  I see no reason that ZFS 
 compression wouldn't be supported, in fact Clearcase shouldn't even be 
 able to tell.
  
  Compression in ZFS is completely below the POSIX filesystem layer and 
 completely out of the control of any application or even kernel 
 service like NFS or CIFS that just uses POSIX interfaces.  Same is 
 true of deduplication and will be true of encryption when it 
 integrates as well.
  
  -- 
  Darren J Moffat
  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay

Hi Ross,

 zdb -  f...@snapshot | grep path | nawk '{print $2}'

Enjoy!

Darren Mackay
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread Thomas Maier-Komor

On 04.02.2010 12:12, dick hoogendijk wrote:
 
 Frank Cusack wrote:
 Is it possible to emulate a unionfs with zfs and zones somehow?  My zones
 are sparse zones and I want to make part of /usr writable within a zone.
 (/usr/perl5/mumble to be exact)
 
 Why don't you just export that directory with NFS (rw) to your sparse zone
 and mount it on /usr/perl5/mumble ? Or is this too simple a thought?
 
What about lofs? I thinks lofs is the equivalent for unionfs on Solaris.

E.g.

mount -F lofs /originial/path /my/alternate/mount/point

- Thomas

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Darren J Moffat


On 04/02/2010 12:13, Roshan Perera wrote:

Hi Darren,

Thanks - IBM basically haven't test clearcase with ZFS compression therefore, 
they don't support currently. Future may change, as such my customer cannot use 
compression. I have asked IBM for roadmap info to find whether/when it will be 
supported.


That is FUD generation in my opinion and being overly cautious.  The 
whole point of the POSIX interfaces to a filesystem is that applications 
don't actually care how the filesystem stores their data.


UFS never had checksums before but ZFS adds those, but that didn't mean 
that applications had to be checked because checksums were now done on 
the data.


What if it was the disk drive that was doing the compression ?  There 
would be similarly no way for the application to actually know that it 
is happening.


What about every other feature we add to ZFS ?  Like dedup (which is a 
type of compression) - again they app can't tell.  Or snapshots - the 
app can't tell.


Thats my opinion though and I know that ISVs can be very cautious about 
new features sometimes and overly so when it is far below their parts of 
the stack.


Taking another example it would be like an ISV that supports their 
application running over NFS saying they don't support a certain type of 
vendors switch in the network because they haven't tested it.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] verging OT: how to buy J4500 w/o overpriced drives

2010-02-04 Thread Eugen Leitl

On Wed, Feb 03, 2010 at 03:02:21PM -0800, Brandon High wrote:

Another solution, for a true DIY x4500: BackBlaze has schematics for
the 45 drive chassis that they designed available on their website.
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/

Someone brought it up on the list a few months ago (which is how I
know about it) and there was some interesting discussion at that time.

IIRC the consensus was that the vibration dampening was inadequate
and the interfaces oversubscribed and the disks being not nearline
too unreliable, but I might be misremembering.

I'm still happy with my 16x WD RE4 drives (linux mdraid RAID 10,
CentOS, Oracle, no zfs). Supermicro does 36x drive chassis now
http://www.supermicro.com/products/chassis/4U/?chs=847 so budget
DIY for zfs is about 72 TByte raw storage with 2 TByte nearline
SATA drives.

I've had trouble finding internal 2x 2.5 in one 3.5
SSD mounts from Supermicro for hybrid zfs, but no doubt one
could improvise something from the usual ricer supplies.

On smaller scale http://www.supermicro.com/products/chassis/2U/?chs=216
works well with 2.5 Intel SSDs and VelociRaptors. I hope to be able
to use one for a hybrid zfs iSCSI target for VMWare, probably with
10 GBit Ethernet.

There's no way I would use something like this for most installs, but
there is definitely some use. Now that opensolaris supports sata pmp,
you could use a similar chassis for a zfs pool.

--
Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Roshan Perera

Hi Darren,

I totally agree with you and have raised some of the points mentioned but you 
have given even more items to pass on.
I will update the alias when I hear further.

Many Thanks

Roshan


- Original Message -
From: Darren J Moffat darr...@opensolaris.org
Date: Thursday, February 4, 2010 12:42 pm
Subject: Re: [zfs-discuss] ZFS compression on Clearcase
To: Roshan Perera roshan.per...@sun.com
Cc: zfs-discuss@opensolaris.org


 On 04/02/2010 12:13, Roshan Perera wrote:
  Hi Darren,
  
  Thanks - IBM basically haven't test clearcase with ZFS compression 
 therefore, they don't support currently. Future may change, as such my 
 customer cannot use compression. I have asked IBM for roadmap info to 
 find whether/when it will be supported.
  
  That is FUD generation in my opinion and being overly cautious.  The 
 whole point of the POSIX interfaces to a filesystem is that 
 applications don't actually care how the filesystem stores their data.
  
  UFS never had checksums before but ZFS adds those, but that didn't 
 mean that applications had to be checked because checksums were now 
 done on the data.
  
  What if it was the disk drive that was doing the compression ?  There 
 would be similarly no way for the application to actually know that it 
 is happening.
  
  What about every other feature we add to ZFS ?  Like dedup (which is 
 a type of compression) - again they app can't tell.  Or snapshots - 
 the app can't tell.
  
  Thats my opinion though and I know that ISVs can be very cautious 
 about new features sometimes and overly so when it is far below their 
 parts of the stack.
  
  Taking another example it would be like an ISV that supports their 
 application running over NFS saying they don't support a certain type 
 of vendors switch in the network because they haven't tested it.
  
  -- 
  Darren J Moffat
  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay

looking through some more code.. i was a bit premature in my last post - been a 
long day.

extracting the guids and query the metadata seems to be logical - i think 
runnign a zfs send just to parse the data stream is a lot of overhead, when you 
really only need to traverse metadata directly.

zdb sources have most of the bits there - just need to unwind the deadlist 
(this seems to match the numder of blocks that have been deleted since the last 
snap)...

might look into this in the next week or 2 if i have time - seems like a 
worthwhile project ;-)

Darren Mackay
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Ross Walker






On Feb 4, 2010, at 2:00 AM, Tomas Ögren st...@acc.umu.se wrote:


On 03 February, 2010 - Frank Cusack sent me these 0,7K bytes:

On February 3, 2010 12:04:07 PM +0200 Henu henrik.he...@tut.fi  
wrote:

Is there a possibility to get a list of changed files between two
snapshots? Currently I do this manually, using basic file system
functions offered by OS. I scan every byte in every file manually  
and it

 ^^^

On February 3, 2010 10:11:01 AM -0500 Ross Walker rswwal...@gmail.com 


wrote:
Not a ZFS method, but you could use rsync with the dry run option  
to list

all changed files between two file systems.


That's exactly what the OP is already doing ...


rsync by default compares metadata first, and only checks through  
every

byte if you add the -c (checksum) flag.

I would say rsync is the best tool here.

The find -newer blah suggested in other posts won't catch newer  
files

with an old timestamp (which could happen for various reasons, like
being copied with kept timestamps from somewhere else).


Find -newer doesn't catch files added or removed it assumes identical  
trees.


I would be interested in comparing ddiff, bart and rsync (local  
comparison only) to see imperically how they match up.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay

The delete queue and related blocks need further investigation...

r...@osol-dev:/data/zdb-test# zdb -dd data/zdb-test | more
Dataset data/zdb-test [ZPL], ID 641, cr_txg 529804, 24.5K, 6 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  15.0K16K   18.75  DMU dnode
-1116K512 1K512  100.00  ZFS user/group used
-2116K512 1K512  100.00  ZFS user/group used
 1116K512 1K512  100.00  ZFS master node
 2116K512 1K512  100.00  ZFS delete queue
 3116K  1.50K 1K  1.50K  100.00  ZFS directory
 4116K512 1K512  100.00  ZFS directory
19116K512512512  100.00  ZFS plain file
22116K 2K 2K 2K  100.00  ZFS plain file


all the info seems to be there  (otherwise, we would not be able to store files 
at all!!).

and *spare time* project for the coming couple of weeks...

Darren
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Karl Pielorz



--On 04 February 2010 11:31 + Karl Pielorz kpielorz_...@tdx.co.uk 
wrote:



What would happen when I tried to 'online' ad2 again?


A reply to my own post... I tried this out, when you make 'ad2' online 
again, ZFS immediately logs a 'vdev corrupt' failure, and marks 'ad2' 
(which at this point is a byte-for-byte copy of 'ad1' as it was being 
written to in background) as 'FAULTED' with 'corrupted data'.


You can't replace it with itself at that point, but a detach on ad2, and 
then attaching ad2 back to ad1 results in a resilver, and recovery.


So to answer my own question - from my tests it looks like you can do this, 
and get away with it. It's probably not ideal, but it does work.


A safer bet would be to detach the drive from the pool, and then re-attach 
it (at which point ZFS assumes it's a new drive and probably ignores the 
'mirror image' data that's on it).


-Karl

(The reason for testing this is because of a weird RAID setup I have where 
if 'ad2' fails, and gets replaced - the RAID controller is going to mirror 
'ad1' over to 'ad2' - and cannot be stopped. However, once the re-mirroring 
is complete the RAID controller steps out the way, and allows raw access to 
each disk in the mirror. Strange, a long story - but true).

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Booting OpenSolaris on ZFS root on Sun Netra 240

2010-02-04 Thread Saso Kiselkov

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I'm kind stuck at trying to get my aging Netra 240 machine to boot
OpenSolaris. The live CD and installation worked perfectly, but when I
reboot and try to boot from the installed disk, I get:

Rebooting with command: boot disk0
Boot device: /p...@1c,60/s...@2/d...@0,0  File and args:
|
The file just loaded does not appear to be executable.


I suspect it's due to the fact that my OBP can't boot a ZFS root
(OpenBoot 4.22.19). Is there a to work around this?

Regards,
- --
Saso
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktqz7kACgkQRO8UcfzpOHCqhgCgl8I+5zCTBLb0MUVq9cz5zrqz
9LgAoIurhee3/+nfXtUBwVczkjKxQVaj
=7dXF
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Jacob Ritorto

Seems your controller is actually doing only harm here, or am I missing
something?

On Feb 4, 2010 8:46 AM, Karl Pielorz kpielorz_...@tdx.co.uk wrote:


--On 04 February 2010 11:31 + Karl Pielorz kpielorz_...@tdx.co.uk
wrote:

 What would happen...
A reply to my own post... I tried this out, when you make 'ad2' online
again, ZFS immediately logs a 'vdev corrupt' failure, and marks 'ad2' (which
at this point is a byte-for-byte copy of 'ad1' as it was being written to in
background) as 'FAULTED' with 'corrupted data'.

You can't replace it with itself at that point, but a detach on ad2, and
then attaching ad2 back to ad1 results in a resilver, and recovery.

So to answer my own question - from my tests it looks like you can do this,
and get away with it. It's probably not ideal, but it does work.

A safer bet would be to detach the drive from the pool, and then re-attach
it (at which point ZFS assumes it's a new drive and probably ignores the
'mirror image' data that's on it).

-Karl

(The reason for testing this is because of a weird RAID setup I have where
if 'ad2' fails, and gets replaced - the RAID controller is going to mirror
'ad1' over to 'ad2' - and cannot be stopped. However, once the re-mirroring
is complete the RAID controller steps out the way, and allows raw access to
each disk in the mirror. Strange, a long story - but true).


___
zfs-discuss mailing list
zfs-disc...@opensolaris.or...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Marc Nicholas

I think you'll do just fine then. And I think the extra platter will
work to your advantage.

-marc

On 2/3/10, Simon Breden sbre...@gmail.com wrote:
 Probably 6 in a RAID-Z2 vdev.

 Cheers,
 Simon
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zfs/sol10u8 less stable than in sol10u5?

2010-02-04 Thread Carsten Aulbert

Hi all,

it might not be a ZFS issue (and thus on the wrong list), but maybe there's 
someone here who might be able to give us a good hint:

We are operating 13 x4500 and started to play with non-Sun blessed SSDs in 
there. As we were running Solaris 10u5 before and wanted to use them as log 
devices we upgraded to the latest and greatest 10u8 and changed the zpool 
layout[1]. However, on the first machine we found many, many problems with 
various disks failing in different vdevs (I wrote about this in December on 
this list IIRC).

After going through this with Sun they gave us hints but mostly blamed (maybe 
rightfully the Intel X25e in there), we considered the 2.5 to 2.5 converter 
to be at fault as well. Thus we did the next test by placing the SSD into the 
tray without a conversion unit, but that box (a different one) failed with the 
same problems.

Now, we learned from this experience and did the same to another box but 
without the SSD, i.e. jumpstarted the box and installed 10u8, redid the zpool 
and started to fill data in. In today's scrub suddenly this happened:

s09:~# zpool status   
  pool: atlashome 
 state: DEGRADED  
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors  
using 'zpool clear' or replace the device with 'zpool replace'. 
   see: http://www.sun.com/msg/ZFS-8000-9P  
 scrub: resilver in progress for 0h9m, 3.89% done, 4h2m to go   
config: 

NAME  STATE READ WRITE CKSUM
atlashome DEGRADED 0 0 0
  raidz1  ONLINE   0 0 0
c0t0d0ONLINE   0 0 0
c1t0d0ONLINE   0 0 0
c4t0d0ONLINE   0 0 0
c6t0d0ONLINE   0 0 0
c7t0d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c0t1d0ONLINE   0 0 0
c1t1d0ONLINE   0 0 0
c4t1d0ONLINE   0 0 0
c5t1d0ONLINE   0 0 0
c6t1d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c7t1d0ONLINE   0 0 1
c0t2d0ONLINE   0 0 0
c1t2d0ONLINE   0 0 2
c4t2d0ONLINE   0 0 0
c5t2d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c6t2d0ONLINE   0 0 0
c7t2d0ONLINE   0 0 0
c0t3d0ONLINE   0 0 0
c1t3d0ONLINE   0 0 0
c4t3d0ONLINE   0 0 0
  raidz1  DEGRADED 0 0 0
c5t3d0ONLINE   0 0 0
c6t3d0ONLINE   0 0 0
c7t3d0ONLINE   0 0 0
c1t4d0ONLINE   0 0 1
spare DEGRADED 0 0 0
  c4t4d0  DEGRADED 5 011  too many errors
  c0t4d0  ONLINE   0 0 0  5.38G resilvered
  raidz1  ONLINE   0 0 0
c5t4d0ONLINE   0 0 0
c6t4d0ONLINE   0 0 0
c7t4d0ONLINE   0 0 0
c0t5d0ONLINE   0 0 0
c1t5d0ONLINE   0 0 0
  raidz1  ONLINE   0 0 0
c4t5d0ONLINE   0 0 0
c5t5d0ONLINE   0 0 0
c6t5d0ONLINE   0 0 0
c7t5d0ONLINE   0 0 0
c0t6d0ONLINE   0 0 1
  raidz1  ONLINE   0 0 0
c1t6d0ONLINE   0 0 0
c4t6d0ONLINE   0 0 0
c5t6d0ONLINE   0 0 0
c6t6d0ONLINE   0 0 0
c7t6d0ONLINE   0 0 1
  raidz1  ONLINE   0 0 0
c0t7d0ONLINE   0 0 0
c1t7d0ONLINE   0 0 0
c4t7d0ONLINE   0 0 0
c5t7d0ONLINE   0 0 0
c6t7d0ONLINE   0 0 0
spares
  c0t4d0  INUSE currently in use
  c7t7d0  AVAIL


Also similar to the other hosts were the much, much higher Soft/Hard error 
count in iostat:

s09:~# iostat -En|grep Soft
c2t0d0   Soft Errors: 1 Hard Errors: 2 Transport

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Tonmaus

Hi Arnaud,

which type of controller is this?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Karl Pielorz



--On 04 February 2010 08:58 -0500 Jacob Ritorto jacob.rito...@gmail.com 
wrote:



Seems your controller is actually doing only harm here, or am I missing
something?


The RAID controller presents the drives as both a mirrored pair, and JBOD - 
*at the same time*...


The machine boots off the partition on the 'mirrored' pair - and ZFS uses 
the JBOD devices (a different area of, of course).


It's a little weird to say the least - and I wouldn't recommend it, but it 
does work 'for me' - and is a way of getting the system to boot off a 
mirror, and still be able to use ZFS with only 2 drives available in the 
chassis.


-Karl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Arnaud Brand


Le 04/02/10 16:57, Tonmaus a écrit :

Hi Arnaud,

which type of controller is this?

Regards,

Tonmaus
   
I use two LSI SAS3081E-R in each server (16 hard disk trays, passive 
backplane AFAICT, no expander).

Works very well.

Arnaud
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What would happen with a zpool if you 'mirrored' a disk...

2010-02-04 Thread Robert Milkowski


On 04/02/2010 13:45, Karl Pielorz wrote:


--On 04 February 2010 11:31 + Karl Pielorz 
kpielorz_...@tdx.co.uk wrote:



What would happen when I tried to 'online' ad2 again?


A reply to my own post... I tried this out, when you make 'ad2' online 
again, ZFS immediately logs a 'vdev corrupt' failure, and marks 'ad2' 
(which at this point is a byte-for-byte copy of 'ad1' as it was being 
written to in background) as 'FAULTED' with 'corrupted data'.


You can't replace it with itself at that point, but a detach on ad2, 
and then attaching ad2 back to ad1 results in a resilver, and recovery.


So to answer my own question - from my tests it looks like you can do 
this, and get away with it. It's probably not ideal, but it does work.


it is actually fine - zfs is designed to detect and fix corruption like 
the one you induced.



A safer bet would be to detach the drive from the pool, and then 
re-attach it (at which point ZFS assumes it's a new drive and probably 
ignores the 'mirror image' data that's on it).




Yes, it should and if you want to force resynchronization that's 
probably the best way to do it.
Other thing is that if you suspect some of your data to be corrupted on 
a half of mirror you might try to run zpool scrub as it will fix only 
those corrupted blocks instead of resynchronizing entire mirror which 
might be faster and safer approach.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What happens when: file-corrupted and no-redundancy?

2010-02-04 Thread Robert Milkowski


On 03/02/2010 21:45, Aleksandr Levchuk wrote:

Hardware RAID6 + hot spare, worked well for us. So, I wanted to stick
our SAN for data protection. I understand that the end-to-end checks
of ZFS make it better at detecting corruptions.

In my case, I can imagine that ZFS would FREEZ the whole volume when a
single block or file is found to be corrupted.

Ideally, I would not like this to happen and instead get a log with
names of corrupted files.

What exactly does happens when zfs detects a corrupted block/file and
does not have redundancy to correct it?

Alex

   

I will repeat myself (as I sent below email just yesterday...)

ZFS won't freeze a pool if a single block is corrupted even if no 
redundancy is configured on zfs level.


zpool status -v should provide you with list of affected files which you 
should be able to delete. In case of corrupted block containg meta-data 
zfs should actually be able to fix it on the fly for you as all 
meta-data related blocks are kept in at least two copies even if no 
redundancy is configured at pool level.


Let's test it:

mi...@r600:~# mkfile 128m file1
mi...@r600:~# zpool create test `pwd`/file1
mi...@r600:~# zpool status test
  pool: test
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
testONLINE   0 0 0
  /export/home/milek/file1  ONLINE   0 0 0

errors: No known data errors
mi...@r600:~#
mi...@r600:~# cp /bin/bash /test/file1
mi...@r600:~# cp /bin/bash /test/file2
mi...@r600:~# cp /bin/bash /test/file3
mi...@r600:~# cp /bin/bash /test/file4
mi...@r600:~# cp /bin/bash /test/file5
mi...@r600:~# cp /bin/bash /test/file6
mi...@r600:~# cp /bin/bash /test/file7
mi...@r600:~# cp /bin/bash /test/file8
mi...@r600:~# cp /bin/bash /test/file9
mi...@r600:~# sync
mi...@r600:~# dd if=/dev/zero of=file1 seek=50 count=1 conv=notrunc
1+0 records in
1+0 records out
512 bytes (5.1 MB) copied, 0.179617 s, 28.5 MB/s
mi...@r600:~# sync
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
  pool: test
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed after 0h0m with 7 errors on Thu Feb  4 00:18:40 
2010

config:

NAMESTATE READ WRITE CKSUM
testDEGRADED 0 0 7
  /export/home/milek/file1  DEGRADED 0 029  too many 
errors


errors: Permanent errors have been detected in the following files:

/test/file1
mi...@r600:~#
mi...@r600:~# rm /test/file1
mi...@r600:~# sync
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
  pool: test
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 0h0m with 0 errors on Thu Feb  4 00:19:55 
2010

config:

NAMESTATE READ WRITE CKSUM
testDEGRADED 0 0 7
  /export/home/milek/file1  DEGRADED 0 029  too many 
errors


errors: No known data errors
mi...@r600:~# zpool clear test
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
  pool: test
 state: ONLINE
 scrub: scrub completed after 0h0m with 0 errors on Thu Feb  4 00:20:12 
2010

config:

NAMESTATE READ WRITE CKSUM
testONLINE   0 0 0
  /export/home/milek/file1  ONLINE   0 0 0

errors: No known data errors
mi...@r600:~#
mi...@r600:~# ls -la /test/
total 7191
drwxr-xr-x  2 root root 10 2010-02-04 00:19 .
drwxr-xr-x 28 root root 30 2010-02-04 00:17 ..
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file2
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file3
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file4
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file5
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file6
-r-xr-xr-x  1 root root 799040 2010-02-04 00:17 file7
-r-xr-xr-x  1 root root 799040 2010-02-04 00:18 file8
-r-xr-xr-x  1 root root 799040 2010-02-04 00:18 file9
mi...@r600:~#


--
Robert Milkowski
htpp://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Tonmaus

Hi again,

thanks for the answer. Another thing that came to my mind is that you mentioned 
that you mixed the disks among the controllers. Does that mean you mixed them 
as well among pools? Unsurprisingly,  the WD20EADS is slower than the Hitachi 
that is a fixed 7200 rpm drive. I wonder what impact that would have if you use 
them as vdevs of the same pool.

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Alex Blewitt

On 4 Feb 2010, at 16:35, Bob Friesenhahn wrote:

 On Thu, 4 Feb 2010, Darren J Moffat wrote:
 Thanks - IBM basically haven't test clearcase with ZFS compression 
 therefore, they don't support currently. Future may change, as such my 
 customer cannot use compression. I have asked IBM for roadmap info to find 
 whether/when it will be supported.
 
 That is FUD generation in my opinion and being overly cautious.  The whole 
 point of the POSIX interfaces to a filesystem is that applications don't 
 actually care how the filesystem stores their data.
 
 Clearcase itself implements a versioning filesystem so perhaps it is not 
 being overly cautious.  Compression could change aspects such as how free 
 space is reported.

I'd also like to echo Bob's observations here. Darren's FUDFUD is based on 
limited experience of ClearCase, I expect ...

On the client side, ClearCase actually presnets itself as a mounted filesystem, 
regardless of what the OS has under the covers. In other words, a ClearCase 
directory will never be 'ZFS' because it's not ZFS, it's ClearCaseFS. On the 
server side (which might be the case here) the way ClearCase works is to 
represent the files and contents in a way more akin to a database (e.g. Oracle) 
than traditional file-system approaches to data (e.g. CVS, SVN). In much the 
same way there are app-specific issues with ZFS (e.g. matching block-sizes, 
dealing with ZFS snapshots on a VM image and so forth) there may well be some 
with ClearCase.

At the very least, though, IBM may just be unable/willing to test it at the 
time and put their stamp of approval on it. In many cases for IBM products, 
there are supported platforms (often with specific patch levels), much like 
there are offically supported Solaris platforms and hot-fixes to go for certain 
applications. They may well just being cautious in what there is until they've 
had time to test it out for themselves - or more likely, until the first set of 
paying customers wants to get invoiced for the investigation. But to claim it's 
FUD without any real data to back it up is just FUD^2.

Alex
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS compression on Clearcase

2010-02-04 Thread Robert Milkowski


On 04/02/2010 12:42, Darren J Moffat wrote:

On 04/02/2010 12:13, Roshan Perera wrote:

Hi Darren,

Thanks - IBM basically haven't test clearcase with ZFS compression
therefore, they don't support currently. Future may change, as such
my customer cannot use compression. I have asked IBM for roadmap info
to find whether/when it will be supported.


That is FUD generation in my opinion and being overly cautious.  The
whole point of the POSIX interfaces to a filesystem is that
applications don't actually care how the filesystem stores their data.



I agree (*). It is very similar to what EMC did some years ago by 
officially stating that while ZFS is supported on their disk arrays ZFS 
snapshots are not. Even more funny.



(*) - however compression is not entirely transparent in such a sense 
that a reported disk space usage might not be exactly what application 
expects. But I'm not saying it is an issue here - I honestly don't know.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread Frank Cusack

On February 4, 2010 12:12:04 PM +0100 dick hoogendijk d...@nagual.nl 
wrote:

Why don't you just export that directory with NFS (rw) to your sparse zone
and mount it on /usr/perl5/mumble ? Or is this too simple a thought?


On February 4, 2010 1:41:20 PM +0100 Thomas Maier-Komor 
tho...@maier-komor.de wrote:

What about lofs? I thinks lofs is the equivalent for unionfs on Solaris.


The problem with both of those solutions is a) writes will overwrite the
original filesystem data and b) writes will be visible to everyone else.

Neither suggestion provides unionfs capability.

On February 4, 2010 12:12:18 PM + Peter Tribble 
peter.trib...@gmail.com wrote:

The way I normally do this is to (in the global zone) symlink
/usr/perl5/mumble to somewhere that would be writable such as /opt, and
then put what you need into that location in the zone. Leaves a dangling
symlink in the global zone and other zones, but that's relatively
harmless.


The problem with that is you don't see the underlying data that exists
in the global zone.  I do use that technique for other data (e.g. the
entire /usr/local hierarchy), but it doesn't meet my desired needs in
this case.

I looked into clones (and at least now I understand them much better
than before) and they *almost* provide the functionality I want.  I
could mount a clone in the zoned version of /foo and it would see the
original /foo, and changes would go to the clone only, just like a real
unionfs.

What it's lacking though is that when the underlying filesystem changes
(in the global zone), those changes don't percolate up to the clone.
The clone's base view of files is from the snapshot it was generated
from, which cannot change.  It would be great if you could re-target
(or re-base?) a clone from a different snapshot than the one it was
originally generated from.  Since I don't need realtime updates, for
my purposes that would be a great equivalent to a true unionfs.

So the thread on zfs diff gave me an idea; I will use clones and will
write a 'zfs diff'-like tool.  When the original /usr/perl5/mumble
changes I will use that to pick out files that are different in the
clone and populate a new clone with them.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread Frank Cusack


BTW, I could just install everything in the global zone and use the
default inheritance of /usr into each local zone to see the data.
But then my zones are not independent portable entities; they would
depend on some non-default software installed in the global zone.

Just wanted to explain why this is valuable to me and not just some
crazy way to do something simple.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Frank Cusack


On 2/4/10 8:00 AM +0100 Tomas Ögren wrote:

rsync by default compares metadata first, and only checks through every
byte if you add the -c (checksum) flag.

I would say rsync is the best tool here.


ah, i didn't know that was the default.  no wonder recently when i was
incremental-rsyncing a few TB of data between 2 hosts (not using zfs)
i didn't get any speedup from --size-only or whatever the flag is.


The find -newer blah suggested in other posts won't catch newer files
with an old timestamp (which could happen for various reasons, like
being copied with kept timestamps from somewhere else).


good point.  that is definitely a restriction with find -newer.  but if
you meet that restriction, and don't need to find added or deleted files,
it will be faster since only 1 directory tree has to be walked.

but in the general case it does sound like rsync is the best.  unless
bart can find added and missing files.  in which case bart is better
because it only has to walk 1 dir tree -- assuming you have a saved
manifest from a previous walk over the original dir tree.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Frank Cusack


On 2/4/10 8:21 AM -0500 Ross Walker wrote:

Find -newer doesn't catch files added or removed it assumes identical
trees.


This may be redundant in light of my earlier post, but yes it does.
Directory mtimes are updated when a file is added or removed, and
find -newer will detect that.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread Nicolas Williams

On Thu, Feb 04, 2010 at 03:19:15PM -0500, Frank Cusack wrote:
 BTW, I could just install everything in the global zone and use the
 default inheritance of /usr into each local zone to see the data.
 But then my zones are not independent portable entities; they would
 depend on some non-default software installed in the global zone.
 
 Just wanted to explain why this is valuable to me and not just some
 crazy way to do something simple.

There's no unionfs for Solaris.

(For those of you who don't know, unionfs is a BSDism and is a
pseudo-filesystem which presents the union of two underlying
filesystems, but with all changes being made only to one of the two
filesystems.  The idea is that one of the underlying filesystems cannot
be modified through the union, with all changes made through the union
being recorded in an overlay fs.  Think, for example, of unionfs-
mounting read-only media containing sources: you could cd to the mount
point and build the sources, with all intermediate files and results
placed in the overlay.)

In Frank's case, IIUC, the better solution is to avoid the need for
unionfs in the first place by not placing pkg content in directories
that one might want to be writable from zones.  If there's anything
about Perl5 (or anything else) that causes this need to arise, then I
suggest filing a bug.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread Frank Cusack


On 2/4/10 2:46 PM -0600 Nicolas Williams wrote:

In Frank's case, IIUC, the better solution is to avoid the need for
unionfs in the first place by not placing pkg content in directories
that one might want to be writable from zones.  If there's anything
about Perl5 (or anything else) that causes this need to arise, then I
suggest filing a bug.


Right, and thanks for chiming in.  Problem is that perl wants to install
add-on packages in places that the coincide with the system install.
Most stuff is limited to the site_perl directory, which is easily
redirected, but it also has some other locations it likes to meddle with.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] unionfs help

2010-02-04 Thread Nicolas Williams

On Thu, Feb 04, 2010 at 04:03:19PM -0500, Frank Cusack wrote:
 On 2/4/10 2:46 PM -0600 Nicolas Williams wrote:
 In Frank's case, IIUC, the better solution is to avoid the need for
 unionfs in the first place by not placing pkg content in directories
 that one might want to be writable from zones.  If there's anything
 about Perl5 (or anything else) that causes this need to arise, then I
 suggest filing a bug.
 
 Right, and thanks for chiming in.  Problem is that perl wants to install
 add-on packages in places that the coincide with the system install.
 Most stuff is limited to the site_perl directory, which is easily
 redirected, but it also has some other locations it likes to meddle with.

Maybe we need a zone_perl location.  Judicious use of the search paths
will get you out of this bind, I think.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Pool disk replacing fails

2010-02-04 Thread Alexander M. Stetsenko


Hi all,
Im trying to replace broken LUN in pool using zpool replace -f lun, 
but it fails. Physical disk is already replaced, and new lun has the 
same address as broken one.  But zpool detach/attach works.

This is simple configration:

 pool: mypool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
   attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
   using 'zpool clear' or replace the device with 'zpool replace'.
  see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed after 0h0m with 0 errors on Thu Feb  4 
23:16:21 2010

config:

   NAMESTATE READ WRITE CKSUM
   mypool  DEGRADED 0 0 0
 mirrorDEGRADED 0 0 0
   c1t4d0  DEGRADED 0 028  too many errors
   c1t5d0  ONLINE   0 0 0



c1t4d0 is physically replaced LUN. then I`m trying to replace it in pool.

r...@myhost:~# zpool replace -f mypool c1t4d0
invalid vdev specification
the following errors must be manually repaired:
/dev/dsk/c1t4d0s0 is part of active ZFS pool mypool. Please see zpool(1M).

zpool manual says: -fForces use of new_device, even if its appears 
to be in use. Not all devices can be overridden in this manner.



c1t4d0 in use only in mypool.
What is the problem with zpool replace in my case? Accordingly to 
zpool manual it should work.


Thanx you


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Travis Tabbal

Supermicro USAS-L8i controllers. 

I agree with you, I'd much rather have the drives respond properly and promptly 
than save a little power if that means I'm going to get strange errors from the 
array. And these are the green drives, they just don't seem to cause me any 
problems. The issues people have noted with WD have made me stay away from them 
as just about every drive I own lives in some kind of RAID sometime in its 
life. I have a couple laptop drives that are single, all desktops have at least 
a mirror. I'm a little nuts and would probably install mirrors in the laptops 
if there were somewhere to put them. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to get a list of changed files between two snapshots?

2010-02-04 Thread Darren Mackay

Hi Ross,

Yes - zdb - is dumping out info in the form of:

Object  lvl   iblk   dblk  dsize  lsize   %full  type
19116K512512512  100.00  ZFS plain file
264   bonus  ZFS znode
dnode flags: USED_BYTES USERUSED_ACCOUNTED 
dnode maxblkid: 0
path/snapshot.sh
uid 0
gid 0
atime   Thu Feb  4 23:04:50 2010
mtime   Thu Feb  4 23:04:50 2010
ctime   Thu Feb  4 23:04:50 2010
crtime  Thu Feb  4 23:04:50 2010
gen 529806
mode100755
size174
parent  3
links  
xattr   0
rdev0x


for all objects referenced in the snap.

Perhaps if you wanted to script this, then parsing the above output for time 
stamps that are after the previous snapshot.

Deleted files (and of course new files) can be diffed against the list for the 
snapshot you want to compare with, but I assume you also want files that have 
been modified, hence the requirement to parse the above outputs.

Unfortunately time does not permit me to come up with a working solution until 
(really snowed under until mid next week - did someone say there is meant to be 
a weekend in their too?). But I am sure there is enough info here for someone 
to hack together a script.

Cheers,

Darren Mackay
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Cores vs. Speed?

2010-02-04 Thread Brian

I am Starting to put together a home NAS server that will have the following 
roles:

(1) Store TV recordings from SageTV over either iSCSI or CIFS.  Up to 4 or 5 HD 
streams at a time.  These will be streamed live to the NAS box during recording.
(2) Playback TV (could be stream being recorded, could be others) to 3 or more 
extenders
(3) Hold a music repository
(4) Hold backups from windows machines, mac (time machine), linux.
(5) Be an iSCSI target for several different Virtual Boxes.

Function 4 will use compression and deduplication.
Function 5 will use deduplication.

I plan to start with 5 1.5 TB drives in a raidz2 configuration and 2 mirrored 
boot drives.  

I have been reading these forums off and on for about 6 months trying to figure 
out how to best piece together this system.

I am first trying to select the CPU.  I am leaning towards AMD because of ECC 
support and power consumption.

For items such as de-dupliciation, compression, checksums etc.  Is it better to 
get a faster clock speed or should I consider more cores?  I know certain 
functions such as compression may run on multiple cores.

I have so far narrowed it down to:

AMD Phenom II X2 550 Black Edition Callisto 3.1GHz
and
AMD Phenom X4 9150e Agena 1.8GHz Socket AM2+ 65W Quad-Core

As they are roughly the same price.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

69 matches

Mail list logo