date:20100121

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Robert Milkowski


On 20/01/2010 15:45, David Dyer-Bennet wrote:

On Wed, January 20, 2010 09:23, Robert Milkowski wrote:

   

Now you rsync all the data from your clients to a dedicated filesystem
per client, then create a snapshot.
 


Is there an rsync out there that can reliably replicate all file
characteristics between two ZFS/Solaris systems?  I haven't found one.
The ZFS ACLs seem to be beyond all of them, in particular.

   


No it doesn't support ZFS ACLs - fortunately it is not an issue for us.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Robert Milkowski


On 20/01/2010 19:20, Ian Collins wrote:

Julian Regel wrote:

It is actually not that easy.

Compare a cost of 2x x4540 with 1TB disks to equivalent solution on 
LTO.


Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot 
spare

+ 2x OS disks.
The four raidz2 group form a single pool. This would provide well over
30TB of logical storage per each box.

Now you rsync all the data from your clients to a dedicated filesystem
per client, then create a snapshot.
All snapshots are replicated to a 2nd x4540 so even if you would loose
entire box/data for some reason you would still have a spare copy.

Now compare it to a cost of a library, lto drives, tapes, software +
licenses, support costs, ...

See more details at
http://milek.blogspot.com/2009/12/my-presentation-at-losug.html

I've just read your presentation Robert. Interesting stuff.

I've also just done a pen and paper exercise to see how much 30TB of 
tape would cost as a comparison to your disk based solution.


Using list prices from Sun's website (and who pays list..?), an SL48 
with 2 x LTO3 drives would cost £14000. I couldn't see a price on an 
LTO4 equipped SL48 despite the Sun website saying it's a supported 
option. Each LTO3 has a native capacity of 300GB and the SL48 can 
hold up to 48 tapes in the library (14.4TB native per library). To 
match the 30TB in your solution, we'd need two libraries totalling 
£28000.


You would also need 100 LTO3 tapes to provide 30TB of native storage. 
I recently bought a pack of 20 tapes for £340, so five packs would be 
£1700.


So you could provision a tape backup for just under £3 (~$49000). 
In comparison, the cost of one X4540 with ~ 36TB usable storage is UK 
list price £30900. I've not factored in backup software since you 
could use an open source solution such as Amanda or Bacula.


A more apples to apples comparison would be to compare the storage 
only.  Both removable drive and tape options require a server with FC 
or SCSI ports, so that can be excluded from the comparison.






I think one should actually compare whole solutions - including servers, 
fc infrastructure, tape drives, robots, software costs, rack space, ...


Servers like x4540 are ideal for zfs+rsync backup solution - very 
compact, good $/GB ratio, enough CPU power for its capacity, allow to 
easily scale it horizontally, and it is not too small and not too big. 
Then thanks to its compactness they are very easy to administer.


Depending on an anvironment one could deploy them always in paris - one 
in one datacenter and 2nd one in other datacanter with ZFS send based 
replication of all backups (snapshots). Or one may replicate 
(cross-replicate) only selected clients if needed.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Ian Collins


Robert Milkowski wrote:

On 20/01/2010 19:20, Ian Collins wrote:

Julian Regel wrote:

It is actually not that easy.

Compare a cost of 2x x4540 with 1TB disks to equivalent solution on 
LTO.


Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot 
spare

+ 2x OS disks.
The four raidz2 group form a single pool. This would provide well over
30TB of logical storage per each box.

Now you rsync all the data from your clients to a dedicated filesystem
per client, then create a snapshot.
All snapshots are replicated to a 2nd x4540 so even if you would loose
entire box/data for some reason you would still have a spare copy.

Now compare it to a cost of a library, lto drives, tapes, software +
licenses, support costs, ...

See more details at
http://milek.blogspot.com/2009/12/my-presentation-at-losug.html

I've just read your presentation Robert. Interesting stuff.

I've also just done a pen and paper exercise to see how much 30TB of 
tape would cost as a comparison to your disk based solution.


Using list prices from Sun's website (and who pays list..?), an SL48 
with 2 x LTO3 drives would cost £14000. I couldn't see a price on an 
LTO4 equipped SL48 despite the Sun website saying it's a supported 
option. Each LTO3 has a native capacity of 300GB and the SL48 can 
hold up to 48 tapes in the library (14.4TB native per library). To 
match the 30TB in your solution, we'd need two libraries totalling 
£28000.


You would also need 100 LTO3 tapes to provide 30TB of native 
storage. I recently bought a pack of 20 tapes for £340, so five 
packs would be £1700.


So you could provision a tape backup for just under £3 
(~$49000). In comparison, the cost of one X4540 with ~ 36TB usable 
storage is UK list price £30900. I've not factored in backup 
software since you could use an open source solution such as Amanda 
or Bacula.


A more apples to apples comparison would be to compare the storage 
only.  Both removable drive and tape options require a server with FC 
or SCSI ports, so that can be excluded from the comparison.




I think one should actually compare whole solutions - including 
servers, fc infrastructure, tape drives, robots, software costs, rack 
space, ...


Servers like x4540 are ideal for zfs+rsync backup solution - very 
compact, good $/GB ratio, enough CPU power for its capacity, allow to 
easily scale it horizontally, and it is not too small and not too big. 
Then thanks to its compactness they are very easy to administer.


Until you try to pick one up and put it in a fire safe!

Depending on an anvironment one could deploy them always in paris - 
one in one datacenter and 2nd one in other datacanter with ZFS send 
based replication of all backups (snapshots). Or one may replicate 
(cross-replicate) only selected clients if needed.


Yes, I agree.  That's how my client's systems are configured (pairs).  
We also have another with an attached tape library.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Edward Ned Harvey

 zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0
  mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0
 mirror c0t2d0 c1t2d0
  mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0
 mirror c4t3d0 c5t3d0
  mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0
 mirror c0t5d0 c1t5d0
  mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0
 mirror c4t6d0 c5t6d0
  mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0
 mirror c6t7d0 c7t7d0
  mirror c7t0d0 c7t4d0

This looks good.  But you probably want to stick a spare in there, and add
a SSD disk specified by log

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Edward Ned Harvey

 Zfs does not strictly support RAID 1+0.  However, your sample command
 will create a pool based on mirror vdevs which is written to in a
 load-shared fashion (not striped).  This type of pool is ideal for

Although it's not technically striped according to the RAID definition of
striping, it does achieve the same performance result (actually better) so
people will generally refer to this as striping anyway.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Carsten Aulbert

On Thursday 21 January 2010 10:29:16 Edward Ned Harvey wrote:
  zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0
   mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0
  mirror c0t2d0 c1t2d0
   mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0
  mirror c4t3d0 c5t3d0
   mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0
  mirror c0t5d0 c1t5d0
   mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0
  mirror c4t6d0 c5t6d0
   mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0
  mirror c6t7d0 c7t7d0
   mirror c7t0d0 c7t4d0
 
 This looks good.  But you probably want to stick a spare in there, and
  add a SSD disk specified by log

May I jump in here an ask how people are using SSDs relibly in a x4500? So far 
we had very little success with X25-E drives and a converter from 3.5 to 2.5 
inches. So far two systems have shown pretty bad instabilities with that.

Anyone with a success here?

Cheers

Carste
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Edward Ned Harvey

 zpool create testpool disk1 disk2 disk3

In the traditional sense of RAID, this would create a concatenated data set.
The size of the data set is the size of disk1 + disk2 + disk3.  However,
since this is ZFS, it's not constrained to linearly assigning virtual disk
blocks to physical disk blocks ...  ZFS will happily write a single large
file to all 3 disks simultaneously and just keep track of where all the
blocks landed.

As a result, you get performance which is 3x a single disk for large files
(like striping) but the performance for small files has not been harmed (as
it is in striping)...  As an added bonus, unlike striping, you can still
just add more disks to your zpool, and expand your volume on the fly.  The
filesystem will dynamically adjust to accommodate more space and more
devices, and will intelligently optimize for performance.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Andrew Gabriel


Robert Milkowski wrote:
I think one should actually compare whole solutions - including servers, 
fc infrastructure, tape drives, robots, software costs, rack space, ...


Servers like x4540 are ideal for zfs+rsync backup solution - very 
compact, good $/GB ratio, enough CPU power for its capacity, allow to 
easily scale it horizontally, and it is not too small and not too big. 
Then thanks to its compactness they are very easy to administer.


Depending on an anvironment one could deploy them always in paris - one 
in one datacenter and 2nd one in other datacanter with ZFS send based 
replication of all backups (snapshots). Or one may replicate 
(cross-replicate) only selected clients if needed.


Something else that often sells the 4500/4540 relates to internal 
company politics. Often, inside a company storage has to be provisioned 
from the company's storage group, using very expensive SAN based 
storage, indeed so expensive by the time the company's storage group 
have added their overhead onto the already expensive SAN, that whole 
projects become unviable. Instead, teams find they can order 4500/4540's 
which slip under the radar as servers (or even PCs), and they now have 
affordable storage for their projects, which makes them viable once more.


--
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Phil Harman

Can ASM match ZFS for checksum and self healing? The reason I ask is  
that the x45x0 uses inexpensive (less reluable) SATA drives. Even the  
J4xxx paper you cite uses SAS for production data (only using SATA for  
Oracle Flash, although I gave my concerns about that too).


The thing is, ZFS and the x45x0 seem made for eachother. The latter  
only makes sense to me with all the goodness and assurance added by  
the former.


Phil


On 21 Jan 2010, at 02:58, John hort...@gmail.com wrote:

Have you looked at using Oracle ASM instead of or with ZFS? Recent  
Sun docs concerning the F5100 seem to recommend a hybrid of both.


If you don't go that route, generally you should separate redo logs  
from actual data so they don't compete for I/O, since a redo switch  
lagging hangs the database. If you use archive logs, separate that  
on to yet another pool.


Realistically, it takes lots of analysis with different  
configurations. Every workload and database is different.


A decent overview of configuring JBOD-type storage for databases is  
here, though it doesn't use ASM...

https://www.sun.com/offers/docs/j4000_oracle_db.pdf
It's a couple years old and that might contribute to the lack of an  
ASM mention.

--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Robert Milkowski


On 21/01/2010 09:07, Ian Collins wrote:

Robert Milkowski wrote:

On 20/01/2010 19:20, Ian Collins wrote:

Julian Regel wrote:

It is actually not that easy.

Compare a cost of 2x x4540 with 1TB disks to equivalent solution 
on LTO.


Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot 
spare

+ 2x OS disks.
The four raidz2 group form a single pool. This would provide well 
over

30TB of logical storage per each box.

Now you rsync all the data from your clients to a dedicated 
filesystem

per client, then create a snapshot.
All snapshots are replicated to a 2nd x4540 so even if you would 
loose

entire box/data for some reason you would still have a spare copy.

Now compare it to a cost of a library, lto drives, tapes, software +
licenses, support costs, ...

See more details at
http://milek.blogspot.com/2009/12/my-presentation-at-losug.html

I've just read your presentation Robert. Interesting stuff.

I've also just done a pen and paper exercise to see how much 30TB 
of tape would cost as a comparison to your disk based solution.


Using list prices from Sun's website (and who pays list..?), an 
SL48 with 2 x LTO3 drives would cost £14000. I couldn't see a price 
on an LTO4 equipped SL48 despite the Sun website saying it's a 
supported option. Each LTO3 has a native capacity of 300GB and the 
SL48 can hold up to 48 tapes in the library (14.4TB native per 
library). To match the 30TB in your solution, we'd need two 
libraries totalling £28000.


You would also need 100 LTO3 tapes to provide 30TB of native 
storage. I recently bought a pack of 20 tapes for £340, so five 
packs would be £1700.


So you could provision a tape backup for just under £3 
(~$49000). In comparison, the cost of one X4540 with ~ 36TB usable 
storage is UK list price £30900. I've not factored in backup 
software since you could use an open source solution such as Amanda 
or Bacula.


A more apples to apples comparison would be to compare the storage 
only.  Both removable drive and tape options require a server with 
FC or SCSI ports, so that can be excluded from the comparison.




I think one should actually compare whole solutions - including 
servers, fc infrastructure, tape drives, robots, software costs, rack 
space, ...


Servers like x4540 are ideal for zfs+rsync backup solution - very 
compact, good $/GB ratio, enough CPU power for its capacity, allow to 
easily scale it horizontally, and it is not too small and not too 
big. Then thanks to its compactness they are very easy to administer.


Until you try to pick one up and put it in a fire safe!


Then you backup to tape from x4540 whatever data you need.
In case of enterprise products you save on licensing here as you need a 
one client license per x4540 but in fact can backup data from many 
clients which are there.


:)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zones and other filesystems

2010-01-21 Thread Thomas Burgess

I'm pretty new to opensolaris.  I come from FreeBSD.

Naturally, after using FreeBSD forr awhile i've been big on the use of
FreeBSD jails so i just had to try zones.  I've figured out how to get zones
running but now i'm stuck and need help.  Is there anything like nullfs in
opensolaris...

or maybe there is a more solaris way of doing what i need to do.

Basically, what i'd like to do is give a specific zone access to 2 zfs
filesystems which are available to the global zone.
my new zones are in:

/export/home/zone1
/export/home/zone2


What i'd like to do is give them access to:

/tank/nas/Video
/tank/nas/JeffB


i'm sure i looked over something hugely easy and important...thanks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Julian Regel



 Until you try to pick one up and put it in a fire safe!

Then you backup to tape from x4540 whatever data you need.
In case of enterprise products you save on licensing here as you need a one 
client license per x4540 but in fact can backup data from many clients which 
are there.

Which brings up full circle...

What do you then use to backup to tape bearing in mind that the Sun-provided 
tools all have significant limitations?

I guess you need to use a third party tool and watch carefully that they 
provide complete backups.

JR
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



  ___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Gaëtan Lehmann



Le 21 janv. 10 à 12:33, Thomas Burgess a écrit :


I'm pretty new to opensolaris.  I come from FreeBSD.

Naturally, after using FreeBSD forr awhile i've been big on the use  
of FreeBSD jails so i just had to try zones.  I've figured out how  
to get zones running but now i'm stuck and need help.  Is there  
anything like nullfs in opensolaris...


or maybe there is a more solaris way of doing what i need to do.

Basically, what i'd like to do is give a specific zone access to 2  
zfs filesystems which are available to the global zone.

my new zones are in:

/export/home/zone1
/export/home/zone2




the path of the root of your zone is not important for that feature.


What i'd like to do is give them access to:

/tank/nas/Video
/tank/nas/JeffB




with zonecfg, you can add a configuration like this one to your zone:

add fs
set dir=/some/path/Video
set special=/tank/nas/Video
set type=lofs
end
add fs
set dir=/some/path/JeffB
set special=/tank/nas/JeffB
set type=lofs
end

Your filesystems will appear in /some/path/Video and /some/path/JeffB  
in your zone, and still be accessible in the global zone.


http://docs.sun.com/app/docs/doc/817-1592/z.conf.start-29?a=view

This option don't let you manage the filesystems from the zone though.
You must use add dataset in that case.

http://docs.sun.com/app/docs/doc/819-5461/gbbst?a=view

Gaëtan

--
Gaëtan Lehmann
Biologie du Développement et de la Reproduction
INRA de Jouy-en-Josas (France)
tel: +33 1 34 65 29 66fax: 01 34 65 29 09
http://voxel.jouy.inra.fr  http://www.itk.org
http://www.mandriva.org  http://www.bepo.fr



PGP.sig
Description: Ceci est une signature électronique PGP
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Thomas Burgess


 the path of the root of your zone is not important for that feature.


 \


Ok, cool


 with zonecfg, you can add a configuration like this one to your zone:

 add fs
 set dir=/some/path/Video
 set special=/tank/nas/Video
 set type=lofs
 end
 add fs
 set dir=/some/path/JeffB
 set special=/tank/nas/JeffB
 set type=lofs
 end

 Thanks, i thought i read that this wouldn't work unless it was a legacy
mount
So i'll be able to access the filesystem from both the global zone and my
new zone?


 Your filesystems will appear in /some/path/Video and /some/path/JeffB in
 your zone, and still be accessible in the global zone.

 http://docs.sun.com/app/docs/doc/817-1592/z.conf.start-29?a=view


guess that answers that question =)
Thanks, i'll try that.


 This option don't let you manage the filesystems from the zone though.
 You must use add dataset in that case.

 actually, this is GOOD, i don't WANT the zone to have the ability to change
anything, just the ability to create new files.
Thanks for the help.



 http://docs.sun.com/app/docs/doc/819-5461/gbbst?a=view

 Gaëtan

 --
 Gaëtan Lehmann
 Biologie du Développement et de la Reproduction
 INRA de Jouy-en-Josas (France)
 tel: +33 1 34 65 29 66fax: 01 34 65 29 09
 http://voxel.jouy.inra.fr  http://www.itk.org
 http://www.mandriva.org  http://www.bepo.fr


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread John

No. But, that's where the hybrid solution comes in. ASM would be used for the 
database files and ZFS for the redo/archive logs and undo. Corrupt blocks in 
the datafiles would be repaired with data from redo during a recovery, and ZFS 
should give you assurance that the redo didn't get corrupted. Sun's docs on the 
F5100 point to this as the best solution for performance and 
recoverability/reliability.

Message was edited by: hortnon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Thomas Burgess

now i'm stuck again.sorry to clog the tubes with my nubishness.

i can't seem to create users inside the zonei'm sure it's due to zfs
privelages somewhere but i'm not exactly sure how to fix iti dont' mind
if i need to manage the zfs filesystem outside of the zone, i'm just not
sure WHERE i'm supposed to do it


when i try to create a home dir i get this:

mkdir: Failed to make directory wonslung; Operation not applicable


when i try to do it via adduser i get this:

UX: useradd: ERROR: Unable to create the home directory: Operation not
applicable.


and when i try to enter the zone home dir from the global zone i get this,
even as root:

bash: cd: home: Not owner


have i seriously screwed up or did i again miss something vital.

thanks again.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Gaëtan Lehmann



Le 21 janv. 10 à 14:14, Thomas Burgess a écrit :


now i'm stuck again.sorry to clog the tubes with my nubishness.

i can't seem to create users inside the zonei'm sure it's due to  
zfs privelages somewhere but i'm not exactly sure how to fix iti  
dont' mind if i need to manage the zfs filesystem outside of the  
zone, i'm just not sure WHERE i'm supposed to do it



when i try to create a home dir i get this:

mkdir: Failed to make directory wonslung; Operation not applicable


when i try to do it via adduser i get this:

UX: useradd: ERROR: Unable to create the home directory: Operation  
not applicable.



and when i try to enter the zone home dir from the global zone i get  
this, even as root:


bash: cd: home: Not owner


have i seriously screwed up or did i again miss something vital.



Maybe it's because of the automounter.
If you don't need that feature, try to disable it in your zone with

  svcadm disable autofs

Gaëtan

--
Gaëtan Lehmann
Biologie du Développement et de la Reproduction
INRA de Jouy-en-Josas (France)
tel: +33 1 34 65 29 66fax: 01 34 65 29 09
http://voxel.jouy.inra.fr  http://www.itk.org
http://www.mandriva.org  http://www.bepo.fr



PGP.sig
Description: Ceci est une signature électronique PGP
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Thomas Burgess

hrm...that seemed to work...i'm so new to solarisit's SO
different...what exactly did i just disable?

Does that mount nfs shares or something?
why should that prevent me from creating home directories?
thanks


2010/1/21 Gaëtan Lehmann gaetan.lehm...@jouy.inra.fr


 Le 21 janv. 10 à 14:14, Thomas Burgess a écrit :


  now i'm stuck again.sorry to clog the tubes with my nubishness.

 i can't seem to create users inside the zonei'm sure it's due to zfs
 privelages somewhere but i'm not exactly sure how to fix iti dont' mind
 if i need to manage the zfs filesystem outside of the zone, i'm just not
 sure WHERE i'm supposed to do it


 when i try to create a home dir i get this:

 mkdir: Failed to make directory wonslung; Operation not applicable


 when i try to do it via adduser i get this:

 UX: useradd: ERROR: Unable to create the home directory: Operation not
 applicable.


 and when i try to enter the zone home dir from the global zone i get this,
 even as root:

 bash: cd: home: Not owner


 have i seriously screwed up or did i again miss something vital.



 Maybe it's because of the automounter.
 If you don't need that feature, try to disable it in your zone with

  svcadm disable autofs


 Gaëtan

 --
 Gaëtan Lehmann
 Biologie du Développement et de la Reproduction
 INRA de Jouy-en-Josas (France)
 tel: +33 1 34 65 29 66fax: 01 34 65 29 09
 http://voxel.jouy.inra.fr  http://www.itk.org
 http://www.mandriva.org  http://www.bepo.fr


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Jacob Ritorto


Thomas,
	If you're trying to make user home directories on your local machine in 
/home, you have to watch out because the initial Solaris config assumes 
that you're in an enterprise environment and the convention is to have a 
filer somewhere that serves everyone's home directories which, with the 
default automount config, get mounted onto your machine's /home. 
Personally, when setting up a standalone box, I don't put home 
directories in /home just to avoid clobbering enterprise unix 
conventions.  Gaëtan gave you the quick solution of just shutting off 
the automounter, which allows you to avoid addressing the problem this 
time around.


--jake


Thomas Burgess wrote:
hrm...that seemed to work...i'm so new to solarisit's SO 
different...what exactly did i just disable?


Does that mount nfs shares or something?
why should that prevent me from creating home directories?
thanks


2010/1/21 Gaëtan Lehmann gaetan.lehm...@jouy.inra.fr 
mailto:gaetan.lehm...@jouy.inra.fr



Le 21 janv. 10 à 14:14, Thomas Burgess a écrit :


now i'm stuck again.sorry to clog the tubes with my nubishness.

i can't seem to create users inside the zonei'm sure it's
due to zfs privelages somewhere but i'm not exactly sure how to
fix iti dont' mind if i need to manage the zfs filesystem
outside of the zone, i'm just not sure WHERE i'm supposed to do
it


when i try to create a home dir i get this:

mkdir: Failed to make directory wonslung; Operation not applicable


when i try to do it via adduser i get this:

UX: useradd: ERROR: Unable to create the home directory:
Operation not applicable.


and when i try to enter the zone home dir from the global zone i
get this, even as root:

bash: cd: home: Not owner


have i seriously screwed up or did i again miss something vital.



Maybe it's because of the automounter.
If you don't need that feature, try to disable it in your zone with

 svcadm disable autofs


Gaëtan

-- 
Gaëtan Lehmann

Biologie du Développement et de la Reproduction
INRA de Jouy-en-Josas (France)
tel: +33 1 34 65 29 66fax: 01 34 65 29 09
http://voxel.jouy.inra.fr  http://www.itk.org
http://www.mandriva.org  http://www.bepo.fr





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Thomas Burgess

ahh,

On Thu, Jan 21, 2010 at 8:55 AM, Jacob Ritorto jacob.rito...@gmail.comwrote:

 Thomas,
If you're trying to make user home directories on your local machine
 in /home, you have to watch out because the initial Solaris config assumes
 that you're in an enterprise environment and the convention is to have a
 filer somewhere that serves everyone's home directories which, with the
 default automount config, get mounted onto your machine's /home. Personally,
 when setting up a standalone box, I don't put home directories in /home just
 to avoid clobbering enterprise unix conventions.  Gaëtan gave you the quick
 solution of just shutting off the automounter, which allows you to avoid
 addressing the problem this time around.

 --jake


 yes, i just realized thisi feel quite silly now.
I'm not used to the whole /home vs /export/home difference and when you add
zones to the mix it's quite confusing.

I'm just playing around with this zone.to learn but in the next REAL
zone i'll probably:

mount the home directories from the base system (this machine itself IS a
file server, and the zone i intend to config will be a ftp server and
possible a bit torrent client)

or create a couple stand alone users which AREN't in /home

This makes a lot more sense nowI also forgot to set a default router in
my zone so i can't even connect to the internet right now..

When i edit it with zonecfg can i just do:

add net
set defrouter=192.168.1.1**
end


Thanks again
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Thomas Burgess


 add net
 set defrouter=192.168.1.1**
 end


 Thanks again



I must be doing something wrong...i can access the zone on my network but i
can't for the life of me get the zone to access the internet
I'm googling like crazy but maybe someone here knows what i'm doing wrong.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller ?

2010-01-21 Thread Simon Breden

Does anyone know if the current OpenSolaris mpt driver supports the recent LSI 
SAS2008 controller?

This controller/ASIC is used in the next generation SAS-2 6Gbps PCIe cards from 
LSI and SuperMicro etc, e.g.:
1. SuperMicro AOC-USAS2-L8e and the AOC-USAS2-L8i
2. LSI SAS 9211-8i

Cheers,
Simon

http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zones and other filesystems

2010-01-21 Thread Jacob Ritorto


Thomas Burgess wrote:
I'm not used to the whole /home vs /export/home difference and when you 
add zones to the mix it's quite confusing.


I'm just playing around with this zone.to learn but in the next REAL 
zone i'll probably:


mount the home directories from the base system (this machine itself IS 
a file server, and the zone i intend to config will be a ftp server and 
possible a bit torrent client)


or create a couple stand alone users which AREN't in /home

This makes a lot more sense nowI also forgot to set a default router 
in my zone so i can't even connect to the internet right now..


When i edit it with zonecfg can i just do:

add net
set defrouter=192.168.1.1**
end


OK, so if you're the filer too, the automount system still works for you 
the same as it does for all other machines using automount - it'll nfs 
mount to itself, etc.  Check out and follow the convention if you're so 
inclined.  Then of course, it helps to become a nis or ldap expert too, 
which is a bit much to chew on if you're just here to check out zones, 
so your simplification above is fine, as is Gaëtan's original 
recommendation... At least until your network grows to the point that 
you start to notice the home dir chaos and can't hit nfs shares at 
will..  Then you have to go back and undo all your automount breakage.


And yes, your zonecfg tweak should do the trick.  But you don't have to 
take my word for it -- the experts hang out in zones-discuss ;)

http://mail.opensolaris.org/mailman/listinfo/zones-discuss



ttyl
jake
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Degraded Zpool

2010-01-21 Thread Matthias Appel


Hi list,

I have a serious issue with my zpool.

My zpool consists of 4 vdevs which are assembled to 2 mirrors.

One of this mirrors got degraded cause of too many errors on each vdev  
of the mirror.

Yes, both vdevs of the mirror got degraded.

According to murphys law I don't have a backup as well (I have a  
backup, which was made several months ago, and some backups spread  
across several disks)
Both of these backups are not the best so I want to access my data on  
the zpool so I can make a backup and replace the opensolaris server.


As the two faulted vdevs are connected to different controllers I  
assume that the problem is located on the server, not on the harddisks/ 
controllersone of the faulted harddisks has been replaced some  
weeks ago due to crc errors, so I assume the server is bad, not the  
disks/cables/controllers.


My state is as follows:

NAMESTATE READ WRITE CKSUM
performance  DEGRADED 0 0 8
  mirrorDEGRADED 0 016
c1t1d0  DEGRADED 0 023  too many errors
c2d0DEGRADED 0 024  too many errors
  mirrorONLINE   0 0 0
c1t2d0  ONLINE   0 0 7
c3d0ONLINE   0 0 7

The disks (at least the c1t1/2 ones,I don't see the c2/3 ones via  
cfgadm) are online as fare I can see via cfgadm.	


Is there a possibility to force the two degraded vdevs online, so I  
can fully acces the zpool and do a backup?


I wanted to ask first, before doing any stupid things and to loose the  
whole pool.


Any suggestions are very welcome.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Bob Friesenhahn


On Thu, 21 Jan 2010, Edward Ned Harvey wrote:


Although it's not technically striped according to the RAID definition of
striping, it does achieve the same performance result (actually better) so
people will generally refer to this as striping anyway.


People will say a lot of things, but that does not make them right. 
At some point, using the wrong terminology becomes foolish and 
counterproductive.


Striping and load-share seem quite different to me.  The difference is 
immediately apparent when watching the drive activity LEDs.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Dedup memory overhead

2010-01-21 Thread erik.ableson

Hi all,

I'm going to be trying out some tests using b130 for dedup on a server with 
about 1,7Tb of useable storage (14x146 in two raidz vdevs of 7 disks).  What 
I'm trying to get a handle on is how to estimate the memory overhead required 
for dedup on that amount of storage.  From what I gather, the dedup hash keys 
are held in ARC and L2ARC and as such are in competition for the available 
memory.

So the question is how much memory or L2ARC would be necessary to ensure that 
I'm never going back to disk to read out the hash keys. Better yet would be 
some kind of algorithm for calculating the overhead. eg - averaged block size 
of 4K = a hash key for every 4k stored and a hash occupies 256 bits. An 
associated question is then how does the ARC handle competition between hash 
keys and regular ARC functions?

Based on these estimations, I think that I should be able to calculate the 
following:
1,7 TB
1740,8  GB
1782579,2   MB
1825361100,8KB
4   average block size
456340275,2 blocks
256 hash key size-bits
1,16823E+11 hash key overhead - bits
1460206,4   hash key size-bytes
14260633,6  hash key size-KB
13926,4 hash key size-MB
13,6hash key overhead-GB

Of course the big question on this will be the average block size - or better 
yet - to be able to analyze an existing datastore to see just how many blocks 
it uses and what is the current distribution of different block sizes. I'm 
currently playing around with zdb with mixed success  on extracting this kind 
of data. That's also a worst case scenario since it's counting really small 
blocks and using 100% of available storage - highly unlikely. 

# zdb -ddbb siovale/iphone
Dataset siovale/iphone [ZPL], ID 2381, cr_txg 3764691, 44.6G, 99 objects

ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, 
flags 0x0

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  57.0K64K   77.34  DMU dnode
 1116K 1K  1.50K 1K  100.00  ZFS master node
 2116K512  1.50K512  100.00  ZFS delete queue
 3216K16K  18.0K32K  100.00  ZFS directory
 4316K   128K   408M   408M  100.00  ZFS plain file
 5116K16K  3.00K16K  100.00  FUID table
 6116K 4K  4.50K 4K  100.00  ZFS plain file
 7116K  6.50K  6.50K  6.50K  100.00  ZFS plain file
 8316K   128K   952M   952M  100.00  ZFS plain file
 9316K   128K   912M   912M  100.00  ZFS plain file
10316K   128K   695M   695M  100.00  ZFS plain file
11316K   128K   914M   914M  100.00  ZFS plain file
 
Now, if I'm understanding this output properly, object 4 is composed of 128KB 
blocks with a total size of 408MB, meaning that it uses 3264 blocks.  Can 
someone confirm (or correct) that assumption? Also, I note that each object  
(as far as my limited testing has shown) has a single block size with no 
internal variation.

Interestingly, all of my zvols seem to use fixed size blocks - that is, there 
is no variation in the block sizes - they're all the size defined on creation 
with no dynamic block sizes being used. I previously thought that the -b option 
set the maximum size, rather than fixing all blocks.  Learned something today 
:-)

# zdb -ddbb siovale/testvol
Dataset siovale/testvol [ZVOL], ID 45, cr_txg 4717890, 23.9K, 2 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1116K64K  064K0.00  zvol object
 2116K512  1.50K512  100.00  zvol prop

# zdb -ddbb siovale/tm-media
Dataset siovale/tm-media [ZVOL], ID 706, cr_txg 4426997, 240G, 2 objects

ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, 
flags 0x0

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1516K 8K   240G   250G   97.33  zvol object
 2116K512  1.50K512  100.00  zvol prop

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New Supermicro SAS/SATA controller: AOC-USAS2-L8e in SOHO NAS and HD HT

2010-01-21 Thread Simon Breden

That looks promising.

As the main thing here is that OpenSolaris supports the LSI SAS2008 controller, 
I have created a new post to ask for confirmation of driver support -- see here:
http://opensolaris.org/jive/thread.jspa?threadID=122156tstart=0

Cheers,
Simon

http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS filesystem lock after running auto-replicate.ksh - how to clear?

2010-01-21 Thread Fletcher Cocquyt

Hi, 

I found this script for replicating zfs data:
http://www.infrageeks.com/groups/infrageeks/wiki/8fb35/zfs_autoreplicate_script.html

 - I am testing it out in the lab with b129.
It error-ed out the first run with some syntax error about the send component
(recursive needed?)

But I have not been able to run it again - it says the destination filesystem is
locked:
g...@lab-zfs-01:~ 10:50am 3 # ./auto-replicate.ksh data1/vms data1 lab-zfs-02
Destination filesystem data1/vms exists

Filesystem locked, quitting: data1/vms
g...@lab-zfs-01:~ 10:50am 4 

How do I clear the lock - I have not been able to find documentation on this...


thanks!

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Richard Elling

On Jan 21, 2010, at 3:55 AM, Julian Regel wrote:
  Until you try to pick one up and put it in a fire safe!
 
 Then you backup to tape from x4540 whatever data you need.
 In case of enterprise products you save on licensing here as you need a one 
 client license per x4540 but in fact can backup data from many clients 
 which are there.
 
 Which brings up full circle...
 
 What do you then use to backup to tape bearing in mind that the Sun-provided 
 tools all have significant limitations?

Poor choice of words.  Sun resells NetBackup and (IIRC) that which was 
formerly called NetWorker.  Thus, Sun does provide enterprise backup
solutions.

If I may put on my MBA hat, the competition is not ufsdump.  ufsdump has
nearly zero market penetration and no prospects for improving its market
share. Making another ufsdump will also gain no market share.  The market
leaders are the likes of EMC, IBM, and Symantec with their heterogenous
backup support. If Sun wanted to provide a better solution that might gain
market share against the others, then it would also need to be heterogenous.
So I think it would be hard to make a business case for a whole new backup
solution. A less costly and less risky approach is to work with the market 
leaders to better integrate with dataset replication.  Caveat: this may already
be available, I haven't looked recently.

 I guess you need to use a third party tool and watch carefully that they 
 provide complete backups.

This is a good idea anyway.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool

2010-01-21 Thread Richard Elling

On Jan 20, 2010, at 4:17 PM, Daniel Carosone wrote:

 On Wed, Jan 20, 2010 at 03:20:20PM -0800, Richard Elling wrote:
 Though the ARC case, PSARC/2007/618 is unpublished, I gather from
 googling and the source that L2ARC devices are considered auxiliary,
 in the same category as spares. If so, then it is perfectly reasonable to
 expect that it gets picked up regardless of the GUID. This also implies
 that it is shareable between pools until assigned. Brief testing confirms
 this behaviour.  I learn something new every day :-)
 
 So, I suspect Lutz sees a race when both pools are imported onto one
 node.  This still makes me nervous though...
 
 Yes. What if device reconfiguration renumbers my controllers, will
 l2arc suddenly start trashing a data disk?  The same problem used to
 be a risk for swap,  but less so now that we swap to named zvol. 

This will not happen unless the labels are rewritten on your data disk, 
and if that occurs, all bets are off.

 There's work afoot to make l2arc persistent across reboot, which
 implies some organised storage structure on the device.  Fixing this
 shouldn't wait for that.

Upon further review, the ruling on the field is confirmed ;-)  The L2ARC
is shared amongst pools just like the ARC. What is important is that at
least one pool has a cache vdev. I suppose one could make the case
that a new command is needed in addition to zpool and zfs (!) to manage
such devices. But perhaps we can live with the oddity for a while?

As such, for Lutz's configuration, I am now less nervous. If I understand
correctly, you could add the cache vdev to rpool and forget about how
it works with the shared pools.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup memory overhead

2010-01-21 Thread Richard Elling

On Jan 21, 2010, at 8:04 AM, erik.ableson wrote:

 Hi all,
 
 I'm going to be trying out some tests using b130 for dedup on a server with 
 about 1,7Tb of useable storage (14x146 in two raidz vdevs of 7 disks).  What 
 I'm trying to get a handle on is how to estimate the memory overhead required 
 for dedup on that amount of storage.  From what I gather, the dedup hash keys 
 are held in ARC and L2ARC and as such are in competition for the available 
 memory.

... and written to disk, of course.

For ARC sizing, more is always better.

 So the question is how much memory or L2ARC would be necessary to ensure that 
 I'm never going back to disk to read out the hash keys. Better yet would be 
 some kind of algorithm for calculating the overhead. eg - averaged block size 
 of 4K = a hash key for every 4k stored and a hash occupies 256 bits. An 
 associated question is then how does the ARC handle competition between hash 
 keys and regular ARC functions?

AFAIK, there is no special treatment given to the DDT. The DDT is stored like
other metadata and (currently) not easily accounted for.

Also the DDT keys are 320 bits. The key itself includes the logical and physical
block size and compression. The DDT entry is even larger.

I think it is better to think of the ARC as caching the uncompressed DDT
blocks which were written to disk.  The number of these will be data dependent.
zdb -S poolname will give you an idea of the number of blocks and how well
dedup will work on your data, but that means you already have the data in a
pool.
 -- richard


 Based on these estimations, I think that I should be able to calculate the 
 following:
 1,7   TB
 1740,8GB
 1782579,2 MB
 1825361100,8  KB
 4 average block size
 456340275,2   blocks
 256   hash key size-bits
 1,16823E+11   hash key overhead - bits
 1460206,4 hash key size-bytes
 14260633,6hash key size-KB
 13926,4   hash key size-MB
 13,6  hash key overhead-GB
 
 Of course the big question on this will be the average block size - or better 
 yet - to be able to analyze an existing datastore to see just how many blocks 
 it uses and what is the current distribution of different block sizes. I'm 
 currently playing around with zdb with mixed success  on extracting this kind 
 of data. That's also a worst case scenario since it's counting really small 
 blocks and using 100% of available storage - highly unlikely. 
 
 # zdb -ddbb siovale/iphone
 Dataset siovale/iphone [ZPL], ID 2381, cr_txg 3764691, 44.6G, 99 objects
 
ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, 
 flags 0x0
 
Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  57.0K64K   77.34  DMU dnode
 1116K 1K  1.50K 1K  100.00  ZFS master node
 2116K512  1.50K512  100.00  ZFS delete queue
 3216K16K  18.0K32K  100.00  ZFS directory
 4316K   128K   408M   408M  100.00  ZFS plain file
 5116K16K  3.00K16K  100.00  FUID table
 6116K 4K  4.50K 4K  100.00  ZFS plain file
 7116K  6.50K  6.50K  6.50K  100.00  ZFS plain file
 8316K   128K   952M   952M  100.00  ZFS plain file
 9316K   128K   912M   912M  100.00  ZFS plain file
10316K   128K   695M   695M  100.00  ZFS plain file
11316K   128K   914M   914M  100.00  ZFS plain file
 
 Now, if I'm understanding this output properly, object 4 is composed of 128KB 
 blocks with a total size of 408MB, meaning that it uses 3264 blocks.  Can 
 someone confirm (or correct) that assumption? Also, I note that each object  
 (as far as my limited testing has shown) has a single block size with no 
 internal variation.
 
 Interestingly, all of my zvols seem to use fixed size blocks - that is, there 
 is no variation in the block sizes - they're all the size defined on creation 
 with no dynamic block sizes being used. I previously thought that the -b 
 option set the maximum size, rather than fixing all blocks.  Learned 
 something today :-)
 
 # zdb -ddbb siovale/testvol
 Dataset siovale/testvol [ZVOL], ID 45, cr_txg 4717890, 23.9K, 2 objects
 
Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1116K64K  064K0.00  zvol object
 2116K512  1.50K512  100.00  zvol prop
 
 # zdb -ddbb siovale/tm-media
 Dataset siovale/tm-media [ZVOL], ID 706, cr_txg 4426997, 240G, 2 objects
 
ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, 
 flags 0x0
 
Object  lvl   iblk   dblk  dsize  lsize   %full  type
 0716K16K  21.0K16K6.25  DMU dnode
 1516K 8K   240G   250G   97.33  zvol object
 2116K512  1.50K512  100.00  zvol prop
 
 ___

Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-21 Thread Ian Collins


Julian Regel wrote:


 Until you try to pick one up and put it in a fire safe!

Then you backup to tape from x4540 whatever data you need.
In case of enterprise products you save on licensing here as you need 
a one client license per x4540 but in fact can backup data from many 
clients which are there.


Which brings up full circle...

What do you then use to backup to tape bearing in mind that the 
Sun-provided tools all have significant limitations?


In addition to Richard's comments, I doubt many medium to large 
businesses would use ufsdump/restore as their backup solution.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller ?

2010-01-21 Thread James C. McPherson


On 22/01/10 12:28 AM, Simon Breden wrote:

Does anyone know if the current OpenSolaris mpt driver supports the recent LSI 
SAS2008 controller?

This controller/ASIC is used in the next generation SAS-2 6Gbps PCIe cards from 
LSI and SuperMicro etc, e.g.:
1. SuperMicro AOC-USAS2-L8e and the AOC-USAS2-L8i
2. LSI SAS 9211-8i


No, the 2nd generation non-RAID LSI SAS controllers make
use of the mpt_sas(7d).

Second generation RAID LSI SAS controllers use mr_sas(7d).

Code for both of these drivers is Open and you can find
it on src.opensolaris.org.


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Michelle Knight

Hi Folks,

Situation, 64 bit Open Solaris on AMD. 2009-6 111b - I can't successfully 
update the OS.

I've got three external 1.5 Tb drives in a raidz pool connected via USB.

Hooked on to an IDE channel is a 750gig hard drive that I'm copying the data 
off. It is an ext3 drive from an Ubuntu server.

Copying is being done on the machine using the cp command as root.

So far, two files have failed...
/mirror2/applications/Microsoft/Operating Systems/Virtual PC/vm/XP-SP2/XP-SP2 
Hard Disk.vhd: File too large
/mirror2/applications/virtualboximages/xp/xp.tar.bz2: File too large

The files are...
-rwxr-x---   1 adminapplications 4177570654 Nov  4 08:02 xp.tar.bz2
-rwxr-x---   1 adminapplications 2582259712 Feb 14  2007 XP-SP2 Hard 
Disk.vhd

The system is a home server and contains files of all types and sizes.

Any ideas please?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Richard Elling

CC'ed to ext3-disc...@opensolaris.org because this is an ext3 on Solaris
issue.  ZFS has no problem with large files, but the older ext3 did.

See also the ext3 project page and documentation, especially
http://hub.opensolaris.org/bin/view/Project+ext3/Project_status
 -- richard


On Jan 21, 2010, at 11:58 AM, Michelle Knight wrote:

 Hi Folks,
 
 Situation, 64 bit Open Solaris on AMD. 2009-6 111b - I can't successfully 
 update the OS.
 
 I've got three external 1.5 Tb drives in a raidz pool connected via USB.
 
 Hooked on to an IDE channel is a 750gig hard drive that I'm copying the data 
 off. It is an ext3 drive from an Ubuntu server.
 
 Copying is being done on the machine using the cp command as root.
 
 So far, two files have failed...
 /mirror2/applications/Microsoft/Operating Systems/Virtual PC/vm/XP-SP2/XP-SP2 
 Hard Disk.vhd: File too large
 /mirror2/applications/virtualboximages/xp/xp.tar.bz2: File too large
 
 The files are...
 -rwxr-x---   1 adminapplications 4177570654 Nov  4 08:02 xp.tar.bz2
 -rwxr-x---   1 adminapplications 2582259712 Feb 14  2007 XP-SP2 Hard 
 Disk.vhd
 
 The system is a home server and contains files of all types and sizes.
 
 Any ideas please?
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread Simon Breden

Thanks a lot for the info James.

For the benefit of myself and others then:
1. mpt_sas driver is used for the SuperMicro AOC-USAS2-L8e
2. mr_sas driver is used for the SuperMicro AOC-USAS2-L8i and LSI SAS 9211-8i

And how does the maturity/robustness of the mpt_sas  mr_sas drivers compare to 
the mpt driver which I'm currently using for my LSI 1068-based AOC-USAS-L8i 
card? (in the default IT mode)

It might be hard to answer that one, but I thought I'd ask anyway, as it would 
make choosing new kit for OpenSolaris + ZFS a bit easier.

Cheers,
Simon

http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup memory overhead

2010-01-21 Thread Andrey Kuzmin

On Thu, Jan 21, 2010 at 10:00 PM, Richard Elling
richard.ell...@gmail.com wrote:
 On Jan 21, 2010, at 8:04 AM, erik.ableson wrote:

 Hi all,

 I'm going to be trying out some tests using b130 for dedup on a server with 
 about 1,7Tb of useable storage (14x146 in two raidz vdevs of 7 disks).  What 
 I'm trying to get a handle on is how to estimate the memory overhead 
 required for dedup on that amount of storage.  From what I gather, the dedup 
 hash keys are held in ARC and L2ARC and as such are in competition for the 
 available memory.

 ... and written to disk, of course.

 For ARC sizing, more is always better.

 So the question is how much memory or L2ARC would be necessary to ensure 
 that I'm never going back to disk to read out the hash keys. Better yet 
 would be some kind of algorithm for calculating the overhead. eg - averaged 
 block size of 4K = a hash key for every 4k stored and a hash occupies 256 
 bits. An associated question is then how does the ARC handle competition 
 between hash keys and regular ARC functions?

 AFAIK, there is no special treatment given to the DDT. The DDT is stored like
 other metadata and (currently) not easily accounted for.

 Also the DDT keys are 320 bits. The key itself includes the logical and 
 physical
 block size and compression. The DDT entry is even larger.

Looking at dedupe code, I noticed that on-disk DDT entries are
compressed less efficiently than possible: key is not compressed at
all (I'd expect roughly 2:1 compression ration with sha256 data),
while other entry data is currently passed through zle compressor only
(I'd expect this one to be less efficient than off-the-shelf
compressors, feel free to correct me if I'm wrong). Is this v1, going
to be improved in the future?

Further, with huge dedupe memory footprint and heavy performance
impact when DDT entries need to be read from disk, it might be
worthwhile to consider compression of in-core ddt entries
(specifically for DDTs or, more generally, making ARC/L2ARC
compression-aware). Has this been considered?

Regards,
Andrey


 I think it is better to think of the ARC as caching the uncompressed DDT
 blocks which were written to disk.  The number of these will be data 
 dependent.
 zdb -S poolname will give you an idea of the number of blocks and how well
 dedup will work on your data, but that means you already have the data in a
 pool.
  -- richard


 Based on these estimations, I think that I should be able to calculate the 
 following:
 1,7   TB
 1740,8        GB
 1782579,2     MB
 1825361100,8  KB
 4     average block size
 456340275,2   blocks
 256   hash key size-bits
 1,16823E+11   hash key overhead - bits
 1460206,4 hash key size-bytes
 14260633,6    hash key size-KB
 13926,4       hash key size-MB
 13,6  hash key overhead-GB

 Of course the big question on this will be the average block size - or 
 better yet - to be able to analyze an existing datastore to see just how 
 many blocks it uses and what is the current distribution of different block 
 sizes. I'm currently playing around with zdb with mixed success  on 
 extracting this kind of data. That's also a worst case scenario since it's 
 counting really small blocks and using 100% of available storage - highly 
 unlikely.

 # zdb -ddbb siovale/iphone
 Dataset siovale/iphone [ZPL], ID 2381, cr_txg 3764691, 44.6G, 99 objects

    ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, 
 flags 0x0

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
         0    7    16K    16K  57.0K    64K   77.34  DMU dnode
         1    1    16K     1K  1.50K     1K  100.00  ZFS master node
         2    1    16K    512  1.50K    512  100.00  ZFS delete queue
         3    2    16K    16K  18.0K    32K  100.00  ZFS directory
         4    3    16K   128K   408M   408M  100.00  ZFS plain file
         5    1    16K    16K  3.00K    16K  100.00  FUID table
         6    1    16K     4K  4.50K     4K  100.00  ZFS plain file
         7    1    16K  6.50K  6.50K  6.50K  100.00  ZFS plain file
         8    3    16K   128K   952M   952M  100.00  ZFS plain file
         9    3    16K   128K   912M   912M  100.00  ZFS plain file
        10    3    16K   128K   695M   695M  100.00  ZFS plain file
        11    3    16K   128K   914M   914M  100.00  ZFS plain file

 Now, if I'm understanding this output properly, object 4 is composed of 
 128KB blocks with a total size of 408MB, meaning that it uses 3264 blocks.  
 Can someone confirm (or correct) that assumption? Also, I note that each 
 object  (as far as my limited testing has shown) has a single block size 
 with no internal variation.

 Interestingly, all of my zvols seem to use fixed size blocks - that is, 
 there is no variation in the block sizes - they're all the size defined on 
 creation with no dynamic block sizes being used. I previously thought that 
 the -b option set the maximum size, rather than fixing all blocks.  Learned 
 something today :-)

 # zdb -ddbb

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Michelle Knight

Aplogies for not explaining myself correctly, I'm copying from ext3 on to ZFS - 
it appears to my amateur eyes, that it is ZFS that is having the problem.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread James C. McPherson


On 22/01/10 06:14 AM, Simon Breden wrote:

Thanks a lot for the info James.

For the benefit of myself and others then:

 1. mpt_sas driver is used for the SuperMicro AOC-USAS2-L8e
 2. mr_sas driver is used for the SuperMicro AOC-USAS2-L8i and LSI SAS 
9211-8i


Correct. I only know the internal chip code names, not what
the actual shipping products are called :|


And how does the maturity/robustness of the mpt_sas  mr_sas drivers
compare to the mpt driver which I'm currently using for my LSI
1068-based AOC-USAS-L8i card? (in the default IT mode)
It might be hard to answer that one, but I thought I'd ask anyway, as
it would make choosing new kit for OpenSolaris + ZFS a bit easier.



I really don't have any specs re maturity or robustness, sorry,
but I can tell you that

(a) these two drivers were joint development efforts between Sun
and LSI,

(b) the requirements list that we had for the drivers is extensive,
[note that MPxIO is on by default with mpt_sas]

(c) we went through an insane amount of testing (and with some
very rigorous tools) at every stage of the cycle before integration,

and
(d) we're confident that you'll find these drivers and chips to be
up to the task.


If you do come across problems, please bring it up in storage-discuss
or zfs-discuss, and if necessary file a bug on bugs.opensolaris.org
solaris/driver/mpt-sas, and solaris/driver/mr_sas are the two subcats
that you'll need in that case.



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 09:36:06AM -0800, Richard Elling wrote:
 On Jan 20, 2010, at 4:17 PM, Daniel Carosone wrote:
 
  On Wed, Jan 20, 2010 at 03:20:20PM -0800, Richard Elling wrote:
  Though the ARC case, PSARC/2007/618 is unpublished, I gather from
  googling and the source that L2ARC devices are considered auxiliary,
  in the same category as spares. If so, then it is perfectly reasonable to
  expect that it gets picked up regardless of the GUID. This also implies
  that it is shareable between pools until assigned. Brief testing confirms
  this behaviour.  I learn something new every day :-)
  
  So, I suspect Lutz sees a race when both pools are imported onto one
  node.  This still makes me nervous though...
  
  Yes. What if device reconfiguration renumbers my controllers, will
  l2arc suddenly start trashing a data disk?  The same problem used to
  be a risk for swap,  but less so now that we swap to named zvol. 
 
 This will not happen unless the labels are rewritten on your data disk, 
 and if that occurs, all bets are off.

It occurred to me later yesterday, while offline, that the pool in
question might have autoreplace=on set.  If that were true, it would
explain why a disk in the same controller slot was overwritten and
used.

Lutz, is the pool autoreplace property on?  If so, god help us all
is no longer quite so necessary.

  There's work afoot to make l2arc persistent across reboot, which
  implies some organised storage structure on the device.  Fixing this
  shouldn't wait for that.
 
 Upon further review, the ruling on the field is confirmed ;-)  The L2ARC
 is shared amongst pools just like the ARC. What is important is that at
 least one pool has a cache vdev. 

Wait, huh?  That's a totally separate issue from what I understood
from the discussion.  What I was worried about was that disk Y, that
happened to have the same cLtMdN address as disk X on another node,
was overwritten and trashed on import to become l2arc.  

Maybe I missed some other detail in the thread and reached the wrong
conclusion? 

 As such, for Lutz's configuration, I am now less nervous. If I understand
 correctly, you could add the cache vdev to rpool and forget about how
 it works with the shared pools.

The fact that l2arc devices could be caching data from any pool in the
system is .. a whole different set of (mostly performance) wrinkles.

For example, if I have a pool of very slow disks (usb or remote
iscsi), and a pool of faster disks, and l2arc for the slow pool on the
same faster disks, it's pointless having the faster pool using l2arc
on the same disks or even the same type of disks.  I'd need to set the
secondarycache properties of one pool according to the configuration
of another. 

 I suppose one could make the case
 that a new command is needed in addition to zpool and zfs (!) to manage
 such devices. But perhaps we can live with the oddity for a while?

This part, I expect, will be resolved or clarified as part of the
l2arc persistence work, since then their attachment to specific pools
will need to be clear and explicit.

Perhaps the answer is that the cache devices become their own pool
(since they're going to need filesystem-like structured storage
anyway). The actual cache could be a zvol (or new object type) within
that pool, and then (if necessary) an association is made between
normal pools and the cache (especially if I have multiple of them).
No new top-level commands needed. 

--
Dan.


pgp0MK26F4Jvy.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Michelle Knight

Fair enough.

So where do you think my problem lies?

Do you think it could be a limitation of the driver I loaded to read the ext3 
partition?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread Simon Breden

 Correct. I only know the internal chip code names, not what the actual 
 shipping products are called :|

Now 'knew' ;-)

It's reassuring to hear your points a thru d regarding the development/test 
cycle.

I could always use the 'try before you buy' approach: others try it, and if it 
works, I buy it ;-)

Thanks a lot.
Simon

http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Andrew Gabriel


Michelle Knight wrote:

Fair enough.

So where do you think my problem lies?

Do you think it could be a limitation of the driver I loaded to read the ext3 
partition?


Without knowing exactly what commands you typed and exactly what error 
messages they produced, and which directories/files are on which types 
of file systems, we're limited to guessing.


--
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup memory overhead

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 05:04:51PM +0100, erik.ableson wrote:

 What I'm trying to get a handle on is how to estimate the memory
 overhead required for dedup on that amount of storage.   

We'd all appreciate better visibility of this. This requires:
 - time and observation and experience, and
 - better observability tools and (probably) data exposed for them

 So the question is how much memory or L2ARC would be necessary to
 ensure that I'm never going back to disk to read out the hash keys. 

I think that's a wrong-goal for optimisation.

For performance (rather than space) issues, I look at dedup as simply
increasing the size of the working set, with a goal of reducing the
amount of IO (avoided duplicate writes) in return.

If saving one large async write costs several small sync reads, you
fall off a very steep performance cliff, especially for IOPS-limited
seeking media. However, it doesn't matter whether those reads are for
DDT entries or other filesystem metadata necessary to complete the
write. Nor does it even matter if those reads are data reads, for
other processes that have been pushed out of ARC because of the larger
working set.  So I think it's right that arc doesn't treat DDT entries
specially.

The trouble is that the hash function produces (we can assume) random
hits across the DDT, so the working set depends on the amount of
data and the rate of potentially dedupable writes as well as the
actual dedup hit ratio.  A high rate of writes also means a large
amount of data in ARC waiting to be written at the same time. This
makes analysis very hard (and pushes you very fast towards that very
steep cliff, as we've all seen). 

Separately, what might help is something like dedup=opportunistic
that would keep the working set smaller:
 - dedup the block IFF the DDT entry is already in (l2)arc
 - otherwise, just write another copy
 - maybe some future async dedup cleaner, using bp-rewrite, to tidy
   up later.
I'm not sure what, in this scheme, would ever bring DDT entries into
cache, though.  Reads for previously dedup'd data?

I also think a threshold on the size of blocks to try deduping would
help.  If I only dedup blocks (say) 64k and larger, i might well get
most of the space benefit for much less overhead.

--
Dan.

pgpfZ1iTPb0nB.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Michelle Knight

The error messages are in the original post. They are...
/mirror2/applications/Microsoft/Operating Systems/Virtual PC/vm/XP-SP2/XP-SP2 
Hard Disk.vhd: File too large
/mirror2/applications/virtualboximages/xp/xp.tar.bz2: File too large

The system installed to read the EXT3 system is here - 
http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in

The ZFS partition is on /mirror

The EXT3 partition is on /mirror2

The command to start the copy is...
cp -R /mirror2/* .
...while being CD'd to /mirror and logged in as root.

Anything else I can get that would help this?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 01:55:53PM -0800, Michelle Knight wrote:
 The error messages are in the original post. They are...
 /mirror2/applications/Microsoft/Operating Systems/Virtual PC/vm/XP-SP2/XP-SP2 
 Hard Disk.vhd: File too large
 /mirror2/applications/virtualboximages/xp/xp.tar.bz2: File too large
 
 The system installed to read the EXT3 system is here - 
 http://blogs.sun.com/pradhap/entry/mount_ntfs_ext2_ext3_in
 
 The ZFS partition is on /mirror
 
 The EXT3 partition is on /mirror2

Which is the path in the error filename.  You're having trouble
reading the file off ext3 - you can verify this by trying something
like cat  /dev/null. 

 The command to start the copy is...
 cp -R /mirror2/* .
 ...while being CD'd to /mirror and logged in as root.
 
 Anything else I can get that would help this?

Best would be to plug the ext3 disk into something that can read it
fully, and copy over the network.  Linux, NetBSD, maybe newer
opensolaris. Note that this could be running in a VM on the same box,
if necessary.

--
Dan.

pgpUExChlbKhx.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread Moshe Vainer

We tried the new LSI controllers in our configuration trying to replace Areca 
1680 controllers. The tests were done on 2009.06
Unlike the mpt drivers which were rock solid (but obviously do not support the 
new chips), the mr_sas was a complete disaster. (We got ours from LSI website). 
Timeouts, missing drives, errors in /var/adm/messages. The driver may have 
stabilized since then, but I wouldn't use it in production yet. 2010/03 - may 
be, but not 2009.06
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Bryan Allen

| On Thu, Jan 21, 2010 at 01:55:53PM -0800, Michelle Knight wrote:
 Anything else I can get that would help this?

split(1)? :-)
-- 
bda
cyberpunk is dead. long live cyberpunk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] need a few suggestions for a poor man's ZIL/SLOG device

2010-01-21 Thread Moshe Vainer

PS: For data that you want to mostly archive, consider using Amazon
Web Services (AWS) S3 service. Right now there is no charge to push
data into the cloud and its $0.15/gigabyte to keep it there. Do a
quick (back of the napkin) calculation on what storage you can get for
$30/month and factor in bandwidth costs (to pull the data when/if you
need it). My napkin calculations tell me that I cannot compete
with AWS S3 for up to 100Gb of storage available 7x24. Even the
electric utility bill would be more than AWS charges - especially when
you consider UPS and air conditioning. And thats not including any
hardware (capital equipment) costs! see: http://aws.amazon.com/s3/

When going the amazon route, you always need to take into account retrieval 
time/bandwidth cost. 
If you were to store 100GB on Amazon - how fast can you get your data back, or 
how much would bandwidth cost you to retrieve it in a timely manner. It is all 
a matter of requirements of course.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Zpool is a bit Pessimistic at failures

2010-01-21 Thread Henrik Johansson

Hello,

Anyone else noticed that zpool is kind of negative when reporting back from 
some error conditions?

Like:
cannot import 'zpool01': I/O error
Destroy and re-create the pool from
a backup source.

or even worse:

cannot import 'rpool': pool already exists
Destroy and re-create the pool from
a backup source.

The first one i got when doing some failure testing on my new storage node, 
i've pulled several disks from a raidz2 to simulate loss off connectivity, 
lastly I pulled a third one which as expected made the pool unusable and later 
exported the pool. But when I reconnected one of the previous two drives and 
tried a import I got this message, the pool was fine once I reconnected the 
last disk to fail, so the messages seems a bit pessimistic.

The second one i got when importing a old rpool with altroot but forgot to 
specify a new name for the pool, the solution to just add a new name to the 
pool was much better than recreating the pool and restoring from backup.

I think this could scare or even make new users do terrible things, even if the 
errors could be fixed. I think I'll file a bug, agree?

Henrik
http://sparcv9.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup memory overhead

2010-01-21 Thread Daniel Carosone

On Fri, Jan 22, 2010 at 08:55:16AM +1100, Daniel Carosone wrote:
 For performance (rather than space) issues, I look at dedup as simply
 increasing the size of the working set, with a goal of reducing the
 amount of IO (avoided duplicate writes) in return.

I should add and avoided future duplicate reads in those parentheses
as well. 

A CVS checkout, with identical CVS/Root files in every directory, is a
great example. Every one of those files is read on cvs update.
Developers often have multiple checkouts (different branches) from the
same server. Good performance gains can be had by avoiding potentially
many thousands of extra reads and cache entries, whether with dedup or
simply by hardlinking them all together.   I've hit the 64k limit on
hardlinks to the one file more than once with this, on bsd FFS.

It's not a great example for my suggestion of a threshold lower
blocksize for dedup, however :-/

--
Dan.



pgpleAwmVO8zb.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] need a few suggestions for a poor man's ZIL/SLOG device

2010-01-21 Thread Nicolas Williams

On Thu, Jan 21, 2010 at 02:11:31PM -0800, Moshe Vainer wrote:
 PS: For data that you want to mostly archive, consider using Amazon
 Web Services (AWS) S3 service. Right now there is no charge to push
 data into the cloud and its $0.15/gigabyte to keep it there. Do a
 quick (back of the napkin) calculation on what storage you can get for
 $30/month and factor in bandwidth costs (to pull the data when/if you
 need it). My napkin calculations tell me that I cannot compete
 with AWS S3 for up to 100Gb of storage available 7x24. Even the
 electric utility bill would be more than AWS charges - especially when
 you consider UPS and air conditioning. And thats not including any
 hardware (capital equipment) costs! see: http://aws.amazon.com/s3/
 
 When going the amazon route, you always need to take into account
 retrieval time/bandwidth cost.  If you were to store 100GB on Amazon -
 how fast can you get your data back, or how much would bandwidth cost
 you to retrieve it in a timely manner. It is all a matter of
 requirements of course.

Don't forget asymmetric upload/download bandwidth.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL

2010-01-21 Thread Simon Breden

+1

I agree 100%

I have a website whose ZFS Home File Server articles are read around 1 million 
times a year, and so far I have recommended Western Digital drives 
wholeheartedly, as I have found them to work flawlessly within my RAID system 
using ZFS.

With this recent action by Western Digital of disabling the ability to 
time-limit the error reporting period, thus effectively forcing consumer RAID 
users to buy their RAID-version drives at 50%-100% price premium, I have 
decided not to use Western Digital drives any longer, and have explained why 
here:

http://breden.org.uk/2009/05/01/home-fileserver-a-year-in-zfs/ (look in the 
Drives section)

Like yourself, I too am searching for consumer-priced drives where it's still 
possible to set the error reporting period.

I'm also looking at the Samsung models at the moment -- either the HD154UI 
1.5TB drive or the HD203WI 2TB drives... and if it's possible to set the error 
reporting time then these will be my next purchase. They have quite good user 
ratings at newegg.com...

If WD lose money over this, they might rethink their strategy. Until then, bye 
bye WD.

Cheers,
Simon

http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool

2010-01-21 Thread Richard Elling

[Richard makes a hobby of confusing Dan :-)]
more below..

On Jan 21, 2010, at 1:13 PM, Daniel Carosone wrote:

 On Thu, Jan 21, 2010 at 09:36:06AM -0800, Richard Elling wrote:
 On Jan 20, 2010, at 4:17 PM, Daniel Carosone wrote:
 
 On Wed, Jan 20, 2010 at 03:20:20PM -0800, Richard Elling wrote:
 Though the ARC case, PSARC/2007/618 is unpublished, I gather from
 googling and the source that L2ARC devices are considered auxiliary,
 in the same category as spares. If so, then it is perfectly reasonable to
 expect that it gets picked up regardless of the GUID. This also implies
 that it is shareable between pools until assigned. Brief testing confirms
 this behaviour.  I learn something new every day :-)
 
 So, I suspect Lutz sees a race when both pools are imported onto one
 node.  This still makes me nervous though...
 
 Yes. What if device reconfiguration renumbers my controllers, will
 l2arc suddenly start trashing a data disk?  The same problem used to
 be a risk for swap,  but less so now that we swap to named zvol. 
 
 This will not happen unless the labels are rewritten on your data disk, 
 and if that occurs, all bets are off.
 
 It occurred to me later yesterday, while offline, that the pool in
 question might have autoreplace=on set.  If that were true, it would
 explain why a disk in the same controller slot was overwritten and
 used.
 
 Lutz, is the pool autoreplace property on?  If so, god help us all
 is no longer quite so necessary.

I think this is a different issue. But since the label in a cache device does
not associate it with a pool, it is possible that any pool which expects a
cache will find it.  This seems to be as designed.

 There's work afoot to make l2arc persistent across reboot, which
 implies some organised storage structure on the device.  Fixing this
 shouldn't wait for that.
 
 Upon further review, the ruling on the field is confirmed ;-)  The L2ARC
 is shared amongst pools just like the ARC. What is important is that at
 least one pool has a cache vdev. 
 
 Wait, huh?  That's a totally separate issue from what I understood
 from the discussion.  What I was worried about was that disk Y, that
 happened to have the same cLtMdN address as disk X on another node,
 was overwritten and trashed on import to become l2arc.  
 
 Maybe I missed some other detail in the thread and reached the wrong
 conclusion? 
 
 As such, for Lutz's configuration, I am now less nervous. If I understand
 correctly, you could add the cache vdev to rpool and forget about how
 it works with the shared pools.
 
 The fact that l2arc devices could be caching data from any pool in the
 system is .. a whole different set of (mostly performance) wrinkles.
 
 For example, if I have a pool of very slow disks (usb or remote
 iscsi), and a pool of faster disks, and l2arc for the slow pool on the
 same faster disks, it's pointless having the faster pool using l2arc
 on the same disks or even the same type of disks.  I'd need to set the
 secondarycache properties of one pool according to the configuration
 of another. 

Don't use slow devices for L2ARC.

Secondarycache is a dataset property, not a pool property.  You can
definitely manage the primary and secondary cache policies for each
dataset.

 I suppose one could make the case
 that a new command is needed in addition to zpool and zfs (!) to manage
 such devices. But perhaps we can live with the oddity for a while?
 
 This part, I expect, will be resolved or clarified as part of the
 l2arc persistence work, since then their attachment to specific pools
 will need to be clear and explicit.

Since the ARC is shared amongst all pools, it makes sense to share
L2ARC amongst all pools.

 Perhaps the answer is that the cache devices become their own pool
 (since they're going to need filesystem-like structured storage
 anyway). The actual cache could be a zvol (or new object type) within
 that pool, and then (if necessary) an association is made between
 normal pools and the cache (especially if I have multiple of them).
 No new top-level commands needed. 

I propose a best practice of adding the cache device to rpool and be 
happy.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 02:54:21PM -0800, Richard Elling wrote:
 + support file systems larger then 2GiB include 32-bit UIDs a GIDs 

file systems, but what about individual files within?

--
Dan.


pgpw54qWyHczW.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread Simon Breden

Ouch. Was that on the original 2009.06 vanilla install, or a later updated 
build? Hopefully a lot of the original bugs have been fixed by now, or soon 
will be.

Has anyone got any from the trenches experience of using the mpt_sas driver? 
Any comments?

Cheers,
Simon

http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Zpool is a bit Pessimistic at failures

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 11:14:33PM +0100, Henrik Johansson wrote:
 I think this could scare or even make new users do terrible things,
 even if the errors could be fixed. I think I'll file a bug, agree?

Yes, very much so.

--
Dan.


pgp7OGc773Bqe.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-21 Thread Ross Walker


On Jan 21, 2010, at 6:47 PM, Daniel Carosone d...@geek.com.au wrote:


On Thu, Jan 21, 2010 at 02:54:21PM -0800, Richard Elling wrote:

+ support file systems larger then 2GiB include 32-bit UIDs a GIDs


file systems, but what about individual files within?


I think the original author meant files bigger then 2GiB and files  
systems bigger then 2TiB.


I don't know why that wasn't builtin from the start it's been out for  
a long, long time now, between 5-10 years if I had to guess.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 03:33:28PM -0800, Richard Elling wrote:
 [Richard makes a hobby of confusing Dan :-)]

Heh.

  Lutz, is the pool autoreplace property on?  If so, god help us all
  is no longer quite so necessary.
 
 I think this is a different issue.

I agree. For me, it was the main issue, and I still want clarity on
it.  However, at this point I'll go back to the start of the thread
and look at what was actually reported again in more detail.  

 But since the label in a cache device does
 not associate it with a pool, it is possible that any pool which expects a
 cache will find it.  This seems to be as designed.

Hm. My recollection was that node b's disk in that controller slot was
totally unlabelled, but perhaps I'm misremembering.. as above.

  For example, if I have a pool of very slow disks (usb or remote
  iscsi), and a pool of faster disks, and l2arc for the slow pool on the
  same faster disks, it's pointless having the faster pool using l2arc
  on the same disks or even the same type of disks.  I'd need to set the
  secondarycache properties of one pool according to the configuration
  of another. 
 
 Don't use slow devices for L2ARC.

Slow is entirely relative, as we discussed here just recently.  They
just need to be faster than the pool devices I want to cache.  The
wrinkle here is that it's now clear they should be faster than the
devices in all other pools as well (or I need to take special
measures).

Faster is better regardless, and suitable l2arc ssd's are cheap
enough now.  It's mostly academic that, previously, faster/local hard
disks were fast enough, since now you can have both.

 Secondarycache is a dataset property, not a pool property.  You can
 definitely manage the primary and secondary cache policies for each
 dataset.

Yeah, properties of the root fs and of the pool are easily conflated.

  such devices. But perhaps we can live with the oddity for a while?
  
  This part, I expect, will be resolved or clarified as part of the
  l2arc persistence work, since then their attachment to specific pools
  will need to be clear and explicit.
 
 Since the ARC is shared amongst all pools, it makes sense to share
 L2ARC amongst all pools.

Of course it does - apart from the wrinkles we now know we need to
watch out for.

  Perhaps the answer is that the cache devices become their own pool
  (since they're going to need filesystem-like structured storage
  anyway). The actual cache could be a zvol (or new object type) within
  that pool, and then (if necessary) an association is made between
  normal pools and the cache (especially if I have multiple of them).
  No new top-level commands needed. 
 
 I propose a best practice of adding the cache device to rpool and be 
 happy.

It is *still* not that simple.  Forget my slow disks caching an even
slower pool (which is still fast enough for my needs, thanks to the
cache and zil).

Consider a server config thus:
 - two MLC SSDs (x25-M, OCZ Vertex, whatever)
 - SSDs partitioned in two, mirrored rpool  2x l2arc
 - a bunch of disks for a data pool

This is a likely/common configuration, commodity systems being limited
mostly by number of sata ports.  I'd even go so far as to propose it
as another best practice, for those circumstances.

Now, why would I waste l2arc space, bandwidth, and wear cycles to
cache rpool to the same ssd's that would be read on a miss anyway?  

So, there's at least one more step required for happiness:
 # zfs set secondarycache=none rpool

(plus relying on property inheritance through the rest of rpool)

--
Dan.



pgph2OAJgbY6C.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL

2010-01-21 Thread R.G. Keen

And I agree as well. WD was about to get upwards of $500-$700 of my money, and 
is now getting zero over this issue alone moving me to look harder for other 
drives. 

I'm sure a WD rep would tell us about how there are extra unseen goodies in the 
RE line. Maybe.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL

2010-01-21 Thread Simon Breden

Thanks!

Yep, I was about to buy six or so WD15EADS or WD15EARS drives, but it looks 
like I will not be ordering them now.

The bad news is that after looking at the Samsungs it too seems that they have 
no way of changing the error reporting time in the 'desktop' drives. I hope I'm 
wrong though. I refuse to pay silly money for 'raid editions' of these drives.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread Moshe Vainer

Vanilla 2009.06, mr_sas drivers from LSI website. 
To answer your other question - the mpt driver is very solid on 2009.06
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread Tim Cook

On Thu, Jan 21, 2010 at 7:37 PM, Moshe Vainer mvai...@doyenz.com wrote:

 Vanilla 2009.06, mr_sas drivers from LSI website.
 To answer your other question - the mpt driver is very solid on 2009.06



Are you sure those are the open source drivers he's referring to?  LSI has a
habit of releasing their own drivers with similar names.  It sounds to me
like that's what you were using.

On that front, exactly where did you find the driver?  They have nothing
listed on the downloads page:
http://lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/internal/sas9211-8i/index.html?locale=ENremote=1




-- 
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool

2010-01-21 Thread Richard Elling

On Jan 21, 2010, at 4:32 PM, Daniel Carosone wrote:
 I propose a best practice of adding the cache device to rpool and be 
 happy.
 
 It is *still* not that simple.  Forget my slow disks caching an even
 slower pool (which is still fast enough for my needs, thanks to the
 cache and zil).
 
 Consider a server config thus:
 - two MLC SSDs (x25-M, OCZ Vertex, whatever)
 - SSDs partitioned in two, mirrored rpool  2x l2arc
 - a bunch of disks for a data pool
 
 This is a likely/common configuration, commodity systems being limited
 mostly by number of sata ports.  I'd even go so far as to propose it
 as another best practice, for those circumstances.

 Now, why would I waste l2arc space, bandwidth, and wear cycles to
 cache rpool to the same ssd's that would be read on a miss anyway?  
 
 So, there's at least one more step required for happiness:
 # zfs set secondarycache=none rpool
 
 (plus relying on property inheritance through the rest of rpool)

I agree with this, except for the fact that the most common installers
(LiveCD, Nexenta, etc.) use the whole disk for rpool[1].  So the likely
and common configuration today is moving towards one whole
root disk.  That could change in the future.

[1] Solaris 10?  well... since installation hard anyway, might as well do this.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-21 Thread Tim Cook

On Thu, Jan 21, 2010 at 8:05 PM, Moshe Vainer mvai...@doyenz.com wrote:


 http://lsi.com/storage_home/products_home/internal_raid/megaraid_sas/6gb_s_value_line/sas9260-8i/index.html



 2009.06 didn't have the drivers integrated, so those aren't the open source
 ones. As i said, it is possible that 2010.03 will resolve this. But we do
 not put development releases in production.


You should probably make that clear from the start then.  You just bashed
the opensource drivers based on your experience with something completely
different.


-- 
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 05:52:57PM -0800, Richard Elling wrote:
 I agree with this, except for the fact that the most common installers
 (LiveCD, Nexenta, etc.) use the whole disk for rpool[1]. 

Er, no. You certainly get the option of whole disk or make
partitions, at least with the opensolaris livecd.

--
Dan.




pgpBWoV2Vz5kt.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zfs zvol available space vs used space vs reserved space

2010-01-21 Thread Younes

Hello all,

I have a small issue with zfs.
I create a volume 1TB.

# zfs get all tank/test01   
  NAMEPROPERTY  VALUE   
   SOURCE
tank/test01  type  volume -
tank/test01  creation  Thu Jan 21 15:05 2010  -
tank/test01  used  1T -
tank/test01  available 2.26T  -
tank/test01  referenced79.4G  -
tank/test01  compressratio 1.00x  -
tank/test01  reservation   none   default
tank/test01  volsize   1T -
tank/test01  volblocksize  8K -
tank/test01  checksum  on default
tank/test01  compression   offdefault
tank/test01  readonly  offdefault
tank/test01  shareiscsioffdefault
tank/test01  copies1  default
tank/test01  refreservation1T local
tank/test01  primarycache  alldefault
tank/test01  secondarycachealldefault
tank/test01  usedbysnapshots   0  -
tank/test01  usedbydataset 79.4G  -
tank/test01  usedbychildren0  -
tank/test01  usedbyrefreservation  945G   -

What bugs me is the available:2.26T.

Any ideas on why is that?

Thanks,
Younes
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs zvol available space vs used space vs reserved space

2010-01-21 Thread Daniel Carosone

On Thu, Jan 21, 2010 at 07:33:47PM -0800, Younes wrote:
 Hello all,
 
 I have a small issue with zfs.
 I create a volume 1TB.
 
 # zfs get all tank/test01 
 NAMEPROPERTY  VALUE   
SOURCE
 tank/test01  used  1T -
 tank/test01  available 2.26T  -
 tank/test01  referenced79.4G  -
 tank/test01  reservation   none   default
 tank/test01  refreservation1T local
 tank/test01  usedbydataset 79.4G  -
 tank/test01  usedbychildren0  -
 tank/test01  usedbyrefreservation  945G   -

I've trimmed some not relevant properties.

 What bugs me is the available:2.26T.
 
 Any ideas on why is that?

That's the available space in the rest of the pool. This includes
space that could be used (ie, available for) potential snapshots of
the volume (which would show in usedbychildren), since the volume size
is a refreservation not a reservation.   

--
Dan.


pgpxZF537tqix.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] question about which build to install

2010-01-21 Thread Thomas Burgess

I installed b130 on my server, and i'm being hit by this bug:
http://defect.opensolaris.org/bz/show_bug.cgi?id=13540

Where i can't log into gnome.  I've bee trying to deal with it hoping that i
a workaround would show up..
if there IS a workaround, i'd love to have it...if not, i'm wondering:

is there another version i can downgrade to?  I'm pretty new to opensolaris
and i've tried to google to find this answer but i can't find it.

my zpool is version 22.

thanks for any help.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS loses configuration

2010-01-21 Thread Chris Kaminski

I have just installed EON .599 on a machine with a 6 disk raidz2 configuration. 

I run updimg after creating a zpool.  When I reboot, and attempt to run 'zpool 
list' it returns 'no pools configured'.  

I've checked /etc/zfs/zpool.cache, and it appears to have configuration 
information about the disks in place. 

If I run zpool import, it loads properly, but for whatever reason with EON, 
it's not saving the configuration. 

Any ideas where I should start looking?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup memory overhead

2010-01-21 Thread Mike Gerdts

On Thu, Jan 21, 2010 at 2:51 PM, Andrey Kuzmin
andrey.v.kuz...@gmail.com wrote:
 Looking at dedupe code, I noticed that on-disk DDT entries are
 compressed less efficiently than possible: key is not compressed at
 all (I'd expect roughly 2:1 compression ration with sha256 data),

A cryptographic hash such as sha256 should not be compressible.  A
trivial example shows this to be the case:

for i in {1..1} ; do
echo $i | openssl dgst -sha256 -binary
done  /tmp/sha256

$ gzip -c sha256 sha256.gz
$ compress -c sha256 sha256.Z
$ bzip2 -c sha256 sha256.bz2

$ ls -go sha256*
-rw-r--r--   1  32 Jan 22 04:13 sha256
-rw-r--r--   1  428411 Jan 22 04:14 sha256.Z
-rw-r--r--   1  321846 Jan 22 04:14 sha256.bz2
-rw-r--r--   1  320068 Jan 22 04:14 sha256.gz

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] question about which build to install

2010-01-21 Thread Peter Karlsson

Wrong list But anyhow I was able to install b128 and then upgrade to 
b130. I had relink some OpenGL files to get Compiz to work but apart 
from that it looks OK.


/peter

On 2010-01-22 11.03, Thomas Burgess wrote:

I installed b130 on my server, and i'm being hit by this bug:
http://defect.opensolaris.org/bz/show_bug.cgi?id=13540

Where i can't log into gnome.  I've bee trying to deal with it hoping 
that i a workaround would show up..

if there IS a workaround, i'd love to have it...if not, i'm wondering:

is there another version i can downgrade to?  I'm pretty new to 
opensolaris and i've tried to google to find this answer but i can't 
find it.


my zpool is version 22.

thanks for any help.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Brad

Did you buy the SSDs directly from Sun?  I've heard there could possibly be 
firmware that's vendor specific for the X25-E.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration

2010-01-21 Thread Carsten Aulbert

Hi

On Friday 22 January 2010 07:04:06 Brad wrote:
 Did you buy the SSDs directly from Sun?  I've heard there could possibly be
  firmware that's vendor specific for the X25-E.

No.

So far I've heard that they are not readily available as certification 
procedures are still underway (apart from this the 8850 firmware should be ok, 
but that's just what I've heard).

C
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

74 matches

Mail list logo