Re: [zfs-discuss] ZFS loses configuration

2010-01-22 Thread Oyvind Syljuasen
You will have to uncomment the zpool import -a line in /mnt/eon0/.exec for
this to automatically import your pools on startup.
(took a while before I found this too...)

Other than that, for me, EON is great!

br,
syljua
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Old quota tools

2010-01-22 Thread Martijn de Munnik
Hi List,

Does anybody have scripts available which mimic the ufs quota tools on
zfs. A tool I use relies on the old quota tools (quota, edquota, quotaon,
quotaoff, repquota, quotacheck).

I use zfs filesystem quota and reservation for /export/home/username
filesystems. I would like the tool to manage the quotas but I'm not able to
change the code. I can only change the quota commands but it still expect
the commands to behave like the ufs quota commands.

Thanks,
Martijn

-- 
YoungGuns
Kasteleinenkampweg 7b
5222 AX 's-Hertogenbosch
T. 073 623 56 40
F. 073 623 56 39
www.youngguns.nl
KvK 18076568
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] etc on separate pool

2010-01-22 Thread Alexander
Is it possible to have /etc on separate zfs pool in OpenSolaris? 
The purpose is to have rw non-persistent main pool and rw persistent /etc...
I've tried to make legacy etcpool/etc file system and mount it in 
/etc/vfstab... 
Is it possible to extend boot-archive in such a way that it include most of the 
files necessary for mounting /etc from separate pool? Have someone tried such 
configurations?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zones and other filesystems

2010-01-22 Thread Zoram Thanga

On 01/21/10 17:03, Thomas Burgess wrote:
I'm pretty new to opensolaris.  I come from FreeBSD. 

Naturally, after using FreeBSD forr awhile i've been big on the use of 
FreeBSD jails so i just had to try zones.  I've figured out how to get 
zones running but now i'm stuck and need help.  Is there anything like 
nullfs in opensolaris...


or maybe there is a more solaris way of doing what i need to do.

Basically, what i'd like to do is give a specific zone access to 2 zfs 
filesystems which are available to the global zone.

my new zones are in:

/export/home/zone1
/export/home/zone2


What i'd like to do is give them access to:

/tank/nas/Video
/tank/nas/JeffB


# zonecfg -z zone1
add dataset
set name=tank/nas/Video
end
add dataset
set name=tank/nas/JeffB
end
exit

# zoneadm -z zone1 reboot

Thanks,
Zoram




i'm sure i looked over something hugely easy and important...thanks.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup memory overhead

2010-01-22 Thread erik.ableson

On 21 janv. 2010, at 22:55, Daniel Carosone wrote:

 On Thu, Jan 21, 2010 at 05:04:51PM +0100, erik.ableson wrote:
 
 What I'm trying to get a handle on is how to estimate the memory
 overhead required for dedup on that amount of storage.   
 
 We'd all appreciate better visibility of this. This requires:
 - time and observation and experience, and
 - better observability tools and (probably) data exposed for them

I'd guess that since every written block is going to go and ask for the hash 
keys, this should result in this data living in the ARC based on the MFU 
ruleset.  The theory being that as a result if I can determine the maximum 
memory requirement for these keys, I know what my minimum memory baseline 
requirements will be to guarantee that I won't be caught short.

 So the question is how much memory or L2ARC would be necessary to
 ensure that I'm never going back to disk to read out the hash keys. 
 
 I think that's a wrong-goal for optimisation.
 
 For performance (rather than space) issues, I look at dedup as simply
 increasing the size of the working set, with a goal of reducing the
 amount of IO (avoided duplicate writes) in return.

True.  but as a practical aspect, we've seen that overall performance drops off 
the cliff if you overstep your memory bounds and the system is obliged to go to 
disk to evaluate a new block to write against the hash keys. Compounded by the 
fact that the ARC is full so it's obliged to go straight to disk, further 
exacerbating the problem.

It's this particular scenario that I'm trying to avoid and from a business 
aspect of selling ZFS based solutions (whether to a client or to an internal 
project) we need to be able to ensure that the performance is predictable with 
no surprises.

Realizing of course that all of this is based on a slew of uncontrollable 
variables (size of the working set, IO profiles, ideal block sizes, etc.).  The 
empirical approach of give it lots and we'll see if we need to add an L2ARC 
later is not really viable for many managers (despite the fact that the real 
world works like this).

 The trouble is that the hash function produces (we can assume) random
 hits across the DDT, so the working set depends on the amount of
 data and the rate of potentially dedupable writes as well as the
 actual dedup hit ratio.  A high rate of writes also means a large
 amount of data in ARC waiting to be written at the same time. This
 makes analysis very hard (and pushes you very fast towards that very
 steep cliff, as we've all seen). 

I don't think  it would be random since _any_ write operation on a deduplicated 
filesystem would require a hash check, forcing them to live in the MFU.  
However I agree that a high write rate would result in memory pressure on the 
ARC which could result in the eviction of the hash keys. So the next factor to 
include in memory sizing is the maximum write rate (determined by IO 
availability). So with a team of two GbE cards, I could conservatively say that 
I need to size for inbound write IO of 160MB/s, worst case accumulated for the 
30 second flush cycle so, say about 5GB of memory (leaving aside ZIL issues 
etc.). Noting that this is all very back of the napkin estimations, and I also 
need to have some idea of what my physical storage is capable of ingesting 
which could add to this value.

 I also think a threshold on the size of blocks to try deduping would
 help.  If I only dedup blocks (say) 64k and larger, i might well get
 most of the space benefit for much less overhead.

Well - since my primary use case is iSCSI presentation to VMware backed by 
zvols and I can manually force the block size on volume creation to 64, this 
reduces the unpredictability a little bit. That's based on the hypothesis that 
zvols use a fixed block size.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS raidz2 on Aberdeen SCSI DAS problems

2010-01-22 Thread Dragan
Hi, I'm trying to build OpenSolaris storage server but I'm expiriencing regular 
zpool corruptions after one or two days of operation.
I would like if someone would comment on my hardware that I use in this setup, 
and give me some pointers how to troubleshoot this.

Machine that Opensolaris is installed on has Supermicro Intel X7DCT 
motherboard, and LSI22320SE SGL SCSI HBA. Aberdeen XDAS P6 Series - 3U SCSI DAS 
is attached to HBA, (16 bays with 2Tb hitachi drives), and drives are 
configured as Pass Through.
I built just one testPool with one vdev containing 8 drives, in raidz2.

This is the zpool status output after the zpool crash:

  pool: testPool
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
 scrub: resilver completed after 0h0m with 0 errors on Fri Jan 22 09:29:51 2010
config:

NAME STATE READ WRITE CKSUM
testPool UNAVAIL 0 0 0  insufficient replicas
  raidz2 UNAVAIL 128 0  insufficient replicas
c10t0d0  FAULTED  695 3  too many errors
c10t0d1  FAULTED  589 3  too many errors
c10t0d2  ONLINE   2 0 0
c10t0d3  ONLINE   2 1 0  6K resilvered
c10t0d4  ONLINE   4 4 0  5.50K resilvered
c10t0d5  ONLINE   2 8 0  4K resilvered
c10t0d6  DEGRADED1 9 3  too many errors
c10t0d7  ONLINE   3 8 0  3.50K resilvered

errors: 3 data errors, use '-v' for a list


And this is the relevant lines from my /var/adm/messages:

Jan 22 08:02:54 diskgot firmware SCSI bus reset.
Jan 22 08:02:54 disk log info = 0
Jan 22 08:03:07 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8 (mpt0):
Jan 22 08:03:07 diskRev. 8 LSI, Inc. 1030 found.
Jan 22 08:03:07 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8 (mpt0):
Jan 22 08:03:07 diskmpt0 supports power management.
Jan 22 08:03:07 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8 (mpt0):
Jan 22 08:03:07 diskmpt0 unrecognized capability 0x6.
Jan 22 08:03:10 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8 (mpt0):
Jan 22 08:03:10 diskmpt0: IOC Operational.
Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8,1 (mpt1):
Jan 22 08:03:13 diskRev. 8 LSI, Inc. 1030 found.
Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8,1 (mpt1):
Jan 22 08:03:13 diskmpt1 supports power management.
Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8,1 (mpt1):
Jan 22 08:03:13 diskmpt1 unrecognized capability 0x0.
Jan 22 08:03:13 disk scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,6...@4/pci10b5
,8...@0/pci1000,1...@8,1 (mpt1):
Jan 22 08:03:13 diskmpt1: IOC Operational.
Jan 22 08:04:50 disk fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYP
E: Fault, VER: 1, SEVERITY: Major
Jan 22 08:04:50 disk EVENT-TIME: Fri Jan 22 08:04:50 GMT 2010
Jan 22 08:04:50 disk PLATFORM: X7DCT, CSN: 0123456789, HOSTNAME: disk
Jan 22 08:04:50 disk SOURCE: zfs-diagnosis, REV: 1.0
Jan 22 08:04:50 disk EVENT-ID: 857d4e64-9a2f-e6fb-94c2-9337566aa6c9
Jan 22 08:04:50 disk DESC: The number of I/O errors associated with a ZFS device
 exceeded
Jan 22 08:04:50 disk acceptable levels.  Refer to http://sun.com/msg/ZFS
-8000-FD for more information.
Jan 22 08:04:50 disk AUTO-RESPONSE: The device has been offlined and marked as f
aulted.  An attempt
Jan 22 08:04:50 disk will be made to activate a hot spare if available.
Jan 22 08:04:50 disk IMPACT: Fault tolerance of the pool may be compromised.
Jan 22 08:04:50 disk REC-ACTION: Run 'zpool status -x' and replace the bad devic
e.



Can I build ZFS storage server with this kind of hardware? If yes, how can I 
troubleshoot the problem?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] etc on separate pool

2010-01-22 Thread Chris Ridd

On 22 Jan 2010, at 08:55, Alexander wrote:

 Is it possible to have /etc on separate zfs pool in OpenSolaris? 
 The purpose is to have rw non-persistent main pool and rw persistent /etc...
 I've tried to make legacy etcpool/etc file system and mount it in 
 /etc/vfstab... 
 Is it possible to extend boot-archive in such a way that it include most of 
 the files necessary for mounting /etc from separate pool? Have someone tried 
 such configurations?

What does the live CD do?

Cheers,

Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] etc on separate pool

2010-01-22 Thread Alexander
  Is it possible to extend boot-archive in such a way
 that it include most of the files necessary for
 mounting /etc from separate pool? Have someone tried
 such configurations?
 
 What does the live CD do?
I'm not sure that it is the same configuration, but maybe it is quite 
similar... LiveCD has ramdisk which is mounted on boot. /etc is on this 
ramdisk... 
And in real system configuration we need some way to sync real /etc and ramdisk 
(or boot archive) /etc. With ramdisk  this may be a problem. 
But  
1) I don't understand deeply how livecd boots (maybe need to look at this 
process more attentively)
2) In my opinion its boot process is quite specific and quite different from 
real sustem behavior. I'm not sure that this practices may be adopted...  
However, I'm not confident (1)...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 2gig file limit on ZFS?

2010-01-22 Thread Michelle Knight
On Thursday 21 Jan 2010 22:00:55 Daniel Carosone wrote:
 Best would be to plug the ext3 disk into something that can read it
 fully, and copy over the network.  Linux, NetBSD, maybe newer
 opensolaris. Note that this could be running in a VM on the same box,
 if necessary.

Yep, done.  Ubuntu gave me some grief about mounting part elements of a raid 
set, but I managed it and the files are now copying over to the OS box happily. 
 

The problem must have been the drivers loaded to read the ext3 partition.

 split(1)?

Har, har!!! :-)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-22 Thread Moshe Vainer
http://lsi.com/storage_home/products_home/internal_raid/megaraid_sas/6gb_s_value_line/sas9260-8i/index.html

2009.06 didn't have the drivers integrated, so those aren't the open source 
ones. As i said, it is possible that 2010.03 will resolve this. But we do not 
put development releases in production.


From: Tim Cook [mailto:t...@cook.ms]
Sent: Thursday, January 21, 2010 5:45 PM
To: Moshe Vainer
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 
controller


On Thu, Jan 21, 2010 at 7:37 PM, Moshe Vainer 
mvai...@doyenz.commailto:mvai...@doyenz.com wrote:
Vanilla 2009.06, mr_sas drivers from LSI website.
To answer your other question - the mpt driver is very solid on 2009.06


Are you sure those are the open source drivers he's referring to?  LSI has a 
habit of releasing their own drivers with similar names.  It sounds to me like 
that's what you were using.

On that front, exactly where did you find the driver?  They have nothing listed 
on the downloads page:
http://lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/internal/sas9211-8i/index.html?locale=ENremote=1



--
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-22 Thread Moshe Vainer
I thought i made it very clear - mr_sas drivers from LSI website. No intention 
to bash anything, just a user experience. Sorry if that was misunderstood.

From: Tim Cook [mailto:t...@cook.ms]
Sent: Thursday, January 21, 2010 6:07 PM
To: Moshe Vainer
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 
controller


On Thu, Jan 21, 2010 at 8:05 PM, Moshe Vainer 
mvai...@doyenz.commailto:mvai...@doyenz.com wrote:
http://lsi.com/storage_home/products_home/internal_raid/megaraid_sas/6gb_s_value_line/sas9260-8i/index.html

2009.06 didn't have the drivers integrated, so those aren't the open source 
ones. As i said, it is possible that 2010.03 will resolve this. But we do not 
put development releases in production.


You should probably make that clear from the start then.  You just bashed the 
opensource drivers based on your experience with something completely different.


--
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs zvol available space vs used space vs reserved space

2010-01-22 Thread younes naguib
Hi dan,

Thanks for your reply.

I'm not sure about that, as it shows different values for different zvol.
# zfs list -o space
NAME   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV
 USEDCHILD
tank   1.33T  27.2T 0   32.0K  0
 27.2T
tank/test1   2.28T 1T 0   51.4G   973G  0
tank/test22.33T 1T 0   1.31G  1023G  0
tank/test3   1.38T   100G 0   50.4G  49.6G  0


Thanks,
Younes

On Thu, Jan 21, 2010 at 10:48 PM, Daniel Carosone d...@geek.com.au wrote:

 On Thu, Jan 21, 2010 at 07:33:47PM -0800, Younes wrote:
  Hello all,
 
  I have a small issue with zfs.
  I create a volume 1TB.
 
  # zfs get all tank/test01
 NAMEPROPERTY
  VALUE  SOURCE
  tank/test01  used  1T -
  tank/test01  available 2.26T  -
  tank/test01  referenced79.4G  -
  tank/test01  reservation   none   default
  tank/test01  refreservation1T local
  tank/test01  usedbydataset 79.4G  -
  tank/test01  usedbychildren0  -
  tank/test01  usedbyrefreservation  945G   -

 I've trimmed some not relevant properties.

  What bugs me is the available:2.26T.
 
  Any ideas on why is that?

 That's the available space in the rest of the pool. This includes
 space that could be used (ie, available for) potential snapshots of
 the volume (which would show in usedbychildren), since the volume size
 is a refreservation not a reservation.

 --
 Dan.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [ext3-discuss] 2gig file limit on ZFS?

2010-01-22 Thread Milan Jurik
Hi,

it would be very good to know the version of driver used for ext3fs.
From where and how was the driver installed?

Best regards,

Milan

Richard Elling píše v čt 21. 01. 2010 v 12:08 -0800:
 CC'ed to ext3-disc...@opensolaris.org because this is an ext3 on Solaris
 issue.  ZFS has no problem with large files, but the older ext3 did.
 
 See also the ext3 project page and documentation, especially
 http://hub.opensolaris.org/bin/view/Project+ext3/Project_status
  -- richard
 
 
 On Jan 21, 2010, at 11:58 AM, Michelle Knight wrote:
 
  Hi Folks,
  
  Situation, 64 bit Open Solaris on AMD. 2009-6 111b - I can't successfully 
  update the OS.
  
  I've got three external 1.5 Tb drives in a raidz pool connected via USB.
  
  Hooked on to an IDE channel is a 750gig hard drive that I'm copying the 
  data off. It is an ext3 drive from an Ubuntu server.
  
  Copying is being done on the machine using the cp command as root.
  
  So far, two files have failed...
  /mirror2/applications/Microsoft/Operating Systems/Virtual 
  PC/vm/XP-SP2/XP-SP2 Hard Disk.vhd: File too large
  /mirror2/applications/virtualboximages/xp/xp.tar.bz2: File too large
  
  The files are...
  -rwxr-x---   1 adminapplications 4177570654 Nov  4 08:02 xp.tar.bz2
  -rwxr-x---   1 adminapplications 2582259712 Feb 14  2007 XP-SP2 Hard 
  Disk.vhd
  
  The system is a home server and contains files of all types and sizes.
  
  Any ideas please?
  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Remove ZFS Mount Points

2010-01-22 Thread Tony MacDoodle
Can I move the below mounts under / ?

rpool/export/export
rpool/export/home   /export/home

It was a result of the default install...

Thaks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-22 Thread Mike Gerdts
On Thu, Jan 21, 2010 at 11:28 AM, Richard Elling
richard.ell...@gmail.com wrote:
 On Jan 21, 2010, at 3:55 AM, Julian Regel wrote:
  Until you try to pick one up and put it in a fire safe!

 Then you backup to tape from x4540 whatever data you need.
 In case of enterprise products you save on licensing here as you need a one 
 client license per x4540 but in fact can backup data from many clients 
 which are there.

 Which brings up full circle...

 What do you then use to backup to tape bearing in mind that the Sun-provided 
 tools all have significant limitations?

 Poor choice of words.  Sun resells NetBackup and (IIRC) that which was
 formerly called NetWorker.  Thus, Sun does provide enterprise backup
 solutions.

(Symantec nee Veritas) NetBackup and (EMC nee Legato) Networker are
different products that compete in the enterprise backup space.

Under the covers NetBackup uses gnu tar to gather file data for the
backup stream.  At one point (maybe still the case), one of the
claimed features of netbackup is that if a tape is written without
multiplexing, you can use gnu tar to extract data.  This seems to be
most useful when you need to recover master and/or media servers and
to be able to extract your data after you no longer use netbackup.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Remove ZFS Mount Points

2010-01-22 Thread Mark J Musante

On Fri, 22 Jan 2010, Tony MacDoodle wrote:


Can I move the below mounts under / ?

rpool/export/export
rpool/export/home   /export/home


Sure.  Just copy the data out of the directory, do a zfs destroy on the 
two filesystems, and copy it back.


For example:

# mkdir /save
# cp -r /export/home /save
# zfs destroy rpool/export/home
# zfs destroy rpool/export
# mkdir /export
# mv /save/home /export
# rmdir /save

I'm sure there are other ways to do it, but that's the gist.


Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Removing large holey file does not free space 6792701 (still)

2010-01-22 Thread Henrik Johansson
Hello,I mentioned this problem a year ago here and filed 6792701 and I know it has been discussed since. It should have been fixed in snv_118, but I can still trigger the same problem. This is only triggered if the creation of a large file is aborted, for example by loss of power, crash or SIGINT to mkfile(1M). The bug should probably be reopened but I post it here since some people where seeing something similar.Example and attached zdb output:filer01a:/$ uname -a  SunOS filer01a 5.11 snv_130 i86pc i386 i86pc Solarisfiler01a:/$ zpool create zpool01 raidz2 c4t0d0 c4t1d0 c4t2d0 c4t4d0 c4t5d0 c4t6d0filer01a:/$ zpool create zpool01 raidz2 c4t0d0 c4t1d0 c4t2d0 c4t4d0 c4t5d0 c4t6d0filer01a:/$ zfs list zpool01   NAME   USED AVAIL REFER MOUNTPOINTzpool01  123K 5.33T 42.0K /zpool01filer01a:/$ df -h /zpool01Filesystem  Size Used Avail Use% Mounted onzpool015.4T  42K 5.4T  1% /zpool01filer01a:/$ mkfile 1024G /zpool01/largefile   ^C   filer01a:/$ zfs list zpool01NAME   USED AVAIL REFER MOUNTPOINTzpool01  160G 5.17T  160G /zpool01filer01a:/$ ls -hl /zpool01/largefile  -rw--- 1 root root 1.0T 2010-01-22 15:02 /zpool01/largefilefiler01a:/$ rm /zpool01/largefilefiler01a:/$ sync filer01a:/$ zfs list zpool01   NAME   USED AVAIL REFER MOUNTPOINTzpool01  160G 5.17T  160G /zpool01filer01a:/$ df -h /zpool01Filesystem  Size Used Avail Use% Mounted onzpool015.4T 161G 5.2T  3% /zpool01filer01a:/$ ls -l /zpool01total 0filer01a:/$ zfs list -t all zpool01   NAME   USED AVAIL REFER MOUNTPOINTzpool01  160G 5.17T  160G /zpool01filer01a:/$ zpool export zpool01 filer01a:/$ zpool import zpool01 filer01a:/$ zfs list zpool01   NAME   USED AVAIL REFER MOUNTPOINTzpool01  160G 5.17T  160G /zpool01filer01a:/$ zfs -ddd zpool01cut Object lvl  iblk  dblk dsize lsize  %full typecut5  5  16K  128K  160G   1T  15.64 ZFS plain file/cut

zpool01.zdb
Description: Binary data

Henrikhttp://sparcv9.blogspot.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] etc on separate pool

2010-01-22 Thread Cindy Swearingen

Hi Alexander,

I'm not sure about the OpenSolaris release specifically, but for the
SXCE and Solaris 10 releases, we provide this requirement:

http://docs.sun.com/app/docs/doc/817-2271/zfsboot-1?a=view

* Solaris OS Components – All subdirectories of the root file system
that are part of the OS image, with the exception of /var, must be in
the same dataset as the root file system. In addition, all Solaris OS
components must reside in the root pool with the exception of the swap
and dump devices.

Maybe someone else can comment on their OpenSolaris experiences. I'm not 
sure we've done enough testing to relax this requirement for OpenSolaris 
releases.


In the meantime, I would suggest following the above requirement until
we're sure alternate configurations are supportable.

Thanks,

Cindy


On 01/22/10 05:17, Alexander wrote:

Is it possible to extend boot-archive in such a way

that it include most of the files necessary for
mounting /etc from separate pool? Have someone tried
such configurations?

What does the live CD do?
I'm not sure that it is the same configuration, but maybe it is quite similar... LiveCD has ramdisk which is mounted on boot. /etc is on this ramdisk... 
And in real system configuration we need some way to sync real /etc and ramdisk (or boot archive) /etc. With ramdisk  this may be a problem. 
But  
1) I don't understand deeply how livecd boots (maybe need to look at this process more attentively)

2) In my opinion its boot process is quite specific and quite different from 
real sustem behavior. I'm not sure that this practices may be adopted...  
However, I'm not confident (1)...

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] etc on separate pool

2010-01-22 Thread Lori Alt

On 01/22/10 01:55, Alexander wrote:
Is it possible to have /etc on separate zfs pool in OpenSolaris? 
The purpose is to have rw non-persistent main pool and rw persistent /etc...
I've tried to make legacy etcpool/etc file system and mount it in /etc/vfstab... 
Is it possible to extend boot-archive in such a way that it include most of the files necessary for mounting /etc from separate pool? Have someone tried such configurations?
  
There have been efforts (some ongoing) to enable what you are trying to 
do, but they involve substantial changes to Solaris configuration and 
system administration. 

As Solaris works right now, it is not supported to have /etc in a 
separate dataset, let alone a separate pool.


Lori
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs zvol available space vs used space vs reserved space

2010-01-22 Thread Cindy Swearingen

Younes,

Including your zpool list output for tank would be helpful because zfs
list includes the AVAILABLE pool space. Determining volume space is a
bit trickier because volume size is set at creation time but the
allocated size might not be consumed.

I include a simple example below that might help.

The Nevada release revises the zpool list output to include
SIZE, ALLOC, and FREE, which helps clarify the values.

Cindy



A mirrored pool tank of 2 x 136GB disks:

# zpool create tank mirror c1t1d0 c1t2d0
# zpool list tank
NAME   SIZE   USED  AVAILCAP  HEALTH  ALTROOT
tank   136G  76.5K   136G 0%  ONLINE  -

Review how much space is available for datasets:
# zfs list -r tank
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank72K   134G21K  /tank

Approx 2 GB of pool space is consumed for metadata.

I create two volumes, 10GB and 20GB, in size:
# zfs create -V 10G tank/vol1
# zfs create -V 20G tank/vol2
# zfs list -r tank
NAMEUSED  AVAIL  REFER  MOUNTPOINT
tank   30.0G   104G21K  /tank
tank/vol110G   114G16K  -
tank/vol220G   124G16K  -

In the above output, USED is 30.0G due to the creation of the volumes.

If we check the pool space consumed:

# zpool list tank
NAME   SIZE   USED  AVAILCAP  HEALTH  ALTROOT
tank   136G   124K   136G 0%  ONLINE  -

USED is only 124K because the volumes contain no data yet and USED also
included a small amount of metadata.

On 01/21/10 20:53, younes naguib wrote:

Hi dan,

Thanks for your reply.

I'm not sure about that, as it shows different values for different zvol.
# zfs list -o space
NAME   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV 
 USEDCHILD
tank   1.33T  27.2T 0   32.0K  0 
 27.2T

tank/test1   2.28T 1T 0   51.4G   973G  0
tank/test22.33T 1T 0   1.31G  1023G  0
tank/test3   1.38T   100G 0   50.4G  49.6G  0


Thanks,
Younes

On Thu, Jan 21, 2010 at 10:48 PM, Daniel Carosone d...@geek.com.au 
mailto:d...@geek.com.au wrote:


On Thu, Jan 21, 2010 at 07:33:47PM -0800, Younes wrote:
  Hello all,
 
  I have a small issue with zfs.
  I create a volume 1TB.
 
  # zfs get all tank/test01
NAMEPROPERTY

 VALUE  SOURCE
  tank/test01  used  1T -
  tank/test01  available 2.26T  -
  tank/test01  referenced79.4G  -
  tank/test01  reservation   none   default
  tank/test01  refreservation1T local
  tank/test01  usedbydataset 79.4G  -
  tank/test01  usedbychildren0  -
  tank/test01  usedbyrefreservation  945G   -

I've trimmed some not relevant properties.

  What bugs me is the available:2.26T.
 
  Any ideas on why is that?

That's the available space in the rest of the pool. This includes
space that could be used (ie, available for) potential snapshots of
the volume (which would show in usedbychildren), since the volume size
is a refreservation not a reservation.

--
Dan.





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zero out block / sectors

2010-01-22 Thread John Hoogerdijk
Is there a way to zero out unused blocks in a pool?  I'm looking for  
ways to shrink the size of an opensolaris virtualbox VM and

using the compact subcommand will remove zero'd sectors.

Thanks,

John Hoogerdijk

Sun Microsystems of Canada IMO
Network Computer NC Ltd.
808 240 Graham Avenue
Winnipeg, Manitoba, R3C 0J7, Canada
Phone: 204.927.1932
Cell: 204.230.6720
Fax: 204.927.1939
Email: john.hoogerd...@sun.com




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL

2010-01-22 Thread Miles Nordin
 dc == Daniel Carosone d...@geek.com.au writes:
 w == Willy  willy.m...@gmail.com writes:
 sb == Simon Breden sbre...@gmail.com writes:

First of all, I've been so far assembling vdev stripes from different
manufacturers, such that one manufacturer can have a bad batch or
firmware bug killing all their drives at once without losing my pool.
Based on recent drive problems I think this is a really wise idea.

 w http://www.csc.liv.ac.uk/~greg/projects/erc/

dead link?

 w Unfortunately, smartmontools has limited SATA drive support in
 w opensolaris, and you cannot query or set the values.

also the driver stack is kind of a mess with different mid-layers
depending on which SATA low-level driver you use, and many proprietary
no-source low-level drivers, neither of which you have to deal with on
Linux.  Maybe in a decade it will get better if the oldest driver we
have to deal with is AHCI, but yes smartmontools vs. uscsi still needs
fixing!

 w I have 4 of the HD154UI Samsung Ecogreens, and was able to set
 w the error reporting time using HDAT2.  The settings would
 w survive a warm reboot, but not a powercycle.

after stfw this seems to be some MesS-DOS binary-only tool.  Maybe you
can run it in virtualbox and snoop on its behavior---this worked for
me with Wine and a lite-on RPC tool.  At least on Linux you can for
example run CD burning programs from within Wine---it is that good.

sb RAID-version drives at 50%-100% price premium, I have decided
sb not to use Western Digital drives any longer, and have
sb explained why here:

sb http://breden.org.uk/2009/05/01/home-fileserver-a-year-in-zfs/

IMHO it is just a sucker premium because the feature is worthless
anyway.  From the discussion I've read here, the feature is designed
to keep drives which are *reporting failures* to still be considered
*GOOD*, and to not drop out of RAIDsets in RAID-on-a-card
implementations with RAID-level timeouts 60seconds.  It is a
workaround for huge modern high-BER drives and RAID-on-card firmware
that's (according to some person's debateable idea) not well-matched
to the drive.  Of course they are going to sell it as this big
valuable enterprise optimisation, but at its root it has evloved as a
workaround for someone else's broken (from WD POV) software.

The solaris timeout, because of m * n * o multiplicative layered
speculative retry nonsense, is 60 seconds or 180 seconds or many
hours, so solaris is IMHO quite broken in this regard but also does
not benefit from the TLER workaround: the long-TLER drives will not
drop out of RAIDsets on ZFS even if they report an error now and then.

What's really needed for ZFS or RAID in general is (a) for drives to
never spend more than x% of their time attempting recovery, so they
don't effectively lose ALL the data on a partially-damaged drive by
degrading performance to the point it would take n years to read out
what data they're able to deliver and (b) RAID-level smarts to
dispatch reads for redundant data when a drive becomes slow without
reporting failure, and to diagnose drives as failed based on
statistical measurements of their speed.  TLER does not deliver (a)
because reducing error retries to 5 seconds is still 10^3 slowdown
instead of 10^4 and thus basically no difference, and the hard drive
can never do (b) it's a ZFS-level feature.  

so my question is, have you actually found cases where ZFS needs TLER
adjustments, or are you just speculating and synthesizing ideas from a
mess of whitepaper marketing blurbs?  

Because a 7-second-per-read drive will fuck your pool just as badly as
a 70-second-per-read drive: you're going to have to find and unplug it
before the pool will work again.


pgpXHCdaAwoIH.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zero out block / sectors

2010-01-22 Thread Darren J Moffat

John Hoogerdijk wrote:
Is there a way to zero out unused blocks in a pool?  I'm looking for 
ways to shrink the size of an opensolaris virtualbox VM and

using the compact subcommand will remove zero'd sectors.


Not yet, but this has been discussed here before.  It is something I 
want to look at after I've got encryption support integrated.  No 
promises but I want it for this and other purposes.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Dedup Performance

2010-01-22 Thread Steve Radich, BitShop, Inc.
We're having to split data to multiple pools if we enable dedup, 1+ TB pools 
each (one 6x750gb is particularly bad). 

The timeouts cause COMSTAR / iSCSI to fail, Windows clients are dropping the 
persistent targets due to timeouts ( 15 seconds it seems). This is causing 
bigger problems.

Disabling dedup is an option, but it shouldn't be *THAT* much load I wouldn't 
think. Having it on a cache drive is reasonable, however if this is required 
OpenSolaris should add something like DDTCacheDevice so we can dedicate a 
device to it seperate from the secondcache. 

I'll drop in a 150gb cache drive tonight to see if it improves things.

Steve Radich
www.BitShop.com
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Dedup Performance

2010-01-22 Thread Steve Radich, BitShop, Inc.
I should note that trying zfs set primarycache=metadata tank1 took a few 
minutes. Seems changing what is cached in ram would be instant (we don't need 
to flush out from ram the data, just don't put it back in ram again).

During this disk i/o seemed slow, could have been unrelated.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-22 Thread A Darren Dunham
On Wed, Jan 20, 2010 at 08:11:27AM +1300, Ian Collins wrote:
 True, but I wonder how viable its future is.  One of my clients
 requires 17 LT04 types for a full backup, which cost more and takes
 up more space than the equivalent in removable hard drives.

What kind of removable hard drives are you getting that are cheaper than
tape?

LTO4 media is less than 2.5 cents/GB for us (before compression,
acquisition cost only).

-- 
Darren
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL

2010-01-22 Thread Simon Breden
Thanks for your reply Miles.

I think I understand your points, but unfortunately my historical knowledge of 
the the need for TLER etc solutions is lacking.

How I've understood it to be (as generic as possible, but possibly inaccurate 
as a result):

1. In simple non-RAID single drive 'desktop' PC scenarios where you have one 
drive, if your drive is experiencing read/write errors, as this is the only 
drive you have, and therefore you have no alternative redundant source of data 
to help with required reconstruction/recovery, you REALLY NEED your drive to 
try as much as possible to try to recover from the error, therefore a long 
'deep recovery' process may be kicked off to try to fix/recover the problematic 
data being read/written. 

2. Historically, hardware RAID arrays, where redundant data *IS* available, you 
really DON'T want any drive with trivial occasional block read errors to be 
kicked from the array, so the idea was to have drives experiencing read errors 
report quickly to the hardware RAID controller that there's a problem, so that 
the hardware RAID controller can then quickly reconstruct the missing data by 
using the redundant parity data.

3. With ZFS, I think you're saying that if, for example, there's a block read 
error, then even with a RAID EDITION (TLER) drive, you're still looking at a 
circa 7 second delay before the error is reported to ZFS, and if you're using a 
cheapo standard non-RAID edition drive then you're looking at a likely circa 
60/70 second delay before ZFS is notified. Either way, you say that ZFS won't 
kick the drive, yes? And worst case is that depending on arbitrary 'unknowns' 
relating to the particular drive's firmware chemistry/storage stack, relating 
to the storage array's repsonsiveness, 'some time' could be 'mucho time' if 
you're unlucky.

And to summarise, you don't see any point in spending a high premium on 
RAID-edition drives if using with ZFS, yes? And also, you don't think that 
using non-RAID edition drives presents a significant additional data loss risk?

Cheers,
Simon

http://breden.org.uk/2009/05/01/home-fileserver-a-year-in-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-01-22 Thread Mirko
Well, I've purchased 5 Barracuda LP 1.5TB.
They ran very queit, cool, 5 in a cage and the vibration are nearly zero.

reliability ? Well every HDD is unreliable, every major brand at this time have 
problems, so go for the best bang for the bucks.

In my country Seagate have the best RMA service, with tournaround in about 1 
week or so, WD is 3-4 weeks. Samsung have no direct RMA service, Hitachi well 
have a foot out HDD business IMHO, no attractive product at moment.

The enterprise SATA class HDD is a joke, same costructions like the consumers 
line only longer warranties but with a helfy money premium. If you need of a 
real enterprise class HDD you want a SAS not a SATA.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-22 Thread A Darren Dunham
On Thu, Jan 21, 2010 at 12:38:56AM +0100, Ragnar Sundblad wrote:
 On 21 jan 2010, at 00.20, Al Hopper wrote:
  I remember for about 5 years ago (before LT0-4 days) that streaming
  tape drives would go to great lengths to ensure that the drive kept
  streaming - because it took so much time to stop, backup and stream
  again.  And one way the drive firmware accomplished that was to write
  blocks of zeros when there was no data available.

 I haven't seen drives that fill out with zeros. Sounds like an ugly
 solution, but maybe it could be useful in some strange case.

It was closer to 15 years ago than 5, but this may be a reference to the
first release of the DLT 7000.  That version came out with only 4MB as a
RAM buffer, which is insufficient to buffer at speed during a stop/start
cycle.  It didn't write zeros, but it would disable the on-drive
compression to try to keep the speed of bits being written to tape up.
So the effect was similar in that the capacity of the media was reduced.
The later versions had 8MB buffers and that behavior was removed.

-- 
Darren
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does OpenSolaris mpt driver support LSI 2008 controller

2010-01-22 Thread Simon Breden
OK, gotcha.

Relating to my request for robustness feedback of the other driver, I was 
referring in fact to the mpt_sas driver that James says is used for the 
non-RAID LSI SAS2008-based cards like the SuperMicro AOC-USAS2-L8e (as opposed 
to the RAID-capable AOC-USAS2-L8i  LSI SAS 9211-8i cards, which use the mr_sas 
driver).

As far as I'm aware, the standard mpt driver is used for the card I already 
own, the LSI SAS1068E-based AOC-USAS-L8i etc.

Cheers,
Simon

http://breden.org.uk/2009/05/01/home-fileserver-a-year-in-zfs/
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zero out block / sectors

2010-01-22 Thread Mike Gerdts
On Fri, Jan 22, 2010 at 1:00 PM, John Hoogerdijk
john.hoogerd...@sun.com wrote:
 Is there a way to zero out unused blocks in a pool?  I'm looking for ways to
 shrink the size of an opensolaris virtualbox VM and
 using the compact subcommand will remove zero'd sectors.

I've long suspected that you should be able to just use mkfile or dd
if=/dev/zero ... to create a file that consumes most of the free
space then delete that file.  Certainly it is not an ideal solution,
but seems quite likely to be effective.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive as backup - reliability?

2010-01-22 Thread Ian Collins

A Darren Dunham wrote:

On Wed, Jan 20, 2010 at 08:11:27AM +1300, Ian Collins wrote:
  

True, but I wonder how viable its future is.  One of my clients
requires 17 LT04 types for a full backup, which cost more and takes
up more space than the equivalent in removable hard drives.



What kind of removable hard drives are you getting that are cheaper than
tape?

  
It's not the raw cost per GB, it's the way the tapes are used.  To aid 
recovery times, a number of different backup sets (groups of 
filesystems) are written, so the tapes aren't all used to capacity.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss