[zfs-discuss] Re: Re: Re: Re: concatination stripe - zfs?

2007-04-26 Thread shay
That would surprise me. Can it be that you are saturating the PCI slot
you 2342 card sits in ? IIRC not every slot on V240 can handle dual port 2342 
card
going at full rate.

I didn't understand what you mean? 
there is only 3 slots on v240, which of them cannot handle dual HBA?

Generalizations like that fairly often become not true at all. It just depends
on too many factors.
I know that there are many factors for that, but my preformance test was clear 
, it is much faster to write 20GB on SAN than local disk.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: software RAID vs. HW RAID - part III

2007-04-26 Thread Gino
Hello Robert,

it would be really intresting if you can add a HD RAID 10 lun with UFS to your 
comparison.

gino
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

2007-04-26 Thread Gavin Maltby



On 04/24/07 01:37, Richard Elling wrote:

Leon Koll wrote:
My guess that Yaniv assumes that 8 pools with 62.5 million files each 
have significantly less chances to be corrupted/cause the data loss 
than 1 pool with 500 million files in it.

Do you agree with this?


I do not agree with this statement.  The probability is the same,
regardless of the number of files.  By analogy, if I have 100 people
and the risk of heart attack is 0.1%/year/person, then dividing those
people into groups does not change their risk of heart attack.


Is that not because heart attacks in different people are (under normal
circumstances!) independent events.  8 filesystems backed by a single
pool are not independent;  8 filesystems from 8 distinct pools are a lot
more independent.

Gavin
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Re: Status Update before Reinstall?

2007-04-26 Thread Robert Milkowski
Hello Brian,

Thursday, April 26, 2007, 3:55:16 AM, you wrote:

BG If I recall, the dump partition needed to be at least as large as RAM.

BG In Solaris 8(?) this changed, in that crashdumps streans were
BG compressed as they were written out to disk. Although I've never read
BG this anywhere, I assumed the reasons this was done are as follows:

BG 1) Large enterprise systems could support ridiculous (at the time)
BG amounts of physical RAM. Providing a physical disk/LUN partition that
BG could hold such a large crashdump seemed wasteful and expensive.

BG 2) Compressing the dump before writing to disk would be faster, thus
BG improving the chances of getting a full dump. (CPU performance has
BG progressed at a much higher rate of change than disk throughputs
BG have).

BG (I don't know what the compression ratios are, but I'd imagine they
BG would be pretty high).

By default only kernel pages are saved to dump device so even without
compression it can be smaller than ram size in a server. I often see
compression ratio 1.x or 2.x nothing more (it's lzjb after all).

Now with ZFS the story is a little bit different as its caches are
treated as kernel pages so you basically are dumping all memory in
case of file servers... there's an open bug for it.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re[2]: Re: opensol-20060605 # zpool iostat -v 1

2007-04-26 Thread Robert Milkowski
Hello Ron,

Tuesday, April 24, 2007, 4:54:52 PM, you wrote:

RH Thanks Robert. This will be put to use.

Please let us know about the results.

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Status Update before Reinstall?

2007-04-26 Thread Brian Hechinger
On Wed, Apr 25, 2007 at 09:55:16PM -0400, Brian Gupta wrote:
 
 In Solaris 8(?) this changed, in that crashdumps streans were
 compressed as they were written out to disk. Although I've never read
 this anywhere, I assumed the reasons this was done are as follows:

What happens if the dump slice is too small?  Just the dump just fail?
I mostly don't care about dumps, so..  ;)

-brian
-- 
Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it's just
that most of the shit out there is built by people who'd be better
suited to making sure that my burger is cooked thoroughly.  -- Jonathan 
Patschke
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Status Update before Reinstall?

2007-04-26 Thread Brian Hechinger
On Wed, Apr 25, 2007 at 09:30:12PM -0700, Richard Elling wrote:
 
 IMHO, only a few people in the world care about dumps at all (and you
 know who you are :-).  If you care, setup dump to an NFS server somewhere,
 no need to have it local.

a) what does this entail

b) with zvols not supporting dump, what would happen if one were to
setup a machine with no dump slice at all, would it just skip the
dump completely?

-brian
-- 
Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it's just
that most of the shit out there is built by people who'd be better
suited to making sure that my burger is cooked thoroughly.  -- Jonathan 
Patschke
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] XServe Raid Complex Storage Considerations

2007-04-26 Thread cedric briner

The Xraid is a very well thought of storage device with a heck of a price
point.  Attached is an image of the Settings/Performance Screen where
you see Allow Host Cache Flushing.

I think when you use ZFS, it would be best to uncheck that box.
This is what happen when you do use GUI in your native language (french 
in my case). I finally understood what was the meaning in french, after 
reading it from your image in english :)


And your setting just boosted my BW from 0.8 MiB/s to 7 MiB/s * !!
good to see that it just works.


The only 2 drawbacks to using Xserve raid that I have found are:

1. Partition Management, dynamic expansion and Volume management.  If we
stay native in OSX tools/filesystems we cant partition with free space then
later try to create a partition and retain the data from the already created
partition.  This really sucks.  I'm betting Xsan changes these limitations
however.
I would love that the Xserve juste provide a way to exports 14 disks to 
the Host. In this way, we could manage it with zfs in a more fine 
grained fashion.



2. Each controller can only talk to 7 disks (1/2 the array).

Other than that, the thing is really fast, and quite reliable.  Not to
mention the sexy blue lights that tell you its hummin'

yeah right.. quite sexy !


-Andy


Ced.
* (a MiB is a mebibyte 2^20 ref: http://en.wikipedia.org/wiki/Mebibyte)


--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS disables nfs/server on a host

2007-04-26 Thread Ben Miller
I was able to duplicate this problem on a test Ultra 10.  I put in a workaround 
by adding a service that depends on /milestone/multi-user-server which does a 
'zfs share -a'.  It's strange this hasn't happened on other systems, but maybe 
it's related to slower systems...

Ben
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS disables nfs/server on a host

2007-04-26 Thread Mark J Musante
On Thu, 26 Apr 2007, Ben Miller wrote:

 I just rebooted this host this morning and the same thing happened again.  I 
 have the core file from zfs.

 [ Apr 26 07:47:01 Executing start method (/lib/svc/method/nfs-server start) 
 ]
 Assertion failed: pclose(fp) == 0, file ../common/libzfs_mount.c, line 380, 
 func
 tion zfs_share
 Abort - core dumped

For there to be no output between the 'executing start method' and the
assertion means that the popen succeeded, but the fgets() failed.  It's
possible that fgets was interrupted and returned an EINTR, which isn't
currently being handled by the code in zfs_share_nfs().

The code I'm looking at starts here:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/common/libzfs_mount.c#454

It'd be nice to see a truss or dtrace of this to help narrow it down.


Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] HowTo: UPS + ZFS NFS + no fsync

2007-04-26 Thread cedric briner

Hello,

I wonder if the subject of this email is not self-explanetory ?


okay let'say that it is not. :)
Imagine that I setup a box:
 - with Solaris
 - with many HDs (directly attached).
 - use ZFS as the FS
 - export the Data with NFS
 - on an UPS.

Then after reading the : 
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations
I wonder if there is a way to tell the OS to ignore the fsync flush 
commands since they are likely to survive a power outage.



Ced.

--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HowTo: UPS + ZFS NFS + no fsync

2007-04-26 Thread Wee Yeh Tan

On 4/26/07, cedric briner [EMAIL PROTECTED] wrote:

okay let'say that it is not. :)
Imagine that I setup a box:
  - with Solaris
  - with many HDs (directly attached).
  - use ZFS as the FS
  - export the Data with NFS
  - on an UPS.

Then after reading the :
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations
I wonder if there is a way to tell the OS to ignore the fsync flush
commands since they are likely to survive a power outage.


Cedric,

You do not want to ignore syncs from ZFS if your harddisk is directly
attached to the server.  As the document mentioned, that is really for
Complex Storage with NVRAM where flush is not necessary.



--
Just me,
Wire ...
Blog: prstat.blogspot.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HowTo: UPS + ZFS NFS + no fsync

2007-04-26 Thread Roch - PAE

You might set zil_disable to 1 (_then_ mount the fs to be
shared). But you're still exposed to OS crashes; those would 
still corrupt your nfs clients.

-r


cedric briner writes:
  Hello,
  
  I wonder if the subject of this email is not self-explanetory ?
  
  
  okay let'say that it is not. :)
  Imagine that I setup a box:
- with Solaris
- with many HDs (directly attached).
- use ZFS as the FS
- export the Data with NFS
- on an UPS.
  
  Then after reading the : 
  http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations
  I wonder if there is a way to tell the OS to ignore the fsync flush 
  commands since they are likely to survive a power outage.
  
  
  Ced.
  
  -- 
  
  Cedric BRINER
  Geneva - Switzerland
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HowTo: UPS + ZFS NFS + no fsync

2007-04-26 Thread cedric briner

okay let'say that it is not. :)
Imagine that I setup a box:
  - with Solaris
  - with many HDs (directly attached).
  - use ZFS as the FS
  - export the Data with NFS
  - on an UPS.

Then after reading the :
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations 


I wonder if there is a way to tell the OS to ignore the fsync flush
commands since they are likely to survive a power outage.


Cedric,

You do not want to ignore syncs from ZFS if your harddisk is directly
attached to the server.  As the document mentioned, that is really for
Complex Storage with NVRAM where flush is not necessary.


This post follows : `XServe Raid  Complex Storage Considerations'
http://www.opensolaris.org/jive/thread.jspa?threadID=29276tstart=0

Where we have made the assumption (*1) if the XServe Raid is connected 
to an UPS that we can consider the RAM in the XServe Raid as it was NVRAM.


(*1)
  This assumption is even pointed by Roch  :
  http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison
   Intelligent Storage
  through: `the Shenanigans with ZFS flushing and intelligent arrays...'
  http://blogs.digitar.com/jjww/?itemid=44
   Tell your array to ignore ZFS' flush commands

So in this way, when we export it with NFS we get a boost in the BW.

Okay, then is there any difference that I do not catch between :
 - the Shenanigans with ZFS flushing and intelligent arrays...
 - and my situation

I mean, I want to have a cheap and reliable nfs service. Why should I 
buy expensive `Complex Storage with NVRAM' and not just buying a machine 
with 8 IDE HD's ?



Ced.
--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Status Update before Reinstall?

2007-04-26 Thread Lori Alt

So first of all, we're not proposing dumping to a filesystem.
We're proposing dumping to a zvol, which is a raw volume
implemented within a pool (see the -V option to the zfs
create command).  As Malachi points out, the advantage
of this is that it simplifies the ongoing administration.  You don't
have to have pre-allocate a slice of the appropriate size, and
then be unable to grow the space later.

You are right that at crash dump time, you want as little
complexity as possible in the process of writing out the dump
because there's no knowing how broken the system is.
So consider what happens with dump files and UFS.  With
UFS, you can set up a file as a dump device.  This is not as
crazy as it sounds because at the time you set up the dump
device (through dumpadm), UFS allocates the space and
sets up an array of offset-length pointers to the space, so
that at the time the crash dump takes place, some really
dumb code in the kernel just has to run that list and hose
the memory contents into those pre-allocated areas on the
disk.  We are looking at doing something similar with zfs,
where the space is allocated and pointers to it prepared in
advance, so that at crash time, we only need very simple
code to write out the dump.

I'm not in charge of the zfs dump development, so I don't
know the technical details, but I think that the development
is proceeding along these lines. 


Lori



Brian Gupta wrote:

Please bear with me, as I am not very familiar with ZFS. (And
unfortunately probably won't have time to be until ZFS supports root
boot and clustering in a named release).

I do understand the reasons why you would want to dump to a virtual
construct. I am just not very comfortable with the concept.

My instinct is that you want the fewest layers of software involved in
the event of a system crashdump.

To me dumping to logical volumes or filesystems seems like asking for
trouble. Now on the other hand, if you were to dump to an underlying
zdev it starts to make sense. (Assuming a zdev is basically a
physical chunk of a LUN or disk.

Please educate me as to what I am missing.

Thanks,
Brian

On 4/25/07, Malachi de Ælfweald [EMAIL PROTECTED] wrote:

Maybe so that it can grow rather than being tied to a specific piece of
hardware?

Malachi


On 4/25/07, Brian Gupta  [EMAIL PROTECTED] wrote:

  Yes, dump on ZVOL isn't currently supported, so a dump slice is 
still

needed.

 Maybe a dumb question, but why would anyone ever want to dump to an
 actual filesystem? (Or is my head thinking too Solaris)

 Actually I could see why, but I don't think it is a good idea.

 -brian
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HowTo: UPS + ZFS NFS + no fsync

2007-04-26 Thread Neil . Perrin

cedric briner wrote:

You might set zil_disable to 1 (_then_ mount the fs to be
shared). But you're still exposed to OS crashes; those would still 
corrupt your nfs clients.


-r



hello Roch,

I've few questions

1)
from:
   Shenanigans with ZFS flushing and intelligent arrays...
   http://blogs.digitar.com/jjww/?itemid=44
I read :
  Disable the ZIL. The ZIL is the way ZFS maintains _consistency_ until 
it can get the blocks written to their final place on the disk.


This is wrong. The on-disk format is always consistent.
The author of this blog is misinformed and is probably getting
confused with traditional journalling.


That's why the ZIL flushes the cache.


The ZIL flushes it's blocks to ensure that if a power failure/panic occurs
then the data the system guarantees to be on stable storage (due say to a fsync
or O_DSYNC) is actually on stable storage.

If you don't have the ZIL and a power 
outage occurs, your blocks may go poof in your server's RAM...'cause 
they never made it to the disk Kemosabe.


True, but not blocks, rather system call transactions - as this is what the
ZIL handles.



from :
   Eric Kustarz's Weblog
   http://blogs.sun.com/erickustarz/entry/zil_disable
I read :
   Note: disabling the ZIL does _NOT_ compromise filesystem integrity. 
Disabling the ZIL does NOT cause corruption in ZFS.


then :
   I don't understand: In one they tell that:
- we can lose _consistency_
   and in the other one they say that :
- does not compromise filesystem integrity
   so .. which one is right ?


Eric's, who works on ZFS!




2)
from :
   Eric Kustarz's Weblog
   http://blogs.sun.com/erickustarz/entry/zil_disable
I read:
  Disabling the ZIL is definitely frowned upon and can cause your 
applications much confusion. Disabling the ZIL can cause corruption for 
NFS clients in the case where a reply to the client is done before the 
server crashes, and the server crashes before the data is commited to 
stable storage. If you can't live with this, then don't turn off the ZIL.


then:
   The service that we export with zfs  NFS is not such things as 
databases or some really stress full system, but just exporting home. So 
it feels to me that we can juste disable this ZIL.


3)
from:
   NFS and ZFS, a fine combination
   http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison
I read:
   NFS service with risk of corruption of client's side view :

nfs/ufs :  7 sec (write cache enable)
nfs/zfs :  4.2   sec (write cache enable,zil_disable=1)
nfs/zfs :  4.7   sec (write cache disable,zil_disable=1)

Semantically correct NFS service :

nfs/ufs : 17 sec (write cache disable)
nfs/zfs : 12 sec (write cache disable,zil_disable=0)
nfs/zfs :  7 sec (write cache enable,zil_disable=0)

then :
   Does this mean that when you just create an UFS FS, and that you just 
export it with NFS, you are doing an not semantically correct NFS 
service. And that you have to disable the write cache to have an correct 
NFS server ???


Yes. UFS requires the write cache to be disabled to maintain consistency.



4)
so can we say that people used to have an NFS with risk of corruption of 
client's side view can just take ZFS and disable the ZIL ?


I suppose but we aim to strive for better than expected corruption.
We (ZFS) recommend not disabling the ZIL.
We also recommend not disabling the disk write cache flushing unless they are
backed by nvram or UPS.



thanks in advance for your clarifications

Ced.
P.-S. Does some of you know the best way to send an email containing 
many questions inside it ? Should I create a thread for each of them, 
the next time


This works.

- Good questions.

Neil.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Status Update before Reinstall?

2007-04-26 Thread Bill Sommerfeld
On Wed, 2007-04-25 at 21:30 -0700, Richard Elling wrote:
 Brian Gupta wrote:
  Maybe a dumb question, but why would anyone ever want to dump to an
  actual filesystem? (Or is my head thinking too Solaris)
 
 IMHO, only a few people in the world care about dumps at all (and you
 know who you are :-). 

sorry, but that's a attitude which is toxic to quality.  

EVERY installed solaris system should be able to generate and save a
valid crash dump, to increase the chance that a bug will be able to be
root caused the first time some customer sees it.

- Bill


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: zfs boot image conversion kit is posted

2007-04-26 Thread Benjamin Perrault
Don't mean to be a pest - but is there an eta on when the b62_zfsboot.iso will 
be posted? 

I'm really looking forward to ZFS root, but I'd rather download a working dvd 
image then attempt to patch the image myself :-)

cheers and thanks,
-bp
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] HowTo: UPS + ZFS NFS + no fsync

2007-04-26 Thread Robert Milkowski
Hello Wee,

Thursday, April 26, 2007, 4:21:00 PM, you wrote:

WYT On 4/26/07, cedric briner [EMAIL PROTECTED] wrote:
 okay let'say that it is not. :)
 Imagine that I setup a box:
   - with Solaris
   - with many HDs (directly attached).
   - use ZFS as the FS
   - export the Data with NFS
   - on an UPS.

 Then after reading the :
 http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations
 I wonder if there is a way to tell the OS to ignore the fsync flush
 commands since they are likely to survive a power outage.

WYT Cedric,

WYT You do not want to ignore syncs from ZFS if your harddisk is directly
WYT attached to the server.  As the document mentioned, that is really for
WYT Complex Storage with NVRAM where flush is not necessary.


What??

Setting zil_disable=1 has nothing to do with NVRAM in storage arrays.
It disables ZIL in ZFS wich means that if application calls fsync() or
opens a file with O_DSYNC, etc. then ZFS won't honor it (return
immediatelly without commiting to stable storage).

Once txg group closes data will be written to disks and SCSI write
cache flush commands will be send.

Setting zil_disable to 1 is not that bad actually, and if someone
doesn't care to lose some last N seconds of data in case of server
crash (however zfs itself will be consistent) it can actually speed up
nfs operations a lot.

btw: people accustomed to Linux in a way have always zil_disable set
to 1... :)


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS Boot: Dividing up the name space

2007-04-26 Thread Mike Dotson
 Peter Tribble wrote:
  On 4/24/07, Darren J Moffat [EMAIL PROTECTED]
 wrote:
  With reference to Lori's blog posting[1] I'd like
 to throw out a few of
  my thoughts on spliting up the namespace.
 
  Just a plea with my sysadmin hat on - please don't
 go overboard
  and make new filesystems just because we can. Each
 extra
  filesystem generates more work for the
 administrator, if only
  for the effort to parse df output (which is more
 than cluttered enough
  already).
 My first reaction to that, is yes, of course, extra
 file systems are extra
 work.  Don't require them, and don't even make them
 the default unless
 they buy you a lot.  But then I thought, no, let's
 challenge that a bit.
 
 Why do administrators do 'df' commands?  It's to find
 out how much space
 is used or available in a single file system.   That
 made sense when file
 systems each had their own dedicated slice, but now
 it doesn't make that
 much sense anymore.  Unless you've assigned a quota
 to a zfs file system,
 space available is meaningful more at the pool
 level.  And if you DID
 assign a quota to the file system, then you really
 did want that part of
 the name space to be a separate, and separately
 manageable, file system.

I'd like to put my sysadmin hat on and add to this:  

Yes, if you start adding quota's, etc. you'll have to start looking at doing 
df's again but this is actually easier with zfs (zfs list).  Now I can see, 
very easily, where my space is being allocated and start diving in from there 
instead of the multiple du -ks * |sort -n recursive rampages I do on one big 
filesystem.

Also, if I start using zfs and some of the other features (read only) for 
example, I can start taking and locking down some of these filesystems (/usr 
perhaps???) so I no longer need to worry about the space being allocated in 
/usr.  Or setting reserve and quota's on file systems, basically eliminating 
them from my constant monitoring and free space shuffle of where did my space 
go.

 
 With zfs, file systems are in many ways more like
 directories than what
 we used to call file systems.   They draw from pooled
 storage.  They
 have low overhead and are easy to create and destroy.
  File systems
 re sort of like super-functional directories, with
 quality-of-service
 control and cloning and snapshots.  Many of the
 things that sysadmins
 used to have to do with file systems just aren't
 necessary or even
 meaningful anymore.  And so maybe the additional work
 of managing
 more file systems is actually a lot smaller than you
 might initially think.

I believe so.  Just having zfs boot on my system for a couple of days and 
breaking out the major food groups, I can easily see where my space is at - 
again zfs list is much faster than du -ks and I don't have to be root for it to 
be 100% accurate - my postgres data files aren't owned by me;)

Other things (I've mentioned to Lori off alias) is the possible ability to 
compress some file systems - again possibly /usr and /opt???  

Breaking out the namespace provides the flexibility of separate file systems 
and snapping/cloning/administrating those as needed with the benefits of a 
single root file system - one disk and not having to get the partition space 
right.

But, there is the matter of balance - too much would be overkill.  Perhaps the 
split and merge RFE's would bridge that gap to provide again more flexibility?

 
 In other words, think about ALL of the implications
 of using zfs,
 not just some.
 
 We've come up with a lot of good reasons for having
 multiple
 file systems.  So we know that there are benefits.
  We also know
 hat there are costs.  But if we can figure out a way
 to keep the
 costs low, the benefits might outweigh them.
 
 
  In other words, let people have a system with just
 one filesystem.
 I think I can agree with this, but I'm not absolutely
 certain.   On the
 one hand, sure, more freedom is better.  But I'm
 concerned that
 our long-term install and upgrade strategies might be
 constrained
 by having to support configurations that haven't been
 set up with
 the granularity needed for some kinds of valuable
 storage management
 features.
 
 This conversation is great!  I'm getting lots of good
 information
 and I *really* want to figure out what's best, even
 if it challenges
 some of my cherished notions.
 
 Lori
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: zfs boot image conversion kit is posted

2007-04-26 Thread Ian Collins
Lori Alt wrote:

 Benjamin Perrault wrote:

 Don't mean to be a pest - but is there an eta on when the
 b62_zfsboot.iso will be posted?
 I'm really looking forward to ZFS root, but I'd rather download a
 working dvd image then attempt to patch the image myself :-)

 Actually, we hadn't planned to release zfsboot dvd images, but I'll
 look into it.

Or an alternative CD1, for those of us who do network installs after
booting from CD.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cow performance penatly

2007-04-26 Thread Erblichs
Ming,

Lets take a pro example with a minimal performance
tradeoff.

All FSs that modify a disk block, IMO, do a full
disk block read before anything.

If doing a extended write and moving to a
larger block size with COW you give yourself
the ability to write to a single block
vs 
having to fill the original block and also needing 
to write the next block. The performance loss is the
additional latency to transfer more bytes within the
larger block on the next access.

This pro doesn't just benefit at the end of the file
but also at both ends of a hole within the file. In
addition, the next non recent IO op that accesses the
disk block will be able to perform a single seek. Also,
if we allow ourselves to dynamicly increase the size
of the block and we are within direct access to the
blocks, we can delay moving to the additional latencies
going to a indirect block or...

So, this has a performance benefit in addition to
removing the case where a OS panic occures in the
middle of the disk block and losing the original and
the full next iteration of the file. After the
write completes we should be able to update the
FS's node data struct.

Mitchell Erblich
Ex-Sun Kernel Engineer who proposed and implemented this
 in a limited release of UFS many years ago.
--


Ming Zhang wrote:
 
 Hi All
 
 I wonder if any one have idea about the performance loss caused by COW
 in ZFS? If you have to read old data out before write it to some other
 place, it involve disk seek.
 
 Thanks
 
 Ming
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Status Update before Reinstall?

2007-04-26 Thread Malachi de Ælfweald

Just an interesting side note networked based logging isn't always a bad
thing. I'll give you an example. My Netgear router will crash within 1/2
hour if I turn local logging on.  However, it has no problems sending the
logs via syslog to another machine.

Just a thought.

Mal

On 4/26/07, Adam Leventhal [EMAIL PROTECTED] wrote:


On Wed, Apr 25, 2007 at 09:30:12PM -0700, Richard Elling wrote:
 IMHO, only a few people in the world care about dumps at all (and you
 know who you are :-).  If you care, setup dump to an NFS server
somewhere,
 no need to have it local.

Well IMHO, every Solaris customer cares about crash dumps (although they
may not know it). There are failures that occur once -- no dump means no
solution.

And you're not going to be dumping directly over NFS if you care about
your crash dump (see previous point).

Adam

--
Adam Leventhal, Solaris Kernel Development   http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss