from:"Richard L. Hamilton"

Re: [zfs-discuss] ZFS - Sudden decrease in write performance

2010-11-20 Thread Richard L. Hamilton

arc-discuss doesn't have anything specifically to do with ZFS;
in particular, it has nothing to do with the ZFS ARC.  Just an
unfortunate overlap of acronyms.

Cross-posted to zfs-discuss, where this probably belongs.


 Hey all1
 
 Recently I've decided to implement OpenSolaris as a
 target for BackupExec.
 
 The server I've converted into a Storage Appliance
 is an IBM x3650 M2 w/ ~4TB of on board storage via
 ~10 local SATA drives and I'm using OpenSolaris
 svn_134. I'm using a QLogic 4Gb FC HBA w/ the QLT
 driver and presented an 8TB sparse volume to the host
 due to dedup and compression being turned on for the
 zpool.
 
 When writes begin, I see anywhere from 4.5GB/Min to
 5.5GB/Min and then it drops of quickly (I mean down
 to 1GB/Min or less). I've already swapped out the
 card, cable, and port with no results. I have since
 ensured that every piece of equipment on the box had
 it's firmware updated. While doing so, I installed
 Windows Server 2008 to flash all the firmware (IBM
 doesn't have a Solaris installer). 
 
 While in Server 2008, I decided to just attempt a
 backup via share on the 1Gbs copper connection. I saw
 speeds of up to 5.5GB/Min consistently and they were
 sustained throughout 3 days of testing. Today I
 decided to move back to OpenSolaris with confidence.
 All writes began at 5.5GB/Min and quickly dropped
 off.
 
 In my troubleshooting efforts, I have also dropped
 the fiber connection and made it an iSCSI target with
 no performance gains. I have let the on board RAID
 controller do the RAID portion instead of creating a
 zpool of multiple disks with no performance gains.
 And, I have created the target LUN using both rdsk
 and dsk paths.
 
 I did notice today though, that there is a direct
 correlation between the ARC memory usage and speed.
 Using arcstat.pl, as soon as arcsz hits 1G (half of c
 column [commit?]), my throughput hits the floor (i.e.
 600MB/Min or less). I can't figure it out. I tried
 every configuration possible.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs under medium load causes SMB to delay writes

2010-11-07 Thread Richard L. Hamilton

This is not the appropriate group/list for this message.
Crossposting to zfs-discuss (where it perhaps primarily
belongs) and to cifs-discuss, which also relates.

 Hi,
 
 I have an I/O load issue and after days of searching
 wanted to know if anyone has pointers on how to
 approach this.
 
 My 1-year stable zfs system (raidz3 8 2TB drives, all
 OK) just started to cause problems when I introduced
 a new backup script that puts medium I/O load. This
 script simply tars up a few filesystems and md5sums
 the tarball, to copy to another system for off
 OpenSolaris backup. The simple commands are:
 
 tar -c /tank/[filesystem]/.zfs/snapshot/[snapshot] 
 /tank/[otherfilesystem]/file.tar
 md5sum -b /tank/[otherfilesystem]/file.tar 
 file.md5sum
 
 These 2 commands obviously cause high read/write I/O
 because the 8 drives are directly reading and writing
 a large amount of data as fast as the system can go.
 This is OK and OpenSolaris functions fine.
 
 The problem is I host VMWare images on another PC
 which access their images on this zfs box over smb,
 and during this high I/O period, these VMWare guests
 are crashing.
 
 What I think is happening is during the backup with
 high I/O, zfs is delaying reads/writes to the VMWare
 images. This is quickly causing VMWare to freeze the
 guest machines. 
 
 When the backup script is not running, the VMWare
 guests are fine, and have been fine for 1-year.
 (setup has been rock solid)
 
 Any idea how to address this? I'd thought puting the
 relevant filesystem (tank/vmware) on a higher
 priority for reads/writes, but haven't figured out
 how. 
 Another way is to deprioritize the backup somehow.
 
 Any pointers would be appreciated.
 
 Thanks,
 Tom
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharesmb should be ignored if filesystem is not mounted

2010-11-04 Thread Richard L. Hamilton

 On 10/28/10 08:40 AM, Richard L. Hamilton wrote:
  I have sharesmb=on set for a bunch of filesystems,
  including three that weren't mounted.
  Nevertheless,
  all of those are advertised.  Needless to say,
 the one that isn't mounted can't be accessed
  remotely,
 even though since advertised, it looks like it could
  be.
 When you say advertised do you mean that it appears
 in
 /etc/dfs/sharetab when the dataset is not mounted
 and/or
 you can see it from a client with 'net view' on a
 client?
 
 I'm using a recent build and I see the smb share
 disappear
 from both when the dataset is unmounted.

I could see it in Finder on a Mac client; presumably were
I on a Windows client, it would have appeared with net view.
I've since turned off the sharesmb property on those filesystems,
so I may need to reboot (which I'd much rather not) to re-create
the problem.

But if recent builds don't have the problem, that's the main thing.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] sharesmb should be ignored if filesystem is not mounted

2010-10-28 Thread Richard L. Hamilton

I have sharesmb=on set for a bunch of filesystems,
including three that weren't mounted.  Nevertheless,
all of those are advertised.  Needless to say,
the one that isn't mounted can't be accessed remotely,
even though since advertised, it looks like it could be.

# zfs list -o name,mountpoint,sharesmb,mounted|awk '$(NF-1)!=off   
$(NF-1)!=-  $NF!=yes'
NAME   MOUNTPOINT  SHARESMB  MOUNTED
rpool/ROOT legacy  on no
rpool/ROOT/snv_129 /   on no
rpool/ROOT/snv_93  /tmp/.alt.luupdall.22709on no
# 


So I think that if a zfs filesystem is not mounted,
sharesmb should be ignored.

This is in snv_97 (SXCE; with a pending LU BE not yet activated,
and an old one no longer active); I don't know if it's still a problem in
current builds that unmounted filesystems are advertised, but if it is,
I can see how it could confuse clients.  So I thought I'd mention it.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharesmb should be ignored if filesystem is not mounted

2010-10-28 Thread Richard L. Hamilton

PS obviously these are home systems; in a real environment,
I'd only be sharing out filesystems with user or application
data, and not local system filesystems!  But since it's just
me, I somewhat trust myself not to shoot myself in the foot.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side

2010-10-02 Thread Richard L. Hamilton

 On Thu, Sep 30, 2010 at 08:14:24PM -0400, Miles
 Nordin wrote:
   Can the user in (3) fix the permissions from
 Windows?
  
  no, not under my proposal.
 
 Then your proposal is a non-starter.  Support for
 multiple remote
 filesystem access protocols is key for ZFS and
 Solaris.
 
 The impedance mismatches between these various
 protocols means that we
 need to make some trade-offs.  In this case I think
 the business (as
 well as the engineers involved) would assert that
 being a good SMB
 server is critical, and that being able to
 authoritatively edit file
 permissions via SMB clients is part of what it means
 to be a good SMB
 server.
 
 Now, you could argue that we should being aclmode
 back and let the user
 choose which trade-offs to make.  And you might
 propose new values for
 aclmode or enhancements to the groupmask setting of
 aclmode.
 
  but it sounds like currently people cannot ``fix''
 permissions through
  the quirky autotranslation anyway, certainly not to
 the point where
  neither unix nor windows users are confused:
 windows users are always
  confused, and unix users don't get to see all the
 permissions.
 
 Thus the current behavior is the same as the old
 aclmode=discard
 setting.
 
   Now what?
  
  set the unix perms to 777 as a sign to the unix
 people to either (a)
  leave it alone, or (b) learn to use 'chmod A...'.
  This will actually
 work: it's not a hand-waving hypothetical that just
  doesn't play out.
 That's not an option, not for a default behavior
 anyways.
 
 Nico


One question: Casper, where are you?  The guy that did fine-grained
permissions IMO ought to have an idea of how to do something with ACLs
that's both safe and unsurprising for the various sorts of clients.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Please warn a home user against OpenSolaris under VirtualBox under WinXP ; )

2010-10-01 Thread Richard L. Hamilton

Hmm...according to
http://www.mail-archive.com/vbox-users-commun...@lists.sourceforge.net/msg00640.html

that's only needed before VirtualBox 3.2, or for IDE.  = 3.2, non-IDE should
honor flush requests, if I read that correctly.

Which is good, because I haven't seen an example of how to enabling flushing
for SAS (which is the emulation I usually use because it's supposed to have
better performance).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] fs root inode number?

2010-09-26 Thread Richard L. Hamilton

Typically on most filesystems, the inode number of the root
directory of the filesystem is 2, 0 being unused and 1 historically
once invisible and used for bad blocks (no longer done, but kept
reserved so as not to invalidate assumptions implicit in ufsdump tapes).

However, my observation seems to be (at least back at snv_97), the
inode number of ZFS filesystem root directories (including at the
top level of a spool) is 3, not 2.

If there's any POSIX/SUS requirement for the traditional number 2,
I haven't found it.  So maybe there's no reason founded in official
standards for keeping it the same.  But there are bound to be programs
that make what was with other filesystems a safe assumption.

Perhaps a warning is in order, if there isn't already one.

Is there some _reason_ why the inode number of filesystem root directories
in ZFS is 3 rather than 2?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Debunking the dedup memory myth

2010-07-18 Thread Richard L. Hamilton

 Even the most expensive decompression algorithms
 generally run
 significantly faster than I/O to disk -- at least
 when real disks are
 involved.  So, as long as you don't run out of CPU
 and have to wait for
 CPU to be available for decompression, the
 decompression will win.  The
 same concept is true for dedup, although I don't
 necessarily think of
 dedup as a form of compression (others might
 reasonably do so though.)

Effectively, dedup is a form of compression of the
filesystem rather than any single file, but one
oriented to not interfering with access to any of what
may be sharing blocks.

I would imagine that if it's read-mostly, it's a win, but
otherwise it costs more than it saves.  Even more conventional
compression tends to be more resource intensive than decompression...

What I'm wondering is when dedup is a better value than compression.
Most obviously, when there are a lot of identical blocks across different
files; but I'm not sure how often that happens, aside from maybe
blocks of zeros (which may well be sparse anyway).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-16 Thread Richard L. Hamilton

 Losing ZFS would indeed be disastrous, as it would
 leave Solaris with 
 only the Veritas File System (VxFS) as a semi-modern
 filesystem, and a 
 non-native FS at that (i.e. VxFS is a 3rd-party
 for-pay FS, which 
 severely inhibits its uptake). UFS is just way to old
 to be competitive 
 these days.

Having come to depend on them, the absence of some of the
features would certainly be significant.

But how come everyone forgets about QFS?
http://www.sun.com/storage/management_software/data_management/qfs/index.xml
http://en.wikipedia.org/wiki/QFS
http://hub.opensolaris.org/bin/view/Project+samqfs/WebHome
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-16 Thread Richard L. Hamilton

 On Tue, 13 Jul 2010, Edward Ned Harvey wrote:
 
  It is true there's no new build published in the
 last 3 months.  But you
  can't use that to assume they're killing the
 community.
 
 Hmm, the community seems to think they're killing the
 community:
 
   http://developers.slashdot.org/story/10/07/14/1448209
 /OpenSolaris-Governing-Board-Closing-Shop?from=rss
 
 
 ZFS is great. It's pretty much the only reason we're
 running Solaris. But I
 don't have much confidence Oracle Solaris is going to
 be a product I'm
 going to want to run in the future. We barely put our
 ZFS stuff into
 production last year but quite frankly I'm already on
 the lookout for
 something to replace it.
 
 No new version of OpenSolaris (which we were about to
 start migrating to).
 No new update of Solaris 10. *Zero* information about
 what the hell's going
 on...

Presumably if you have a maintenance contract or some other
formal relationship, you could get an NDA briefing.  Not having
been to one yet myself, I don't know what that would tell you,
but presumably more than without it.

Still, the silence is quite unhelpful, and the apparent lack of
anyone willing to recognize that, and with the authority to do
anything about it, is troubling.


 ZFS will surely live on as the filesystem under the
 hood in the doubtlessly
 forthcoming Oracle database appliances, and I'm
 sure they'll keep selling
 their NAS devices. But for home users? I doubt it. I
 was about to build a
 big storage box at home running OpenSolaris, I froze
 that project. Oracle
 is all about the money. Which I guess is why they're
 succeeding and Sun
 failed to the point of having to sell out to them. My
 home use wasn't
 exactly going to make them a profit, but on the other
 hand, the philosophy
 that led to my not using the product at home is a
 direct cause of my lack
 of desire to continue using it at work, and while
 we're not exactly a huge
 client we've dropped a decent penny or two in Sun's
 wallet over the years.

FWIW, you're not the only one that's tried to make that point!

 Who knows, maybe Oracle will start to play ball
 before August 16th and the
 OpenSolaris Governing Board won't shut themselves
 down. But I wouldn't hold
 my breath.

Postponement of respiration pending hypothetical actions by others
is seldom an effective survival strategy.

Nevertheless, the zfs on my Sun Blade 2000 currently running SXCE snv_97
(pending luactivate and reboot to switch to snv_129) is doing just fine
with what is presently 3TB of redundant storage, and will eventually grow
to 9TB as I populate the rest of the slots in my JBOD
(8 slots; 2 x 1TB mirror for root; presently also 2 x 2TB mirror for data,
but that will change to 5 x 2TB raidz + 1 2TB hot spare when I can afford
four more 2TB drives).

I have a spare power supply and some other odds and ends for the
Sun Blade 2000, so, with fingers crossed, it will run (and heat my house :-)
for quite some time to come, regardless of availability of future software
updates.  If not, I'm sure I have an ISO of SXCE 129 or so for x86 somewhere
too, which I could put on any cheap x86 box with a PCIx slot for my SAS
controller, and just import the zpools and go.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Legality and the future of zfs...

2010-07-16 Thread Richard L. Hamilton

 never make it any better. Just for a record: Solaris
 9 and 10 from Sun
 was a plain crap to work with, and still is
 inconvenient conservative
 stagnationware. They won't build a free cool tools

Everybody but geeks _wants_ stagnationware, if you means
something that just runs.  Even my old Sun Blade 100 at home
still has Solaris 9 on it, because I haven't had a day to kill to
split the mirror, load something newer like the last SXCE, and
get everything on there working on it.  (My other SPARC is running
a semi-recent SXCE, and pending activation of an already installed
most recent SXCE.  Sitting at a Sun, I still prefer CDE to GNOME,
and the best graphics card I have for that box won't work with
the newer Xorg server, so I can't see putting OpenSolaris on it.)

For instance, recent enough Solaris 10 updates to be able to do zfs
root are pretty decent; you get into the habit of doing live upgrades
even for patching, so you can minimize downtime.  Hardly stagnant,
considering that the initial release of Solaris 10 didn't even have
zfs in it yet.

 for Solaris, hence
 the whole thing will turned to be a dry job for
 trained monkeys
 wearing suits in a corporations. Nothing more. That's
 a philosophy of
 last decade, but IT now is very changing and is very
 different. That
 is why Oracle's idea to kill community is totally
 stupid. And that's
 why IBM will win, because you run the same Linux on
 their hardware as
 you run at your home.
 
 Yes, Oracle will run good for a while, using the
 inertia of a hype
 (and latest their financial report proves that), but
 soon people will
 realize that Oracle is just another evil mean beast
 with great
 marketing and the same sh*tty products as they always
 had. Buy Solaris
 for any single little purpose? No way ever! I may buy
 support and/or
 security patches, updates. But not the OS itself. If
 that is the only
 option, then I'd rather stick to Linux from other
 vendor, i.e. RedHat.
 That will lead me to no more talk to Oracle about
 software at OS
 level, only applications (if I am an idiot enough to
 jump into APEX or
 something like that). Hence, if all I can do is talk
 only about
 hardware (well, not really, because no more
 hardware-only support!!!),
 then I'd better talk to IBM, if I need a brand and I
 consider myself
 too dumb to get SuperMicro instead. IBM System x3550
 M3 is still
 better by characteristics than equivalent from
 Oracle, it is OEM if
 somebody needs that at first place and is still
 cheaper than Oracle's
 similar class. And IBM stuff just works great (at
 least if we talk
 about hardware).

I'm not going to say you're wrong, because in part I agree
with you.  Systems people can run at home, desktops, laptops,
those are all what get future mindshare and eventually get
people with big bucks spending them.

But the simple fact that Sun went down suggests that
just being all lovey-dovey (and plenty of people thought that
Sun wasn't lovey-dovey _enough_?) won't keep you in business
either.

[...]
  But for home users? I doubt it. I was about to
 build a
  big storage box at home running OpenSolaris, I
 froze that project.

Mine's running SXCE, and unless I can find a solution
to getting decent graphics working with Xorg on it,
probably always will be.  But the big (well, target 9TB redundant;
presently 3TB redundant) storage is doing just fine.
Being super latest and greatest just isn't necessary for that.

 Same here. A lot of nice ideas and potential
 open-source tools
 basically frozen and I think gonna be dumped. We
 (geeks) won't build
 stuff for Larry just for free. We need OS back opened
 in reward. So I
 think OpenSolaris is pretty much game over, thanks to
 the Oracle. Some
 Oracle fanboys might call it a plain FUD, hope to get
 updates etc, but
 the reality is that Oracle to OpenSolaris is pretty
 much the same what
 Palm did for BeOS.
 
 Enjoy your last svn_134 build.
 

I can't rule out that possibility, but I see some reasons
to think that it's worth being patient for a couple more
months.  As it is, I find myself updating my Mac and Windows
every darn week; so I'm pretty much past getting a kick out
of updating just to see what's kewl.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Exporting iSCSI - it's still getting all the ZFS protect

2010-05-07 Thread Richard L. Hamilton

AFAIK, zfs should be able to protect against (if the pool is redundant), or at 
least
detect, corruption from the point that it is handed the data, to the point
that the data is written to permanent storage, _provided_that_ the system
has ECC RAM (so it can detect and often correct random background-radiation
caused memory errors), and that, if zfs controls the whole disk and the disk
has a write cache, the disk correctly honors requests to flush the write cache
to permanent storage.  That should be just as true for a zvol as for a
regular zfs file.

What I'm trying to say is that zfs should give you a lot of protection in your
situation, but that it can do nothing about it if it is handed bad data: for
example, if the client is buggy and sends corrupt data, if somehow a network
error goes undetected (unlikely given that AFAIK iSCSI runs over TCP and
at least thus far never over UDP, and TCP always checksums (UDP might not)),
if the iSCSI server software corrupts data before writing it to disk, etc.

In other words, zfs probably gives more protection to a larger portion of the
data path than just about anything else, but in the case of a remote client,
whether iSCSI, NFS, CIFS, or whatever, the data path is longer and
distributed, and the verification that zfs does only covers part of that.

What I'm saying would _not_ apply if the client were doing zfs onto iSCSI
storage; in that case, the client's zfs would also be looking after data 
integrity.
So the closer to the data generating application that the integrity from that
point on is provided, the less places something bad can happen without being
at least detected.

Note: I can't guarantee that any of what I said is correct, although I would
be willing to risk my own data as if it were.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Richard L. Hamilton

 I've googled this for a bit, but can't seem to find
 the answer.
 
 What does compression bring to the party that dedupe
 doesn't cover already?
 
 Thank you for you patience and answers.

That almost sounds like a classroom question.

Pick a simple example: large text files, of which each is
unique, maybe lines of data or something.  Not likely to
be much in the way of duplicate blocks to share, but
very likely to be highly compressible.

Contrast that with binary files, which might have blocks
of zero bytes in them (without being strictly sparse, sometimes).
With deduping, one such block is all that's actually stored (along
with all the references to it, of course).

In the 30 seconds or so I've been thinking about it to type this,
I would _guess_ that one might want one or the other, but
rarely both, since compression might tend to work against deduplication.

So given the availability of both, and how lightweight zfs filesystems
are, one might want to create separate filesystems within a pool with
one or the other as appropriate, and separate the data according to
which would likely work better on it.  Also, one might as well
put compressed video, audio, and image formats in a filesystem
that was _not_ compressed, since compressing an already compressed
file seldom gains much if anything more.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Richard L. Hamilton

Another thought is this: _unless_ the CPU is the bottleneck on
a particular system, compression (_when_ it actually helps) can
speed up overall operation, by reducing the amount of I/O needed.
But storing already-compressed files in a filesystem with compression
is likely to result in wasted effort, with little or no gain to show for it.

Even deduplication requires some extra effort.  Looking at the documentation,
it implies a particular checksum algorithm _plus_ verification (if the checksum
or digest matches, then make sure by doing a byte-for-byte compare of the
blocks, since nothing shorter than the data itself can _guarantee_ that
they're the same, just like no lossless compression can possibly work for
all possible bitstreams).

So doing either of these where the success rate is likely to be too low
is probably not helpful.

There are stats that show the savings for a filesystem due to compression
or deduplication.  What I think would be interesting is some advice as to
how much (percentage) savings one should be getting to expect to come
out ahead not just on storage, but on overall system performance.  Of
course, no such guidance would exactly fit any particular workload, but
I think one might be able to come up with some approximate numbers,
or at least a range, below which those features probably represented
a waste of effort unless space was at an absolute premium.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool rename?

2010-05-04 Thread Richard L. Hamilton

[...]
 To answer Richard's question, if you have to rename a
 pool during
 import due to a conflict, the only way to change it
 back is to
 re-import it with the original name. You'll have to
 either export the
 conflicting pool, or (if it's rpool) boot off of a
 LiveCD which
 doesn't use an rpool to do the rename.

Thanks.  The latter is what I ended up doing (well,
off of the SXCE install CD image that I'd used to set up that
disk image in VirtualBox in the first place).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zpool rename?

2010-05-02 Thread Richard L. Hamilton

One can rename a zpool on import

zpool import -f pool_or_id newname

Is there any way to rename it (back again, perhaps)
on export?

(I had to rename rpool in an old disk image to access
some stuff in it, and I'd like to put it back the way it
was so it's properly usable if I ever want to boot off of it.)

But I suppose there must be other scenarios where that would
be useful too...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mapping inode numbers to file names

2010-04-28 Thread Richard L. Hamilton

[...]
 There is a way to do this kind of object to name
 mapping, though there's no documented public
 interface for it. See zfs_obj_to_path() function and
 ZFS_IOC_OBJ_TO_PATH ioctl.
 
 I think it should also be possible to extend it to
 handle multiple names (in case of multiple hardlinks)
 in some way, as id of parent directory is recorded at
 the time of link creation in zone attributes

To add a bit: these sorts of things are _not_ required by any existing
standard, and may be limited to use by root (since they bypass directory
permissions).  So they're typically private, undocumented, and subject
to change without notice.

Some other examples:

UFS _FIOIO ioctl: obtain a read-only file descriptor given an
existing file descriptor on the file system (to make the ioctl on)
and the inode number and generation number (keeps inode numbers
from being reused too quickly, mostly to make NFS happy I think) in
an argument to the ioctl.

Mac OS X /.vol directory: allows pre-OS X style access by
volume-ID/folder-ID/name triplet

Those are all hidden behind a particular library or application
that is the only supported way of using them.

It is perhaps unfortunate that there is no generic root-only way to
look up
fsid/inode
(problematic though due to hard links)
or
fsid/dir_inode/name
(could fail if name has been moved to another directory on the same filesystem)

but implementing a generic solution would likely be a lot of work
(requiring support from every filesystem, most of which were _not_
designed to do a reverse lookup, i.e. from inode back to name), and
the use cases seem to be very few indeed.  (As an example of that,
/.vol on a Mac is said to only work for HFS or HFS+ volumes, not old UFS
volumes (Macs used to support their own flavor of UFS, apparently; no
doubt one considerably different from on Solaris, so don't go there)
In fact, I'm not sure that /.vol works at all on the latest Mac OS X.)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] customizing zfs list with less typing

2010-01-23 Thread Richard L. Hamilton

It might be nice if zfs list would check an environment variable for
a default list of properties to show (same as the comma-separated list
used with the -o option).  If not set, it would use the current default list;
if set, it would use the value of that environment variable as the list.

I find there are a lot of times I want to see the same one additional property
when using zfs list; an environment variable would mean a one-time edit
of .profile rather than typing the -o option with the default list modified by 
whatever
I want to add.

Along those lines, pseudo properties that were abbreviated (constant-length) 
versions
of some real properties might help.  For instance, sharenfs can be on, off, or a
rather long list of nfs sharing options.  A pseudo property with a related name
and a value of on, off, or spec (with spec implying some arbitrary list of 
applicable
options) would have a constant length.  Given two potentially long properties
(mountpoint and the pseudo property name), output lines are already close to
cumbersome (that assumes one at the beginning of the line and one at the end).
Additional potentially long properties in the output would tend to make it 
unreadable.

Both of those, esp. together, would make quickly checking or familiarizing 
oneself
with a server that much more civilized, IMO.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] customizing zfs list with less typing

2010-01-23 Thread Richard L. Hamilton

 Just make 'zfs' an alias to your version of it.  A
 one-time edit
 of .profile can update that alias.


Sure; write a shell function, and add an alias to it.
And use a quoted command name (or full path) within the function
to get to the real command.  Been there, done that.
But to do a good job of it means parsing the command line the
same way the real command would, so that it only adds
-o ${ZFS_LIST_PROPS:-name,used,available,referenced,mountpoint} 

or perhaps better

${ZFS_LIST_PROPS:+-o ${ZFS_LIST_PROPS}}

to zfs list (rather than to other subcommands), and only
if the -o option wasn't given explicitly.

That's not only a bit of a pain, but anytime one is duplicating parsing,
it's begging for trouble: in case they don't really handle it the same,
or in case the underlying command is changed.  And unless that sort of thing
is handled with extreme care (quoting all shell variable references, just
for starters), it can turn out to be a security problem.

And that's just the implicit options part of what I want; the other part would 
take
optionally filtering to modify the command output as well.  That's starting to
get nuts, IMO.

Heck, I can grab a copy of the source for the zfs command, modify it,
and recompile it (without building all of OpenSolaris) faster than I can
write a really good shell wrapper that does the same thing.  But then I
have to maintain my own divergent implementation, unless I can interest
someone else in the idea...OTOH by the time the hoop-jumping for getting
something accepted is over, it's definitely been more bother gain...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] high read iops - more memory for arc?

2009-12-25 Thread Richard L. Hamilton

FYI, the arc and arc-discuss lists or forums are not appropriate for this.  
There are
two arc acronyms:

* Architecture Review Committee (arc list is for cases being considered, 
arc-discuss is for
other discussion.  Non-committee business is most unwelcome on the arc list.)

* the ZFS Adaptive Replacement Cache.  That is what you are talking about.

The zfs-discuss list is appropriate for that subject; storage-discuss and 
database-discuss
_may_ relate, but rather than sending to every list that _might_ relate, I'd 
suggest starting
with the most appropriate first, and reading enough of the posts already on a 
list to
get some idea of what's appropriate there and what isn't, before just adding it 
as
and additional CC in the hopes that someone might answer.

Very few people are likely to be responding here at this time, insofar as the 
largest
part of the people that might are probably observing (at least socially) the 
Christmas
holiday right now (their families might not appreciate them being distracted by
anything else!), and many of the rest aren't interacting much because of how
many are not around right now.  Don't expect too much until the first Monday
after 1 January.  And anyway, discussion lists are not a place where anyone is
_obligated_ to answer.  Those with support contracts presumably have other ways
of getting help.

Now...I probably couldn't answer your question even if I had all the information
you left out,but maybe someone could, eventually.  Some of the information they
might need:

* what are you running (uname -a will do)?  ZFS is constantly being improved; 
problems
get fixed (and sometimes introduced) in just about every build

* what system, how is it configured, exactly what disk models, etc?

Free memory is _supposed_ to be low.  Free memory is wasted memory, except that
a little is kept free to quickly respond to requests for more.  Most memory not 
otherwise
being used for mappings, kernel data structures, etc, is used as either 
additional VM
page cache of pages that might be used again, or by the ZFS ARC.  The tools to
report on just how memory is used behave differently on Solaris (and even on 
different
versions) than they do on other OSs, because Solaris tries really hard to make 
best
use of all RAM.  The uname -a information would also help someone (more 
knowledgeable
than I, although I might be able to look it up) suggest which tools would best 
help to
understand your situation.

So while free memory alone doesn't tell you much, there's a good chance that 
more
would help unless there's some specific problem that's involved.  There's also 
a good
chance that your problem is known, recognizable, and probably has a fix in a 
newer
version or a workaround, if you provide enough information to help someone find
that for you.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Why is st_size of a zfs directory equal to the number of entries?

2009-01-14 Thread Richard L. Hamilton

Cute idea, maybe.  But very inconsistent with the size in blocks (reported by 
ls -dls dir).
Is there a particular reason for this, or is it one of those just for the heck 
of it things?

Granted that it isn't necessarily _wrong_.  I just checked SUSv3 for stat() and 
sys/stat.h,
and it appears that st_size is only well-defined for regular files and 
symlinks.  So I suppose
it could be (a) undefined, or  (b) whatever is deemed to be useful, for 
directories,
device files, etc.

This is of course inconsistent with the behavior on other filesystems.  On UFS 
(a bit
of a special case perhaps in that it still allows read(2) on a directory, for 
compatibility),
the st_size seems to reflect the actual number of bytes used by the 
implementation to
hold the directory's current contents.  That may well also be the case for 
tmpfs, but from
user-land, one can't tell since it (reasonably enough) disallows read(2) on 
directories.
Haven't checked any other filesystems.  Don't have anything else (pcfs, hsfs, 
udfs, ...)
mounted at the moment to check.

(other stuff: ISTR that devices on Solaris will give a size if applicable, 
but for
non LF-aware 32-bit, that may be capped at MAXOFF32_T rather than returning an 
error;
I think maybe for pipes, one sees the number of bytes available to be read.  
None of
which is portable or should necessarily be depended on...)

Cool ideas are fine, but IMO, if one does wish to make something nominally 
undefined
have some particular behavior, I wonder why one wouldn't at least try for 
consistency...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Why is st_size of a zfs directory equal to the

2009-01-14 Thread Richard L. Hamilton

 Richard L. Hamilton rlha...@smart.net wrote:
 
  Cute idea, maybe.  But very inconsistent with the
 size in blocks (reported by ls -dls dir).
  Is there a particular reason for this, or is it one
 of those just for the heck of it things?
 
  Granted that it isn't necessarily _wrong_.  I just
 checked SUSv3 for stat() and sys/stat.h,
  and it appears that st_size is only well-defined
 for regular files and symlinks.  So I suppose
  it could be (a) undefined, or  (b) whatever is
 deemed to be useful, for directories,
  device files, etc.
 
 You could also return 0 for st_size for all
 directories and would still be 
 POSIX compliant.
 
 
 Jörg
 

Yes, some do IIRC (certainly for empty directories, maybe always; I forget what
OS I saw that on).

Heck, undefined means it wouldn't be _wrong_ to return a random number.  Even
a _negative_ number wouldn't necessarily be wrong (although it would be a new 
low
in rudeness, perhaps).

I did find the earlier discussion on the subject (someone e-mailed me that 
there had been
such).  It seemed to conclude that some apps are statically linked with old 
scandir() code
that (incorrectly) assumed that the number of directory entries could be 
estimated as
st_size/24; and worse, that some such apps might be seeing the small st_size 
that zfs
offers via NFS, so they might not even be something that could be fixed on 
Solaris at all.
But I didn't see anything in the discussion that suggested that this was going 
to be changed.
Nor did I see a compelling argument for leaving it the way it is, either.  In 
the face of
undefined, all arguments end up as pragmatism rather than principle, IMO.

Maybe it's not a bad thing to go and break incorrect code.  But if that code 
has worked for
a long time (maybe long enough for the source to have been lost), I don't know 
that it's
helpful to just remind everyone that st_size is only defined for certain types 
of objects,
and directories aren't one of them.

(Now if one wanted to write something to break code depending on 32-bit time_t 
_now_
rather than waiting for 2038, that might be a good deed in terms of breaking 
things.
But I'll be 80 then (if I'm still alive), and I probably won't care.)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mac Mini (OS X 10.5.4) with globalSAN

2008-08-14 Thread Richard L. Hamilton

 On Wed, 13 Aug 2008, Richard L. Hamilton wrote:
 
  Reasonable enough guess, but no, no compression,
 nothing like that;
  nor am I running anything particularly demanding
 most of the time.
 
  I did have the volblocksize set down to 512 for
 that volume, since I thought
  that for the purpose, that reflected hardware-like
 behavior.  But maybe there's
  some reason that's not a good idea.
 
 Yes, that would normally be a very bad idea.  The
 default is 128K. 
 The main reason to want to reduce it is if you have
 an application 
 doing random-access I/O with small block sizes (e.g.
 8K is common for 
 applications optimized for UFS).  In that case the
 smaller block sizes 
 decrease overhead since zfs reads and updates whole
 blocks.  If the 
 block size is 512 then that means you are normally
 performing more 
 low-level I/Os, doing more disk seeks, and wasting
 disk space.
 
 The hardware itself does not really deal with 512
 bytes any more since 
 buffering on the disk drive is sufficient to buffer
 entire disk tracks 
 and when data is read, it is common for the disk
 drive to read the 
 entire track into its local buffer.  A hardware RAID
 controller often 
 extends that 512 bytes to a somewhat larger value for
 its own 
 purposes.
 
 Bob

Ok, but that leaves the question what a better value would be.  I gather
that HFS+ operates in terms of 512-byte sectors but larger allocation units;
however, unless those allocation units are a power of two between 512 and 128k
inclusive _and_ are accordingly aligned within the device (or actually, with
the choice of a proper volblocksize can be made to correspond to blocks in
the underlying zvol), it seems to me that a larger volblocksize would not help;
it might well mean that a one a.u. write by HFS+ equated to two blocks read
and written by zfs, because the alignment didn't match, whereas at least with
the smallest volblocksize, there should never be a need to read/merge/write.

I'm having trouble figuring out how to get the info to make a better choice on
the HFS+ side; maybe I'll just fire up wireshark, and see if it knows
how to interpret iSCSI, and/or run truss on iscsitgtd to see what it actually
is reading from/writing to the zvol; if there is a consistent least common
aligned blocksize, I would expect the latter especially to reveal it, and
probably the former to confirm it.

I did string Ethernet; I think that sped things up a bit, but it didn't change 
the
annoying pauses.  In the end, I found a 500GB USB drive on sale for $89.95 (US),
and put that on the Mac, with 1 partition for backups, and 1 each for possible
future [Open]Solaris x86, Linux, and Windows OSs, assuming they can be
booted from a USB drive on a Mac Mini.  Still, I want to know if the pausing
with iscsitgtd is in part something I can tune down to being non-obnoxious,
or is (as I suspect) in some sense a real bug.

cc-ing zfs-discuss, since I suspect the problem might be there at least as much
as with iscsitgtd (not that the latter is a prize-winner, having core-dumped
with an assert() somewhere a number of times).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] memory hog

2008-06-13 Thread Richard L. Hamilton

Hmm...my SB2K, 2GB RAM, 2x 1050MHz UltraSPARC III Cu CPU, seems
to freeze momentarily for a couple of seconds every now and then in
a zfs root setup on snv_90, which it never did with mostly ufs on snv_81;
that despite having much faster disks now (LSI SAS 3800X and a pair of
Seagate 1TB SAS drives (mirrored), vs the 2x internal 73GB FC drives;
the SAS drives, at a mere 7200 RPM can sustain a sequential transfer
rate about 2.5x that of the 10KRPM FC drives!).

Then again, between the hardware differences and any other software
differences as well as the configuration change, I'm not absolutely ready
to blame any particular one of those for those annoying pauses...but
my suspicions are on zfs...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Boot from mirrored vdev

2008-06-13 Thread Richard L. Hamilton

Are you using
set md:mirrored_root_flag=1
in /etc/system?

See the entry for md:mirrored_root_flag on
http://docs.sun.com/app/docs/doc/819-2724/chapter2-156?a=view
keeping in mind all the cautions...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can't rm file when No space left on device...

2008-06-13 Thread Richard L. Hamilton

I wonder if one couldln't reduce (but probably not eliminate) the likelihood
of this sort of situation by setting refreservation significantly lower than
reservation?

Along those lines, I don't see any property that would restrict the number
of concurrent snapshots of a dataset :-(   I think that would be real handy,
along with one that would say whether to refuse another when the limit was
reached, or to automatically delete the oldest snapshot.  Yes, one can script
the rotation of snapshots, but it might be nice to just make it policy for a
given dataset instead, particularly together with delegated snapshot permission
(provided that that didn't also delegate the ability to change the maximum
number of allowed snapshots).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Filesystem for each home dir - 10,000 users?

2008-06-11 Thread Richard L. Hamilton

 On Sat, 7 Jun 2008, Mattias Pantzare wrote:
 
  If I need to count useage I can use du. But if you
 can implement space
  usage info on a per-uid basis you are not far from
 quota per uid...
 
 That sounds like quite a challenge.  UIDs are just
 numbers and new 
 ones can appear at any time.  Files with existing
 UIDs can have their 
 UIDs switched from one to another at any time.  The
 space used per UID 
 needs to be tallied continuously and needs to track
 every change, 
 including real-time file growth and truncation.  We
 are ultimately 
 talking about 128 bit counters here.  Instead of
 having one counter 
 per filesystem we now have potentially hundreds of
 thousands, which 
 represents substantial memory.

But if you already have the ZAP code, you ought to be able to do
quick lookups of arbitrary byte sequences, right?  Just assume that
a value not stored is zero (or infinity, or uninitialized, as applicable),
and you have the same functionality as  the sparse quota file on ufs,
without the problems.

Besides, uid/gid/sid quotas would usually make more sense at the zpool level 
than
at the individual filesystem level, so perhaps it's not _that_ bad.  Which is to
say, you want user X to have an n GB quota over the whole zpool, and you
probably don't so much care whether the filesystem within the zpool
corresponds to his home directory or to some shared directory.

 Multicore systems have the additional challenge that
 this complex 
 information needs to be effectively shared between
 cores.  Imagine if 
 you have 512 CPU cores, all of which are running some
 of the ZFS code 
 and have their own caches which become invalidated
 whenever one of 
 those counters is updated.  This sounds like a no-go
 for an almost 
 infinite-sized pooled last word filesystem like
 ZFS.
 
 ZFS is already quite lazy at evaluating space
 consumption.  With ZFS, 
 'du' does not always reflect true usage since updates
 are delayed.

Whatever mechanism can check at block allocation/deallocation time
to keep track of per-filesystem space (vs a filesystem quota, if there is one)
could surely also do something similar against per-uid/gid/sid quotas.  I 
suspect
a lot of existing functions and data structures could be reused or adapted for
most of it.  Just one more piece of metadata to update, right?  Not as if ufs
quotas had zero runtime penalty if enabled.   And you only need counters and
quotas in-core for identifiers applicable to in-core znodes, not for every
identifier used on the zpool.

Maybe I'm off base on the details.  But in any event, I expect that it's 
entirely
possible to make it happen, scalably.  Just a question of whether it's worth the
cost of designing, coding, testing, documenting.  I suspect there may be enough
scenarios for sites with really high numbers of accounts (particularly
universities, which are not only customers in their own right, but a chance
for future mindshare) that it might be worthwhile, but I don't know that to
be the case.

IMO, even if no one sort of site using existing deployment architectures would
justify it, given the future blurring of server, SAN, and NAS (think recent
SSD announcement + COMSTAR + iSCSI initiator + separate device for zfs
zil  cache + in-kernel CIFS + enterprise authentication with Windows
interoperability + Thumper + ...), the ability to manage all that storage in all
sorts of as-yet unforseen deployment configurations _by user or other identity_
may well be important across a broad base of customers.  Maybe identity-based,
as well as filesystem-based quotas, should be part of that.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SSD reliability, wear levelling, warranty period

2008-06-11 Thread Richard L. Hamilton

  btw: it's seems to me that this thread is a little
 bit OT.
 
 I don't think its OT - because SSDs make perfect
 sense as ZFS log
 and/or cache devices.  If I did not make that clear
 in my OP then I
 failed to communicate clearly.  In both these roles
 (log/cache)
 reliability is of the utmost importance.

Older SSDs (before cheap and relatively high-cycle-limit flash)
were RAM cache+battery+hard disk.  Surely RAM+battery+flash
is also possible; the battery only needs to keep the RAM alive long
enough to stage to the flash.  That keeps the write count on the flash
down, and the speed up (RAM being faster than flash).  Such a device
would of course cost more, and be less dense (given having to have
battery+charging circuits and RAM as well as flash), than a pure flash device.
But with more limited write rates needed, and no moving parts, _provided_
it has full ECC and maybe radiation-hardened flash (if that exists), I can't
imagine why such a device couldn't be exceedingly reliable and have quite
a long lifetime (with the battery, hopefully replaceable, being more of
a limitation than the flash).

It could be a matter of paying for how much quality you want...

As for reliability, from zpool(1m):

log

A separate intent log device. If more than one log device is specified, 
 then
  writes are load-balanced between devices. Log devices can be mirrored.
  However, raidz and raidz2 are not supported for the intent log. For more
  information, see the “Intent Log” section.

 cache

A device used to cache storage pool data. A cache device cannot be mirrored
 or part of a raidz or raidz2 configuration. For more information, see the
 “Cache Devices” section.
[...]
 Cache Devices

  Devices can be added to a storage pool as “cache devices.” These devices
 provide an additional layer of caching between main memory and disk. For
 read-heavy workloads, where the working set size is much larger than what can
 be cached in main memory, using cache devices allow much more of this
 working set to be served from low latency media. Using cache devices provides
 the greatest performance improvement for random read-workloads of mostly
  static content.

   To create a pool with cache devices, specify a “cache” vdev with any number
 of devices. For example:

  # zpool create pool c0d0 c1d0 cache c2d0 c3d0

Cache devices cannot be mirrored or part of a raidz configuration. If a 
 read
  error is encountered on a cache device, that read I/O is reissued to the
  original storage pool device, which might be part of a mirrored or raidz
  configuration.

  The content of the cache devices is considered volatile, as is the case with
 other system caches.

That tells me that the zil can be mirrored and zfs can recover from cache 
errors.

I think that means that these devices don't need to be any more reliable than
regular disks, just much faster.

So...expensive ultra-reliability SSD, or much less expensive SSD plus mirrored
zil?  Given what zfs can do with cheap SATA, my bet is on the latter...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Growing root pool ?

2008-06-11 Thread Richard L. Hamilton

 On Tue, Jun 10, 2008 at 11:33:36AM -0700, Wyllys
 Ingersoll wrote:
  Im running build 91 with ZFS boot.  It seems that
 ZFS will not allow
  me to add an additional partition to the current
 root/boot pool
  because it is a bootable dataset.  Is this a known
 issue that will be
  fixed or a permanent limitation?
 
 The current limitation is that a bootable pool be
 limited to one disk or
 one disk and a mirror.  When your data is striped
 across multiple disks,
 that makes booting harder.
 
 From a post to zfs-discuss about two months ago:
 
 ... we do have plans to support booting from
  RAID-Z.  The design is
 still being worked out, but it's likely that it
  will involve a new
 kind of dataset which is replicated on each disk of
  the RAID-Z pool,
 and which contains the boot archive and other
  crucial files that the
 booter needs to read.  I don't have a projected
  date for when it will
 be available.  It's a lower priority project than
  getting the install
   support for zfs boot done.
 - 
 Darren

If I read you right, with little or nothing extra, that would enable
growing rpool as well, since what it would really do is ensure
/boot (and whatever if anything else) was mirrored even though
the rest of the zpool was raidz or raidz2; which would also
ensure that those critical items were _not_ spread across the
stripe that would result from adding devices to an existing zpool.

Of course installation and upgrade would have to be able to recognize
and deal with such exotica too.  Which seems to pose a problem, since
having one dataset in the zpool mirrored while the rest is raidz and/or
extended by a stripe implies to me that some space is more or less
reserved for that purpose, or that such a dataset couldn't be snapshotted,
or both; so I suppose there might be a smaller-than-total-capacity limit
on the number of BEs possible.

http://en.wikipedia.org/wiki/TANSTAAFL ...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Growing root pool ?

2008-06-11 Thread Richard L. Hamilton

 I'm not even trying to stripe it across multiple
 disks, I just want to add another partition (from the
 same physical disk) to the root pool.  Perhaps that
 is a distinction without a difference, but my goal is
 to grow my root pool, not stripe it across disks or
 enable raid features (for now).
 
 Currently, my root pool is using c1t0d0s4 and I want
 to add c1t0d0s0 to the pool, but can't.
 
 -Wyllys

Right, that's how it is right now (which the other guy seemed to
be suggesting might change eventually, but nobody knows when
because it's just not that important compared to other things).

AFAIK, if you could shrink the partition whose data is after
c1t0d0s4 on the disk, you could grow c1t0d0s4 by that much,
and I _think_ zfs would pick up the growth of the device automatically.
(ufs partitions can be grown like that, or by being on an SVM or VxVM
volume that's grown, but then one has to run a command specific to ufs
to grow the filesystem to use the additional space).
I think zpools are supposed to grow automatically if SAN LUNs are grown,
and this should be a similar situation, anyway.  But if you can do that,
and want to try it, just be careful.  And of course you couldn't shrink it 
again, either.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SATA controller suggestion

2008-06-06 Thread Richard L. Hamilton

I don't presently have any working x86 hardware, nor do I routinely work with
x86 hardware configurations.

But it's not hard to find previous discussion on the subject:
http://www.opensolaris.org/jive/thread.jspa?messageID=96790
for example...

Also, remember that SAS controllers can usually also talk to SATA drives;
they're usually more expensive of course, but sometimes you can find a deal.
I have a LSI SAS 3800x, and I paid a heck of a lot less than list for it (eBay),
I'm guessing because someone bought the bulk package and sold off whatever
they didn't need (new board, sealed, but no docs).  That was a while ago, and
being around US $100, it might still not have been what you'd call cheap.
If you want  $50, you might have better luck looking at the earlier discussion.
But I suspect to some extent you get what you pay for; the throughput on the
higher-end boards may well be a good bit higher, although for one disk
(or even two, to mirror the system disk), it might not matter so much.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can't rm file when No space left on device...

2008-06-06 Thread Richard L. Hamilton

 On Thu, Jun 05, 2008 at 09:13:24PM -0600, Keith
 Bierman wrote:
  On Jun 5, 2008, at 8:58 PM   6/5/, Brad Diggs
 wrote:
   Hi Keith,
  
   Sure you can truncate some files but that
 effectively corrupts
   the files in our case and would cause more harm
 than good. The
   only files in our volume are data files.
  
  So an rm is ok, but a truncation is not?
  
  Seems odd to me, but if that's your constraint so
 be it.
 
 Neither will help since before the space can be freed
 a transaction must
 be written, which in turn requires free space.
 
 (So you say let ZFS save some just-in-case-space for
 this, but, how
 much is enough?)

If you make it a parameter, that's the admin's problem.  Although
since each rm of a file also present in a snapshot just increases the
divergence, only an rm of a file _not_ present in a snapshot would
actually recover space, right?  So in some circumstances, even if it's
the admin's problem, there might be no amount that's enough to
do what one wants to do without removing a snapshot.  Specifically,
take a snapshot of a filesystem that's very nearly full, and then use
dd or whatever to create a single new file that fills up the filesystem.
At that point, only removing that single new file will help, and even that's
not possible without a just-in-case reserve of enough to handle worst
case metadata(including system attributes, if any) update+transaction log+\
any other fudge I forgot, for at least one file's worth.

Maybe that's a simplistic view of the scenario, I dunno...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs incremental-forever

2008-06-06 Thread Richard L. Hamilton

If I read the man page right, you might only have to keep a minimum of two
on each side (maybe even just one on the receiving side), although I might be
tempted to keep an extra just in case; say near current, 24 hours old, and a
week old (space permitting for the larger interval of the last one).  Adjust
frequency, spacing, and number according to available space, keeping in
mind that the more COW-ing between snapshots (the longer interval if
activity is more or less constant), the more space required.  (assuming
my head is more or less on straight right now...)

Of course if you get messed up, you can always resync with a non-incremental
transfer, so if you could live with that occasionally, there may be no need for
more than two.

Your script would certainly have to be careful to check for successful send 
_and_
receive before removing old snapshots on either side.

ssh remotehost exit 1

seems to have a return code of 1 (cool).  rsh does _not_ have that desirable
property.  But that still leaves the problem of how to check the exit status
of the commands on both ends of a pipeline; maybe someone has solved
that?

Anyway, correctly verifying successful completion of the commands on both ends
might be a bit tricky, but is critical if you don't want failures or the need 
for
frequent non-incremental transfers.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Per-user home filesystems and OS-X Leopard anomaly

2008-06-06 Thread Richard L. Hamilton

 I encountered an issue that people using OS-X systems
 as NFS clients 
 need to be aware of.  While not strictly a ZFS issue,
 it may be 
 encounted most often by ZFS users since ZFS makes it
 easy to support 
 and export per-user filesystems.  The problem I
 encountered was when 
 using ZFS to create exported per-user filesystems and
 the OS-X 
 automounter to perform the necessary mount magic.
 
 OS-X creates hidden .DS_Store directories in every
 directory which 
 is accessed (http://en.wikipedia.org/wiki/.DS_Store).
 
 OS-X decided that it wanted to create the path
 /home/.DS_Store and 
 it would not take `no' for an answer.  First it would
 try to create 
 /home/.DS_Store and then it would try an alternate
 name.  Since the 
 automounter was used, there would be an automount
 request for 
 /home/.DS_Store, which does not exist on the server
 so the mount 
 request would fail.  Since OS-X does not take 'no'
 for an answer, 
 there would be subsequent thousands of back to back
 mount requests. 
 The end result was that 'mountd' was one of the top
 three resource 
 consumers on my system, there would be bursts of high
 network traffic 
 (1500 packets/second), and the affected OS-X system
 would operate 
 more strangely than normal.
 
 The simple solution was to simply create a
 /home/.DS_Store directory 
 on the server so that the mount request would
 succeed.

Too bad it appears to be non-obvious how to do loopback mounts
(a mount of one local directory onto another, without having to be an
NFS server) on Darwin/MacOS X; then you could mount the
/home/.DS_Store locally from a directory elsewhere (e.g.
/export/home/.DS_Store) on each machine, rather than bothering
the server with it.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Filesystem for each home dir - 10,000 users?

2008-06-06 Thread Richard L. Hamilton

[...]
  That's not to say that there might not be other
 problems with scaling to
  thousands of filesystems.  But you're certainly not
 the first one to test it.
 
  For cases where a single filesystem must contain
 files owned by
  multiple users (/var/mail being one example), old
 fashioned
  UFS quotas still solve the problem where the
 alternative approach
  with ZFS doesn't.

 
 A single /var/mail doesn't work well for 10,000 users
 either.  When you
 start getting into that scale of service
 provisioning, you might look at
 how the big boys do it... Apple, Verizon, Google,
 Amazon, etc.  You
 should also look at e-mail systems designed to scale
 to large numbers of 
 users
 which implement limits without resorting to file
 system quotas.  Such
 e-mail systems actually tell users that their mailbox
 is too full rather 
 than
 just failing to deliver mail.  So please, when we
 start having this 
 conversation
 again, lets leave /var/mail out.

I'm not recommending such a configuration; I quite agree that it is neither
scalable nor robust.

It's only merit is that it's an obvious example of where one would have
potentially large files owned by many users necessarily on one filesystem,
inasmuch as they were in one common directory.  But there must  be
other examples where the ufs quota model is a better fit than the
zfs quota model with potentially one filesystem per user.

In terms of the limitations they can provide, zfs filesystem quotas remind me
of DG/UX control point directories (presumably a relic of AOS/VS) - like regular
directories except they could have a quota bound to them restricting the sum of
the space of the subtree rooted there (the native filesystem on DG/UX didn't
have UID-based quotas).

Given restricted chown (non-root can't give files away), per-UID*filesystem
quotas IMO make just as much sense as per-filesystem quotas themselves
do on zfs, save only that per-UID*filesystem quotas make the filesystem less
lightweight.  For zfs, perhaps an answer might be if it were possible to
have per-zpool uid/gid/projid/zoneid/sid quotas too?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Filesystem for each home dir - 10,000 users?

2008-06-05 Thread Richard L. Hamilton

 Hi All,
 
 I'm new to ZFS but I'm intrigued by the possibilities
 it presents.
 
 I'm told one of the greatest benefits is that,
 instead of setting 
 quotas, each user can have their own 'filesystem'
 under a single pool.
 
 This is obviously great if you've got 10 users but
 what if you have 
 10,000?  Are the overheads too great and do they
 outweigh the potential 
 benefits?
 
 I've got a test system running with 5,000 dummy users
 which seems to 
 perform fine, even if my 'df' output is a little
 sluggish :-) .
 
 Any advice or experiences would be greatly
 appreciated.

I think sharemgr was created to speed up the case of sharing out very
high numbers of filesystems on NFS servers, which otherwise took
quite a long time.

That's not to say that there might not be other problems with scaling to
thousands of filesystems.  But you're certainly not the first one to test it.

For cases where a single filesystem must contain files owned by
multiple users (/var/mail being one example), old fashioned
UFS quotas still solve the problem where the alternative approach
with ZFS doesn't.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] system backup and recovery

2008-06-05 Thread Richard L. Hamilton

 Hi list,
 
 for windows we use ghost to backup system and
 recovery.
 can we do similar thing for solaris by ZFS?
 
 I want to create a image and install to another
 machine,
 So that the personal configuration will not be lost.

Since I don't do Windows, I'm not familiar with ghost, but I gather from
Wikipedia that it's more a disk cloning tool (bare metal backup/restore)
than a conventional backup program, although some people may well use it
for backups too.

Zfs has send and receive commands, which more or less correspond to
ufsdump and ufsrestore for ufs, except that the names send and receive
are perhaps more appropriate, since the zfs(1m) man page says:
  The format of the stream is evolving. No backwards compatibility is
 guaranteed. You may not be able to receive your streams on future
 versions of ZFS.
which means to me that it's not a really good choice for archiving or long-term
backups, but it should be ok for transferring zfs filesystems between systems
that are the same OS version (or at any rate, close enough that the format
of the zfs send/receive datastream is compatible).

There are of course also generic archiving utilities that can be used for
backup/restore, like tar (or star), pax, cpio, and so on.  But as far as I know,
there's no bare metal backup/restore facility that comes with Solaris, although
there are some commercial (and probably quite expensive) products that
do that.  But there's probably nothing at all that's quite equivalent to Norton
Ghost.

One can of course use dd to copy entire raw disk partitions, but that
won't set up the partitions, nor will it work as expected unless all disk sizes
are identical (for filesystems that don't have the OS on them), or if the OS
is on there, all hardware is identical.

Depending on just what personal configuration you mean, you may not
necessarily need to back up the whole system anyway.  Which is another way
of saying that I'm not sure your post was specific enough about what you're 
doing
to make it possible to suggest the best available (and preferably free) 
solution.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] system backup and recovery

2008-06-05 Thread Richard L. Hamilton

 
 On Thu, 2008-06-05 at 15:44 +0800, Aubrey Li wrote:
  for windows we use ghost to backup system and
 recovery.
  can we do similar thing for solaris by ZFS?
 
 How about flar ?
 http://docs.sun.com/app/docs/doc/817-5668/flash-24?a=v
 iew
 [ I'm actually not sure if it's supported for zfs
 root though ]
 
   cheers,
   tim

Oops, forgot about that one...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] More USB Storage Issues

2008-06-05 Thread Richard L. Hamilton

 Nathan Kroenert wrote:
  For what it's worth, I started playing with USB +
 flash + ZFS and was 
  most unhappy for quite a while.
  
  I was suffering with things hanging, going slow or
 just going away and 
  breaking, and thought I was witnessing something
 zfs was doing as I was 
  trying to do mirror recovery and all that sort of
 stuff.
  
  On a hunch, I tried doing UFS and RAW instead and
 saw the same issues.
  
  It's starting to look like my USB hubs. Once they
 are under any 
  reasonable read/write load, they just make bunches
 of things go offline.
  
  Yep - They are powered and plugged in.
  
  So, at this stage, I'll be grabbing a couple of
 'better' USB hubs (Mine 
  are pretty much the cheapest I could buy) and see
 how that goes.
  
  For gags, take ZFS out of the equation and validate
 that your hardware 
  is actually providing a stable platform for ZFS...
 Mine wasn't...
 
 That's my experience too. USB HUBs are cheap [ expletive deleted ]
 mostly...

What do you expect?  They're mostly consumer-grade, which is to say garbage,
rather than datacenter-grade.

And it's not just USB hubs - I've got a consumer-grade external modem,
and I swear it must have little or no ECC and/or watchdog, because I have
to power-cycle it every so often.  Wish I had a lead box to put it in to shield
it from the cosmic rays, maybe that would help...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] new install - when is zfs root offered? (snv_90)

2008-06-05 Thread Richard L. Hamilton

 A Darren Dunham [EMAIL PROTECTED] writes:
 
  On Tue, Jun 03, 2008 at 05:56:44PM -0700, Richard
 L. Hamilton wrote:
  How about SPARC - can it do zfs install+root yet,
 or if not, when?
  Just got a couple of nice 1TB SAS drives, and I
 think I'd prefer to
  have a mirrored pool where zfs owns the entire
 drives, if possible.
  (I'd also eventually like to have multiple
 bootable zfs filesystems in
  that pool, corresponding to multiple versions.)
 
  Is they just under 1TB?  I don't believe there's
 any boot support in
  Solaris for EFI labels, which would be required for
 1TB+.
 
 ISTR that I saw an ARC case go past about a week ago
 about extended SMI
 labels to allow  1TB disks, for exactly this reason.
 

Thanks.  Just searched, that's
http://www.opensolaris.org/jive/thread.jspa?messageID=237603
(approved)

Since format didn't choke, and since a close reading suggests the
actual older limit is 1TiB (or maybe 1TiB - 1 sector), I should be fine
on that score.  The LSI SAS 3800x is supposed to have fcode boot support.
And snv_90 is supposed to have zfs boot install working on both SPARC and x86.
So I guess I'll just have to try it.  That only leaves me wondering whether I 
should
attempt a live upgrade from SXCE snv_81, or just do the text install off a DVD
onto one of the new disks (hoping the installer takes care of setting up the 
disk
however it needs to be to be bootable), and then adding identical partitioning
to the other disk, attaching a suitable partition on the 2nd disk to the zpool, 
and
using LVM (Disk Suite) to mirror any non-zfs partitions the installation 
created.

Never having used live upgrade myself (although read about it), I suppose it 
would
be an educational experience either way.  Time was once, I'd have looked forward
to that...must be getting tired...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] new install - when is zfs root offered? (snv_90)

2008-06-04 Thread Richard L. Hamilton

 On Tue, Jun 03, 2008 at 05:56:44PM -0700, Richard L.
 Hamilton wrote:
  How about SPARC - can it do zfs install+root yet,
 or if not, when?
  Just got a couple of nice 1TB SAS drives, and I
 think I'd prefer to
  have a mirrored pool where zfs owns the entire
 drives, if possible.
  (I'd also eventually like to have multiple bootable
 zfs filesystems in
  that pool, corresponding to multiple versions.)
 
 Is they just under 1TB?  I don't believe there's any
 boot support in
 Solaris for EFI labels, which would be required for
 1TB+.

Don't know about Solaris or the on-disk bootloader (I would think they
ought to have that eventually if not already), but since it's been awhile
since I've seen a new firmware update for the SB2K, I doubt the firmware
could handle EFI labels.

But format is perfectly happy putting either Sun or EFI labels
on these drives, so that shouldn't be a problem.   SCSI read capacity
shows 1953525168 (512-byte) sectors, which multiplied out is
1,000,204,886,016 bytes; more than 10^12 (1TB), but less than 2^40 (1TiB).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] new install - when is zfs root offered? (snv_90)

2008-06-04 Thread Richard L. Hamilton

P.S. the ST31000640SS drives, together with the LSI SAS 3800x
controller (in a 64-bit 66MHz slot) gave me, using dd with
a block size of either 1024k or 16384k (1MB or 16MB) and a count
of 1024, a sustained read rate that worked out to a shade over 119MB/s,
even better than the nominal sustained transfer rate of 116MB/s documented
for the drives.  Even at a miserly 7200 RPM, that was better than 2 1/2 times
faster than the internal 10,000 RPM 73GB FC/AL (2GB/s) drives, which impressed
the heck out of me.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] new install - when is zfs root offered? (snv_90)

2008-06-03 Thread Richard L. Hamilton

How about SPARC - can it do zfs install+root yet, or if not, when?
Just got a couple of nice 1TB SAS drives, and I think I'd prefer to
have a mirrored pool where zfs owns the entire drives, if possible.
(I'd also eventually like to have multiple bootable zfs filesystems in
that pool, corresponding to multiple versions.)

Is/will all that be possible?  Would it be ok to pre-create the pool,
and if so, any particular requirements?

Currently running snv_81 on a Sun Blade 2000; SAS/SATA controller
is an LSI Logic SAS 3800X 8-port, in the 66MHz slot.  I chose SAS drives
for the first two (of 8) trusting SCSI support to probably be more mature
and functional than SATA support, but the rest (as I'm willing to part with the 
$$)
will probably be SATA for price.  The current two SAS drives are Seagate
ST31000640SS (which I just used smartctl to confirm have SMART support
including temperature reporting).  Enclosure is an Enhance E8-ML (no
enclosure services support).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] The ZFS inventor and Linus sitting in a tree?

2008-05-20 Thread Richard L. Hamilton

 On Mon, May 19, 2008 at 10:06 PM, Bill McGonigle
 [EMAIL PROTECTED] wrote:
  On May 18, 2008, at 14:01, Mario Goebbels wrote:
 
  I mean, if the Linux folks to want it, fine. But
 if Sun's actually
  helping with such a possible effort, then it's
 just shooting itself in
  the foot here, in my opinion.
 
 
 
 []
  they're quick to do it - they threatened to sue me
 when they couldn't
  figure out how to take back a try-out server).
 
 There's a story contained within that for sure! :)
 You brought a smile
 to this subscriber when I read it.
 
 
  Having ZFS as a de- facto standard lifts all boats,
 IMHO.
 It's still hard to believe (in one sense) that the
 entire world isn't
 beating a path to Sun's door and PLEADING for ZFS.
 This is (if y'all
 will forgive the colloquialism) a kick-ass amazing
 piece of software.
 It appears to defy all the rules, a bit like
 levitation in a way, or
 perhaps it just rewrites those rules. There are days
 I still can't get
 my head around what ZFS really is.
 
 In general, licensing issues just make my brain
 bleed, but one hopes
 that the licensing gurus can get their heads together
 and find a way
 to get this done. I don't personally believe that
 Open Solaris *OR*
 Solaris will lose if ZFS makes its way over the fence
 to Linux, I
 think that this is a big enough tent for everyone.
 Sure hope so
 anyway, it would be immensely sad to see technology
 like this not
 being adopted/ported/migrated/whatever more widely
 because of damn
 lawyers and the morass called licensing.
 
 Perhaps (gazing into a cloudy crystal ball that
 hasn't been cleaned in
 a while) Solaris/Open Solaris can manage to hold onto
 ZFS-on-boot
 which is perhaps *the* most mind bending
 accomplishment within the zfs
 concept, and let the rest procreate elsewhere. That
 could contribute
 to the must-have/must-install cachet of
 Solaris/OpenSolaris.

Umm, I think it's too late for that; as I recall, the bits needed for
read-only access had to be made dual CDDL/GPL to be linked with GRUB.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS for write-only media?

2008-04-24 Thread Richard L. Hamilton

 Dana H. Myers [EMAIL PROTECTED] wrote:
 
  Bob Friesenhahn wrote:
   Are there any plans to support ZFS for write-only
 media such as 
   optical storage?  It seems that if mirroring or
 even zraid is used 
   that ZFS would be a good basis for long term
 archival storage.
  I'm just going to assume that write-only here
 means write-once,
  read-many, since it's far too late for an April
 Fool's joke.
 
 I know two write-only device types:
 
 WOM   Write-only media
 WORN  Write-once read never (this one is often used
 for backups ;-)
 
 Jörg

Save $$ (or €€) - use /dev/null instead.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] utf8only-property

2008-02-28 Thread Richard L. Hamilton

 So, I set utf8only=on and try to create a file with a
 filename that is
 a byte array that can't be decoded to text using
 UTF-8. What's supposed
 to happen? Should fopen(), or whatever syscall
 'touch' uses, fail?
 Should the syscall somehow escape utf8-incompatible
 bytes, or maybe
 replace them with ?s or somesuch? Or should it
 automatically convert the
 filename from the active locale's fs-encoding
 (LC_CTYPE?) to UTF-8?

First, utf8only can AFAIK only be set when a filesystem is created.

Second, use the source, Luke:
http://src.opensolaris.org/source/search?q=defs=refs=z_utf8path=%2Fonnv%2Fonnv-gate%2Fusr%2Fsrc%2Futs%2Fcommon%2Ffs%2Fzfs%2Fzfs_vnops.chist=project=%2Fonnv

Looks to me like lookups, file create, directory create, creating symlinks,
and creating hard links will all fail with error EILSEQ (Illegal byte 
sequence)
if utf8only is enabled and they are presented with a name that is not valid
UTF-8.  Thus, on a filesystem where it is enabled (since creation), no such
names can be created or would ever be there to be found anyway.

So in that case, the system is refusing non UTF-8 compatible byte strings
and there's no need to escape anything.

Further, your last sentence suggests that you might hold the
incorrect idea that the kernel knows or cares what locale an application is
running in: it does not.  Nor indeed does the kernel know about environment
variables at all, except as the third argument passed to execve(2); it
doesn't interpret them, or even validate that they are of the usual
name=value form, they're typically handled pretty much the same as the
command line args, and the only illusion of magic is that with the more
widely used variants of exec that don't explicitly pass the environment,
they internally call execve(2) with the external variable environ as the
last arg, thus passing the environment automatically.

There have been Unix-like OSs that make the environment available to
additional system calls (give or take what's a true system call in the
example I'm thinking of, namely variant links (symlinks with embedded
environment variable references) in the now defunct Apollo Domain/OS),
but AFAIK, that's not the case in those that are part of the historical
Unix source lineage.  (I have no idea off the top of my head whether
or not Linux, or oddballs like OSF/1 might make environment variables
implicitly available to syscalls other than execve(2).)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] vxfs vs ufs vs zfs

2008-02-18 Thread Richard L. Hamilton

 Hello,
 
 I have just done comparison of all the above
 filesystems
 using the latest filebench.  If you are interested:
 http://przemol.blogspot.com/2008/02/zfs-vs-vxfs-vs-ufs
 -on-x4500-thumper.html
 
 Regards
 przemol

I would think there'd be a lot more variation based on workload,
such that the overall comparison may fall far short of telling the
whole story.  For example, IIRC, VxFS is more or less
extent-based (like mainframe storage), so serial I/O for large
files should be perhaps its strongest point, while other workloads
may do relatively better with the other filesystems.

The free basic edition sounds cool, though - downloading now.
I could use a bit of practice with VxVM/VxFS; it's always struck
me as very good when it was good (online reorgs of storage and
such), and an utter terror to untangle when it got messed up,
not to mention rather more complicated that DiskSuite/SVM
(and of course _waay_ more complicated than zfs :-)
Any idea if it works with reasonably recent OpenSolaris (build 81) ?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 'du' is not accurate on zfs

2008-02-18 Thread Richard L. Hamilton

 On Sat, 16 Feb 2008, Richard Elling wrote:
 
  ls -l shows the length.  ls -s shows the size,
 which may be
  different than the length.  You probably want size
 rather than du.
 
 That is true.  Unfortunately 'ls -s' displays in
 units of disk blocks 
 and does not also consider the 'h' option in order to
 provide a value 
 suitable for humans.
 
 Bob

ISTR someone already proposing to make ls -h -s   work in
a way one might hope for.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharenfs with over 10000 file systems

2008-01-25 Thread Richard L. Hamilton

 New, yes. Aware - probably not.
 
 Given cheap filesystems, users would create many
 filesystems was an easy guess, but I somehow don't
 think anybody envisioned that users would be creating
 tens of thousands of filesystems.
 
 ZFS - too good for it's own good :-p

IMO (and given mails/posts I've seen typically by people using
or wanting to use zfs at large universities and the like, for home
directories) this is frequently driven by the need for per-user
quotas.  Since zfs doesn't have per-uid quotas, this means they
end up creating (at least one) filesystem per user.  That means a
share per user, and locally a mount per user, which will never
scale as well as (locally) a single share of /export/home, and a
single mount (although there would of course be automounts to /home
on demand, but they wouldn't slow down bootup).  sharemgr and the
like may be attempts to improve the situation, but they mitigate rather
than eliminate the consequences of exploding what used to be a single
large filesystem into a bunch of relatively small ones, simply based on
the need to have per-user quotas with zfs.

And there are still situations where a per-uid quota would be useful,
such as /var/mail (although I could see that corrupting mailboxes
in some cases) or other sorts of shared directories.

OTOH, the implementation could certainly vary a little.  The
equivalent of the quotas file should be automatically created
when quotas are enabled, and invisible; and unless quotas are not
only disabled but purged somehow, it should maintain per-uid use
statistics even for uids with no quotas, to eliminate the need for
quotacheck (initialization of quotas might well be restricted to filesystem
creation time, to eliminate the need for a cumbersome pass through
existing data, at least at first; but that would probably be wanted too,
since people don't always plan ahead).  But other quota-related
functionality could IMO maintain, although the implementations
might have to get smarter, and there ought to be some alternative
to the method presently used with ufs of simply reading the
quotas file to iterate through the available stats.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 7zip compression?

2007-07-31 Thread Richard L. Hamilton

 Hello Marc,
 
 Sunday, July 29, 2007, 9:57:13 PM, you wrote:
 
 MB MC rac at eastlink.ca writes:
  
  Obviously 7zip is far more CPU-intensive than
 anything in use with ZFS
  today.  But maybe with all these processor cores
 coming down the road,
  a high-end compression system is just the thing
 for ZFS to use.
 
 MB I am not sure you realize the scale of things
 here. Assuming the worst case:
 MB that lzjb (default ZFS compression algorithm)
 performs as bad as lha in [1],
 MB 7zip would compress your data only 20-30% better
 at the cost of being 4x-5x
 MB slower !
 
 MB Also, in most cases, the bottleneck in data
 compression is the CPU, so
 MB switching to 7zip would reduce the I/O throughput
 by about 4x.
 
 1. it depends on a specific case - sometimes it's cpu
 sometimes not
 
 2. sometimes you don't really care about cpu - you
 have hundreds TBs
 of data rarely used and then squeezing 20-30% more
 space is a huge
 benefit - especially when you only read those files
 once they are
 written

* disks are probably cheaper than CPUs

* it looks to me like 7z may also be RAM-hungry; and there are probably
better ways to use the RAM, too

No doubt it's an option that would serve _someone_ well despite its
shortcomings.  But are there enough such someones to make it worthwhile?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Cluster File System Use Cases

2007-07-13 Thread Richard L. Hamilton

 Bringing this back towards ZFS-land, I think that
 there are some clever
 things we can do with snapshots and clones.  But the
 age-old problem 
 of arbitration rears its ugly head.  I think I could
 write an option to expose
 ZFS snapshots to read-only clients.  But in doing so,
 I don't see how to
 prevent an ill-behaved client from clobbering the
 data.  To solve that
 problem, an arbiter must decide who can write where.
  The SCSI
 rotocol has almost nothing to assist us in this
 cause, but NFS, QFS,
 and pxfs do.  There is room for cleverness, but not
 at the SCSI or block
 level.
  -- richard

Yeah; ISTR that IBM mainframe complexes with what they called
shared DASD (DASD==Direct Access Storage Device, i.e. disk, drum, or the
like) depended on extent reserves.  IIRC, SCSI dropped extent reserve
support, and indeed it was never widely nor reliably available anyway.
AFAIK, all SCSI offers is reserves of an entire LUN; that doesn't even help
with slices, let alone anything else.  Nor (unlike either the VTOC structure
on MVS nor VxFS) is ZFS extent-based anyway; so even if extent reserves
were available, they'd only help a little.  Which means, as he says, some
sort of arbitration.

I wonder whether the hooks for putting the ZIL on a separate device
will be of any use for the cluster filesystem problem; it almost makes me
wonder if there could be any parallels between pNFS and a refactored
ZFS.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: Re: ZFS - SAN and Raid

2007-06-27 Thread Richard L. Hamilton

 Victor Engle wrote:
  Roshan,
  
  As far as I know, there is no problem at all with
 using SAN storage
  with ZFS and it does look like you were having an
 underlying problem
  with either powerpath or the array.
 
 Correct.  A write failed.
 
  The best practices guide on opensolaris does
 recommend replicated
  pools even if your backend storage is redundant.
 There are at least 2
  good reasons for that. ZFS needs a replica for the
 self healing
  feature to work. Also there is no fsck like tool
 for ZFS so it is a
  good idea to make sure self healing can work.
 
 Yes, currently ZFS on Solaris will panic if a
 non-redundant write fails.
 This is known and being worked on, but there really
 isn't a good solution
 if a write fails, unless you have some ZFS-level
 redundancy.

Why not?  If O_DSYNC applies, a write() can still fail with EIO, right?
And if O_DSYNC does not apply, an app could not assume that the
written data was on stable storage anyway.

Or the write() can just block until the problem is corrected (if correctable)
or the system is rebooted.

In any case, IMO there ought to be some sort of consistent behavior
possible short of a panic.  I've seen UFS based systems stay up even
with their disks incommunicado for awhile, although they were hardly
useful like that except insofar as activity strictly involving reading
already cached pages was involved.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: Re: OT: extremely poor experience with Sun Download

2007-06-16 Thread Richard L. Hamilton

Well, I just grabbed the latest SXCE, and just for the heck of it, fooled
around until I got the Java Web Start to work.

Basically, one's browser needs to know the following (how to do that depends
on the browser):

MIME Type:  application/x-java-jnlp-file
File Extension: jnlp
Open With:  /usr/bin/javaws

I got that working with both firefox and opera without inordinate difficulty.
Once that was done, after clicking accept and selecting the three files, I
clicked on the download with sdm box, it started sdm, and passed all three
files to it.  I think I also had to click start on sdm.  That's it...not so bad 
after all.

sdm has a major advantage over typical downloads done directly by browsers for
such large files: if the server supports it (needs to be able to handle 
requests fo
portions of files rather than just an entire file), it can restart failed 
transfers more
or less automatically; and they can even be paused and resumed more or less 
arbitrarily.
I've used that in the past to download the entire Solaris 10 CD set over a 
_dialup_.  Took
a week (well, 8 hours a day connected), but it worked.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: OT: extremely poor experience with Sun Download

2007-06-14 Thread Richard L. Hamilton

 Intending to experiment with ZFS, I have been
 struggling with what  
 should be a simple download routine.
 
 Sun Download Manager leaves a great deal to be
 desired.
 
 In the Online Help for Sun Download Manager there's a
 section on  
 troubleshooting, but if it causes *anyone* this much
 trouble
 http://fuzzy.wordpress.com/2007/06/14/sundownloadmana
 gercrap/ then  
 it should, surely, be fixed.
 
 Sun Download Manager -- a FORCED step in an
 introduction to  
 downloadable software from Sun -- should be PROBLEM
 FREE in all  
 circumstances. It gives an extraordinarily poor first
 impression.
 
 If it can't assuredly be fixed, then we should not be
 forced to use it.
 
 (True, I might have ordered rather than downloaded a
 DVD, but Sun  
 Download Manager has given such a poor impression
 that right now I'm  
 disinclined to pay.)

For trying out zfs, you could always request the free Starter Kit DVD
at http://www.opensolaris.org/kits/ which  contains the
SXCE, Nexenta, Belenix and Schillix distros (all newer than Solaris 10).

Beyond that, while I'm sure you're right about that providing a poor
first impression, I guess I'm too old to have much sympathy for something
taking minutes rather than seconds of attention being a barrier to entry.
Yes, the download experience should be vastly improved, but if you let
that stop you, I wonder if you're all that interested in the first place.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: Re: Re: Re: Re: ZFS consistency guarantee

2007-06-09 Thread Richard L. Hamilton

I wish there was a uniform way whereby applications could
register their ability to achieve or release consistency on demand,
and if registered, could also communicate back that they had
either achieved consistency on-disk, or were unable to do so.  That
would allow backup procedures to automatically talk to apps capable
of such functions, to get them to a known state on-disk before taking
a snapshot.  That would allow one to for example not stop a DBMS, but
simply have it seem to pause for a moment while achieving consistency
and until told that the snapshot was complete; thus providing minimum
impact while still having fully usable backups (and without needing to
do the database backups _through_ the DBMS).  

Something I heard once leads me to believe that some such facility
or convention for how to communicate such issues with e.g. database
server processes exists on Windows.  If they've got it, we really ought
to have something even better, right? :-)

(That's of course not specific to ZFS, but would be useful with any filesystem
that can take snapshots.)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: shareiscsi is cool, but what about sharefc or sharescsi?

2007-06-01 Thread Richard L. Hamilton

 I'd love to be able to server zvols out as SCSI or FC
 targets.  Are
 there any plans to add this to ZFS?  That would be
 amazingly awesome.

Can one use a spare SCSI or FC controller as if it were a target?

Even if the hardware is capable, I don't see what you describe as
a ZFS thing really; it isn't for iSCSI, except that ZFS supports
a shareiscsi option (and property?) by knowing how to tell the
iSCSI server to do the right thing.

That is, there would have to be something like an iSCSI server
except that it listened on an otherwise unused SCSI or FC
interface.

I think that would require not just the daemon but probably new
driver facilities as well.  Given that one can run IP over FC,
it seems to me that in principle it ought to be possible, at least
for FC.  Not so sure about SCSI.

Also not sure about performance.  I suspect even high-end SAN controllers
have a bit more latency than the underlying drives.  And this is a 
general-purpose
OS we're talking about doing this to; I don't know that it would be acceptably 
close,
or as robust (depending on the hardware) as a high-end FC SAN, although it 
might be
possible to be a good deal cheaper.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: storage type for ZFS

2007-04-17 Thread Richard L. Hamilton

Well, no; his quote did say software or hardware.  The theory is apparently
that ZFS can do better at detecting (and with redundancy, correcting) errors
if it's dealing with raw hardware, or as nearly so as possible.  Most SANs
_can_ hand out raw LUNs as well as RAID LUNs, the folks that run them are
just not used to doing it.

Another issue that may come up with SANs and/or hardware RAID:
supposedly, storage systems with large non-volatile caches will tend to have
poor performance with ZFS, because ZFS issues cache flush commands as
part of committing every transaction group; this is worse if the filesystem
is also being used for NFS service.  Most such hardware can be
configured to ignore cache flushing commands, which is safe as long as
the cache is non-volatile.

The above is simply my understanding of what I've read; I could be way off
base, of course.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: Testing of UFS, VxFS and ZFS

2007-04-17 Thread Richard L. Hamilton

 # zfs create pool raidz d1 … d8

Surely you didn't create the zfs pool on top of SVM metadevices?  If so,
that's not useful; the zfs pool should be on top of raw devices.

Also, because VxFS is extent based (if I understand correctly), not unlike how
MVS manages disk space I might add, _it ought_ to blow the doors off of
everything for sequential reads, and probably sequential writes too,
depending on the write size.  OTOH, if a lot of files are created
and deleted, it needs to be defragmented (although I think it can do that
automatically; but there's still at least some overhead while a defrag is
running).

Finally, don't forget complexity.  VxVM+VxFS is quite capable, but it
doesn't always recover from problems as gracefully as one might hope,
and it can be a real bear to get untangled sometimes (not to mention
moderately tedious just to set up).  SVM, although not as capable as VxVM,
is much easier IMO.  And zfs on top of raw devices is about as easy as it
gets.  That may not matter _now_, when whoever sets these up is still
around; but when their replacement has to troubleshoot or rebuild, it
might help to have something that's as easy as possible.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: FreeBSD's system flags.

2007-04-14 Thread Richard L. Hamilton

So you're talking about not just reserving something for on-disk compatibility,
but also maybe implementing these for Solaris?  Cool.  Might be fairly useful
for hardening systems (although as long as someone had raw device access,
or physical access, they could of course still get around it; that would have to
be taken into account in the overall design for it to make much of a 
difference).

Other problems: from a quick look at the header files there's no room left in
the 64-bit version of the stat structure to add something in which to retrieve
the flags; that may mean a new and incompatible (with other chflags(2) 
supporting systems) system call? Also, there's no provision in pkgmap(4) for
file flags; could that be extended compatibly?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
[EMAIL PROTECTED]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: ZFS and Linux

2007-04-14 Thread Richard L. Hamilton

 I hope this isn't turning into a License flame war.
 But why do Linux  
 contributors not deserve the right to retain their
 choice of license  
 as equally as Sun, or any other copyright holder,
 does?
 
 The anti-GPL kneejerk just witnessed on this list is
 astonishing. The  
 BSD license, for instance, is fundamentally
 undesirable to many GPL  
 licensors (myself included).

Nothing wrong with GPL as an abstract ideology.  But when ideology trumps
practicality (which it does when code can't be as widely reused as
possible), I have a problem with that.

As far as I'm concerned, GPL is to open licenses as political correctness
is to free speech.

Of course, anyone who writes something is free to use any license they
please.  And anyone else is free to choose an incompatible license, either
for reasons that have nothing specifically to do with being incompatible,
or because they just don't want the sucking sound of their goodies being
adopted and very little being returned (which strikes me as a major
element of the relationship between Linux and *BSD; although to be sure,
there is some two-way cooperation).

I have zero problem with Linux using GPLv2 (and as some have said,
perhaps being stuck with it at this point).  I'm not sure I'd want their
code anyway, and even if I did, I darn sure wouldn't want the we
don't need no steekin' DDI 'cause we're source based philosophy that
comes with it, because to my mind that ends up justifying a lot of
poor design and engineering discipline in the name of not being limited
by backwards compatibility.

So, if having chosen a license based on the ideology of being a lever to
free other software (but on their terms!) for the sake of being compatible
with them, the Linux folks now have to re-invent equivalents of ZFS and
Dtrace, it serves them right, IMO.

And as someone else also mentioned, competition is good anyway.  Not
as if a lot of ideas don't cross-pollinate.  But if every free OS used
compatible licenses, I think 20 years later, the result would resemble
the result of inbreeding...not pretty, and a shallower meme pool overall.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
[EMAIL PROTECTED]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: Re: How big a write to a regular file is atomic?

2007-03-30 Thread Richard L. Hamilton

 On Wed, Mar 28, 2007 at 06:55:17PM -0700, Anton B.
 Rang wrote:
  It's not defined by POSIX (or Solaris). You can
 rely on being able to
  atomically write a single disk block (512 bytes);
 anything larger than
  that is risky. Oh, and it has to be 512-byte
 aligned.
  
  File systems with overwrite semantics (UFS, QFS,
 etc.) will never
  guarantee atomicity for more than a disk block,
 because that's the
  only guarantee from the underlying disks.
 
 I thought UFS and others have a guarantee of
 atomicity for O_APPEND
 writes vis-a-vis other O_APPEND writes up to some
 write size.  (Of
 course, NFS does not have true O_APPEND support, so
 this wouldn't apply
 to NFS.)

That's mainly what I was thinking of, since the overwrite case
would get more complicated.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] How big a write to a regular file is atomic?

2007-03-28 Thread Richard L. Hamilton

and does it vary by filesystem type? I know I ought to know the
answer, but it's been a long time since I thought about it, and
I must not be looking at the right man pages.  And also, if it varies,
how does one tell?  For a pipe, there's fpathconf() with _PC_PIPE_BUF,
but how about for a regular file?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] missing features?Could/should zfs support a new ioctl, constrained if neede

2007-03-24 Thread Richard L. Hamilton

_FIOSATIME - why doesn't zfs support this (assuming I didn't just miss it)?
Might be handy for backups.

Could/should zfs support a new ioctl, constrained if needed to files of
zero size, that sets an explicit (and fixed) blocksize for a particular
file?  That might be useful for performance in special cases when one
didn't necessarily want to specify (or depend on the specification of
perhaps) the attribute at the filesystem level.  One could imagine a
database that was itself tunable per-file to a similar range of
blocksizes, which would almost certainly benefit if it used those sizes
for the corresponding files.  Additional capabilities that might be
desirable: setting the blocksize to zero to let the system return to
default behavior for a file; being able to discover the file's blocksize
(does fstat() report this?) as well as whether it was fixed at the
filesystem level, at the file level, or in default state.

Wasn't there some work going on to add real per-user (and maybe per-group)
quotas, so one doesn't necessarily need to be sharing or automounting
thousands of individual filesystems (slow)?  Haven't heard anything lately 
though...
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] mirror question

2007-03-23 Thread Richard L. Hamilton

If I create a mirror, presumably if possible I use two or more identically 
sized devices,
since it can only be as large as the smallest.  However, if later I want to 
replace a disk
with a larger one, and detach the mirror (and anything else on the disk), 
replace the
disk (and if applicable repartition it), since it _is_ a larger disk (and/or 
the partitions
will likely be larger since they mustn't be smaller, and blocks per cylinder 
will likely differ,
and partitions are on cylinder boundaries), once I reattach everything, I'll 
now have
two different sized devices in the mirror.  So far, the mirror is still the 
original size.
But what if I later replace the other disks with ones identical to the first 
one I replaced?
With all the devices within the mirror now the larger size, will the mirror and 
the zpool
of which it is a part expand?  And if that won't happen automatically, can it 
(without
inordinate trickery, and online, i.e. without backup and restore) be forced to 
do so?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: UFS on zvol: volblocksize and maxcontig

2007-02-01 Thread Richard L. Hamilton

I hope there will be consideration given to providing compatibility with UFS 
quotas
(except that inode limits would be ignored).  At least to the point of having

edquota(1m)
quot(1m)
quota(1m)
quotactl(7i)
repquota(1m)
rquotad(1m)

and possibly quotactl(7i) work with zfs (with the exception previously 
mentioned).
OTOH, quotaon(1m)/quotaoff(1m)/quotacheck(1m) may not be needed for support of
per-user quotas in zfs (since it will presumably have its own ways of enabling 
these, and
will simply never mess up?)

None of which need preclude new interfaces with greater functionality (like both
user and group quotas), but where there is similar functionality, IMO it would 
be
easier for a lot of folks if quota maintenance (esp. edquota and reporting) 
could
be done the same way for ufs and zfs.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: zpool split

2007-01-24 Thread Richard L. Hamilton

...such that a snapshot (cloned if need be) won't do what you want?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: A versioning FS

2006-10-06 Thread Richard L. Hamilton

 What would a version FS buy us that cron+ zfs
 snapshots doesn't?

Some people are making money on the concept, so I
suppose there are those who perceive benefits:

http://en.wikipedia.org/wiki/Rational_ClearCase

(I dimly remember DSEE on the Apollos; also some sort of
versioning file type on (probably long-dead) Harris VOS
real-time OS.)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: ZFS + rsync, backup on steroids.

2006-08-31 Thread Richard L. Hamilton

Are both of you doing a umount/mount (or export/import, I guess) of the
source filesystem before both first and second test?  Otherwise, there might
still be a fair bit of cached data left over from the first test, which would
give the 2nd an unfair advantage.  I'm fairly sure unmounting a filesystem
invalidates all cached pages associated with files on that filesystem, as well
as any cached [iv]node entries, all of which in needed to ensure both tests
are starting from the most similar situation possible.  Ideally, all this would
even be done in single-user mode, so that nothing else could interfere.

If there were a list of precautions to take that would put comparisons
like this on firmer ground, it might provide a good starting point for such
comparisons to be more than anecdotes, saving time for all concerned,
both those attempting to replicate a prior casual observation for reporting,
and those looking at the report.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Re: Re: Re: SCSI synchronize cache cmd

2006-08-22 Thread Richard L. Hamilton

 Filed as 6462690.
 
 If our storage qualification test suite doesn't yet
 check for support of this bit, we might want to get
 that added; it would be useful to know (and gently
 nudge vendors who don't yet support it).

Is either the test suite, or at least a list of what it tests
(which it looks like may more or less track what Solaris
requires) publically available, or could it be made so?
Seems to me that if people can independently discover
problem hardware, that might make your job easier
insofar as they're smarter before they start asking you
questions; even more so if they feed back what they find
(not unlike the do-it-yourself x86 compatibility testing).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

70 matches

Mail list logo