[zfs-discuss] Announcing ZFS code discussion forum

2006-05-03 Thread Eric Schrock
Folks -

Given the response to my previous mail, we've created
'[EMAIL PROTECTED]' and the corresponding Jive discussion forum:

http://www.opensolaris.org/jive/forum.jspa?forumID=131

This forum should be used for detailed discussion of ZFS code,
implementation details, codereview requests, porting problems, etc.
General questions about ZFS, or discussion of user-visible ZFS
architecture should continue to be directed to
'zfs-discuss@opensolaris.org'.  Please do not cross-post between the two
lists.

I've gone ahead and added the ZFS team members to this new list.  Anyone
else wishing to subscribe to the forum should send mail to
'[EMAIL PROTECTED]'.

Thanks,

Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] XATTRs, ZAP and the Mac

2006-05-03 Thread Matthew Ahrens
On Wed, May 03, 2006 at 03:22:53PM -0400, Maury Markowitz wrote:
  I think that's the disconnect. WHY are they full-fledged files?
 
 Because that's what the specification calls for.
 
 Right, but that's my concern. To me this sounds like historically 
 circular reasoning...
 
 20xx) we need a new file system that supports xaddrs
 well xaddrs are this second file, so...
 
 To me it appears that there is some confusion between the purpose and 
 implementation.

 Certainly if xaddrs were originally introduced to store, well, x
 addrs, then the implementation is a poor one. Years later the
 _implementation_ was copied, even though it was never a good one.

I think you are confusing the interface with the implementation.  ZFS
has copied (aka. adhered to) a pre-existing interface[*].  Our
implementation of that interface is in some ways similar to other
implementations.  I believe that our implementation is a very good one,
but if you have specific suggestions for how it could be improved, we'd
love to hear them.

[*] The solaris extended attributes interface is actually more
accurately called named streams, and has been used as the back-end for
CIFS (Windows) and NFSv4 named-streams protocols.  See the fsattr(5)
manpage.

We appreciate your suggestion that we implement a higher-performance
method for storing additional metadata associated with files.  This will
most likely not be possible within the extended attribute interface, and
will require that we design (and applications use) a new interface.
Having specific examples of how that interface would be used will help
us to design a useful feature.

 The real problem is that there is nothing like a general overview of
 the zfs system as a whole

I agree that a higher-level overview would be useful.

 COMPARING the system with the widely understood UFS would be
 invaluable, IMHO.

Agreed, thanks for the suggestion.  Unfortunately, ZFS and UFS are
sufficiently different that I think the comparison would only be useful
for a very limited part of ZFS, say from the file/directory down.

 But to the specifics. You asked why I thought it was that the file
 name did not appear. Well, that's because the term file name (or
 filename) does not appear anywhere in the document.

Thanks, maybe we should use that keyword in section 6.2 to help when
doing a search.

 So then, at a first glance it seems that one would expect to find the
 directory description in Chapter 6, which has a subsection called
 Directories and Directory Traversal.

I believe that that section does in fact describe directories.  Perhaps
the description could be made more explicit (eg. The ZAP object which
stores the directory maps from filename to object number.  Each entry in
the ZAP is a single directory entry.  The entry's name is the filename,
and its value is the object number which identifies that file.

 That section describes the znode_phys_t structure.

You're right, it also describes the znode_phys_t.  There should be a
section break after the first paragraph, before we start talking about
the znode_phys_t.

 Maybe I'm going down a dark alley here, but is there any reason this
 split still exists under zfs? IE, I asumed that the znode_phys_t would
 be located in the directory ZAP, because to my mind, that's where
 metadata belongs.

ZFS must support POSIX semantics, part of which is hard links.  Hard
links allow you to create multiple names (directory entries) for the
same file.  Therefore, all UNIX filesystems have chosen to store the
file information separately for the directory entries (otherwise, you'd
have multiple copies, and need pointers between all of them so you could
update them all -- yuck).

Hard links suck for FS designers because they constrain our
implementation in this way.  We'd love to have the flexability to easily
store metadata with the directory entry.  We've actually contemplated
caching the metadata needed to do a stat(2) in the directory entry, to
improve performance of directory traversals like find(1).  Perhaps we'll
be able to add this performance improvement in an future release.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: PSARC 2006/288 zpool history

2006-05-03 Thread Ed Gould

On May 3, 2006, at 15:21, eric kustarz wrote:
There's basically two writes that need to happen: one for time and one 
for the subcommand string.  The kernel just needs to make sure if a 
write completes, the data is parseable (has a delimiter).  Its then up 
to the userland parser (zpool history) to figure out if there are 
incomplete records, but the kernel guarantees it is parseable.  The 
userland command only prints out complete records, nothing partial.  
So the userland command needs to handle if say for one record that 
the time entry was written but the subcommand was not.


I'm not clear on how the parser knows enough to do that.  I believe I 
saw that a record looked like


\0arbitrary number of NUL-terminated strings

If this is correct, how can the parser know if a string (or part of 
one) got dropped?


I think this might be a case where a structured record (like the 
compact XML suggestion made earlier) would help.  At least having 
distinguished start and end markers (whether they be one byte each, 
or XML constructs) for a record looks necessary to me.


--Ed

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: PSARC 2006/288 zpool history

2006-05-03 Thread Nicolas Williams
On Wed, May 03, 2006 at 03:34:56PM -0700, Ed Gould wrote:
 I think this might be a case where a structured record (like the 
 compact XML suggestion made earlier) would help.  At least having 
 distinguished start and end markers (whether they be one byte each, 
 or XML constructs) for a record looks necessary to me.

NVLISTs are perfect for this...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zpool history' proposal

2006-05-03 Thread Edward Pilatowicz
On Wed, May 03, 2006 at 03:05:25PM -0700, Eric Schrock wrote:
 On Wed, May 03, 2006 at 02:47:57PM -0700, eric kustarz wrote:
  Jason Schroeder wrote:
 
  eric kustarz wrote:
  
  The following case is about to go to PSARC.  Comments are welcome.
  
  eric
  
  To piggyback on earlier comments re: adding hostname and user:
  
  What is the need for zpool history to distinguish zfs commands that
  were executed by priviledged users in non-global zones for those
  datasets under ngz ownership?
  
  I personally don't see a need to distinguish between zones.  However,
  with delegated administration, it would be nice to know who did (say)
  destroy that file system - the local root or some remote user.

 Keep in mind that one username (or uid) in a local zone is different
 from the same username in the global zone, since they can be running
 different name services.  In the simplest example, you could have an
 entry that said something like:

 root  zfs destroy tank/foo

 And if you were using datasets delegated to local zones, you wouldn't
 know if that was 'root' in the global zone or 'root' in the local zone.
 If you are going to log a user at all, you _need_ to log the zone name
 as well.  Even without usernames, it would probably be useful to know
 that a particular action was done in a particular zone.

 Imagine a service provider with several zones delegated to different
 users, and each user has their own portion of the namespace.  At some
 point, you get a servicecall from a customer saying someone deleted my
 filesystems  You could look at the zpool history, but without a
 zone name, you wouldn't know if was your fault (from the global zone) or
 theirs (from the local zone).

 - Eric


why don't you see a need to distinguish between zones?

in most cases (but not all) a zone administrator doesn't deal with
pools.  they deal with datasets allocated to their zone, and for the
same reasons that the global zone administrator might want access to zfs
command histories, a zone administrator might want access to zfs
command histories that apply to datasets allocated to their zones.

which makes me wonder if perhaps zfs command history buffers should
also be supported on datasets allocated to zones?  or perhaps a zone
administrator should should be able to view a subset of the zfs command
history, specifically the transactions that affect datasets allocated
to their zone?

ed
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zpool history' proposal

2006-05-03 Thread Craig Cory
I, too, am late to this thread but I caught something that didn't seem right
to me in this specific example. For the administration of the non-global
zones, SunEducation (for whom I am an instructor) is stressing that the ng
zones are Software Virtualizations (my quotes) and that the hardware and
infrastructure are managed by the global zone admin. In this case, the ngz
admins would not have access or permission to corrupt their filesystems at the
zpool/zfs level. Unless zfs is to offer a different management model, I don't
suspect we will need to differentiate the (incapacitated) ngz admins from the
gz admins.

Regards,

Craig

On Wed, May 3, 2006 3:05 pm, Eric Schrock said:
 On Wed, May 03, 2006 at 02:47:57PM -0700, eric kustarz wrote:
 Jason Schroeder wrote:

 eric kustarz wrote:
 
 The following case is about to go to PSARC.  Comments are welcome.
 
 eric
 
 To piggyback on earlier comments re: adding hostname and user:
 
 What is the need for zpool history to distinguish zfs commands that
 were executed by priviledged users in non-global zones for those
 datasets under ngz ownership?
 
 I personally don't see a need to distinguish between zones.  However,
 with delegated administration, it would be nice to know who did (say)
 destroy that file system - the local root or some remote user.

 Keep in mind that one username (or uid) in a local zone is different
 from the same username in the global zone, since they can be running
 different name services.  In the simplest example, you could have an
 entry that said something like:

 root  zfs destroy tank/foo

 And if you were using datasets delegated to local zones, you wouldn't
 know if that was 'root' in the global zone or 'root' in the local zone.
 If you are going to log a user at all, you _need_ to log the zone name
 as well.  Even without usernames, it would probably be useful to know
 that a particular action was done in a particular zone.

 Imagine a service provider with several zones delegated to different
 users, and each user has their own portion of the namespace.  At some
 point, you get a servicecall from a customer saying someone deleted my
 filesystems  You could look at the zpool history, but without a
 zone name, you wouldn't know if was your fault (from the global zone) or
 theirs (from the local zone).

 - Eric

 --
 Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'zpool history' proposal

2006-05-03 Thread eric kustarz

Bill Sommerfeld wrote:
...

So its really both - the subcommand successfully executes when its 
actually written to disk and txg group is synced. 
   



I found myself backtracking while reading that sentence due to the
ambiguity in the first half -- did you mean the write of the literal
text of the command itself to the audit trail, or the intended changes
to the pool produced by the subcommand?  (just a wordsmithing thing,
really.)
 



Sorry, the subcommand refers to whatever writes that currently happen 
today to put something like 'zfs create pool/fs' on disk.


The log I/O refers to the new writes that happen with the proposed 
changes for command history.


So we're trying to determine what happens when the newly introduced 
writes (due to the command history) fail.



eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: 'zpool history' proposal

2006-05-03 Thread Matthew A. Ahrens
 # zpool history jen
 History for 'jen':
 2006-04-27T10:38:36 zpool create jen mirror ...

I have two suggestions which are just minor nits compared with the rest of this 
discussion:

1. Why do you print a T between the date and the time?  I think a space would 
be more readable.

2. When printing the history for a specific pool, I don't think we should print 
the History for pool line.  It seems unnecessary, and that way every line 
of output will have the same format (better for later machine parsing).

--matt
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] XATTRs, ZAP and the Mac

2006-05-03 Thread Bill Sommerfeld
On Wed, 2006-05-03 at 17:20, Matthew Ahrens wrote:
 We appreciate your suggestion that we implement a higher-performance
 method for storing additional metadata associated with files.  This will
 most likely not be possible within the extended attribute interface, and
 will require that we design (and applications use) a new interface.
 Having specific examples of how that interface would be used will help
 us to design a useful feature.

another potential consumer for an extra-metadata extension is trusted
extensions, for per-file security labels and similar obscurity.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] been busy working on ZFS stuff

2006-05-03 Thread James Dickens

In case anyone is bored and wants some zfs reading here are some links for you.

comparison of ZFS vs. Linux Raid and LVM 
http://unixconsult.org/zfs_vs_lvm.html


zfs ready for home use  http://uadmin.blogspot.com/2006/05/why-zfs-for-home.html

moving zfs filesystems using zfs back/restore commands
http://uadmin.blogspot.com/2006/05/moving-zfs-pools.html

James Dickens
uadmin.blogspot.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss