[zfs-discuss] Announcing ZFS code discussion forum
Folks - Given the response to my previous mail, we've created '[EMAIL PROTECTED]' and the corresponding Jive discussion forum: http://www.opensolaris.org/jive/forum.jspa?forumID=131 This forum should be used for detailed discussion of ZFS code, implementation details, codereview requests, porting problems, etc. General questions about ZFS, or discussion of user-visible ZFS architecture should continue to be directed to 'zfs-discuss@opensolaris.org'. Please do not cross-post between the two lists. I've gone ahead and added the ZFS team members to this new list. Anyone else wishing to subscribe to the forum should send mail to '[EMAIL PROTECTED]'. Thanks, Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] XATTRs, ZAP and the Mac
On Wed, May 03, 2006 at 03:22:53PM -0400, Maury Markowitz wrote: I think that's the disconnect. WHY are they full-fledged files? Because that's what the specification calls for. Right, but that's my concern. To me this sounds like historically circular reasoning... 20xx) we need a new file system that supports xaddrs well xaddrs are this second file, so... To me it appears that there is some confusion between the purpose and implementation. Certainly if xaddrs were originally introduced to store, well, x addrs, then the implementation is a poor one. Years later the _implementation_ was copied, even though it was never a good one. I think you are confusing the interface with the implementation. ZFS has copied (aka. adhered to) a pre-existing interface[*]. Our implementation of that interface is in some ways similar to other implementations. I believe that our implementation is a very good one, but if you have specific suggestions for how it could be improved, we'd love to hear them. [*] The solaris extended attributes interface is actually more accurately called named streams, and has been used as the back-end for CIFS (Windows) and NFSv4 named-streams protocols. See the fsattr(5) manpage. We appreciate your suggestion that we implement a higher-performance method for storing additional metadata associated with files. This will most likely not be possible within the extended attribute interface, and will require that we design (and applications use) a new interface. Having specific examples of how that interface would be used will help us to design a useful feature. The real problem is that there is nothing like a general overview of the zfs system as a whole I agree that a higher-level overview would be useful. COMPARING the system with the widely understood UFS would be invaluable, IMHO. Agreed, thanks for the suggestion. Unfortunately, ZFS and UFS are sufficiently different that I think the comparison would only be useful for a very limited part of ZFS, say from the file/directory down. But to the specifics. You asked why I thought it was that the file name did not appear. Well, that's because the term file name (or filename) does not appear anywhere in the document. Thanks, maybe we should use that keyword in section 6.2 to help when doing a search. So then, at a first glance it seems that one would expect to find the directory description in Chapter 6, which has a subsection called Directories and Directory Traversal. I believe that that section does in fact describe directories. Perhaps the description could be made more explicit (eg. The ZAP object which stores the directory maps from filename to object number. Each entry in the ZAP is a single directory entry. The entry's name is the filename, and its value is the object number which identifies that file. That section describes the znode_phys_t structure. You're right, it also describes the znode_phys_t. There should be a section break after the first paragraph, before we start talking about the znode_phys_t. Maybe I'm going down a dark alley here, but is there any reason this split still exists under zfs? IE, I asumed that the znode_phys_t would be located in the directory ZAP, because to my mind, that's where metadata belongs. ZFS must support POSIX semantics, part of which is hard links. Hard links allow you to create multiple names (directory entries) for the same file. Therefore, all UNIX filesystems have chosen to store the file information separately for the directory entries (otherwise, you'd have multiple copies, and need pointers between all of them so you could update them all -- yuck). Hard links suck for FS designers because they constrain our implementation in this way. We'd love to have the flexability to easily store metadata with the directory entry. We've actually contemplated caching the metadata needed to do a stat(2) in the directory entry, to improve performance of directory traversals like find(1). Perhaps we'll be able to add this performance improvement in an future release. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: PSARC 2006/288 zpool history
On May 3, 2006, at 15:21, eric kustarz wrote: There's basically two writes that need to happen: one for time and one for the subcommand string. The kernel just needs to make sure if a write completes, the data is parseable (has a delimiter). Its then up to the userland parser (zpool history) to figure out if there are incomplete records, but the kernel guarantees it is parseable. The userland command only prints out complete records, nothing partial. So the userland command needs to handle if say for one record that the time entry was written but the subcommand was not. I'm not clear on how the parser knows enough to do that. I believe I saw that a record looked like \0arbitrary number of NUL-terminated strings If this is correct, how can the parser know if a string (or part of one) got dropped? I think this might be a case where a structured record (like the compact XML suggestion made earlier) would help. At least having distinguished start and end markers (whether they be one byte each, or XML constructs) for a record looks necessary to me. --Ed ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: PSARC 2006/288 zpool history
On Wed, May 03, 2006 at 03:34:56PM -0700, Ed Gould wrote: I think this might be a case where a structured record (like the compact XML suggestion made earlier) would help. At least having distinguished start and end markers (whether they be one byte each, or XML constructs) for a record looks necessary to me. NVLISTs are perfect for this... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'zpool history' proposal
On Wed, May 03, 2006 at 03:05:25PM -0700, Eric Schrock wrote: On Wed, May 03, 2006 at 02:47:57PM -0700, eric kustarz wrote: Jason Schroeder wrote: eric kustarz wrote: The following case is about to go to PSARC. Comments are welcome. eric To piggyback on earlier comments re: adding hostname and user: What is the need for zpool history to distinguish zfs commands that were executed by priviledged users in non-global zones for those datasets under ngz ownership? I personally don't see a need to distinguish between zones. However, with delegated administration, it would be nice to know who did (say) destroy that file system - the local root or some remote user. Keep in mind that one username (or uid) in a local zone is different from the same username in the global zone, since they can be running different name services. In the simplest example, you could have an entry that said something like: root zfs destroy tank/foo And if you were using datasets delegated to local zones, you wouldn't know if that was 'root' in the global zone or 'root' in the local zone. If you are going to log a user at all, you _need_ to log the zone name as well. Even without usernames, it would probably be useful to know that a particular action was done in a particular zone. Imagine a service provider with several zones delegated to different users, and each user has their own portion of the namespace. At some point, you get a servicecall from a customer saying someone deleted my filesystems You could look at the zpool history, but without a zone name, you wouldn't know if was your fault (from the global zone) or theirs (from the local zone). - Eric why don't you see a need to distinguish between zones? in most cases (but not all) a zone administrator doesn't deal with pools. they deal with datasets allocated to their zone, and for the same reasons that the global zone administrator might want access to zfs command histories, a zone administrator might want access to zfs command histories that apply to datasets allocated to their zones. which makes me wonder if perhaps zfs command history buffers should also be supported on datasets allocated to zones? or perhaps a zone administrator should should be able to view a subset of the zfs command history, specifically the transactions that affect datasets allocated to their zone? ed ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'zpool history' proposal
I, too, am late to this thread but I caught something that didn't seem right to me in this specific example. For the administration of the non-global zones, SunEducation (for whom I am an instructor) is stressing that the ng zones are Software Virtualizations (my quotes) and that the hardware and infrastructure are managed by the global zone admin. In this case, the ngz admins would not have access or permission to corrupt their filesystems at the zpool/zfs level. Unless zfs is to offer a different management model, I don't suspect we will need to differentiate the (incapacitated) ngz admins from the gz admins. Regards, Craig On Wed, May 3, 2006 3:05 pm, Eric Schrock said: On Wed, May 03, 2006 at 02:47:57PM -0700, eric kustarz wrote: Jason Schroeder wrote: eric kustarz wrote: The following case is about to go to PSARC. Comments are welcome. eric To piggyback on earlier comments re: adding hostname and user: What is the need for zpool history to distinguish zfs commands that were executed by priviledged users in non-global zones for those datasets under ngz ownership? I personally don't see a need to distinguish between zones. However, with delegated administration, it would be nice to know who did (say) destroy that file system - the local root or some remote user. Keep in mind that one username (or uid) in a local zone is different from the same username in the global zone, since they can be running different name services. In the simplest example, you could have an entry that said something like: root zfs destroy tank/foo And if you were using datasets delegated to local zones, you wouldn't know if that was 'root' in the global zone or 'root' in the local zone. If you are going to log a user at all, you _need_ to log the zone name as well. Even without usernames, it would probably be useful to know that a particular action was done in a particular zone. Imagine a service provider with several zones delegated to different users, and each user has their own portion of the namespace. At some point, you get a servicecall from a customer saying someone deleted my filesystems You could look at the zpool history, but without a zone name, you wouldn't know if was your fault (from the global zone) or theirs (from the local zone). - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'zpool history' proposal
Bill Sommerfeld wrote: ... So its really both - the subcommand successfully executes when its actually written to disk and txg group is synced. I found myself backtracking while reading that sentence due to the ambiguity in the first half -- did you mean the write of the literal text of the command itself to the audit trail, or the intended changes to the pool produced by the subcommand? (just a wordsmithing thing, really.) Sorry, the subcommand refers to whatever writes that currently happen today to put something like 'zfs create pool/fs' on disk. The log I/O refers to the new writes that happen with the proposed changes for command history. So we're trying to determine what happens when the newly introduced writes (due to the command history) fail. eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: 'zpool history' proposal
# zpool history jen History for 'jen': 2006-04-27T10:38:36 zpool create jen mirror ... I have two suggestions which are just minor nits compared with the rest of this discussion: 1. Why do you print a T between the date and the time? I think a space would be more readable. 2. When printing the history for a specific pool, I don't think we should print the History for pool line. It seems unnecessary, and that way every line of output will have the same format (better for later machine parsing). --matt This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] XATTRs, ZAP and the Mac
On Wed, 2006-05-03 at 17:20, Matthew Ahrens wrote: We appreciate your suggestion that we implement a higher-performance method for storing additional metadata associated with files. This will most likely not be possible within the extended attribute interface, and will require that we design (and applications use) a new interface. Having specific examples of how that interface would be used will help us to design a useful feature. another potential consumer for an extra-metadata extension is trusted extensions, for per-file security labels and similar obscurity. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] been busy working on ZFS stuff
In case anyone is bored and wants some zfs reading here are some links for you. comparison of ZFS vs. Linux Raid and LVM http://unixconsult.org/zfs_vs_lvm.html zfs ready for home use http://uadmin.blogspot.com/2006/05/why-zfs-for-home.html moving zfs filesystems using zfs back/restore commands http://uadmin.blogspot.com/2006/05/moving-zfs-pools.html James Dickens uadmin.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss