Re: [zfs-discuss] ZFS Mountroot and Bootroot Comparison
Please do share how you managed to have a separate ZFS /usr since b64; there are dependencies to /usr and they are not documented. -kv doesn't help too. I tried added /usr/lib/libdisk* to a /usr/lib dir on the root partition and failed. Jurgen also pointed that there are two related bugs already filed: Bug ID 6570056 Synopsis/sbin/zpool should not link to files in /usr/lib http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6570056 Bug ID 6494840 Synopsislibzfs should dlopen libiscsitgt rather than linking to it http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494840 I can do a snapshot on bootroot too ... after I tried to do a rollback from failsafe I couldn't boot anymore, probably because there was no straightforward way to rebuild the boot archive. Regarding compression, if I am not mistaken, grub cannot access files that are compressed. Regards, K. On 05/10/2007, at 5:55 AM, Andre Wenas wrote: > Hi, > > Using bootroot I can do seperate /usr filesystem since b64. I can > also do snapshot, clone and compression. > > Rgds, > Andre W. > > Kugutsumen wrote: >> Lori Alt told me that mountrount was a temporary hack until grub >> could boot zfs natively. >> Since build 62, mountroot support was dropped and I am not >> convinced that this is a mistake. >> >> Let's compare the two: >> >> Mountroot: >> >> Pros: >>* can have root partition on raid-z: YES >>* can have root partition on zfs stripping mirror: YES >>* can have usr partition on separate ZFS partition with build >> < 72 : YES >>* can snapshot and rollback root partition: YES >>* can use copies on root partition on a single root disk (e.g. >> a laptop ): YES >>* can use compression on root partition: YES >> Cons: >>* grub native support: NO (if you use raid-z or stripping >> mirror, you will need to have a small UFS partition >> to bootstrap the system, but you can use a small usb stick >> for that purpose.) >> >> New and "improved" *sigh* bootroot scheme: >> >> Pros: >>* grub native support: YES >> Cons: >>* can have root partition on raid-z: NO >>* can have root partition on zfs stripping mirror: NO >>* can use copies on root partition on a single root disk (e.g. >> a laptop ): NO >>* can have usr partition on separate ZFS partition with build >> < 72 : NO >>* can snapshot and rollback root partition: NO >>* can use compression on root partition: NO >>* No backward compatibility with zfs mountroot. >> >> Why did we completely drop support for the old mountroot approach >> which is so much more flexible? >> >> Kugutsumen >> >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
I've been thinking about this for awhile, but Anton's analysis makes me think about it even more: We all love ZFS, right. It's futuristic in a bold new way, which many virtues, I won't preach tot he choir. But to make it all glue together has some necessary CPU/Memory intensive operations around checksum generation/validation, compression, encryption, data placement/component load balancing, etc. Processors have gotten really powerful, much more so than the relative disk I/O gains, which in all honesty make ZFS possible. My question: Is anyone working on an offload engine for ZFS? I can envision a highly optimized, pipelined system, where writes and reads pass through checksum, compression, encryption ASICs, that also locate data properly on disk. This could even be in the form of a PCIe SATA/SAS card with many ports, or different options. This would make direct IO, or DMA IO possible again. The file system abstraction with ZFS is really too much and too important to ignore, and too hard to optimize with different load conditions, (my rookie opinion) to expect any RDBMS app to have a clue what to do with it. I guess what I'm saying is the RDMBS app will know what blocks it needs, and wants to get them in and out speedy quick, but the mapping to disk is not linear with ZFS, the way other file systems are. An offload engine could translate this instead. Just throwing this out there for the purpose of blue sky fluff. Jon Anton B. Rang wrote: 5) DMA straight from user buffer to disk avoiding a copy. This is what the "direct" in "direct i/o" has historically meant. :-) line has been that 5) won't help latency much and latency is here I think the game is currently played. Now the disconnect might be because people might feel that the game is not latency but CPU efficiency : "how many CPU cycles do I burn to do get data from disk to user buffer". Actually, it's less CPU cycles in many cases than memory cycles. For many databases, most of the I/O is writes (reads wind up cached in memory). What's the cost of a write? With direct I/O: CPU writes to memory (spread out over many transactions), disk DMAs from memory. We write LPS (log page size) bytes of data from CPU to memory, we read LPS bytes from memory. On processors without a cache line zero, we probably read the LPS data from memory as part of the write. Total cost = W:LPS, R:2*LPS. Without direct I/O: The cost of getting the data into the user buffer remains the same (W:LPS, R:LPS). We copy the data from user buffer to system buffer (W:LPS, R:LPS). Then we push it out to disk. Total cost = W:2*LPS, R:3*LPS. We've nearly doubled the cost, not including any TLB effects. On a memory-bandwidth-starved system (which should be nearly all modern designs, especially with multi-threaded chips like Niagara), replacing buffered I/O with direct I/O should give you nearly a 2x improvement in log write bandwidth. That's without considering cache effects (which shouldn't be too significant, really, since LPS should be << the size of L2). How significant is this? We'd have to measure; and it will likely vary quite a lot depending on which database is used for testing. But note that, for ZFS, the win with direct I/O will be somewhat less. That's because you still need to read the page to compute its checksum. So for direct I/O with ZFS (with checksums enabled), the cost is W:LPS, R:2*LPS. Is saving one page of writes enough to make a difference? Possibly not. Anton This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- - _/ _/ / - Jonathan Loran - - -/ / /IT Manager - - _ / _ / / Space Sciences Laboratory, UC Berkeley -/ / / (510) 643-5146 [EMAIL PROTECTED] - __/__/__/ AST:7731^29u18e3 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs + iscsi target + vmware esx server
I'm posting here as this seems to be a zfs issue. We also have an open ticket with Sun support and I've heard another large sun customer also is reporting this as an issue. Basic Problem: Create a zfs file system and set shareiscsi to on. On a vmware esx server discover that iscsi target. It shows up as 249 luns. When attempting to then add the storage esx server eventually times out, if you view it from command line you see it checking each lun and crashing out before it gets to 249. Test Environment: Sun ISCSI Target Host: compaq PC with 2 80GB SATA drives, 1GB RAM, 3.2GHz CPU running solaris 10 x86 update 4 (08/07). The second drive is setup as a storage pool with 1 filesystem created with shareiscsi on. ESX Server: compaq PC with SCSI HD, 1GB RAM, 3.2GHz CPU running vmware esx server 3.0.2 Other test scenarios: We also created an iscsi target via iscsitadm using a spare slice on the primary disk. esx server sees this target just fine with a single lun as expected. Trying to get more creative I did an iscsitadm modify admin and set the path to my zfs filesystem. I then used iscsitadm to create a new target that was a file (which would get created on the zfs filesystem). However in esx server I see the same results with 249 luns. The one difference this time is the first lun is the size I created the target as and the other 248 are the size of the zfs filesystem. So if zfs is involved it screws up, if it's not it's fine. I do not yet have another solaris 10 update 4 system to test as an initiator but my update 3 system sees it just fine in all test scenarios. It seems to be an issue with how the iscsi target is being broadcast when zfs is involved that the solaris initiator doesn't seem to mind but esx server sure does. Since it works fine when I use a regular disk slice I think it's something to do with zfs but I wouldn't rule out an issue with esx completely yet. Any help is greatly appreciated. We are looking to role out multiple thumpers and vmware servers using the thumpers as their backend storage via iscsi (or nfs if we have to, but would rather go iscsi) Thanks! Adam This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for OSX - it'll be in there.
Dale Ghent wrote: > ...and eventually in a read-write capacity: > > http://www.macrumors.com/2007/10/04/apple-seeds-zfs-read-write- > developer-preview-1-1-for-leopard/ > > Apple has seeded version 1.1 of ZFS (Zettabyte File System) for Mac > OS X to Developers this week. The preview updates a previous build > released on June 26, 2007. > Y! Finally my USB Thumb Drives will work on my MacBook! :) I wonder if it'll automatically mount the Zpool on my iPod when I sync it. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS for OSX - it'll be in there.
...and eventually in a read-write capacity: http://www.macrumors.com/2007/10/04/apple-seeds-zfs-read-write- developer-preview-1-1-for-leopard/ Apple has seeded version 1.1 of ZFS (Zettabyte File System) for Mac OS X to Developers this week. The preview updates a previous build released on June 26, 2007. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
> 5) DMA straight from user buffer to disk avoiding a copy. This is what the "direct" in "direct i/o" has historically meant. :-) > line has been that 5) won't help latency much and > latency is here I think the game is currently played. Now the > disconnect might be because people might feel that the game > is not latency but CPU efficiency : "how many CPU cycles do I > burn to do get data from disk to user buffer". Actually, it's less CPU cycles in many cases than memory cycles. For many databases, most of the I/O is writes (reads wind up cached in memory). What's the cost of a write? With direct I/O: CPU writes to memory (spread out over many transactions), disk DMAs from memory. We write LPS (log page size) bytes of data from CPU to memory, we read LPS bytes from memory. On processors without a cache line zero, we probably read the LPS data from memory as part of the write. Total cost = W:LPS, R:2*LPS. Without direct I/O: The cost of getting the data into the user buffer remains the same (W:LPS, R:LPS). We copy the data from user buffer to system buffer (W:LPS, R:LPS). Then we push it out to disk. Total cost = W:2*LPS, R:3*LPS. We've nearly doubled the cost, not including any TLB effects. On a memory-bandwidth-starved system (which should be nearly all modern designs, especially with multi-threaded chips like Niagara), replacing buffered I/O with direct I/O should give you nearly a 2x improvement in log write bandwidth. That's without considering cache effects (which shouldn't be too significant, really, since LPS should be << the size of L2). How significant is this? We'd have to measure; and it will likely vary quite a lot depending on which database is used for testing. But note that, for ZFS, the win with direct I/O will be somewhat less. That's because you still need to read the page to compute its checksum. So for direct I/O with ZFS (with checksums enabled), the cost is W:LPS, R:2*LPS. Is saving one page of writes enough to make a difference? Possibly not. Anton This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] O.T. "patches" for OpenSolaris
On 30/09/2007, William Papolis <[EMAIL PROTECTED]> wrote: > Henk, > > By upgrading do you mean, rebooting and installing Open Solaris from DVD or > Network? > > Like, no Patch Manager install some quick patches and updates and a quick > reboot, right? You can live upgrade and then do a quick reboot: http://number9.hellooperator.net/articles/2007/08/08/solaris-laptop-live-upgrade -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Mountroot and Bootroot Comparison
Hi, Using bootroot I can do seperate /usr filesystem since b64. I can also do snapshot, clone and compression. Rgds, Andre W. Kugutsumen wrote: > Lori Alt told me that mountrount was a temporary hack until grub > could boot zfs natively. > Since build 62, mountroot support was dropped and I am not convinced > that this is a mistake. > > Let's compare the two: > > Mountroot: > > Pros: >* can have root partition on raid-z: YES >* can have root partition on zfs stripping mirror: YES >* can have usr partition on separate ZFS partition with build < > 72 : YES >* can snapshot and rollback root partition: YES >* can use copies on root partition on a single root disk (e.g. a > laptop ): YES >* can use compression on root partition: YES > Cons: >* grub native support: NO (if you use raid-z or stripping mirror, > you will need to have a small UFS partition > to bootstrap the system, but you can use a small usb stick for > that purpose.) > > New and "improved" *sigh* bootroot scheme: > > Pros: >* grub native support: YES > Cons: >* can have root partition on raid-z: NO >* can have root partition on zfs stripping mirror: NO >* can use copies on root partition on a single root disk (e.g. a > laptop ): NO >* can have usr partition on separate ZFS partition with build < > 72 : NO >* can snapshot and rollback root partition: NO >* can use compression on root partition: NO >* No backward compatibility with zfs mountroot. > > Why did we completely drop support for the old mountroot approach > which is so much more flexible? > > Kugutsumen > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
Awesome. Thanks, Eric. :) This type of feature / fix is quite important to a number of the guys in the our local OSUG. In particular, they are adamant that they cannot use ZFS in production until it stops panicing the whole box for isolated filesystem / zpool failures. This will be a big step. :) Cheers. Nathan. Eric Schrock wrote: > On Fri, Oct 05, 2007 at 08:20:13AM +1000, Nathan Kroenert wrote: >> Erik - >> >> Thanks for that, but I know the pool is corrupted - That was kind if the >> point of the exercise. >> >> The bug (at least to me) is ZFS panicing Solaris just trying to import >> the dud pool. >> >> But, maybe I'm missing your point? >> >> Nathan. > > This a variation on the "read error while writing" problem. It is a > known issue and a generic solution (to handle any kind of non-replicated > writes failing) is in the works (see PSARC 2007/567). > > - Eric > > -- > Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
On Fri, Oct 05, 2007 at 08:20:13AM +1000, Nathan Kroenert wrote: > Erik - > > Thanks for that, but I know the pool is corrupted - That was kind if the > point of the exercise. > > The bug (at least to me) is ZFS panicing Solaris just trying to import > the dud pool. > > But, maybe I'm missing your point? > > Nathan. This a variation on the "read error while writing" problem. It is a known issue and a generic solution (to handle any kind of non-replicated writes failing) is in the works (see PSARC 2007/567). - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
Erik - Thanks for that, but I know the pool is corrupted - That was kind if the point of the exercise. The bug (at least to me) is ZFS panicing Solaris just trying to import the dud pool. But, maybe I'm missing your point? Nathan. eric kustarz wrote: >> >> Client A >> - import pool make couple-o-changes >> >> Client B >> - import pool -f (heh) >> >> Client A + B - With both mounting the same pool, touched a couple of >> files, and removed a couple of files from each client >> >> Client A + B - zpool export >> >> Client A - Attempted import and dropped the panic. >> > > ZFS is not a clustered file system. It cannot handle multiple readers > (or multiple writers). By importing the pool on multiple machines, you > have corrupted the pool. > > You purposely did that by adding the '-f' option to 'zpool import'. > Without the '-f' option ZFS would have told you that its already > imported on another machine (A). > > There is no bug here (besides admin error :) ). > > eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] chgrp -R hangs all writes to pool
On Mon, Jul 16, 2007 at 09:36:06PM -0700, Stuart Anderson wrote: > Running Solaris 10 Update 3 on an X4500 I have found that it is possible > to reproducibly block all writes to a ZFS pool by running "chgrp -R" > on any large filesystem in that pool. As can be seen below in the zpool > iostat output below, after about 10-sec of running the chgrp command all > writes to the pool stop, and the pool starts exclusively running a slow > background task of 1kB reads. > > At this point the chgrp -R command is not killable via root kill -9, > and in fact even the command "halt -d" does not do anything. > For posterity this appears to have been fixed in S10U4, at least I am unable to reproduce the problem that was easy to trigger with S10U3. Thanks. -- Stuart Anderson [EMAIL PROTECTED] http://www.ligo.caltech.edu/~anderson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Do we have a successful installation method for patch 120011-14?
It was 120272-12 that caused ths snmp.conf problem and was withdrawn. 120272-13 has replaced it and has that bug fixed. 122660-10 does not have any issues that I am aware of. It is only obsolete, not withdrawn. Additionally, it appears that the circular patch dependency is by design if you read this BugID: 6574472 U4 feature Ku's need to hard require a patch that enforces zoneadmd patch is installed So hacking the prepatch script for 125547-02/125548-02 to bypass the dependency check (as others have recommended) is a BAD THING and you may wind up with a broken system. -Brian Rob Windsor wrote: > Yeah, the only thing wrong with that patch is that it eats > /etc/sma/snmp/snmpd.conf > > All is not lost, your original is copied to > /etc/sma/snmp/snmpd.conf.save in the process. > > Rob++ > > Brian H. Nelson wrote: > >> Manually installing the obsolete patch 122660-10 has worked fine for me. >> Until sun fixes the patch dependencies, I think that is the easiest way. >> >> -Brian >> >> >> -- --- Brian H. Nelson Youngstown State University System Administrator Media and Academic Computing bnelson[at]cis.ysu.edu --- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Do we have a successful installation method for patch 120011-14?
Yeah, the only thing wrong with that patch is that it eats /etc/sma/snmp/snmpd.conf All is not lost, your original is copied to /etc/sma/snmp/snmpd.conf.save in the process. Rob++ Brian H. Nelson wrote: > Manually installing the obsolete patch 122660-10 has worked fine for me. > Until sun fixes the patch dependencies, I think that is the easiest way. > > -Brian > > Bruce Shaw wrote: >> It fails on my machine because it requires a patch that's deprecated. >> >> This email and any files transmitted with it are confidential and intended >> solely for the use of the individual or entity to whom they are addressed. >> If you have received this email in error please notify the system manager. >> This message contains confidential information and is intended only for the >> individual named. If you are not the named addressee you should not >> disseminate, distribute or copy this e-mail. >> >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > -- Internet: [EMAIL PROTECTED] __o Life: [EMAIL PROTECTED]_`\<,_ (_)/ (_) "They couldn't hit an elephant at this distance." -- Major General John Sedgwick ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Convert Raid-Z to Mirror
Update to this. Before destroying the original pool the first time, offline the disk you plan on re-using in the new pool. Otherwise when you destroy the original pool for the second time it causes issues with the new pool. In fact, if you attempt to destroy the new pool immediately after destroying the original pool, the system will panic. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Do we have a successful installation method for patch 120011-14?
Manually installing the obsolete patch 122660-10 has worked fine for me. Until sun fixes the patch dependencies, I think that is the easiest way. -Brian Bruce Shaw wrote: > It fails on my machine because it requires a patch that's deprecated. > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. If > you have received this email in error please notify the system manager. This > message contains confidential information and is intended only for the > individual named. If you are not the named addressee you should not > disseminate, distribute or copy this e-mail. > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- --- Brian H. Nelson Youngstown State University System Administrator Media and Academic Computing bnelson[at]cis.ysu.edu --- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
I'd like to second a couple of comments made recently: * If they don't regularly do so, I too encourage the ZFS, Solaris performance, and Sun Oracle support teams to sit down and talk about the utility of Direct I/O for databases. * I too suspect that absent Direct I/O (or some ringing endorsement from Oracle about how ZFS doesn't need Direct I/O), there will be a drain of customer escalations regarding the lack-- plus FUD and other sales inhibitors. While I realize that Sun has not published a TPC-C result since 2001 and offers a different value proposition to customers, performance does matter and for some cases Direct I/O can contribute to that. Historically, every TPC-C database benchmark run can be converted from being I/O bound to being CPU bound by adding enough disk spindles and enough main memory. In that context, saving the CPU cycles (and cache misses) from a copy are important. Another historical trend was that for performance, portability across different operating systems, and perhaps just because they could, databases tended to use as few OS capabilities as possible and to do their own resource management. So for instance databases were often benchmarked using raw devices. Customers on the other hand preferred the manageability of filesystems and tended to deploy there. In that context, Direct I/O is an attempt to get the best of both worlds. Finally, besides UFS Direct I/O on Solaris, other filesystems including VxFS also have various forms of Direct I/O-- either separate APIs or mount options for that bypass the cache on large writes, etc. Understanding those benefits, both real and advertised, helps understand the opportunities and shortfalls for ZFS. It may be that this is not the most important thing for ZFS performance or capability right now-- measurement in targeted configurations and workloads is the only way to tell-- but I'd be highly surprised if there isn't something (bypass cache on really large writes?) that can't be learned from experiences with Direct I/O. Eric (Hamilton) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
On Thu, Oct 04, 2007 at 06:59:56PM +0200, Roch - PAE wrote: > Nicolas Williams writes: > > On Thu, Oct 04, 2007 at 03:49:12PM +0200, Roch - PAE wrote: > > > So the DB memory pages should not be _contented_ for. > > > > What if your executable text, and pretty much everything lives on ZFS? > > You don't want to content for the memory caching those things either. > > It's not just the DB's memory you don't want to contend for. > > On the read side, > > We're talking here about 1000 disks each running35 > concurrent I/Os of 8K, so a footprint of 250MB, to stage a > ton of work. I'm not sure what you mean, but extra copies and memory just to stage the I/Os is not the same as the systemic memory pressure issue. Now, I'm _speculating_ as to what the real problem is, but it seems very likely that putting things in the cache that needn't be there would push out things that should be there, and since restoring those things to the cache later would cost I/Os... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
Nicolas Williams writes: > On Wed, Oct 03, 2007 at 04:31:01PM +0200, Roch - PAE wrote: > > > It does, which leads to the core problem. Why do we have to store the > > > exact same data twice in memory (i.e., once in the ARC, and once in > > > the shared memory segment that Oracle uses)? > > > > We do not retain 2 copies of the same data. > > > > If the DB cache is made large enough to consume most of memory, > > the ZFS copy will quickly be evicted to stage other I/Os on > > their way to the DB cache. > > > > What problem does that pose ? > > Other things deserving of staying in the cache get pushed out by things > that don't deserve being in the cache. Thus systemic memory pressure > (e.g., more on-demand paging of text). > > Nico > -- I agree. That's why I submitted both of these. 6429855 Need way to tell ZFS that caching is a lost cause 6488341 ZFS should avoiding growing the ARC into trouble -r ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Do we have a successful installation method for patch 120011-14?
It fails on my machine because it requires a patch that's deprecated. This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Mountroot and Bootroot Comparison
Remember that you have to maintain an entirely separate slice with yet another boot environment. This causes huge amounts of complexity in terms of live upgrade, multiple BE management, etc. The old mountroot solution was useful for mounting ZFS root, but completely unmaintainable from an installation and upgrade perspective. It was dropped because we could not possibly develop installation, packaging, and upgrade software that would work across multiple BEs under such a scheme. - Eric On Thu, Oct 04, 2007 at 11:27:46PM +0700, Kugutsumen wrote: > Lori Alt told me that mountrount was a temporary hack until grub > could boot zfs natively. > Since build 62, mountroot support was dropped and I am not convinced > that this is a mistake. > > Let's compare the two: > > Mountroot: > > Pros: >* can have root partition on raid-z: YES >* can have root partition on zfs stripping mirror: YES >* can have usr partition on separate ZFS partition with build < > 72 : YES >* can snapshot and rollback root partition: YES >* can use copies on root partition on a single root disk (e.g. a > laptop ): YES >* can use compression on root partition: YES > Cons: >* grub native support: NO (if you use raid-z or stripping mirror, > you will need to have a small UFS partition > to bootstrap the system, but you can use a small usb stick for > that purpose.) > > New and "improved" *sigh* bootroot scheme: > > Pros: >* grub native support: YES > Cons: >* can have root partition on raid-z: NO >* can have root partition on zfs stripping mirror: NO >* can use copies on root partition on a single root disk (e.g. a > laptop ): NO >* can have usr partition on separate ZFS partition with build < > 72 : NO >* can snapshot and rollback root partition: NO >* can use compression on root partition: NO > > Why did we completely drop support for the old mountroot approach > which is so much more flexible? -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
Nicolas Williams writes: > On Thu, Oct 04, 2007 at 03:49:12PM +0200, Roch - PAE wrote: > > ...memory utilisation... OK so we should implement the 'lost cause' rfe. > > > > In all cases, ZFS must not steal pages from other memory consumers : > > > >6488341 ZFS should avoiding growing the ARC into trouble > > > > So the DB memory pages should not be _contented_ for. > > What if your executable text, and pretty much everything lives on ZFS? > You don't want to content for the memory caching those things either. > It's not just the DB's memory you don't want to contend for. On the read side, We're talking here about 1000 disks each running35 concurrent I/Os of 8K, so a footprint of 250MB, to stage a ton of work. On the write side we do have to play with the transaction group so that will be 5-10 seconds worth of synchronous write activity. But how much memory does a 1000-disks server got ? -r ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] About bug 6486493 (ZFS boot incompatible with
On Thu, Oct 04, 2007 at 05:22:58AM -0700, Ivan Wang wrote: > > This bug was rendered moot via 6528732 in build > > snv_68 (and s10_u5). We > > now store physical devices paths with the vnodes, so > > even though the > > SATA framework doesn't correctly support open by > > devid in early boot, we > > But if I read it right, there is still a problem in SATA framework (failing > ldi_open_by_devid,) right? > If this problem is framework-wide, it might just bite back some time in the > future. > Yes, there is still a bug in the SATA framework, in that ldi_open_by_devid() doesn't work early in boot. Opening by device path works so long as you don't recable your boot devices. If we had open by devid working in early boot, then this wouldn't be a problem. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
On Thu, Oct 04, 2007 at 03:49:12PM +0200, Roch - PAE wrote: > ...memory utilisation... OK so we should implement the 'lost cause' rfe. > > In all cases, ZFS must not steal pages from other memory consumers : > > 6488341 ZFS should avoiding growing the ARC into trouble > > So the DB memory pages should not be _contented_ for. What if your executable text, and pretty much everything lives on ZFS? You don't want to content for the memory caching those things either. It's not just the DB's memory you don't want to contend for. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
On Wed, Oct 03, 2007 at 04:31:01PM +0200, Roch - PAE wrote: > > It does, which leads to the core problem. Why do we have to store the > > exact same data twice in memory (i.e., once in the ARC, and once in > > the shared memory segment that Oracle uses)? > > We do not retain 2 copies of the same data. > > If the DB cache is made large enough to consume most of memory, > the ZFS copy will quickly be evicted to stage other I/Os on > their way to the DB cache. > > What problem does that pose ? Other things deserving of staying in the cache get pushed out by things that don't deserve being in the cache. Thus systemic memory pressure (e.g., more on-demand paging of text). Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
eric kustarz writes: > > > > Anyhow, in the case of DBs, ARC indeed becomes a vestigial organ. I'm > > surprised that this is being met with skepticism considering that > > Oracle highly recommends direct IO be used, and, IIRC, Oracle > > performance was the main motivation to adding DIO to UFS back in > > Solaris 2.6. This isn't a problem with ZFS or any specific fs per se, > > it's the buffer caching they all employ. So I'm a big fan of seeing > > 6429855 come to fruition. > > The point is that directI/O typically means two things: > 1) concurrent I/O > 2) no caching at the file system > In my blog I also mention : 3) no readahead (but can be viewed as an implicit consequence of 2) And someone chimed in with 4) ability to do I/O at the sector granularity. I also think that for many 2) is too weak form of what they expect : 5) DMA straight from user buffer to disk avoiding a copy. So 1) concurrent I/O we have in ZFS. 2) No Caching. we could do by taking a directio hint and evict arc buffer immediately after copyout to user space for reads, and after txg completion for writes. 3) No prefetching. we have 2 level of prefetching. The low level was fixed recently. Should not cause problem to DB loads. The high level still needs fixing on it's own. Then we should take the same hint as 2) to disable it altogether. In the mean time we can tune our way into this mode. 4) Sector sized I/O Is really foreign to ZFS design. 5) Zero Copy & more CPU efficientcy. I think is where the debate is. My line has been that 5) won't help latency much and latency is where I think the game is currently played. Now the disconnect might be because people might feel that the game is not latency but CPU efficientcy : "how many CPU cycles to I burn to do get data from disk to user buffer". This is a valid point. Configurations can with very large number of disks end up saturated by the filesystem CPU utilisation. So I still think that the major area for ZFS perf gains are on the latency front : block allocation (now much improved with the Separate intent log), I/O scheduling, and other fixes to the threading & ARC behavior. But at some point we can turn our microscope onthe CPU efficientcy of the implementation. The copy will certainly be a big chunk of the CPU cost per I/O but I would still like to gather that data. Also consider, 50 disks at 200 IOPS of 8K is 80 MB/sec. That means maybe 1/10th of a single CPU to be saved by avoiding just the copy. Probably not what people have in mind. How many CPU's do you have when attaching 1000 drives to a host running a 100TB database ? That many drivers will barely occupy 2 cores running the copies. People want performance and efficientcy. Directio is just an overloaded name that delivered those gains to other filesystems. Right now, what I think is worth gathering is cycles spent in ZFS per reads & writes in a large DB environment where DB holds 90% of memory. For comparison with another FS, we should disable checksum, file prefetching, vdev prefetching, cap the ARC, atime off, 8K recordsize. A breakdown and comparison of the CPU cost per layer will be quite interesting and points to what needs work. Another interesting thing for me would be : what is your budget ? "how many cycles per DB reads and writes are you willing to spend and how did you come to that number" But, as Eric says, let's develop 2 and I'll try in parallel to figure out the per layer breakdown cost. -r > Most file systems (ufs, vxfs, etc.) don't do 1) or 2) without turning > on "directI/O". > > ZFS *does* 1. It doesn't do 2 (currently). > > That is what we're trying to discuss here. > > Where does the win come from with "directI/O"? Is it 1), 2), or some > combination? If its a combination, what's the percentage of each > towards the win? > > We need to tease 1) and 2) apart to have a full understanding. I'm > not against adding 2) to ZFS but want more information. I suppose > i'll just prototype it and find out for myself. > > eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Mountroot and Bootroot Comparison
Lori Alt told me that mountrount was a temporary hack until grub could boot zfs natively. Since build 62, mountroot support was dropped and I am not convinced that this is a mistake. Let's compare the two: Mountroot: Pros: * can have root partition on raid-z: YES * can have root partition on zfs stripping mirror: YES * can have usr partition on separate ZFS partition with build < 72 : YES * can snapshot and rollback root partition: YES * can use copies on root partition on a single root disk (e.g. a laptop ): YES * can use compression on root partition: YES Cons: * grub native support: NO (if you use raid-z or stripping mirror, you will need to have a small UFS partition to bootstrap the system, but you can use a small usb stick for that purpose.) New and "improved" *sigh* bootroot scheme: Pros: * grub native support: YES Cons: * can have root partition on raid-z: NO * can have root partition on zfs stripping mirror: NO * can use copies on root partition on a single root disk (e.g. a laptop ): NO * can have usr partition on separate ZFS partition with build < 72 : NO * can snapshot and rollback root partition: NO * can use compression on root partition: NO * No backward compatibility with zfs mountroot. Why did we completely drop support for the old mountroot approach which is so much more flexible? Kugutsumen ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
On Thu, Oct 04, 2007 at 08:36:10AM -0600, eric kustarz wrote: > > Client A > > - import pool make couple-o-changes > > > > Client B > > - import pool -f (heh) > > > > Client A + B - With both mounting the same pool, touched a couple of > > files, and removed a couple of files from each client > > > > Client A + B - zpool export > > > > Client A - Attempted import and dropped the panic. > > > > ZFS is not a clustered file system. It cannot handle multiple > readers (or multiple writers). By importing the pool on multiple > machines, you have corrupted the pool. Yes. > You purposely did that by adding the '-f' option to 'zpool import'. > Without the '-f' option ZFS would have told you that its already > imported on another machine (A). > > There is no bug here (besides admin error :) ). My reading is that the complaint is not about corrupting the pool. The complaint is that once a pool has become corrupted, it shouldn't cause a panic on import. It seems reasonable to detect this and fail the import instead. -- Darren Dunham [EMAIL PROTECTED] Senior Technical Consultant TAOShttp://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Crypto Alpha Release
I'm pleased to announce that the ZFS Crypto project now has Alpha release binaries that you can download and try. Currently we only have x86/x64 binaries available, SPARC will be available shortly. Information on the Alpha release of ZFS Crypto and links for downloading the binaries is here: http://opensolaris.org/os/project/zfs-crypto/phase1/alpha/ Please pay particular note to the important information at the top of the above page. One of the main purposes of this Alpha release is to get feedback so that we can complete the design and schedule our second design review and our PSARC commitment review. Note the the feature set is NOT committed at this time, neither is the user interface. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
> > Client A > - import pool make couple-o-changes > > Client B > - import pool -f (heh) > > Client A + B - With both mounting the same pool, touched a couple of > files, and removed a couple of files from each client > > Client A + B - zpool export > > Client A - Attempted import and dropped the panic. > ZFS is not a clustered file system. It cannot handle multiple readers (or multiple writers). By importing the pool on multiple machines, you have corrupted the pool. You purposely did that by adding the '-f' option to 'zpool import'. Without the '-f' option ZFS would have told you that its already imported on another machine (A). There is no bug here (besides admin error :) ). eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
Jim Mauro writes: > > > Where does the win come from with "directI/O"? Is it 1), 2), or some > > combination? If its a combination, what's the percentage of each > > towards the win? > > > That will vary based on workload (I know, you already knew that ... :^). > Decomposing the performance win between what is gained as a result of > single writer > lock breakup and no caching is something we can only guess at, because, > at least > for UFS, you can't do just one - it's all or nothing. > > We need to tease 1) and 2) apart to have a full understanding. > > We can't. We can only guess (for UFS). > > My opinion - it's a must-have for ZFS if we're going to get serious > attention > in the database space. I'll bet dollars-to-donuts that, over the next > several years, > we'll burn many tens-of-millions of dollars on customer support > escalations that > come down to memory utilization issues and contention between database > specific buffering and the ARC. This is entirely my opinion (not that of > Sun), ...memory utilisation... OK so we should implement the 'lost cause' rfe. In all cases, ZFS must not steal pages from other memory consumers : 6488341 ZFS should avoiding growing the ARC into trouble So the DB memory pages should not be _contented_ for. -r > and I've been wrong before. > > Thanks, > /jim > > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Direct I/O ability with zfs?
> Where does the win come from with "directI/O"? Is it 1), 2), or some > combination? If its a combination, what's the percentage of each > towards the win? > That will vary based on workload (I know, you already knew that ... :^). Decomposing the performance win between what is gained as a result of single writer lock breakup and no caching is something we can only guess at, because, at least for UFS, you can't do just one - it's all or nothing. > We need to tease 1) and 2) apart to have a full understanding. We can't. We can only guess (for UFS). My opinion - it's a must-have for ZFS if we're going to get serious attention in the database space. I'll bet dollars-to-donuts that, over the next several years, we'll burn many tens-of-millions of dollars on customer support escalations that come down to memory utilization issues and contention between database specific buffering and the ARC. This is entirely my opinion (not that of Sun), and I've been wrong before. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] About bug 6486493 (ZFS boot incompatible with
> This bug was rendered moot via 6528732 in build > snv_68 (and s10_u5). We > now store physical devices paths with the vnodes, so > even though the > SATA framework doesn't correctly support open by > devid in early boot, we But if I read it right, there is still a problem in SATA framework (failing ldi_open_by_devid,) right? If this problem is framework-wide, it might just bite back some time in the future. Ivan. > can fallback to the device path just fine. ZFS root > works great on > thumper, which uses the marvell SATA driver. > > - Eric > > On Wed, Oct 03, 2007 at 08:10:16AM +, Marc Bevand > wrote: > > I would like to test ZFS boot on my home server, > but according to bug > > 6486493 ZFS boot cannot be used if the disks are > attached to a SATA > > controller handled by a driver using the new SATA > framework (which > > is my case: driver si3124). I have never heard of > someone having > > successfully used ZFS boot with the SATA framework, > so I assume this > > bug is real and everybody out there playing with > ZFS boot is doing so > > with PATA controllers, or SATA controllers > operating in compatibility > > mode, or SCSI controllers, right ? > > > > -marc > > > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss > > -- > Eric Schrock, Solaris Kernel Development > http://blogs.sun.com/eschrock > _ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
> Perhaps it's the same cause, I don't know... > > But I'm certainly not convinced that I'd be happy with a 25K, for > example, panicing just because I tried to import a dud pool... > > I'm ok(ish) with the panic on a failed write to a non-redundant storage. > I expect it by now... > I agree, forcing a panic seems to be pretty severe and may cause as much grief as it prevents. Why not just stop allowing I/O to the pool so the sys admin can gracefully shutdown the system? Applications would be disrupted but no more so than they would be disrupted during a panic. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zones on zfs
Hi I have a Netra T1 with 2 int disks. I want to install Sol 10 8/07 and build 2 zones (one as an ftp server and one as an scp server) and would like the system mirrored. My thoughts are to use SVM to mirror the / partitions, then build a mirrored zfs pool using slice 5 on both disks (I know this isnt recommended but 2 disks is all I have). The zones will then be built on the zfs filesystem. Filesystem size used avail capacity Mounted on /dev/md/dsk/d2 5.8G 4.0G 1.7G 70%/ zfspool 9.6G 220M 9.4G 3% /zfspool Does this sound feasible or are are there any better ways of doing this? Thanks Neal This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
I think it's a little more sinister than that... I'm only just trying to import the pool. Not even yet doing any I/O to it... Perhaps it's the same cause, I don't know... But I'm certainly not convinced that I'd be happy with a 25K, for example, panicing just because I tried to import a dud pool... I'm ok(ish) with the panic on a failed write to a non-redundant storage. I expect it by now... Cheers! Nathan. Victor Engle wrote: > Wouldn't this be the known feature where a write error to zfs forces a panic? > > Vic > > > > On 10/4/07, Ben Rockwood <[EMAIL PROTECTED]> wrote: >> Dick Davies wrote: >>> On 04/10/2007, Nathan Kroenert <[EMAIL PROTECTED]> wrote: >>> >>> Client A - import pool make couple-o-changes Client B - import pool -f (heh) >>> Oct 4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80: Oct 4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion failed: dmu_read(os, smo->smo_object, offset, size, entry_map) == 0 (0x5 == 0x0) , file: ../../common/fs/zfs/space_map.c, line: 339 Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160 genunix:assfail3+b9 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200 zfs:space_map_load+2ef () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240 zfs:metaslab_activate+66 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300 zfs:metaslab_group_alloc+24e () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0 zfs:metaslab_alloc_dva+192 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470 zfs:metaslab_alloc+82 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0 zfs:zio_dva_allocate+68 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0 zfs:zio_next_stage+b3 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510 zfs:zio_checksum_generate+6e () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530 zfs:zio_next_stage+b3 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0 zfs:zio_write_compress+239 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0 zfs:zio_next_stage+b3 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610 zfs:zio_wait_for_children+5d () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630 zfs:zio_wait_children_ready+20 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650 zfs:zio_next_stage_async+bb () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670 zfs:zio_nowait+11 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960 zfs:dbuf_sync_leaf+1ac () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0 zfs:dbuf_sync_list+51 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10 zfs:dnode_sync+23b () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50 zfs:dmu_objset_sync_dnodes+55 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0 zfs:dmu_objset_sync+13d () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40 zfs:dsl_pool_sync+199 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0 zfs:spa_sync+1c5 () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60 zfs:txg_sync_thread+19a () Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70 unix:thread_start+8 () Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] >>> Is this a known issue, already fixed in a later build, or should I bug it? >>> It shouldn't panic the machine, no. I'd raise a bug. >>> >>> After spending a little time playing with iscsi, I have to say it's almost inevitable that someone is going to do this by accident and panic a big box for what I see as no good reason. (though I'm happy to be educated... ;) >>> You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously >>> access the same LUN by accident. You'd have the same problem with >>> Fibre Channel SANs. >>> >> I ran into similar problems when replicating via AVS. >> >> benr. >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mai
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
Wouldn't this be the known feature where a write error to zfs forces a panic? Vic On 10/4/07, Ben Rockwood <[EMAIL PROTECTED]> wrote: > Dick Davies wrote: > > On 04/10/2007, Nathan Kroenert <[EMAIL PROTECTED]> wrote: > > > > > >> Client A > >> - import pool make couple-o-changes > >> > >> Client B > >> - import pool -f (heh) > >> > > > > > >> Oct 4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80: > >> Oct 4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion > >> failed: dmu_read(os, smo->smo_object, offset, size, entry_map) == 0 (0x5 > >> == 0x0) > >> , file: ../../common/fs/zfs/space_map.c, line: 339 > >> Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160 > >> genunix:assfail3+b9 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200 > >> zfs:space_map_load+2ef () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240 > >> zfs:metaslab_activate+66 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300 > >> zfs:metaslab_group_alloc+24e () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0 > >> zfs:metaslab_alloc_dva+192 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470 > >> zfs:metaslab_alloc+82 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0 > >> zfs:zio_dva_allocate+68 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0 > >> zfs:zio_next_stage+b3 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510 > >> zfs:zio_checksum_generate+6e () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530 > >> zfs:zio_next_stage+b3 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0 > >> zfs:zio_write_compress+239 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0 > >> zfs:zio_next_stage+b3 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610 > >> zfs:zio_wait_for_children+5d () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630 > >> zfs:zio_wait_children_ready+20 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650 > >> zfs:zio_next_stage_async+bb () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670 > >> zfs:zio_nowait+11 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960 > >> zfs:dbuf_sync_leaf+1ac () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0 > >> zfs:dbuf_sync_list+51 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10 > >> zfs:dnode_sync+23b () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50 > >> zfs:dmu_objset_sync_dnodes+55 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0 > >> zfs:dmu_objset_sync+13d () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40 > >> zfs:dsl_pool_sync+199 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0 > >> zfs:spa_sync+1c5 () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60 > >> zfs:txg_sync_thread+19a () > >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70 > >> unix:thread_start+8 () > >> Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] > >> > > > > > >> Is this a known issue, already fixed in a later build, or should I bug it? > >> > > > > It shouldn't panic the machine, no. I'd raise a bug. > > > > > >> After spending a little time playing with iscsi, I have to say it's > >> almost inevitable that someone is going to do this by accident and panic > >> a big box for what I see as no good reason. (though I'm happy to be > >> educated... ;) > >> > > > > You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously > > access the same LUN by accident. You'd have the same problem with > > Fibre Channel SANs. > > > I ran into similar problems when replicating via AVS. > > benr. > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?
Dick Davies wrote: > On 04/10/2007, Nathan Kroenert <[EMAIL PROTECTED]> wrote: > > >> Client A >> - import pool make couple-o-changes >> >> Client B >> - import pool -f (heh) >> > > >> Oct 4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80: >> Oct 4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion >> failed: dmu_read(os, smo->smo_object, offset, size, entry_map) == 0 (0x5 >> == 0x0) >> , file: ../../common/fs/zfs/space_map.c, line: 339 >> Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160 >> genunix:assfail3+b9 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200 >> zfs:space_map_load+2ef () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240 >> zfs:metaslab_activate+66 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300 >> zfs:metaslab_group_alloc+24e () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0 >> zfs:metaslab_alloc_dva+192 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470 >> zfs:metaslab_alloc+82 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0 >> zfs:zio_dva_allocate+68 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0 >> zfs:zio_next_stage+b3 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510 >> zfs:zio_checksum_generate+6e () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530 >> zfs:zio_next_stage+b3 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0 >> zfs:zio_write_compress+239 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0 >> zfs:zio_next_stage+b3 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610 >> zfs:zio_wait_for_children+5d () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630 >> zfs:zio_wait_children_ready+20 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650 >> zfs:zio_next_stage_async+bb () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670 >> zfs:zio_nowait+11 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960 >> zfs:dbuf_sync_leaf+1ac () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0 >> zfs:dbuf_sync_list+51 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10 >> zfs:dnode_sync+23b () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50 >> zfs:dmu_objset_sync_dnodes+55 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0 >> zfs:dmu_objset_sync+13d () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40 >> zfs:dsl_pool_sync+199 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0 >> zfs:spa_sync+1c5 () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60 >> zfs:txg_sync_thread+19a () >> Oct 4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70 >> unix:thread_start+8 () >> Oct 4 15:03:12 fozzie unix: [ID 10 kern.notice] >> > > >> Is this a known issue, already fixed in a later build, or should I bug it? >> > > It shouldn't panic the machine, no. I'd raise a bug. > > >> After spending a little time playing with iscsi, I have to say it's >> almost inevitable that someone is going to do this by accident and panic >> a big box for what I see as no good reason. (though I'm happy to be >> educated... ;) >> > > You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously > access the same LUN by accident. You'd have the same problem with > Fibre Channel SANs. > I ran into similar problems when replicating via AVS. benr. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss