Re: [zfs-discuss] 3ware support
Le mardi 12 février 2008 à 07:22 +0100, Johan Kooijman a écrit : Goodmorning all, Hi, can anyone confirm that 3ware raid controllers are indeed not working under Solaris/OpenSolaris? I can't seem to find it in the HCL. I do confirm they don't work We're now using a 3Ware 9550SX as a S-ATA RAID controller. The original plan was to disable all it's RAID functions and use justs the S-ATA controller functionality for ZFS deployment. If indeed 3Ware isn't support, I have to buy a new controller. Any specific controller/brand you can recommend for Solaris? I use Areca cards, with the driver supplied by Areca (certified in the HCL) Have a nice day, -- Nicolas Szalay Administrateur systèmes réseaux -- _ ASCII ribbon campaign ( ) - against HTML email X vCards / \ signature.asc Description: Ceci est une partie de message numériquement signée ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3ware support
Jason J. W. Williams wrote: X4500 problems seconded. Still having issues with port resets due to the Marvell driver. Though they seem considerably more transient and less likely to lock up the entire systems in the most recent ( b72) OpenSolaris builds. Build 72 is pretty old. The build date for that build was August 27, 2007. It looks like build 75 should have been pretty good (October 8, 2007), but for absolute, most up-to-date stuff you want build 84 (which will be in a build on February 25, 2008). The source code changes are already visible in OpenSolaris and the marvell88sx binary is likewise down loadable (I wish I could provide the source, but Marvell says no). Please try something more recent than over 5 months old. Regards, Lida -J On Feb 12, 2008 9:35 AM, Carson Gaspar [EMAIL PROTECTED] wrote: Tim wrote: A much cheaper (and probably the BEST supported card), is the supermicro based on the marvell chipset. This is the same chipset that is used in the thumper x4500 so you know that the folks at sun are doing their due diligence to make sure the drivers are solid. Except the drivers _aren't_ solid, at least in Solaris(tm). The OpenSolaris drivers may have been fixed (I know a lot of work is going into them, but I haven't tested them), but those fixes have not made it back into the supported realm. So if you need to run a supported OS, I'd skip the Marvell chips if possible, at least for now. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3ware support
X4500 problems seconded. Still having issues with port resets due to the Marvell driver. Though they seem considerably more transient and less likely to lock up the entire systems in the most recent ( b72) OpenSolaris builds. -J On Feb 12, 2008 9:35 AM, Carson Gaspar [EMAIL PROTECTED] wrote: Tim wrote: A much cheaper (and probably the BEST supported card), is the supermicro based on the marvell chipset. This is the same chipset that is used in the thumper x4500 so you know that the folks at sun are doing their due diligence to make sure the drivers are solid. Except the drivers _aren't_ solid, at least in Solaris(tm). The OpenSolaris drivers may have been fixed (I know a lot of work is going into them, but I haven't tested them), but those fixes have not made it back into the supported realm. So if you need to run a supported OS, I'd skip the Marvell chips if possible, at least for now. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lost intermediate snapshot; incremental backup still possible?
I think so. On your backup pool, roll back to the last snapshot that was successfully received. Then you should be able to send an incremental between that one and the present. Jeff On Thu, Feb 07, 2008 at 08:38:38AM -0800, Ian wrote: I keep my system synchronized to a USB disk from time to time. The script works by sending incremental snapshots to a pool on the USB disk, then deleting those snapshots from the source machine. A botched script ended up deleting a snapshot that was not successfully received on the USB disk. Now, I've lost the ability to send incrementally since the intermediate snapshot is lost. From what I gather, if I try to send a full snapshot, it will require deleting and replacing the dataset on the USB disk. Is there any way around this? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Need help with a dead disk
[EMAIL PROTECTED] said: One thought I had was to unconfigure the bad disk with cfgadm. Would that force the system back into the 'offline' response? In my experience (X4100 internal drive), that will make ZFS stop trying to use it. It's also a good idea to do this before you hot-unplug the bad drive to replace it with a new one. Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Computer usable output for zpool commands
On Feb 1, 2008, at 7:17 AM, Nicolas Dorfsman wrote: Hi, I wrote an hobbit script around lunmap/hbamap commands to monitor SAN health. I'd like to add detail on what is being hosted by those luns. With svm metastat -p is helpful. With zfs, zpool status output is awful for script. Is there somewhere an utility to show zpool informations in a scriptable format ? What exactly do you want to display? We have the '-H' option to 'zfs list' and 'zfs get' for parsing. Feel free to experiment with the code to make zpool otuput more scriptable: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/ zpool/zpool_main.c eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Need help with a dead disk
Here's a bit more info. The drive appears to have failed at 22:19 EST but it wasn't until 1:30 EST the next day that the system finally decided that it was bad. (Why?) Here's some relevant log stuff (with lots of repeated 'device not responding' errors removed) I don't know if it will be useful: Feb 11 22:19:09 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 11 22:19:09 maxwell SCSI transport failed: reason 'incomplete': retrying command Feb 11 22:19:10 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 11 22:19:10 maxwell disk not responding to selection ... Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED] (isp0): Feb 11 22:21:08 maxwell SCSI Cable/Connection problem. Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.notice] Hardware/Firmware error. Feb 11 22:21:08 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED] (isp0): Feb 11 22:21:08 maxwell Fatal error, resetting interface, flg 16 ... (Why did this take so long?) Feb 12 01:30:05 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 01:30:05 maxwell offline ... Feb 12 01:30:22 maxwell fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major Feb 12 01:30:22 maxwell EVENT-TIME: Tue Feb 12 01:30:22 EST 2008 Feb 12 01:30:22 maxwell PLATFORM: SUNW,Ultra-250, CSN: -, HOSTNAME: maxwell Feb 12 01:30:22 maxwell SOURCE: zfs-diagnosis, REV: 1.0 Feb 12 01:30:22 maxwell EVENT-ID: 7f48f376-2eb1-ccaf-afc5-e56f5bf4576f Feb 12 01:30:22 maxwell DESC: A ZFS device failed. Refer to http://sun.com/msg/ZFS-8000-D3 for more information. Feb 12 01:30:22 maxwell AUTO-RESPONSE: No automated response will occur. Feb 12 01:30:22 maxwell IMPACT: Fault tolerance of the pool may be compromised. Feb 12 01:30:22 maxwell REC-ACTION: Run 'zpool status -x' and replace the bad device. One thought I had was to unconfigure the bad disk with cfgadm. Would that force the system back into the 'offline' response? Thanks, -Brian Brian H. Nelson wrote: Ok. I think I answered my own question. ZFS _didn't_ realize that the disk was bad/stale. I power-cycled the failed drive (external) to see if it would come back up and/or run diagnostics on it. As soon as I did that, ZFS put the disk ONLINE and started using it again! Observe: bash-3.00# zpool status pool: pool1 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM pool1ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c0t9d0 ONLINE 0 0 0 c0t10d0 ONLINE 0 0 0 c0t11d0 ONLINE 0 0 0 c0t12d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 ONLINE 2.11K 20.09 0 errors: No known data errors Now I _really_ have a problem. I can't offline the disk myself: bash-3.00# zpool offline pool1 c2t2d0 cannot offline c2t2d0: no valid replicas I don't understand why, as 'zpool status' says all the other drives are OK. What's worse, if I just power off the drive in question (trying to get back to where I started) the zpool hangs completely! I let it go for about 7 minutes thinking maybe there was some timeout, but still nothing. Any command that would access the zpool (including 'zpool status') hangs. The only way to fix is to power the external disk back on upon which everything starts working like nothing has happened. Nothing gets logged other than lots of these only while the drive is powered off: Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 11:49:32 maxwell disk not responding to selection Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 11:49:32 maxwell offline or reservation conflict Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
[zfs-discuss] Need help with a dead disk (was: ZFS keeps trying to open a dead disk: lots of logging)
Ok. I think I answered my own question. ZFS _didn't_ realize that the disk was bad/stale. I power-cycled the failed drive (external) to see if it would come back up and/or run diagnostics on it. As soon as I did that, ZFS put the disk ONLINE and started using it again! Observe: bash-3.00# zpool status pool: pool1 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM pool1ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c0t9d0 ONLINE 0 0 0 c0t10d0 ONLINE 0 0 0 c0t11d0 ONLINE 0 0 0 c0t12d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 ONLINE 2.11K 20.09 0 errors: No known data errors Now I _really_ have a problem. I can't offline the disk myself: bash-3.00# zpool offline pool1 c2t2d0 cannot offline c2t2d0: no valid replicas I don't understand why, as 'zpool status' says all the other drives are OK. What's worse, if I just power off the drive in question (trying to get back to where I started) the zpool hangs completely! I let it go for about 7 minutes thinking maybe there was some timeout, but still nothing. Any command that would access the zpool (including 'zpool status') hangs. The only way to fix is to power the external disk back on upon which everything starts working like nothing has happened. Nothing gets logged other than lots of these only while the drive is powered off: Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 11:49:32 maxwell disk not responding to selection Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 11:49:32 maxwell offline or reservation conflict Feb 12 11:49:32 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 11:49:32 maxwell i/o to invalid geometry What's going on here? What can I do to make ZFS let go of the bad drive? This is a production machine and I'm getting concerned. I _really_ don't like the fact that ZFS is using a suspect drive, but I can't seem to make it stop! Thanks, -Brian Brian H. Nelson wrote: This is Solaris 10U3 w/127111-05. It appears that one of the disks in my zpool died yesterday. I got several SCSI errors finally ending with 'device not responding to selection'. That seems to be all well and good. ZFS figured it out and the pool is degraded: maxwell /var/adm zpool status pool: pool1 state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: none requested config: NAME STATE READ WRITE CKSUM pool1DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c0t9d0 ONLINE 0 0 0 c0t10d0 ONLINE 0 0 0 c0t11d0 ONLINE 0 0 0 c0t12d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 UNAVAIL 1.88K 17.98 0 cannot open errors: No known data errors My question is why does ZFS keep attempting to open the dead device? At least that's what I assume is happening. About every minute, I get eight of these entries in the messages log: Feb 12 10:15:54 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 10:15:54 maxwell disk not responding to selection I also got a number of these thrown in for good measure: Feb 11 22:21:58 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 11 22:21:58 maxwell SYNCHRONIZE CACHE command failed (5) Since the disk died last night (at about 11:20pm EST) I now have over 15K of similar entries in my log. What gives? Is this expected behavior? If ZFS knows the device is having problems, why does it not just leave it
Re: [zfs-discuss] 3ware support
Carson Gaspar wrote: Tim wrote: A much cheaper (and probably the BEST supported card), is the supermicro based on the marvell chipset. This is the same chipset that is used in the thumper x4500 so you know that the folks at sun are doing their due diligence to make sure the drivers are solid. Except the drivers _aren't_ solid, at least in Solaris(tm). The OpenSolaris drivers may have been fixed (I know a lot of work is going into them, but I haven't tested them), but those fixes have not made it back into the supported realm. So if you need to run a supported OS, I'd skip the Marvell chips if possible, at least for now. Does this mean that support still has not provided you with working code? I am surprised if that is true. I do not know of any reason why this should be the case. If you have not been given fixed code I think you should escalate up the support chain. Further, if more customers push for getting the latest changes that are in OpenSolaris into Solaris 10, the more likely it is that the individuals responsible for evaluating what should be back ported to Solaris 10 will accept those changes. Regards and sympathy, Lida ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3ware support
Johan Kooijman wrote: Goodmorning all, can anyone confirm that 3ware raid controllers are indeed not working under Solaris/OpenSolaris? I can't seem to find it in the HCL. We're now using a 3Ware 9550SX as a S-ATA RAID controller. The original plan was to disable all it's RAID functions and use justs the S-ATA controller functionality for ZFS deployment. If indeed 3Ware isn't support, I have to buy a new controller. Any specific controller/brand you can recommend for Solaris? 3ware cards do not work (as previously specified). Even in linux/windows, they're pretty flaky -- if you had Solaris drivers, you'd probably shoot yourself in a month anyway. I'm using the SuperMicro aoc-sat2-mv8 at the recommendation of someone else on this list. It's a JBOD card, which is perfect for ZFS. Also, you won't be paying for RAID functionality that you're wanting to disable anyway. Rob++ -- Internet: [EMAIL PROTECTED] __o Life: [EMAIL PROTECTED]_`\,_ (_)/ (_) They couldn't hit an elephant at this distance. -- Major General John Sedgwick ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Need help with a dead disk
Hmm... this won't help you, but I think I'm having similar problems with an iSCSI target device. If I offline the target, zfs hangs for just over 5 minutes before it realises the device is unavailable, and even then it doesn't report the problem until I repeat the zpool status command. What I see here every time is: - iSCSI device disconnected - zpool status, and all file i/o appears to hang for 5 mins - zpool status then finishes (reporting pools ok), and i/o carries on. - Immediately running zpool status again correctly shows the device as faulty and the pool as degraded. It seems either ZFS or the Solaris driver stack has a problem when devices go offline. Both of us have seen zpool status hang for huge amounts of time when there's a problem with a drive. Not something that inspires confidence in a raid system. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Which DTrace provider to use
Hi List, I'm wondering if one of you expert DTrace guru's can help me. I want to write a DTrace script to print out a a histogram of how long IO requests sit in the service queue. I can output the results with the quantize method. I'm not sure which provider I should be using for this. Does anyone know? I can easily adapt one of the DTrace Toolkit routines for this, if I can find the provider. I'll also throw out the problem I'm trying to meter. We are using ZFS on a large SAN array (4TB). The pool on this array serves up a lot of users, (250 home file systems/directories) and also /usr/local and other OTS software. It works fine most of the time, but then gets overloaded during busy periods. I'm going to reconfigure the array to help with this, but I sure would love to have some metrics to know how big a difference my tweaks are making. Basically, the problem users experience, when the load shoots up are huge latencies. An ls on a non-cached directory, which usually is instantaneous, will take 20, 30, 40 seconds or more. Then when the storage array catches up, things get better. My clients are not happy campers. I know, I know, I should have gone with a JBOD setup, but it's too late for that in this iteration of this server. We we set this up, I had the gear already, and it's not in my budget to get new stuff right now. Thanks for any help anyone can offer. Jon ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3ware support
Tim wrote: A much cheaper (and probably the BEST supported card), is the supermicro based on the marvell chipset. This is the same chipset that is used in the thumper x4500 so you know that the folks at sun are doing their due diligence to make sure the drivers are solid. Except the drivers _aren't_ solid, at least in Solaris(tm). The OpenSolaris drivers may have been fixed (I know a lot of work is going into them, but I haven't tested them), but those fixes have not made it back into the supported realm. So if you need to run a supported OS, I'd skip the Marvell chips if possible, at least for now. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS keeps trying to open a dead disk: lots of logging
This is Solaris 10U3 w/127111-05. It appears that one of the disks in my zpool died yesterday. I got several SCSI errors finally ending with 'device not responding to selection'. That seems to be all well and good. ZFS figured it out and the pool is degraded: maxwell /var/adm zpool status pool: pool1 state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: none requested config: NAME STATE READ WRITE CKSUM pool1DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c0t9d0 ONLINE 0 0 0 c0t10d0 ONLINE 0 0 0 c0t11d0 ONLINE 0 0 0 c0t12d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 UNAVAIL 1.88K 17.98 0 cannot open errors: No known data errors My question is why does ZFS keep attempting to open the dead device? At least that's what I assume is happening. About every minute, I get eight of these entries in the messages log: Feb 12 10:15:54 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 12 10:15:54 maxwell disk not responding to selection I also got a number of these thrown in for good measure: Feb 11 22:21:58 maxwell scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],4000/[EMAIL PROTECTED]/SUNW,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd32): Feb 11 22:21:58 maxwell SYNCHRONIZE CACHE command failed (5) Since the disk died last night (at about 11:20pm EST) I now have over 15K of similar entries in my log. What gives? Is this expected behavior? If ZFS knows the device is having problems, why does it not just leave it alone and wait for user intervention? Also, I noticed that the 'action' says to attach the device and 'zpool online' it. Am I correct in assuming that a 'zpool replace' is what would really be needed, as the data on the disk will be outdated? Thanks, -Brian -- --- Brian H. Nelson Youngstown State University System Administrator Media and Academic Computing bnelson[at]cis.ysu.edu --- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scrub halts
Will Murnane wrote: On Feb 12, 2008 4:45 AM, Lida Horn [EMAIL PROTECTED] wrote: The latest changes to the sata and marvell88sx modules have been put back to Solaris Nevada and should be available in the next build (build 84). Hopefully, those of you who use it will find the changes helpful. I have indeed found it beneficial. I installed the new drivers on two machines, both of which were intermittently giving errors about device resets. One card did this so often that I believed the card was faulty and I would have to replace either the card or the motherboard. I'm glad you find the new modules useful and am pleased with your results. One thing of which I would like you to be aware is that some of what was done was to suppress the messages. In other words, some of what was happening before is still happening, just silently. Since installing the new drivers I've had no issues whatsoever with drives on either box. I ran zpool scrubs continuously on the flaky box, replaced a disk with another one, and copied data about in an attempt to replicate the bus errors I had previously seen, to no avail. The other box has been similarly stable, as far as I can tell; I see no messages in the logs and the users haven't complained when I asked them. No issues whatsoever, wonderful words to hear! Thank you for the work you've put into improving the state of these drivers; I meant to email you earlier this week and mention the great strides they have made, but other things took precedence. That, to my mind, is the primary evolution these drivers have made: I don't have to worry about my HBAs any more. I appreciate your taking the time to post and hope you have no further issues with the driver. Thank you, Lida Thanks! Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3ware support
On 2/12/08, Johan Kooijman [EMAIL PROTECTED] wrote: Goodmorning all, can anyone confirm that 3ware raid controllers are indeed not working under Solaris/OpenSolaris? I can't seem to find it in the HCL. We're now using a 3Ware 9550SX as a S-ATA RAID controller. The original plan was to disable all it's RAID functions and use justs the S-ATA controller functionality for ZFS deployment. If indeed 3Ware isn't support, I have to buy a new controller. Any specific controller/brand you can recommend for Solaris? -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 76 201 1179 E [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Johan, A much cheaper (and probably the BEST supported card), is the supermicro based on the marvell chipset. This is the same chipset that is used in the thumper x4500 so you know that the folks at sun are doing their due diligence to make sure the drivers are solid. It's also much cheaper than almost all RAID based alternatives to boot. If you aren't using the raid functionality, don't waste your money on a raid card :) http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm Here's where I purchased mine from, but I'm guessing you are not in the US and they don't ship to your country of origin. http://www.ewiz.com/detail.php?p=AOC-SAT2MVc=frpid=84b59337aa4414aa488fdf95dfd0de1a1e2a21528d6d2fbf89732c9ed77b72a4 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] We can't import pool zfs faulted
Yesterday, we needed to stop our nfs file server. after having restarted one pool zfs was marked as faulted we exported the faulted pool we tried to import it (even with option -f) but it can't msg by solaris is cannot import one or more devices curently unavaillable. what can I do ? is there a way to rebuild a faulted pool ? I have no backup. thanks for any help This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%
bda wrote: I haven't noticed this behavior when ZFS has (as recommended) the full disk. Good to know, as i intended to use the whole disks anyway. Thanks, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%
Ralf Ramge wrote: Quotas are applied to file systems, not pools, and a such are pretty independent from the pool size. I found it best to give every user his/her own filesystem and applying individual quotas afterwards. Does this mean, that if i have a pool of 7TB with one filesystem for all users with a quota of 6TB i'd be alright? The usage of that fs would never be over 80%, right? Like in the following example for the pool shares with a poolsize of 228G an one fs with a quota of 100G: shares 228G28K 220G 1%/shares shares/production 100G 8,4G92G 9%/shares/production This would suite me perfectly, as this would be exactly what i wanted to do ;) Thanks, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%
+-- | On 2008-02-12 02:40:33, Thomas Liesner wrote: | | Subject: Re: [zfs-discuss] Avoiding performance decrease when pool usage is | over 80% | | Nobody out there who ever had problems with low diskspace? Only in shared-disk setups. i.e., ZFS lives on a slice or on partition 2 with (typically) UFS slices or UFS on partition 1. I've definitely tried to keep disk util under 80% for this reason. Things become very slow as you pass that limit. I haven't noticed this behavior when ZFS has (as recommended) the full disk. -- bda Cyberpunk is dead. Long live cyberpunk. http://mirrorshades.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%
Nobody out there who ever had problems with low diskspace? Regrads, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3ware support
Nicolas Szalay wrote: Le mardi 12 février 2008 à 07:22 +0100, Johan Kooijman a écrit : Goodmorning all, Hi, can anyone confirm that 3ware raid controllers are indeed not working under Solaris/OpenSolaris? I can't seem to find it in the HCL. I do confirm they don't work We're now using a 3Ware 9550SX as a S-ATA RAID controller. The original plan was to disable all it's RAID functions and use justs the S-ATA controller functionality for ZFS deployment. If indeed 3Ware isn't support, I have to buy a new controller. Any specific controller/brand you can recommend for Solaris? I use Areca cards, with the driver supplied by Areca (certified in the HCL) I'm working on getting arcmsr integrated into OpenSolaris, and I hope to integrate it into build 87. The RFE is 6614012 add Areca SAS/SATA RAID adapter driver PSARC 2008/079 arcmsr SAS/SATA RAID driver The existing case materials (spec and manpage) should be visible on www.opensolaris.org/os/community/arc cheers James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%
Thomas Liesner wrote: Nobody out there who ever had problems with low diskspace? Okay, I found your original mail :-) Quotas are applied to file systems, not pools, and a such are pretty independent from the pool size. I found it best to give every user his/her own filesystem and applying individual quotas afterwards. -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 [EMAIL PROTECTED] - http://web.de/ 11 Internet AG Brauerstraße 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Avoiding performance decrease when pool usage is over 80%
Thomas Liesner wrote: Does this mean, that if i have a pool of 7TB with one filesystem for all users with a quota of 6TB i'd be alright? Yep. Although I *really* recommend creating individual file systems, e.g. if you have 1,000 users on your server, I'd create 1,000 file systems with a quota of 6 GB each. Easier to handle, more flexible to use, easier to backup, it allows better use of snapshots and it's easier to migrate single users to other servers. The usage of that fs would never be over 80%, right? Nope. Don't mix up pools and file systems. your pool of 7TB will only be filled to a maximum of 6TB, but the file system will be 100% full. which shouldn't impact your overall performance. Like in the following example for the pool shares with a poolsize of 228G an one fs with a quota of 100G: shares 228G28K 220G 1%/shares shares/production 100G 8,4G92G 9%/shares/production This would suite me perfectly, as this would be exactly what i wanted to do ;) Yep, you got it. -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 [EMAIL PROTECTED] - http://web.de/ 11 Internet AG Brauerstraße 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Norbert Lang, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] We can't import pool zfs faulted
If you can't use zpool status, you probably should check wether your system is right and not all devices needed for this pool are currently available... i.e. format... Regards, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss