[zfs-discuss] RFE 4852783

2008-06-16 Thread Miles Nordin
Is RFE 4852783 (need for an equivalent to LVM2's pvmove) likely to happen within the next year? My use-case is home user. I have 16 disks spinning, two towers of eight disks each, exporting some of them as iSCSI targets. Four disks are 1TB disks already in ZFS mirrors, and 12 disks are 180 -

Re: [zfs-discuss] Problem with missing disk in RaidZ

2008-06-20 Thread Miles Nordin
ph == Peter Hawkins [EMAIL PROTECTED] writes: ph Tried zpool replace. Unfortunately that takes me back into the ph cycle where as soon as the resilver starts the system hangs, ph not even CAPS Lock works. When I reset the system I have about ph a 10 second window to detach the

Re: [zfs-discuss] zfs mirror broken?

2008-06-20 Thread Miles Nordin
jb == Jeff Bonwick [EMAIL PROTECTED] writes: jb If you say 'zpool online pool disk' that should tell ZFS jb that the disk is healthy again and automatically kick off a jb resilver. jb Of course, that should have happened automatically. with b71 I find that it does sometimes

Re: [zfs-discuss] Oracle and ZFS

2008-06-23 Thread Miles Nordin
mo == Mertol Ozyoney [EMAIL PROTECTED] writes: mo One of our customer is suffered from FS being corrupted after mo an unattanded shutdonw due to power problem. mo They want to switch to ZFS. mo From what I read on, ZFS will most probably not be corrupted mo from the same

Re: [zfs-discuss] Oracle and ZFS

2008-06-23 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: kb == Keith Bierman [EMAIL PROTECTED] writes: re the disk lies about the persistence of the data. ZFS knows re disks lie, so it sends sync commands when necessary (1) i don't think ``lie'' is a correct characerization given that the

Re: [zfs-discuss] ZFS configuration for VMware

2008-06-28 Thread Miles Nordin
et == Erik Trimble [EMAIL PROTECTED] writes: et SSD used to refer strictly to standard DRAM backed with a et battery (and, maybe some sort of a fancy enclosure with a hard et drive to write all DRAM data to after a power outage). et * 3.5 LP disk form factor, SCSI hotswap/SATA2

Re: [zfs-discuss] [caiman-discuss] swap dump on ZFS volume

2008-07-01 Thread Miles Nordin
bf == Bob Friesenhahn [EMAIL PROTECTED] writes: re == Richard Elling [EMAIL PROTECTED] writes: re If you run out of space, things fail. Pinwheels are a symptom re of running out of RAM, not running out of swap. okay. But what is the point? Pinwheels are a symptom of thrashing.

Re: [zfs-discuss] [caiman-discuss] swap dump on ZFS volume

2008-07-01 Thread Miles Nordin
bf == Bob Friesenhahn [EMAIL PROTECTED] writes: bf What is the relationship between the size of the memory bf reservation and thrashing? The problem is that size-capping is the only control we have over thrashing right now. Maybe there are better ways to predict thrashing than through

Re: [zfs-discuss] [caiman-discuss] swap dump on ZFS volume

2008-07-01 Thread Miles Nordin
bf == Bob Friesenhahn [EMAIL PROTECTED] writes: bf sequential access to virtual memory causes reasonably bf sequential I/O requests to disk. no, thrashing is not when memory is accessed randomly instead of sequentially. It's when the working set of pages is too big to fit in physical

Re: [zfs-discuss] Large zpool design considerations

2008-07-03 Thread Miles Nordin
djm == Darren J Moffat [EMAIL PROTECTED] writes: bf == Bob Friesenhahn [EMAIL PROTECTED] writes: djm Why are you planning on using RAIDZ-2 rather than mirroring ? isn't MTDL sometimes shorter for mirroring than raidz2? I think that is the biggest point of raidz2, is it not? bf The

Re: [zfs-discuss] zfs-discuss Digest, Vol 33, Issue 19

2008-07-09 Thread Miles Nordin
r == Ross [EMAIL PROTECTED] writes: np == Neil Perrin [EMAIL PROTECTED] writes: np 2. I received the board and driver from another group within np Sun. It would be better to contact Micro Memory (or whoever np took them over) directly, as it's not my place to give out 3rd np

Re: [zfs-discuss] confusion and frustration with zpool

2008-07-09 Thread Miles Nordin
ah == Al Hopper [EMAIL PROTECTED] writes: ah I've had bad experiences with the Seagate products. I've had bad experiences with all of them. (maxtor, hgst, seagate, wd) ah My guess is that it's related to duty cycle - Recently I've been getting a lot of drives from companies like

Re: [zfs-discuss] zfs-discuss Digest, Vol 33, Issue 19

2008-07-09 Thread Miles Nordin
r == Ross [EMAIL PROTECTED] writes: r I think the problem Miles is that this isn't Sun hardware In this case it's not, but please do not muddle my point: Marvell SATA and LSI Logic mpt SATARAID and many other (most?) drivers have the same problem. Right now there are, AIUI: *

Re: [zfs-discuss] previously mentioned J4000 released

2008-07-11 Thread Miles Nordin
bf == Bob Friesenhahn [EMAIL PROTECTED] writes: bf since the dawn of time since the dawn of time Sun has been playing these games with hard drive ``sleds''. I still have sparc32 stuff on the shelf with missing/extra sleds. bf POTS line bf cell phone bf You are free to select

Re: [zfs-discuss] raid or mirror

2008-07-11 Thread Miles Nordin
jh == Johan Hartzenberg [EMAIL PROTECTED] writes: jh To be even MORE safe, you want the two disks to be on separate jh controllers, so that you can survive a controller failure too. or a controller-driver-failure. At least on Linux, when a disk goes bad, Linux starts resetting

Re: [zfs-discuss] checksum errors on root pool after upgrade to snv_94

2008-07-20 Thread Miles Nordin
jk == Jürgen Keil [EMAIL PROTECTED] writes: jk And a zpool scrub under snv_85 doesn't find checksum errors, jk either. how about a second scrub with snv_94? are the checksum errors gone the second time around? I get checksum errors counted all the time when it is really just

Re: [zfs-discuss] install opensolaris on raidz

2008-07-20 Thread Miles Nordin
r == Ross [EMAIL PROTECTED] writes: r the benefit of mirroring that CF drive would be minimal. rather short-sighted. What if you want to replace the CF with a bigger or faster one without shutting down? pgpSx47yLusSx.pgp Description: PGP signature

Re: [zfs-discuss] ZFS deduplication

2008-07-22 Thread Miles Nordin
et == Erik Trimble [EMAIL PROTECTED] writes: et Dedup Advantages: et (1) save space (2) coalesce data which is frequently used by many nodes in a large cluster into a small nugget of common data which can fit into RAM or L2 fast disk (3) back up non-ZFS filesystems that don't

Re: [zfs-discuss] The best motherboard for a home ZFS fileserver

2008-07-23 Thread Miles Nordin
mh == Matt Harrison [EMAIL PROTECTED] writes: mh http://breden.org.uk/2008/03/02/home-fileserver-zfs-hardware/ that's very helpful. I'll reshop for nForce 570 boards. i think my untested guess was an nForce 630 or something, so it probably won't work. I would add: 1. do not get three

Re: [zfs-discuss] The best motherboard for a home ZFS fileserver

2008-07-23 Thread Miles Nordin
ic == Ian Collins [EMAIL PROTECTED] writes: ic I'd use mirrors rather than raidz2. You should see better ic performance the problem is that it's common for a very large drive to have unreadable sectors. This can happen because the drive is so big that its bit-error-rate matters. But

Re: [zfs-discuss] The best motherboard for a home ZFS fileserver

2008-07-24 Thread Miles Nordin
s == Steve [EMAIL PROTECTED] writes: s About freedom: I for sure would prefere open source drivers s availability, let's account for it! There is source for the Intel gigabit cards in the source browser.

Re: [zfs-discuss] Ideal Setup: RAID-5, Areca, etc!

2008-07-25 Thread Miles Nordin
bh == Brandon High [EMAIL PROTECTED] writes: bh a system built around the Marvell or LSI chipsets according to The Blogosphere, source of all reliable information, there's some issue with LSI, too. The driver is not available in stable Solaris nor OpenSolaris, or there are two drivers, or

Re: [zfs-discuss] zfs, raidz, spare and jbod

2008-07-25 Thread Miles Nordin
jcm == James C McPherson [EMAIL PROTECTED] writes: jcm I'm not convinced that this is a valid test; yanking a disk it is the ONLY valid test. it's just testing more than ZFS. pgpHTHYtENLmG.pgp Description: PGP signature ___ zfs-discuss mailing

Re: [zfs-discuss] zfs, raidz, spare and jbod

2008-07-25 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re I will submit that this failure mode is often best re solved by door locks, not software. First, not just door locks, but: * redundant power supplies * sleds and Maintain Me, Please lights * high-strung extremely conservative

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-07-28 Thread Miles Nordin
mp == Mattias Pantzare [EMAIL PROTECTED] writes: This is a big one: ZFS can continue writing to an unavailable pool. It doesn't always generate errors (I've seen it copy over 100MB before erroring), and if not spotted, this *will* cause data loss after you reboot. mp

Re: [zfs-discuss] The best motherboard for a home ZFS fileserver

2008-07-31 Thread Miles Nordin
s == Steve [EMAIL PROTECTED] writes: s http://www.newegg.com/Product/Product.aspx?Item=N82E16813128354 no ECC: http://en.wikipedia.org/wiki/List_of_Intel_chipsets#Core_2_Chipsets pgpSbbK6c48b6.pgp Description: PGP signature ___ zfs-discuss

Re: [zfs-discuss] Can I trust ZFS?

2008-07-31 Thread Miles Nordin
r == Ross [EMAIL PROTECTED] writes: r This is a big step for us, we're a 100% windows company and r I'm really going out on a limb by pushing Solaris. I'm using it in anger. I'm angry at it, and can't afford anything that's better. Whatever I replaced ZFS with, I would make sure it

Re: [zfs-discuss] Terrible zfs performance under NFS load

2008-08-01 Thread Miles Nordin
cs == Chris Siebenmann [EMAIL PROTECTED] writes: cs (Some versions of syslog let you turn this off for specific cs log files, which is very useful for high volume, low cs importance ones.) To ensure that kernel messages are written to disk promptly, syslogd(8)

Re: [zfs-discuss] checksum errors after online'ing device

2008-08-02 Thread Miles Nordin
tn == Thomas Nau [EMAIL PROTECTED] writes: tn Nevertheless during the first hour of operation after onlining tn we recognized numerous checksum errors on the formerly tn offlined device. We decided to scrub the pool and after tn several hours we got about 3500 error in 600GB of

Re: [zfs-discuss] checksum errors after online'ing device

2008-08-02 Thread Miles Nordin
tn == Thomas Nau [EMAIL PROTECTED] writes: tn I never experienced that one but we usually don't touch any of tn the iSCSI settings as long as a devices is offline. At least tn as long as we don't have to for any reason Usually I do 'zpool offline' followed by 'iscsiadm remove

[zfs-discuss] 'zpool status' intrusiveness

2008-08-02 Thread Miles Nordin
c == Miles Nordin [EMAIL PROTECTED] writes: tn == Thomas Nau [EMAIL PROTECTED] writes: c 'zpool status' should not be touching the disk at all. I found this on some old worklog: http://web.Ivy.NET/~carton/oneNightOfWork/20061119-carton.html -8- Also, zpool status takes forEVer

Re: [zfs-discuss] are these errors dangerous

2008-08-03 Thread Miles Nordin
mh == Matt Harrison [EMAIL PROTECTED] writes: mh I'm worried about is if the entire batch is failing slowly mh and will all die at the same time. If you can download smartctl, you can use the approach described here: http://web.Ivy.NET/~carton/rant/ml/raid-findingBadDisks-0.html

Re: [zfs-discuss] Supermicro AOC-USAS-L8i

2008-08-04 Thread Miles Nordin
bh == Brandon High [EMAIL PROTECTED] writes: nk == Nathan Kroenert [EMAIL PROTECTED] writes: nk And I can certainly vouch for that series of chipsets... I nk have a 750a-sli chipset (the one below the 790) um...what? 750a is an nVidia chip

Re: [zfs-discuss] help me....

2008-08-04 Thread Miles Nordin
np == Neal Pollack [EMAIL PROTECTED] writes: wj == wan jm [EMAIL PROTECTED] writes: np Yes, it's too easy to administer. This makes it rough to np charge a lot as a sysadmin. yeah, sure, until you get a simple question like this: wj there are two disks in one ZFS pool used as

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

2008-08-04 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: pf == Paul Fisher [EMAIL PROTECTED] writes: re I was able to reproduce this in b93, but might have a re different interpretation You weren't able to reproduce the hang of 'zpool status'? Your 'zpool status' was after the FMA fault kicked

Re: [zfs-discuss] Supermicro AOC-USAS-L8i

2008-08-05 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re This was fixed some months ago, and it should be hard to find re the old B2 chips anymore (not many were made or sold). -- well, they all ended up on newegg. :) pgpSasiHxNZEB.pgp Description: PGP signature

Re: [zfs-discuss] OpenSolaris+ZFS+RAIDZ+VirtualBox - ready for production systems?

2008-08-05 Thread Miles Nordin
em == Evert Meulie [EMAIL PROTECTED] writes: em OpenSolaris+ZFS+RAIDZ+VirtualBox. I'm using snv b83 + ZFS-unredundant + 32bit CPU + VirtualBox. It's stable, but not all the features like USB and RDP are working for me. Also it is being actively developed, so that's good. I'm planning to

Re: [zfs-discuss] more ZFS recovery

2008-08-06 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: tb == Tom Bird [EMAIL PROTECTED] writes: tb There was a problem with the SAS bus which caused various tb errors including the inevitable kernel panic, the thing came tb back up with 3 out of 4 zfs mounted. re In general, ZFS can

Re: [zfs-discuss] more ZFS recovery

2008-08-06 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: c If that's really the excuse for this situation, then ZFS is c not ``always consistent on the disk'' for single-VDEV pools. re I disagree with your assessment. The on-disk format (any re on-disk format) necessarily assumes no

Re: [zfs-discuss] more ZFS recovery

2008-08-06 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re If your pool is not redundant, the chance that data re corruption can render some or all of your data inaccessible is re always present. 1. data corruption != unclean shutdown 2. other filesystems do not need a mirror to recover

Re: [zfs-discuss] more ZFS recovery

2008-08-06 Thread Miles Nordin
nw == Nicolas Williams [EMAIL PROTECTED] writes: nw Without ZFS the OP would have had silent, undetected (by the nw OS that is) data corruption. It sounds to me more like the system would have paniced as soon as he pulled the cord, and when it rebooted, it would have rolled the UFS log

Re: [zfs-discuss] more ZFS recovery

2008-08-07 Thread Miles Nordin
r == Ross [EMAIL PROTECTED] writes: r Tom wrote There was a problem with the SAS bus which caused r various errors including the inevitable kernel panic. It's r the various errors part that catches my eye, yeah, possibly, but there are checksums on the SAS bus, and its

Re: [zfs-discuss] RFE 4852783

2008-08-08 Thread Miles Nordin
t == Tim [EMAIL PROTECTED] writes: t Why would you have to buy smaller disks? You can replace the t 320's with 1tb drives and after the last 320 is out of the t raidgroup, it will grow automatically. This does work for me to grow a mirrored vdev on nevada b71. The way I found

Re: [zfs-discuss] ZFS, SATA, LSI and stability

2008-08-12 Thread Miles Nordin
ff I have check the drives with smartctl: ff ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE ff 1 Raw_Read_Error_Rate 0x000f 115 075 006Pre-fail Always - 94384069 ff 5 Reallocated_Sector_Ct 0x0033

Re: [zfs-discuss] corrupt zfs stream? checksum mismatch

2008-08-12 Thread Miles Nordin
mp == Mattias Pantzare [EMAIL PROTECTED] writes: mp Or the file was corrupted when you transfered it. he stored the backup streams on ZFS, so obviously they couldn't possibly be corrupt. :p Jonathan, does 'zfs receive -nv' also detect the checksum error, or is it only detected when you

Re: [zfs-discuss] more ZFS recovery

2008-08-12 Thread Miles Nordin
cs == Cromar Scott [EMAIL PROTECTED] writes: cs It appears that the metadata on that pool became corrupted cs when the processor failed. The exact mechanism is a bit of a cs mystery, [...] cs We were told that the probability of metadata corruption would cs have been

Re: [zfs-discuss] corrupt zfs stream? checksum mismatch

2008-08-13 Thread Miles Nordin
jw == Jonathan Wheeler [EMAIL PROTECTED] writes: mp == Mattias Pantzare [EMAIL PROTECTED] writes: jw Miles: zfs receive -nv works ok one might argue 'zfs receive' should validate checksums with the -n option, so you can check if a just-written dump is clean before counting on it. Without

Re: [zfs-discuss] more ZFS recovery

2008-08-13 Thread Miles Nordin
cs == Cromar Scott [EMAIL PROTECTED] writes: cs We opened a call with Sun support. We were told that the cs corruption issue was due to a race condition within ZFS. We cs were also told that the issue was known and was scheduled for cs a fix in S10U6. nice. Is there a bug

Re: [zfs-discuss] corrupt zfs stream? checksum mismatch

2008-08-13 Thread Miles Nordin
jw == Jonathan Wheeler [EMAIL PROTECTED] writes: jw A common example used all over the place is zfs send | ssh jw $host. In these examples is ssh guaranteeing the data delivery jw somehow? it is really all just appologetics. It sounds like a zfs bug to me. The only alternative is

Re: [zfs-discuss] Kernel panic at zpool import

2008-08-14 Thread Miles Nordin
mb == Marc Bevand [EMAIL PROTECTED] writes: mb Ask your hardware vendor. The hardware corrupted your data, mb not ZFS. You absolutely do NOT have adequate basis to make this statement. I would further argue that you are probably wrong, and that I think based on what we know that the

Re: [zfs-discuss] shrinking a zpool - roadmap

2008-08-20 Thread Miles Nordin
j == John [EMAIL PROTECTED] writes: j There is also the human error factor. If someone accidentally j grows a zpool or worse, accidentally adds an unredundant vdev to a redundant pool. Once you press return, all you can do is scramble to find mirrors for it. vdev removal is also

Re: [zfs-discuss] ZFS with Traditional SAN

2008-08-21 Thread Miles Nordin
vf == Vincent Fox [EMAIL PROTECTED] writes: vf Because arrays drives can suffer silent errors in the data vf that are not found until too late. My zpool scrubs vf occasionally find FIX errors that none of the array or vf RAID-5 stuff caught. well, just to make it clear again:

Re: [zfs-discuss] Best layout for 15 disks?

2008-08-22 Thread Miles Nordin
m == mike [EMAIL PROTECTED] writes: m can you combine two zpools together? no. You can have many vdevs in one pool. for example you can have a mirror vdev and a raidz2 vdev in the same pool. You can also destroy pool B, and add its (now empty) devices to pool A. but once two separate

Re: [zfs-discuss] Best layout for 15 disks?

2008-08-22 Thread Miles Nordin
m == mike [EMAIL PROTECTED] writes: m that could only be accomplished through combinations of pools? m i don't really want to have to even think about managing two m separate partitions - i'd like to group everything together m into one large 13tb instance You're not

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-25 Thread Miles Nordin
jcm == James C McPherson [EMAIL PROTECTED] writes: thp == Todd H Poole [EMAIL PROTECTED] writes: mh == Matt Harrison [EMAIL PROTECTED] writes: js == John Sonnenschein [EMAIL PROTECTED] writes: re == Richard Elling [EMAIL PROTECTED] writes: cg == Carson Gaspar [EMAIL PROTECTED] writes:

Re: [zfs-discuss] Unable to import zpool since system hang during zfs destroy

2008-08-25 Thread Miles Nordin
cm == Chris Murray [EMAIL PROTECTED] writes: cm The next issue is that when the pool is actually imported cm (zpool import -f zp), it too hangs the whole system, albeit cm after a minute or so of disk activity. could it be #6573681?

Re: [zfs-discuss] NFS shares don't come back online with pool

2008-08-26 Thread Miles Nordin
r == Ross [EMAIL PROTECTED] writes: r I've just gotten a pool back online after the server booted r with it unavailable, but found that NFS shares were not r automatically restarted when the pool came online. ``me, too.'' in b44, in b71. for workarounds, export/import can

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-26 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re unrecoverable read as the dominant disk failure mode. [...] re none of the traditional software logical volume managers nor re the popular open source file systems (other than ZFS :-) re address this problem. Other LVM's should

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re not all devices return error codes which indicate re unrecoverable reads. What you mean is, ``devices sometimes return bad data instead of an error code.'' If you really mean there are devices out there which never return error codes,

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin
m == MC [EMAIL PROTECTED] writes: m file another bug about how solaris recognizes your ACHI SATA m hardware as old ide hardware. I don't have that board but AIUI the driver attachment's chooseable in the BIOS Blue Screen of Setup, by setting the controller to ``Compatibility'' mode

Re: [zfs-discuss] ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin
thp == Todd H Poole [EMAIL PROTECTED] writes: Would try this with your pci/pci-e cards in this system? I think not. thp Unplugging one of them seems like a fine test to me I've done it, with 32-bit 5 volt PCI, I forget why. I might have been trying to use a board, but bypass the

Re: [zfs-discuss] [Fwd: File system size reduction]

2008-08-27 Thread Miles Nordin
vk == Vikas Kakkar [EMAIL PROTECTED] writes: vk Actually customer wants to reduce the pool size, I guess we vk cannot do this todaythere is a pending RFP on this. RFE 4852783 is decreasing. There was maybe some recent activity about INcreasing a pool size which you can do already,

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: If you really mean there are devices out there which never return error codes, and always silently return bad data, please tell us which one and the story of when you encountered it, re I blogged about one such case. re

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin
t == Tim [EMAIL PROTECTED] writes: t Solaris does not do this. yeah but the locators for local disks are still based on pci/controller/channel not devid, so the disk will move to a different device name if he changes BIOS from pci-ide to AHCI because it changes the driver attachment.

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin
t == Tim [EMAIL PROTECTED] writes: t Except he was, and is referring to a non-root disk. wait, what? his root disk isn't plugged into the pci-ide controller? t LVM hardly changes the way devices move around in Linux, fine, be pedantic. It makes systems boot and mount all their

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-27 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re I really don't know how to please you. dd from the raw device instead of through ZFS would be better. If you could show that you can write data to a sector, and read back different data, without getting an error, over and over, I'd be

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-28 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re There is no error in my math. I presented a failure rate for re a time interval, What is a ``failure rate for a time interval''? AIUI, the failure rate for a time interval is 0.46% / yr, no matter how many drives you have.

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-28 Thread Miles Nordin
rm == Robert Milkowski [EMAIL PROTECTED] writes: rm Please look for slides 23-27 at rm http://unixdays.pl/i/unixdays-prezentacje/2007/milek.pdf yeah, ok, ONCE AGAIN, I never said that checksums are worthless. relling: some drives don't return errors on unrecoverable read events.

Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-08-28 Thread Miles Nordin
es == Eric Schrock [EMAIL PROTECTED] writes: es Finally, imposing additional timeouts in ZFS is a bad idea. es [...] As such, it doesn't have the necessary context to know es what constitutes a reasonable timeout. you're right in terms of fixed timeouts, but there's no reason it

Re: [zfs-discuss] pulling disks was: ZFS hangs/freezes after disk failure,

2008-08-28 Thread Miles Nordin
jl == Jonathan Loran [EMAIL PROTECTED] writes: jl Fe = 46% failures/month * 12 months = 5.52 failures the original statistic wasn't of this kind. It was ``likelihood a single drive will experience one or more failures within 12 months''. so, you could say, ``If I have a thousand drives,

Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-08-28 Thread Miles Nordin
es == Eric Schrock [EMAIL PROTECTED] writes: es I don't think you understand how this works. Imagine two es I/Os, just with different sd timeouts and retry logic - that's es B_FAILFAST. It's quite simple, and independent of any es hardware implementation. AIUI the main timeout

Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-08-28 Thread Miles Nordin
bf == Bob Friesenhahn [EMAIL PROTECTED] writes: bf If the system or device is simply overwelmed with work, then bf you would not want the system to go haywire and make the bf problems much worse. None of the decisions I described its making based on performance statistics are

Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-08-29 Thread Miles Nordin
es == Eric Schrock [EMAIL PROTECTED] writes: es The main problem with exposing tunables like this is that they es have a direct correlation to service actions, and es mis-diagnosing failures costs everybody (admin, companies, es Sun, etc) lots of time and money. Once you expose

Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

2008-08-29 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re if you use Ethernet switches in the interconnect, you need to re disable STP on the ports used for interconnects or risk re unnecessary cluster reconfigurations. RSTP/802.1w plus setting the ports connected to Solaris as ``edge'' is

Re: [zfs-discuss] Sidebar to ZFS Availability discussion

2008-08-31 Thread Miles Nordin
dc == David Collier-Brown [EMAIL PROTECTED] writes: dc one discovers latency growing without bound on disk dc saturation, yeah, ZFS needs the same thing just for scrub. I guess if the disks don't let you tag commands with priorities, then you have to run them at slightly below max

Re: [zfs-discuss] Sidebar to ZFS Availability discussion

2008-09-02 Thread Miles Nordin
bs == Bill Sommerfeld [EMAIL PROTECTED] writes: bs In an ip network, end nodes generally know no more than the bs pipe size of the first hop -- and in some cases (such as true bs CSMA networks like classical ethernet or wireless) only have bs an upper bound on the pipe size.

Re: [zfs-discuss] faulty sub-mirror and CKSUM errors

2008-09-03 Thread Miles Nordin
rm == Robert Milkowski [EMAIL PROTECTED] writes: rm What bothers me is why did I got CKSUM errors? I think they accumulated latently while you had the pool imported on Node 2 with half of the mirror missing. ZFS seems to count unexpected resilvering as CKSUM errors sometimes.

Re: [zfs-discuss] SAS or SATA HBA with write cache

2008-09-03 Thread Miles Nordin
mb == Matt Beebe [EMAIL PROTECTED] writes: mb Anyone know of a SATA and/or SAS HBA with battery backed write mb cache? I've never heard of a battery that's used for anything but RAID features. It's an interesting question, if you use the controller in ``JBOD mode'' will it use the

Re: [zfs-discuss] ZFS over multiple iSCSI targets

2008-09-07 Thread Miles Nordin
mp == Matthew Plumb [EMAIL PROTECTED] writes: mp how best to use the disk I have to ensure I have a safe backup mp strategy? continue using rsync between a ZFS pool and an LVM2 pool. At the very least, have two ZFS pools. For ZFS over iSCSI, have some zpool-layer redundancy because

Re: [zfs-discuss] ZFS Failing Drive procedure (mirrored pairs) - did I mess this up?

2008-09-08 Thread Miles Nordin
kp == Karl Pielorz [EMAIL PROTECTED] writes: kp Thinking about it - perhaps I should have detached ad4 (the kp failing drive) before attaching another device? no, I think ZFS should be fixed. 1. the procedure you used is how hot spares are used, so anyone who says it's wrong for any

Re: [zfs-discuss] ZFS over multiple iSCSI targets

2008-09-08 Thread Miles Nordin
ps == Peter Schuller [EMAIL PROTECTED] writes: ps The software raid in Linux does not support [write barriers] ps with raid5/raid6, yeah i read this warning also and think it's a good argument for not using it. http://lwn.net/Articles/283161/ With RAID5 or RAID6 there is of course

Re: [zfs-discuss] Will ZFS stay consistent with AVS/ZFS and async replication

2008-09-11 Thread Miles Nordin
mb == Matt Beebe [EMAIL PROTECTED] writes: mb When using AVS's Async replication with memory queue, am I mb guaranteed a consistent ZFS on the distant end? The assumed mb failure case is that the replication broke, and now I'm trying mb to promote the secondary replicate with

Re: [zfs-discuss] zfs import -f not working!

2008-09-11 Thread Miles Nordin
Did you guys ever fix this, or get a bug number, or anything? Should I avoid that release? I was about to install b96 for ZFS fixes but this 'zpool import -f' problem looks bad. Corey -8- pr1# zpool offline tank c5t0d0s0 pr1# zpool status pool: rpool state: ONLINE scrub: none

Re: [zfs-discuss] zfs import -f not working!

2008-09-11 Thread Miles Nordin
c == Miles Nordin [EMAIL PROTECTED] writes: c Did you guys ever fix this, or get a bug number, or c anything? I found two bugs about this: http://bugs.opensolaris.org/view_bug.do?bug_id=6736213 http://bugs.opensolaris.org/view_bug.do?bug_id=6739532 I don't think either one fits

Re: [zfs-discuss] Interesting Pool Import Failure

2008-09-16 Thread Miles Nordin
s == Solaris [EMAIL PROTECTED] writes: s Point being that even if you can't run OpenSolaris due to s support issues, you may still be able to use OpenSolaris to s help resolve ZFS issues that you might run into in Solaris 10. glad ZFS is improving, but this sentence is a

Re: [zfs-discuss] ZPOOL Import Problem

2008-09-16 Thread Miles Nordin
jd == Jim Dunham [EMAIL PROTECTED] writes: jd If at the time the SNDR replica is deleted the set was jd actively replicating, along with ZFS actively writing to the jd ZFS storage pool, I/O consistency will be lost, leaving ZFS jd storage pool in an indeterministic state on the

Re: [zfs-discuss] zpool with multiple mirrors question

2008-09-17 Thread Miles Nordin
djm == Darren J Moffat [EMAIL PROTECTED] writes: djm If c0t6d0 and c0t7d0 both fail (ie both sides of the same djm mirror vdev) then the pool will be unable to retrieve all the djm data stored in it. won't be able to retrieve ANY of the data stored on it. It's correct as you wrote it,

Re: [zfs-discuss] resilver keeps starting over? snv_95

2008-09-17 Thread Miles Nordin
t == Tomas Ögren [EMAIL PROTECTED] writes: t I recall some issue with 'zpool status' as root restarting t resilvering.. Doing it as a regular user will not.. is there an mdb command similar to zpool status? maybe it's safer. pgp8jYtCisPzr.pgp Description: PGP signature

Re: [zfs-discuss] resilver being killed by 'zpool status' when root

2008-09-23 Thread Miles Nordin
bi == Blake Irvin [EMAIL PROTECTED] writes: bi running 'zpool status' or 'zpool status -xv' bi during a resilver as a non-privileged user has no adverse bi effect, but if i do the same as root, the resilver restarts. I have this in my ZFS bug notes: From: Thomas Bleek [EMAIL

Re: [zfs-discuss] zpool file corruption

2008-09-25 Thread Miles Nordin
np == Neal Pollack [EMAIL PROTECTED] writes: np No attempt to acknowledge or recall defective silicon. No np interest in customer data loss. Well, this customer has no np further interest in Silicon Image. I refuse to acknowledge np that they exist. 1. too bad Sil is the only

Re: [zfs-discuss] SIL3124 stability?

2008-09-25 Thread Miles Nordin
mk == Mikael Karlsson [EMAIL PROTECTED] writes: mk Anyone with experience with the SIL3124 chipset? Does it work mk good? In Solaris, I believe Sil3124 has a SATA framework driver while SIL3114 is the old IDE framework. There is more than one version of the 3124, but I've not heard

Re: [zfs-discuss] SIL3124 stability?

2008-09-25 Thread Miles Nordin
js == Joerg Schilling [EMAIL PROTECTED] writes: js If it works for your system, be happy. I mentioned that the js controller may not be usable in all systems as it hangs up the js BIOS in my machine if there is a disk connected to the card. There are three different chips under

Re: [zfs-discuss] Automatic removal of old snapshots

2008-09-25 Thread Miles Nordin
tf == Tim Foster [EMAIL PROTECTED] writes: tf anyone else have an opinion? keep the number of snapshots small until the performacne problems with booting/importing/scrubbing while having lots of snapshots are resolved. pgp4Qi9Dyk7O4.pgp Description: PGP signature

[zfs-discuss] working closed blob driver

2008-09-25 Thread Miles Nordin
wm == Will Murnane [EMAIL PROTECTED] writes: wm I'd rather have a working closed blob than a driver that is wm Free Software for a device that is faulty. Ideals are very wm nice, but broken hardware isn't. except, 1. part of the reason the closed Solaris drivers are (also)

Re: [zfs-discuss] zpool file corruption

2008-09-25 Thread Miles Nordin
c 2. if the .vmdk's were stored in ZFS why was the corruption not c flagged as a CKSUM error? wm They were. From the OP: NAMESTATE READ WRITE CKSUM testing ONLINE 0 016 mirrorONLINE 0 016

Re: [zfs-discuss] working closed blob driver

2008-09-25 Thread Miles Nordin
jcm == James C McPherson [EMAIL PROTECTED] writes: jcm I assume you're referring to mpt(7d) here? jcm Since we started shipping it at all, with Solaris _8_, it's jcm definitely been available in Solaris 10. no, I was mistaken then. My perhaps mistaken understanding though was that

Re: [zfs-discuss] RAIDZ one of the disk showing unavail

2008-09-26 Thread Miles Nordin
sc == Srinivas Chadalavada [EMAIL PROTECTED] writes: rr == Ralf Ramge [EMAIL PROTECTED] writes: sc I see the first disk as unavailble, How do i make it online? rr By replacing it with a non-broken one. Ralf, aren't you missing this obstinence-error: sc the following errors must

Re: [zfs-discuss] working closed blob driver

2008-09-26 Thread Miles Nordin
t == Tim [EMAIL PROTECTED] writes: t http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm I'm not sure. A different thing is wrong with it depending on what driver attaches to it. I can't tell for sure because this page: http://linuxmafia.com/faq/Hardware/sas.html

Re: [zfs-discuss] working closed blob driver

2008-09-28 Thread Miles Nordin
jcm == James C McPherson [EMAIL PROTECTED] writes: t == Tim [EMAIL PROTECTED] writes: jcm find out from Miles why mega_sas is new and unproven jcm given that it's been in NV since build 88. This has degenerated to an argument over definition, so if you like I can retract ``new and

Re: [zfs-discuss] ZFS poor performance on Areca 1231ML

2008-09-28 Thread Miles Nordin
jl == Jonathan Loran [EMAIL PROTECTED] writes: jl the single drive speed is in line with the raidz2 vdev, reviewing the OP UFS single drive: 50MB/s write 70MB/s read ZFS 1-drive: 42MB/s write 43MB/s read raidz2 11-drive: 40MB/s write 40MB/s read so,

  1   2   3   4   5   >