Joe S js.lists at gmail.com writes:
I'm going to create 3x 2-way mirrors. I guess I don't really *need* the
raidz at this point. My biggest concern with raidz is getting locked into
a configuration i can't grow out of. I like the idea of adding more
2 way mirrors to a pool.
The raidz2
It occured to me that there are scenarios where it would be useful to be
able to zfs send -i A B where B is a snapshot older than A. I am
trying to design an encrypted disk-based off-site backup solution on top
of ZFS, where budget is the primary constraint, and I wish zfs send/recv
would allow me
Matthew Ahrens Matthew.Ahrens at sun.com writes:
True, but presumably restoring the snapshots is a rare event.
You are right, this would only happen in case of disaster and total
loss of the backup server.
I thought that your onsite and offsite pools were the same size? If so then
you
Matthew Ahrens Matthew.Ahrens at sun.com writes:
So the errors on the raidz2 vdev indeed indicate that at least 3 disks below
it gave the wrong data for a those 2 blocks; we just couldn't tell which 3+
disks they were.
Something must be seriously wrong with this server. This is the first
MC rac at eastlink.ca writes:
Obviously 7zip is far more CPU-intensive than anything in use with ZFS
today. But maybe with all these processor cores coming down the road,
a high-end compression system is just the thing for ZFS to use.
I am not sure you realize the scale of things here.
Pawel Jakub Dawidek pjd at FreeBSD.org writes:
This is how RAIDZ fills the disks (follow the numbers):
Disk0 Disk1 Disk2 Disk3
D0 D1 D2 P3
D4 D5 D6 P7
D8 D9 D10 P11
D12 D13 D14 P15
D16
David Runyon david.runyon at sun.com writes:
I'm trying to get maybe 200 MB/sec over NFS for large movie files (need
(I assume you meant 200 Mb/sec with a lower case b.)
large capacity to hold all of them). Are there any rules of thumb on how
much RAM is needed to handle this (probably
I would like to test ZFS boot on my home server, but according to bug
6486493 ZFS boot cannot be used if the disks are attached to a SATA
controller handled by a driver using the new SATA framework (which
is my case: driver si3124). I have never heard of someone having
successfully used ZFS boot
Michael m.kucharski at bigfoot.com writes:
Excellent.
Oct 9 13:36:01 zeta1 scsi: [ID 107833 kern.warning] WARNING:
/pci at 2,0/pci1022,7458 at 8/pci11ab,11ab at 1/disk at 2,0 (sd13):
Oct 9 13:36:01 zeta1 Error for Command: readError
Level: Retryable
Scrubbing
cases where neither ZFS nor any other checksumming
filesystem is capable of detecting anything (e.g. the sequence of events: data
is corrupted, checksummed, written to disk).
--
Marc Bevand
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http
Robert slask at telia.com writes:
I simply need to rename/remove one of the erronous c2d0 entries/disks in
the pool so that I can use it in full again, since at this time I can't
reconnect the 10th disk in my raid and if one more disk fails all my
data would be lost (4 TB is a lot of disk to
William Fretts-Saxton william.fretts.saxton at sun.com writes:
Some more information about the system. NOTE: Cpu utilization never
goes above 10%.
Sun Fire v40z
4 x 2.4 GHz proc
8 GB memory
3 x 146 GB Seagate Drives (10k RPM)
1 x 146 GB Fujitsu Drive (10k RPM)
And what version of
William Fretts-Saxton william.fretts.saxton at sun.com writes:
I disabled file prefetch and there was no effect.
Here are some performance numbers. Note that, when the application server
used a ZFS file system to save its data, the transaction took TWICE as long.
For some reason, though,
Neil Perrin Neil.Perrin at Sun.COM writes:
The ZIL doesn't do a lot of extra IO. It usually just does one write per
synchronous request and will batch up multiple writes into the same log
block if possible.
Ok. I was wrong then. Well, William, I think Marion Hakanson has the
most plausible
.
To answer Paul's question about how to upgrade to snv_73 (if you
still want to upgrade for another reason): actually I would recommend
you the latest SXDE (Solaris Express Developer Edition 1/08, based
on build 79). Boot from the install disc, and choose the Upgrade
Install option.
--
Marc Bevand
I figured the following ZFS 'success story' may interest some readers here.
I was interested to see how much sequential read/write performance it would be
possible to obtain from ZFS running on commodity hardware with modern features
such as PCI-E busses, SATA disks, well-designed SATA
Anton B. Rang rang at acm.org writes:
Be careful of changing the Max_Payload_Size parameter. It needs to match,
and be supported, between all PCI-E components which might communicate with
each other. You can tell what values are supported by reading the Device
Capabilities Register and
all successful (aggregate data rate of 610 MB/s
generated by reading the disks for 24+ hours, 6 million head
seeks performed by each disk, etc).
Thanks for your much appreciated comments.
--
Marc Bevand
___
zfs-discuss mailing list
zfs-discuss
Brandon High bhigh at freaks.com writes:
Do you have access to a Sil3726 port multiplier?
Nope. But AFAIK OpenSolaris doesn't support port multipliers yet. Maybe
FreeBSD does.
Keep in mind that three modern drives (334GB/platter) are all it takes to
saturate a SATA 3.0Gbps link.
It's also
Brandon High bhigh at freaks.com writes:
[...]
The lack of documentation for supported devices is a general complaint
of mine with Solaris x86, perhaps better taken to the opensolaris-discuss
list however.
I replied to all your questions in opensolaris-discuss.
-marc
Mark Shellenbaum Mark.Shellenbaum at Sun.COM writes:
# ls -V a
-rw-r--r--+ 1 root root 0 Mar 19 13:04 a
owner@:--:--I:allow
group@:--:--I:allow
everyone@:--:--I:allow
The ls(1) manpage
Sachin Palav palavsachin27 at indiatimes.com writes:
3. Currently there no command that prints the entire configuration of ZFS.
Well there _is_ a command to show all (and only) the dataset properties
that have been manually zfs set:
$ zfs get -s local all
For the pool properties, zpool has
(Keywords: solaris hang zfs scrub heap space kernelbase marvell 88sx6081)
I am experiencing system hangs on a 32-bit x86 box with 1.5 GB RAM
running Solaris 10 Update 4 (with only patch 125205-07) during ZFS
scrubs of an almost full 3 TB zpool (6 disks on a AOC-SAT2-MV8
controller). I found out
For the record a parallel install of snv_83 on the same machine allows me to
set kernelbase to 0x8000 with no pb, no init crash. This increased the
kernel heap size to 1912 MB (up from 632 MB with kernelbase=0xd000 in
sol10u4) and the system doesn't hang anymore. The max heap usage I have
Pascal Vandeputte pascal_vdp at hotmail.com writes:
I'm at a loss, I'm thinking about just settling for the 20MB/s write
speeds with a 3-drive raidz and enjoy life...
As Richard Elling pointed out, the ~10ms per IO operation implies
seeking, or hardware/firmware problems. The mere fact you
Rustam rustam at code.az writes:
Didn't help. Keeps crashing.
The worst thing is that I don't know where's the problem. More ideas on
how to find problem?
Lots of CKSUM errors like you see is often indicative of bad hardware. Run
memtest for 24-48 hours.
-marc
Tim tim at tcsac.net writes:
So we're still stuck the same place we were a year ago. No high port
count pci-E compatible non-raid sata cards. You'd think with all the
demand SOMEONE would've stepped up to the plate by now. Marvell, cmon ;)
Here is a 6-port SATA PCI-Express x1 controller
Kyle McDonald KMcDonald at Egenera.COM writes:
Marc Bevand wrote:
Overall, like you I am frustrated by the lack of non-RAID inexpensive
native PCI-E SATA controllers.
Why non-raid? Is it cost?
Primarily cost, reliability (less complex hw = less hw that can fail),
and serviceability
Brandon High bhigh at freaks.com writes:
I'm going to be putting together a home NAS
based on OpenSolaris using the following:
1 SUPERMICRO CSE-743T-645B Black Chassis
1 ASUS M2N-LR AM2 NVIDIA nForce Professional 3600 ATX Server Motherboard
1 SUPERMICRO AOC-SAT2-MV8 64-bit
Marc Bevand m.bevand at gmail.com writes:
What I hate about mobos with no onboard video is that these days it is
impossible to find cheap fanless video cards. So usually I just go headless.
Didn't finish my sentence: ...fanless and *power-efficient*.
Most cards consume 20+W when idle
So you are experiencing slow I/O which is making the deletion of this clone
and the replay of the ZIL take forever. It could be because of random I/O ops,
or one of your disks which is dying (not reporting any errors, but very slow
to execute every single ATA command). You provided the output
Hernan Freschi hjf at hjf.com.ar writes:
Here's the output. Numbers may be a little off because I'm doing a nightly
build and compressing a crashdump with bzip2 at the same time.
Thanks. Your disks look healthy. But one question: why is
c5t0/c5t1/c6t0/c6t1 when in another post you referred
Ben Middleton ben at drn.org writes:
[...]
But that simply had the effect of transferring the issue to the new drive:
When you see this behavior, it most likely means it's not your drive
which is failing, but instead it indicates a bad SATA/SAS cable, or
port on the disk controller.
PS: have
Buy a 2-port SATA II PCI-E x1 SiI3132 controller ($20). The solaris driver is
very stable.
Or, a solution I would personally prefer, don't use a 7th disk. Partition
each of your 6 disks with a small ~7-GB slice at the beginning and the rest of
the disk for ZFS. Install the OS in one of the
Richard L. Hamilton rlhamil at smart.net writes:
But I suspect to some extent you get what you pay for; the throughput on the
higher-end boards may well be a good bit higher.
Not really. Nowadays, even the cheapest controllers, processors mobos are
EASILY capable of handling the platter-speed
Weird. I have no idea how you could remove that file (beside destroying the
entire filesystem)...
One other thing I noticed:
NAMESTATE READ WRITE CKSUM
rpool ONLINE 0 0 8
raidz1ONLINE 0 0 8
c0t7d0 ONLINE
I remember a similar pb with an AOC-SAT2-MV8 controller in a system of mine:
Solaris rebooted each time the marvell88sx driver tried to detect the disks
attached to it. I don't remember if happened during installation, or during
the first boot after a successful install. I ended up spending a
Erik Trimble Erik.Trimble at Sun.COM writes:
* Huge RAM drive in a 1U small case (ala Cisco 2500-series routers),
with SAS or FC attachment.
Almost what you want:
http://www.superssd.com/products/ramsan-400/
128 GB RAM-based device, 3U chassis, FC and Infiniband connectivity.
However as a
Marc Bevand m.bevand at gmail.com writes:
I have recently had to replace this AOC-SAT2-MV8 controller with another one
(we accidentally broke a SATA connector during a maintainance operation). Its
firmware version is using a totally different numbering scheme (it's probably
more recent
Chris Cosby ccosby+zfs at gmail.com writes:
You're backing up 40TB+ of data, increasing at 20-25% per year.
That's insane.
Over time, backing up his data will require _fewer_ and fewer disks.
Disk sizes increase by about 40% every year.
-marc
Matt Harrison iwasinnamuknow at genestate.com writes:
Aah, excellent, just did an export/import and its now showing the
expected capacity increase. Thanks for that, I should've at least tried
a reboot :)
More recent OpenSolaris builds don't even need the export/import anymore when
It looks like you *think* you are trying to add the new drive, when you are in
fact re-adding the old (failing) one. A new drive should never show up as
ONLINE in a pool with no action from your part, if only because it contains no
partition and no vdev label with the right pool GUID.
If I am
I noticed some errors in ls(1), acl(5) and the ZFS Admin Guide about ZFS/NFSv4
ACLs:
ls(1): read_acl (r) Permission to read the ACL of a file. The compact
representation of read_acl is c, not r.
ls(1): -c | -vThe same as -l, and in addition displays the [...] The
options are in fact
Vanja vanjab at gmail.com writes:
And finally, if this is the case, is it possible to make an array with
3 drives, and then add the mirror later?
I assume you are asking if it is possible to create a temporary 3-way raidz,
then transfer your data to it, then convert it to a 4-way raidz ? No
Alan alan at peak.org writes:
I was just thinking of a similar feature request: one of the things
I'm doing is hosting vm's. I build a base vm with standard setup in a
dedicated filesystem, then when I need a new instance zfs clone and voila!
ready to start tweaking for the needs of the new
Bryan, Thomas: these hangs of 32-bit Solaris under heavy (fs, I/O) loads are a
well known problem. They are caused by memory contention in the kernel heap.
Check 'kstat vmem::heap'. The usual recommendation is to change the
kernelbase. It worked for me. See:
Borys Saulyak borys.saulyak at eumetsat.int writes:
root at omases11:~[8]#zpool import
[...]
pool: private
id: 3180576189687249855
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
private ONLINE
Borys Saulyak borys.saulyak at eumetsat.int writes:
Your pools have no redundancy...
Box is connected to two fabric switches via different HBAs, storage is
RAID5, MPxIP is ON, and all after that my pools have no redundancy?!?!
As Darren said: no, there is no redundancy that ZFS can use.
Tim tim at tcsac.net writes:
That's because the faster SATA drives cost just as much money as
their SAS counterparts for less performance and none of the
advantages SAS brings such as dual ports.
SAS drives are far from always being the best choice, because absolute IOPS or
throughput
Marc Bevand m.bevand at gmail.com writes:
Well let's look at a concrete example:
- cheapest 15k SAS drive (73GB): $180 [1]
- cheapest 7.2k SATA drive (160GB): $40 [2] (not counting a 80GB at $37)
The SAS drive most likely offers 2x-3x the IOPS/$. Certainly not 180/40=4.5x
Doh! I said
Erik Trimble Erik.Trimble at Sun.COM writes:
Bottom line here is that when it comes to making statements about SATA
vs SAS, there are ONLY two statements which are currently absolute:
(1) a SATA drive has better GB/$ than a SAS drive
(2) a SAS drive has better throughput and IOPs than a
About 2 years ago I used to run snv_55b with a raidz on top of 5 500GB SATA
drives. After 10 months I ran out of space and added a mirror of 2 250GB
drives to my pool with zpool add. No pb. I scrubbed it weekly. I only saw 1
CKSUM error one day (ZFS self-healed itself automatically of course).
Charles Menser charles.menser at gmail.com writes:
Nearly every time I scrub a pool I get small numbers of checksum
errors on random drives on either controller.
These are the typical symptoms of bad RAM/CPU/Mobo. Run memtest for 24h+.
-marc
___
Ross myxiplx at googlemail.com writes:
Now this is risky if you don't have backups, but one possible approach might
be:
- Take one of the 1TB drives off your raid-z pool
- Use your 3 1TB drives, plus two sparse 1TB files and create a 5 drive
raid-z2
- disconnect the sparse files. You now
Robert Rodriguez robertro at comcast.net writes:
A couple of follow up question, have you done anything similar before?
I have done similar manipulations to experiment with ZFS
(using files instead of drives).
Can you assess the risk involved here?
If any one of your 8 drives die during the
Carsten Aulbert carsten.aulbert at aei.mpg.de writes:
Put some stress on the system with bonnie and other tools and try to
find slow disks
Just run iostat -Mnx 2 (not zpool iostat) while ls is slow to find the slow
disks. Look at the %b (busy) values.
-marc
Aaron Blew aaronblew at gmail.com writes:
I've done some basic testing with a X4150 machine using 6 disks in a
RAID 5 and RAID Z configuration. They perform very similarly, but RAIDZ
definitely has more system overhead.
Since hardware RAID 5 implementations usually do not checksum data
Carsten Aulbert carsten.aulbert at aei.mpg.de writes:
In RAID6 you have redundant parity, thus the controller can find out
if the parity was correct or not. At least I think that to be true
for Areca controllers :)
Are you sure about that ? The latest research I know of [1] says that
Carsten Aulbert carsten.aulbert at aei.mpg.de writes:
Well, I probably need to wade through the paper (and recall Galois field
theory) before answering this. We did a few tests in a 16 disk RAID6
where we wrote data to the RAID, powered the system down, pulled out one
disk, inserted it into
Mattias Pantzare pantzare at gmail.com writes:
He was talking about errors that the disk can't detect (errors
introduced by other parts of the system, writes to the wrong sector or
very bad luck). You can simulate that by writing diffrent data to the
sector,
Well yes you can. Carsten and I
Mattias Pantzare pantzare at gmail.com writes:
On Tue, Dec 30, 2008 at 11:30, Carsten Aulbert wrote:
[...]
where we wrote data to the RAID, powered the system down, pulled out one
disk, inserted it into another computer and changed the sector checksum
of a few sectors (using hdparm's
The copy operation will make all the disks start seeking at the same time and
will make your CPU activity jump to a significant percentage to compute the
ZFS checksum and RAIDZ parity. I think you could be overloading your PSU
because of the sudden increase in power consumption...
However if
dick hoogendijk dick at nagual.nl writes:
I live in Holland and it is not easy to find motherboards that (a)
truly support ECC ram and (b) are (Open)Solaris compatible.
Virtually all motherboards for AMD processors support ECC RAM because the
memory controller is in the CPU and all AMD CPUs
dick hoogendijk dick at nagual.nl writes:
Than why is it that most AMD MoBo's in the shops clearly state that ECC
Ram is not supported on the MoBo?
To restate what Erik explained: *all* AMD CPUs support ECC RAM, however poorly
written motherboard specs often make the mistake of confusing
Bill Moore Bill.Moore at sun.com writes:
Moving on, modern high-capacity SATA drives are in the 100-120MB/s
range. Let's call it 125MB/s for easier math. A 5-port port multiplier
(PM) has 5 links to the drives, and 1 uplink. SATA-II speed is 3Gb/s,
which after all the framing overhead,
Marc Bevand m.bevand at gmail.com writes:
So in conclusion, my SBNSWAG (scientific but not so wild-ass guess)
is that the max I/O throughput when reading from all the disks on
1 of their storage pod is about 1000MB/s.
Correction: the SiI3132 are on x1 (not x2) links, so my guess
Tim Cook tim at cook.ms writes:
Whats the point of arguing what the back-end can do anyways? This is bulk
data storage. Their MAX input is ~100MB/sec. The backend can more than
satisfy that. Who cares at that point whether it can push 500MB/s or
5000MB/s? It's not a database processing
Neal Pollack Neal.Pollack at Sun.COM writes:
Pliant Technologies just released two Lightning high performance
enterprise SSDs that threaten to blow away the competition.
One can build an SSD-based storage device that gives you:
o 320GB of storage capacity (2.1x better than their 2.5 model:
Richard Connamacher rich at indieimage.com writes:
I was thinking of custom building a server, which I think I can do for
around $10,000 of hardware (using 45 SATA drives and a custom enclosure),
and putting OpenSolaris on it. It's a bit of a risk compared to buying a
$30,000 server, but
Richard Connamacher rich at indieimage.com writes:
Also, one of those drives will need to be the boot drive.
(Even if it's possible I don't want to boot from the
data dive, need to keep it focused on video storage.)
But why?
By allocating 11 drives instead of 12 to your data pool, you will
Frank Middleton f.middleton at apogeect.com writes:
As noted in another thread, 6GB is way too small. Based on
actual experience, an upgradable rpool must be more than
20GB.
It depends on how minimal your install is.
The OpenSolaris install instructions recommend 8GB minimum, I have
one
Bob Friesenhahn bfriesen at simple.dallas.tx.us writes:
[...]
X25-E's write cache is volatile), the X25-E has been found to offer a
bit more than 1000 write IOPS.
I think this is incorrect. On the paper the X25-E offers 3300 random write
4kB IOPS (and Intel is known to be very conservative
Bob Friesenhahn bfriesen at simple.dallas.tx.us writes:
The Intel specified random write IOPS are with the cache enabled and
without cache flushing.
For random write I/O, caching improves I/O latency not sustained I/O
throughput (which is what random write IOPS usually refer to). So Intel
Russ Price rjp_sun at fubegra.net writes:
I had recently started setting up a homegrown OpenSolaris NAS with
a large RAIDZ2 pool, and had found its RAIDZ2 performance severely
lacking - more like downright atrocious. As originally set up:
* Asus M4A785-M motherboard
* Phenom II X2 550
Russ Price rjp_sun at fubegra.net writes:
Did you enable AHCI mode on _every_ SATA controller?
I have the exact opposite experience with 2 of your 3
types of controllers.
It wasn't possible to do so, and that also made me think that a real HBA would
work better. First off, with the
Oliver Seidel osol at os1.net writes:
Hello,
I'm a grown-up and willing to read, but I can't find where to read.
Please point me to the place that explains how I can diagnose this
situation: adding a mirror to a disk fills the mirror with an
apparent rate of 500k per second.
I don't
I have done quite some research over the past few years on the best (ie.
simple, robust, inexpensive, and performant) SATA/SAS controllers for ZFS.
Especially in terms of throughput analysis (many of them are designed with an
insufficient PCIe link width). I have seen many questions on this
The LSI SAS1064E slipped through the cracks when I built the list.
This is a 4-port PCIe x8 HBA with very good Solaris (and Linux)
support. I don't remember having seen it mentionned on zfs-discuss@
before, even though many were looking for 4-port controllers. Perhaps
the fact it is priced too
Marc Nicholas geekything at gmail.com writes:
Nice write-up, Marc.Aren't the SuperMicro cards their funny UIO form
factor? Wouldn't want someone buying a card that won't work in a standard
chassis.
Yes, 4 or the 6 Supermicro cards are UIO cards. I added a warning about it.
Thanks.
-mrb
Thomas Burgess wonslung at gmail.com writes:
A really great alternative to the UIO cards for those who don't want the
headache of modifying the brackets or cases is the Intel SASUC8I
This is a rebranded LSI SAS3081E-R
It can be flashed with the LSI IT firmware from the LSI website and
Deon Cui deon.cui at gmail.com writes:
So I had a bunch of them lying around. We've bought a 16x SAS hotswap
case and I've put in an AMD X4 955 BE with an ASUS M4A89GTD Pro as
the mobo.
In the two 16x PCI-E slots I've put in the 1068E controllers I had
lying around. Everything is still
On Wed, May 26, 2010 at 6:09 PM, Giovanni Tirloni gtirl...@sysdroid.com wrote:
On Wed, May 26, 2010 at 9:22 PM, Brandon High bh...@freaks.com wrote:
I'd wager it's the PCIe x4. That's about 1000MB/s raw bandwidth, about
800MB/s after overhead.
Makes perfect sense. I was calculating the
Graham McArdle graham.mcardle at ccfe.ac.uk writes:
This thread from Marc Bevand and his blog linked therein might have some
useful alternative suggestions.
http://opensolaris.org/jive/thread.jspa?messageID=480925
I've bookmarked it because it's quite a handy summary and I hope he keeps
(I am aware I am replying to an old post...)
Arne Jansen sensille at gmx.net writes:
Now the test for the Vertex 2 Pro. This was fun.
For more explanation please see the thread Crucial RealSSD C300 and cache
flush?
This time I made sure the device is attached via 3GBit SATA. This is also
Marc Bevand m.bevand at gmail.com writes:
This discrepancy between tests with random data and zero data is puzzling
to me. Does this suggest that the SSD does transparent compression between
its Sandforce SF-1500 controller and the NAND flash chips?
Replying to myself: yes, SF-1500 does
Richard Jacobsen richard at unixboxen.net writes:
Hi all,
I'm getting a very strange problem with a recent OpenSolaris b134 install.
System is:
Supermicro X5DP8-G2 BIOS 1.6a
2x Supermicro AOC-SAT2-MV8 1.0b
As Richard pointed out this is a bug in the AOC-SAT2-MV8 firmware 1.0b.
It
86 matches
Mail list logo