So I have read in the ZFS Wiki:
# The minimum size of a log device is the same as the minimum size of device
in pool, which is 64 Mbytes. The amount of in-play data that might be stored on
a log device is relatively small. Log blocks are freed when the log transaction
(system call) is
You were just lucky before and unlucky now.
I had a PC back in like Pentium-133 days go CRASH because I moved it too
roughly while the drive was spinning.
I moved many PC in my life with drive spinning no problems, but I don't COUNT
on it and avoid it if humanly possible. Don't people do it
Thanks I think I get it now.
Do you think having log on a 15K RPM drive with the main pool composed of 10K
RPM drives will show worthwhile improvements? Or am I chasing a few percentage
points?
I don't have money for new hardware SSD. Just recycling some old components
here are and there
JZ cease desist all this junk e-mail. I am adding you to my spam filters.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I happen to have some 3510, 3511 on SAN, and older 3310 direct-attach arrays
around here.
Also some newer 2540 arrays.
Our preferred setup for the past year or so, is 2 arrays available to the
server.
From each array make 2 LUNS available. Take these LUNs on the server and
ZFS them as
Dear Admin
u said:
choose a 3-disk RAID-1 or a 4-disk RAID10
set for tank over RAIDZ aWith a 3-disk mirror you'd have a disk left over
to be hot spare for failure in either rpool or tank.
why do u prefer raid1 with one spare rather than raidz?
i heard raidz has better redundancy than raid 1 or 5?!
div id=jive-html-wrapper-div
Dear AdminbrI have a server with 6 HDD,in fresh
installation i select two disks for mirroring and
create rpool and installing solaris10,but i need
further space than default rpool`s file systems for
installing my application. (for example mysql).so i
decide
To answer original post, simple answer:
Almost all old RAID designs have holes in their logic where they are
insufficiently paranoid on the writes or read, and sometimes both. One example
is the infamous RAID-5 write hole.
Look at simple example of mirrored SVM versus ZFS in page 1516 of
Seems a lot simpler to create a multi-way mirror.
Then symlink your /opt/BIND or whatever off to new place.
# zpool create opt-new mirror c3t40d0 c3t40d1 c3t40d2 c3t40d3
# zpool status
pool: opt-new
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
Just put commands to create them in the finish script.
I create several and set options on them like so:
### Create ZFS additional filesystems
echo setting up additional filesystems.
zfs create -o compression=on -o mountpoint=/ucd rpool/ucd
zfs create -o compression=on -o mountpoint=/local/d01
Whether tis nobler.
Just wondering if (excepting the existing zones thread) there are any
compelling arguments to keep /var as it's own filesystem for your typical
Solaris server. Web servers and the like.
Or arguments against it.
--
This message posted from opensolaris.org
It just seems like in a typical ZFS root install the need for a separate /var
is difficult for me to justify now. By default there are no quotas or
reservations set on /var. Okay I set them.
I have a monitoring system able to tell me when disks are getting full. It
seems easier to say just
Followup to my own post.
Looks like my SVM setup was having problems prior to patch being applied.
If I boot net:dhcp -s and poke around on the disks, it looks like disk0 is
pre-patch state and disk1 is post-patch.
I can get a shell if I
boot disk1 -s
So I think I am in SVM hell here not
Reviving this thread.
We have a Solaris 10u4 system recently patched with 137137-09.
Unfortunately the patch was applied from multi-user mode, I wonder if this
may have been original posters problem as well? Anyhow we are now stuck
with an unbootable system as well.
I have submitted a case to
The SupportTech responding to case #66153822 so far
has only suggested boot from cdrom and patchrm 137137-09
which tells me I'm dealing with a level-1 binder monkey.
It's the idle node of a cluster holding 10K email accounts
so I'm proceeding cautiously. It is unfortunate the admin doing
the
I don't want to steer you wrong under the
circumstances,
so I think we need more information.
First, is the failure the same as in the earlier part
of this
thread. I.e., when you boot, do you get a failure
like this?
Warning: Fcode sequence resulted in a net stack depth
change of
I noticed this while patching to 137137-09 on a UFS Sparc today:
Patch 137137-09 has been successfully installed.
See /var/run/.patchSafeMode/root/var/sadm/patch/137137-09/log for details
Executing postpatch script...
Detected SVM root.
Installing bootblk on /dev/rdsk/c1t0d0s0
Installing bootblk
Ummm, could you name a specific patch number that would apply to a stock
install?
Or suggest a way to search?
I poked at the 10_PatchReport from Nov 1st which has a handful but none of
the ones I picked out were needed on a 10u6 OEM install. I kept seeing patches
that applied to
Never mind, I ran through the cluster_install for 10_Recommended and found
patch 126868-02 is new so I used that one.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
Just wondering if anyone knows of a patch released for 10u6?
I realize this is OT but want to test my new ability with ZFS root to do
lucreate, patch the alternate BE, and luactivate it.
--
This message posted from opensolaris.org
___
zfs-discuss
Fit-PC Slim uses Geode LX800 which is 500 MHz CPU with 512 megs RAM.
Well... it's easy to disable graphical login:
svcadm disable cde-login
The problem is there's no option during install to say no graphics so during
firstboot it's going to try anyhow. At which point my console is hosed
Anyone tried Nevada with ZFS on small platforms?
I ordered one of these:
http://www.fit-pc.com/new/fit-pc-slim-specifications.html
Planning to stick in a 160-gig Samsung drive and use it for lightweight
household server. Probably some Samba usage, and a tiny bit of Apache
RADIUS. I don't
You know that you need a minimum of 2 disks to form a
(mirrored) pool
with ZFS? A pool with no redundancy is not a good
idea!
According to the slides I have seen, a ZFS filesystem even on a single disk can
handle massive amounts of sector failure before it becomes unusable. I seem
to
Has anyone done a script to check for filesystem problems?
On our existing UFS infrastructure we have a cron job run metacheck.pl
periodically so we get email if an SVM setup has problems.
We can scratch something like this together but if someone else already has.
--
This message posted
So after tinkering with lucreate and luactivate I now have several boot
environments but the active one is unfortunately not bootable.
How can I access the luactivate command from boot:net?
boot net:dhcp -s
mkdir /tmp/mnt
zpool import -R /tmp/mnt rpool
I poke around in /tmp/mnt but do not find
Thanks I have restated the question over there.
Just thought this a ZFS question since I am doing Sparc ZFS root mirrors.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
Dunno about the text installer mentioned in other replies as I never use it.
JumpStart installs working fine though with ZFS root.
I am also in finish script pre-creating some additional filesystems etc.
--
This message posted from opensolaris.org
___
No, the last arguments are not options.nbsp;
Unfortunately,br
the syntax doesn't provide a way to specify
compressionbr
at the creation time.nbsp; It should, though.nbsp;
Or perhapsbr
compression should be the default.br
Should I submit an RFE somewhere then?
--
This message posted from
Does it seem feasible/reasonable to enable compression on ZFS root disks during
JumpStart?
Seems like it could buy some space performance.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
Did you enable it in the jumpstart profile somehow?
If you do it after install the OS files are not compressed.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
Just make SURE the other host is actually truly DEAD!
If for some reason it's simply wedged, or you have lost console access but the
hostA is still live, then you can end up with 2 systems having access to same
ZFS pool.
I have done this in test, 2 hosts accessing same pool, and the result is
Why would the customer need to use raidz or zfs
mirroring if the array
is doing it for them? As someone else posted,
metadata is already
redundant by default and doesn't consume a ton of
space.
Because arrays drives can suffer silent errors in the data that are not found
until too
div id=jive-html-wrapper-div
div dir=ltrAll,brI'm currently working out
details on an upgrade from UFS/SDS on DAS to ZFS on a
SAN fabric. I'm interested in hearing how
ZFS has behaved in more traditional SAN environments
using gear that scales vertically like EMC
Clarion/HDS AMS/3PAR etc.
I'm not sure why you want to separate out all these filesystems on a root disk
these days? The reason I recall needing to do it over a decade ago, was
because disks were so small and perhaps you couldn't FIT /opt onto the same
disk with /usr. So you needed to be able to say /usr is on this
Once upon a time I ran a lab with a whole bunch of SGI workstations.
A company that barely exists now.
This ButterFS may be the Next Big Thing. But I recall one time how hot
everyone was for Reiser. Look how that turned out.
3 years is an entire production lifecycle for the systems in this
We have 50,000 users worth of mail spool on ZFS.
So we've been trusting it for production usage for THE most critical visible
enterprise app.
Works fine. Our stores are ZFS RAID-10 built of LUNS from pairs of 3510FC.
Had an entire array go down once, the system kept going fine. Brought the
I'm not sure why people obsess over this issue so much. Disk is cheap.
We have a fair number of 3510 and 2540 on our SAN. They make RAID-5 LUNs
available to various servers.
On the servers we take RAID-5 LUNs from different arrays and ZFS mirror them.
So if any array goes away we are still
Followup with modified test plan:
1) Yank disk0 from V240.
Waited for it to be marked FAULTED in zpool status -x
2) Inserted new disk0 scavenged from another system
3) Ran format to set s0 as full-disk to agree with other system
4) Halted system
5) boot disk1
Wanted to make sure Jumpstart mirror
You want to install the zfs boot block, not the ufs
bootblock.
Oh duh. I tried to correct my mistake using this:
installboot /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t0d0s0
And now get this:
Boot device: disk File and args:
Can't mount root
Evaluating:
The file just loaded
Your key problem is going to be:
Will Sun use SLC or MLC?
From what I have read the trend now is towards MLC chips which have much lower
number of write cycles but are cheaper and more storage. So then they end up
layering ECC and wear-levelling on to address this shortened life-span. A
This is one of those issues, where the developers generally seem to think that
old-style quotas is legacy baggage. And that people running large
home-directory sort of servers with 10,000+ users are a minority that can
safely be ignored.
I can understand their thinking.However it does
So I decided to test out failure modes of ZFS root mirrors.
Installed on a V240 with nv90. Worked great.
Pulled out disk1, then replaced it and attached again, resilvered, all good.
Now I pull out disk0 to simulate failure there. OS up and running fine, but
lots of error message about SYNC
Ummm, could you back up a bit there?
What do you mean disk isn't sync'd so boot should fail? I'm coming from UFS
of course where I'd expect to be able to fix a damaged boot drive as it drops
into a single-user root prompt.
I believe I did try boot disk1 but that failed I think due to prior
So can I jumpstart and setup the ZFS root config?
Anyone have example profile?
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Ahh!!!
I can resize swap on a live system.
I shudder to think how many times in my sysadmin career I've had to resolve
insufficient swap issues the HARD WAY with downtime.
If I ever meet any ZFS devs your bar tab is covered.
This message posted from opensolaris.org
Way to drag my post into the mud there.
Ever try adding a swapfile onto the same disk where there's already a swap
partition? At least in my experience this led really poor performance. Can we
just move on?
This message posted from opensolaris.org
I run 3510FC and 2540 units in pairs. I build 2 5-disk RAID5 LUNs in each
array, with 2 disks as global spares. Each array has dual controllers and I'm
doing multipath.
Then from the server I have access to 2 LUNs from 2 arrays, and I build a ZFS
RAID-10 set from these 4 LUNs being sure
So it's pushed back to build 90 now?
There was an announcement the other day that build 88 was being skipped and
build 89 would be the official release with ZFS boot.
Not a big deal but someone should do an announcement about the change.
This message posted from opensolaris.org
Intel EagleLake boards look very promising for this purpose. Compact low-power
home NAS but with some power to it. Onboard gigabit.
http://www.mini-itx.com/2008/03/06/intels-eaglelake-mini-itx-boards
This message posted from opensolaris.org
___
Cyrus mail-stores for UC Davis are in ZFS.
Began as failure ended as success. We hit the FSYNC performance issue and our
systems collapsed under user load. We could not track it down and neither
could the Sun reps we contacted. Eventually I found a reference to FSYNC bug
and we tried out
I would hope at least it has that giant FSYNC patch for ZFS already present?
We ran into this issue and it nearly killed Solaris here in our Data Center as
a product it was such a bad experience.
Fix was in 127728 (x86) and 127729 (Sparc).
Well anyhow good to see U5 is out, hadn't even heard
You DO mean IPMP then. That's what I was trying to sort out, to make sure that
you were talking about the IP part of things, the iSCSI layer. And not the
paths from the target system to it's local storage.
You say non-ethernet for your network transport, what ARE you using?
This message
Oh sure pick nits. Yeah I should have said network multipath instead of
ethernet multipath but really how often do I encounter non-ethernet networks?
I can't recall the last time I saw a token ring or anything else.
This message posted from opensolaris.org
I don't think ANY situation in which you are mirrored and one half of the
mirror pair becomes unavailable will panic the system. At least this has been
the case when I've tested with local storage haven't tried with iSCSI yet but
will give it a whirl.
I had a simple single ZVOL shared over
Followup, my initiator did eventually panic.
I will have to do some setup to get a ZVOL from another system to mirror with,
and see what happens when one of them goes away. Will post in a day or two on
that.
This message posted from opensolaris.org
I assume you mean IPMP here, which refers to ethernet multipath.
There is also the other meaning of multipath referring to multiple paths to the
storage array typically enabled by stmsboot command.
We run active-passive (failover) IPMP as it keeps things simple for us and I
have run into some
Fascinating read, thanks Simon!
I have been using ZFS in production data center for some while now, but it
never occurred to me to use iSCSI with ZFS also.
This gives me some ideas on how to backup our mail pools into some older slower
disks offsite. I find it interesting that while a local
We are working very hard to get it into build 88.
*sigh*
Last I heard it was going to be build 86. I saw build 85 come out and thought
GREAT only a couple more weeks!
Oh well..
Will we ever be able to boot from a RAIDZ pool, or is that fantasy?
This message posted from
Insufficient data.
How big is the pool? How much stored?
Are the external drives all on the same USB bus?
I am switching to eSATA for my next external drive setup as both USB 2.0 and
firewire are just too fricking slow for the large drives these days.
This message posted from
Is it still the case that there is a kernel panic if the device(s) with the ZFS
pool die?
I was thinking to attach some cheap SATA disks to a system to use for nearline
storage. Then I could use ZFS send/recv on the local system (without ssh) to
keep backups of the stuff in the main pool.
I recall reading that ZFS boot/root will be possible in NV_86 Sparc coming soon.
Do we expect to be able to to define disk setup in the Jumpstart profile at
that time?
Or will that come later?
This message posted from opensolaris.org
___
Let's say you are paranoid and have built a pool with 40+ disks in a Thumper.
Is there a way to set metadata copies=3 manually?
After having built RAIDZ2 sets with 7-9 disks and then pooled these together,
it just seems like a little bit of extra insurance to increase metadata copies.
I don't
-Setting zfs_nocacheflush, though got me drastically
increased throughput--client requests took, on
average, less than 2 seconds each!
So, in order to use this, I should have a storage
array, w/battery backup, instead of using the
internal drives, correct? I have the option of using
a
Solaris 10u4 eh?
Sounds a lot like fsync issues we want into, trying to run Cyrus mail-server
spools in ZFS.
This was highlighted for us by the filebench software varmail test.
OpenSolaris nv78 however worked very well.
This message posted from opensolaris.org
Does anyone have any particularly creative ZFS replication strategies they
could share?
I have 5 high-performance Cyrus mail-servers, with about a Terabyte of storage
each of which only 200-300 gigs is used though even including 14 days of
snapshot space.
I am thinking about setting up a
I package up 5 or 6 disks into a RAID-5 LUN on our Sun 3510 and 2540 arrays.
Then I use ZFS to RAID-10 these volumes.
Safety first!
Quite frankly I've had ENOUGH of rebuilding trashed filesystems. I am tired
to chasing performance like it's the Holy Grail and shoving other
considerations
So the point is, a JBOD with a flash drive in one (or two to mirror the ZIL)
of the slots would be a lot SIMPLER.
We've all spent the last decade or two offloading functions into specialized
hardware, that has turned into these massive unneccessarily complex things.
I don't want to go to a
zfs boot on sparc will not be putback on its own.
It will be putback with the rest of zfs boot support,
sometime around build 86.
Does this still seem likely to occur, or will it be pushed back further? I see
that build 81 is out today which means we are not far from seeing ZFS boot on
Awesome work you and your team are doing. Thanks Lori!
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Are you already running with zfs_nocacheflush=1? We have SAN arrays with dual
battery-backed controllers for the cache, so we definitely have this set on all
our production systems. It makes a big difference for us.
As I said before I don't see the catastrophe in disabling ZIL though.
We
No, we're not using the zfs_nocacheflush=1, but our
SAN array's are set
to cache all writebacks, so it shouldn't be needed.
I may test this, if
get the chance to reboot one of the servers, but
I'll bet the storage
rrays' are working correctly.
Bzzzt, wrong.
Read up on a few threads
As other poster noted, you can disable it completely for testing.
From my understanding though, it's not as production-catastrophic as it
sounds to delay or disable ZIL.
Many people run Linux boxes with ext3 in the standard setting, which only
journals metadata, not file content. So the
We loaded Nevada_78 on a peer T2000 unit. Imported the same ZFS pool. I
didn't even upgrade the pool since we wanted to be able to move it back to
10u4. Cut 'n paste of my colleague's email with the results:
Here's the latest Pepsi Challenge results.
Sol10u4 vs Nevada78. Same tuning
So does anyone have any insight on BugID 6535160?
We have verified on a similar system, that ZFS shows big latency in filebench
varmail test.
We formatted the same LUN with UFS and latency went down from 300 ms to 1-2 ms.
http://sunsolve.sun.com/search/document.do?assetkey=1-1-6535160-1
We
) The write cache is non volatile, but ZFS hasn't
been configured
to stop flushing it (set zfs:zfs_nocacheflush =
1).
These are a pair of 2540 with dual-controllers, definitely non-volatile cache.
We set the zfs_nocacheflush=1 and that improved things considerably.
ZFS filesystem (2540
On Wed, 5 Dec 2007, Brian Hechinger wrote:
[1] Finally, someone built a flash SSD that rocks
(and they know how
fast it is judging by the pricetag):
http://www.tomshardware.com/2007/11/21/mtron_ssd_32_gb
/
http://www.anandtech.com/storage/showdoc.aspx?i=3167
Great now if only Sun would
Thanks for your observations.
HOWEVER, I didn't pose the question
How do I architect the HA and storage and everything for an email system?
Our site like many other data centers has HA standards and politics and all
this other baggage that may lead a design to a certain point. Thus our answer
On Dec 1, 2007 7:15 AM, Vincent Fox
Any reason why you are using a mirror of raid-5
lun's?
I can understand that perhaps you want ZFS to be in
control of
rebuilding broken vdev's, if anything should go wrong
... but
rebuilding RAID-5's seems a little over the top.
Because the decision
Sounds good so far: lots of small files in a largish
system with presumably significant access parallelism
makes RAID-Z a non-starter, but RAID-5 should be OK,
especially if the workload is read-dominated. ZFS
might aggregate small writes such that their
performance would be good as well
From Neil's comment in the blog entry that you
referenced, that sounds *very* dicey (at least by
comparison with the level of redundancy that you've
built into the rest of your system) - even if you
have rock-solid UPSs (which have still been known to
fail). Allowing a disk to lie to higher
Bill, you have a long-winded way of saying I don't know. But thanks for
elucidating the possibilities.
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
We will be using Cyrus to store mail on 2540 arrays.
We have chosen to build 5-disk RAID-5 LUNs in 2 arrays which are both connected
to same host, and mirror and stripe the LUNs. So a ZFS RAID-10 set composed of
4 LUNs. Multi-pathing also in use for redundancy.
My question is any guidance on
The info in that tuning guide depends on what Solaris version you are working
with. Last I checked it was not current.
I use Solaris 10u4 and have zfs_nocacheflush set. Haven't played with using
alternate disks for ZIL yet not really sure what that does to my HA model. I
have mirrored LUNS
H, well depends on what you are looking for. Is the speed not enough, or
the size of RAM? I am thinking people found out the original GLY would
actually work with a 2-gig DIMM. So it's possible the GLY2 will accept 2-gig
also, which seems plenty for me. YMMV.
This message posted
The new Intel D201GLY2 looks quite good.
Fanless 64-bit CPU, low-power consumption from what I have read. Awaiting
first substantive review from SilentPCReview.com before ordering one.
This message posted from opensolaris.org
___
zfs-discuss
Thanks, we have switched over a couple of our arrays. Have not noticed a
performance change so perhaps the effect is minor.
Yes we are using ZFS to do the mirroring between the array LUNs and quite happy
with it for reliability. As someone else said, speed and costs are metrics to
look at
In our data center on CRITICAL systems we plan to survive chains of several
single-type failures. The HA standards we apply to a mail-server for 30,000
people are neccessarily quite high.
A fully redundant 2-node failover clustered system can survive failures of half
or more of it's systems
To my mind ZFS has a serious deficiency for JBOD usage in a high-availability
clustered environment.
Namely, inability to tie spare drives to a particular storage group.
Example in clustering HA setups you would would want 2 SAS JBOD units and
mirror between them. In this way if a chassis
We had a Sun Engineer on-site recently who said this:
We should set our array controllers to sequential I/O *even* if we are doing
random I/O if we are using ZFS. This is because the Arc cache is already
grouping requests up sequentially so to the array controller it will appear
like
Yes we do this currently on some systems where we haven't had time to install
and test Cluster software.
Even old 3310 array can be setup so 2 servers have storage visible. We export
pool on one system and import it on the other, move over a virtual IP and the
service is back up.
You
So what are the failure modes to worry about?
I'm not exactly sure what the implications of this nocache option for my
configuration.
Say from a recent example I have an overtemp and first one array shuts down,
then the other one.
I come in after A/C is returned, shutdown and repower
So the problem in the zfs send/receive thing, is what if your network glitches
out during the transfers?
We have these once a day due to some as-yet-undiagnosed switch problem, a
chop-out of 50 seconds or so which is enough to trip all our IPMP setups and
enough to abort SSH transfers in
So I went ahead and loaded 10u4 on a pair of V210 units.
I am going to set this nocacheflush option and cross my fingers and see how it
goes.
I have my ZPool mirroring LUNs off 2 different arrays. I have
single-controllers in each 3310. My belief is it's OK for me to do this even
without
Vincent Fox wrote:
Is this what you're referring to?
http://www.solarisinternals.com/wiki/index.php/ZFS_Evi
l_Tuning_Guide#Cache_Flushes
As I wrote several times in this thread, this kernel variable does not work in
Sol 10u3.
Probably not in u4 although I haven't tried it.
I would like
Where is ZFS with regards to the NVRAM cache present on arrays?
I have a pile of 3310 with 512 megs cache, and even some 3510FC with 1-gig
cache. It seems silly that it's going to waste. These are dual-controller
units so I have no worry about loss of cache information.
It looks like
As a novice, I undestand that if you don't have any redundancy between vdevs
this is going to be a problem. Perhaps you can add mirroring to your existing
pool and make it work that way?
A pool made up of mirror pairs:
{cyrus4:137} zpool status
pool: ms2
state: ONLINE
scrub: scrub
I ran testing of hardware RAID versus all software and didn't see any
differences that made either a clear winner. For production platforms you're
just as well off having JBODs.
This was with bonnie++ on a V240 running Solaris 10u3. A 3511 array fully
populated with 12 380-gig drives, single
97 matches
Mail list logo