Why does zfs define raidz/raidz2/mirror/stripe at the pool level instead of the
filesystem/volume level?
A sample use case: two filesystems in a eight disk pool. The first filesystems
is a stripe across four mirrors. The second filesystems is a raidz2. Both
utilizing the free space in the 8
Vizzini Sampere wrote:
Why does zfs define raidz/raidz2/mirror/stripe at the pool level instead of
the filesystem/volume level?
To take the burden away from the system admin.
Turnaround question - why *should* ZFS define an underlying
storage arrangement at the filesystem level?
A sample
Hello Anantha,
Wednesday, January 17, 2007, 2:35:01 PM, you wrote:
ANS You're probably hitting the same wall/bug that I came across;
ANS ZFS in all versions up to and including Sol10U3 generates
ANS excessive I/O when it encounters 'fssync' or if any of the files
ANS were opened with 'O_DSYNC'
On 17 January, 2007 - Christian Rost sent me these 2,4K bytes:
I'm using
SunOS cassandra 5.10 Generic_118833-33 sun4u sparc SUNW,Sun-Fire-V240
[..]
cassandra# zpool list
NAMESIZEUSED AVAILCAP HEALTH ALTROOT
tray30 7.25T 10.4G 7.24T
Bug 6413510 is the root cause. ZFS maestros please correct me if I'm quoting an
incorrect bug.
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
What do you mean by UFS wasn't an option due to
number of files?
Exactly that. UFS has a 1 million file limit under Solaris. Each Oracle
Financials environment well exceeds this limitation.
Also do you have any tunables in system?
Can you send 'zpool status' output? (raidz, mirror,
...?)
Thanks for the feedback!
This does sound like what we're hitting. From our testing, you are absolutely
correct--separating out the parts is a major help. The big problem we still
see, though, is doing the clones/recoveries. The DBA group clones the
production environment for Education. Since
Also as an workaround you could disable zil if it's
acceptable to you
(in case of system panic or hard reset you can endup
with
unrecoverable database).
Again, not an option, but thatnks for the pointer. I read a bit about this last
week, and it sounds way too scary.
Rainer
This
Hello Christian,
Wednesday, January 17, 2007, 4:48:00 PM, you wrote:
CR Now i got the point.
CR Thank you :)
CR Is it true that one group should contain less than 10 disks for
performance-reasons?
CR Or am i free to use 16 disks in one group without perfomance-drops?
Depends on workload.
But
Rainer Heilke wrote:
I'll know for sure later today or tomorrow, but it sounds like they are
seriously considering the ASM route. Since we will be going to RAC later
this year, this move makes the most sense. We'll just have to hope that
the DBA group gets a better understanding of LUN's and
There looks to be code in the DLM package that either
Connects to the HDS box and queries for some info
or
Reads some attributes in the SCSI mode pages to get the info. (I'm
guessing this one.)
In any case HDS would have to share said knowledge with Sun so the
correct
Rainer Heilke wrote:
What do you mean by UFS wasn't an option due to
number of files?
Exactly that. UFS has a 1 million file limit under Solaris. Each Oracle
Financials environment well exceeds this limitation.
Really?!? I thought Oracle would use a database for storage...
Also do you
On Wed, 2007-01-17 at 20:34 +1100, James C. McPherson wrote:
Vizzini Sampere wrote:
Why does zfs define raidz/raidz2/mirror/stripe at the pool level instead of
the filesystem/volume level?
To take the burden away from the system admin.
Turnaround question - why *should* ZFS define an
What do you mean by UFS wasn't an option due to
number of files?
Exactly that. UFS has a 1 million file limit under Solaris. Each Oracle
Financials environment well exceeds this limitation.
what ?
$ uname -a
SunOS core 5.10 Generic_118833-17 sun4u sparc SUNW,UltraSPARC-IIi-cEngine
$ df -F
Dennis Clarke wrote:
What do you mean by UFS wasn't an option due to
number of files?
Exactly that. UFS has a 1 million file limit under Solaris. Each Oracle
Financials environment well exceeds this limitation.
what ?
$ uname -a
SunOS core 5.10 Generic_118833-17 sun4u sparc
Sorry, yes - update 2.
Andrew.
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
We had a 2TB filesystem. No matter what options I set explicitly, the UFS
filesystem kept getting written with a 1 million file limit. Believe me, I
tried a lot of options, and they kept getting set back on me.
After a fair bit of poking around (Google, Sun's site, etc.) I found several
other
We had a 2TB filesystem. No matter what options I set explicitly, the
UFS filesystem kept getting written with a 1 million file limit.
Believe me, I tried a lot of options, and they kept getting se t back
on me.
The limit is documented as 1 million inodes per TB. So something
must not have gone
Hi Anantha,
I was curious why segregating at the FS level would provide adequate
I/O isolation? Since all FS are on the same pool, I assumed flogging a
FS would flog the pool and negatively affect all the other FS on that
pool?
Best Regards,
Jason
On 1/17/07, Anantha N. Srirama [EMAIL
Hi Torrey,
On 1/15/07, Torrey McMahon [EMAIL PROTECTED]
wrote:
What you will find is that while you can mount the
UFS (ZFS should
prevent mounting but that's another story), any
updates on previously
read files will not be visible.
Actually it's quite worth with UFS: mounting a filesystem
It turns out we're probably going to go the UFS/ZFS route, with 4 filesystems
(the DB files on UFS with Directio).
It seems that the pain of moving from a single-node ASM to a RAC'd ASM is
great, and not worth it. The DBA group decided doing the migration to UFS for
the DB files now, and then
Bag-o-tricks-r-us, I suggest the following in such a case:
- Two ZFS pools
- One for production
- One for Education
- Isolate the LUNs feeding the pools if possible, don't share spindles.
Remember on EMC/Hitachi you've logical LUNs created by striping/concat'ng
carved up physical disks,
I did some straight up Oracle/ZFS testing but not on Zvols. I'll give it a shot
and report back, next week is the earliest.
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
Rainer Heilke wrote On 01/17/07 15:44,:
It turns out we're probably going to go the UFS/ZFS route, with 4 filesystems
(the DB files on
UFS with Directio).
It seems that the pain of moving from a single-node ASM to a RAC'd ASM is
great, and not worth it.
The DBA group decided doing the
The limit is documented as 1 million inodes per TB.
So something
ust not have gone right. But many people have
complained and
you could take the newfs source and fix the
limitation.
Patching the source ourselves would not fly very far, but thanks for the
clarification. I guess I have to
Hello Jason,
Wednesday, January 17, 2007, 11:24:50 PM, you wrote:
JJWW Hi Anantha,
JJWW I was curious why segregating at the FS level would provide adequate
JJWW I/O isolation? Since all FS are on the same pool, I assumed flogging a
JJWW FS would flog the pool and negatively affect all the
Hi Robert,
I see. So it really doesn't get around the idea of putting DB files
and logs on separate spindles?
Best Regards,
Jason
On 1/17/07, Robert Milkowski [EMAIL PROTECTED] wrote:
Hello Jason,
Wednesday, January 17, 2007, 11:24:50 PM, you wrote:
JJWW Hi Anantha,
JJWW I was curious why
I explore ZFS on X4500 (thumper) MTTDL models in yet another blog.
http://blogs.sun.com/relling/entry/a_story_of_two_mttdl
I hope you find it interesting.
-- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
Turnaround question - why *should* ZFS define an underlying
storage arrangement at the filesystem level?
It would be nice to provide it at the directory hierarchy level, but
since file systems in ZFS are cheap, providing it at the file system
level instead might be reasonable. (I say might be
Yes, Anantha is correct that is the bug id, which could be responsible
for more disk writes than expected.
I believe, though, that this would explain at most a factor of 2 of write
expansion (user data getting pushed to disk once in the intent log, then again
in its final location). If the
Anton B. Rang wrote On 01/17/07 20:31,:
Yes, Anantha is correct that is the bug id, which could be responsible
for more disk writes than expected.
I believe, though, that this would explain at most a factor of 2
of write expansion (user data getting pushed to disk once in the
intent log,
At the september LISA meeting Jeff B. did suggest that they planned to -
eventually - add the distributed aspect to ZFS, and when he's talking about the
filesystem as a 'pool of blocks' it certainly seems like there's no reason
(beyond some minor implementation issues :) why those blocks could
32 matches
Mail list logo