hysical memory."
This is also a reasonable explanation of what the output of 'swap -s'
actually means.
http://www.softpanorama.org/Solaris/Processes_and_memory/swap_space_management.shtml
Look about half-way down the page, under "Monitoring Swap Resources".
Th
hat
page. If you can explain the individual numbers and how they add up
across 'swap -s', 'swap -l' and 'top -b', that would be great!
-devsk
*From:* Erik Trimble
*Cc:* devsk ; zfs-discuss@o
on of the ZFS *design* intended for
the linux kernel, then Yea! Great! (fortunately, it does sound like this is
what's going on) Otherwise, OpenSolaris CDDL'd code can't go into a Linux
kernel, module or otherwise.
--
Erik Trimble
Java System Suppo
just wanted to
make sure that the original developers understood that there are very
possibly issues using CDDL code in conjunction with GPL'd code. If they
are indeed using OpenSolaris ZFS code, then they at very minimum should
consult an IP lawyer to get the OK.
End of this Discussion.
explicit about that. I didn't mean to start a
license minutiae discussion.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
t you're back in business.
:-)
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
either buy more RAM, or find
something you can use as an L2ARC cache device for your pool. Ideally,
it would be an SSD. However, in this case, a plain hard drive would do
OK (NOT one already in a pool).To add such a device, you would do:
'zpool add tank mycachedevice'
ctronics
prototyping shops. It would be really nice if I could solve 99% of the
need with 1 or 2 2GB SODIMMs and the chips from a cheap 4GB USB thumb
drive...
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
if
your workload is lots of small files). 200bytes/record needed in RAM
for an L2ARC entry.
I.e.
if you have 1k average record size, for 600GB of L2ARC, you'll need
600GB / 1kb * 200B = 120GB RAM.
if you have a more manageable 8k record size, then, 600
On 6/15/2010 6:17 AM, Darren J Moffat wrote:
On 15/06/2010 14:09, Erik Trimble wrote:
I'm going to say something sacrilegious here: 128GB of RAM may be
overkill. You have the SSDs for L2ARC - much of which will be the DDT,
The point of L2ARC is that you start adding L2ARC when you c
of that struct?
Vennlige hilsener / Best regards
roy
--
A DDT entry takes up about 250 bytes, regardless of where it is stored.
For every "normal" (i.e. block, metadata, etc - NOT DDT ) L2ARC entry,
about 200 bytes has to be stored in main memory (ARC).
--
Erik Trimble
Java System
initely beta. There are known
severe bugs and performance issues which will take time to fix, as not
all of them have obvious solutions. Given current schedules, I predict
that it should be production-ready some time in 2011. *When* in 2011, I
couldn't hazard...
Maybe time to make Sola
On 6/15/2010 10:52 AM, Erik Trimble wrote:
Frankly, dedup isn't practical for anything but enterprise-class
machines. It's certainly not practical for desktops or anything
remotely low-end.
This isn't just a ZFS issue - all implementations I've seen so far
requ
was referenced from. The cached block has no idea where it was
referenced from (that's in the metadata). So, even if I have 10 VMs,
requesting access to 10 different files, if those files have been
dedup-ed, then any "common" (i.e. deduped) blocks will be stored only
once in the
visibility to
that than the public.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
in
software, it's more difficult than normal for something like ZFS. Not to
say that we *really* could have the BP rewrite stuff finished sometime
soon...
:-)
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
__
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Try exporting, then re-importing by UID, not name.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
_
heir pool? I'm
assuming so, since it's both block and metadata that get stored there.
I'm considering adding a couple of very large SSDs to I might be able to
cache most of my DB in the L2ARC, if that works.
--
Erik Trimble
Java System
al 2009.06 "stable"
OpenSolaris version? It might not have the install issues you're running
into with the Dev branch, and give you something to do while you wait
for the next 2010.X stable version of OpenSolaris...)
--
Erik Trimble
Java Sy
mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mai
On 6/27/2010 9:07 PM, Richard Elling wrote:
On Jun 27, 2010, at 8:52 PM, Erik Trimble wrote:
But that won't solve the OP's problem, which was that OpenSolaris doesn't
support his hardware. Nexenta has the same hardware limitations as OpenSolaris.
AFAICT, the OP'
hard to beat for
performance/$. The biggest problem with PCI-E cards is that they
require a OS-specific drivers, and OpenSolaris doesn't always make the
cut for support.
In your specific case, I'd consider upgrading to 8GB RAM, and looking at
an 80GB MLC SSD. That's just bl
n Oracle employee, but I don't have any insider knowledge on
this. It's solely my experience talking.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
L usage, should
you choose to use a portion of the device for that purpose.
For what you've said your usage pattern is, I think the Intel X25M is
the best fit for good performance and size for the dollar.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Cl
eds to sustain the data until it has
> been saved (perhaps 10 milliseconds). It is different than a RAID
> array battery.
>
> Bob
>
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
up.
You almost certainly need a SSD for L2ARC, and probably at least 2x the
RAM.
The "hangs" you see are likely the Dedup Table being built on-the-fly
from the datasets, which is massively I/O intensive.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
San
Good question, and I don't know. My educated guess is the latter
(initially stored in ARC, then moved to L2ARC as size increases).
Anyone?
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT
On 7/1/2010 10:17 PM, Neil Perrin wrote:
On 07/01/10 22:33, Erik Trimble wrote:
On 7/1/2010 9:23 PM, Geoff Nordli wrote:
Hi Erik.
Are you saying the DDT will automatically look to be stored in an
L2ARC device if one exists in the pool, instead of using ARC?
Or is there some sort of memory
On 7/2/2010 6:30 AM, Neil Perrin wrote:
On 07/02/10 00:57, Erik Trimble wrote:
That's what I assumed. One further thought, though. Is the DDT is
treated as a single entity - so it's *all* either in the ARC or in
the L2ARC? Or does it move one entry at a time into the L2ARC as it
tually be to your benefit to split your disks into multiple POOLS, and
put different DB tables on different pools. I'd suggest trying it out
with 3 pools of mirrors, and see what that gets you.
And, of course, as for all database work, you need to get your DB
indexes into RAM (or on a very
de a full vdev (i.e every disk in the
vdev), but you don't otherwise have to get a new enclosure.
I'd love to get any status update on the BP Rewrite code, but, given our
rather tight-lipped Oracle policies these days, I'm not hopeful.
--
Erik Trimble
Java System Support
Oracle here in any way, nor have any privileged
knowledge of the suit.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
volving forking over a portion of your
revenue to NetApp for a considerable time.
Do remember: Oracle has much deeper pockets than NetApp, and much less
incentive to settle.
None of the preceding should infer that I speak for Oracle, Inc, nor do
I have any special knowledge of the progress of
able using Richard's 270
bytes-per-entry estimate.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ed, that ZFS will
respect 4k block boundaries? That is, why do you think that ZFS would
put any effort into doing block alignment with its L2ARC writes?
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs
ntroller (or, possibly, just one from the same OEM) to
replace a broken controller with.
Even in JBOD mode, I wouldn't trust a RAID controller to not write
proprietary bits onto the disks. It's one of the big reasons to chose a
HBA and not a RAID controller.
--
Erik Trimble
Java Sys
ll ZFS (perhaps through a pool property?) that building an
in-ARC DDT isn't really needed.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.o
On 7/10/2010 10:14 AM, Brandon High wrote:
On Sat, Jul 10, 2010 at 5:33 AM, Erik Trimble <mailto:erik.trim...@oracle.com>> wrote:
Which brings up an interesting idea: if I have a pool with good
random I/O (perhaps made from SSDs, or even one of those nifty
Oracle F51
e async
writes, so ZIL will be of no benefit, since it's not being used.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
a
snapshot it taken.
Given that snapshots will probably be more popular in the future (WAFL
NFS/LUNs, ZFS, Btrfs, VMware disk image snapshots, etc.), an agreed upon
consensus would be handy (D-Bus? POSIX?).
--
Erik Trimble
Java System Support
Ma
On 7/12/2010 8:49 AM, Linder, Doug wrote:
Erik Trimble wrote:
it does look like they'll win, I would bet huge chunks of money that
Oracle cross-licenses the patents or pays for a license, rather than
kill ZFS (it simply makes too much money for Oracle to abandon).
Out of
- Garrett
Losing ZFS would indeed be disastrous, as it would leave Solaris with
only the Veritas File System (VxFS) as a semi-modern filesystem, and a
non-native FS at that (i.e. VxFS is a 3rd-party for-pay FS, which
severely inhibits its uptake). UFS is just way to old to be compe
oms!). You can't compare an OEM server
(Dell, Sun, whatever) to a custom-built box from a parts assembler. Not
that same thing. Different standards, different prices.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
100 kph". Up front pricing is but one of many
different aspects of buying a server, and for many of us, it's not even
the most important.
When doing price comparison, you have to compare within the same class.
Cross-class comparisons are generally meaningless, since they're to
wpool/. (use whichever rsync options fit you
best)
# zpool destroy datapool
# zpool replace newpool /foo D
# rm /foo
During this process, you will have your data on both mirrors exposed to
a disk failure, and when it's complete, the rpool will of course remain
unprotected.
--
Erik Trim
(though iSCSI volumes are coming up fast), and as
such, I've got 20G files which would really, really, benefit from having
a much larger slab size.
(e) and, of course, seeing if there's some way we can cut down on
dedup's piggy DDT size. :-)
--
Erik Trimble
Java Syst
isk's pools via a plain
'import', correct? Have you tried importing via UID rather than via
name - also, try importing with a different mountpoint option.
Last resort - boot from the LiveCD, import the old disks' rpool by UID,
and then rename the whole pool something el
ZFS really only support 1 level
of metadevice. You're pool can be made up of multiple vdev (virtual
devices), but a vdev *must* be something real (file, disk, lun, etc.).
There's no real way to do what you want in real-time.
-Erik
--
Erik Trimble
Java System Support
Mailstop: u
think of everything (or, if you can, it takes
awhile) - and, the 20 hours it just took you to fix that machine could
have been 2 hours if it had a service contract. Doesn't take too long
for that kind of math to blow out any savings whiteboxes may have had.
Worst case, someone goes
ith a temporary directory and
clever use of the "zpool import -d" command. Examples are
in the archives." (from Richard Elling's post)
Where are these archives located?
http://mail.opensolaris.org/pipermail/zfs-discuss/
use Google to search them.
--
Erik Trimble
Java System
keep up with a Gbit Ethernet.
For doing things like compiling over NFS/CIFS, the disks are going to be
your bottleneck.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-dis
k-space's worth
of just storage.
Oh, and just ignore the RAID controller's configuration. Enable the
NVRAM cache on the controllers, but otherwise, run the disks as either
JBOD (if the controller allows use of the NVRAM for a JBOD config), or a
1-disk stripes (if it requires arrays to
out of the disks, as they're only
being infrequently touched. This is good. :-)
Also, I'm assuming you mean 'zpool iostat -v', right?
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
_
across all devices (there is no dedicated parity-only device), it
becomes simpler to recover data while retaining performance.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ce seen iSCSI beat out NFS in a VMware
> environment. I have however seen countless examples of their
> "clustered filesystem" causing permanent SCSI locks on a LUN that
> result in an entire datastore going offline.
>
>
> --Tim
> ___
phrase
Ripley).
Not just for Sun kit, but I'd be very wary of using any
no-service-contract hardware for something that is business critical,
which I can't imagine your digital editing system isn't. Don't be
penny-wise and pound-foolish.
--
Erik Trimble
Java System Suppo
are useful for adding protection to a number of vdevs, not a
single vdev.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
y a snapshot for some reason, that's what the
'zfs clone' function is for. clone your snapshot, promote it, and make
your modifications.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
are still the preferred method of arranging
things in ZFS, even with hardware raid backing the underlying LUN
(whether the LUN is from a SAN or local HBA doesn't matter).
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US
ose underlying LUN has gone away,
eventually reporting an inability to complete the relevant transaction
to the calling software.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
Victor Latushkin wrote:
Erik Trimble wrote:
ZFS no longer has the issue where loss of a single device (even
intermittently) causes pool corruption. That's been fixed.
Erik, it does not help at all when you are talking about some issue
being fixed and does not provide corresponding CR n
can get it to work is to
offline (ie export) the whole pool, and then pray that nothing
interrupts the expansion process.
That all said, I'm not a /real/ developer, so maybe someone else has
some free time to try.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x1719
ocess. My primary problem is
that I have to keep both schemes in memory during the migration, and if
something should happen (i.e. reboot, panic, etc) then I lose the
current state of the zpool, and everything goes to hell in a handbasket.
--
Erik Trimble
Java System Support
Mailstop: usca22
a 10u6 ZFS filesystem to a 10u8 machine, resulting in
creating a new v10 filesystem on the 10u8 machine. However, you can't
send a v12 filesystem from the 10u8 machine to the 10u6 machine. If you
explicitly create a v10 filesystem on the 10u8 machine, you can send
that filesystem to the
pertank raidz c0t0d0 c0t1d0 c0t2d0 c1t0d0s0 c1t1d0s0
c1t2d0s0 raidz c1t0d0s1 c1t1d0s1 c1t2d0s1
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
n appliance, and frankly, you have to live with
the limited configurations it's sold in.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
performance. I guess what it boils down to is what
is the access time/throughput of a single local 15k SCSI drive vs a GigE
iSCSI volume?
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing li
t there should be no gotchas on the zpool import (of course,
remember to zpool export from the original machines first as a good
practice).
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mail
u need to
unset the 'shareiscsi' property BEFORE you export them (or, after you
import them, then reboot). This prevents a potential conflict between
the old iSCSI implementation and COMSTAR.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
ot; SSDs - the Intel X25-M is a good fit here for a Readzilla.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
arder to manage, and allows you to get a
virgin zpool which will provide the best performance.
Sometimes, ignorance is bliss :-)
-- richard
oooh, then I must be ecstatically happy!
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
h other up and splitting the load?
> Len Zaifman
> Systems Manager, High Perime formance Systems
> The Centre for Computational Biology
> The Hospital for Sick Children
> 555 University Ave.
> Toronto, Ont M5G 1X8
>
> tel: 416-813-5513
> email: leona...@sickkids.ca
>
.
Worst case scenario is that you can blow away the AmberRoad software
load, and install OpenSolaris/Solaris. The hardware is a standard X4140
and J4200.
Note, that if you do that, well, you can't re-load A-R without a support
contract.
--
Erik Trimble
Java System Sup
Erik Trimble wrote:
Miles Nordin wrote:
"lz" == Len Zaifman writes:
lz> So I now have 2 disk paths and two network paths as opposed to
lz> only one in the 7310 cluster.
You're configuring all your failover on the client, so the HA stuff is
sta
Miles Nordin wrote:
"et" == Erik Trimble writes:
et> I'd still get the 7310 hardware.
et> Worst case scenario is that you can blow away the AmberRoad
okay but, AIUI he was saying pricing is 6% more for half as much
physical disk. This is also why
zpool add tank cache c1t1d0
And from then on, I just import/export between the two hosts, and it
auto-picks the correct c1t1d0 drive.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ably with a SCA interface)
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
nowledge, I would stick with it for the time being. The
differences for something like FreeNAS are relatively minor, and it's
better to Go With What You Know. Exploring OpenSolaris for a future
migration would be good, but for right now, I'd stick to FreeBSD.
--
Erik Trimble
Ja
whole pool, destroy the pool, remove the device, remake the pool, then reimport
the pool) to even bother with?
--
BP rewrite is key to several oft-asked features: vdev removal, defrag,
raidz expansion, among others.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
San
siderably less time than a full-drive resilver.
That said, if your drive really is taking 10-15 seconds to remap bad
sectors, maybe you _should_ replace it.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
A remains the same, while
file B consists of all dedup'd blocks pointing to those shared with A,
EXCEPT the block where I changed the single bit. This is the same
process that happens when updates are made after a snapshot.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
ghtly used SSD will likely
outlast a HD. And, in the case of SSDs, writes are far harder on the
SSD than reads are.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-d
hat's not enough time for that level of IOPS to wear out the SSDs
(which, are likely OEM Intel X25-E). Something else is wrong.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
__
ilable to the failover machine.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
st putting in a straight-through cable between the
two machine is the best idea here, rather than going through a switch.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
___
z
mirrored ZIL into the zpool.
It's a (relatively) simple and ingenious suggestion.
-Erik
On Wed, Dec 23, 2009 at 9:40 AM, Erik Trimble wrote:
Charles Hedrick wrote:
Is ISCSI reliable enough for this?
YES.
The original idea is a good one, and one that I'd not
lmost exclusively in the case of NFS traffic.
In fact, I think that the Vertex's sustained random write IOPs
performance is actually inferior to a 15k SAS drive.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timez
Richard Elling wrote:
On Dec 25, 2009, at 4:15 PM, Erik Trimble wrote:
I haven't seen this mentioned before, but the OCZ Vertex Turbo is
still an MLC-based SSD, and is /substantially/ inferior to an Intel
X25-E in terms of random write performance, which is what a ZIL
device does a
will be able to recover/reconstruct some metadata which fails checksumming.
In short, Checksumming is how ZFS /determines/ data corruption, and
Redundancy is how ZFS /fixes/ it. Checksumming is /always/ present,
while redundancy depends on the pool layout and options (cf. "copies&
local stores)
20GB for a rpool is sufficient, so the rest can go to L2ARC. I would
disable any swap volume on the SSDs, however. If you need swap, put it
somewhere else.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
how many blocks of X size have to be
written at once) and that's about it. Let filesystem makers worry about
scheduling writes appropriately, doing redundancy, etc.
Oooh! Oooh! a whole cluster of USB thumb drives! Yeah!
--
Erik Trimble
Java System Support
Mailstop: usca22-12
Bob Friesenhahn wrote:
On Fri, 1 Jan 2010, Erik Trimble wrote:
Maybe it's approaching time for vendors to just produce really stupid
SSDs: that is, ones that just do wear-leveling, and expose their true
page-size info (e.g. for MLC, how many blocks of X size have to be
written at once
Eric D. Mudama wrote:
On Fri, Jan 1 at 21:21, Erik Trimble wrote:
That all said, it certainly would be really nice to get a SSD
controller which can really push the bandwidth, and the only way I
see this happening now is to go the "stupid" route, and dumb down the
controller
SD page size). Reads could be in smaller sections,
though. Which would be interesting: ZFS would write in Page Size
increments, and read in Block Size amounts.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
_
Joerg Schilling wrote:
Erik Trimble wrote:
From ZFS's standpoint, the optimal configuration would be for the SSD
to inform ZFS as to it's PAGE size, and ZFS would use this as the
fundamental BLOCK size for that device (i.e. all writes are in integer
It seems that a
Ragnar Sundblad wrote:
On 2 jan 2010, at 13.10, Erik Trimble wrote
Joerg Schilling wrote:
the TRIM command is what is intended for an OS to notify the SSD as to which blocks are deleted/erased, so the SSD's internal free list can be updated (that is, it allows formerly-in-use blocks
they are doing some research/development on a next-gen
filesystem (didn't make it into Windows 2008, but maybe Win2011), so
we'll have to see what that entails.
All that said, it would certainly be limited to Enterprise SSD, which,
are low-volume. But, on the up side, they're Hi
Ragnar Sundblad wrote:
On 2 jan 2010, at 22.49, Erik Trimble wrote:
Ragnar Sundblad wrote:
On 2 jan 2010, at 13.10, Erik Trimble wrote
Joerg Schilling wrote:
the TRIM command is what is intended for an OS to notify the SSD as to which
blocks are deleted/erased, so the
David Magda wrote:
On Jan 2, 2010, at 16:49, Erik Trimble wrote:
My argument is that the OS has a far better view of the whole data
picture, and access to much higher performing caches (i.e.
RAM/registers) than the SSD, so not only can the OS make far better
decisions about the data and how
Ragnar Sundblad wrote:
On 3 jan 2010, at 04.19, Erik Trimble wrote:
Let's say I have 4k blocks, grouped into a 128k page. That is, the SSD's
fundamental minimum unit size is 4k, but the minimum WRITE size is 128k. Thus,
32 blocks in a page.
Do you know of SSD disks t
101 - 200 of 559 matches
Mail list logo