[zfs-discuss] NFS asynchronous writes being written to ZIL

2012-06-13 Thread Timothy Coalson
I noticed recently that the SSDs hosting the ZIL for my pool had a large number in the SMART attribute for total LBAs written (with some calculation, it seems to be the total amount of data written to the pool so far), did some testing, and found that the ZIL is being used quite heavily (matching

Re: [zfs-discuss] NFS asynchronous writes being written to ZIL

2012-06-13 Thread Timothy Coalson
-0500, Timothy Coalson wrote: client: ubuntu 11.10 /etc/fstab entry: server:/mainpool/storage       /mnt/myelin     nfs bg,retry=5,soft,proto=tcp,intr,nfsvers=3,noatime,nodiratime,async 0     0 nfsvers=3 NAME              PROPERTY  VALUE     SOURCE mainpool/storage  sync      standard

Re: [zfs-discuss] NFS asynchronous writes being written to ZIL

2012-06-14 Thread Timothy Coalson
Carosone wrote: On Wed, Jun 13, 2012 at 05:56:56PM -0500, Timothy Coalson wrote: client: ubuntu 11.10 /etc/fstab entry: server:/mainpool/storage       /mnt/myelin     nfs bg,retry=5,soft,proto=tcp,intr,nfsvers=3,noatime,nodiratime,async       0    0 nfsvers=3 NAME              PROPERTY  VALUE

Re: [zfs-discuss] NFS asynchronous writes being written to ZIL

2012-06-14 Thread Timothy Coalson
The client is using async writes, that include commits. Sync writes do not need commits. Are you saying nfs commit operations sent by the client aren't always reported by that script? What happens is that the ZFS transaction group commit occurs at more-or-less regular intervals, likely 5

Re: [zfs-discuss] NFS asynchronous writes being written to ZIL

2012-06-14 Thread Timothy Coalson
with the current behavior (and the SSDs shouldn't give out any time soon even being used like this), if it isn't possible to change it. Thanks for all the help, Tim On Thu, Jun 14, 2012 at 10:30 PM, Phil Harman phil.har...@gmail.com wrote: On 14 Jun 2012, at 23:15, Timothy Coalson tsc...@mst.edu wrote

Re: [zfs-discuss] NFS asynchronous writes being written to ZIL

2012-06-15 Thread Timothy Coalson
, is it smart enough to do this? Tim On Fri, Jun 15, 2012 at 10:56 AM, Richard Elling richard.ell...@gmail.com wrote: [Phil beat me to it] Yes, the 0s are a result of integer division in DTrace/kernel. On Jun 14, 2012, at 9:20 PM, Timothy Coalson wrote: Indeed they are there, shown with 1 second

Re: [zfs-discuss] NFS asynchronous writes being written to ZIL

2012-06-15 Thread Timothy Coalson
On Fri, Jun 15, 2012 at 12:56 PM, Timothy Coalson tsc...@mst.edu wrote: Thanks for the suggestions.  I think it would also depend on whether the nfs server has tried to write asynchronously to the pool in the meantime, which I am unsure how to test, other than making the txgs extremely

Re: [zfs-discuss] Migrating 512 byte block zfs root pool to 4k disks

2012-06-15 Thread Timothy Coalson
On Fri, Jun 15, 2012 at 5:35 PM, Jim Klimov jimkli...@cos.ru wrote: 2012-06-16 0:05, John Martin wrote: Its important to know... ...whether the drive is really 4096p or 512e/4096p. BTW, is there a surefire way to learn that programmatically from Solaris or its derivates prtvtoc device

Re: [zfs-discuss] Migrating 512 byte block zfs root pool to 4k disks

2012-06-15 Thread Timothy Coalson
Sorry, if you meant distinguishing between true 512 and emulated 512/4k, I don't know, it may be vendor-specific as to whether they expose it through device commands at all. Tim On Fri, Jun 15, 2012 at 6:02 PM, Timothy Coalson tsc...@mst.edu wrote: On Fri, Jun 15, 2012 at 5:35 PM, Jim Klimov

Re: [zfs-discuss] Recommendation for home NAS external JBOD

2012-06-17 Thread Timothy Coalson
So I can either exchange the disks one by one with autoexpand, use 2-4 TB disks and be happy. This was my original approach. However I am totally unclear about the 512b vs 4Kb issue. What sata disk could I use that is big enough and still uses 512b? I know about the discussion about the

Re: [zfs-discuss] Recommendation for home NAS external JBOD

2012-06-17 Thread Timothy Coalson
worst case).  The worst case for 512 emulated sectors on zfs is probably small (4KB or so) synchronous writes (which if they mattered to you, you would probably have a separate log device, in which case the data disk write penalty may not matter). Good to know. This really opens up the

Re: [zfs-discuss] Recommendation for home NAS external JBOD

2012-06-18 Thread Timothy Coalson
What makes you think the Barracuda 7200.14 drives report 4k sectors? I gave up looking for 4kn drives, as everything I could find was 512e. I would _love_ to be wrong, as I have 8 4TB Hitachis on backorder that I would gladly replace with 4kn drives, even if I had to drop to 3TB density. I

Re: [zfs-discuss] Recommendation for home NAS external JBOD

2012-06-19 Thread Timothy Coalson
- Will I be able to buy a replacement in 3-3 years that reports the disk in such a way, that resilvering will work? According to the Advanced Format threat this seems to be a problem. I was hopimg to get arond this with these disks and have a more future proof solution I think that if you are

Re: [zfs-discuss] Recommendation for home NAS external JBOD

2012-06-19 Thread Timothy Coalson
I think that if you are running an illumos kernel, you can use /kernel/drv/sd.conf That refers to creating a new pool and is good to know. Two things: one, it looks like you should also be able to trick it into using 512 sectors on a 4k disk, allowing you to do exactly such a replacement

Re: [zfs-discuss] ZFS snapshot used space question

2012-08-29 Thread Timothy Coalson
As I understand it, the used space of a snapshot does not include anything that is in more than one snapshot. There is a bit of a hack, using the verbose and dry run options of zfs send, that will tell you how much data must be transferred to replicate each snapshot incrementally, which should

Re: [zfs-discuss] ZFS snapshot used space question

2012-08-30 Thread Timothy Coalson
Is there a way to get the total amount of data referenced by a snapshot that isn't referenced by a specified snapshot/filesystem? I think this is what is really desired in order to locate snapshots with offending space usage. The written and written@ attributes seem to only do the reverse. I

Re: [zfs-discuss] ZFS snapshot used space question

2012-08-31 Thread Timothy Coalson
I went ahead and hacked this script together, so let me elaborate. First, though, a teaser: $ ./snapspace.sh mainpool/storage SNAPSHOTOLDREFS UNIQUE UNIQUE% zfs-auto-snap_monthly-2011-11-14-18h59 34.67G 11.0G 31%

Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Timothy Coalson
When I wrote a script for this, I used separate snapshots, with a different naming convention, to use as the endpoints for the incremental send. With this, it becomes easier: find the newest snapshot with that naming convention on the sending side, and check that it exists on the receiving side.

Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Timothy Coalson
Unless i'm missing something, they didn't solve the matching snapshots thing yet, from their site: To Do: Additional error handling for mismatched snapshots (last destination snap no longer exists on the source) walk backwards through the remote snaps until a common snapshot is found and destroy

Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Timothy Coalson
On Wed, Sep 12, 2012 at 7:16 PM, Ian Collins i...@ianshome.com wrote: On 09/13/12 10:23 AM, Timothy Coalson wrote: Unless i'm missing something, they didn't solve the matching snapshots thing yet, from their site: To Do: Additional error handling for mismatched snapshots (last destination

Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-23 Thread Timothy Coalson
I think you can fool a recent Illumos kernel into thinking a 4k disk is 512 (incurring a performance hit for that disk, and therefore the vdev and pool, but to save a raidz1, it might be worth it): http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks , see Overriding the Physical

Re: [zfs-discuss] cannot replace X with Y: devices have different sector alignment

2012-09-24 Thread Timothy Coalson
On Mon, Sep 24, 2012 at 12:20 AM, Timothy Coalson tsc...@mst.eduwrote: I think you can fool a recent Illumos kernel into thinking a 4k disk is 512 (incurring a performance hit for that disk, and therefore the vdev and pool, but to save a raidz1, it might be worth it): http

Re: [zfs-discuss] Making ZIL faster

2012-10-03 Thread Timothy Coalson
I found something similar happening when writing over NFS (at significantly lower throughput than available on the system directly), specifically that effectively all data, even asynchronous writes, were being written to the ZIL, which I eventually traced (with help from Richard Elling and others

Re: [zfs-discuss] Making ZIL faster

2012-10-03 Thread Timothy Coalson
believe that a failed unmirrored log device is only a problem if the pool is ungracefully closed before ZFS notices that the log device failed (ie, simultaneous power failure and log device failure), so mirroring them may not be required. Tim On Wed, Oct 3, 2012 at 2:54 PM, Timothy Coalson tsc

Re: [zfs-discuss] vm server storage mirror

2012-10-19 Thread Timothy Coalson
Several times, I destroyed the pool and recreated it completely from backup. zfs send and zfs receive both work fine. But strangely - when I launch a VM, the IO grinds to a halt, and I'm forced to powercycle (usually) the host. A shot in the dark here, but perhaps one of the disks involved

Re: [zfs-discuss] vm server storage mirror

2012-10-20 Thread Timothy Coalson
On Sat, Oct 20, 2012 at 7:39 AM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: Timothy Coalson [mailto:tsc...@mst.edu] Sent: Friday, October 19, 2012 9:43 PM A shot in the dark here, but perhaps one of the disks

Re: [zfs-discuss] Scrub and checksum permutations

2012-10-25 Thread Timothy Coalson
On Thu, Oct 25, 2012 at 7:35 AM, Jim Klimov jimkli...@cos.ru wrote: If scrubbing works the way we logically expect it to, it should enforce validation of such combinations for each read of each copy of a block, in order to ensure that parity sectors are intact and can be used for data

Re: [zfs-discuss] Zpool LUN Sizes

2012-10-26 Thread Timothy Coalson
Disclaimer: I haven't used LUNs with ZFS, so take this with a grain of salt. On Fri, Oct 26, 2012 at 4:08 PM, Morris Hooten mhoo...@us.ibm.com wrote: I'm creating a zpool that is 25TB in size. What are the recommendations in regards to LUN sizes? The first standard advice I can give is that

Re: [zfs-discuss] Zpool LUN Sizes

2012-10-27 Thread Timothy Coalson
On Sat, Oct 27, 2012 at 9:21 AM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) Performance is much better if you use mirrors instead of raid. (Sequential

Re: [zfs-discuss] Scrub and checksum permutations

2012-10-27 Thread Timothy Coalson
On Sat, Oct 27, 2012 at 12:35 PM, Jim Klimov jimkli...@cos.ru wrote: 2012-10-27 20:54, Toby Thain wrote: Parity is very simple to calculate and doesn't use a lot of CPU - just slightly more work than reading all the blocks: read all the stripe blocks on all the drives involved in a stripe,

Re: [zfs-discuss] Scrub and checksum permutations

2012-10-31 Thread Timothy Coalson
On Wed, Oct 31, 2012 at 6:47 PM, Matthew Ahrens mahr...@delphix.com wrote: On Thu, Oct 25, 2012 at 2:25 AM, Jim Klimov jimkli...@cos.ru wrote: Hello all, I was describing how raidzN works recently, and got myself wondering: does zpool scrub verify all the parity sectors and the mirror

Re: [zfs-discuss] mixing WD20EFRX and WD2002FYPS in one pool

2012-11-21 Thread Timothy Coalson
On Wed, Nov 21, 2012 at 4:45 AM, Eugen Leitl eu...@leitl.org wrote: A couple questions: is there a way to make WD20EFRX (2 TByte, 4k sectors) and WD200FYPS (4k internally, reported as 512 Bytes?) work well together on a current OpenIndiana? Which parameters need I give the zfs pool in regards

Re: [zfs-discuss] mixing WD20EFRX and WD2002FYPS in one pool

2012-11-21 Thread Timothy Coalson
On Wed, Nov 21, 2012 at 1:29 PM, Timothy Coalson tsc...@mst.edu wrote: On Wed, Nov 21, 2012 at 4:45 AM, Eugen Leitl eu...@leitl.org wrote: A couple questions: is there a way to make WD20EFRX (2 TByte, 4k sectors) and WD200FYPS (4k internally, reported as 512 Bytes?) work well together

Re: [zfs-discuss] Digging in the bowels of ZFS

2012-12-05 Thread Timothy Coalson
On Tue, Dec 4, 2012 at 10:52 PM, Jim Klimov jimkli...@cos.ru wrote: On 2012-12-03 18:23, Jim Klimov wrote: On 2012-12-02 05:42, Jim Klimov wrote: 4) Where are the redundancy algorithms specified? Is there any simple tool that would recombine a given algo-N redundancy sector with

Re: [zfs-discuss] Digging in the bowels of ZFS

2012-12-09 Thread Timothy Coalson
On Sun, Dec 9, 2012 at 1:27 PM, Jim Klimov jimkli...@cos.ru wrote: In two of three cases, some of the sectors (in the range which mismatches the parity data) are not only clearly invalid, like being filled with long stretches of zeroes with other sectors being uniformly-looking binary data

Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2012-12-17 Thread Timothy Coalson
On Mon, Dec 17, 2012 at 9:36 AM, Truhn, Chad chad.tr...@bowheadsupport.comwrote: I am not disagreeing with this, but isn't this the opposite test from what Ned asked? You can send from an old version (6) to a new version (28), but I don't believe you can send the other way from the new

Re: [zfs-discuss] Pool performance when nearly full

2012-12-20 Thread Timothy Coalson
On Thu, Dec 20, 2012 at 2:32 PM, Jim Klimov jimkli...@cos.ru wrote: Secondly, there's 8 vdevs each of 11 disks. 6 vdevs show used 8.19 TB, free 1.81 TB, free = 18.1% 2 vdevs show used 6.39 TB, free 3.61 TB, free = 36.1% How did you look that up? ;) zpool iostat -v or zpool list -v Tim

Re: [zfs-discuss] Heavy write IO for no apparent reason

2013-01-17 Thread Timothy Coalson
On Thu, Jan 17, 2013 at 5:33 PM, Peter Wood peterwood...@gmail.com wrote: The 'zpool iostat -v' output is uncomfortably static. The values of read/write operations and bandwidth are the same for hours and even days. I'd expect at least some variations between morning and night. The load on

Re: [zfs-discuss] Heavy write IO for no apparent reason

2013-01-18 Thread Timothy Coalson
On Fri, Jan 18, 2013 at 4:55 PM, Freddie Cash fjwc...@gmail.com wrote: On Thu, Jan 17, 2013 at 4:48 PM, Peter Blajev pbla...@taaz.com wrote: Right on Tim. Thanks. I didn't know that. I'm sure it's documented somewhere and I should have read it so double thanks for explaining it. When in