Re: [zfs-discuss] ZIL on a dedicated HDD slice (1-2 disk systems)

2012-01-09 Thread Darren J Moffat
On 01/08/12 18:21, Bob Friesenhahn wrote: Something else to be aware of is that even if you don't have a dedicated ZIL device, zfs will create a ZIL using devices in the main pool so Terminology nit: The log device is a SLOG. Every ZFS dataset has a ZIL. Where the ZIL writes (slog or main

Re: [zfs-discuss] ZIL on a dedicated HDD slice (1-2 disk systems)

2012-01-09 Thread Jim Klimov
2012-01-08 5:45, Richard Elling wrote: I think you will see a tradeoff on the read side of the mixed read/write workload. Sync writes have higher priority than reads so the order of I/O sent to the disk will appear to be very random and not significantly coalesced. This is the pathological

Re: [zfs-discuss] zfs defragmentation via resilvering?

2012-01-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Bob Friesenhahn To put things in proper perspective, with 128K filesystem blocks, the worst case file fragmentation as a percentage is 0.39% (100*1/((128*1024)/512)). On a Microsoft Windows

Re: [zfs-discuss] ZIL on a dedicated HDD slice (1-2 disk systems)

2012-01-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov 1) Sync writes will land on disk randomly into nearest (to disk heads) available blocks, in order to have them committed ASAP; This is true - but you need to make the distinction

Re: [zfs-discuss] zfs defragmentation via resilvering?

2012-01-09 Thread Richard Elling
On Jan 9, 2012, at 5:44 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Bob Friesenhahn To put things in proper perspective, with 128K filesystem blocks, the worst case file fragmentation as a percentage is

Re: [zfs-discuss] zfs read-ahead and L2ARC

2012-01-09 Thread John Martin
On 01/08/12 20:10, Jim Klimov wrote: Is it true or false that: ZFS might skip the cache and go to disks for streaming reads? I don't believe this was ever suggested. Instead, if data is not already in the file system cache and a large read is made from disk should the file system put this

Re: [zfs-discuss] zfs read-ahead and L2ARC

2012-01-09 Thread Jim Klimov
Thanks for the replies, some more questions follow. Your answers below seem to contradict each other somewhat. Is it true that: 1) VDEV cache before b70 used to contain a full copy of prefetched disk contents, 2) VDEV cache since b70 analyzes the prefetched sectors and only keeps metadata

Re: [zfs-discuss] zfs read-ahead and L2ARC

2012-01-09 Thread John Martin
On 01/08/12 10:15, John Martin wrote: I believe Joerg Moellenkamp published a discussion several years ago on how L1ARC attempt to deal with the pollution of the cache by large streaming reads, but I don't have a bookmark handy (nor the knowledge of whether the behavior is still accurate).

Re: [zfs-discuss] zfs read-ahead and L2ARC

2012-01-09 Thread Jim Klimov
2012-01-09 18:15, John Martin пишет: On 01/08/12 20:10, Jim Klimov wrote: Is it true or false that: ZFS might skip the cache and go to disks for streaming reads? (The more I think about it, the more senseless this sentence seems, and I might have just mistaken it with ZIL writes of bulk

Re: [zfs-discuss] zfs defragmentation via resilvering?

2012-01-09 Thread Bob Friesenhahn
On Mon, 9 Jan 2012, Edward Ned Harvey wrote: I don't think that's correct... But it is! :-) Suppose you write a 1G file to disk. It is a database store. Now you start running your db server. It starts performing transactions all over the place. It overwrites the middle 4k of the file,

Re: [zfs-discuss] zfs defragmentation via resilvering?

2012-01-09 Thread Jim Klimov
2012-01-09 19:14, Bob Friesenhahn wrote: In summary, with zfs's default 128K block size, data fragmentation is not a significant issue, If the zfs filesystem block size is reduced to a much smaller value (e.g. 8K) then it can become a significant issue. As Richard Elling points out, a database

Re: [zfs-discuss] Thinking about spliting a zpool in system and data

2012-01-09 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/01/12 13:39, Jim Klimov wrote: I have transitioned a number of systems roughly by the same procedure as you've outlined. Sadly, my notes are not in English so they wouldn't be of much help directly; Yes, my russian is rusty :-). I have