Re: [zfs-discuss] Understanding directio, O_DSYNC and zfs_nocacheflush on ZFS

2011-02-07 Thread Nico Williams
On Mon, Feb 7, 2011 at 1:17 PM, Yi Zhang yizhan...@gmail.com wrote: On Mon, Feb 7, 2011 at 1:51 PM, Brandon High bh...@freaks.com wrote: Maybe I didn't make my intention clear. UFS with directio is reasonably close to a raw disk from my application's perspective: when the app writes to a file

Re: [zfs-discuss] RAID Failure Calculator (for 8x 2TB RAIDZ)

2011-02-14 Thread Nico Williams
On Feb 14, 2011 6:56 AM, Paul Kraus p...@kraus-haus.org wrote: P.S. I am measuring number of objects via `zdb -d` as that is faster than trying to count files and directories and I expect is a much better measure of what the underlying zfs code is dealing with (a particular dataset may have

Re: [zfs-discuss] disable zfs/zpool destroy for root user

2011-02-17 Thread Nico Williams
On Thu, Feb 17, 2011 at 3:07 PM, Richard Elling richard.ell...@gmail.com wrote: On Feb 17, 2011, at 12:44 PM, Stefan Dormayer wrote: Hi all, is there a way to disable the subcommand destroy of zpool/zfs for the root user? Which OS? Heheh. Great answer. The real answer depends also on

Re: [zfs-discuss] ls reports incorrect file size

2011-05-02 Thread Nico Williams
Also, sparseness need not be apparent to applications. Until recent improvements to lseek(2) to expose hole/non-hole offsets, the only way to know about sparseness was to notice that a file's reported size is more than the file's reported filesystem blocks times the block size. Sparse files in

Re: [zfs-discuss] ls reports incorrect file size

2011-05-02 Thread Nico Williams
On Mon, May 2, 2011 at 3:56 PM, Eric D. Mudama edmud...@bounceswoosh.org wrote: Yea, kept googling and it makes sense.  I guess I am simply surprised that the application would have done the seek+write combination, since on NTFS (which doesn't support sparse) these would have been real 1.5GB

Re: [zfs-discuss] ls reports incorrect file size

2011-05-02 Thread Nico Williams
Then again, Windows apps may be doing seek+write to pre-allocate storage. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [cryptography] rolling hashes, EDC/ECC vs MAC/MIC, etc.

2011-05-22 Thread Nico Williams
On Sun, May 22, 2011 at 10:20 AM, Richard Elling richard.ell...@gmail.com wrote: ZFS already tracks the blocks that have been written, and the time that they were written. So we already know when something was writtem, though that does not answer the question of whether the data was changed. I

Re: [zfs-discuss] [cryptography] rolling hashes, EDC/ECC vs MAC/MIC, etc.

2011-05-22 Thread Nico Williams
On Sun, May 22, 2011 at 1:52 PM, Nico Williams n...@cryptonector.com wrote: [...] Or perhaps you'll argue that no one should ever need bi-di replication, that if one finds oneself wanting that then one has taken a wrong turn somewhere. You could also grant the premise and argue instead

Re: [zfs-discuss] ZFS, Oracle and Nexenta

2011-05-26 Thread Nico Williams
On May 25, 2011 7:15 AM, Garrett Dapos;Amore garr...@nexenta.com wrote: You are welcome to your beliefs. There are many groups that do standards that do not meet in public. [...] True. [...] In fact, I can't think of any standards bodies that *do* hold open meetings. I can: the IETF, for

Re: [zfs-discuss] ZFS Hard link space savings

2011-06-12 Thread Nico Williams
On Sun, Jun 12, 2011 at 4:14 PM, Scott Lawson scott.law...@manukau.ac.nz wrote: I have an interesting question that may or may not be answerable from some internal ZFS semantics. This is really standard Unix filesystem semantics. [...] So total storage used is around ~7.5MB due to the hard

Re: [zfs-discuss] ZFS Hard link space savings

2011-06-13 Thread Nico Williams
On Mon, Jun 13, 2011 at 5:50 AM, Roy Sigurd Karlsbakk r...@karlsbakk.net wrote: If anyone has any ideas be it ZFS based or any useful scripts that could help here, I am all ears. Something like this one-liner will show what would be allocated by everything if hardlinks weren't used: #

Re: [zfs-discuss] ZFS Hard link space savings

2011-06-13 Thread Nico Williams
On Mon, Jun 13, 2011 at 12:59 PM, Nico Williams n...@cryptonector.com wrote: Try this instead: (echo 0; find . -type f \! -links 1 | xargs stat -c %b %B *+ $p; echo p) | dc s/\$p// ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http

Re: [zfs-discuss] ZFS Hard link space savings

2011-06-13 Thread Nico Williams
And, without a sub-shell: find . -type f \! -links 1 | xargs stat -c %b %B *+p /dev/null | dc 2/dev/null | tail -1 (The stderr redirection is because otherwise dc whines once that the stack is empty, and the tail is because we print interim totals as we go.) Also, this doesn't quit work, since

Re: [zfs-discuss] Versioning FS was: question about COW and snapshots

2011-06-16 Thread Nico Williams
As Casper pointed out, the right thing to do is to build applications such that they can detect mid-transaction state and roll it back (or forward, if there's enough data). Then mid-transaction snapshots are fine, and the lack of APIs by which to inform the filesystem of application transaction

Re: [zfs-discuss] question about COW and snapshots

2011-06-16 Thread Nico Williams
On Thu, Jun 16, 2011 at 8:51 AM, casper@oracle.com wrote: If a database engine or another application keeps both the data and the log in the same filesystem, a snapshot wouldn't create inconsistent data (I think this would be true with vim and a large number of database engines; vim will

Re: [zfs-discuss] question about COW and snapshots

2011-06-16 Thread Nico Williams
That said, losing committed transactions when you needed and thought you had ACID semantics... is bad. But that's implied in any restore-from-backups situation. So you replicate/distribute transactions so that restore from backups (or snapshots) is an absolutely last resort matter, and if you

Re: [zfs-discuss] Encryption accelerator card recommendations.

2011-06-27 Thread Nico Williams
IMO a faster processor with built-in AES and other crypto support is most likely to give you the most bang for your buck, particularly if you're using closed Solaris 11, as Solaris engineering is likely to add support for new crypto instructions faster than Illumos (but I don't really know enough

Re: [zfs-discuss] Encryption accelerator card recommendations.

2011-06-27 Thread Nico Williams
On Jun 27, 2011 9:24 PM, David Magda dma...@ee.ryerson.ca wrote: AESNI is certain better than nothing, but RSA, SHA, and the RNG would be nice as well. It'd also be handy for ZFS crypto in addition to all the network IO stuff. The most important reason for AES-NI might be not performance but

Re: [zfs-discuss] Encryption accelerator card recommendations.

2011-06-27 Thread Nico Williams
On Jun 27, 2011 4:15 PM, David Magda dma...@ee.ryerson.ca wrote: The (Ultra)SPARC T-series processors do, but to a certain extent it goes against a CPU manufacturers best (financial) interest to provide this: crypto is very CPU intensive using 'regular' instructions, so if you need to do a lot

Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)

2011-07-24 Thread Nico Williams
On Jul 9, 2011 1:56 PM, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com wrote: Given the abysmal performance, I have to assume there is a significant number of overhead reads or writes in order to maintain the DDT for each actual block write operation. Something I didn't

Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-28 Thread Nico Williams
On Wed, Jul 27, 2011 at 9:22 PM, Daniel Carosone d...@geek.com.au wrote: Absent TRIM support, there's another way to do this, too.  It's pretty easy to dd /dev/zero to a file now and then.  Just make sure zfs doesn't prevent these being written to the SSD (compress and dedup are off).  I have

Re: [zfs-discuss] zfs scripts

2011-09-09 Thread Nico Williams
On Fri, Sep 9, 2011 at 5:33 AM, Sriram Narayanan sri...@belenix.org wrote: Plus, you'll need an character at the end of each command. And a wait command, if you want the script to wait for the sends to finish (which you should). Nico -- ___

Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Nico Williams
On Mon, Sep 26, 2011 at 1:55 PM, Jesus Cea j...@jcea.es wrote: I just upgraded to Solaris 10 Update 10, and one of the improvements is zfs diff. Using the birthtime of the sectors, I would expect very high performance. The actual performance doesn't seems better that an standard rdiff,

Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Nico Williams
Ah yes, of course. I'd misread your original post. Yes, disabling atime updates will reduce the number of superfluous transactions. It's *all* transactions that count, not just the ones the app explicitly caused, and atime implies lots of transactions. Nico --

Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea

2011-10-11 Thread Nico Williams
On Tue, Oct 11, 2011 at 11:15 PM, Richard Elling richard.ell...@gmail.com wrote: On Oct 9, 2011, at 10:28 AM, Jim Klimov wrote: ZFS developers have for a long time stated that ZFS is not intended, at least not in near term, for clustered environments (that is, having a pool safely imported by

Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea

2011-10-11 Thread Nico Williams
On Sun, Oct 9, 2011 at 12:28 PM, Jim Klimov jimkli...@cos.ru wrote: So, one version of the solution would be to have a single host which imports the pool in read-write mode (i.e. the first one which boots), and other hosts would write thru it (like iSCSI or whatever; maybe using SAS or FC to

Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea

2011-10-14 Thread Nico Williams
On Thu, Oct 13, 2011 at 9:13 PM, Jim Klimov jimkli...@cos.ru wrote: Thanks to Nico for concerns about POSIX locking. However, hopefully, in the usecase I described - serving images of VMs in a manner where storage, access and migration are efficient - whole datasets (be it volumes or FS

Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea

2011-10-14 Thread Nico Williams
Also, it's not worth doing a clustered ZFS thing that is too application-specific. You really want to nail down your choices of semantics, explore what design options those yield (or approach from the other direction, or both), and so on. Nico -- ___

Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Nico Williams
On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson bfwil...@doit.wisc.edu wrote: I just wanted to add something on fsck on ZFS - because for me that used to make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments. Where ZFS doesn't have an fsck command - and that really used to bug me

Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Nico Williams
On Wed, Oct 19, 2011 at 7:24 AM, Garrett D'Amore garrett.dam...@nexenta.com wrote: I'd argue that from a *developer* point of view, an fsck tool for ZFS might well be useful.  Isn't that what zdb is for? :-) But ordinary administrative users should never need something like this, unless

Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea

2011-11-08 Thread Nico Williams
To some people active-active means all cluster members serve the same filesystems. To others active-active means all cluster members serve some filesystems and can serve all filesystems ultimately by taking over failed cluster members. Nico -- ___

Re: [zfs-discuss] about btrfs and zfs

2011-11-11 Thread Nico Williams
On Fri, Nov 11, 2011 at 4:27 PM, Paul Kraus p...@kraus-haus.org wrote: The command syntax paradigm of zfs (command sub-command object parameters) is not unique to zfs, but seems to have been the way of doing things in Solaris 10. The _new_ functions of Solaris 10 were all this way (to the best

Re: [zfs-discuss] about btrfs and zfs

2011-11-14 Thread Nico Williams
On Mon, Nov 14, 2011 at 8:33 AM, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Paul Kraus Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls apart

[zfs-discuss] aclmode=mask

2011-11-14 Thread Nico Williams
I see, with great pleasure, that ZFS in Solaris 11 has a new aclmode=mask property. http://download.oracle.com/docs/cd/E23824_01/html/821-1448/gbscy.html#gkkkp http://download.oracle.com/docs/cd/E23824_01/html/821-1448/gbchf.html#gljyz

Re: [zfs-discuss] aclmode=mask

2011-11-14 Thread Nico Williams
On Mon, Nov 14, 2011 at 6:20 PM, Nico Williams n...@cryptonector.com wrote: I see, with great pleasure, that ZFS in Solaris 11 has a new aclmode=mask property. Also, congratulations on shipping. And thank you for implementing aclmode=mask. Nico

Re: [zfs-discuss] virtualbox rawdisk discrepancy

2011-11-21 Thread Nico Williams
Moving boot disks from one machine to another used to work as long as the machines were of the same architecture. I don't recall if it was *supported* (and wouldn't want to pretend to speak for Oracle now), but it was meant to work (unless you minimized the install and removed drivers not needed

Re: [zfs-discuss] grrr, How to get rid of mis-touched file named `-c'

2011-11-28 Thread Nico Williams
On Mon, Nov 28, 2011 at 11:28 AM, Smith, David W. smith...@llnl.gov wrote: You could list by inode, then use find with rm. # ls -i 7223 -O # find . -inum 7223 -exec rm {} \; This is the one solution I'd recommend against, since it would remove hardlinks that you might care about. Also,

Re: [zfs-discuss] bug moving files between two zfs filesystems (too many open files)

2011-11-29 Thread Nico Williams
On Tue, Nov 29, 2011 at 12:17 PM, Cindy Swearingen cindy.swearin...@oracle.com wrote: I think the too many open files is a generic error message about running out of file descriptors. You should check your shell ulimit information. Also, see how many open files you have: echo /proc/self/fd/*

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-13 Thread Nico Williams
On Dec 11, 2011 5:12 AM, Nathan Kroenert nat...@tuneunix.com wrote: On 12/11/11 01:05 AM, Pawel Jakub Dawidek wrote: On Wed, Dec 07, 2011 at 10:48:43PM +0200, Mertol Ozyoney wrote: Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do

Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2011-12-27 Thread Nico Williams
On Tue, Dec 27, 2011 at 2:20 PM, Frank Cusack fr...@linetwo.net wrote: http://sparcv9.blogspot.com/2011/12/solaris-11-illumos-and-source.html If I upgrade ZFS to use the new features in Solaris 11 I will be unable to import my pool using the free ZFS implementation that is available in

Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2011-12-27 Thread Nico Williams
On Tue, Dec 27, 2011 at 8:44 PM, Frank Cusack fr...@linetwo.net wrote: So with a de facto fork (illumos) now in place, is it possible that two zpools will report the same version yet be incompatible across implementations? Not likely: the Illumos community has developed a method for managing

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-28 Thread Nico Williams
On Wed, Dec 28, 2011 at 3:14 PM, Brad Diggs brad.di...@oracle.com wrote: The two key takeaways from this exercise were as follows.  There is tremendous caching potential through the use of ZFS deduplication.  However, the current block level deduplication does not benefit directory as much

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Nico Williams
On Thu, Dec 29, 2011 at 9:53 AM, Brad Diggs brad.di...@oracle.com wrote: Jim, You are spot on.  I was hoping that the writes would be close enough to identical that there would be a high ratio of duplicate data since I use the same record size, page size, compression algorithm, … etc.  

Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2011-12-29 Thread Nico Williams
On Thu, Dec 29, 2011 at 2:06 PM, sol a...@yahoo.com wrote: Richard Elling wrote:  many of the former Sun ZFS team regularly contribute to ZFS through the illumos developer community. Does this mean that if they provide a bug fix via illumos then the fix won't make it into the Oracle code? If

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

2011-12-29 Thread Nico Williams
On Thu, Dec 29, 2011 at 6:44 PM, Matthew Ahrens mahr...@delphix.com wrote: On Mon, Dec 12, 2011 at 11:04 PM, Erik Trimble tr...@netdemons.com wrote: (1) when constructing the stream, every time a block is read from a fileset (or volume), its checksum is sent to the receiving machine. The

Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2012-01-05 Thread Nico Williams
On Thu, Jan 5, 2012 at 8:53 AM, sol a...@yahoo.com wrote: if a bug fixed in Illumos is never reported to Oracle by a customer, it would likely never get fixed in Solaris either :-( I would have liked to think that there was some good-will between the ex- and current-members of the zfs

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-11 Thread Nico Williams
On Wed, Jan 11, 2012 at 9:16 AM, Jim Klimov jimkli...@cos.ru wrote: I've recently had a sort of an opposite thought: yes, ZFS redundancy is good - but also expensive in terms of raw disk space. This is especially bad for hardware space-constrained systems like laptops and home-NASes, where

Re: [zfs-discuss] Data loss by memory corruption?

2012-01-18 Thread Nico Williams
On Wed, Jan 18, 2012 at 4:53 AM, Jim Klimov jimkli...@cos.ru wrote: 2012-01-18 1:20, Stefan Ring wrote: I don’t care too much if a single document gets corrupted – there’ll always be a good copy in a snapshot. I do care however if a whole directory branch or old snapshots were to disappear.

Re: [zfs-discuss] ZFS on Linux vs FreeBSD

2012-04-25 Thread Nico Williams
As I understand it LLNL has very large datasets on ZFS on Linux. You could inquire with them, as well as http://groups.google.com/a/zfsonlinux.org/group/zfs-discuss/topics?pli=1 . My guess is that it's quite stable for at least some use cases (most likely: LLNL's!), but that may not be yours.

Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)

2012-04-25 Thread Nico Williams
I agree, you need something like AFS, Lustre, or pNFS. And/or an NFS proxy to those. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)

2012-04-25 Thread Nico Williams
On Wed, Apr 25, 2012 at 4:26 PM, Paul Archer p...@paularcher.org wrote: 2:20pm, Richard Elling wrote: Ignoring lame NFS clients, how is that architecture different than what you would have with any other distributed file system? If all nodes share data to all other nodes, then...? Simple.

Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)

2012-04-25 Thread Nico Williams
On Wed, Apr 25, 2012 at 5:22 PM, Richard Elling richard.ell...@gmail.com wrote: Unified namespace doesn't relieve you of 240 cross-mounts (or equivalents). FWIW, automounters were invented 20+ years ago to handle this in a nearly seamless manner. Today, we have DFS from Microsoft and NFS

Re: [zfs-discuss] cluster vs nfs

2012-04-25 Thread Nico Williams
On Wed, Apr 25, 2012 at 5:42 PM, Ian Collins i...@ianshome.com wrote: Aren't those general considerations when specifying a file server? There are Lustre clusters with thousands of nodes, hundreds of them being servers, and high utilization rates. Whatever specs you might have for one server

Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)

2012-04-25 Thread Nico Williams
On Wed, Apr 25, 2012 at 7:37 PM, Richard Elling richard.ell...@gmail.com wrote: On Apr 25, 2012, at 3:36 PM, Nico Williams wrote: I disagree vehemently.  automount is a disaster because you need to synchronize changes with all those clients.  That's not realistic. Really?  I did it with NIS

Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)

2012-04-25 Thread Nico Williams
On Wed, Apr 25, 2012 at 8:57 PM, Paul Kraus pk1...@gmail.com wrote: On Wed, Apr 25, 2012 at 9:07 PM, Nico Williams n...@cryptonector.com wrote: Nothing's changed.  Automounter + data migration - rebooting clients (or close enough to rebooting).  I.e., outage.    Uhhh, not if you design your

Re: [zfs-discuss] cluster vs nfs

2012-04-26 Thread Nico Williams
On Thu, Apr 26, 2012 at 12:10 AM, Richard Elling richard.ell...@gmail.com wrote: On Apr 25, 2012, at 8:30 PM, Carson Gaspar wrote: Reboot requirement is a lame client implementation. And lame protocol design. You could possibly migrate read-write NFSv3 on the fly by preserving FHs and somehow

Re: [zfs-discuss] cluster vs nfs

2012-04-26 Thread Nico Williams
On Thu, Apr 26, 2012 at 5:45 PM, Carson Gaspar car...@taltos.org wrote: On 4/26/12 2:17 PM, J.P. King wrote: I don't know SnapMirror, so I may be mistaken, but I don't see how you can have non-synchronous replication which can allow for seamless client failover (in the general case).

Re: [zfs-discuss] cluster vs nfs

2012-04-26 Thread Nico Williams
On Thu, Apr 26, 2012 at 12:37 PM, Richard Elling richard.ell...@gmail.com wrote: [...] NFSv4 had migration in the protocol (excluding protocols between servers) from the get-go, but it was missing a lot (FedFS) and was not implemented until recently. I've no idea what clients and servers

Re: [zfs-discuss] current status of SAM-QFS?

2012-05-03 Thread Nico Williams
On Wed, May 2, 2012 at 7:59 AM, Paul Kraus p...@kraus-haus.org wrote: On Wed, May 2, 2012 at 7:46 AM, Darren J Moffat darr...@opensolaris.org wrote: If Oracle is only willing to share (public) information about the roadmap for products via official sales channels then there will be lots of

Re: [zfs-discuss] Terminology question on ZFS COW

2012-06-05 Thread Nico Williams
COW goes back at least to the early days of virtual memory and fork(). On fork() the kernel would arrange for writable pages in the parent process to be made read-only so that writes to them could be caught and then the page fault handler would copy the page (and restore write access) so the

Re: [zfs-discuss] Is there an actual newsgroup for zfs-discuss?

2012-06-11 Thread Nico Williams
On Mon, Jun 11, 2012 at 5:05 PM, Tomas Forsman st...@acc.umu.se wrote: .. or use a mail reader that doesn't suck. Or the mailman thread view. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

Re: [zfs-discuss] [developer] Re: History of EPERM for unlink() of directories on ZFS?

2012-06-26 Thread Nico Williams
On Tue, Jun 26, 2012 at 9:44 AM, Alan Coopersmith alan.coopersm...@oracle.com wrote: On 06/26/12 05:46 AM, Lionel Cons wrote: On 25 June 2012 11:33,  casper@oracle.com wrote: To be honest, I think we should also remove this from all other filesystems and I think ZFS was created this way

Re: [zfs-discuss] History of EPERM for unlink() of directories on ZFS?

2012-06-26 Thread Nico Williams
On Tue, Jun 26, 2012 at 8:12 AM, Lionel Cons lionelcons1...@googlemail.com wrote: On 26 June 2012 14:51,  casper@oracle.com wrote: We've already asked our Netapp representative. She said it's not hard to add that. Did NetApp tell you that they'll add support for using the NFSv4 LINK

Re: [zfs-discuss] Interaction between ZFS intent log and mmap'd files

2012-07-02 Thread Nico Williams
On Mon, Jul 2, 2012 at 3:32 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Mon, 2 Jul 2012, Iwan Aucamp wrote: I'm interested in some more detail on how ZFS intent log behaves for updated done via a memory mapped file - i.e. will the ZIL log updates done to an mmap'd file or not ?

Re: [zfs-discuss] Interaction between ZFS intent log and mmap'd files

2012-07-03 Thread Nico Williams
On Tue, Jul 3, 2012 at 9:48 AM, James Litchfield jim.litchfi...@oracle.com wrote: On 07/02/12 15:00, Nico Williams wrote: You can't count on any writes to mmap(2)ed files hitting disk until you msync(2) with MS_SYNC. The system should want to wait as long as possible before committing any

Re: [zfs-discuss] Interaction between ZFS intent log and mmap'd files

2012-07-04 Thread Nico Williams
On Wed, Jul 4, 2012 at 11:14 AM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Tue, 3 Jul 2012, James Litchfield wrote: Agreed - msync/munmap is the only guarantee. I don't see that the munmap definition assures that anything is written to disk. The system is free to buffer the data

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams
On Wed, Jul 11, 2012 at 9:48 AM, casper@oracle.com wrote: Huge space, but still finite=85 Dan Brown seems to think so in Digital Fortress but it just means he has no grasp on big numbers. I couldn't get past that. I had to put the book down. I'm guessing it was as awful as it threatened

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams
On Wed, Jul 11, 2012 at 3:45 AM, Sašo Kiselkov skiselkov...@gmail.com wrote: It's also possible to set dedup=verify with checksum=sha256, however, that makes little sense (as the chances of getting a random hash collision are essentially nil). IMO dedup should always verify. Nico --

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams
You can treat whatever hash function as an idealized one, but actual hash functions aren't. There may well be as-yet-undiscovered input bit pattern ranges where there's a large density of collisions in some hash function, and indeed, since our hash functions aren't ideal, there must be. We just

Re: [zfs-discuss] Can the ZFS copies attribute substitute HW disk redundancy?

2012-07-30 Thread Nico Williams
The copies thing is a really only for laptops, where the likelihood of redundancy is very low (there are some high-end laptops with multiple drives, but those are relatively rare) and where this idea is better than nothing. It's also nice that copies can be set on a per-dataset manner (whereas

Re: [zfs-discuss] Solaris 11 System Reboots Continuously Because of a ZFS-Related Panic (7191375)

2013-01-14 Thread Nico Williams
On Mon, Jan 14, 2013 at 1:48 PM, Tomas Forsman st...@acc.umu.se wrote: https://bug.oraclecorp.com/pls/bug/webbug_print.show?c_rptno=15852599 Host oraclecorp.com not found: 3(NXDOMAIN) Would oracle.internal be a better domain name? Things like that cannot be changed easily. They (Oracle) are

Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-19 Thread Nico Williams
I've wanted a system where dedup applies only to blocks being written that have a good chance of being dups of others. I think one way to do this would be to keep a scalable Bloom filter (on disk) into which one inserts block hashes. To decide if a block needs dedup one would first check the

Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-20 Thread Nico Williams
Bloom filters are very small, that's the difference. You might only need a few bits per block for a Bloom filter. Compare to the size of a DDT entry. A Bloom filter could be cached entirely in main memory. ___ zfs-discuss mailing list

Re: [zfs-discuss] RFE: Un-dedup for unique blocks

2013-01-22 Thread Nico Williams
IIRC dump is special. As for swap... really, you don't want to swap. If you're swapping you have problems. Any swap space you have is to help you detect those problems and correct them before apps start getting ENOMEM. There *are* exceptions to this, such as Varnish. For Varnish and any other