[zfs-discuss] Improving L1ARC cache efficiency with dedup
Hello,I have a hypothetical question regarding ZFS reduplication. Does the L1ARC cache benefitfrom reduplicationin the sense that the L1ARC will only need to cache one copy of the reduplicated dataversus many copies? Here is an example:Imagine that I have a server with 2TB of RAM and a PB of disk storage. On this server I create a single 1TBdata file thatis full of unique data. Then I make 9 copies of that file giving each file a unique name andlocation withinthe same ZFS zpool. If I start up 10 application instances where each application reads all ofits ownuniquecopy of the data, will theL1ARC contain only the deduplicated data or will it cache separatecopies the data from each file? Insimpler terms, will the L1ARC require 10TB of RAM or just 1TB of RAM tocache all 10 1TB files worth ofdata?My hope is that since the data only physically occupies 1TB of storage via deduplication that the L1ARCwill also only require 1TB of RAM for the data.Note that I know the deduplication table will use the L1ARC as well. However, the focus of my questionis on how the L1ARC would benefit from a data caching standpoint.Thanks in advance!Brad Brad Diggs | Principal Sales ConsultantTech Blog:http://TheZoneManager.comLinkedIn:http://www.linkedin.com/in/braddiggs ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup
Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware. The only vendor i know that can do this is Netapp In fact , most of our functions, like replication is not dedup aware. However we have significant advantage that zfs keeps checksums regardless of the dedup being on and off. So, in the future we can perhaps make functions more dedup friendly regardless of dedup being enabled or not. For example, thecnicaly it's possible to optimize our replication that it does not send daya chunks if a data chunk with the same chechsum exists in target, without enabling dedup on target and source. Best regards Mertol Sent from a mobile device Mertol Ozyoney On 07 Ara 2011, at 20:46, Brad Diggs brad.di...@oracle.com wrote: Hello, I have a hypothetical question regarding ZFS reduplication. Does the L1ARC cache benefit from reduplication in the sense that the L1ARC will only need to cache one copy of the reduplicated data versus many copies? Here is an example: Imagine that I have a server with 2TB of RAM and a PB of disk storage. On this server I create a single 1TB data file that is full of unique data. Then I make 9 copies of that file giving each file a unique name and location within the same ZFS zpool. If I start up 10 application instances where each application reads all of its own unique copy of the data, will the L1ARC contain only the deduplicated data or will it cache separate copies the data from each file? In simpler terms, will the L1ARC require 10TB of RAM or just 1TB of RAM to cache all 10 1TB files worth of data? My hope is that since the data only physically occupies 1TB of storage via deduplication that the L1ARC will also only require 1TB of RAM for the data. Note that I know the deduplication table will use the L1ARC as well. However, the focus of my question is on how the L1ARC would benefit from a data caching standpoint. Thanks in advance! Brad PastedGraphic-2.tiff Brad Diggs | Principal Sales Consultant Tech Blog: http://TheZoneManager.com LinkedIn: http://www.linkedin.com/in/braddiggs ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] First zone creation - getting ZFS error
On 12/ 7/11 05:12 AM, Mark Creamer wrote: I'm running OI 151a. I'm trying to create a zone for the first time, and am getting an error about zfs. I'm logged in as me, then su - to root before running these commands. I have a pool called datastore, mounted at /datastore Per the wiki document http://wiki.openindiana.org/oi/Building+in+zones, I first created the zfs file system (note that the command syntax in the document appears to be wrong, so I did the options I wanted separately): zfs create datastore/zones zfs set compression=on datastore/zones zfs set mountpoint=/zones datastore/zones zfs list shows: NAME USED AVAIL REFER MOUNTPOINT datastore 28.5M 7.13T 57.9K /datastore datastore/dbdata28.1M 7.13T 28.1M /datastore/dbdata datastore/zones 55.9K 7.13T 55.9K /zones rpool 27.6G 201G45K /rpool rpool/ROOT 2.89G 201G31K legacy rpool/ROOT/openindiana 2.89G 201G 2.86G / rpool/dump 12.0G 201G 12.0G - rpool/export5.53M 201G32K /export rpool/export/home 5.50M 201G32K /export/home rpool/export/home/mcreamer 5.47M 201G 5.47M /export/home/mcreamer rpool/swap 12.8G 213G 137M - Then I went about creating the zone: zonecfg -z zonemaster create set autoboot=true set zonepath=/zones/zonemaster set ip-type=exclusive add net set physical=vnic0 end exit That all goes fine, then... zoneadm -z zonemaster install which returns... ERROR: the zonepath must be a ZFS dataset. The parent directory of the zonepath must be a ZFS dataset so that the zonepath ZFS dataset can be created properly. That's odd, it should have worked. Since the zfs dataset datastore/zones is created, I don't understand what the error is trying to get me to do. Do I have to do: zfs create datastore/zones/zonemaster before I can create a zone in that path? That's not in the documentation, so I didn't want to do anything until someone can point out my error for me. Thanks for your help! You shouldn't have to, but it won't do any harm. If you don't get any further, try zones-discuss. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup
It was my understanding that both dedup and caching work on block level. So if you have identical on-disk blocks (same original data past same compression and encryption), they turn into one(*) on-disk block with several references from DDT. And that one block is only cached once, saving ARC space. * (Technically, for very-often referenced blocks there is a number of copies, controlled by ditto attribute). HTH, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss