Re: svn commit: r286570 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys

2015-08-14 Thread Andriy Gapon
On 10/08/2015 13:34, Alexander Motin wrote:
 Author: mav
 Date: Mon Aug 10 10:34:23 2015
 New Revision: 286570
 URL: https://svnweb.freebsd.org/changeset/base/286570
 
 Log:
   MFV r277426: 5408 managing ZFS cache devices requires lots of RAM
   Reviewed by: Christopher Siden christopher.si...@delphix.com
   Reviewed by: George Wilson george.wil...@delphix.com
   Reviewed by: Matthew Ahrens mahr...@delphix.com
   Reviewed by: Don Brady dev.fs@gmail.com
   Reviewed by: Josef 'Jeff' Sipek josef.si...@nexenta.com
   Approved by: Garrett D'Amore garr...@damore.org
   Author: Chris Williamson chris.william...@delphix.com
   
   illumos/illumos-gate@89c86e32293a30cdd7af530c38b2073fee01411c
   
   Currently, every buffer cached in the L2ARC is accompanied by a 240-byte
   header in memory, leading to very high memory consumption when using very
   large cache devices. These changes significantly reduce this overhead.
   
   Currently:
   
   L1-only header = 176 bytes
   L1 + L2 or L2-only header = 176 bytes + 32 byte checksum + 32 byte l2hdr
   = 240 bytes
   
   Memory-optimized:
   
   L1-only header = 176 bytes
   L1 + L2 header = 176 bytes + 32 byte checksum = 208 bytes
   L2-only header = 96 bytes + 32 byte checksum = 128 bytes
   
   So overall:
   
 Trunk  Optimized
   +-+
   L1-only | 176 B  | 176 B  | (same)
   +-+
   L1  L2 | 240 B  | 208 B  | (saved 32 bytes)
   +-+
   L2-only | 240 B  | 128 B  | (saved 116 bytes)
   +-+
   
   For an average blocksize of 8KB, this means that for the L2ARC, the ratio
   of metadata to data has gone down from about 2.92% to 1.56%.  For a
   'storage optimized' EC2 instance with 1600GB of SSD and 60GB of RAM, this
   means that we expect a completely full L2ARC to use (1600 GB * 0.0156) /
   60GB = 41% of the available memory, down from 78%.
 
 Modified:
   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h
 

Looking at the code in arc_buf_l2_cdata_free() function it seems that
this MFV contains element of both 5408 change and the later 5497 lock
contention on arcs_mtx change, because that's the change where
arc_buf_l2_cdata_free() was actually added in illumos.
It's hard to review such a composite change.
I think that you used a better approach when you merged 5497 where you
first reverted our local changes and then merged the upstream changes.
I think that it would make sense to use the same approach with this
change as well.
For example, in ZoL they did just that and it was much easier to
understand and review:
https://github.com/zfsonlinux/zfs/pull/3481/commits
Note the revert commits one of which is Revert fix l2arc compression
buffers leak.

P.S. The original illumos commit also made small changes to ztest.c that
do not seem to be merged.

P.P.S. it sucks that svn is not git :-)
-- 
Andriy Gapon
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286570 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys

2015-08-11 Thread Dmitry Morozovsky
Alexander,

On Mon, 10 Aug 2015, Alexander Motin wrote:

 Author: mav
 Date: Mon Aug 10 10:34:23 2015
 New Revision: 286570
 URL: https://svnweb.freebsd.org/changeset/base/286570
 
 Log:
   MFV r277426: 5408 managing ZFS cache devices requires lots of RAM
   
   illumos/illumos-gate@89c86e32293a30cdd7af530c38b2073fee01411c

[snip]

   For an average blocksize of 8KB, this means that for the L2ARC, the ratio
   of metadata to data has gone down from about 2.92% to 1.56%.  For a
   'storage optimized' EC2 instance with 1600GB of SSD and 60GB of RAM, this
   means that we expect a completely full L2ARC to use (1600 GB * 0.0156) /
   60GB = 41% of the available memory, down from 78%.

Thanks!

Any MFC planned please?


-- 
Sincerely,
D.Marck [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer: ma...@freebsd.org ]

*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru ***

___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286570 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys

2015-08-11 Thread Alexander Motin
On 11.08.2015 12:38, Dmitry Morozovsky wrote:
 Alexander,
 
 On Mon, 10 Aug 2015, Alexander Motin wrote:
 
 Author: mav
 Date: Mon Aug 10 10:34:23 2015
 New Revision: 286570
 URL: https://svnweb.freebsd.org/changeset/base/286570

 Log:
   MFV r277426: 5408 managing ZFS cache devices requires lots of RAM
   
   illumos/illumos-gate@89c86e32293a30cdd7af530c38b2073fee01411c
 
 [snip]
 
   For an average blocksize of 8KB, this means that for the L2ARC, the ratio
   of metadata to data has gone down from about 2.92% to 1.56%.  For a
   'storage optimized' EC2 instance with 1600GB of SSD and 60GB of RAM, this
   means that we expect a completely full L2ARC to use (1600 GB * 0.0156) /
   60GB = 41% of the available memory, down from 78%.
 
 Thanks!
 
 Any MFC planned please?

Theoretically planned, but no specific terms yet -- too many changes.

-- 
Alexander Motin
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


svn commit: r286570 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys

2015-08-10 Thread Alexander Motin
Author: mav
Date: Mon Aug 10 10:34:23 2015
New Revision: 286570
URL: https://svnweb.freebsd.org/changeset/base/286570

Log:
  MFV r277426: 5408 managing ZFS cache devices requires lots of RAM
  Reviewed by: Christopher Siden christopher.si...@delphix.com
  Reviewed by: George Wilson george.wil...@delphix.com
  Reviewed by: Matthew Ahrens mahr...@delphix.com
  Reviewed by: Don Brady dev.fs@gmail.com
  Reviewed by: Josef 'Jeff' Sipek josef.si...@nexenta.com
  Approved by: Garrett D'Amore garr...@damore.org
  Author: Chris Williamson chris.william...@delphix.com
  
  illumos/illumos-gate@89c86e32293a30cdd7af530c38b2073fee01411c
  
  Currently, every buffer cached in the L2ARC is accompanied by a 240-byte
  header in memory, leading to very high memory consumption when using very
  large cache devices. These changes significantly reduce this overhead.
  
  Currently:
  
  L1-only header = 176 bytes
  L1 + L2 or L2-only header = 176 bytes + 32 byte checksum + 32 byte l2hdr
  = 240 bytes
  
  Memory-optimized:
  
  L1-only header = 176 bytes
  L1 + L2 header = 176 bytes + 32 byte checksum = 208 bytes
  L2-only header = 96 bytes + 32 byte checksum = 128 bytes
  
  So overall:
  
Trunk  Optimized
  +-+
  L1-only | 176 B  | 176 B  | (same)
  +-+
  L1  L2 | 240 B  | 208 B  | (saved 32 bytes)
  +-+
  L2-only | 240 B  | 128 B  | (saved 116 bytes)
  +-+
  
  For an average blocksize of 8KB, this means that for the L2ARC, the ratio
  of metadata to data has gone down from about 2.92% to 1.56%.  For a
  'storage optimized' EC2 instance with 1600GB of SSD and 60GB of RAM, this
  means that we expect a completely full L2ARC to use (1600 GB * 0.0156) /
  60GB = 41% of the available memory, down from 78%.

Modified:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h

Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
==
--- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c   Mon Aug 10 
10:29:32 2015(r286569)
+++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c   Mon Aug 10 
10:34:23 2015(r286570)
@@ -111,7 +111,7 @@
  * Note that the majority of the performance stats are manipulated
  * with atomic operations.
  *
- * The L2ARC uses the l2arc_buflist_mtx global mutex for the following:
+ * The L2ARC uses the l2ad_mtx on each vdev for the following:
  *
  * - L2ARC buflist creation
  * - L2ARC buflist eviction
@@ -399,6 +399,7 @@ typedef struct arc_stats {
kstat_named_t arcstat_l2_writes_hdr_miss;
kstat_named_t arcstat_l2_evict_lock_retry;
kstat_named_t arcstat_l2_evict_reading;
+   kstat_named_t arcstat_l2_evict_l1cached;
kstat_named_t arcstat_l2_free_on_write;
kstat_named_t arcstat_l2_cdata_free_on_write;
kstat_named_t arcstat_l2_abort_lowmem;
@@ -481,6 +482,7 @@ static arc_stats_t arc_stats = {
{ l2_writes_hdr_miss, KSTAT_DATA_UINT64 },
{ l2_evict_lock_retry,KSTAT_DATA_UINT64 },
{ l2_evict_reading,   KSTAT_DATA_UINT64 },
+   { l2_evict_l1cached,  KSTAT_DATA_UINT64 },
{ l2_free_on_write,   KSTAT_DATA_UINT64 },
{ l2_cdata_free_on_write, KSTAT_DATA_UINT64 },
{ l2_abort_lowmem,KSTAT_DATA_UINT64 },
@@ -585,8 +587,6 @@ static int  arc_no_grow;/* Don't try to
 static uint64_tarc_tempreserve;
 static uint64_tarc_loaned_bytes;
 
-typedef struct l2arc_buf_hdr l2arc_buf_hdr_t;
-
 typedef struct arc_callback arc_callback_t;
 
 struct arc_callback {
@@ -607,29 +607,53 @@ struct arc_write_callback {
arc_buf_t   *awcb_buf;
 };
 
-struct arc_buf_hdr {
-   /* protected by hash lock */
-   dva_t   b_dva;
-   uint64_tb_birth;
-   uint64_tb_cksum0;
-
+/*
+ * ARC buffers are separated into multiple structs as a memory saving measure:
+ *   - Common fields struct, always defined, and embedded within it:
+ *   - L2-only fields, always allocated but undefined when not in L2ARC
+ *   - L1-only fields, only allocated when in L1ARC
+ *
+ *   Buffer in L1 Buffer only in L2
+ *++  ++
+ *| arc_buf_hdr_t  |  | arc_buf_hdr_t  |
+ *||  ||
+ *||  ||
+ *||  ||
+ *++  ++
+ *| l2arc_buf_hdr_t|  | l2arc_buf_hdr_t|
+ *| (undefined if L1-only) |  ||