Author: avg
Date: Tue Aug  8 11:24:13 2017
New Revision: 322242

  8373 TXG_WAIT in ZIL commit path
    The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign a
    transaction to a transaction group.
    That seems to be logically incorrect as writing of the ZIL block does not
    introduce any new dirty data.
    Also, when there is a lot of dirty data, the call can introduce significant
    delays into the ZIL commit path,
    thus affecting all synchronous writes. Additionally, ARC throttling may 
    the ZIL writing.
    We probably need a new mechanism similar to dmu_tx_create_assigned to assign
    ZIL transactions.
    (Ab)using TXG_WAITED does not seem to be sufficient.
  Reviewed by: Matthew Ahrens <>
  Reviewed by: Prakash Surya <>
  Approved by: Dan McDonald <>
  Author: Andriy Gapon <>


Modified: vendor-sys/illumos/dist/uts/common/fs/zfs/zil.c
--- vendor-sys/illumos/dist/uts/common/fs/zfs/zil.c     Tue Aug  8 11:21:58 
2017        (r322241)
+++ vendor-sys/illumos/dist/uts/common/fs/zfs/zil.c     Tue Aug  8 11:24:13 
2017        (r322242)
@@ -974,7 +974,24 @@ zil_lwb_write_start(zilog_t *zilog, lwb_t *lwb)
         * to clean up in the event of allocation failure or I/O failure.
        tx = dmu_tx_create(zilog->zl_os);
-       VERIFY(dmu_tx_assign(tx, TXG_WAIT) == 0);
+       /*
+        * Since we are not going to create any new dirty data and we can even
+        * help with clearing the existing dirty data, we should not be subject
+        * to the dirty data based delays.
+        * We (ab)use TXG_WAITED to bypass the delay mechanism.
+        * One side effect from using TXG_WAITED is that dmu_tx_assign() can
+        * fail if the pool is suspended.  Those are dramatic circumstances,
+        * so we return NULL to signal that the normal ZIL processing is not
+        * possible and txg_wait_synced() should be used to ensure that the data
+        * is on disk.
+        */
+       error = dmu_tx_assign(tx, TXG_WAITED);
+       if (error != 0) {
+               ASSERT3S(error, ==, EIO);
+               dmu_tx_abort(tx);
+               return (NULL);
+       }
        dsl_dataset_dirty(dmu_objset_ds(zilog->zl_os), tx);
        txg = dmu_tx_get_txg(tx);
_______________________________________________ mailing list
To unsubscribe, send any mail to ""

Reply via email to