[PATCH] btrfs-progs: Restrict e2fsprogs version for new convert

2016-04-13 Thread Qu Wenruo
New btrfs-convert is using a lot of new macro in e2fsprogs 1.42.
Unfortunately the new compatible layer for older e2fsprogs is still
under development.

So restrict e2fsprogs version so far to avoid complier error.

Reported-by: David Sterba 
Signed-off-by: Qu Wenruo 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index fc343ea..05fdc32 100644
--- a/configure.ac
+++ b/configure.ac
@@ -105,7 +105,7 @@ AS_IF([test "x$enable_convert" = xyes], 
[DISABLE_BTRFSCONVERT=0], [DISABLE_BTRFS
 AC_SUBST([DISABLE_BTRFSCONVERT])
 
 if test "x$enable_convert" = xyes; then
-   PKG_CHECK_MODULES(EXT2FS, [ext2fs])
+   PKG_CHECK_MODULES(EXT2FS, [ext2fs >= 1.42])
PKG_CHECK_MODULES(COM_ERR, [com_err])
 fi
 
-- 
2.8.0



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/42] v5: separate operations from flags in the bio/request structs

2016-04-13 Thread Hannes Reinecke
On 04/13/2016 09:35 PM, mchri...@redhat.com wrote:
> The following patches begin to cleanup the request->cmd_flags and
> bio->bi_rw mess. We currently use cmd_flags to specify the operation,
> attributes and state of the request. For bi_rw we use it for similar
> info and also the priority but then also have another bi_flags field
> for state. At some point, we abused them so much we just made cmd_flags
> 64 bits, so we could add more.
> 
> The following patches seperate the operation (read, write discard,
> flush, etc) from cmd_flags/bi_rw.
> 
> This patchset was made against linux-next from today April 13
> (git tag next-20160413).
> 
> I put a git tree here:
> https://github.com/mikechristie/linux-kernel.git
> The patches are in the op branch.
> 
A round of applause for you.

For the entire series:

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] btrfs: qgroup: Fix qgroup accounting when creating snapshot

2016-04-13 Thread Qu Wenruo
Current btrfs qgroup design implies a requirement that after calling
btrfs_qgroup_account_extents() there must be a commit root switch.

Normally this is OK, as btrfs_qgroup_accounting_extents() is only called
inside btrfs_commit_transaction() just be commit_cowonly_roots().

However there is a exception at create_pending_snapshot(), which will
call btrfs_qgroup_account_extents() but no any commit root switch.

In case of creating a snapshot whose parent root is itself (create a
snapshot of fs tree), it will corrupt qgroup by the following trace:
(skipped unrelated data)
==
btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots 
= 0, nr_new_roots = 1
qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 
0, excl = 0
qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 
16384, excl = 16384
btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots 
= 0, nr_new_roots = 0
==

The problem here is in first qgroup_account_extent(), the
nr_new_roots of the extent is 1, which means its reference got
increased, and qgroup increased its rfer and excl.

But at second qgroup_account_extent(), its reference got decreased, but
between these two qgroup_account_extent(), there is no switch roots.
This leads to the same nr_old_roots, and this extent just got ignored by
qgroup, which means this extent is wrongly accounted.

Fix it by call commit_cowonly_roots() after qgroup_account_extent() in
create_pending_snapshot(), with needed preparation.

Reported-by: Mark Fasheh 
Signed-off-by: Qu Wenruo 
---
v2:
  Fix a soft lockup caused by missing switch_commit_root() call.
  Fix a warning caused by dirty-but-not-committed root.
v3:
  Fix a difference behavior that btrfs qgroup will start accounting
  dropped roots if we are creating snapshots.
  Other than always account them in next transaction.
---
 fs/btrfs/transaction.c | 122 +++--
 1 file changed, 87 insertions(+), 35 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 43885e5..5ba0d9a 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -126,7 +126,8 @@ static void clear_btree_io_tree(struct extent_io_tree *tree)
 }
 
 static noinline void switch_commit_roots(struct btrfs_transaction *trans,
-struct btrfs_fs_info *fs_info)
+struct btrfs_fs_info *fs_info,
+int free_dropped_roots)
 {
struct btrfs_root *root, *tmp;
 
@@ -142,16 +143,18 @@ static noinline void switch_commit_roots(struct 
btrfs_transaction *trans,
}
 
/* We can free old roots now. */
-   spin_lock(&trans->dropped_roots_lock);
-   while (!list_empty(&trans->dropped_roots)) {
-   root = list_first_entry(&trans->dropped_roots,
-   struct btrfs_root, root_list);
-   list_del_init(&root->root_list);
-   spin_unlock(&trans->dropped_roots_lock);
-   btrfs_drop_and_free_fs_root(fs_info, root);
+   if (free_dropped_roots) {
spin_lock(&trans->dropped_roots_lock);
+   while (!list_empty(&trans->dropped_roots)) {
+   root = list_first_entry(&trans->dropped_roots,
+   struct btrfs_root, root_list);
+   list_del_init(&root->root_list);
+   spin_unlock(&trans->dropped_roots_lock);
+   btrfs_drop_and_free_fs_root(fs_info, root);
+   spin_lock(&trans->dropped_roots_lock);
+   }
+   spin_unlock(&trans->dropped_roots_lock);
}
-   spin_unlock(&trans->dropped_roots_lock);
up_write(&fs_info->commit_root_sem);
 }
 
@@ -311,12 +314,13 @@ loop:
  * when the transaction commits
  */
 static int record_root_in_trans(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root)
+  struct btrfs_root *root,
+  int force)
 {
-   if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
-   root->last_trans < trans->transid) {
+   if ((test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+   root->last_trans < trans->transid) || force) {
WARN_ON(root == root->fs_info->extent_root);
-   WARN_ON(root->commit_root != root->node);
+   WARN_ON(root->commit_root != root->node && !force);
 
/*
 * see below for IN_TRANS_SETUP usage rules
@@ -331,7 +335,7 @@ static int record_root_in_trans(struct btrfs_trans_handle 
*trans,
smp_wmb();
 
spin_lock(&root->fs_info->fs_roots_radix_lock);
-   if (root->last_trans == trans->transid) {
+   if (root->last_trans == trans->transid && !force) {

Re: [PATCH] Btrfs: fix loading of orphan roots leading to BUG_ON

2016-04-13 Thread Qu Wenruo

Ping?

Cc: Chris and David

It seems that this fix is missing in 4.6 merge window.
Or did I miss something?

Thanks,
Qu

Filipe Manana wrote on 2016/03/03 09:10 +:

On Thu, Mar 3, 2016 at 4:31 AM, Duncan <1i5t5.dun...@cox.net> wrote:

fdmanana posted on Wed, 02 Mar 2016 15:49:38 + as excerpted:


When looking for orphan roots during mount we can end up hitting a
BUG_ON() (at root-item.c:btrfs_find_orphan_roots()) if a log tree is
replayed and qgroups are enabled.


This should hit 4.6, right?  Will it hit 4.5 before release?


It's not the first time you do a similar question, and if it's
targeted at me, all I can tell you is I don't know. It's the
maintainers (Chris, Josef, David)  who decide when to pick patches and
for which releases.



Because I wasn't sure of current quota functionality status, but this bug
obviously resets the counter on my ongoing "two kernel cycles with no
known quota bugs before you try to use quotas" recommendation.


You shouldn't spread such affirmation with such a level of certainty
every time a user reports a problem.
There are many bugs affecting the last 2 to 3 releases, but there are
also many bugs present since btrfs was added to the linux kernel tree,
and many others present for 2+ years, etc.



Meanwhile, what /is/ current quota feature status?  Other than this bug,
is it now considered known bug free, or is more quota reworking and/or
bug fixing known to be needed for 4.6 and beyond?

IOW, given that two release cycles no known bugs counter, are we
realistically looking at that being 4.8, or are we now looking at 4.9 or
beyond for reasonable quota stability?


I don't know. I generally don't look actively look at qgroups, and I'm
not a user either.
You can only take conclusions based on user bug reports. Probably
there aren't more bugs for qgroups than there are for send/receive or
even non-btrfs specific features for example.



--
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html







--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: remove BUG_ON()'s in btrfs_map_block

2016-04-13 Thread Anand Jain



On 04/13/2016 12:54 AM, Josef Bacik wrote:

btrfs_map_block can go horribly wrong in the face of fs corruption, lets agree
to not be assholes and panic at any possible chance things are all fucked up.

Signed-off-by: Josef Bacik 
---
  fs/btrfs/volumes.c | 22 --
  1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e2b54d5..ba8216b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5278,7 +5278,18 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
stripe_nr = div64_u64(stripe_nr, stripe_len);

stripe_offset = stripe_nr * stripe_len;
-   BUG_ON(offset < stripe_offset);
+   if (offset < stripe_offset) {
+   btrfs_crit(fs_info, "stripe math has gone wrong, "
+  "stripe_offset=%llu, offset=%llu, start=%llu, "




+  "logical=%llu, stripe_len=%llu\n",


 btrfs_crit adds \n suffix by its own.


+  (unsigned long long)stripe_offset,
+  (unsigned long long)offset,
+  (unsigned long long)em->start,
+  (unsigned long long)logical,
+  (unsigned long long)stripe_len);
+   free_extent_map(em);
+   return -EINVAL;
+   }

/* stripe_offset is the offset of this block in its stripe*/
stripe_offset = offset - stripe_offset;
@@ -5519,7 +5530,14 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
&stripe_index);
mirror_num = stripe_index + 1;
}
-   BUG_ON(stripe_index >= map->num_stripes);
+   if (stripe_index >= map->num_stripes) {
+   btrfs_crit(fs_info, "stripe index math went horribly wrong, "



+  "got stripe_index=%lu, num_stripes=%lu\n",


 -same-

Thanks, Anand


+  (unsigned long)stripe_index,
+  (unsigned long)map->num_stripes);
+   ret = -EINVAL;
+   goto out;
+   }

num_alloc_stripes = num_stripes;
if (dev_replace_is_ongoing) {


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: qgroup: Fix qgroup accounting when creating snapshot

2016-04-13 Thread Qu Wenruo



Filipe Manana wrote on 2016/04/13 17:23 +0100:

On Tue, Apr 12, 2016 at 8:35 AM, Qu Wenruo  wrote:

Current btrfs qgroup design implies a requirement that after calling
btrfs_qgroup_account_extents() there must be a commit root switch.

Normally this is OK, as btrfs_qgroup_accounting_extents() is only called
inside btrfs_commit_transaction() just be commit_cowonly_roots().

However there is a exception at create_pending_snapshot(), which will
call btrfs_qgroup_account_extents() but no any commit root switch.

In case of creating a snapshot whose parent root is itself (create a
snapshot of fs tree), it will corrupt qgroup by the following trace:
(skipped unrelated data)
==
btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots 
= 0, nr_new_roots = 1
qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 
0, excl = 0
qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 
16384, excl = 16384
btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots 
= 0, nr_new_roots = 0
==

The problem here is in first qgroup_account_extent(), the
nr_new_roots of the extent is 1, which means its reference got
increased, and qgroup increased its rfer and excl.

But at second qgroup_account_extent(), its reference got decreased, but
between these two qgroup_account_extent(), there is no switch roots.
This leads to the same nr_old_roots, and this extent just got ignored by
qgroup, which means this extent is wrongly accounted.

Fix it by call commit_cowonly_roots() after qgroup_account_extent() in
create_pending_snapshot(), with needed preparation.

Reported-by: Mark Fasheh 
Signed-off-by: Qu Wenruo 
---
changelog:
v2:
   Fix a soft lockup caused by missing switch_commit_root() call.
   Fix a warning caused by dirty-but-not-committed root.

Note:
   This may be the dirtiest hack I have ever done.


I don't like it either. But, more importantly, I don't think this is
correct. See below.


   As there are already several different judgment to check if a fs root
   should be updated. From root->last_trans to root->commit_root ==
   root->node.

   With this patch, we must switch the root of at least related fs tree
   and extent tree to allow qgroup to call
   btrfs_qgroup_account_extents().
   But this will break some transid judgement, as transid is already
   updated to current transid.
   (maybe we need a special sub-transid for qgroup use only?)

   As long as current qgroup use commit_root to determine old_roots,
   there is no better idea though.
---
  fs/btrfs/transaction.c | 96 +-
  1 file changed, 71 insertions(+), 25 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 43885e5..0f299a56 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -311,12 +311,13 @@ loop:
   * when the transaction commits
   */
  static int record_root_in_trans(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root)
+  struct btrfs_root *root,
+  int force)
  {
-   if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
-   root->last_trans < trans->transid) {
+   if ((test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+   root->last_trans < trans->transid) || force) {
 WARN_ON(root == root->fs_info->extent_root);
-   WARN_ON(root->commit_root != root->node);
+   WARN_ON(root->commit_root != root->node && !force);

 /*
  * see below for IN_TRANS_SETUP usage rules
@@ -331,7 +332,7 @@ static int record_root_in_trans(struct btrfs_trans_handle 
*trans,
 smp_wmb();

 spin_lock(&root->fs_info->fs_roots_radix_lock);
-   if (root->last_trans == trans->transid) {
+   if (root->last_trans == trans->transid && !force) {
 spin_unlock(&root->fs_info->fs_roots_radix_lock);
 return 0;
 }
@@ -402,7 +403,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle 
*trans,
 return 0;

 mutex_lock(&root->fs_info->reloc_mutex);
-   record_root_in_trans(trans, root);
+   record_root_in_trans(trans, root, 0);
 mutex_unlock(&root->fs_info->reloc_mutex);

 return 0;
@@ -1383,7 +1384,7 @@ static noinline int create_pending_snapshot(struct 
btrfs_trans_handle *trans,
 dentry = pending->dentry;
 parent_inode = pending->dir;
 parent_root = BTRFS_I(parent_inode)->root;
-   record_root_in_trans(trans, parent_root);
+   record_root_in_trans(trans, parent_root, 0);

 cur_time = current_fs_time(parent_inode->i_sb);

@@ -1420,7 +1421,7 @@ static noinline int create_pending_snapshot(struct 
btrfs_trans_handle *trans,
 goto fail;
 }

-   record_root_in_trans(trans, root);
+   

[PATCH] Btrfs: Set superblock s_bdev field properly at device closing

2016-04-13 Thread Yauhen Kharuzhy
fs_info->sb->s_bdev field isn't set to any value at mount time but is set
after device replacing or at device closing. Existing code of
device_force_close() checks if current s_bdev is not equal to closing
bdev and, if equal, replace it by bdev field of first btrfs_device from
device list. This device may be the same as closed, and s_bdev field will
be invalid.

If s_bdev is not NULL but references an freed block device, kernel
oopses at filesystem sync time on unmount.

For multi-device FS setting of this field may be senseless, but using of
it should be consistent over the all btrfs code. So, set it on mount
time and select valid device at device closing time.

Alternative solution may be to not set s_bdev entirely.

Signed-off-by: Yauhen Kharuzhy 
---
 fs/btrfs/super.c   |  1 +
 fs/btrfs/volumes.c | 16 
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 3dd154e..1a2c58f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1522,6 +1522,7 @@ static struct dentry *btrfs_mount(struct file_system_type 
*fs_type, int flags,
char b[BDEVNAME_SIZE];
 
strlcpy(s->s_id, bdevname(bdev, b), sizeof(s->s_id));
+   s->s_bdev = bdev;
btrfs_sb(s)->bdev_holder = fs_type;
error = btrfs_fill_super(s, fs_devices, data,
 flags & MS_SILENT ? 1 : 0);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 08ab116..f14f3f2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7132,6 +7132,7 @@ void device_force_close(struct btrfs_device *device)
 {
struct btrfs_device *next_device;
struct btrfs_fs_devices *fs_devices;
+   int found = 0;
 
fs_devices = device->fs_devices;
 
@@ -7139,13 +7140,20 @@ void device_force_close(struct btrfs_device *device)
mutex_lock(&fs_devices->fs_info->chunk_mutex);
spin_lock(&fs_devices->fs_info->free_chunk_lock);
 
-   next_device = list_entry(fs_devices->devices.next,
-   struct btrfs_device, dev_list);
+   list_for_each_entry(next_device, &fs_devices->devices, dev_list) {
+   if (next_device->bdev && next_device->bdev != device->bdev) {
+   found = 1;
+   break;
+   }
+   }
+
if (device->bdev == fs_devices->fs_info->sb->s_bdev)
-   fs_devices->fs_info->sb->s_bdev = next_device->bdev;
+   fs_devices->fs_info->sb->s_bdev =
+   found ? next_device->bdev : NULL;
 
if (device->bdev == fs_devices->latest_bdev)
-   fs_devices->latest_bdev = next_device->bdev;
+   fs_devices->latest_bdev =
+   found ? next_device->bdev : NULL;
 
if (device->bdev)
fs_devices->open_devices--;
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: qgroup: Fix qgroup accounting when creating snapshot

2016-04-13 Thread Qu Wenruo



Filipe Manana wrote on 2016/04/13 17:23 +0100:

On Tue, Apr 12, 2016 at 8:35 AM, Qu Wenruo  wrote:

Current btrfs qgroup design implies a requirement that after calling
btrfs_qgroup_account_extents() there must be a commit root switch.

Normally this is OK, as btrfs_qgroup_accounting_extents() is only called
inside btrfs_commit_transaction() just be commit_cowonly_roots().

However there is a exception at create_pending_snapshot(), which will
call btrfs_qgroup_account_extents() but no any commit root switch.

In case of creating a snapshot whose parent root is itself (create a
snapshot of fs tree), it will corrupt qgroup by the following trace:
(skipped unrelated data)
==
btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots 
= 0, nr_new_roots = 1
qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 
0, excl = 0
qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 
16384, excl = 16384
btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots 
= 0, nr_new_roots = 0
==

The problem here is in first qgroup_account_extent(), the
nr_new_roots of the extent is 1, which means its reference got
increased, and qgroup increased its rfer and excl.

But at second qgroup_account_extent(), its reference got decreased, but
between these two qgroup_account_extent(), there is no switch roots.
This leads to the same nr_old_roots, and this extent just got ignored by
qgroup, which means this extent is wrongly accounted.

Fix it by call commit_cowonly_roots() after qgroup_account_extent() in
create_pending_snapshot(), with needed preparation.

Reported-by: Mark Fasheh 
Signed-off-by: Qu Wenruo 
---
changelog:
v2:
   Fix a soft lockup caused by missing switch_commit_root() call.
   Fix a warning caused by dirty-but-not-committed root.

Note:
   This may be the dirtiest hack I have ever done.


I don't like it either. But, more importantly, I don't think this is
correct. See below.


   As there are already several different judgment to check if a fs root
   should be updated. From root->last_trans to root->commit_root ==
   root->node.

   With this patch, we must switch the root of at least related fs tree
   and extent tree to allow qgroup to call
   btrfs_qgroup_account_extents().
   But this will break some transid judgement, as transid is already
   updated to current transid.
   (maybe we need a special sub-transid for qgroup use only?)

   As long as current qgroup use commit_root to determine old_roots,
   there is no better idea though.
---
  fs/btrfs/transaction.c | 96 +-
  1 file changed, 71 insertions(+), 25 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 43885e5..0f299a56 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -311,12 +311,13 @@ loop:
   * when the transaction commits
   */
  static int record_root_in_trans(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root)
+  struct btrfs_root *root,
+  int force)
  {
-   if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
-   root->last_trans < trans->transid) {
+   if ((test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
+   root->last_trans < trans->transid) || force) {
 WARN_ON(root == root->fs_info->extent_root);
-   WARN_ON(root->commit_root != root->node);
+   WARN_ON(root->commit_root != root->node && !force);

 /*
  * see below for IN_TRANS_SETUP usage rules
@@ -331,7 +332,7 @@ static int record_root_in_trans(struct btrfs_trans_handle 
*trans,
 smp_wmb();

 spin_lock(&root->fs_info->fs_roots_radix_lock);
-   if (root->last_trans == trans->transid) {
+   if (root->last_trans == trans->transid && !force) {
 spin_unlock(&root->fs_info->fs_roots_radix_lock);
 return 0;
 }
@@ -402,7 +403,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle 
*trans,
 return 0;

 mutex_lock(&root->fs_info->reloc_mutex);
-   record_root_in_trans(trans, root);
+   record_root_in_trans(trans, root, 0);
 mutex_unlock(&root->fs_info->reloc_mutex);

 return 0;
@@ -1383,7 +1384,7 @@ static noinline int create_pending_snapshot(struct 
btrfs_trans_handle *trans,
 dentry = pending->dentry;
 parent_inode = pending->dir;
 parent_root = BTRFS_I(parent_inode)->root;
-   record_root_in_trans(trans, parent_root);
+   record_root_in_trans(trans, parent_root, 0);

 cur_time = current_fs_time(parent_inode->i_sb);

@@ -1420,7 +1421,7 @@ static noinline int create_pending_snapshot(struct 
btrfs_trans_handle *trans,
 goto fail;
 }

-   record_root_in_trans(trans, root);
+   

Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace

2016-04-13 Thread Anand Jain



On 04/13/2016 04:02 AM, Yauhen Kharuzhy wrote:

On Tue, Apr 12, 2016 at 10:15:50PM +0800, Anand Jain wrote:

Thanks for various comments, tests and feedback.


Seems working for me. I have triggered OOM killer while testing this in 
VirtualBox but




I don't think that it is related to autoreplace,


Yep looks like. I suggest to report those bugs separately and not as a
review/testing reply to the patch.

Thanks, Anand


> it seems to be scrub implementation issue:


[  449.615157] CPU: 0 PID: 1771 Comm: btrfs-health Not tainted 4.4.5-scst31x+ 
#25
[  449.621763] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
VirtualBox 12/01/2006
[  449.647614]   8800601c7660 813529e3 
8800601c7858
[  449.659766]  88005ba66140 8800601c76d0 8121b41e 
8800601c7680
[  449.683167]  810d7ccd 8800601c76a0 0206 
81c6d0e0
[  449.700746] Call Trace:
[  449.705078]  [] dump_stack+0x85/0xc2
[  449.715238]  [] dump_header+0x5a/0x21d
[  449.725400]  [] ? trace_hardirqs_on+0xd/0x10
[  449.741261]  [] oom_kill_process+0x200/0x3d0
[  449.753042]  [] out_of_memory+0x562/0x580
[  449.765923]  [] ? out_of_memory+0x2d3/0x580
[  449.768455]  [] __alloc_pages_nodemask+0xafc/0xc80
[  449.770281]  [] alloc_pages_current+0x9b/0x1c0
[  449.783371]  [] scrub_pages+0xb5/0x400 [btrfs]
[  449.804598]  [] ? scrub_find_csum+0xd5/0x110 [btrfs]
[  449.819145]  [] scrub_stripe+0x82e/0x1180 [btrfs]
[  449.829299]  [] scrub_chunk+0x110/0x160 [btrfs]
[  449.835859]  [] scrub_enumerate_chunks+0x27c/0x560 [btrfs]
[  449.852805]  [] ? wake_atomic_t_function+0x30/0x70
[  449.867081]  [] btrfs_scrub_dev+0x1cd/0x680 [btrfs]
[  449.876784]  [] btrfs_dev_replace_start+0x334/0x540 [btrfs]
[  449.891503]  [] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
[  449.911958]  [] health_kthread+0x246/0x490 [btrfs]
[  449.922132]  [] ? health_kthread+0x138/0x490 [btrfs]
[  449.946273]  [] ? btrfs_congested_fn+0x180/0x180 [btrfs]
[  449.975742]  [] kthread+0xef/0x110
[  449.994914]  [] ? 
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[  450.022306]  [] ? kthread_create_on_node+0x200/0x200
[  450.036069]  [] ret_from_fork+0x3f/0x70
[  450.045622]  [] ? kthread_create_on_node+0x200/0x200
[  450.047625] Mem-Info:
[  450.055195] active_anon:30 inactive_anon:71 isolated_anon:0
[  450.055195]  active_file:220 inactive_file:980 isolated_file:0
[  450.055195]  unevictable:527 dirty:41 writeback:59 unstable:0
[  450.055195]  slab_reclaimable:18226 slab_unreclaimable:283931
[  450.055195]  mapped:612 shmem:10 pagetables:1209 bounce:0
[  450.055195]  free:3310 free_pcp:153 free_cma:0
[  450.069070] Node 0 DMA free:6232kB min:48kB low:60kB high:72kB 
active_anon:0kB inactive_anon:0kB active_file:8kB ina
ctive_file:16kB unevictable:28kB isolated(anon):0kB isolated(file):0kB 
present:15992kB managed:15908kB mlocked:28kB dir
ty:4kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:788kB 
slab_unreclaimable:6236kB kernel_stack:96kB pagetables
:48kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:220 all_unreclaim
able? yes
[  450.161023] lowmem_reserve[]: 0 1546 1546 1546
[  450.181786] Node 0 DMA32 free:10620kB min:4896kB low:6120kB high:7344kB 
active_anon:120kB inactive_anon:176kB active
_file:964kB inactive_file:1132kB unevictable:2080kB isolated(anon):0kB 
isolated(file):0kB present:1668032kB managed:158
3780kB mlocked:2080kB dirty:160kB writeback:112kB mapped:2568kB shmem:40kB 
slab_reclaimable:72116kB slab_unreclaimable:1129488kB kernel_stack:4192kB 
pagetables:4788kB unstable:0kB bounce:0kB free_pcp:740kB local_pcp:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  450.267804] lowmem_reserve[]: 0 0 0 0
[  450.272899] Node 0 DMA: 45*4kB (UME) 31*8kB (UME) 19*16kB (ME) 10*32kB (ME) 
7*64kB (ME) 7*128kB (UME) 3*256kB (UME) 2*512kB (UM) 2*1024kB (M) 0*2048kB 
0*4096kB = 6236kB
[  450.286381] Node 0 DMA32: 2006*4kB (UME) 453*8kB (UME) 68*16kB (UME) 15*32kB 
(UM) 2*64kB (UM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
13472kB
[  450.299928] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=2048kB
[  450.304622] 985 total pagecache pages
[  450.306857] 111 pages in swap cache
[  450.308870] Swap cache stats: add 9380, delete 9269, find 113/183
[  450.312090] Free swap  = 381628kB
[  450.314188] Total swap = 418492kB
[  450.317644] 421006 pages RAM
[  450.319573] 0 pages HighMem/MovableOnly
[  450.322100] 21084 pages reserved
[  450.323853] 0 pages hwpoisoned
...


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace

2016-04-13 Thread Yauhen Kharuzhy
On Tue, Apr 12, 2016 at 10:15:50PM +0800, Anand Jain wrote:
> Thanks for various comments, tests and feedback.

Hmm... I broke it :)

I get kernel oops after few cycles of drive removing-insertion-replacing.

My steps to reproduce:
1) create RAID (I used RAID6)
2) remove drive (i tested /sys interface for this and VBox storage
management – reproduced with both bethods). Write & sync fs to detect
falure.
3) insert drive again
4) wipe it
5) replace missing device (reproduced with user-initiated replace and
autoreplace)
6) repeat steps 2-3

At reboot, kernel oopses (see below). Sometimes more than one repeat of
steps 2-5 needed (I am still working to localize this now).

Commands from my last session:

root@grack12:~# btrfs fi show
Label: 'test'  uuid: 833fef31-5536-411c-8f58-53b527569fa5
Total devices 4 FS bytes used 768.00KiB
devid1 size 8.00GiB used 1.41GiB path /dev/sdc
devid2 size 8.00GiB used 1.41GiB path /dev/sdd
devid3 size 8.00GiB used 1.41GiB path /dev/sde
devid5 size 8.00GiB used 1.12GiB path /dev/sdg

Global spare

root@test:~# ls -l /sys/block/sdg
lrwxrwxrwx 1 root root 0 Apr  8 20:03 /sys/block/sdg -> 
../devices/pci:00/:00:1f.2/ata7/host6/target6:0:0/6:0:0:0/block/sdg
root@test:~# echo 1 > /sys/class/scsi_device/6\:0\:0\:0/device/delete 
root@test:~# touch /media/833fef31-5536-411c-8f58-53b527569fa5/ && btrfs fi 
sync /media/833fef31-5536-411c-8f58-53b527569fa5/
FSSync '/media/833fef31-5536-411c-8f58-53b527569fa5/'
root@test:~# echo 0 0 0 > /sys/class/scsi_host/host6/scan 
root@test:~# wipefs -a /dev/sdg
8 bytes were erased at offset 0x10040 (btrfs)
they were: 5f 42 48 52 66 53 5f 4d
root@test:~# btrfs replace start 5 /dev/sdg 
/media/833fef31-5536-411c-8f58-53b527569fa5/
root@test:~# echo 1 > /sys/class/scsi_device/6\:0\:0\:0/device/delete 
root@test:~# echo 0 0 0 > /sys/class/scsi_host/host6/scan 
root@test:~# echo 1 > /sys/class/scsi_device/6\:0\:0\:0/device/delete 
root@test:~# touch /media/833fef31-5536-411c-8f58-53b527569fa5/ && btrfs fi 
sync /media/833fef31-5536-411c-8f58-53b527569fa5/
FSSync '/media/833fef31-5536-411c-8f58-53b527569fa5/'
root@test:~# echo 0 0 0 > /sys/class/scsi_host/host6/scan 
root@test:~# wipefs -a /dev/sdg
8 bytes were erased at offset 0x10040 (btrfs)
they were: 5f 42 48 52 66 53 5f 4d
root@test:~# btrfs replace start 5 /dev/sdg 
/media/833fef31-5536-411c-8f58-53b527569fa5/
root@test:~# echo 1 > /sys/class/scsi_device/6\:0\:0\:0/device/delete 
root@test:~# touch /media/833fef31-5536-411c-8f58-53b527569fa5/ && btrfs fi 
sync /media/833fef31-5536-411c-8f58-53b527569fa5/
FSSync '/media/833fef31-5536-411c-8f58-53b527569fa5/'
root@test:~# echo 0 0 0 > /sys/class/scsi_host/host6/scan 
root@test:~# wipefs -a /dev/sdg
8 bytes were erased at offset 0x10040 (btrfs)
they were: 5f 42 48 52 66 53 5f 4d
root@test:~# btrfs replace start 5 /dev/sdg 
/media/833fef31-5536-411c-8f58-53b527569fa5/
root@test:~# reboot

Oops itself:

[  349.559019] BTRFS info (device sdd): dev_replace from  (devid 
5) to /dev/sdg started
[  349.647966] BTRFS info (device sdd): dev_replace from  (devid 
5) to /dev/sdg finished
[  373.701691] general protection fault:  [#1] SMP DEBUG_PAGEALLOC 
[  373.731698] Modules linked in: cpufreq_powersave cpufreq_stats 
cpufreq_userspace cpufreq_conservative softdog nfsd a
uth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc ipmi_devintf 
ipmi_msghandler iosf_mbi crct10dif_pclmul c
rc32_pclmul sha256_ssse3 sha256_generic snd_pcm snd_timer iTCO_wdt hmac drbg 
iTCO_vendor_support ansi_cprng snd soundco
re aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse 
evdev serio_raw pcspkr acpi_cpufreq 8250_
fintek lpc_ich video ac battery parport_pc tpm_tis tpm mfd_core parport button 
processor rng_core i2c_piix4 btrfs xor r
aid6_pq dm_mod raid1 md_mod sg sd_mod sr_mod cdrom ata_generic ahci libahci 
ata_piix libata crc32c_intel scsi_mod pcnet
32 mii
[  373.933548] CPU: 0 PID: 3955 Comm: umount Not tainted 4.4.5-scst31x-debug+ 
#33
[  373.941730] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
VirtualBox 12/01/2006
[  373.945337] task: 88005b2fe080 ti: 880056cbc000 task.ti: 
880056cbc000
[  373.951991] RIP: 0010:[]  [] 
__filemap_fdatawrite_range+0x29/0xf0
[  373.954135] RSP: 0018:880056cbfd50  EFLAGS: 00010286
[  373.972201] RAX:  RBX: 880056cbfd50 RCX: 
[  374.003989] RDX: 7fff RSI:  RDI: 880056cbfdb0
[  374.044001] RBP: 880056cbfdc8 R08:  R09: 0002
[  374.099584] R10: 81d1b880 R11: 81d1b840 R12: 00441f0f441f
[  374.113566] R13: 88005b2fe080 R14:  R15: 88005b2fe080
[  374.157600] FS:  7f9281eea7e0() GS:88006660() 
knlGS:
[  374.164870] CS:  0010 DS:  ES:  CR0: 8005003b
[  374.184379] CR2: 01277048 CR3: 60324000 CR4: 000406f0
[  374.1

[PATCH 04/42] fs: have submit_bh users pass in op and flags separately

2016-04-13 Thread mchristi
From: Mike Christie 

This has submit_bh users pass in the operation and flags separately,
so submit_bh_wbc can setup bio->bi_op and bio-bi_rw on the bio that
is submitted.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/md/bitmap.c |  4 ++--
 fs/btrfs/check-integrity.c  | 24 ++--
 fs/btrfs/check-integrity.h  |  2 +-
 fs/btrfs/disk-io.c  |  4 ++--
 fs/buffer.c | 54 +++--
 fs/ext4/balloc.c|  2 +-
 fs/ext4/ialloc.c|  2 +-
 fs/ext4/inode.c |  2 +-
 fs/ext4/mmp.c   |  4 ++--
 fs/fat/misc.c   |  2 +-
 fs/gfs2/bmap.c  |  2 +-
 fs/gfs2/dir.c   |  2 +-
 fs/gfs2/meta_io.c   |  6 ++---
 fs/jbd2/commit.c|  6 ++---
 fs/jbd2/journal.c   |  8 +++
 fs/nilfs2/btnode.c  |  6 ++---
 fs/nilfs2/btnode.h  |  2 +-
 fs/nilfs2/btree.c   |  6 +++--
 fs/nilfs2/gcinode.c |  5 +++--
 fs/nilfs2/mdt.c | 11 -
 fs/ntfs/aops.c  |  6 ++---
 fs/ntfs/compress.c  |  2 +-
 fs/ntfs/file.c  |  2 +-
 fs/ntfs/logfile.c   |  2 +-
 fs/ntfs/mft.c   |  4 ++--
 fs/ocfs2/buffer_head_io.c   |  8 +++
 fs/reiserfs/inode.c |  4 ++--
 fs/reiserfs/journal.c   |  6 ++---
 fs/ufs/util.c   |  2 +-
 include/linux/buffer_head.h |  9 
 30 files changed, 103 insertions(+), 96 deletions(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 3fe86b5..8b2e16f 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -294,7 +294,7 @@ static void write_page(struct bitmap *bitmap, struct page 
*page, int wait)
atomic_inc(&bitmap->pending_writes);
set_buffer_locked(bh);
set_buffer_mapped(bh);
-   submit_bh(WRITE | REQ_SYNC, bh);
+   submit_bh(REQ_OP_WRITE, REQ_SYNC, bh);
bh = bh->b_this_page;
}
 
@@ -389,7 +389,7 @@ static int read_page(struct file *file, unsigned long index,
atomic_inc(&bitmap->pending_writes);
set_buffer_locked(bh);
set_buffer_mapped(bh);
-   submit_bh(READ, bh);
+   submit_bh(REQ_OP_READ, 0, bh);
}
block++;
bh = bh->b_this_page;
diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index 9400acd..f82190f 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -2856,12 +2856,12 @@ static struct btrfsic_dev_state 
*btrfsic_dev_state_lookup(
return ds;
 }
 
-int btrfsic_submit_bh(int rw, struct buffer_head *bh)
+int btrfsic_submit_bh(int op, int op_flags, struct buffer_head *bh)
 {
struct btrfsic_dev_state *dev_state;
 
if (!btrfsic_is_initialized)
-   return submit_bh(rw, bh);
+   return submit_bh(op, op_flags, bh);
 
mutex_lock(&btrfsic_mutex);
/* since btrfsic_submit_bh() might also be called before
@@ -2870,26 +2870,26 @@ int btrfsic_submit_bh(int rw, struct buffer_head *bh)
 
/* Only called to write the superblock (incl. FLUSH/FUA) */
if (NULL != dev_state &&
-   (rw & WRITE) && bh->b_size > 0) {
+   (op == REQ_OP_WRITE) && bh->b_size > 0) {
u64 dev_bytenr;
 
dev_bytenr = 4096 * bh->b_blocknr;
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bh(rw=0x%x, blocknr=%llu (bytenr %llu),"
-  " size=%zu, data=%p, bdev=%p)\n",
-  rw, (unsigned long long)bh->b_blocknr,
+  "submit_bh(op=0x%x,0x%x, blocknr=%llu "
+  "(bytenr %llu), size=%zu, data=%p, bdev=%p)\n",
+  op, op_flags, (unsigned long long)bh->b_blocknr,
   dev_bytenr, bh->b_size, bh->b_data, bh->b_bdev);
btrfsic_process_written_block(dev_state, dev_bytenr,
  &bh->b_data, 1, NULL,
- NULL, bh, rw);
-   } else if (NULL != dev_state && (rw & REQ_FLUSH)) {
+ NULL, bh, op_flags);
+   } else if (NULL != dev_state && (op_flags & REQ_FLUSH)) {
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bh(rw=0x%x FLUSH, bdev=%p)\n",
-  rw, bh->b_bdev);
+  "submit_bh(op=0x%x,0x%x FLUSH, bdev=%p)\n",
+  op, op_fl

[PATCH 05/42] fs: have ll_rw_block users pass in op and flags separately

2016-04-13 Thread mchristi
From: Mike Christie 

This has ll_rw_block users pass in the operation and flags separately,
so ll_rw_block can setup bio->bi_op and bio-bi_rw on the bio that
is submitted.

v2:

1. Fix for kbuild error in ll_rw_block comments.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/buffer.c | 19 ++-
 fs/ext4/inode.c |  6 +++---
 fs/ext4/namei.c |  3 ++-
 fs/ext4/super.c |  2 +-
 fs/gfs2/bmap.c  |  2 +-
 fs/gfs2/meta_io.c   |  4 ++--
 fs/gfs2/quota.c |  2 +-
 fs/isofs/compress.c |  2 +-
 fs/jbd2/journal.c   |  2 +-
 fs/jbd2/recovery.c  |  4 ++--
 fs/ocfs2/aops.c |  2 +-
 fs/ocfs2/super.c|  2 +-
 fs/reiserfs/journal.c   |  8 
 fs/reiserfs/stree.c |  4 ++--
 fs/reiserfs/super.c |  2 +-
 fs/squashfs/block.c |  4 ++--
 fs/udf/dir.c|  2 +-
 fs/udf/directory.c  |  2 +-
 fs/udf/inode.c  |  2 +-
 fs/ufs/balloc.c |  2 +-
 include/linux/buffer_head.h |  2 +-
 21 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 1e1a474..68c8f27 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -588,7 +588,7 @@ void write_boundary_block(struct block_device *bdev,
struct buffer_head *bh = __find_get_block(bdev, bblock + 1, blocksize);
if (bh) {
if (buffer_dirty(bh))
-   ll_rw_block(WRITE, 1, &bh);
+   ll_rw_block(REQ_OP_WRITE, 0, 1, &bh);
put_bh(bh);
}
 }
@@ -1395,7 +1395,7 @@ void __breadahead(struct block_device *bdev, sector_t 
block, unsigned size)
 {
struct buffer_head *bh = __getblk(bdev, block, size);
if (likely(bh)) {
-   ll_rw_block(READA, 1, &bh);
+   ll_rw_block(REQ_OP_READ, READA, 1, &bh);
brelse(bh);
}
 }
@@ -1955,7 +1955,7 @@ int __block_write_begin(struct page *page, loff_t pos, 
unsigned len,
if (!buffer_uptodate(bh) && !buffer_delay(bh) &&
!buffer_unwritten(bh) &&
 (block_start < from || block_end > to)) {
-   ll_rw_block(READ, 1, &bh);
+   ll_rw_block(REQ_OP_READ, 0, 1, &bh);
*wait_bh++=bh;
}
}
@@ -2852,7 +2852,7 @@ int block_truncate_page(struct address_space *mapping,
 
if (!buffer_uptodate(bh) && !buffer_delay(bh) && !buffer_unwritten(bh)) 
{
err = -EIO;
-   ll_rw_block(READ, 1, &bh);
+   ll_rw_block(REQ_OP_READ, 0, 1, &bh);
wait_on_buffer(bh);
/* Uhhuh. Read error. Complain and punt. */
if (!buffer_uptodate(bh))
@@ -3052,7 +3052,8 @@ EXPORT_SYMBOL(submit_bh);
 
 /**
  * ll_rw_block: low-level access to block devices (DEPRECATED)
- * @rw: whether to %READ or %WRITE or maybe %READA (readahead)
+ * @op: whether to %READ or %WRITE
+ * @op_flags: rq_flag_bits or %READA (readahead)
  * @nr: number of &struct buffer_heads in the array
  * @bhs: array of pointers to &struct buffer_head
  *
@@ -3075,7 +3076,7 @@ EXPORT_SYMBOL(submit_bh);
  * All of the buffers must be for the same device, and must also be a
  * multiple of the current approved size for the device.
  */
-void ll_rw_block(int rw, int nr, struct buffer_head *bhs[])
+void ll_rw_block(int op, int op_flags,  int nr, struct buffer_head *bhs[])
 {
int i;
 
@@ -3084,18 +3085,18 @@ void ll_rw_block(int rw, int nr, struct buffer_head 
*bhs[])
 
if (!trylock_buffer(bh))
continue;
-   if (rw == WRITE) {
+   if (op == WRITE) {
if (test_clear_buffer_dirty(bh)) {
bh->b_end_io = end_buffer_write_sync;
get_bh(bh);
-   submit_bh(rw, 0, bh);
+   submit_bh(op, op_flags, bh);
continue;
}
} else {
if (!buffer_uptodate(bh)) {
bh->b_end_io = end_buffer_read_sync;
get_bh(bh);
-   submit_bh(rw, 0, bh);
+   submit_bh(op, op_flags, bh);
continue;
}
}
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d15d92e..fe96d2e 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -963,7 +963,7 @@ struct buffer_head *ext4_bread(handle_t *handle, struct 
inode *inode,
return bh;
if (!bh || buffer_uptodate(bh))
return bh;
-   ll_rw_block(READ | REQ_META | REQ_PRIO, 1, &bh);
+   ll_rw_block(REQ_OP_READ, REQ_META | REQ_PRIO, 1, &bh);
wait_on_buffer(bh);
if (buffer_

[PATCH 01/42] block/fs/drivers: remove rw argument from submit_bio

2016-04-13 Thread mchristi
From: Mike Christie 

This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
instead of passing it in. This makes that use the same as
generic_make_request and how we set the other bio fields.

v5:
1. Missed crypto fs submit_bio_wait call.

v2:

1. Set bi_rw instead of ORing it. For cloned bios, I still OR it
to keep the old behavior incase there bits we wanted to keep.

Signed-off-by: Mike Christie 
Reviewed-by: Bart Van Assche 
Reviewed-by: Christoph Hellwig 
---
 block/bio.c |  7 +++
 block/blk-core.c| 11 ---
 block/blk-flush.c   |  3 ++-
 block/blk-lib.c |  9 ++---
 drivers/block/drbd/drbd_actlog.c|  2 +-
 drivers/block/drbd/drbd_bitmap.c|  4 ++--
 drivers/block/floppy.c  |  3 ++-
 drivers/block/xen-blkback/blkback.c |  4 +++-
 drivers/block/xen-blkfront.c|  4 ++--
 drivers/md/bcache/debug.c   |  6 --
 drivers/md/bcache/journal.c |  2 +-
 drivers/md/bcache/super.c   |  4 ++--
 drivers/md/dm-bufio.c   |  3 ++-
 drivers/md/dm-io.c  |  3 ++-
 drivers/md/dm-log-writes.c  |  9 ++---
 drivers/md/dm-thin.c|  3 ++-
 drivers/md/md.c | 10 +++---
 drivers/md/raid1.c  |  3 ++-
 drivers/md/raid10.c |  4 +++-
 drivers/md/raid5-cache.c|  7 ---
 drivers/target/target_core_iblock.c | 24 +---
 fs/btrfs/check-integrity.c  | 18 ++
 fs/btrfs/check-integrity.h  |  4 ++--
 fs/btrfs/disk-io.c  |  3 ++-
 fs/btrfs/extent_io.c|  7 ---
 fs/btrfs/raid56.c   | 17 -
 fs/btrfs/scrub.c| 15 ++-
 fs/btrfs/volumes.c  | 14 +++---
 fs/buffer.c |  3 ++-
 fs/crypto/crypto.c  |  3 ++-
 fs/direct-io.c  |  3 ++-
 fs/ext4/crypto.c|  3 ++-
 fs/ext4/page-io.c   |  3 ++-
 fs/ext4/readpage.c  |  9 +
 fs/f2fs/data.c  | 13 -
 fs/f2fs/segment.c   |  6 --
 fs/gfs2/lops.c  |  3 ++-
 fs/gfs2/meta_io.c   |  3 ++-
 fs/gfs2/ops_fstype.c|  3 ++-
 fs/hfsplus/wrapper.c|  3 ++-
 fs/jfs/jfs_logmgr.c |  6 --
 fs/jfs/jfs_metapage.c   | 10 ++
 fs/logfs/dev_bdev.c | 15 ++-
 fs/mpage.c  |  3 ++-
 fs/nfs/blocklayout/blocklayout.c| 22 --
 fs/nilfs2/segbuf.c  |  3 ++-
 fs/ocfs2/cluster/heartbeat.c| 12 +++-
 fs/xfs/xfs_aops.c   | 15 ++-
 fs/xfs/xfs_buf.c|  4 ++--
 include/linux/bio.h |  2 +-
 include/linux/fs.h  |  2 +-
 kernel/power/swap.c |  5 +++--
 mm/page_io.c| 10 ++
 53 files changed, 221 insertions(+), 146 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 807d25e..f319b78 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -865,21 +865,20 @@ static void submit_bio_wait_endio(struct bio *bio)
 
 /**
  * submit_bio_wait - submit a bio, and wait until it completes
- * @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead)
  * @bio: The &struct bio which describes the I/O
  *
  * Simple wrapper around submit_bio(). Returns 0 on success, or the error from
  * bio_endio() on failure.
  */
-int submit_bio_wait(int rw, struct bio *bio)
+int submit_bio_wait(struct bio *bio)
 {
struct submit_bio_ret ret;
 
-   rw |= REQ_SYNC;
init_completion(&ret.event);
bio->bi_private = &ret;
bio->bi_end_io = submit_bio_wait_endio;
-   submit_bio(rw, bio);
+   bio->bi_rw |= REQ_SYNC;
+   submit_bio(bio);
wait_for_completion_io(&ret.event);
 
return ret.error;
diff --git a/block/blk-core.c b/block/blk-core.c
index c502277..f3895c1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2093,7 +2093,6 @@ EXPORT_SYMBOL(generic_make_request);
 
 /**
  * submit_bio - submit a bio to the block device layer for I/O
- * @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead)
  * @bio: The &struct bio which describes the I/O
  *
  * submit_bio() is very similar in purpose to generic_make_request(), and
@@ -2101,10 +2100,8 @@ EXPORT_SYMBOL(generic_make_request);
  * interfaces; @bio must be presetup and ready for I/O.
  *
  */
-blk_qc_t submit_bio(int rw, struct bio *bio)
+blk_qc_t submit_bio(struct bio *bio)
 {
-   bio->bi_rw |= rw;
-
/*
 * If it's a regular read/write or a barrier with data attached,
 * go through the normal accounting stuff before submission.
@@ -2112,12 +2109,12 @@ blk_qc_t submit_bio(int rw, struct bi

[PATCH 08/42] btrfs: set bi_op tp REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has btrfs use the bio bi_op for REQ_OP and bi_rw for rq_flag_bits.

v5:
- Misset bi_rw to REQ_OP_WRITE in finish_parity_scrub

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/btrfs/check-integrity.c | 19 +--
 fs/btrfs/compression.c |  4 
 fs/btrfs/disk-io.c |  7 ---
 fs/btrfs/inode.c   | 20 +---
 fs/btrfs/raid56.c  | 10 +-
 fs/btrfs/scrub.c   |  9 +
 fs/btrfs/volumes.c | 20 ++--
 7 files changed, 50 insertions(+), 39 deletions(-)

diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index f82190f..c4a48e8 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -1673,7 +1673,7 @@ static int btrfsic_read_block(struct btrfsic_state *state,
}
bio->bi_bdev = block_ctx->dev->bdev;
bio->bi_iter.bi_sector = dev_bytenr >> 9;
-   bio->bi_rw = READ;
+   bio->bi_op = REQ_OP_READ;
 
for (j = i; j < num_pages; j++) {
ret = bio_add_page(bio, block_ctx->pagev[j],
@@ -2922,7 +2922,6 @@ int btrfsic_submit_bh(int op, int op_flags, struct 
buffer_head *bh)
 static void __btrfsic_submit_bio(struct bio *bio)
 {
struct btrfsic_dev_state *dev_state;
-   int rw = bio->bi_rw;
 
if (!btrfsic_is_initialized)
return;
@@ -2932,7 +2931,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
 * btrfsic_mount(), this might return NULL */
dev_state = btrfsic_dev_state_lookup(bio->bi_bdev);
if (NULL != dev_state &&
-   (rw & WRITE) && NULL != bio->bi_io_vec) {
+   (bio->bi_op == REQ_OP_WRITE) && NULL != bio->bi_io_vec) {
unsigned int i;
u64 dev_bytenr;
u64 cur_bytenr;
@@ -2944,9 +2943,9 @@ static void __btrfsic_submit_bio(struct bio *bio)
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bio(rw=0x%x, bi_vcnt=%u,"
+  "submit_bio(rw=%d,0x%lx, bi_vcnt=%u,"
   " bi_sector=%llu (bytenr %llu), bi_bdev=%p)\n",
-  rw, bio->bi_vcnt,
+  bio->bi_op, bio->bi_rw, bio->bi_vcnt,
   (unsigned long long)bio->bi_iter.bi_sector,
   dev_bytenr, bio->bi_bdev);
 
@@ -2977,18 +2976,18 @@ static void __btrfsic_submit_bio(struct bio *bio)
btrfsic_process_written_block(dev_state, dev_bytenr,
  mapped_datav, bio->bi_vcnt,
  bio, &bio_is_patched,
- NULL, rw);
+ NULL, bio->bi_rw);
while (i > 0) {
i--;
kunmap(bio->bi_io_vec[i].bv_page);
}
kfree(mapped_datav);
-   } else if (NULL != dev_state && (rw & REQ_FLUSH)) {
+   } else if (NULL != dev_state && (bio->bi_rw & REQ_FLUSH)) {
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bio(rw=0x%x FLUSH, bdev=%p)\n",
-  rw, bio->bi_bdev);
+  "submit_bio(rw=%d,0x%lx FLUSH, bdev=%p)\n",
+  bio->bi_op, bio->bi_rw, bio->bi_bdev);
if (!dev_state->dummy_block_for_bio_bh_flush.is_iodone) {
if ((dev_state->state->print_mask &
 (BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH |
@@ -3006,7 +3005,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
block->never_written = 0;
block->iodone_w_error = 0;
block->flush_gen = dev_state->last_flush_gen + 1;
-   block->submit_bio_bh_rw = rw;
+   block->submit_bio_bh_rw = bio->bi_rw;
block->orig_bio_bh_private = bio->bi_private;
block->orig_bio_bh_end_io.bio = bio->bi_end_io;
block->next_in_same_bio = NULL;
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index ff61a41..334a00c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -363,6 +363,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
kfree(cb);
return -ENOMEM;
}
+   bio->bi_op = REQ_OP_WRITE;
bio->bi_private = cb;
bio->bi_end_io = end_compressed_bio_write;
atomic_inc(&cb->pending_bios);
@@ -408,6 +409,7 @@ int btrfs_submit_compressed_write(struct inode *inode

[PATCH 03/42] block, fs, mm, drivers: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch converts the simple bi_rw use cases in the block,
drivers, mm and fs code to set the bio->bi_op to a REQ_OP.

These should be simple one liner cases, so I just did them
in one patch. The next patches handle the more complicated
cases in a module per patch.

v5:
1. Add missed crypto call.
2. Change nfs bi_rw check to bi_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/bio.c  |  8 +---
 block/blk-flush.c|  1 +
 block/blk-lib.c  |  7 ---
 block/blk-map.c  |  2 +-
 drivers/block/floppy.c   |  2 +-
 drivers/block/pktcdvd.c  |  4 ++--
 drivers/lightnvm/rrpc.c  |  4 ++--
 drivers/scsi/osd/osd_initiator.c |  8 
 fs/crypto/crypto.c   |  2 +-
 fs/exofs/ore.c   |  2 +-
 fs/ext4/crypto.c |  2 +-
 fs/ext4/page-io.c|  8 +---
 fs/ext4/readpage.c   |  2 +-
 fs/jfs/jfs_logmgr.c  |  2 ++
 fs/jfs/jfs_metapage.c|  4 ++--
 fs/logfs/dev_bdev.c  | 12 ++--
 fs/nfs/blocklayout/blocklayout.c |  4 ++--
 mm/page_io.c |  4 ++--
 18 files changed, 43 insertions(+), 35 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index f319b78..921de2e 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -587,6 +587,7 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src)
 */
bio->bi_bdev = bio_src->bi_bdev;
bio_set_flag(bio, BIO_CLONED);
+   bio->bi_op = bio_src->bi_op;
bio->bi_rw = bio_src->bi_rw;
bio->bi_iter = bio_src->bi_iter;
bio->bi_io_vec = bio_src->bi_io_vec;
@@ -669,6 +670,7 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t 
gfp_mask,
return NULL;
 
bio->bi_bdev= bio_src->bi_bdev;
+   bio->bi_op  = bio_src->bi_op;
bio->bi_rw  = bio_src->bi_rw;
bio->bi_iter.bi_sector  = bio_src->bi_iter.bi_sector;
bio->bi_iter.bi_size= bio_src->bi_iter.bi_size;
@@ -1177,7 +1179,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
goto out_bmd;
 
if (iter->type & WRITE)
-   bio->bi_rw |= REQ_WRITE;
+   bio->bi_op = REQ_OP_WRITE;
 
ret = 0;
 
@@ -1347,7 +1349,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
 * set data direction, and check if mapped pages need bouncing
 */
if (iter->type & WRITE)
-   bio->bi_rw |= REQ_WRITE;
+   bio->bi_op = REQ_OP_WRITE;
 
bio_set_flag(bio, BIO_USER_MAPPED);
 
@@ -1540,7 +1542,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void 
*data, unsigned int len,
bio->bi_private = data;
} else {
bio->bi_end_io = bio_copy_kern_endio;
-   bio->bi_rw |= REQ_WRITE;
+   bio->bi_op = REQ_OP_WRITE;
}
 
return bio;
diff --git a/block/blk-flush.c b/block/blk-flush.c
index f2fbf9a..b05acca 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -484,6 +484,7 @@ int blkdev_issue_flush(struct block_device *bdev, gfp_t 
gfp_mask,
 
bio = bio_alloc(gfp_mask, 0);
bio->bi_bdev = bdev;
+   bio->bi_op = REQ_OP_WRITE;
bio->bi_rw = WRITE_FLUSH;
 
ret = submit_bio_wait(bio);
diff --git a/block/blk-lib.c b/block/blk-lib.c
index 87e3de4..d01b5f2 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -42,7 +42,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t 
sector,
 {
DECLARE_COMPLETION_ONSTACK(wait);
struct request_queue *q = bdev_get_queue(bdev);
-   int type = REQ_WRITE | REQ_DISCARD;
+   int type = 0;
unsigned int granularity;
int alignment;
struct bio_batch bb;
@@ -102,6 +102,7 @@ int blkdev_issue_discard(struct block_device *bdev, 
sector_t sector,
bio->bi_end_io = bio_batch_end_io;
bio->bi_bdev = bdev;
bio->bi_private = &bb;
+   bio->bi_op = REQ_OP_DISCARD;
bio->bi_rw = type;
 
bio->bi_iter.bi_size = req_sects << 9;
@@ -178,7 +179,7 @@ int blkdev_issue_write_same(struct block_device *bdev, 
sector_t sector,
bio->bi_io_vec->bv_page = page;
bio->bi_io_vec->bv_offset = 0;
bio->bi_io_vec->bv_len = bdev_logical_block_size(bdev);
-   bio->bi_rw = REQ_WRITE | REQ_WRITE_SAME;
+   bio->bi_op = REQ_OP_WRITE_SAME;
 
if (nr_sects > max_write_same_sectors) {
bio->bi_iter.bi_size = max_write_same_sectors << 9;
@@ -240,7 +241,7 @@ static int __blkdev_issue_zeroout(struct block_device 
*bdev, sector_t sector,
bio->bi_bdev   = bdev;
bio->bi_end_io = bio_batch_end_io;
bio->bi_private = &bb;
-   bio->bi_rw = WRITE;
+   bio->

[PATCH 09/42] btrfs: update __btrfs_map_block for bi_op transition

2016-04-13 Thread mchristi
From: Mike Christie 

We no longer pass in a bitmap of rq_flag_bits bits to __btrfs_map_block.
It will always be a REQ_OP, or the btrfs specific REQ_GET_READ_MIRRORS,
so this drops the bit tests.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/inode.c   |  2 +-
 fs/btrfs/volumes.c | 55 +++---
 fs/btrfs/volumes.h |  4 ++--
 4 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 99decfb..f4bc8c1 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2053,7 +2053,7 @@ int btrfs_discard_extent(struct btrfs_root *root, u64 
bytenr,
 
 
/* Tell the block device(s) that the sectors can be discarded */
-   ret = btrfs_map_block(root->fs_info, REQ_DISCARD,
+   ret = btrfs_map_block(root->fs_info, REQ_OP_DISCARD,
  bytenr, &num_bytes, &bbio, 0);
/* Error condition is -ENOMEM */
if (!ret) {
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index f693490..c2dc75b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8300,7 +8300,7 @@ static int btrfs_submit_direct_hook(int rw, struct 
btrfs_dio_private *dip,
int i;
 
map_length = orig_bio->bi_iter.bi_size;
-   ret = btrfs_map_block(root->fs_info, rw, start_sector << 9,
+   ret = btrfs_map_block(root->fs_info, orig_bio->bi_op, start_sector << 9,
  &map_length, NULL, 0);
if (ret)
return -EIO;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 96fdf4b..dc56558 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5212,7 +5212,7 @@ void btrfs_put_bbio(struct btrfs_bio *bbio)
kfree(bbio);
 }
 
-static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
+static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int op,
 u64 logical, u64 *length,
 struct btrfs_bio **bbio_ret,
 int mirror_num, int need_raid_map)
@@ -5290,7 +5290,7 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
raid56_full_stripe_start *= full_stripe_len;
}
 
-   if (rw & REQ_DISCARD) {
+   if (op == REQ_OP_DISCARD) {
/* we don't discard raid56 yet */
if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
ret = -EOPNOTSUPP;
@@ -5303,7 +5303,7 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
   For other RAID types and for RAID[56] reads, just allow a 
single
   stripe (on a single disk). */
if ((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
-   (rw & REQ_WRITE)) {
+   (op == REQ_OP_WRITE)) {
max_len = stripe_len * nr_data_stripes(map) -
(offset - raid56_full_stripe_start);
} else {
@@ -5328,8 +5328,8 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
btrfs_dev_replace_set_lock_blocking(dev_replace);
 
if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 &&
-   !(rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS)) &&
-   dev_replace->tgtdev != NULL) {
+   op != REQ_OP_WRITE && op != REQ_OP_DISCARD &&
+   op != REQ_GET_READ_MIRRORS && dev_replace->tgtdev != NULL) {
/*
 * in dev-replace case, for repair case (that's the only
 * case where the mirror is selected explicitly when
@@ -5416,15 +5416,17 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
(offset + *length);
 
if (map->type & BTRFS_BLOCK_GROUP_RAID0) {
-   if (rw & REQ_DISCARD)
+   if (op == REQ_OP_DISCARD)
num_stripes = min_t(u64, map->num_stripes,
stripe_nr_end - stripe_nr_orig);
stripe_nr = div_u64_rem(stripe_nr, map->num_stripes,
&stripe_index);
-   if (!(rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS)))
+   if (op != REQ_OP_WRITE && op != REQ_OP_DISCARD &&
+   op != REQ_GET_READ_MIRRORS)
mirror_num = 1;
} else if (map->type & BTRFS_BLOCK_GROUP_RAID1) {
-   if (rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS))
+   if (op == REQ_OP_WRITE || op == REQ_OP_DISCARD ||
+   op == REQ_GET_READ_MIRRORS)
num_stripes = map->num_stripes;
else if (mirror_num)
stripe_index = mirror_num - 1;
@@ -5437,7 +5439,8 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
}
 
} else if (map->

[PATCH 15/42] mpage: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has mpage.c use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/mpage.c | 41 +
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/fs/mpage.c b/fs/mpage.c
index 2c251ec..89f58f1 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -56,11 +56,12 @@ static void mpage_end_io(struct bio *bio)
bio_put(bio);
 }
 
-static struct bio *mpage_bio_submit(int rw, struct bio *bio)
+static struct bio *mpage_bio_submit(int op, int op_flags, struct bio *bio)
 {
bio->bi_end_io = mpage_end_io;
-   bio->bi_rw = rw;
-   guard_bio_eod(rw, bio);
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
+   guard_bio_eod(op, bio);
submit_bio(bio);
return NULL;
 }
@@ -270,7 +271,7 @@ do_mpage_readpage(struct bio *bio, struct page *page, 
unsigned nr_pages,
 * This page will go to BIO.  Do we need to send this BIO off first?
 */
if (bio && (*last_block_in_bio != blocks[0] - 1))
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
 
 alloc_new:
if (bio == NULL) {
@@ -287,7 +288,7 @@ alloc_new:
 
length = first_hole << blkbits;
if (bio_add_page(bio, page, length, 0) < length) {
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
goto alloc_new;
}
 
@@ -295,7 +296,7 @@ alloc_new:
nblocks = map_bh->b_size >> blkbits;
if ((buffer_boundary(map_bh) && relative_block == nblocks) ||
(first_hole != blocks_per_page))
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
else
*last_block_in_bio = blocks[blocks_per_page - 1];
 out:
@@ -303,7 +304,7 @@ out:
 
 confused:
if (bio)
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
if (!PageUptodate(page))
block_read_full_page(page, get_block);
else
@@ -385,7 +386,7 @@ mpage_readpages(struct address_space *mapping, struct 
list_head *pages,
}
BUG_ON(!list_empty(pages));
if (bio)
-   mpage_bio_submit(READ, bio);
+   mpage_bio_submit(REQ_OP_READ, 0, bio);
return 0;
 }
 EXPORT_SYMBOL(mpage_readpages);
@@ -406,7 +407,7 @@ int mpage_readpage(struct page *page, get_block_t get_block)
bio = do_mpage_readpage(bio, page, 1, &last_block_in_bio,
&map_bh, &first_logical_block, get_block, gfp);
if (bio)
-   mpage_bio_submit(READ, bio);
+   mpage_bio_submit(REQ_OP_READ, 0, bio);
return 0;
 }
 EXPORT_SYMBOL(mpage_readpage);
@@ -487,7 +488,7 @@ static int __mpage_writepage(struct page *page, struct 
writeback_control *wbc,
struct buffer_head map_bh;
loff_t i_size = i_size_read(inode);
int ret = 0;
-   int wr = (wbc->sync_mode == WB_SYNC_ALL ?  WRITE_SYNC : WRITE);
+   int op_flags = (wbc->sync_mode == WB_SYNC_ALL ?  WRITE_SYNC : 0);
 
if (page_has_buffers(page)) {
struct buffer_head *head = page_buffers(page);
@@ -596,7 +597,7 @@ page_is_mapped:
 * This page will go to BIO.  Do we need to send this BIO off first?
 */
if (bio && mpd->last_block_in_bio != blocks[0] - 1)
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
 
 alloc_new:
if (bio == NULL) {
@@ -623,7 +624,7 @@ alloc_new:
wbc_account_io(wbc, page, PAGE_SIZE);
length = first_unmapped << blkbits;
if (bio_add_page(bio, page, length, 0) < length) {
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
goto alloc_new;
}
 
@@ -633,7 +634,7 @@ alloc_new:
set_page_writeback(page);
unlock_page(page);
if (boundary || (first_unmapped != blocks_per_page)) {
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
if (boundary_block) {
write_boundary_block(boundary_bdev,
boundary_block, 1 << blkbits);
@@ -645,7 +646,7 @@ alloc_new:
 
 confused:
if (bio)
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
 
if (mpd->use_writepage) {
ret = mapping->a_ops->writepage(page, wbc);
@@ -702,9 +703,9 @@ mpage_writepages(struct address_space *mapping,
 
ret = write_cache_pages(mapping, wbc, __mpage_writepage, &mpd);
if (mpd.bio) {
-   int wr = (wbc->sync_mode == WB_SYNC_ALL ?
-

[PATCH 12/42] gfs2: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has gfs2 use bio->bi_op for REQ_OPs and rq_flag_bits to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/gfs2/log.c|  8 
 fs/gfs2/lops.c   | 12 +++-
 fs/gfs2/lops.h   |  2 +-
 fs/gfs2/meta_io.c|  8 +---
 fs/gfs2/ops_fstype.c |  1 +
 5 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 0ff028c..e58ccef0 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -657,7 +657,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 
flags)
struct gfs2_log_header *lh;
unsigned int tail;
u32 hash;
-   int rw = WRITE_FLUSH_FUA | REQ_META;
+   int op_flags = WRITE_FLUSH_FUA | REQ_META;
struct page *page = mempool_alloc(gfs2_page_pool, GFP_NOIO);
enum gfs2_freeze_state state = atomic_read(&sdp->sd_freeze_state);
lh = page_address(page);
@@ -682,12 +682,12 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 
flags)
if (test_bit(SDF_NOBARRIERS, &sdp->sd_flags)) {
gfs2_ordered_wait(sdp);
log_flush_wait(sdp);
-   rw = WRITE_SYNC | REQ_META | REQ_PRIO;
+   op_flags = WRITE_SYNC | REQ_META | REQ_PRIO;
}
 
sdp->sd_log_idle = (tail == sdp->sd_log_flush_head);
gfs2_log_write_page(sdp, page);
-   gfs2_log_flush_bio(sdp, rw);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, op_flags);
log_flush_wait(sdp);
 
if (sdp->sd_log_tail != tail)
@@ -738,7 +738,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock 
*gl,
 
gfs2_ordered_write(sdp);
lops_before_commit(sdp, tr);
-   gfs2_log_flush_bio(sdp, WRITE);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0);
 
if (sdp->sd_log_head != sdp->sd_log_flush_head) {
log_flush_wait(sdp);
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index ce28242..c1099b4 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -230,17 +230,19 @@ static void gfs2_end_log_write(struct bio *bio)
 /**
  * gfs2_log_flush_bio - Submit any pending log bio
  * @sdp: The superblock
- * @rw: The rw flags
+ * @op: REQ_OP
+ * @op_flags: rq_flag_bits
  *
  * Submit any pending part-built or full bio to the block device. If
  * there is no pending bio, then this is a no-op.
  */
 
-void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw)
+void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int op, int op_flags)
 {
if (sdp->sd_log_bio) {
atomic_inc(&sdp->sd_log_in_flight);
-   sdp->sd_log_bio->bi_rw = rw;
+   sdp->sd_log_bio->bi_op = op;
+   sdp->sd_log_bio->bi_rw = op_flags;
submit_bio(sdp->sd_log_bio);
sdp->sd_log_bio = NULL;
}
@@ -300,7 +302,7 @@ static struct bio *gfs2_log_get_bio(struct gfs2_sbd *sdp, 
u64 blkno)
nblk >>= sdp->sd_fsb2bb_shift;
if (blkno == nblk)
return bio;
-   gfs2_log_flush_bio(sdp, WRITE);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0);
}
 
return gfs2_log_alloc_bio(sdp, blkno);
@@ -329,7 +331,7 @@ static void gfs2_log_write(struct gfs2_sbd *sdp, struct 
page *page,
bio = gfs2_log_get_bio(sdp, blkno);
ret = bio_add_page(bio, page, size, offset);
if (ret == 0) {
-   gfs2_log_flush_bio(sdp, WRITE);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0);
bio = gfs2_log_alloc_bio(sdp, blkno);
ret = bio_add_page(bio, page, size, offset);
WARN_ON(ret == 0);
diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index a65a7ba..e529f53 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -27,7 +27,7 @@ extern const struct gfs2_log_operations gfs2_databuf_lops;
 
 extern const struct gfs2_log_operations *gfs2_log_ops[];
 extern void gfs2_log_write_page(struct gfs2_sbd *sdp, struct page *page);
-extern void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw);
+extern void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int op, int op_flags);
 extern void gfs2_pin(struct gfs2_sbd *sdp, struct buffer_head *bh);
 
 static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index f56f3ca..55ae188 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -213,7 +213,8 @@ static void gfs2_meta_read_endio(struct bio *bio)
  * Submit several consecutive buffer head I/O requests as a single bio I/O
  * request.  (See submit_bh_wbc.)
  */
-static void gfs2_submit_bhs(int rw, struct buffer_head *bhs[], int num)
+static void gfs2_submit_bhs(int op, int op_flags, struct buffer_head *bhs[],
+   int num)
 {
struct buffer_head *bh = bhs[0];
struct bio *bio;
@@ -230,7 +231,8 @@ static void gfs2_submit_bhs(int rw, struct buffer_head 
*bhs[], int num)
bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh));
}

[PATCH 13/42] xfs: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has xfs use bio->bi_op for REQ_OPs and rq_flag_bits to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Acked-by: Dave Chinner 
---
 fs/xfs/xfs_aops.c |  3 +--
 fs/xfs/xfs_buf.c  | 27 +++
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 5852c5a..6c20336 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -439,10 +439,9 @@ xfs_submit_ioend(
 
ioend->io_bio->bi_private = ioend;
ioend->io_bio->bi_end_io = xfs_end_bio;
+   ioend->io_bio->bi_op = REQ_OP_WRITE;
if (wbc->sync_mode)
ioend->io_bio->bi_rw = WRITE_SYNC;
-   else
-   ioend->io_bio->bi_rw = WRITE;
/*
 * If we are failing the IO now, just mark the ioend with an
 * error and finish it. This will run IO completion immediately
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 079bb77..917774e 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1131,7 +1131,8 @@ xfs_buf_ioapply_map(
int map,
int *buf_offset,
int *count,
-   int rw)
+   int op,
+   int op_flags)
 {
int page_index;
int total_nr_pages = bp->b_page_count;
@@ -1170,7 +1171,8 @@ next_chunk:
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = xfs_buf_bio_end_io;
bio->bi_private = bp;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
 
for (; size && nr_pages; nr_pages--, page_index++) {
int rbytes, nbytes = PAGE_SIZE - offset;
@@ -1214,7 +1216,8 @@ _xfs_buf_ioapply(
struct xfs_buf  *bp)
 {
struct blk_plug plug;
-   int rw;
+   int op;
+   int op_flags = 0;
int offset;
int size;
int i;
@@ -1233,14 +1236,13 @@ _xfs_buf_ioapply(
bp->b_ioend_wq = bp->b_target->bt_mount->m_buf_workqueue;
 
if (bp->b_flags & XBF_WRITE) {
+   op = REQ_OP_WRITE;
if (bp->b_flags & XBF_SYNCIO)
-   rw = WRITE_SYNC;
-   else
-   rw = WRITE;
+   op_flags = WRITE_SYNC;
if (bp->b_flags & XBF_FUA)
-   rw |= REQ_FUA;
+   op_flags |= REQ_FUA;
if (bp->b_flags & XBF_FLUSH)
-   rw |= REQ_FLUSH;
+   op_flags |= REQ_FLUSH;
 
/*
 * Run the write verifier callback function if it exists. If
@@ -1270,13 +1272,14 @@ _xfs_buf_ioapply(
}
}
} else if (bp->b_flags & XBF_READ_AHEAD) {
-   rw = READA;
+   op = REQ_OP_READ;
+   op_flags = REQ_RAHEAD;
} else {
-   rw = READ;
+   op = REQ_OP_READ;
}
 
/* we only use the buffer cache for meta-data */
-   rw |= REQ_META;
+   op_flags |= REQ_META;
 
/*
 * Walk all the vectors issuing IO on them. Set up the initial offset
@@ -1288,7 +1291,7 @@ _xfs_buf_ioapply(
size = BBTOB(bp->b_io_length);
blk_start_plug(&plug);
for (i = 0; i < bp->b_map_count; i++) {
-   xfs_buf_ioapply_map(bp, i, &offset, &size, rw);
+   xfs_buf_ioapply_map(bp, i, &offset, &size, op, op_flags);
if (bp->b_error)
break;
if (size <= 0)
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/42] hfsplus: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has hfsplus use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/hfsplus/hfsplus_fs.h |  2 +-
 fs/hfsplus/part_tbl.c   |  5 +++--
 fs/hfsplus/super.c  |  6 --
 fs/hfsplus/wrapper.c| 15 +--
 4 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index f91a1fa..80154aa 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -525,7 +525,7 @@ int hfsplus_compare_dentry(const struct dentry *parent,
 
 /* wrapper.c */
 int hfsplus_submit_bio(struct super_block *sb, sector_t sector, void *buf,
-  void **data, int rw);
+  void **data, int op, int op_flags);
 int hfsplus_read_wrapper(struct super_block *sb);
 
 /* time macros */
diff --git a/fs/hfsplus/part_tbl.c b/fs/hfsplus/part_tbl.c
index eb355d8..63164eb 100644
--- a/fs/hfsplus/part_tbl.c
+++ b/fs/hfsplus/part_tbl.c
@@ -112,7 +112,8 @@ static int hfs_parse_new_pmap(struct super_block *sb, void 
*buf,
if ((u8 *)pm - (u8 *)buf >= buf_size) {
res = hfsplus_submit_bio(sb,
 *part_start + HFS_PMAP_BLK + i,
-buf, (void **)&pm, READ);
+buf, (void **)&pm, REQ_OP_READ,
+0);
if (res)
return res;
}
@@ -136,7 +137,7 @@ int hfs_part_find(struct super_block *sb,
return -ENOMEM;
 
res = hfsplus_submit_bio(sb, *part_start + HFS_PMAP_BLK,
-buf, &data, READ);
+buf, &data, REQ_OP_READ, 0);
if (res)
goto out;
 
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index c359113..d3646c2 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -219,7 +219,8 @@ static int hfsplus_sync_fs(struct super_block *sb, int wait)
 
error2 = hfsplus_submit_bio(sb,
   sbi->part_start + HFSPLUS_VOLHEAD_SECTOR,
-  sbi->s_vhdr_buf, NULL, WRITE_SYNC);
+  sbi->s_vhdr_buf, NULL, REQ_OP_WRITE,
+  WRITE_SYNC);
if (!error)
error = error2;
if (!write_backup)
@@ -227,7 +228,8 @@ static int hfsplus_sync_fs(struct super_block *sb, int wait)
 
error2 = hfsplus_submit_bio(sb,
  sbi->part_start + sbi->sect_count - 2,
- sbi->s_backup_vhdr_buf, NULL, WRITE_SYNC);
+ sbi->s_backup_vhdr_buf, NULL, REQ_OP_WRITE,
+ WRITE_SYNC);
if (!error)
error2 = error;
 out:
diff --git a/fs/hfsplus/wrapper.c b/fs/hfsplus/wrapper.c
index d026bb3..c5c916d 100644
--- a/fs/hfsplus/wrapper.c
+++ b/fs/hfsplus/wrapper.c
@@ -30,7 +30,8 @@ struct hfsplus_wd {
  * @sector: block to read or write, for blocks of HFSPLUS_SECTOR_SIZE bytes
  * @buf: buffer for I/O
  * @data: output pointer for location of requested data
- * @rw: direction of I/O
+ * @op: direction of I/O
+ * @op_flags: request op flags
  *
  * The unit of I/O is hfsplus_min_io_size(sb), which may be bigger than
  * HFSPLUS_SECTOR_SIZE, and @buf must be sized accordingly. On reads
@@ -44,7 +45,7 @@ struct hfsplus_wd {
  * will work correctly.
  */
 int hfsplus_submit_bio(struct super_block *sb, sector_t sector,
-   void *buf, void **data, int rw)
+  void *buf, void **data, int op, int op_flags)
 {
struct bio *bio;
int ret = 0;
@@ -65,9 +66,10 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t 
sector,
bio = bio_alloc(GFP_NOIO, 1);
bio->bi_iter.bi_sector = sector;
bio->bi_bdev = sb->s_bdev;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
 
-   if (!(rw & WRITE) && data)
+   if (op != WRITE && data)
*data = (u8 *)buf + offset;
 
while (io_size > 0) {
@@ -182,7 +184,7 @@ int hfsplus_read_wrapper(struct super_block *sb)
 reread:
error = hfsplus_submit_bio(sb, part_start + HFSPLUS_VOLHEAD_SECTOR,
   sbi->s_vhdr_buf, (void **)&sbi->s_vhdr,
-  READ);
+  REQ_OP_READ, 0);
if (error)
goto out_free_backup_vhdr;
 
@@ -214,7 +216,8 @@ reread:
 
error = hfsplus_submit_bio(sb, part_start + part_size - 2,
   sbi->s_backup_vhdr_buf,
-  (void **)&sbi->s_backup_vhdr, READ);
+  (void **)&sbi->s_backup_vhdr, REQ_OP_READ,
+

[PATCH 02/42] block: add REQ_OP definitions and bi_op/op fields

2016-04-13 Thread mchristi
From: Mike Christie 

The following patches separate the operation (write, read, discard,
etc) from the rq_flag_bits flags. This patch adds definitions for
request/bio operations, adds fields to the request/bio to set them, and
some temporary compat code so the kernel/modules can use either one. In
the final patches this compat code will be removed when everything is converted.

In this patch the REQ_OPs match the REQ rq_flag_bits ones
for compat reasons while all the code is converted in this set. In the
last patches that will abe removed and the bi_op field will be
shrunk.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c  | 19 ---
 include/linux/blk_types.h | 15 ++-
 include/linux/blkdev.h|  1 +
 include/linux/fs.h| 37 +++--
 4 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f3895c1..6bcc22e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1698,7 +1698,8 @@ void init_request_from_bio(struct request *req, struct 
bio *bio)
 {
req->cmd_type = REQ_TYPE_FS;
 
-   req->cmd_flags |= bio->bi_rw & REQ_COMMON_MASK;
+   /* tmp compat. Allow users to set bi_op or bi_rw */
+   req->cmd_flags |= (bio->bi_rw | bio->bi_op) & REQ_COMMON_MASK;
if (bio->bi_rw & REQ_RAHEAD)
req->cmd_flags |= REQ_FAILFAST_MASK;
 
@@ -2033,6 +2034,12 @@ blk_qc_t generic_make_request(struct bio *bio)
struct bio_list bio_list_on_stack;
blk_qc_t ret = BLK_QC_T_NONE;
 
+   /* tmp compat. Allow users to set either one or both.
+* This will be removed when we have converted
+* everyone in the next patches.
+*/
+   bio->bi_rw |= bio->bi_op;
+
if (!generic_make_request_checks(bio))
goto out;
 
@@ -2102,6 +2109,12 @@ EXPORT_SYMBOL(generic_make_request);
  */
 blk_qc_t submit_bio(struct bio *bio)
 {
+   /* tmp compat. Allow users to set either one or both.
+* This will be removed when we have converted
+* everyone in the next patches.
+*/
+   bio->bi_rw |= bio->bi_op;
+
/*
 * If it's a regular read/write or a barrier with data attached,
 * go through the normal accounting stuff before submission.
@@ -2975,8 +2988,8 @@ EXPORT_SYMBOL_GPL(__blk_end_request_err);
 void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
 struct bio *bio)
 {
-   /* Bit 0 (R/W) is identical in rq->cmd_flags and bio->bi_rw */
-   rq->cmd_flags |= bio->bi_rw & REQ_WRITE;
+   /* tmp compat. Allow users to set bi_op or bi_rw */
+   rq->cmd_flags |= bio_data_dir(bio);
 
if (bio_has_data(bio))
rq->nr_phys_segments = bio_phys_segments(q, bio);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 86a38ea..6e49c91 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -48,9 +48,15 @@ struct bio {
struct block_device *bi_bdev;
unsigned intbi_flags;   /* status, command, etc */
int bi_error;
-   unsigned long   bi_rw;  /* bottom bits READ/WRITE,
+   unsigned long   bi_rw;  /* bottom bits rq_flags_bits
 * top bits priority
 */
+   /*
+* this will be a u8 in the next patches and bi_rw can be shrunk to
+* a u32. For compat in these transistional patches op is a int here.
+*/
+   int bi_op;  /* REQ_OP */
+
 
struct bvec_iterbi_iter;
 
@@ -242,6 +248,13 @@ enum rq_flag_bits {
 #define REQ_HASHED (1ULL << __REQ_HASHED)
 #define REQ_MQ_INFLIGHT(1ULL << __REQ_MQ_INFLIGHT)
 
+enum req_op {
+   REQ_OP_READ,
+   REQ_OP_WRITE= REQ_WRITE,
+   REQ_OP_DISCARD  = REQ_DISCARD,
+   REQ_OP_WRITE_SAME   = REQ_WRITE_SAME,
+};
+
 typedef unsigned int blk_qc_t;
 #define BLK_QC_T_NONE  -1U
 #define BLK_QC_T_SHIFT 16
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f3f232f..e2b2881 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -96,6 +96,7 @@ struct request {
struct request_queue *q;
struct blk_mq_ctx *mq_ctx;
 
+   int op;
u64 cmd_flags;
unsigned cmd_type;
unsigned long atomic_flags;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 69bdd03..96ace0f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2452,15 +2452,48 @@ extern void make_bad_inode(struct inode *);
 extern bool is_bad_inode(struct inode *);
 
 #ifdef CONFIG_BLOCK
+
+static inline bool op_is_write(int op)
+{
+   switch (op) {
+   case REQ_OP_WRITE:
+   case REQ_OP_WRITE_SAME:
+   case REQ_OP_DISCARD:
+   return true;
+   defa

[PATCH 07/42] btrfs: have submit_one_bio users setup bio bi_op

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has btrfs's submit_one_bio callers set the bio->bi_op to a
REQ_OP and the bi_rw to rq_flag_bits.

The next patches will continue to convert btrfs,
so submit_bio_hook and merge_bio_hook
related code will be modified to take only the bio. I did
not do it in this patch to try and keep it smaller.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/btrfs/extent_io.c | 88 +++-
 1 file changed, 45 insertions(+), 43 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index fdeb8fa..45fa3be 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2377,7 +2377,7 @@ static int bio_readpage_error(struct bio *failed_bio, u64 
phy_offset,
int read_mode;
int ret;
 
-   BUG_ON(failed_bio->bi_rw & REQ_WRITE);
+   BUG_ON(failed_bio->bi_op == REQ_OP_WRITE);
 
ret = btrfs_get_io_failure_record(inode, start, end, &failrec);
if (ret)
@@ -2403,6 +2403,8 @@ static int bio_readpage_error(struct bio *failed_bio, u64 
phy_offset,
free_io_failure(inode, failrec);
return -EIO;
}
+   bio->bi_op = REQ_OP_READ;
+   bio->bi_rw = read_mode;
 
pr_debug("Repair Read Error: submitting new read[%#x] to 
this_mirror=%d, in_validation=%d\n",
 read_mode, failrec->this_mirror, failrec->in_validation);
@@ -2714,8 +2716,8 @@ struct bio *btrfs_io_bio_alloc(gfp_t gfp_mask, unsigned 
int nr_iovecs)
 }
 
 
-static int __must_check submit_one_bio(int rw, struct bio *bio,
-  int mirror_num, unsigned long bio_flags)
+static int __must_check submit_one_bio(struct bio *bio, int mirror_num,
+  unsigned long bio_flags)
 {
int ret = 0;
struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
@@ -2726,12 +2728,12 @@ static int __must_check submit_one_bio(int rw, struct 
bio *bio,
start = page_offset(page) + bvec->bv_offset;
 
bio->bi_private = NULL;
-   bio->bi_rw = rw;
bio_get(bio);
 
if (tree->ops && tree->ops->submit_bio_hook)
-   ret = tree->ops->submit_bio_hook(page->mapping->host, rw, bio,
-  mirror_num, bio_flags, start);
+   ret = tree->ops->submit_bio_hook(page->mapping->host,
+bio->bi_rw, bio, mirror_num,
+bio_flags, start);
else
btrfsic_submit_bio(bio);
 
@@ -2739,20 +2741,20 @@ static int __must_check submit_one_bio(int rw, struct 
bio *bio,
return ret;
 }
 
-static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page,
+static int merge_bio(struct extent_io_tree *tree, struct page *page,
 unsigned long offset, size_t size, struct bio *bio,
 unsigned long bio_flags)
 {
int ret = 0;
if (tree->ops && tree->ops->merge_bio_hook)
-   ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio,
-   bio_flags);
+   ret = tree->ops->merge_bio_hook(bio->bi_op, page, offset, size,
+   bio, bio_flags);
BUG_ON(ret < 0);
return ret;
 
 }
 
-static int submit_extent_page(int rw, struct extent_io_tree *tree,
+static int submit_extent_page(int op, int op_flags, struct extent_io_tree 
*tree,
  struct writeback_control *wbc,
  struct page *page, sector_t sector,
  size_t size, unsigned long offset,
@@ -2780,10 +2782,9 @@ static int submit_extent_page(int rw, struct 
extent_io_tree *tree,
 
if (prev_bio_flags != bio_flags || !contig ||
force_bio_submit ||
-   merge_bio(rw, tree, page, offset, page_size, bio, 
bio_flags) ||
+   merge_bio(tree, page, offset, page_size, bio, bio_flags) ||
bio_add_page(bio, page, page_size, offset) < page_size) {
-   ret = submit_one_bio(rw, bio, mirror_num,
-prev_bio_flags);
+   ret = submit_one_bio(bio, mirror_num, prev_bio_flags);
if (ret < 0) {
*bio_ret = NULL;
return ret;
@@ -2804,6 +2805,8 @@ static int submit_extent_page(int rw, struct 
extent_io_tree *tree,
bio_add_page(bio, page, page_size, offset);
bio->bi_end_io = end_io_func;
bio->bi_private = tree;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
if (wbc) {
wbc_init_bio(wbc, bio);
wbc_account_io(wbc, page, page_size);
@@ -2812,7 +2815,7 @@ static int submit_extent_page(int rw, struct 
extent_io_tree *tree,
if (bio_ret)
 

[PATCH 00/42] v5: separate operations from flags in the bio/request structs

2016-04-13 Thread mchristi
The following patches begin to cleanup the request->cmd_flags and
bio->bi_rw mess. We currently use cmd_flags to specify the operation,
attributes and state of the request. For bi_rw we use it for similar
info and also the priority but then also have another bi_flags field
for state. At some point, we abused them so much we just made cmd_flags
64 bits, so we could add more.

The following patches seperate the operation (read, write discard,
flush, etc) from cmd_flags/bi_rw.

This patchset was made against linux-next from today April 13
(git tag next-20160413).

I put a git tree here:
https://github.com/mikechristie/linux-kernel.git
The patches are in the op branch.

v5:
1. Missed crypto fs submit_bio_wait call.
2. Change nfs bi_rw check to bi_op.
3. btrfs. Convert finish_parity_scrub.
4. Reworked against Jens's QUEUE_FLAG patches so I could drop my similar
code.
5. Separated the core block layer change into multiple patches for
merging, elevator, stats, mq and non mq request allocation to try
and make it easier to read.

v4:
1. Rebased to current linux-next tree.

v3:

1. Used "=" instead of "|="  to setup bio bi_rw.
2. Removed __get_request cmd_flags compat code.
3. Merged initial dm related changes requested by Mike Snitzer.
4. Fixed ubd kbuild errors in flush related patches.
5. Fix 80 char col issues in several patches.
6. Fix issue with one of the btrfs patches where it looks like I reverted
a patch when trying to fix a merge error.

v2

1. Dropped arguments from submit_bio, and had callers setup
bio.
2. Add REQ_OP_FLUSH for request_fn users and renamed REQ_FLUSH
to REQ_PREFLUSH for make_request_fn users.
3. Dropped bio/rq_data_dir functions, and added a op_is_write
function instead.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/42] btrfs: use bio fields for op and flags

2016-04-13 Thread mchristi
From: Mike Christie 

The bio bi_op and bi_rw is now setup, so there is no need
to pass around the rq_flag_bits bits too. btrfs users should should
access the bio.

v2:

1. Fix merge_bio issue where instead of removing rw/op argument
I passed it in again to the merge_bio related functions.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/btrfs/compression.c | 13 ++---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 30 --
 fs/btrfs/disk-io.h |  2 +-
 fs/btrfs/extent_io.c   | 12 +---
 fs/btrfs/extent_io.h   |  8 
 fs/btrfs/inode.c   | 44 
 fs/btrfs/volumes.c |  6 +++---
 fs/btrfs/volumes.h |  2 +-
 9 files changed, 53 insertions(+), 66 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 334a00c..356ac36 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -374,7 +374,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
page = compressed_pages[pg_index];
page->mapping = inode->i_mapping;
if (bio->bi_iter.bi_size)
-   ret = io_tree->ops->merge_bio_hook(WRITE, page, 0,
+   ret = io_tree->ops->merge_bio_hook(page, 0,
   PAGE_SIZE,
   bio, 0);
else
@@ -402,7 +402,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
BUG_ON(ret); /* -ENOMEM */
}
 
-   ret = btrfs_map_bio(root, WRITE, bio, 0, 1);
+   ret = btrfs_map_bio(root, bio, 0, 1);
BUG_ON(ret); /* -ENOMEM */
 
bio_put(bio);
@@ -433,7 +433,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
BUG_ON(ret); /* -ENOMEM */
}
 
-   ret = btrfs_map_bio(root, WRITE, bio, 0, 1);
+   ret = btrfs_map_bio(root, bio, 0, 1);
BUG_ON(ret); /* -ENOMEM */
 
bio_put(bio);
@@ -659,7 +659,7 @@ int btrfs_submit_compressed_read(struct inode *inode, 
struct bio *bio,
page->index = em_start >> PAGE_SHIFT;
 
if (comp_bio->bi_iter.bi_size)
-   ret = tree->ops->merge_bio_hook(READ, page, 0,
+   ret = tree->ops->merge_bio_hook(page, 0,
PAGE_SIZE,
comp_bio, 0);
else
@@ -690,8 +690,7 @@ int btrfs_submit_compressed_read(struct inode *inode, 
struct bio *bio,
sums += DIV_ROUND_UP(comp_bio->bi_iter.bi_size,
 root->sectorsize);
 
-   ret = btrfs_map_bio(root, READ, comp_bio,
-   mirror_num, 0);
+   ret = btrfs_map_bio(root, comp_bio, mirror_num, 0);
if (ret) {
bio->bi_error = ret;
bio_endio(comp_bio);
@@ -721,7 +720,7 @@ int btrfs_submit_compressed_read(struct inode *inode, 
struct bio *bio,
BUG_ON(ret); /* -ENOMEM */
}
 
-   ret = btrfs_map_bio(root, READ, comp_bio, mirror_num, 0);
+   ret = btrfs_map_bio(root, comp_bio, mirror_num, 0);
if (ret) {
bio->bi_error = ret;
bio_endio(comp_bio);
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index a0c6aca..4e50f59 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4087,7 +4087,7 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle 
*trans,
 struct btrfs_root *new_root,
 struct btrfs_root *parent_root,
 u64 new_dirid);
-int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
+int btrfs_merge_bio_hook(struct page *page, unsigned long offset,
 size_t size, struct bio *bio,
 unsigned long bio_flags);
 int btrfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 932268b..bea6df6 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -124,7 +124,6 @@ struct async_submit_bio {
struct list_head list;
extent_submit_bio_hook_t *submit_bio_start;
extent_submit_bio_hook_t *submit_bio_done;
-   int rw;
int mirror_num;
unsigned long bio_flags;
/*
@@ -797,7 +796,7 @@ static void run_one_async_start(struct btrfs_work *work)
int ret;
 
async = container_of(work, struct  async_submit_bio, work);
-   ret = async->submit_bio_start(async->inode, async->rw, async->bio,
+   ret = async->submit_bio_start(async->inode, async->b

[PATCH 06/42] direct-io: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has the dio code set the bio bi_op to a REQ_OP and bio op_flags
to rq_flag_bits.

It also begins to convert btrfs's dio_submit_t because of the dio submit_io
callout use. In the btrfs_submit_direct change, I OR'd the op and flag back
together. It is only temporary. The next patched will completely convert
all the btrfs code paths.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/btrfs/inode.c   |  9 +
 fs/direct-io.c | 35 +--
 include/linux/fs.h |  2 +-
 3 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9cc2256..d999fdf 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8401,14 +8401,14 @@ out_err:
return 0;
 }
 
-static void btrfs_submit_direct(int rw, struct bio *dio_bio,
-   struct inode *inode, loff_t file_offset)
+static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
+   loff_t file_offset)
 {
struct btrfs_dio_private *dip = NULL;
struct bio *io_bio = NULL;
struct btrfs_io_bio *btrfs_bio;
int skip_sum;
-   int write = rw & REQ_WRITE;
+   bool write = (dio_bio->bi_op == REQ_OP_WRITE);
int ret = 0;
 
skip_sum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM;
@@ -8459,7 +8459,8 @@ static void btrfs_submit_direct(int rw, struct bio 
*dio_bio,
dio_data->unsubmitted_oe_range_end;
}
 
-   ret = btrfs_submit_direct_hook(rw, dip, skip_sum);
+   ret = btrfs_submit_direct_hook(dio_bio->bi_op | dio_bio->bi_rw, dip,
+  skip_sum);
if (!ret)
return;
 
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 1890ad2..64bfab0 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -108,7 +108,8 @@ struct dio_submit {
 /* dio_state communicated between submission path and end_io */
 struct dio {
int flags;  /* doesn't change */
-   int rw;
+   int op;
+   int op_flags;
blk_qc_t bio_cookie;
struct block_device *bio_bdev;
struct inode *inode;
@@ -163,7 +164,7 @@ static inline int dio_refill_pages(struct dio *dio, struct 
dio_submit *sdio)
ret = iov_iter_get_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES,
&sdio->from);
 
-   if (ret < 0 && sdio->blocks_available && (dio->rw & WRITE)) {
+   if (ret < 0 && sdio->blocks_available && (dio->op == REQ_OP_WRITE)) {
struct page *page = ZERO_PAGE(0);
/*
 * A memory fault, but the filesystem has some outstanding
@@ -242,7 +243,8 @@ static ssize_t dio_complete(struct dio *dio, loff_t offset, 
ssize_t ret,
transferred = dio->result;
 
/* Check for short read case */
-   if ((dio->rw == READ) && ((offset + transferred) > dio->i_size))
+   if ((dio->op == REQ_OP_READ) &&
+   ((offset + transferred) > dio->i_size))
transferred = dio->i_size - offset;
}
 
@@ -265,7 +267,7 @@ static ssize_t dio_complete(struct dio *dio, loff_t offset, 
ssize_t ret,
inode_dio_end(dio->inode);
 
if (is_async) {
-   if (dio->rw & WRITE) {
+   if (dio->op == REQ_OP_WRITE) {
int err;
 
err = generic_write_sync(dio->iocb->ki_filp, offset,
@@ -374,7 +376,8 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio,
 
bio->bi_bdev = bdev;
bio->bi_iter.bi_sector = first_sector;
-   bio->bi_rw = dio->rw;
+   bio->bi_op = dio->op;
+   bio->bi_rw = dio->op_flags;
if (dio->is_async)
bio->bi_end_io = dio_bio_end_aio;
else
@@ -402,14 +405,13 @@ static inline void dio_bio_submit(struct dio *dio, struct 
dio_submit *sdio)
dio->refcount++;
spin_unlock_irqrestore(&dio->bio_lock, flags);
 
-   if (dio->is_async && dio->rw == READ && dio->should_dirty)
+   if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty)
bio_set_pages_dirty(bio);
 
dio->bio_bdev = bio->bi_bdev;
 
if (sdio->submit_io) {
-   sdio->submit_io(dio->rw, bio, dio->inode,
-  sdio->logical_offset_in_bio);
+   sdio->submit_io(bio, dio->inode, sdio->logical_offset_in_bio);
dio->bio_cookie = BLK_QC_T_NONE;
} else
dio->bio_cookie = submit_bio(bio);
@@ -478,14 +480,14 @@ static int dio_bio_complete(struct dio *dio, struct bio 
*bio)
if (bio->bi_error)
dio->io_error = -EIO;
 
-   if (dio->is_async && dio->rw == READ && dio->should_dirty) {
+   if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) {
err = bio->bi_error;
bio_check_pages_dirty

[PATCH 16/42] nilfs: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has nilfs use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/nilfs2/segbuf.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c
index 7666f1d..7b13e14 100644
--- a/fs/nilfs2/segbuf.c
+++ b/fs/nilfs2/segbuf.c
@@ -350,7 +350,8 @@ static void nilfs_end_bio_write(struct bio *bio)
 }
 
 static int nilfs_segbuf_submit_bio(struct nilfs_segment_buffer *segbuf,
-  struct nilfs_write_info *wi, int mode)
+  struct nilfs_write_info *wi, int mode,
+  int mode_flags)
 {
struct bio *bio = wi->bio;
int err;
@@ -368,7 +369,8 @@ static int nilfs_segbuf_submit_bio(struct 
nilfs_segment_buffer *segbuf,
 
bio->bi_end_io = nilfs_end_bio_write;
bio->bi_private = segbuf;
-   bio->bi_rw = mode;
+   bio->bi_op = mode;
+   bio->bi_rw = mode_flags;
submit_bio(bio);
segbuf->sb_nbio++;
 
@@ -442,7 +444,7 @@ static int nilfs_segbuf_submit_bh(struct 
nilfs_segment_buffer *segbuf,
return 0;
}
/* bio is FULL */
-   err = nilfs_segbuf_submit_bio(segbuf, wi, mode);
+   err = nilfs_segbuf_submit_bio(segbuf, wi, mode, 0);
/* never submit current bh */
if (likely(!err))
goto repeat;
@@ -466,19 +468,19 @@ static int nilfs_segbuf_write(struct nilfs_segment_buffer 
*segbuf,
 {
struct nilfs_write_info wi;
struct buffer_head *bh;
-   int res = 0, rw = WRITE;
+   int res = 0;
 
wi.nilfs = nilfs;
nilfs_segbuf_prepare_write(segbuf, &wi);
 
list_for_each_entry(bh, &segbuf->sb_segsum_buffers, b_assoc_buffers) {
-   res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, rw);
+   res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, REQ_OP_WRITE);
if (unlikely(res))
goto failed_bio;
}
 
list_for_each_entry(bh, &segbuf->sb_payload_buffers, b_assoc_buffers) {
-   res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, rw);
+   res = nilfs_segbuf_submit_bh(segbuf, &wi, bh, REQ_OP_WRITE);
if (unlikely(res))
goto failed_bio;
}
@@ -488,8 +490,8 @@ static int nilfs_segbuf_write(struct nilfs_segment_buffer 
*segbuf,
 * Last BIO is always sent through the following
 * submission.
 */
-   rw |= REQ_SYNC;
-   res = nilfs_segbuf_submit_bio(segbuf, &wi, rw);
+   res = nilfs_segbuf_submit_bio(segbuf, &wi, REQ_OP_WRITE,
+ REQ_SYNC);
}
 
  failed_bio:
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/42] ocfs2: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has ocfs2 use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/ocfs2/cluster/heartbeat.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index e37373d..0619a20 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -530,7 +530,8 @@ static void o2hb_bio_end_io(struct bio *bio)
 static struct bio *o2hb_setup_one_bio(struct o2hb_region *reg,
  struct o2hb_bio_wait_ctxt *wc,
  unsigned int *current_slot,
- unsigned int max_slots, int rw)
+ unsigned int max_slots, int op,
+ int op_flags)
 {
int len, current_page;
unsigned int vec_len, vec_start;
@@ -556,7 +557,8 @@ static struct bio *o2hb_setup_one_bio(struct o2hb_region 
*reg,
bio->bi_bdev = reg->hr_bdev;
bio->bi_private = wc;
bio->bi_end_io = o2hb_bio_end_io;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
 
vec_start = (cs << bits) % PAGE_SIZE;
while(cs < max_slots) {
@@ -593,7 +595,7 @@ static int o2hb_read_slots(struct o2hb_region *reg,
 
while(current_slot < max_slots) {
bio = o2hb_setup_one_bio(reg, &wc, ¤t_slot, max_slots,
-READ);
+REQ_OP_READ, 0);
if (IS_ERR(bio)) {
status = PTR_ERR(bio);
mlog_errno(status);
@@ -625,7 +627,8 @@ static int o2hb_issue_node_write(struct o2hb_region *reg,
 
slot = o2nm_this_node();
 
-   bio = o2hb_setup_one_bio(reg, write_wc, &slot, slot+1, WRITE_SYNC);
+   bio = o2hb_setup_one_bio(reg, write_wc, &slot, slot+1, REQ_OP_WRITE,
+WRITE_SYNC);
if (IS_ERR(bio)) {
status = PTR_ERR(bio);
mlog_errno(status);
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/42] dm: pass dm stats data dir instead of bi_rw

2016-04-13 Thread mchristi
From: Mike Christie 

It looks like dm stats cares about the data direction
(READ vs WRITE) and does not need the bio/request flags.
Commands like REQ_FLUSH, REQ_DISCARD and REQ_WRITE_SAME
are currently always set with REQ_WRITE, so the extra check for
REQ_DISCARD in dm_stats_account_io is not needed.

This patch has it use the bio and request data_dir helpers
instead of accessing the bi_rw/cmd_flags directly. This makes
the next patches that remove the operation from the cmd_flags
and bi_rw easier, because we will no longer have the REQ_WRITE
bit set for operations like discards.

This patch is compile tested only.

v2:

1. Merged Mike Snitzer's fixes to pass in int instead of
unsigned long.

2. Fix 80 char col issues.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/md/dm-stats.c |  9 -
 drivers/md/dm.c   | 21 -
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c
index 8289804..4fba26c 100644
--- a/drivers/md/dm-stats.c
+++ b/drivers/md/dm-stats.c
@@ -514,11 +514,10 @@ static void dm_stat_round(struct dm_stat *s, struct 
dm_stat_shared *shared,
 }
 
 static void dm_stat_for_entry(struct dm_stat *s, size_t entry,
- unsigned long bi_rw, sector_t len,
+ int idx, sector_t len,
  struct dm_stats_aux *stats_aux, bool end,
  unsigned long duration_jiffies)
 {
-   unsigned long idx = bi_rw & REQ_WRITE;
struct dm_stat_shared *shared = &s->stat_shared[entry];
struct dm_stat_percpu *p;
 
@@ -584,7 +583,7 @@ static void dm_stat_for_entry(struct dm_stat *s, size_t 
entry,
 #endif
 }
 
-static void __dm_stat_bio(struct dm_stat *s, unsigned long bi_rw,
+static void __dm_stat_bio(struct dm_stat *s, int bi_rw,
  sector_t bi_sector, sector_t end_sector,
  bool end, unsigned long duration_jiffies,
  struct dm_stats_aux *stats_aux)
@@ -645,8 +644,8 @@ void dm_stats_account_io(struct dm_stats *stats, unsigned 
long bi_rw,
last = raw_cpu_ptr(stats->last);
stats_aux->merged =
(bi_sector == (ACCESS_ONCE(last->last_sector) &&
-  ((bi_rw & (REQ_WRITE | REQ_DISCARD)) ==
-   (ACCESS_ONCE(last->last_rw) & 
(REQ_WRITE | REQ_DISCARD)))
+  ((bi_rw == WRITE) ==
+   (ACCESS_ONCE(last->last_rw) == WRITE))
   ));
ACCESS_ONCE(last->last_sector) = end_sector;
ACCESS_ONCE(last->last_rw) = bi_rw;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 5432da7..98fea0e 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -723,8 +723,9 @@ static void start_io_acct(struct dm_io *io)
atomic_inc_return(&md->pending[rw]));
 
if (unlikely(dm_stats_used(&md->stats)))
-   dm_stats_account_io(&md->stats, bio->bi_rw, 
bio->bi_iter.bi_sector,
-   bio_sectors(bio), false, 0, &io->stats_aux);
+   dm_stats_account_io(&md->stats, bio_data_dir(bio),
+   bio->bi_iter.bi_sector, bio_sectors(bio),
+   false, 0, &io->stats_aux);
 }
 
 static void end_io_acct(struct dm_io *io)
@@ -738,8 +739,9 @@ static void end_io_acct(struct dm_io *io)
generic_end_io_acct(rw, &dm_disk(md)->part0, io->start_time);
 
if (unlikely(dm_stats_used(&md->stats)))
-   dm_stats_account_io(&md->stats, bio->bi_rw, 
bio->bi_iter.bi_sector,
-   bio_sectors(bio), true, duration, 
&io->stats_aux);
+   dm_stats_account_io(&md->stats, bio_data_dir(bio),
+   bio->bi_iter.bi_sector, bio_sectors(bio),
+   true, duration, &io->stats_aux);
 
/*
 * After this is decremented the bio must not be touched if it is
@@ -1121,9 +1123,9 @@ static void rq_end_stats(struct mapped_device *md, struct 
request *orig)
if (unlikely(dm_stats_used(&md->stats))) {
struct dm_rq_target_io *tio = tio_from_request(orig);
tio->duration_jiffies = jiffies - tio->duration_jiffies;
-   dm_stats_account_io(&md->stats, orig->cmd_flags, 
blk_rq_pos(orig),
-   tio->n_sectors, true, tio->duration_jiffies,
-   &tio->stats_aux);
+   dm_stats_account_io(&md->stats, rq_data_dir(orig),
+   blk_rq_pos(orig), tio->n_sectors, true,
+   tio->duration_jiffies, &tio->stats_aux);
}
 }
 
@@ -2082,8 +2084,9 @@ static void dm_start_request(struct mapped_device *md, 
struct request *ori

[PATCH 19/42] dm: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has dm use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/md/dm-bufio.c   |  8 +++---
 drivers/md/dm-crypt.c   |  1 +
 drivers/md/dm-io.c  | 57 ++---
 drivers/md/dm-kcopyd.c  | 25 +-
 drivers/md/dm-log-writes.c  |  6 ++---
 drivers/md/dm-log.c |  5 ++--
 drivers/md/dm-raid1.c   | 11 +---
 drivers/md/dm-snap-persistent.c | 24 +
 drivers/md/dm-thin.c|  7 ++---
 drivers/md/dm.c |  1 +
 include/linux/dm-io.h   |  3 ++-
 11 files changed, 82 insertions(+), 66 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 9d3ee7f..b6055f2 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -574,7 +574,8 @@ static void use_dmio(struct dm_buffer *b, int rw, sector_t 
block,
 {
int r;
struct dm_io_request io_req = {
-   .bi_rw = rw,
+   .bi_op = rw,
+   .bi_op_flags = 0,
.notify.fn = dmio_complete,
.notify.context = b,
.client = b->c->dm_io,
@@ -634,7 +635,7 @@ static void use_inline_bio(struct dm_buffer *b, int rw, 
sector_t block,
 * the dm_buffer's inline bio is local to bufio.
 */
b->bio.bi_private = end_io;
-   b->bio.bi_rw = rw;
+   b->bio.bi_op = rw;
 
/*
 * We assume that if len >= PAGE_SIZE ptr is page-aligned.
@@ -1327,7 +1328,8 @@ EXPORT_SYMBOL_GPL(dm_bufio_write_dirty_buffers);
 int dm_bufio_issue_flush(struct dm_bufio_client *c)
 {
struct dm_io_request io_req = {
-   .bi_rw = WRITE_FLUSH,
+   .bi_op = REQ_OP_WRITE,
+   .bi_op_flags = WRITE_FLUSH,
.mem.type = DM_IO_KMEM,
.mem.ptr.addr = NULL,
.client = c->dm_io,
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 4f3cb35..70fbf11 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1136,6 +1136,7 @@ static void clone_init(struct dm_crypt_io *io, struct bio 
*clone)
clone->bi_private = io;
clone->bi_end_io  = crypt_endio;
clone->bi_bdev= cc->dev->bdev;
+   clone->bi_op  = io->base_bio->bi_op;
clone->bi_rw  = io->base_bio->bi_rw;
 }
 
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 50f17e3..0f723ca 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -278,8 +278,9 @@ static void km_dp_init(struct dpages *dp, void *data)
 /*-
  * IO routines that accept a list of pages.
  *---*/
-static void do_region(int rw, unsigned region, struct dm_io_region *where,
- struct dpages *dp, struct io *io)
+static void do_region(int op, int op_flags, unsigned region,
+ struct dm_io_region *where, struct dpages *dp,
+ struct io *io)
 {
struct bio *bio;
struct page *page;
@@ -295,24 +296,25 @@ static void do_region(int rw, unsigned region, struct 
dm_io_region *where,
/*
 * Reject unsupported discard and write same requests.
 */
-   if (rw & REQ_DISCARD)
+   if (op == REQ_OP_DISCARD)
special_cmd_max_sectors = q->limits.max_discard_sectors;
-   else if (rw & REQ_WRITE_SAME)
+   else if (op == REQ_OP_WRITE_SAME)
special_cmd_max_sectors = q->limits.max_write_same_sectors;
-   if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) && special_cmd_max_sectors == 
0) {
+   if ((op == REQ_OP_DISCARD || op == REQ_OP_WRITE_SAME) &&
+   special_cmd_max_sectors == 0) {
dec_count(io, region, -EOPNOTSUPP);
return;
}
 
/*
-* where->count may be zero if rw holds a flush and we need to
+* where->count may be zero if op holds a flush and we need to
 * send a zero-sized flush.
 */
do {
/*
 * Allocate a suitably sized-bio.
 */
-   if ((rw & REQ_DISCARD) || (rw & REQ_WRITE_SAME))
+   if ((op == REQ_OP_DISCARD) || (op == REQ_OP_WRITE_SAME))
num_bvecs = 1;
else
num_bvecs = min_t(int, BIO_MAX_PAGES,
@@ -322,14 +324,15 @@ static void do_region(int rw, unsigned region, struct 
dm_io_region *where,
bio->bi_iter.bi_sector = where->sector + (where->count - 
remaining);
bio->bi_bdev = where->bdev;
bio->bi_end_io = endio;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
store_io_and_region_in_bio(bio, io, region);
 
-   if (rw & REQ_DISCARD) {
+   

[PATCH 27/42] block: prepare request creation/destruction code to use REQ_OPs

2016-04-13 Thread mchristi
From: Mike Christie 

This patch prepares *_get_request/*_put_request and freed_request,
to use separate variables for the operation and flags. In the
next patches the struct request users will be converted like
was done for bios. request->op will be used for the REQ_OP and
request->cmd_flags for the rq_flag_bits.

There is some temporary compat code in __get_request to
allow users to read the operation from the cmd_flags. This will
be deleted in one of the last patches when all drivers have
been converted.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c | 56 +++-
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4224775..f1545d1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -959,10 +959,10 @@ static void __freed_request(struct request_list *rl, int 
sync)
  * A request has just been released.  Account for it, update the full and
  * congestion status, wake up any waiters.   Called under q->queue_lock.
  */
-static void freed_request(struct request_list *rl, unsigned int flags)
+static void freed_request(struct request_list *rl, int op, unsigned int flags)
 {
struct request_queue *q = rl->q;
-   int sync = rw_is_sync(flags);
+   int sync = rw_is_sync(op | flags);
 
q->nr_rqs[sync]--;
rl->count[sync]--;
@@ -1054,7 +1054,8 @@ static struct io_context *rq_ioc(struct bio *bio)
 /**
  * __get_request - get a free request
  * @rl: request list to allocate from
- * @rw_flags: RW and SYNC flags
+ * @op: REQ_OP_READ/REQ_OP_WRITE
+ * @op_flags: rq_flag_bits
  * @bio: bio to allocate request for (can be %NULL)
  * @gfp_mask: allocation mask
  *
@@ -1065,21 +1066,22 @@ static struct io_context *rq_ioc(struct bio *bio)
  * Returns ERR_PTR on failure, with @q->queue_lock held.
  * Returns request pointer on success, with @q->queue_lock *not held*.
  */
-static struct request *__get_request(struct request_list *rl, int rw_flags,
-struct bio *bio, gfp_t gfp_mask)
+static struct request *__get_request(struct request_list *rl, int op,
+int op_flags, struct bio *bio,
+gfp_t gfp_mask)
 {
struct request_queue *q = rl->q;
struct request *rq;
struct elevator_type *et = q->elevator->type;
struct io_context *ioc = rq_ioc(bio);
struct io_cq *icq = NULL;
-   const bool is_sync = rw_is_sync(rw_flags) != 0;
+   const bool is_sync = rw_is_sync(op | op_flags) != 0;
int may_queue;
 
if (unlikely(blk_queue_dying(q)))
return ERR_PTR(-ENODEV);
 
-   may_queue = elv_may_queue(q, rw_flags);
+   may_queue = elv_may_queue(q, op | op_flags);
if (may_queue == ELV_MQUEUE_NO)
goto rq_starved;
 
@@ -1123,7 +1125,7 @@ static struct request *__get_request(struct request_list 
*rl, int rw_flags,
 
/*
 * Decide whether the new request will be managed by elevator.  If
-* so, mark @rw_flags and increment elvpriv.  Non-zero elvpriv will
+* so, mark @op_flags and increment elvpriv.  Non-zero elvpriv will
 * prevent the current elevator from being destroyed until the new
 * request is freed.  This guarantees icq's won't be destroyed and
 * makes creating new ones safe.
@@ -1132,14 +1134,14 @@ static struct request *__get_request(struct 
request_list *rl, int rw_flags,
 * it will be created after releasing queue_lock.
 */
if (blk_rq_should_init_elevator(bio) && !blk_queue_bypass(q)) {
-   rw_flags |= REQ_ELVPRIV;
+   op_flags |= REQ_ELVPRIV;
q->nr_rqs_elvpriv++;
if (et->icq_cache && ioc)
icq = ioc_lookup_icq(ioc, q);
}
 
if (blk_queue_io_stat(q))
-   rw_flags |= REQ_IO_STAT;
+   op_flags |= REQ_IO_STAT;
spin_unlock_irq(q->queue_lock);
 
/* allocate and init request */
@@ -1149,10 +1151,12 @@ static struct request *__get_request(struct 
request_list *rl, int rw_flags,
 
blk_rq_init(q, rq);
blk_rq_set_rl(rq, rl);
-   rq->cmd_flags = rw_flags | REQ_ALLOCED;
+   /* tmp compat - allow users to check either one for the op */
+   rq->cmd_flags = op | op_flags | REQ_ALLOCED;
+   rq->op = op;
 
/* init elvpriv */
-   if (rw_flags & REQ_ELVPRIV) {
+   if (op_flags & REQ_ELVPRIV) {
if (unlikely(et->icq_cache && !icq)) {
if (ioc)
icq = ioc_create_icq(ioc, q, gfp_mask);
@@ -1178,7 +1182,7 @@ out:
if (ioc_batching(q, ioc))
ioc->nr_batch_requests--;
 
-   trace_block_getrq(q, bio, rw_flags & 1);
+   trace_block_getrq(q, bio, op);
return rq;
 
 fail_elvpriv:
@@ -1208,7 +1212,7 @@ fail_alloc:
 * que

[PATCH 11/42] f2fs: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has f2fs use bio->bi_op for REQ_OPs and rq_flag_bits to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 fs/f2fs/checkpoint.c| 10 ++
 fs/f2fs/data.c  | 33 -
 fs/f2fs/f2fs.h  |  5 +++--
 fs/f2fs/gc.c|  9 ++---
 fs/f2fs/inline.c|  3 ++-
 fs/f2fs/node.c  |  8 +---
 fs/f2fs/segment.c   | 10 +++---
 fs/f2fs/trace.c |  7 ---
 include/trace/events/f2fs.h | 34 +-
 9 files changed, 74 insertions(+), 45 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index b92782f..6f9dc16 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -55,14 +55,15 @@ static struct page *__get_meta_page(struct f2fs_sb_info 
*sbi, pgoff_t index,
struct f2fs_io_info fio = {
.sbi = sbi,
.type = META,
-   .rw = READ_SYNC | REQ_META | REQ_PRIO,
+   .op = REQ_OP_READ,
+   .op_flags = READ_SYNC | REQ_META | REQ_PRIO,
.old_blkaddr = index,
.new_blkaddr = index,
.encrypted_page = NULL,
};
 
if (unlikely(!is_meta))
-   fio.rw &= ~REQ_META;
+   fio.op_flags &= ~REQ_META;
 repeat:
page = grab_cache_page(mapping, index);
if (!page) {
@@ -149,13 +150,14 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t 
start, int nrpages,
struct f2fs_io_info fio = {
.sbi = sbi,
.type = META,
-   .rw = sync ? (READ_SYNC | REQ_META | REQ_PRIO) : READA,
+   .op = REQ_OP_READ,
+   .op_flags = sync ? (READ_SYNC | REQ_META | REQ_PRIO) : READA,
.encrypted_page = NULL,
};
struct blk_plug plug;
 
if (unlikely(type == META_POR))
-   fio.rw &= ~REQ_META;
+   fio.op_flags &= ~REQ_META;
 
blk_start_plug(&plug);
for (; nrpages-- > 0; blkno++) {
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 74cf5cb..03b6362 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -105,11 +105,12 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
if (!io->bio)
return;
 
-   if (is_read_io(fio->rw))
+   if (is_read_io(fio->op))
trace_f2fs_submit_read_bio(io->sbi->sb, fio, io->bio);
else
trace_f2fs_submit_write_bio(io->sbi->sb, fio, io->bio);
-   io->bio->bi_rw = fio->rw;
+   io->bio->bi_op = fio->op;
+   io->bio->bi_rw = fio->op_flags;
 
submit_bio(io->bio);
io->bio = NULL;
@@ -177,10 +178,12 @@ static void __f2fs_submit_merged_bio(struct f2fs_sb_info 
*sbi,
/* change META to META_FLUSH in the checkpoint procedure */
if (type >= META_FLUSH) {
io->fio.type = META_FLUSH;
+   io->fio.op = REQ_OP_WRITE;
if (test_opt(sbi, NOBARRIER))
-   io->fio.rw = WRITE_FLUSH | REQ_META | REQ_PRIO;
+   io->fio.op_flags = WRITE_FLUSH | REQ_META | REQ_PRIO;
else
-   io->fio.rw = WRITE_FLUSH_FUA | REQ_META | REQ_PRIO;
+   io->fio.op_flags = WRITE_FLUSH_FUA | REQ_META |
+   REQ_PRIO;
}
__submit_merged_bio(io);
 out:
@@ -222,13 +225,14 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
f2fs_trace_ios(fio, 0);
 
/* Allocate a new bio */
-   bio = __bio_alloc(fio->sbi, fio->new_blkaddr, 1, is_read_io(fio->rw));
+   bio = __bio_alloc(fio->sbi, fio->new_blkaddr, 1, is_read_io(fio->op));
 
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
bio_put(bio);
return -EFAULT;
}
-   bio->bi_rw = fio->rw;
+   bio->bi_op = fio->op;
+   bio->bi_rw = fio->op_flags;
 
submit_bio(bio);
return 0;
@@ -239,7 +243,7 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio)
struct f2fs_sb_info *sbi = fio->sbi;
enum page_type btype = PAGE_TYPE_OF_BIO(fio->type);
struct f2fs_bio_info *io;
-   bool is_read = is_read_io(fio->rw);
+   bool is_read = is_read_io(fio->op);
struct page *bio_page;
 
io = is_read ? &sbi->read_io : &sbi->write_io[btype];
@@ -254,7 +258,7 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio)
inc_page_count(sbi, F2FS_WRITEBACK);
 
if (io->bio && (io->last_block_in_bio != fio->new_blkaddr - 1 ||
-   io->fio.rw != fio->rw))
+   (io->fio.op != fio->op || io->fio.op_flags != fio->op_flags)))
__submit_merged_bio(io);
 alloc_new:
if (io->bio == NULL) {
@@ -359,7 +363,7 @@ int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index)
 }
 
 struct 

[PATCH 21/42] bcache: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has bcache use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/md/bcache/btree.c |  2 ++
 drivers/md/bcache/debug.c |  2 ++
 drivers/md/bcache/io.c|  2 +-
 drivers/md/bcache/journal.c   |  7 ---
 drivers/md/bcache/movinggc.c  |  2 +-
 drivers/md/bcache/request.c   |  9 +
 drivers/md/bcache/super.c | 26 +++---
 drivers/md/bcache/writeback.c |  4 ++--
 8 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 22b9e34..752a44f 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -295,6 +295,7 @@ static void bch_btree_node_read(struct btree *b)
closure_init_stack(&cl);
 
bio = bch_bbio_alloc(b->c);
+   bio->bi_op  = REQ_OP_READ;
bio->bi_rw  = REQ_META|READ_SYNC;
bio->bi_iter.bi_size = KEY_SIZE(&b->key) << 9;
bio->bi_end_io  = btree_node_read_endio;
@@ -397,6 +398,7 @@ static void do_btree_node_write(struct btree *b)
 
b->bio->bi_end_io   = btree_node_write_endio;
b->bio->bi_private  = cl;
+   b->bio->bi_op   = REQ_OP_WRITE;
b->bio->bi_rw   = REQ_META|WRITE_SYNC|REQ_FUA;
b->bio->bi_iter.bi_size = roundup(set_bytes(i), block_bytes(b->c));
bch_bio_map(b->bio, i);
diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index 52b6bcf..8df9e66 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -52,6 +52,7 @@ void bch_btree_verify(struct btree *b)
bio->bi_bdev= PTR_CACHE(b->c, &b->key, 0)->bdev;
bio->bi_iter.bi_sector  = PTR_OFFSET(&b->key, 0);
bio->bi_iter.bi_size= KEY_SIZE(&v->key) << 9;
+   bio->bi_op  = REQ_OP_READ;
bio->bi_rw  = REQ_META|READ_SYNC;
bch_bio_map(bio, sorted);
 
@@ -114,6 +115,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
check = bio_clone(bio, GFP_NOIO);
if (!check)
return;
+   check->bi_op = REQ_OP_READ;
check->bi_rw |= READ_SYNC;
 
if (bio_alloc_pages(check, GFP_NOIO))
diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index 86a0bb8..f10a9a0 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -111,7 +111,7 @@ void bch_bbio_count_io_errors(struct cache_set *c, struct 
bio *bio,
struct bbio *b = container_of(bio, struct bbio, bio);
struct cache *ca = PTR_CACHE(c, &b->key, 0);
 
-   unsigned threshold = bio->bi_rw & REQ_WRITE
+   unsigned threshold = op_is_write(bio->bi_op)
? c->congested_write_threshold_us
: c->congested_read_threshold_us;
 
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index af3f9f7..68fa0f0 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -54,7 +54,7 @@ reread:   left = ca->sb.bucket_size - offset;
bio_reset(bio);
bio->bi_iter.bi_sector  = bucket + offset;
bio->bi_bdev= ca->bdev;
-   bio->bi_rw  = READ;
+   bio->bi_op  = REQ_OP_READ;
bio->bi_iter.bi_size= len << 9;
 
bio->bi_end_io  = journal_read_endio;
@@ -452,7 +452,7 @@ static void do_journal_discard(struct cache *ca)
bio->bi_iter.bi_sector  = bucket_to_sector(ca->set,
ca->sb.d[ja->discard_idx]);
bio->bi_bdev= ca->bdev;
-   bio->bi_rw  = REQ_WRITE|REQ_DISCARD;
+   bio->bi_op  = REQ_OP_DISCARD;
bio->bi_max_vecs= 1;
bio->bi_io_vec  = bio->bi_inline_vecs;
bio->bi_iter.bi_size= bucket_bytes(ca);
@@ -626,7 +626,8 @@ static void journal_write_unlocked(struct closure *cl)
bio_reset(bio);
bio->bi_iter.bi_sector  = PTR_OFFSET(k, i);
bio->bi_bdev= ca->bdev;
-   bio->bi_rw  = REQ_WRITE|REQ_SYNC|REQ_META|REQ_FLUSH|REQ_FUA;
+   bio->bi_op  = REQ_OP_WRITE;
+   bio->bi_rw  = REQ_SYNC|REQ_META|REQ_FLUSH|REQ_FUA;
bio->bi_iter.bi_size = sectors << 9;
 
bio->bi_end_io  = journal_write_endio;
diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c
index b929fc9..f33860a 100644
--- a/drivers/md/bcache/movinggc.c
+++ b/drivers/md/bcache/movinggc.c
@@ -163,7 +163,7 @@ static void read_moving(struct cache_set *c)
moving_init(io);
bio = &io->bio.bio;
 
-   bio->bi_rw  = READ;
+   bio->bi_op  = REQ_OP_READ;
bio->bi_end_io  = read_moving_endio;
 
if (bio_alloc_pages(bio, GFP_KERNEL))
diff 

[PATCH 24/42] xen: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has the xen block driver use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/block/xen-blkback/blkback.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index 79fe493..854ecca 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -501,7 +501,7 @@ static int xen_vbd_translate(struct phys_req *req, struct 
xen_blkif *blkif,
struct xen_vbd *vbd = &blkif->vbd;
int rc = -EACCES;
 
-   if ((operation != READ) && vbd->readonly)
+   if ((operation != REQ_OP_READ) && vbd->readonly)
goto out;
 
if (likely(req->nr_sects)) {
@@ -1014,7 +1014,7 @@ static int dispatch_discard_io(struct xen_blkif_ring 
*ring,
preq.sector_number = req->u.discard.sector_number;
preq.nr_sects  = req->u.discard.nr_sectors;
 
-   err = xen_vbd_translate(&preq, blkif, WRITE);
+   err = xen_vbd_translate(&preq, blkif, REQ_OP_WRITE);
if (err) {
pr_warn("access denied: DISCARD [%llu->%llu] on dev=%04x\n",
preq.sector_number,
@@ -1229,6 +1229,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
struct bio **biolist = pending_req->biolist;
int i, nbio = 0;
int operation;
+   int operation_flags = 0;
struct blk_plug plug;
bool drain = false;
struct grant_page **pages = pending_req->segments;
@@ -1247,17 +1248,19 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
switch (req_operation) {
case BLKIF_OP_READ:
ring->st_rd_req++;
-   operation = READ;
+   operation = REQ_OP_READ;
break;
case BLKIF_OP_WRITE:
ring->st_wr_req++;
-   operation = WRITE_ODIRECT;
+   operation = REQ_OP_WRITE;
+   operation_flags = WRITE_ODIRECT;
break;
case BLKIF_OP_WRITE_BARRIER:
drain = true;
case BLKIF_OP_FLUSH_DISKCACHE:
ring->st_f_req++;
-   operation = WRITE_FLUSH;
+   operation = REQ_OP_WRITE;
+   operation_flags = WRITE_FLUSH;
break;
default:
operation = 0; /* make gcc happy */
@@ -1269,7 +1272,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
nseg = req->operation == BLKIF_OP_INDIRECT ?
   req->u.indirect.nr_segments : req->u.rw.nr_segments;
 
-   if (unlikely(nseg == 0 && operation != WRITE_FLUSH) ||
+   if (unlikely(nseg == 0 && operation_flags != WRITE_FLUSH) ||
unlikely((req->operation != BLKIF_OP_INDIRECT) &&
 (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) ||
unlikely((req->operation == BLKIF_OP_INDIRECT) &&
@@ -1310,7 +1313,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
 
if (xen_vbd_translate(&preq, ring->blkif, operation) != 0) {
pr_debug("access denied: %s of [%llu,%llu] on dev=%04x\n",
-operation == READ ? "read" : "write",
+operation == REQ_OP_READ ? "read" : "write",
 preq.sector_number,
 preq.sector_number + preq.nr_sects,
 ring->blkif->vbd.pdevice);
@@ -1369,7 +1372,8 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
bio->bi_private = pending_req;
bio->bi_end_io  = end_block_io_op;
bio->bi_iter.bi_sector  = preq.sector_number;
-   bio->bi_rw  = operation;
+   bio->bi_op  = operation;
+   bio->bi_rw  = operation_flags;
}
 
preq.sector_number += seg[i].nsec;
@@ -1377,7 +1381,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
 
/* This will be hit if the operation was a flush or discard. */
if (!bio) {
-   BUG_ON(operation != WRITE_FLUSH);
+   BUG_ON(operation_flags != WRITE_FLUSH);
 
bio = bio_alloc(GFP_KERNEL, 0);
if (unlikely(bio == NULL))
@@ -1387,7 +1391,8 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
bio->bi_bdev= preq.bdev;
bio->bi_private = pending_req;
bio->bi_end_io  = end_block_io_op;
-   bio->bi_rw  = operation;
+   bio->bi_op  = operation;
+   bio->bi_rw  = operation_flags;
}
 
atomic_set(&pending_req->pendcnt, nbio);
@@ -1399,9 +1404,9 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
/* Let the I/Os go.. */
blk_finish_pl

[PATCH 18/42] pm: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has the pm code use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 kernel/power/swap.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 4d050eb..adbcb1b 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -250,7 +250,7 @@ static void hib_end_io(struct bio *bio)
bio_put(bio);
 }
 
-static int hib_submit_io(int rw, pgoff_t page_off, void *addr,
+static int hib_submit_io(int op, int op_flags, pgoff_t page_off, void *addr,
struct hib_bio_batch *hb)
 {
struct page *page = virt_to_page(addr);
@@ -260,7 +260,8 @@ static int hib_submit_io(int rw, pgoff_t page_off, void 
*addr,
bio = bio_alloc(__GFP_RECLAIM | __GFP_HIGH, 1);
bio->bi_iter.bi_sector = page_off * (PAGE_SIZE >> 9);
bio->bi_bdev = hib_resume_bdev;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
 
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
printk(KERN_ERR "PM: Adding page to bio failed at %llu\n",
@@ -296,7 +297,8 @@ static int mark_swapfiles(struct swap_map_handle *handle, 
unsigned int flags)
 {
int error;
 
-   hib_submit_io(READ_SYNC, swsusp_resume_block, swsusp_header, NULL);
+   hib_submit_io(REQ_OP_READ, READ_SYNC, swsusp_resume_block,
+ swsusp_header, NULL);
if (!memcmp("SWAP-SPACE",swsusp_header->sig, 10) ||
!memcmp("SWAPSPACE2",swsusp_header->sig, 10)) {
memcpy(swsusp_header->orig_sig,swsusp_header->sig, 10);
@@ -305,8 +307,8 @@ static int mark_swapfiles(struct swap_map_handle *handle, 
unsigned int flags)
swsusp_header->flags = flags;
if (flags & SF_CRC32_MODE)
swsusp_header->crc32 = handle->crc32;
-   error = hib_submit_io(WRITE_SYNC, swsusp_resume_block,
-   swsusp_header, NULL);
+   error = hib_submit_io(REQ_OP_WRITE, WRITE_SYNC,
+ swsusp_resume_block, swsusp_header, NULL);
} else {
printk(KERN_ERR "PM: Swap header not found!\n");
error = -ENODEV;
@@ -379,7 +381,7 @@ static int write_page(void *buf, sector_t offset, struct 
hib_bio_batch *hb)
} else {
src = buf;
}
-   return hib_submit_io(WRITE_SYNC, offset, src, hb);
+   return hib_submit_io(REQ_OP_WRITE, WRITE_SYNC, offset, src, hb);
 }
 
 static void release_swap_writer(struct swap_map_handle *handle)
@@ -982,7 +984,8 @@ static int get_swap_reader(struct swap_map_handle *handle,
return -ENOMEM;
}
 
-   error = hib_submit_io(READ_SYNC, offset, tmp->map, NULL);
+   error = hib_submit_io(REQ_OP_READ, READ_SYNC, offset,
+ tmp->map, NULL);
if (error) {
release_swap_reader(handle);
return error;
@@ -1006,7 +1009,7 @@ static int swap_read_page(struct swap_map_handle *handle, 
void *buf,
offset = handle->cur->entries[handle->k];
if (!offset)
return -EFAULT;
-   error = hib_submit_io(READ_SYNC, offset, buf, hb);
+   error = hib_submit_io(REQ_OP_READ, READ_SYNC, offset, buf, hb);
if (error)
return error;
if (++handle->k >= MAP_PAGE_ENTRIES) {
@@ -1508,7 +1511,8 @@ int swsusp_check(void)
if (!IS_ERR(hib_resume_bdev)) {
set_blocksize(hib_resume_bdev, PAGE_SIZE);
clear_page(swsusp_header);
-   error = hib_submit_io(READ_SYNC, swsusp_resume_block,
+   error = hib_submit_io(REQ_OP_READ, READ_SYNC,
+   swsusp_resume_block,
swsusp_header, NULL);
if (error)
goto put;
@@ -1516,7 +1520,8 @@ int swsusp_check(void)
if (!memcmp(HIBERNATE_SIG, swsusp_header->sig, 10)) {
memcpy(swsusp_header->sig, swsusp_header->orig_sig, 10);
/* Reset swap signature now */
-   error = hib_submit_io(WRITE_SYNC, swsusp_resume_block,
+   error = hib_submit_io(REQ_OP_WRITE, WRITE_SYNC,
+   swsusp_resume_block,
swsusp_header, NULL);
} else {
error = -EINVAL;
@@ -1560,10 +1565,12 @@ int swsusp_unmark(void)
 {
int error;
 
-   hib_submit_io(READ_SYNC, swsusp_resume_block, swsusp_header, NULL);
+   hib_submit_io(REQ_OP_READ, READ_SYNC, swsusp_resume_block,
+ swsusp_header, NULL);
if (!memcmp(HIBERNATE_SIG,swsusp_he

[PATCH 23/42] md/raid: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has md use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/md/bitmap.c  |  2 +-
 drivers/md/dm-raid.c |  5 +++--
 drivers/md/md.c  | 11 +++
 drivers/md/md.h  |  3 ++-
 drivers/md/raid1.c   | 34 
 drivers/md/raid10.c  | 50 ++--
 drivers/md/raid5-cache.c | 25 +++-
 drivers/md/raid5.c   | 48 ++
 8 files changed, 101 insertions(+), 77 deletions(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 8b2e16f..9e8019e 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -159,7 +159,7 @@ static int read_sb_page(struct mddev *mddev, loff_t offset,
 
if (sync_page_io(rdev, target,
 roundup(size, 
bdev_logical_block_size(rdev->bdev)),
-page, READ, true)) {
+page, REQ_OP_READ, 0, true)) {
page->index = index;
return 0;
}
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index a090121..43a749c 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -792,7 +792,7 @@ static int read_disk_sb(struct md_rdev *rdev, int size)
if (rdev->sb_loaded)
return 0;
 
-   if (!sync_page_io(rdev, 0, size, rdev->sb_page, READ, 1)) {
+   if (!sync_page_io(rdev, 0, size, rdev->sb_page, REQ_OP_READ, 0, 1)) {
DMERR("Failed to read superblock of device at position %d",
  rdev->raid_disk);
md_error(rdev->mddev, rdev);
@@ -1646,7 +1646,8 @@ static void attempt_restore_of_faulty_devices(struct 
raid_set *rs)
for (i = 0; i < rs->md.raid_disks; i++) {
r = &rs->dev[i].rdev;
if (test_bit(Faulty, &r->flags) && r->sb_page &&
-   sync_page_io(r, 0, r->sb_size, r->sb_page, READ, 1)) {
+   sync_page_io(r, 0, r->sb_size, r->sb_page, REQ_OP_READ, 0,
+1)) {
DMINFO("Faulty %s device #%d has readable super block."
   "  Attempting to revive it.",
   rs->raid_type->name, i);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ec3c98d..9c40368 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -392,6 +392,7 @@ static void submit_flushes(struct work_struct *ws)
bi->bi_end_io = md_end_flush;
bi->bi_private = rdev;
bi->bi_bdev = rdev->bdev;
+   bi->bi_op = REQ_OP_WRITE;
bi->bi_rw = WRITE_FLUSH;
atomic_inc(&mddev->flush_pending);
submit_bio(bi);
@@ -741,6 +742,7 @@ void md_super_write(struct mddev *mddev, struct md_rdev 
*rdev,
bio_add_page(bio, page, size, 0);
bio->bi_private = rdev;
bio->bi_end_io = super_written;
+   bio->bi_op = REQ_OP_WRITE;
bio->bi_rw = WRITE_FLUSH_FUA;
 
atomic_inc(&mddev->pending_writes);
@@ -754,14 +756,15 @@ void md_super_wait(struct mddev *mddev)
 }
 
 int sync_page_io(struct md_rdev *rdev, sector_t sector, int size,
-struct page *page, int rw, bool metadata_op)
+struct page *page, int op, int op_flags, bool metadata_op)
 {
struct bio *bio = bio_alloc_mddev(GFP_NOIO, 1, rdev->mddev);
int ret;
 
bio->bi_bdev = (metadata_op && rdev->meta_bdev) ?
rdev->meta_bdev : rdev->bdev;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
if (metadata_op)
bio->bi_iter.bi_sector = sector + rdev->sb_start;
else if (rdev->mddev->reshape_position != MaxSector &&
@@ -787,7 +790,7 @@ static int read_disk_sb(struct md_rdev *rdev, int size)
if (rdev->sb_loaded)
return 0;
 
-   if (!sync_page_io(rdev, 0, size, rdev->sb_page, READ, true))
+   if (!sync_page_io(rdev, 0, size, rdev->sb_page, REQ_OP_READ, 0, true))
goto fail;
rdev->sb_loaded = 1;
return 0;
@@ -1473,7 +1476,7 @@ static int super_1_load(struct md_rdev *rdev, struct 
md_rdev *refdev, int minor_
return -EINVAL;
bb_sector = (long long)offset;
if (!sync_page_io(rdev, bb_sector, sectors << 9,
- rdev->bb_page, READ, true))
+ rdev->bb_page, REQ_OP_READ, 0, true))
return -EIO;
bbp = (u64 *)page_address(rdev->bb_page);
rdev->badblocks.shift = sb->bblog_shift;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index b5c4be7..2e0918f 100644
--- a/drivers/md/md.h
+++ b/drivers/m

[PATCH 22/42] drbd: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has drbd use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Lars and Philip, I might have split this patch up a little weird.
The block layer has compat so you can set either bi_rw or bi_op.
This patch handles setting up the bio in drbd. I then converted
all the block device drivers and in
0037-block-fs-drivers-do-not-test-bi_rw-for-REQ_OPs.patch I modified
the bi_rw checks so they use bi_op.

This patch is compile tested only.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/block/drbd/drbd_actlog.c   | 29 -
 drivers/block/drbd/drbd_bitmap.c   |  6 +++---
 drivers/block/drbd/drbd_int.h  |  4 ++--
 drivers/block/drbd/drbd_main.c |  5 +++--
 drivers/block/drbd/drbd_receiver.c | 37 +
 drivers/block/drbd/drbd_worker.c   |  3 ++-
 6 files changed, 51 insertions(+), 33 deletions(-)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 6069e15..2fa8534 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -137,19 +137,19 @@ void wait_until_done_or_force_detached(struct drbd_device 
*device, struct drbd_b
 
 static int _drbd_md_sync_page_io(struct drbd_device *device,
 struct drbd_backing_dev *bdev,
-sector_t sector, int rw)
+sector_t sector, int op)
 {
struct bio *bio;
/* we do all our meta data IO in aligned 4k blocks. */
const int size = 4096;
-   int err;
+   int err, op_flags = 0;
 
device->md_io.done = 0;
device->md_io.error = -ENODEV;
 
-   if ((rw & WRITE) && !test_bit(MD_NO_FUA, &device->flags))
-   rw |= REQ_FUA | REQ_FLUSH;
-   rw |= REQ_SYNC | REQ_NOIDLE;
+   if ((op == REQ_OP_WRITE) && !test_bit(MD_NO_FUA, &device->flags))
+   op_flags |= REQ_FUA | REQ_FLUSH;
+   op_flags |= REQ_SYNC | REQ_NOIDLE;
 
bio = bio_alloc_drbd(GFP_NOIO);
bio->bi_bdev = bdev->md_bdev;
@@ -159,9 +159,10 @@ static int _drbd_md_sync_page_io(struct drbd_device 
*device,
goto out;
bio->bi_private = device;
bio->bi_end_io = drbd_md_endio;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
 
-   if (!(rw & WRITE) && device->state.disk == D_DISKLESS && device->ldev 
== NULL)
+   if (op != REQ_OP_WRITE && device->state.disk == D_DISKLESS && 
device->ldev == NULL)
/* special case, drbd_md_read() during drbd_adm_attach(): no 
get_ldev */
;
else if (!get_ldev_if_state(device, D_ATTACHING)) {
@@ -174,7 +175,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
bio_get(bio); /* one bio_put() is in the completion handler */
atomic_inc(&device->md_io.in_use); /* drbd_md_put_buffer() is in the 
completion handler */
device->md_io.submit_jif = jiffies;
-   if (drbd_insert_fault(device, (rw & WRITE) ? DRBD_FAULT_MD_WR : 
DRBD_FAULT_MD_RD))
+   if (drbd_insert_fault(device, (op == REQ_OP_WRITE) ? DRBD_FAULT_MD_WR : 
DRBD_FAULT_MD_RD))
bio_io_error(bio);
else
submit_bio(bio);
@@ -188,7 +189,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
 }
 
 int drbd_md_sync_page_io(struct drbd_device *device, struct drbd_backing_dev 
*bdev,
-sector_t sector, int rw)
+sector_t sector, int op)
 {
int err;
D_ASSERT(device, atomic_read(&device->md_io.in_use) == 1);
@@ -197,19 +198,21 @@ int drbd_md_sync_page_io(struct drbd_device *device, 
struct drbd_backing_dev *bd
 
dynamic_drbd_dbg(device, "meta_data io: %s [%d]:%s(,%llus,%s) %pS\n",
 current->comm, current->pid, __func__,
-(unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ",
+(unsigned long long)sector, (op == REQ_OP_WRITE) ? "WRITE" : 
"READ",
 (void*)_RET_IP_ );
 
if (sector < drbd_md_first_sector(bdev) ||
sector + 7 > drbd_md_last_sector(bdev))
drbd_alert(device, "%s [%d]:%s(,%llus,%s) out of range md 
access!\n",
 current->comm, current->pid, __func__,
-(unsigned long long)sector, (rw & WRITE) ? "WRITE" : 
"READ");
+(unsigned long long)sector,
+(op == REQ_OP_WRITE) ? "WRITE" : "READ");
 
-   err = _drbd_md_sync_page_io(device, bdev, sector, rw);
+   err = _drbd_md_sync_page_io(device, bdev, sector, op);
if (err) {
drbd_err(device, "drbd_md_sync_page_io(,%llus,%s) failed with 
error %d\n",
-   (unsigned long long)sector, (rw & WRITE) ? "WRITE" : 
"READ", err);
+   (unsigned long long)sector,
+   (op == REQ_OP_WRITE) ? "WRITE" : "READ", err);
}
return err;
 }
diff

[PATCH 32/42] block: convert is_sync helpers to use REQ_OPs.

2016-04-13 Thread mchristi
From: Mike Christie 

This patch converts the is_sync helpers to use separate variables
for the operation and flags.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c   | 6 +++---
 block/blk-mq.c | 8 
 block/cfq-iosched.c| 2 +-
 include/linux/blkdev.h | 6 +++---
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 5632cd1..a240657 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -962,7 +962,7 @@ static void __freed_request(struct request_list *rl, int 
sync)
 static void freed_request(struct request_list *rl, int op, unsigned int flags)
 {
struct request_queue *q = rl->q;
-   int sync = rw_is_sync(op | flags);
+   int sync = rw_is_sync(op, flags);
 
q->nr_rqs[sync]--;
rl->count[sync]--;
@@ -1075,7 +1075,7 @@ static struct request *__get_request(struct request_list 
*rl, int op,
struct elevator_type *et = q->elevator->type;
struct io_context *ioc = rq_ioc(bio);
struct io_cq *icq = NULL;
-   const bool is_sync = rw_is_sync(op | op_flags) != 0;
+   const bool is_sync = rw_is_sync(op, op_flags) != 0;
int may_queue;
 
if (unlikely(blk_queue_dying(q)))
@@ -1246,7 +1246,7 @@ static struct request *get_request(struct request_queue 
*q, int op,
   int op_flags, struct bio *bio,
   gfp_t gfp_mask)
 {
-   const bool is_sync = rw_is_sync(op | op_flags) != 0;
+   const bool is_sync = rw_is_sync(op, op_flags) != 0;
DEFINE_WAIT(wait);
struct request_list *rl;
struct request *rq;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4843c0b..64d61be 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -206,7 +206,7 @@ static void blk_mq_rq_ctx_init(struct request_queue *q, 
struct blk_mq_ctx *ctx,
rq->end_io_data = NULL;
rq->next_rq = NULL;
 
-   ctx->rq_dispatched[rw_is_sync(op | op_flags)]++;
+   ctx->rq_dispatched[rw_is_sync(op, op_flags)]++;
 }
 
 static struct request *
@@ -1181,7 +1181,7 @@ static struct request *blk_mq_map_request(struct 
request_queue *q,
ctx = blk_mq_get_ctx(q);
hctx = q->mq_ops->map_queue(q, ctx->cpu);
 
-   if (rw_is_sync(bio->bi_op | bio->bi_rw))
+   if (rw_is_sync(bio->bi_op, bio->bi_rw))
op_flags |= REQ_SYNC;
 
trace_block_getrq(q, bio, op);
@@ -1249,7 +1249,7 @@ static int blk_mq_direct_issue_request(struct request 
*rq, blk_qc_t *cookie)
  */
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 {
-   const int is_sync = rw_is_sync(bio->bi_op | bio->bi_rw);
+   const int is_sync = rw_is_sync(bio->bi_op, bio->bi_rw);
const int is_flush_fua = bio->bi_rw & (REQ_FLUSH | REQ_FUA);
struct blk_map_ctx data;
struct request *rq;
@@ -1346,7 +1346,7 @@ done:
  */
 static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
 {
-   const int is_sync = rw_is_sync(bio->bi_op | bio->bi_rw);
+   const int is_sync = rw_is_sync(bio->bi_op, bio->bi_rw);
const int is_flush_fua = bio->bi_rw & (REQ_FLUSH | REQ_FUA);
struct blk_plug *plug;
unsigned int request_count = 0;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 0dfa2dd..2fd5bcf 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -4311,7 +4311,7 @@ static int cfq_may_queue(struct request_queue *q, int op, 
int op_flags)
if (!cic)
return ELV_MQUEUE_MAY;
 
-   cfqq = cic_to_cfqq(cic, rw_is_sync(op | op_flags));
+   cfqq = cic_to_cfqq(cic, rw_is_sync(op, op_flags));
if (cfqq) {
cfq_init_prio_data(cfqq, cic);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 39df8ef..550b371 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -618,14 +618,14 @@ static inline unsigned int blk_queue_cluster(struct 
request_queue *q)
 /*
  * We regard a request as sync, if either a read or a sync write
  */
-static inline bool rw_is_sync(unsigned int rw_flags)
+static inline bool rw_is_sync(int op, unsigned int rw_flags)
 {
-   return !(rw_flags & REQ_WRITE) || (rw_flags & REQ_SYNC);
+   return op == REQ_OP_READ || (rw_flags & REQ_SYNC);
 }
 
 static inline bool rq_is_sync(struct request *rq)
 {
-   return rw_is_sync(rq->cmd_flags);
+   return rw_is_sync(rq->op, rq->cmd_flags);
 }
 
 static inline bool blk_rl_full(struct request_list *rl, bool sync)
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 34/42] drivers: set request op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has the block drivers use the request->op for REQ_OP
operations and cmd_flags for rq_flag_bits.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/block/loop.c  |  6 +++---
 drivers/block/mtip32xx/mtip32xx.c |  2 +-
 drivers/block/nbd.c   |  2 +-
 drivers/block/rbd.c   |  2 +-
 drivers/block/skd_main.c  | 11 ---
 drivers/block/xen-blkfront.c  |  8 +---
 drivers/md/dm.c   |  2 +-
 drivers/mmc/card/block.c  |  7 +++
 drivers/mmc/card/queue.c  |  6 ++
 drivers/mmc/card/queue.h  |  5 -
 drivers/mtd/mtd_blkdevs.c |  2 +-
 drivers/nvme/host/core.c  |  2 +-
 drivers/nvme/host/nvme.h  |  2 +-
 drivers/nvme/host/pci.c   |  2 +-
 drivers/scsi/sd.c | 25 -
 15 files changed, 45 insertions(+), 39 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 7e5e27a..0e80c9b 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -538,7 +538,7 @@ static int do_req_filebacked(struct loop_device *lo, struct 
request *rq)
if (rq->cmd_flags & REQ_WRITE) {
if (rq->cmd_flags & REQ_FLUSH)
ret = lo_req_flush(lo, rq);
-   else if (rq->cmd_flags & REQ_DISCARD)
+   else if (rq->op == REQ_OP_DISCARD)
ret = lo_discard(lo, rq, pos);
else if (lo->transfer)
ret = lo_write_transfer(lo, rq, pos);
@@ -1653,8 +1653,8 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx,
if (lo->lo_state != Lo_bound)
return -EIO;
 
-   if (lo->use_dio && !(cmd->rq->cmd_flags & (REQ_FLUSH |
-   REQ_DISCARD)))
+   if (lo->use_dio && (!(cmd->rq->cmd_flags & REQ_FLUSH) ||
+cmd->rq->op == REQ_OP_DISCARD))
cmd->use_aio = true;
else
cmd->use_aio = false;
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 6053e46..7638273 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3765,7 +3765,7 @@ static int mtip_submit_request(struct blk_mq_hw_ctx 
*hctx, struct request *rq)
return -ENODATA;
}
 
-   if (rq->cmd_flags & REQ_DISCARD) {
+   if (rq->op == REQ_OP_DISCARD) {
int err;
 
err = mtip_send_trim(dd, blk_rq_pos(rq), blk_rq_sectors(rq));
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 31e73a7..68a1476 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -282,7 +282,7 @@ static int nbd_send_req(struct nbd_device *nbd, struct 
request *req)
 
if (req->cmd_type == REQ_TYPE_DRV_PRIV)
type = NBD_CMD_DISC;
-   else if (req->cmd_flags & REQ_DISCARD)
+   else if (req->op == REQ_OP_DISCARD)
type = NBD_CMD_TRIM;
else if (req->cmd_flags & REQ_FLUSH)
type = NBD_CMD_FLUSH;
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 94a1843..e8935af 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3371,7 +3371,7 @@ static void rbd_queue_workfn(struct work_struct *work)
goto err;
}
 
-   if (rq->cmd_flags & REQ_DISCARD)
+   if (rq->op == REQ_OP_DISCARD)
op_type = OBJ_OP_DISCARD;
else if (rq->cmd_flags & REQ_WRITE)
op_type = OBJ_OP_WRITE;
diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c
index 41aaae3..5739223 100644
--- a/drivers/block/skd_main.c
+++ b/drivers/block/skd_main.c
@@ -576,7 +576,6 @@ static void skd_request_fn(struct request_queue *q)
struct request *req = NULL;
struct skd_scsi_request *scsi_req;
struct page *page;
-   unsigned long io_flags;
int error;
u32 lba;
u32 count;
@@ -624,12 +623,11 @@ static void skd_request_fn(struct request_queue *q)
lba = (u32)blk_rq_pos(req);
count = blk_rq_sectors(req);
data_dir = rq_data_dir(req);
-   io_flags = req->cmd_flags;
 
-   if (io_flags & REQ_FLUSH)
+   if (req->cmd_flags & REQ_FLUSH)
flush++;
 
-   if (io_flags & REQ_FUA)
+   if (req->cmd_flags & REQ_FUA)
fua++;
 
pr_debug("%s:%s:%d new req=%p lba=%u(0x%x) "
@@ -735,7 +733,7 @@ static void skd_request_fn(struct request_queue *q)
else
skreq->sg_data_dir = SKD_DATA_DIR_HOST_TO_CARD;
 
-   if (io_flags & REQ_DISCARD) {
+   if (req->op == REQ_OP_DISCARD) {
page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
if (!page) {
pr_err("request_fn:Page allocation failed.\n");
@@ -852,

[PATCH 33/42] block: convert rq_data_dir helper to use REQ_OPs

2016-04-13 Thread mchristi
From: Mike Christie 

The request->op field is now always setup up with a REQ_OP.
This patch has the rq_data_dir helper convert the operation
to a WRITE or READ direction based on that.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 include/linux/blkdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 550b371..25ec7ce 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -599,7 +599,7 @@ static inline void queue_flag_clear(unsigned int flag, 
struct request_queue *q)
 
 #define list_entry_rq(ptr) list_entry((ptr), struct request, queuelist)
 
-#define rq_data_dir(rq)((int)((rq)->cmd_flags & 1))
+#define rq_data_dir(rq)(op_is_write(rq->op) ? WRITE : READ)
 
 /*
  * Driver can handle struct request, if it either has an old style
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 25/42] target: set bi_op to REQ_OP

2016-04-13 Thread mchristi
From: Mike Christie 

This patch has the target modules use bio->bi_op for REQ_OPs and rq_flag_bits
to bio->bi_rw.

Signed-off-by: Mike Christie 
Acked-by: Nicholas Bellinger 
Reviewed-by: Christoph Hellwig 
---
 drivers/target/target_core_iblock.c | 38 +
 drivers/target/target_core_pscsi.c  |  2 +-
 2 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/drivers/target/target_core_iblock.c 
b/drivers/target/target_core_iblock.c
index c887f7d..fb1543b 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -312,7 +312,8 @@ static void iblock_bio_done(struct bio *bio)
 }
 
 static struct bio *
-iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int rw)
+iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int op,
+  int op_flags)
 {
struct iblock_dev *ib_dev = IBLOCK_DEV(cmd->se_dev);
struct bio *bio;
@@ -334,7 +335,8 @@ iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 
sg_num, int rw)
bio->bi_private = cmd;
bio->bi_end_io = &iblock_bio_done;
bio->bi_iter.bi_sector = lba;
-   bio->bi_rw = rw;
+   bio->bi_op = op;
+   bio->bi_rw = op_flags;
 
return bio;
 }
@@ -480,7 +482,7 @@ iblock_execute_write_same(struct se_cmd *cmd)
goto fail;
cmd->priv = ibr;
 
-   bio = iblock_get_bio(cmd, block_lba, 1, WRITE);
+   bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE, 0);
if (!bio)
goto fail_free_ibr;
 
@@ -493,7 +495,8 @@ iblock_execute_write_same(struct se_cmd *cmd)
while (bio_add_page(bio, sg_page(sg), sg->length, sg->offset)
!= sg->length) {
 
-   bio = iblock_get_bio(cmd, block_lba, 1, WRITE);
+   bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE,
+0);
if (!bio)
goto fail_put_bios;
 
@@ -679,8 +682,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
struct scatterlist *sg;
u32 sg_num = sgl_nents;
unsigned bio_cnt;
-   int rw = 0;
-   int i;
+   int i, op, op_flags = 0;
 
if (data_direction == DMA_TO_DEVICE) {
struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
@@ -690,17 +692,20 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
 * is not enabled, or if initiator set the Force Unit Access 
bit.
 */
if (q->flush_flags & REQ_FUA) {
-   if (cmd->se_cmd_flags & SCF_FUA)
-   rw = WRITE_FUA;
-   else if (!(q->flush_flags & REQ_FLUSH))
-   rw = WRITE_FUA;
-   else
-   rw = WRITE;
+   if (cmd->se_cmd_flags & SCF_FUA) {
+   op = REQ_OP_WRITE;
+   op_flags = WRITE_FUA;
+   } else if (!(q->flush_flags & REQ_FLUSH)) {
+   op = REQ_OP_WRITE;
+   op_flags = WRITE_FUA;
+   } else {
+   op = REQ_OP_WRITE;
+   }
} else {
-   rw = WRITE;
+   op = REQ_OP_WRITE;
}
} else {
-   rw = READ;
+   op = REQ_OP_READ;
}
 
ibr = kzalloc(sizeof(struct iblock_req), GFP_KERNEL);
@@ -714,7 +719,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
return 0;
}
 
-   bio = iblock_get_bio(cmd, block_lba, sgl_nents, rw);
+   bio = iblock_get_bio(cmd, block_lba, sgl_nents, op, op_flags);
if (!bio)
goto fail_free_ibr;
 
@@ -738,7 +743,8 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
bio_cnt = 0;
}
 
-   bio = iblock_get_bio(cmd, block_lba, sg_num, rw);
+   bio = iblock_get_bio(cmd, block_lba, sg_num, op,
+op_flags);
if (!bio)
goto fail_put_bios;
 
diff --git a/drivers/target/target_core_pscsi.c 
b/drivers/target/target_core_pscsi.c
index de18790..2cf915c 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -922,7 +922,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, 
u32 sgl_nents,
goto fail;
 
if (rw)
-   bio->bi_rw |= REQ_WRITE;
+   bio->bi_op = REQ_OP_WRITE;
 
p

[PATCH 28/42] block: prepare mq request creation to use REQ_OPs

2016-04-13 Thread mchristi
From: Mike Christie 

This patch modifies the blk mq request creation code to use
separate variables for the operation and flags, because in the
the next patches the struct request users will be converted like
was done for bios. request->op will be used for the REQ_OP and
request->cmd_flags for the rq_flag_bits.

Like the non mq patch there is some temporary compat code in
blk_mq_rq_ctx_init to allow users to read the operation from the
cmd_flags. This will be deleted in one of the last patches when all
drivers have been converted.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-mq.c | 38 +-
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1699baf..4843c0b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -159,16 +159,19 @@ bool blk_mq_can_queue(struct blk_mq_hw_ctx *hctx)
 EXPORT_SYMBOL(blk_mq_can_queue);
 
 static void blk_mq_rq_ctx_init(struct request_queue *q, struct blk_mq_ctx *ctx,
-  struct request *rq, unsigned int rw_flags)
+  struct request *rq, int op,
+  unsigned int op_flags)
 {
if (blk_queue_io_stat(q))
-   rw_flags |= REQ_IO_STAT;
+   op_flags |= REQ_IO_STAT;
 
INIT_LIST_HEAD(&rq->queuelist);
/* csd/requeue_work/fifo_time is initialized before use */
rq->q = q;
rq->mq_ctx = ctx;
-   rq->cmd_flags |= rw_flags;
+   rq->op = op;
+   /* tmp compat - allow users to check either one for the op */
+   rq->cmd_flags |= op | op_flags;
/* do not touch atomic flags, it needs atomic ops against the timer */
rq->cpu = -1;
INIT_HLIST_NODE(&rq->hash);
@@ -203,11 +206,11 @@ static void blk_mq_rq_ctx_init(struct request_queue *q, 
struct blk_mq_ctx *ctx,
rq->end_io_data = NULL;
rq->next_rq = NULL;
 
-   ctx->rq_dispatched[rw_is_sync(rw_flags)]++;
+   ctx->rq_dispatched[rw_is_sync(op | op_flags)]++;
 }
 
 static struct request *
-__blk_mq_alloc_request(struct blk_mq_alloc_data *data, int rw)
+__blk_mq_alloc_request(struct blk_mq_alloc_data *data, int op, int op_flags)
 {
struct request *rq;
unsigned int tag;
@@ -222,7 +225,7 @@ __blk_mq_alloc_request(struct blk_mq_alloc_data *data, int 
rw)
}
 
rq->tag = tag;
-   blk_mq_rq_ctx_init(data->q, data->ctx, rq, rw);
+   blk_mq_rq_ctx_init(data->q, data->ctx, rq, op, op_flags);
return rq;
}
 
@@ -246,7 +249,7 @@ struct request *blk_mq_alloc_request(struct request_queue 
*q, int rw,
hctx = q->mq_ops->map_queue(q, ctx->cpu);
blk_mq_set_alloc_data(&alloc_data, q, flags, ctx, hctx);
 
-   rq = __blk_mq_alloc_request(&alloc_data, rw);
+   rq = __blk_mq_alloc_request(&alloc_data, rw, 0);
if (!rq && !(flags & BLK_MQ_REQ_NOWAIT)) {
__blk_mq_run_hw_queue(hctx);
blk_mq_put_ctx(ctx);
@@ -254,7 +257,7 @@ struct request *blk_mq_alloc_request(struct request_queue 
*q, int rw,
ctx = blk_mq_get_ctx(q);
hctx = q->mq_ops->map_queue(q, ctx->cpu);
blk_mq_set_alloc_data(&alloc_data, q, flags, ctx, hctx);
-   rq =  __blk_mq_alloc_request(&alloc_data, rw);
+   rq =  __blk_mq_alloc_request(&alloc_data, rw, 0);
ctx = alloc_data.ctx;
}
blk_mq_put_ctx(ctx);
@@ -1170,28 +1173,29 @@ static struct request *blk_mq_map_request(struct 
request_queue *q,
struct blk_mq_hw_ctx *hctx;
struct blk_mq_ctx *ctx;
struct request *rq;
-   int rw = bio_data_dir(bio);
+   int op = bio_data_dir(bio);
+   int op_flags = 0;
struct blk_mq_alloc_data alloc_data;
 
blk_queue_enter_live(q);
ctx = blk_mq_get_ctx(q);
hctx = q->mq_ops->map_queue(q, ctx->cpu);
 
-   if (rw_is_sync(bio->bi_rw))
-   rw |= REQ_SYNC;
+   if (rw_is_sync(bio->bi_op | bio->bi_rw))
+   op_flags |= REQ_SYNC;
 
-   trace_block_getrq(q, bio, rw);
+   trace_block_getrq(q, bio, op);
blk_mq_set_alloc_data(&alloc_data, q, BLK_MQ_REQ_NOWAIT, ctx, hctx);
-   rq = __blk_mq_alloc_request(&alloc_data, rw);
+   rq = __blk_mq_alloc_request(&alloc_data, op, op_flags);
if (unlikely(!rq)) {
__blk_mq_run_hw_queue(hctx);
blk_mq_put_ctx(ctx);
-   trace_block_sleeprq(q, bio, rw);
+   trace_block_sleeprq(q, bio, op);
 
ctx = blk_mq_get_ctx(q);
hctx = q->mq_ops->map_queue(q, ctx->cpu);
blk_mq_set_alloc_data(&alloc_data, q, 0, ctx, hctx);
-   rq = __blk_mq_alloc_request(&alloc_data, rw);
+   rq = __blk_mq_alloc_request(&alloc_data, op, op_flags);
ctx = alloc_data.ctx;
hctx = alloc_data.hctx;
 

[PATCH 35/42] blktrace: get op from req->op/bio->bi_op

2016-04-13 Thread mchristi
From: Mike Christie 

The bio and request structs now store the operation in
bio->bi_op/request->op. This patch has blktrace use that
field instead of bi_rw/cmd_flags.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 include/linux/blktrace_api.h  |  2 +-
 include/trace/events/bcache.h | 12 ++
 include/trace/events/block.h  | 31 +-
 kernel/trace/blktrace.c   | 52 +++
 4 files changed, 57 insertions(+), 40 deletions(-)

diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h
index afc1343..ee25ba4 100644
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -109,7 +109,7 @@ static inline int blk_cmd_buf_len(struct request *rq)
 }
 
 extern void blk_dump_cmd(char *buf, struct request *rq);
-extern void blk_fill_rwbs(char *rwbs, u32 rw, int bytes);
+extern void blk_fill_rwbs(char *rwbs, int op, u32 rw, int bytes);
 
 #endif /* CONFIG_EVENT_TRACING && CONFIG_BLOCK */
 
diff --git a/include/trace/events/bcache.h b/include/trace/events/bcache.h
index 981acf7..8abe564 100644
--- a/include/trace/events/bcache.h
+++ b/include/trace/events/bcache.h
@@ -27,7 +27,8 @@ DECLARE_EVENT_CLASS(bcache_request,
__entry->sector = bio->bi_iter.bi_sector;
__entry->orig_sector= bio->bi_iter.bi_sector - 16;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw,
+ bio->bi_iter.bi_size);
),
 
TP_printk("%d,%d %s %llu + %u (from %d,%d @ %llu)",
@@ -101,7 +102,8 @@ DECLARE_EVENT_CLASS(bcache_bio,
__entry->dev= bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_iter.bi_sector;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw,
+ bio->bi_iter.bi_size);
),
 
TP_printk("%d,%d  %s %llu + %u",
@@ -136,7 +138,8 @@ TRACE_EVENT(bcache_read,
__entry->dev= bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_iter.bi_sector;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw,
+ bio->bi_iter.bi_size);
__entry->cache_hit = hit;
__entry->bypass = bypass;
),
@@ -167,7 +170,8 @@ TRACE_EVENT(bcache_write,
__entry->inode  = inode;
__entry->sector = bio->bi_iter.bi_sector;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio->bi_op, bio->bi_rw,
+ bio->bi_iter.bi_size);
__entry->writeback = writeback;
__entry->bypass = bypass;
),
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index e8a5eca..4416dcd 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -84,7 +84,8 @@ DECLARE_EVENT_CLASS(block_rq_with_error,
0 : blk_rq_sectors(rq);
__entry->errors= rq->errors;
 
-   blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq));
+   blk_fill_rwbs(__entry->rwbs, rq->op, rq->cmd_flags,
+ blk_rq_bytes(rq));
blk_dump_cmd(__get_str(cmd), rq);
),
 
@@ -162,7 +163,7 @@ TRACE_EVENT(block_rq_complete,
__entry->nr_sector = nr_bytes >> 9;
__entry->errors= rq->errors;
 
-   blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, nr_bytes);
+   blk_fill_rwbs(__entry->rwbs, rq->op, rq->cmd_flags, nr_bytes);
blk_dump_cmd(__get_str(cmd), rq);
),
 
@@ -198,7 +199,8 @@ DECLARE_EVENT_CLASS(block_rq,
__entry->bytes = (rq->cmd_type == REQ_TYPE_BLOCK_PC) ?
blk_rq_bytes(rq) : 0;
 
-   blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq));
+   blk_fill_rwbs(__entry->rwbs, rq->op, rq->cmd_flags,
+ blk_rq_bytes(rq));
blk_dump_cmd(__get_str(cmd), rq);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
@@ -272,7 +274,8 @@ TRACE_EVENT(block_bio_bounce,
  bio->bi_bdev->bd_dev : 0;
__entry->sector = bio->bi_iter.bi_sector;
__entry->n

[PATCH 36/42] ide cd: do not set REQ_WRITE on requests.

2016-04-13 Thread mchristi
From: Mike Christie 

The block layer will set the correct READ/WRITE operation flags/fields
when creating a request, so there is not need for drivers to set the
REQ_WRITE flag.

This patch is compile tested only.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 drivers/ide/ide-cd_ioctl.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/ide/ide-cd_ioctl.c b/drivers/ide/ide-cd_ioctl.c
index 474173e..5887a7a 100644
--- a/drivers/ide/ide-cd_ioctl.c
+++ b/drivers/ide/ide-cd_ioctl.c
@@ -459,9 +459,6 @@ int ide_cdrom_packet(struct cdrom_device_info *cdi,
   layer. the packet must be complete, as we do not
   touch it at all. */
 
-   if (cgc->data_direction == CGC_DATA_WRITE)
-   flags |= REQ_WRITE;
-
if (cgc->sense)
memset(cgc->sense, 0, sizeof(struct request_sense));
 
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 26/42] block: copy bio op to request op

2016-04-13 Thread mchristi
From: Mike Christie 

The bio users should now always be setting up the bio->bi_op. This patch
has us copy that to the struct request op field.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 6bcc22e..4224775 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2990,6 +2990,7 @@ void blk_rq_bio_prep(struct request_queue *q, struct 
request *rq,
 {
/* tmp compat. Allow users to set bi_op or bi_rw */
rq->cmd_flags |= bio_data_dir(bio);
+   rq->op = bio->bi_op;
 
if (bio_has_data(bio))
rq->nr_phys_segments = bio_phys_segments(q, bio);
@@ -3074,6 +3075,7 @@ EXPORT_SYMBOL_GPL(blk_rq_unprep_clone);
 static void __blk_rq_prep_clone(struct request *dst, struct request *src)
 {
dst->cpu = src->cpu;
+   dst->op = src->op;
dst->cmd_flags |= (src->cmd_flags & REQ_CLONE_MASK) | REQ_NOMERGE;
dst->cmd_type = src->cmd_type;
dst->__sector = blk_rq_pos(src);
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 37/42] block, fs, drivers: do use bi_rw/cmd_flags for REQ_OPs.

2016-04-13 Thread mchristi
From: Mike Christie 

We no longer use the bio->bi_rw and request->cmd_flags field for REQ_OPs:
REQ_WRITE, REQ_DISCARD, REQ_WRITE_SAME, so this patch stops checking
for them in bi_rw/cmd_flags and also removes the related compat code.

v2:

1. Remove compat code in __get_request.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/bio.c |  6 ++---
 block/blk-core.c| 34 -
 block/blk-merge.c   | 14 ++--
 block/blk-mq.c  |  3 +--
 drivers/ata/libata-scsi.c   |  2 +-
 drivers/block/brd.c |  2 +-
 drivers/block/drbd/drbd_main.c  | 15 +++--
 drivers/block/drbd/drbd_worker.c|  4 ++--
 drivers/block/loop.c|  6 ++---
 drivers/block/rbd.c |  2 +-
 drivers/block/rsxx/dma.c|  2 +-
 drivers/block/umem.c|  2 +-
 drivers/block/zram/zram_drv.c   |  2 +-
 drivers/ide/ide-floppy.c|  2 +-
 drivers/lightnvm/rrpc.c |  2 +-
 drivers/md/bcache/request.c | 10 -
 drivers/md/dm-cache-target.c| 10 +
 drivers/md/dm-crypt.c   |  2 +-
 drivers/md/dm-log-writes.c  |  2 +-
 drivers/md/dm-raid1.c   |  8 +++
 drivers/md/dm-region-hash.c |  4 ++--
 drivers/md/dm-stripe.c  |  4 ++--
 drivers/md/dm-thin.c| 15 -
 drivers/md/dm.c |  6 ++---
 drivers/md/linear.c |  2 +-
 drivers/md/raid0.c  |  2 +-
 drivers/scsi/osd/osd_initiator.c|  4 ++--
 drivers/staging/lustre/lustre/llite/lloop.c |  6 ++---
 include/linux/bio.h | 15 -
 include/linux/fs.h  | 25 +++--
 30 files changed, 99 insertions(+), 114 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 921de2e..bec5b54 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -675,10 +675,10 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t 
gfp_mask,
bio->bi_iter.bi_sector  = bio_src->bi_iter.bi_sector;
bio->bi_iter.bi_size= bio_src->bi_iter.bi_size;
 
-   if (bio->bi_rw & REQ_DISCARD)
+   if (bio->bi_op == REQ_OP_DISCARD)
goto integrity_clone;
 
-   if (bio->bi_rw & REQ_WRITE_SAME) {
+   if (bio->bi_op == REQ_OP_WRITE_SAME) {
bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
goto integrity_clone;
}
@@ -1797,7 +1797,7 @@ struct bio *bio_split(struct bio *bio, int sectors,
 * Discards need a mutable bio_vec to accommodate the payload
 * required by the DSM TRIM and UNMAP commands.
 */
-   if (bio->bi_rw & REQ_DISCARD)
+   if (bio->bi_op == REQ_OP_DISCARD)
split = bio_clone_bioset(bio, gfp, bs);
else
split = bio_clone_fast(bio, gfp, bs);
diff --git a/block/blk-core.c b/block/blk-core.c
index a240657..cee131b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1151,8 +1151,7 @@ static struct request *__get_request(struct request_list 
*rl, int op,
 
blk_rq_init(q, rq);
blk_rq_set_rl(rq, rl);
-   /* tmp compat - allow users to check either one for the op */
-   rq->cmd_flags = op | op_flags | REQ_ALLOCED;
+   rq->cmd_flags = op_flags | REQ_ALLOCED;
rq->op = op;
 
/* init elvpriv */
@@ -1705,8 +1704,7 @@ void init_request_from_bio(struct request *req, struct 
bio *bio)
 {
req->cmd_type = REQ_TYPE_FS;
 
-   /* tmp compat. Allow users to set bi_op or bi_rw */
-   req->cmd_flags |= (bio->bi_rw | bio->bi_op) & REQ_COMMON_MASK;
+   req->cmd_flags |= bio->bi_rw & REQ_COMMON_MASK;
if (bio->bi_rw & REQ_RAHEAD)
req->cmd_flags |= REQ_FAILFAST_MASK;
 
@@ -1856,9 +1854,9 @@ static void handle_bad_sector(struct bio *bio)
char b[BDEVNAME_SIZE];
 
printk(KERN_INFO "attempt to access beyond end of device\n");
-   printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
+   printk(KERN_INFO "%s: rw=%d,%ld, want=%Lu, limit=%Lu\n",
bdevname(bio->bi_bdev, b),
-   bio->bi_rw,
+   bio->bi_op, bio->bi_rw,
(unsigned long long)bio_end_sector(bio),
(long long)(i_size_read(bio->bi_bdev->bd_inode) >> 9));
 }
@@ -1979,14 +1977,14 @@ generic_make_request_checks(struct bio *bio)
}
}
 
-   if ((bio->bi_rw & REQ_DISCARD) &&
+   if ((bio->bi_op == REQ_OP_DISCARD) &&
(!blk_queue_discard(q) ||
 ((bio->bi_rw & REQ_SECURE) && !blk_queue_secdiscard(q {
err = -EO

[PATCH 38/42] block, fs: remove old REQ definitions.

2016-04-13 Thread mchristi
From: Mike Christie 

We no longer use REQ_WRITE. REQ_WRITE_SAME and REQ_DISCARD,
so this patch removes them.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 include/linux/blk_types.h   | 21 ++---
 include/linux/fs.h  | 21 +++--
 include/trace/events/f2fs.h |  1 -
 3 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 6e49c91..b4251ed 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -151,7 +151,6 @@ struct bio {
  */
 enum rq_flag_bits {
/* common flags */
-   __REQ_WRITE,/* not set, read. set, write */
__REQ_FAILFAST_DEV, /* no driver retries of device errors */
__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
__REQ_FAILFAST_DRIVER,  /* no driver retries of driver errors */
@@ -159,9 +158,7 @@ enum rq_flag_bits {
__REQ_SYNC, /* request is sync (sync write or read) */
__REQ_META, /* metadata io request */
__REQ_PRIO, /* boost priority in cfq */
-   __REQ_DISCARD,  /* request to discard sectors */
-   __REQ_SECURE,   /* secure discard (used with __REQ_DISCARD) */
-   __REQ_WRITE_SAME,   /* write same block many times */
+   __REQ_SECURE,   /* secure discard (used with REQ_OP_DISCARD) */
 
__REQ_NOIDLE,   /* don't anticipate more IO after this one */
__REQ_INTEGRITY,/* I/O includes block integrity payload */
@@ -197,28 +194,22 @@ enum rq_flag_bits {
__REQ_NR_BITS,  /* stops here */
 };
 
-#define REQ_WRITE  (1ULL << __REQ_WRITE)
 #define REQ_FAILFAST_DEV   (1ULL << __REQ_FAILFAST_DEV)
 #define REQ_FAILFAST_TRANSPORT (1ULL << __REQ_FAILFAST_TRANSPORT)
 #define REQ_FAILFAST_DRIVER(1ULL << __REQ_FAILFAST_DRIVER)
 #define REQ_SYNC   (1ULL << __REQ_SYNC)
 #define REQ_META   (1ULL << __REQ_META)
 #define REQ_PRIO   (1ULL << __REQ_PRIO)
-#define REQ_DISCARD(1ULL << __REQ_DISCARD)
-#define REQ_WRITE_SAME (1ULL << __REQ_WRITE_SAME)
 #define REQ_NOIDLE (1ULL << __REQ_NOIDLE)
 #define REQ_INTEGRITY  (1ULL << __REQ_INTEGRITY)
 
 #define REQ_FAILFAST_MASK \
(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
 #define REQ_COMMON_MASK \
-   (REQ_WRITE | REQ_FAILFAST_MASK | REQ_SYNC | REQ_META | REQ_PRIO | \
-REQ_DISCARD | REQ_WRITE_SAME | REQ_NOIDLE | REQ_FLUSH | REQ_FUA | \
-REQ_SECURE | REQ_INTEGRITY)
+   (REQ_FAILFAST_MASK | REQ_SYNC | REQ_META | REQ_PRIO | REQ_NOIDLE | \
+REQ_FLUSH | REQ_FUA | REQ_SECURE | REQ_INTEGRITY)
 #define REQ_CLONE_MASK REQ_COMMON_MASK
 
-#define BIO_NO_ADVANCE_ITER_MASK   (REQ_DISCARD|REQ_WRITE_SAME)
-
 /* This mask is used for both bio and request merge checking */
 #define REQ_NOMERGE_FLAGS \
(REQ_NOMERGE | REQ_STARTED | REQ_SOFTBARRIER | REQ_FLUSH | REQ_FUA | 
REQ_FLUSH_SEQ)
@@ -250,9 +241,9 @@ enum rq_flag_bits {
 
 enum req_op {
REQ_OP_READ,
-   REQ_OP_WRITE= REQ_WRITE,
-   REQ_OP_DISCARD  = REQ_DISCARD,
-   REQ_OP_WRITE_SAME   = REQ_WRITE_SAME,
+   REQ_OP_WRITE,
+   REQ_OP_DISCARD, /* request to discard sectors */
+   REQ_OP_WRITE_SAME,  /* write same block many times */
 };
 
 typedef unsigned int blk_qc_t;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9becf20..509e21f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -153,9 +153,10 @@ typedef void (dax_iodone_t)(struct buffer_head *bh_map, 
int uptodate);
 #define CHECK_IOVEC_ONLY -1
 
 /*
- * The below are the various read and write types that we support. Some of
+ * The below are the various read and write flags that we support. Some of
  * them include behavioral modifiers that send information down to the
- * block layer and IO scheduler. Terminology:
+ * block layer and IO scheduler. They should be used along with a req_op.
+ * Terminology:
  *
  * The block layer uses device plugging to defer IO a little bit, in
  * the hope that we will see more IO very shortly. This increases
@@ -194,19 +195,19 @@ typedef void (dax_iodone_t)(struct buffer_head *bh_map, 
int uptodate);
  * non-volatile media on completion.
  *
  */
-#define RW_MASKREQ_WRITE
+#define RW_MASKREQ_OP_WRITE
 #define RWA_MASK   REQ_RAHEAD
 
-#define READ   0
+#define READ   REQ_OP_READ
 #define WRITE  RW_MASK
 #define READA  RWA_MASK
 
-#define READ_SYNC  (READ | REQ_SYNC)
-#define WRITE_SYNC (WRITE | REQ_SYNC | REQ_NOIDLE)
-#define WRITE_ODIRECT  (WRITE | REQ_SYNC)
-#define WRITE_FLUSH(WRITE | REQ_SYNC | REQ_NOIDLE | REQ_FLUSH)
-#define WRITE_FU

[PATCH 31/42] block: convert merge/insert code to check for REQ_OPs.

2016-04-13 Thread mchristi
From: Mike Christie 

This patch converts the block layer merging code to use separate variables
for the operation and flags, and to check request->op for the REQ_OP.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c   |  2 +-
 block/blk-merge.c  | 10 ++
 include/linux/blkdev.h | 20 ++--
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 3ec1310..5632cd1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2175,7 +2175,7 @@ EXPORT_SYMBOL(submit_bio);
 static int blk_cloned_rq_check_limits(struct request_queue *q,
  struct request *rq)
 {
-   if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) {
+   if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->op)) {
printk(KERN_ERR "%s: over max size limit.\n", __func__);
return -EIO;
}
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2613531..c02371f 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -649,7 +649,8 @@ static int attempt_merge(struct request_queue *q, struct 
request *req,
if (!rq_mergeable(req) || !rq_mergeable(next))
return 0;
 
-   if (!blk_check_merge_flags(req->cmd_flags, next->cmd_flags))
+   if (!blk_check_merge_flags(req->cmd_flags, req->op, next->cmd_flags,
+  next->op))
return 0;
 
/*
@@ -663,7 +664,7 @@ static int attempt_merge(struct request_queue *q, struct 
request *req,
|| req_no_special_merge(next))
return 0;
 
-   if (req->cmd_flags & REQ_WRITE_SAME &&
+   if (req->op == REQ_OP_WRITE_SAME &&
!blk_write_same_mergeable(req->bio, next->bio))
return 0;
 
@@ -751,7 +752,8 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
if (!rq_mergeable(rq) || !bio_mergeable(bio))
return false;
 
-   if (!blk_check_merge_flags(rq->cmd_flags, bio->bi_rw))
+   if (!blk_check_merge_flags(rq->cmd_flags, rq->op, bio->bi_rw,
+  bio->bi_op))
return false;
 
/* different data direction or already started, don't merge */
@@ -767,7 +769,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
return false;
 
/* must be using the same buffer */
-   if (rq->cmd_flags & REQ_WRITE_SAME &&
+   if (rq->op == REQ_OP_WRITE_SAME &&
!blk_write_same_mergeable(rq->bio, bio))
return false;
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e2b2881..39df8ef 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -660,16 +660,16 @@ static inline bool rq_mergeable(struct request *rq)
return true;
 }
 
-static inline bool blk_check_merge_flags(unsigned int flags1,
-unsigned int flags2)
+static inline bool blk_check_merge_flags(unsigned int flags1, unsigned int op1,
+unsigned int flags2, unsigned int op2)
 {
-   if ((flags1 & REQ_DISCARD) != (flags2 & REQ_DISCARD))
+   if ((op1 == REQ_OP_DISCARD) != (op2 == REQ_OP_DISCARD))
return false;
 
if ((flags1 & REQ_SECURE) != (flags2 & REQ_SECURE))
return false;
 
-   if ((flags1 & REQ_WRITE_SAME) != (flags2 & REQ_WRITE_SAME))
+   if ((op1 == REQ_OP_WRITE_SAME) != (op2 == REQ_OP_WRITE_SAME))
return false;
 
return true;
@@ -870,12 +870,12 @@ static inline unsigned int blk_rq_cur_sectors(const 
struct request *rq)
 }
 
 static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
-unsigned int cmd_flags)
+int op)
 {
-   if (unlikely(cmd_flags & REQ_DISCARD))
+   if (unlikely(op == REQ_OP_DISCARD))
return min(q->limits.max_discard_sectors, UINT_MAX >> 9);
 
-   if (unlikely(cmd_flags & REQ_WRITE_SAME))
+   if (unlikely(op == REQ_OP_WRITE_SAME))
return q->limits.max_write_same_sectors;
 
return q->limits.max_sectors;
@@ -902,11 +902,11 @@ static inline unsigned int blk_rq_get_max_sectors(struct 
request *rq)
if (unlikely(rq->cmd_type != REQ_TYPE_FS))
return q->limits.max_hw_sectors;
 
-   if (!q->limits.chunk_sectors || (rq->cmd_flags & REQ_DISCARD))
-   return blk_queue_get_max_sectors(q, rq->cmd_flags);
+   if (!q->limits.chunk_sectors || (rq->op == REQ_OP_DISCARD))
+   return blk_queue_get_max_sectors(q, rq->op);
 
return min(blk_max_size_offset(q, blk_rq_pos(rq)),
-   blk_queue_get_max_sectors(q, rq->cmd_flags));
+   blk_queue_get_max_sectors(q, rq->op));
 }
 
 static inline unsigned int blk_rq_count_bios(struc

[PATCH 30/42] blkg_rwstat: separate op from flags

2016-04-13 Thread mchristi
From: Mike Christie 

The bio and request operation and flags are going to be separate definitions,
so we cannot pass them in as a bitmap. This patch converts the blkg_rwstat
code and its caller, cfq, to pass in the values separately.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/cfq-iosched.c| 49 +++---
 include/linux/blk-cgroup.h | 13 ++--
 2 files changed, 36 insertions(+), 26 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 3fcc598..0dfa2dd 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -667,9 +667,10 @@ static inline void cfqg_put(struct cfq_group *cfqg)
 } while (0)
 
 static inline void cfqg_stats_update_io_add(struct cfq_group *cfqg,
-   struct cfq_group *curr_cfqg, int rw)
+   struct cfq_group *curr_cfqg, int op,
+   int op_flags)
 {
-   blkg_rwstat_add(&cfqg->stats.queued, rw, 1);
+   blkg_rwstat_add(&cfqg->stats.queued, op, op_flags, 1);
cfqg_stats_end_empty_time(&cfqg->stats);
cfqg_stats_set_start_group_wait_time(cfqg, curr_cfqg);
 }
@@ -683,26 +684,30 @@ static inline void 
cfqg_stats_update_timeslice_used(struct cfq_group *cfqg,
 #endif
 }
 
-static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int rw)
+static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int op,
+  int op_flags)
 {
-   blkg_rwstat_add(&cfqg->stats.queued, rw, -1);
+   blkg_rwstat_add(&cfqg->stats.queued, op, op_flags, -1);
 }
 
-static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int rw)
+static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int op,
+  int op_flags)
 {
-   blkg_rwstat_add(&cfqg->stats.merged, rw, 1);
+   blkg_rwstat_add(&cfqg->stats.merged, op, op_flags, 1);
 }
 
 static inline void cfqg_stats_update_completion(struct cfq_group *cfqg,
-   uint64_t start_time, uint64_t io_start_time, int rw)
+   uint64_t start_time, uint64_t io_start_time, int op,
+   int op_flags)
 {
struct cfqg_stats *stats = &cfqg->stats;
unsigned long long now = sched_clock();
 
if (time_after64(now, io_start_time))
-   blkg_rwstat_add(&stats->service_time, rw, now - io_start_time);
+   blkg_rwstat_add(&stats->service_time, op, op_flags,
+   now - io_start_time);
if (time_after64(io_start_time, start_time))
-   blkg_rwstat_add(&stats->wait_time, rw,
+   blkg_rwstat_add(&stats->wait_time, op, op_flags,
io_start_time - start_time);
 }
 
@@ -781,13 +786,16 @@ static inline void cfqg_put(struct cfq_group *cfqg) { }
 #define cfq_log_cfqg(cfqd, cfqg, fmt, args...) do {} while (0)
 
 static inline void cfqg_stats_update_io_add(struct cfq_group *cfqg,
-   struct cfq_group *curr_cfqg, int rw) { }
+   struct cfq_group *curr_cfqg, int op, int op_flags) { }
 static inline void cfqg_stats_update_timeslice_used(struct cfq_group *cfqg,
unsigned long time, unsigned long unaccounted_time) { }
-static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int rw) 
{ }
-static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int rw) 
{ }
+static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int op,
+   int op_flags) { }
+static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int op,
+   int op_flags) { }
 static inline void cfqg_stats_update_completion(struct cfq_group *cfqg,
-   uint64_t start_time, uint64_t io_start_time, int rw) { }
+   uint64_t start_time, uint64_t io_start_time, int op,
+   int op_flags) { }
 
 #endif /* CONFIG_CFQ_GROUP_IOSCHED */
 
@@ -2461,10 +2469,10 @@ static void cfq_reposition_rq_rb(struct cfq_queue 
*cfqq, struct request *rq)
 {
elv_rb_del(&cfqq->sort_list, rq);
cfqq->queued[rq_is_sync(rq)]--;
-   cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->cmd_flags);
+   cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->op, rq->cmd_flags);
cfq_add_rq_rb(rq);
cfqg_stats_update_io_add(RQ_CFQG(rq), cfqq->cfqd->serving_group,
-rq->cmd_flags);
+rq->op, rq->cmd_flags);
 }
 
 static struct request *
@@ -2517,7 +2525,7 @@ static void cfq_remove_request(struct request *rq)
cfq_del_rq_rb(rq);
 
cfqq->cfqd->rq_queued--;
-   cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->cmd_flags);
+   cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->op, rq->cmd_flags);
if (rq->cmd_flags 

[PATCH 29/42] block: prepare elevator to use REQ_OPs.

2016-04-13 Thread mchristi
From: Mike Christie 

This patch converts the elevator code to use separate variables
for the operation and flags, and to check request->op for the REQ_OP.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c | 2 +-
 block/cfq-iosched.c  | 4 ++--
 block/elevator.c | 7 +++
 include/linux/elevator.h | 4 ++--
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f1545d1..3ec1310 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1081,7 +1081,7 @@ static struct request *__get_request(struct request_list 
*rl, int op,
if (unlikely(blk_queue_dying(q)))
return ERR_PTR(-ENODEV);
 
-   may_queue = elv_may_queue(q, op | op_flags);
+   may_queue = elv_may_queue(q, op, op_flags);
if (may_queue == ELV_MQUEUE_NO)
goto rq_starved;
 
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 4a34978..3fcc598 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -4285,7 +4285,7 @@ static inline int __cfq_may_queue(struct cfq_queue *cfqq)
return ELV_MQUEUE_MAY;
 }
 
-static int cfq_may_queue(struct request_queue *q, int rw)
+static int cfq_may_queue(struct request_queue *q, int op, int op_flags)
 {
struct cfq_data *cfqd = q->elevator->elevator_data;
struct task_struct *tsk = current;
@@ -4302,7 +4302,7 @@ static int cfq_may_queue(struct request_queue *q, int rw)
if (!cic)
return ELV_MQUEUE_MAY;
 
-   cfqq = cic_to_cfqq(cic, rw_is_sync(rw));
+   cfqq = cic_to_cfqq(cic, rw_is_sync(op | op_flags));
if (cfqq) {
cfq_init_prio_data(cfqq, cic);
 
diff --git a/block/elevator.c b/block/elevator.c
index c3555c9..6a282bf 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -366,8 +366,7 @@ void elv_dispatch_sort(struct request_queue *q, struct 
request *rq)
list_for_each_prev(entry, &q->queue_head) {
struct request *pos = list_entry_rq(entry);
 
-   if ((rq->cmd_flags & REQ_DISCARD) !=
-   (pos->cmd_flags & REQ_DISCARD))
+   if ((rq->op == REQ_OP_DISCARD) != (pos->op == REQ_OP_DISCARD))
break;
if (rq_data_dir(rq) != rq_data_dir(pos))
break;
@@ -717,12 +716,12 @@ void elv_put_request(struct request_queue *q, struct 
request *rq)
e->type->ops.elevator_put_req_fn(rq);
 }
 
-int elv_may_queue(struct request_queue *q, int rw)
+int elv_may_queue(struct request_queue *q, int op, int op_flags)
 {
struct elevator_queue *e = q->elevator;
 
if (e->type->ops.elevator_may_queue_fn)
-   return e->type->ops.elevator_may_queue_fn(q, rw);
+   return e->type->ops.elevator_may_queue_fn(q, op, op_flags);
 
return ELV_MQUEUE_MAY;
 }
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 638b324..953d286 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -26,7 +26,7 @@ typedef int (elevator_dispatch_fn) (struct request_queue *, 
int);
 typedef void (elevator_add_req_fn) (struct request_queue *, struct request *);
 typedef struct request *(elevator_request_list_fn) (struct request_queue *, 
struct request *);
 typedef void (elevator_completed_req_fn) (struct request_queue *, struct 
request *);
-typedef int (elevator_may_queue_fn) (struct request_queue *, int);
+typedef int (elevator_may_queue_fn) (struct request_queue *, int, int);
 
 typedef void (elevator_init_icq_fn) (struct io_cq *);
 typedef void (elevator_exit_icq_fn) (struct io_cq *);
@@ -134,7 +134,7 @@ extern struct request *elv_former_request(struct 
request_queue *, struct request
 extern struct request *elv_latter_request(struct request_queue *, struct 
request *);
 extern int elv_register_queue(struct request_queue *q);
 extern void elv_unregister_queue(struct request_queue *q);
-extern int elv_may_queue(struct request_queue *, int);
+extern int elv_may_queue(struct request_queue *, int, int);
 extern void elv_completed_request(struct request_queue *, struct request *);
 extern int elv_set_request(struct request_queue *q, struct request *rq,
   struct bio *bio, gfp_t gfp_mask);
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 41/42] block: use QUEUE flags instead of flush_flags to test for flush support

2016-04-13 Thread mchristi
From: Mike Christie 

The last patch added a REQ_OP_FLUSH for request_fn drivers
and the next patch renames REQ_FLUSH to REQ_PREFLUSH which
will be used by file systems and make_request_fn drivers so
they can send a write/flush combo.

Jens's cleanup in:

block: add ability to flag write back caching on a device
93e9d8e836cb1a9a58b33eb6643bf061c6119ef2

also added QUEUE flags that can be used to get the same info
as flush_flags.

This patch has drivers check the QUEUE_FLAG bits, so we do not
have to have the extra REQ_FLUSH definition.

Signed-off-by: Mike Christie 
---
 block/blk-core.c|  3 ++-
 block/blk-flush.c   | 12 +-
 block/blk-settings.c| 11 -
 drivers/block/xen-blkback/xenbus.c  |  2 +-
 drivers/block/xen-blkfront.c| 48 +++--
 drivers/md/dm-table.c   | 20 +++-
 drivers/md/raid5-cache.c|  3 ++-
 drivers/target/target_core_iblock.c |  6 ++---
 include/linux/blkdev.h  |  3 ++-
 9 files changed, 59 insertions(+), 49 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 71ba3a9..ef69d04 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1969,7 +1969,8 @@ generic_make_request_checks(struct bio *bio)
 * drivers without flush support don't have to worry
 * about them.
 */
-   if ((bio->bi_rw & (REQ_FLUSH | REQ_FUA)) && !q->flush_flags) {
+   if ((bio->bi_rw & (REQ_FLUSH | REQ_FUA)) &&
+   !(blk_queue_flush(q) || blk_queue_fua(q))) {
bio->bi_rw &= ~(REQ_FLUSH | REQ_FUA);
if (!nr_sectors) {
err = 0;
diff --git a/block/blk-flush.c b/block/blk-flush.c
index af0c805..7682680 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -95,17 +95,18 @@ enum {
 static bool blk_kick_flush(struct request_queue *q,
   struct blk_flush_queue *fq);
 
-static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq)
+static unsigned int blk_flush_policy(struct request *rq)
 {
+   struct request_queue *q = rq->q;
unsigned int policy = 0;
 
if (blk_rq_sectors(rq))
policy |= REQ_FSEQ_DATA;
 
-   if (fflags & REQ_FLUSH) {
+   if (blk_queue_flush(q)) {
if (rq->cmd_flags & REQ_FLUSH)
policy |= REQ_FSEQ_PREFLUSH;
-   if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA))
+   if (!blk_queue_fua(q) && (rq->cmd_flags & REQ_FUA))
policy |= REQ_FSEQ_POSTFLUSH;
}
return policy;
@@ -385,8 +386,7 @@ static void mq_flush_data_end_io(struct request *rq, int 
error)
 void blk_insert_flush(struct request *rq)
 {
struct request_queue *q = rq->q;
-   unsigned int fflags = q->flush_flags;   /* may change, cache */
-   unsigned int policy = blk_flush_policy(fflags, rq);
+   unsigned int policy = blk_flush_policy(rq);
struct blk_flush_queue *fq = blk_get_flush_queue(q, rq->mq_ctx);
 
/*
@@ -394,7 +394,7 @@ void blk_insert_flush(struct request *rq)
 * REQ_FLUSH and FUA for the driver.
 */
rq->cmd_flags &= ~REQ_FLUSH;
-   if (!(fflags & REQ_FUA))
+   if (!blk_queue_fua(q))
rq->cmd_flags &= ~REQ_FUA;
 
/*
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 80d9327..51bc002 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -837,16 +837,13 @@ EXPORT_SYMBOL_GPL(blk_queue_flush_queueable);
 void blk_queue_write_cache(struct request_queue *q, bool wc, bool fua)
 {
spin_lock_irq(q->queue_lock);
-   if (wc) {
+   if (wc)
queue_flag_set(QUEUE_FLAG_WC, q);
-   q->flush_flags = REQ_FLUSH;
-   } else
+   else
queue_flag_clear(QUEUE_FLAG_WC, q);
-   if (fua) {
-   if (wc)
-   q->flush_flags |= REQ_FUA;
+   if (fua)
queue_flag_set(QUEUE_FLAG_FUA, q);
-   } else
+   else
queue_flag_clear(QUEUE_FLAG_FUA, q);
spin_unlock_irq(q->queue_lock);
 }
diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index 26aa080..1291c35 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -477,7 +477,7 @@ static int xen_vbd_create(struct xen_blkif *blkif, 
blkif_vdev_t handle,
vbd->type |= VDISK_REMOVABLE;
 
q = bdev_get_queue(bdev);
-   if (q && q->flush_flags)
+   if (q && (blk_queue_flush(q) || blk_queue_fua(q)))
vbd->flush_support = true;
 
if (q && blk_queue_secdiscard(q))
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index f01691a..a0651f3 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -196,6 +196,7 @@ struct blkfront_info
unsigned int nr_ring_pages;
struct re

[PATCH 40/42] block, drivers: add REQ_OP_FLUSH operation

2016-04-13 Thread mchristi
From: Mike Christie 

This adds a REQ_OP_FLUSH operation that is sent to request_fn
based drivers by the block layer's flush code, instead of
sending requests with the request->cmd_flags REQ_FLUSH bit set.

For the following 3 flush related patches, I have not tested
every driver. I have only tested scsi with xfs and btrfs.

v2.

1. Fix kbuild failures. Forgot to update ubd driver.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 Documentation/block/writeback_cache_control.txt | 6 +++---
 arch/um/drivers/ubd_kern.c  | 2 +-
 block/blk-flush.c   | 3 ++-
 drivers/block/loop.c| 4 ++--
 drivers/block/nbd.c | 2 +-
 drivers/block/osdblk.c  | 2 +-
 drivers/block/ps3disk.c | 4 ++--
 drivers/block/skd_main.c| 2 +-
 drivers/block/virtio_blk.c  | 2 +-
 drivers/block/xen-blkfront.c| 8 
 drivers/ide/ide-disk.c  | 2 +-
 drivers/md/dm.c | 2 +-
 drivers/mmc/card/block.c| 5 ++---
 drivers/mmc/card/queue.h| 2 +-
 drivers/mtd/mtd_blkdevs.c   | 2 +-
 drivers/nvme/host/core.c| 2 +-
 drivers/scsi/sd.c   | 7 +++
 include/linux/blk_types.h   | 1 +
 include/linux/blkdev.h  | 3 +++
 kernel/trace/blktrace.c | 5 -
 20 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/Documentation/block/writeback_cache_control.txt 
b/Documentation/block/writeback_cache_control.txt
index 59e0516..da70bda 100644
--- a/Documentation/block/writeback_cache_control.txt
+++ b/Documentation/block/writeback_cache_control.txt
@@ -73,9 +73,9 @@ doing:
 
blk_queue_write_cache(sdkp->disk->queue, true, false);
 
-and handle empty REQ_FLUSH requests in its prep_fn/request_fn.  Note that
+and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn.  Note that
 REQ_FLUSH requests with a payload are automatically turned into a sequence
-of an empty REQ_FLUSH request followed by the actual write by the block
+of an empty REQ_OP_FLUSH request followed by the actual write by the block
 layer.  For devices that also support the FUA bit the block layer needs
 to be told to pass through the REQ_FUA bit using:
 
@@ -83,4 +83,4 @@ to be told to pass through the REQ_FUA bit using:
 
 and the driver must handle write requests that have the REQ_FUA bit set
 in prep_fn/request_fn.  If the FUA bit is not natively supported the block
-layer turns it into an empty REQ_FLUSH request after the actual write.
+layer turns it into an empty REQ_OP_FLUSH request after the actual write.
diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 17e96dc..0cb2dab 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -1286,7 +1286,7 @@ static void do_ubd_request(struct request_queue *q)
 
req = dev->request;
 
-   if (req->cmd_flags & REQ_FLUSH) {
+   if (req->op == REQ_OP_FLUSH) {
io_req = kmalloc(sizeof(struct io_thread_req),
 GFP_ATOMIC);
if (io_req == NULL) {
diff --git a/block/blk-flush.c b/block/blk-flush.c
index b05acca..af0c805 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -29,7 +29,7 @@
  * The actual execution of flush is double buffered.  Whenever a request
  * needs to execute PRE or POSTFLUSH, it queues at
  * fq->flush_queue[fq->flush_pending_idx].  Once certain criteria are met, a
- * flush is issued and the pending_idx is toggled.  When the flush
+ * REQ_OP_FLUSH is issued and the pending_idx is toggled.  When the flush
  * completes, all the requests which were pending are proceeded to the next
  * step.  This allows arbitrary merging of different types of FLUSH/FUA
  * requests.
@@ -330,6 +330,7 @@ static bool blk_kick_flush(struct request_queue *q, struct 
blk_flush_queue *fq)
 
flush_rq->cmd_type = REQ_TYPE_FS;
flush_rq->cmd_flags = WRITE_FLUSH | REQ_FLUSH_SEQ;
+   flush_rq->op = REQ_OP_FLUSH;
flush_rq->rq_disk = first_rq->rq_disk;
flush_rq->end_io = flush_end_io;
 
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f1f7a25..7d7d7a4f 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -536,7 +536,7 @@ static int do_req_filebacked(struct loop_device *lo, struct 
request *rq)
pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
 
if (op_is_write(rq->op)) {
-   if (rq->cmd_flags & REQ_FLUSH)
+   if (rq->op == REQ_OP_FLUSH)
ret = lo_req_flush(lo, rq);
else if (rq->op == REQ_OP_DISCARD)
ret = lo_discard(lo, rq, pos);
@@ -1653,7 +165

[PATCH 42/42] block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH

2016-04-13 Thread mchristi
From: Mike Christie 

To avoid confusion between REQ_OP_FLUSH, which is handled by
request_fn drivers, and upper layers requesting the block layer
perform a flush sequence along with possibly a WRITE, this patch
renames REQ_FLUSH to REQ_PREFLUSH.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 Documentation/block/writeback_cache_control.txt | 22 +++---
 Documentation/device-mapper/log-writes.txt  | 10 +-
 block/blk-core.c| 12 ++--
 block/blk-flush.c   | 16 
 block/blk-mq.c  |  4 ++--
 drivers/block/drbd/drbd_actlog.c|  4 ++--
 drivers/block/drbd/drbd_main.c  |  2 +-
 drivers/block/drbd/drbd_protocol.h  |  2 +-
 drivers/block/drbd/drbd_receiver.c  |  2 +-
 drivers/block/drbd/drbd_req.c   |  2 +-
 drivers/md/bcache/journal.c |  2 +-
 drivers/md/bcache/request.c |  8 
 drivers/md/dm-cache-target.c| 12 ++--
 drivers/md/dm-crypt.c   |  7 ---
 drivers/md/dm-era-target.c  |  4 ++--
 drivers/md/dm-io.c  |  2 +-
 drivers/md/dm-log-writes.c  |  2 +-
 drivers/md/dm-raid1.c   |  5 +++--
 drivers/md/dm-region-hash.c |  4 ++--
 drivers/md/dm-snap.c|  6 +++---
 drivers/md/dm-stripe.c  |  2 +-
 drivers/md/dm-thin.c|  8 
 drivers/md/dm.c | 12 ++--
 drivers/md/linear.c |  2 +-
 drivers/md/md.c |  2 +-
 drivers/md/md.h |  2 +-
 drivers/md/multipath.c  |  2 +-
 drivers/md/raid0.c  |  2 +-
 drivers/md/raid1.c  |  3 ++-
 drivers/md/raid10.c |  2 +-
 drivers/md/raid5-cache.c|  2 +-
 drivers/md/raid5.c  |  2 +-
 fs/btrfs/check-integrity.c  |  8 
 fs/jbd2/journal.c   |  2 +-
 fs/xfs/xfs_buf.c|  2 +-
 include/linux/blk_types.h   |  8 
 include/linux/fs.h  |  4 ++--
 include/trace/events/f2fs.h |  2 +-
 kernel/trace/blktrace.c |  5 +++--
 39 files changed, 102 insertions(+), 98 deletions(-)

diff --git a/Documentation/block/writeback_cache_control.txt 
b/Documentation/block/writeback_cache_control.txt
index da70bda..8a6bdad 100644
--- a/Documentation/block/writeback_cache_control.txt
+++ b/Documentation/block/writeback_cache_control.txt
@@ -20,11 +20,11 @@ a forced cache flush, and the Force Unit Access (FUA) flag 
for requests.
 Explicit cache flushes
 --
 
-The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
+The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
 the filesystem and will make sure the volatile cache of the storage device
 has been flushed before the actual I/O operation is started.  This explicitly
 guarantees that previously completed write requests are on non-volatile
-storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
+storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
 set on an otherwise empty bio structure, which causes only an explicit cache
 flush without any dependent I/O.  It is recommend to use
 the blkdev_issue_flush() helper for a pure cache flush.
@@ -41,21 +41,21 @@ signaled after the data has been committed to non-volatile 
storage.
 Implementation details for filesystems
 --
 
-Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
+Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
 worry if the underlying devices need any explicit cache flushing and how
-the Forced Unit Access is implemented.  The REQ_FLUSH and REQ_FUA flags
+the Forced Unit Access is implemented.  The REQ_PREFLUSH and REQ_FUA flags
 may both be set on a single bio.
 
 
 Implementation details for make_request_fn based block drivers
 --
 
-These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
+These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
 directly below the submit_bio interface.  For remapping drivers the REQ_FUA
 bits need to be propagated to underlying devices, and a global flush needs
-to be implemented for bios with the REQ_FLUSH bit set.  For real device
-drivers that do not have a volatile cache the REQ_FLUSH a

[PATCH 39/42] block: shrink bio/request fields

2016-04-13 Thread mchristi
From: Mike Christie 

bi_op only needed to be a int for temp compat reasons, so
this patch shrinks it to u8.

There is no need for bi_rw to be so large now, so that is
reduced to a unsigned int and bi_ioprio is just put in
its own field.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c   |  2 +-
 drivers/md/dm-flakey.c |  2 +-
 drivers/md/raid5.c | 13 +++--
 fs/btrfs/check-integrity.c |  4 ++--
 fs/btrfs/inode.c   |  2 +-
 include/linux/bio.h| 13 ++---
 include/linux/blk_types.h  | 11 +++
 include/linux/blkdev.h |  2 +-
 8 files changed, 18 insertions(+), 31 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index cee131b..71ba3a9 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1854,7 +1854,7 @@ static void handle_bad_sector(struct bio *bio)
char b[BDEVNAME_SIZE];
 
printk(KERN_INFO "attempt to access beyond end of device\n");
-   printk(KERN_INFO "%s: rw=%d,%ld, want=%Lu, limit=%Lu\n",
+   printk(KERN_INFO "%s: rw=%d,%u, want=%Lu, limit=%Lu\n",
bdevname(bio->bi_bdev, b),
bio->bi_op, bio->bi_rw,
(unsigned long long)bio_end_sector(bio),
diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index b7341de..29b99fb 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -266,7 +266,7 @@ static void corrupt_bio_data(struct bio *bio, struct 
flakey_c *fc)
data[fc->corrupt_bio_byte - 1] = fc->corrupt_bio_value;
 
DMDEBUG("Corrupting data bio=%p by writing %u to byte %u "
-   "(rw=%c bi_rw=%lu bi_sector=%llu cur_bytes=%u)\n",
+   "(rw=%c bi_rw=%u bi_sector=%llu cur_bytes=%u)\n",
bio, fc->corrupt_bio_value, fc->corrupt_bio_byte,
(bio_data_dir(bio) == WRITE) ? 'w' : 'r', bio->bi_rw,
(unsigned long long)bio->bi_iter.bi_sector, bio_bytes);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c36b817..7fb693f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1006,9 +1006,9 @@ again:
: raid5_end_read_request;
bi->bi_private = sh;
 
-   pr_debug("%s: for %llu schedule op %ld on disc %d\n",
+   pr_debug("%s: for %llu schedule op %d,%u on disc %d\n",
__func__, (unsigned long long)sh->sector,
-   bi->bi_rw, i);
+   bi->bi_op, bi->bi_rw, i);
atomic_inc(&sh->count);
if (sh != head_sh)
atomic_inc(&head_sh->count);
@@ -1058,10 +1058,10 @@ again:
rbi->bi_end_io = raid5_end_write_request;
rbi->bi_private = sh;
 
-   pr_debug("%s: for %llu schedule op %ld on "
+   pr_debug("%s: for %llu schedule op %d,%u on "
 "replacement disc %d\n",
__func__, (unsigned long long)sh->sector,
-   rbi->bi_rw, i);
+   rbi->bi_op, rbi->bi_rw, i);
atomic_inc(&sh->count);
if (sh != head_sh)
atomic_inc(&head_sh->count);
@@ -1093,8 +1093,9 @@ again:
if (!rdev && !rrdev) {
if (op_is_write(op))
set_bit(STRIPE_DEGRADED, &sh->state);
-   pr_debug("skip op %ld on disc %d for sector %llu\n",
-   bi->bi_rw, i, (unsigned long long)sh->sector);
+   pr_debug("skip op %d,%u on disc %d for sector %llu\n",
+bi->bi_op, bi->bi_rw, i,
+(unsigned long long)sh->sector);
clear_bit(R5_LOCKED, &sh->dev[i].flags);
set_bit(STRIPE_HANDLE, &sh->state);
}
diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index c4a48e8..921a858 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -2943,7 +2943,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bio(rw=%d,0x%lx, bi_vcnt=%u,"
+  "submit_bio(rw=%d,0x%x, bi_vcnt=%u,"
   " bi_sector=%llu (bytenr %llu), bi_bdev=%p)\n",
   bio->bi_op, bio->bi_rw, bio->bi_vcnt,
   (unsigned long long)bio->bi_iter.bi_sector,
@@ -2986,7 +2986,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
if (dev_state->s

Broken file system after sudden power off

2016-04-13 Thread Michael Laß
Hi,

recently I had a bit of a file system corruption after a sudden system
power off. I could fix the file system using btrfsck but before that I
dumped the FS to allow finding the root cause if this turns out to be a
bug.

My system is running Linux 4.5.0 and the file system lies on top of dm-
crypt and LVM, so the hierarchy looks like the following:
btrfs > dm-crypt > LVM LV > SINGLE PV = MBR partition > SSD

After the sudden power off the issue initially showed up as a non-
responsive system. I booted into a recovery system and issued btrfsck.
Here's the output:

checking extents
checking free space cache
checking fs roots
root 259 inode 37727 errors 200, dir isize wrong
root 259 inode 37820 errors 200, dir isize wrong
root 259 inode 1890803 errors 1, no inode item
unresolved ref dir 37727 index 122063 namelen 8 name Channels filetype 
1 errors 5, no dir item, no inode ref
root 259 inode 1890826 errors 1, no inode item
unresolved ref dir 37820 index 95273 namelen 17 name TransportSecurity 
filetype 1 errors 5, no dir item, no inode ref
found 68391559241 bytes used err is 1
total csum bytes: 63782984
total tree bytes: 2501541888
total fs tree bytes: 2308063232
total extent tree bytes: 108118016
btree space waste bytes: 482660866
file data blocks allocated: 693883981824
 referenced 101900402688

Both, "Channels" and "TransportSecurity", are part of a chromium
profile and in fact the system got unresponsive as soon as chromium was
started. A "btrfsck --repair" got rid of these errors.

When mounting the image of the broken file system on another system, I
see that "Channels" and "TransportSecurity" are listed twice in an ls
on the corresponding directory. Also the following message appears in
dmesg:
init_special_inode: bogus i_mode (0) for inode loop0:1890816

If you want any more information on this I've still got the image of
the broken file system and I'd be glad to assist.

Cheers,
Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: qgroup: Fix qgroup accounting when creating snapshot

2016-04-13 Thread Filipe Manana
On Tue, Apr 12, 2016 at 8:35 AM, Qu Wenruo  wrote:
> Current btrfs qgroup design implies a requirement that after calling
> btrfs_qgroup_account_extents() there must be a commit root switch.
>
> Normally this is OK, as btrfs_qgroup_accounting_extents() is only called
> inside btrfs_commit_transaction() just be commit_cowonly_roots().
>
> However there is a exception at create_pending_snapshot(), which will
> call btrfs_qgroup_account_extents() but no any commit root switch.
>
> In case of creating a snapshot whose parent root is itself (create a
> snapshot of fs tree), it will corrupt qgroup by the following trace:
> (skipped unrelated data)
> ==
> btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, 
> nr_old_roots = 0, nr_new_roots = 1
> qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer 
> = 0, excl = 0
> qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer 
> = 16384, excl = 16384
> btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, 
> nr_old_roots = 0, nr_new_roots = 0
> ==
>
> The problem here is in first qgroup_account_extent(), the
> nr_new_roots of the extent is 1, which means its reference got
> increased, and qgroup increased its rfer and excl.
>
> But at second qgroup_account_extent(), its reference got decreased, but
> between these two qgroup_account_extent(), there is no switch roots.
> This leads to the same nr_old_roots, and this extent just got ignored by
> qgroup, which means this extent is wrongly accounted.
>
> Fix it by call commit_cowonly_roots() after qgroup_account_extent() in
> create_pending_snapshot(), with needed preparation.
>
> Reported-by: Mark Fasheh 
> Signed-off-by: Qu Wenruo 
> ---
> changelog:
> v2:
>   Fix a soft lockup caused by missing switch_commit_root() call.
>   Fix a warning caused by dirty-but-not-committed root.
>
> Note:
>   This may be the dirtiest hack I have ever done.

I don't like it either. But, more importantly, I don't think this is
correct. See below.

>   As there are already several different judgment to check if a fs root
>   should be updated. From root->last_trans to root->commit_root ==
>   root->node.
>
>   With this patch, we must switch the root of at least related fs tree
>   and extent tree to allow qgroup to call
>   btrfs_qgroup_account_extents().
>   But this will break some transid judgement, as transid is already
>   updated to current transid.
>   (maybe we need a special sub-transid for qgroup use only?)
>
>   As long as current qgroup use commit_root to determine old_roots,
>   there is no better idea though.
> ---
>  fs/btrfs/transaction.c | 96 
> +-
>  1 file changed, 71 insertions(+), 25 deletions(-)
>
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 43885e5..0f299a56 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -311,12 +311,13 @@ loop:
>   * when the transaction commits
>   */
>  static int record_root_in_trans(struct btrfs_trans_handle *trans,
> -  struct btrfs_root *root)
> +  struct btrfs_root *root,
> +  int force)
>  {
> -   if (test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
> -   root->last_trans < trans->transid) {
> +   if ((test_bit(BTRFS_ROOT_REF_COWS, &root->state) &&
> +   root->last_trans < trans->transid) || force) {
> WARN_ON(root == root->fs_info->extent_root);
> -   WARN_ON(root->commit_root != root->node);
> +   WARN_ON(root->commit_root != root->node && !force);
>
> /*
>  * see below for IN_TRANS_SETUP usage rules
> @@ -331,7 +332,7 @@ static int record_root_in_trans(struct btrfs_trans_handle 
> *trans,
> smp_wmb();
>
> spin_lock(&root->fs_info->fs_roots_radix_lock);
> -   if (root->last_trans == trans->transid) {
> +   if (root->last_trans == trans->transid && !force) {
> spin_unlock(&root->fs_info->fs_roots_radix_lock);
> return 0;
> }
> @@ -402,7 +403,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle 
> *trans,
> return 0;
>
> mutex_lock(&root->fs_info->reloc_mutex);
> -   record_root_in_trans(trans, root);
> +   record_root_in_trans(trans, root, 0);
> mutex_unlock(&root->fs_info->reloc_mutex);
>
> return 0;
> @@ -1383,7 +1384,7 @@ static noinline int create_pending_snapshot(struct 
> btrfs_trans_handle *trans,
> dentry = pending->dentry;
> parent_inode = pending->dir;
> parent_root = BTRFS_I(parent_inode)->root;
> -   record_root_in_trans(trans, parent_root);
> +   record_root_in_trans(trans, parent_root, 0);
>
> cur_time = current_fs_time(parent_inode->i_sb);
>
> @@ -1420,7 +1421,7 @@ static noinline int create_pending_snapshot(

Re: enospace regression in 4.4

2016-04-13 Thread Henk Slager
On Tue, Apr 12, 2016 at 5:52 PM, Julian Taylor
 wrote:
> smaller testcase that shows the immediate enospc after fallocate -> rm,
> though I don't know if it is really related to the full filesystem
> bugging out as the balance does work if you wait a few seconds after the
> balance.
> But this sequence of commands did work in 4.2.
>
>  $ sudo btrfs fi show /dev/mapper/lvm-testing
> Label: none  uuid: 25889ba9-a957-415a-83b0-e34a62cb3212
> Total devices 1 FS bytes used 225.18MiB
> devid1 size 5.00GiB used 788.00MiB path /dev/mapper/lvm-testing
>
>  $ fallocate -l 4.4G test.dat
>  $ rm -f test.dat
>  $ sudo btrfs fi balance start -dusage=0 .
> ERROR: error during balancing '.': No space left on device
> There may be more info in syslog - try dmesg | tail

The effect is the same with kernel / progs  v4.6.0-rc3 / v4.5.1
It also doesn't matter if   fallocate -l 4400M test.dat   or   dd
if=/dev/zero of=test.dat bs=1M count=4400   is used to create test.dat
(I was looking at --dig-holes and --punch-hole options earlier and was
wondering if the use of fallocate would make a difference).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Infinite loop/hang in in btrfs_find_space_cluster()

2016-04-13 Thread Eric Wheeler
Hello all,

We just got this backtrace in 4.4.6 on an ARM AM335x (beaglebone 
compatible).  The trace looks similar to this one:
  http://permalink.gmane.org/gmane.comp.file-systems.btrfs/54734 but I 
don't have nice backtraces on this hardware (maybe hang traces are a 
compile option?).

In my case, the system appeared to hang but sysrq functions still worked 
and I was able to send a sysrq-(c)rash over serial.  The filesystem was 
just formatted in RAID1, and while I cannot access it because this hangs 
at boot, there can't be very much data yet.

This looks like the top of the relevant section, full trace below:

[   80.738518] [] (setup_cluster_no_bitmap [btrfs]) from [] 
(btrfs_find_space_cluster+0x10c/0x1dc [btrfs])

Any help you can offer would be greatly appreciated!

-Eric

[   80.005546] sysrq: SysRq : Trigger a crash
[   80.018339] pgd = c0004000
[   80.021160] [] *pgd=
[   80.024897] Internal error: Oops: 817 [#1] ARM
[   80.029531] Modules linked in: btrfs raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor
 raid6_pq raid0 multipath linear zram lz4_compress zsmalloc dm_thin_pool 
dm_persistent_data dm_bio_prison dm_snapshot
 dm_bufio dm_zero dm_mod raid1 md_mod
[   80.054411] CPU: 0 PID: 6 Comm: kworker/u2:0 Not tainted 
4.4.6-vr3-4-g2dfa78e #15
[   80.062579] Hardware name: Generic AM33XX (Flattened Device Tree)
[   80.069476] Workqueue: btrfs-delalloc btrfs_delalloc_helper [btrfs]
[   80.076026] task: cc8344c0 ti: cc858000 task.ti: cc858000
[   80.081669] PC is at sysrq_handle_crash+0x28/0x30
[   80.086576] LR is at sysrq_handle_crash+0x24/0x30
[   80.091484] pc : []lr : []psr: 6193
[   80.091484] sp : cc8599e8  ip : cc8599e8  fp : cc8599fc
[   80.103460] r10: cc8ecb80  r9 : c06008ce  r8 : 0001
[   80.108909] r7 : 0007  r6 : 0063  r5 : c05d6308  r4 : 0001
[   80.115717] r3 :   r2 : c05d62c8  r1 :   r0 : 0063
[   80.122529] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment 
none
[   80.130063] Control: 10c5387d  Table: 8a42c019  DAC: 0051
[   80.136057] Process kworker/u2:0 (pid: 6, stack limit = 0xcc858210)
[   80.142595] Stack: (0xcc8599e8 to 0xcc85a000)
[   80.147145] 99e0:   c02936cc c05e34f4 cc859a24 cc859a00 
c0293d00 c02936d8
[   80.155683] 9a00: c05b2bc4 cca49010  0101 0002 0005 
cc859a34 cc859a28
[   80.164220] 9a20: c0294208 c0293c64 cc859a74 cc859a38 c02ad3fc c02941e4 
 
[   80.172758] 9a40: 0001 0001 00015200 cca22380 cc8ecb90  
 009b
[   80.181296] 9a60: c06008ce cc8ecb80 cc859aac cc859a78 c006652c c02ad050 
c00409f4 c00405e0
[   80.189833] 9a80:  cc8ecb80 cc8ecb90   cc804000 
cc15ec50 ccd02518
[   80.198373] 9aa0: cc859ac4 cc859ab0 c008 c00664d4 00022000 cc8ecb80 
cc859adc cc859ac8
[   80.206912] 9ac0: c00690ec c0066644 009b c05de0a0 cc859aec cc859ae0 
c0065e58 c006905c
[   80.215451] 9ae0: cc859b14 cc859af0 c0065f7c c0065e44 cc859b30 c06342c0 
6013 
[   80.223990] 9b00: cc859b64 00201000 cc859b2c cc859b18 c00093b8 c0065f34 
bf187974 bf18799c
[   80.232529] 9b20: cc859bcc cc859b30 c0412e14 c000938c cc15ec50 cc15fba8 
00201000 
[   80.241066] 9b40: 5000  cc859bf8 ca144880 00201000 cc15ec50 
ccd02518 cc859bcc
[   80.249604] 9b60: cc859b80 cc859b80 bf187974 bf18799c 6013  
0051 c020aba4
[   80.258142] 9b80:  0001  cc859bac cc859bf8 c0203d08 
2013 ccd02518
[   80.266680] 9ba0: 0051 00201000  00201000  00201000 
 cc859bf8
[   80.275218] 9bc0: cc859c2c cc859bd0 bf18b3b8 bf18791c 41c0  
00201000 
[   80.283756] 9be0: 00201000  00201000  ccd02518 ccb31200 
cc15fbd4 cc15fbd4
[   80.292293] 9c00: cce624d0 1000   0001 ccb31200 
0001 ccd02518
[   80.300832] 9c20: cc859cdc cc859c30 bf12c868 bf18b2b8 41c0  
1000 
[   80.309369] 9c40: 0020   cc13aef0 cc859c8c 0001 
0002 0001
[   80.317906] 9c60: 0001 0001 0001 cce62570 0001 ca157800 
1000 
[   80.326442] 9c80: cce62500 0001 bf11e508 bf11da44 0001  
1000 
[   80.334979] 9ca0:   0020  0001 0011 
0001 cc859daf
[   80.343517] 9cc0: 0001 1000  ccdf7000 cc859d4c cc859ce0 
bf12cff4 bf12c1f8
[   80.352054] 9ce0:   41c0  cc859daf 0001 
0011 
[   80.360590] 9d00: 0001  1000   0001 
0011 
[   80.369127] 9d20:   cc859daf 0001  cc126350 
 ccdf7000
[   80.377664] 9d40: cc859dec cc859d50 bf145250 bf12cef4 1000  
 
[   80.386201] 9d60: 41c0  cc859daf 0001 0001 ca264f80 
 
[   80.394737] 9d80: cc8e0480  ccdf7000 cfdc3e6