[PATCH v2 1/9] btrfs: Cleanup the unused struct async_sched.

2013-09-12 Thread Qu Wenruo
The struct async_sched is not used by any codes and can be removed.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/volumes.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 78b8717..12eaf89 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5031,13 +5031,6 @@ static void btrfs_end_bio(struct bio *bio, int err)
}
 }
 
-struct async_sched {
-   struct bio *bio;
-   int rw;
-   struct btrfs_fs_info *info;
-   struct btrfs_work work;
-};
-
 /*
  * see run_scheduled_bios for a description of why bios are collected for
  * async submit.
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/9] btrfs: Replace the btrfs_workers with kernel workqueue

2013-09-12 Thread Qu Wenruo
Use kernel workqueue and kernel workqueue based new btrfs_workqueue_struct to 
replace
the old btrfs_workers.
The main goal is to reduce the redundant codes(800 lines vs 200 lines) and
try to get benefits from the latest workqueue changes.

About the performance, the test suite I used is bonnie++,
and there seems no significant regression.

The patched kernel get the following difference vs the 3.10 kernel on an HDD
with a two-way 4cores server.(10times each and compare the average)

putc:   -0.97%
getc:   +1.48%
random_del: +2.38%
random_create:  -2.27%
seq_del +0.94%

Other changes are smaller than 0.5% and can be ignored.
Since the tests are not enough and maybe unstable,
any further tests are welcome.

--
Changelog:
v1-v2: In patch 2/9
  Add ret=-ENOMEM for some workqueue allocation in scrub.c.
  Add qgroup_rescan_workers allocation check.
--
Qu Wenruo (9):
  btrfs: Cleanup the unused struct async_sched.
  btrfs: use kernel workqueue to replace the btrfs_workers functions
  btrfs: Added btrfs_workqueue_struct implemented ordered execution
based on kernel workqueue
  btrfs: Add high priority workqueue support for btrfs_workqueue_struct
  btrfs: Use btrfs_workqueue_struct to replace the fs_info-workers
  btrfs: Use btrfs_workqueue_struct to replace the
fs_info-delalloc_workers
  btrfs: Replace the fs_info-submit_workers with kernel workqueue.
  btrfs: Cleanup the old btrfs workqueue
  btrfs: Replace thread_pool_size with workqueue default value

 fs/btrfs/Makefile|   5 +-
 fs/btrfs/async-thread.c  | 714 ---
 fs/btrfs/async-thread.h  | 119 
 fs/btrfs/bwq.c   | 136 +
 fs/btrfs/bwq.h   |  67 +
 fs/btrfs/ctree.h |  46 ++-
 fs/btrfs/delayed-inode.c |   9 +-
 fs/btrfs/dev-replace.c   |   1 -
 fs/btrfs/disk-io.c   | 238 ++--
 fs/btrfs/extent-tree.c   |   6 +-
 fs/btrfs/inode.c |  57 ++--
 fs/btrfs/ordered-data.c  |  11 +-
 fs/btrfs/ordered-data.h  |   4 +-
 fs/btrfs/qgroup.c|  16 +-
 fs/btrfs/raid56.c|  38 ++-
 fs/btrfs/reada.c |   8 +-
 fs/btrfs/relocation.c|   1 -
 fs/btrfs/scrub.c |  78 +++---
 fs/btrfs/super.c |  41 ++-
 fs/btrfs/volumes.c   |  25 +-
 fs/btrfs/volumes.h   |   3 +-
 21 files changed, 451 insertions(+), 1172 deletions(-)
 delete mode 100644 fs/btrfs/async-thread.c
 delete mode 100644 fs/btrfs/async-thread.h
 create mode 100644 fs/btrfs/bwq.c
 create mode 100644 fs/btrfs/bwq.h

-- 
1.8.4
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 9/9] btrfs: Replace thread_pool_size with workqueue default value

2013-09-12 Thread Qu Wenruo
The original btrfs_workers uses the fs_info-thread_pool_size as the
max_active, and the previous patches followed this way.

But the kernel workqueue has the default value(0) for workqueue,
and workqueue itself has some threshold mechanism to prevent creating
too many threads, so we should use the default value.

Since the thread_pool_size algorithm is not used, related codes should
also be changed.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/disk-io.c | 12 +++-
 fs/btrfs/super.c   |  3 +--
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a61e1fe..0446d27 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -750,9 +750,11 @@ int btrfs_bio_wq_end_io(struct btrfs_fs_info *info, struct 
bio *bio,
 
 unsigned long btrfs_async_submit_limit(struct btrfs_fs_info *info)
 {
-   unsigned long limit = min_t(unsigned long,
-   info-thread_pool_size,
-   info-fs_devices-open_devices);
+   unsigned long limit;
+   limit = info-thread_pool_size ?
+   min_t(unsigned long, info-thread_pool_size,
+ info-fs_devices-open_devices) :
+   info-fs_devices-open_devices;
return 256 * limit;
 }
 
@@ -2191,8 +2193,8 @@ int open_ctree(struct super_block *sb,
INIT_RADIX_TREE(fs_info-reada_tree, GFP_NOFS  ~__GFP_WAIT);
spin_lock_init(fs_info-reada_lock);
 
-   fs_info-thread_pool_size = min_t(unsigned long,
- num_online_cpus() + 2, 8);
+   /* use the default value of kernel workqueue */
+   fs_info-thread_pool_size = 0;
 
INIT_LIST_HEAD(fs_info-ordered_roots);
spin_lock_init(fs_info-ordered_root_lock);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 63e653c..ccf412f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -898,8 +898,7 @@ static int btrfs_show_options(struct seq_file *seq, struct 
dentry *dentry)
if (info-alloc_start != 0)
seq_printf(seq, ,alloc_start=%llu,
   (unsigned long long)info-alloc_start);
-   if (info-thread_pool_size !=  min_t(unsigned long,
-num_online_cpus() + 2, 8))
+   if (info-thread_pool_size)
seq_printf(seq, ,thread_pool=%d, info-thread_pool_size);
if (btrfs_test_opt(root, COMPRESS)) {
if (info-compress_type == BTRFS_COMPRESS_ZLIB)
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 8/9] btrfs: Cleanup the old btrfs workqueue

2013-09-12 Thread Qu Wenruo
Since the patches before implemented the new kernel workqueue based
btrfs_worqueue_struct, the old btrfs workqueue(btrfs_worker) can be
removed without any problem.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/Makefile   |   2 +-
 fs/btrfs/async-thread.c | 714 
 fs/btrfs/async-thread.h | 119 
 fs/btrfs/ctree.h|   3 -
 fs/btrfs/dev-replace.c  |   1 -
 fs/btrfs/disk-io.c  |  25 +-
 fs/btrfs/raid56.c   |   1 -
 fs/btrfs/relocation.c   |   1 -
 fs/btrfs/super.c|   8 -
 fs/btrfs/volumes.c  |   1 -
 fs/btrfs/volumes.h  |   1 -
 11 files changed, 10 insertions(+), 866 deletions(-)
 delete mode 100644 fs/btrfs/async-thread.c
 delete mode 100644 fs/btrfs/async-thread.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index d7439df..e2162af 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -5,7 +5,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   file-item.o inode-item.o inode-map.o disk-io.o \
   transaction.o inode.o file.o tree-defrag.o \
   extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
-  extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
+  extent_io.o volumes.o ioctl.o locking.o orphan.o \
   export.o tree-log.o free-space-cache.o zlib.o lzo.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
deleted file mode 100644
index 58b7d14..000
--- a/fs/btrfs/async-thread.c
+++ /dev/null
@@ -1,714 +0,0 @@
-/*
- * Copyright (C) 2007 Oracle.  All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License v2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this program; if not, write to the
- * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
- * Boston, MA 021110-1307, USA.
- */
-
-#include linux/kthread.h
-#include linux/slab.h
-#include linux/list.h
-#include linux/spinlock.h
-#include linux/freezer.h
-#include async-thread.h
-
-#define WORK_QUEUED_BIT 0
-#define WORK_DONE_BIT 1
-#define WORK_ORDER_DONE_BIT 2
-#define WORK_HIGH_PRIO_BIT 3
-
-/*
- * container for the kthread task pointer and the list of pending work
- * One of these is allocated per thread.
- */
-struct btrfs_worker_thread {
-   /* pool we belong to */
-   struct btrfs_workers *workers;
-
-   /* list of struct btrfs_work that are waiting for service */
-   struct list_head pending;
-   struct list_head prio_pending;
-
-   /* list of worker threads from struct btrfs_workers */
-   struct list_head worker_list;
-
-   /* kthread */
-   struct task_struct *task;
-
-   /* number of things on the pending list */
-   atomic_t num_pending;
-
-   /* reference counter for this struct */
-   atomic_t refs;
-
-   unsigned long sequence;
-
-   /* protects the pending list. */
-   spinlock_t lock;
-
-   /* set to non-zero when this thread is already awake and kicking */
-   int working;
-
-   /* are we currently idle */
-   int idle;
-};
-
-static int __btrfs_start_workers(struct btrfs_workers *workers);
-
-/*
- * btrfs_start_workers uses kthread_run, which can block waiting for memory
- * for a very long time.  It will actually throttle on page writeback,
- * and so it may not make progress until after our btrfs worker threads
- * process all of the pending work structs in their queue
- *
- * This means we can't use btrfs_start_workers from inside a btrfs worker
- * thread that is used as part of cleaning dirty memory, which pretty much
- * involves all of the worker threads.
- *
- * Instead we have a helper queue who never has more than one thread
- * where we scheduler thread start operations.  This worker_start struct
- * is used to contain the work and hold a pointer to the queue that needs
- * another worker.
- */
-struct worker_start {
-   struct btrfs_work work;
-   struct btrfs_workers *queue;
-};
-
-static void start_new_worker_func(struct btrfs_work *work)
-{
-   struct worker_start *start;
-   start = container_of(work, struct worker_start, work);
-   __btrfs_start_workers(start-queue);
-   kfree(start);
-}
-
-/*
- * helper function to move a thread onto the idle list after it
- * has finished some requests.
- */
-static void check_idle_worker(struct btrfs_worker_thread *worker)
-{
-   if (!worker-idle  

[PATCH v2 3/9] btrfs: Added btrfs_workqueue_struct implemented ordered execution based on kernel workqueue

2013-09-12 Thread Qu Wenruo
Use kernel workqueue to implement a new btrfs_workqueue_struct, which
has the ordering execution feature like the btrfs_worker.

The func is executed in a concurrency way, and the
ordred_func/ordered_free is executed in the sequence them are queued
after the corresponding func is done.
The new btrfs_workqueue use 2 workqueues to implement the original
btrfs_worker, one for the normal work and one for ordered work.

At this patch, high priority work queue is not added yet.
The high priority feature will be added in the following patches.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/Makefile |   3 +-
 fs/btrfs/bwq.c| 109 ++
 fs/btrfs/bwq.h|  59 +
 3 files changed, 170 insertions(+), 1 deletion(-)
 create mode 100644 fs/btrfs/bwq.c
 create mode 100644 fs/btrfs/bwq.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 3932224..d7439df 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -8,7 +8,8 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
   export.o tree-log.o free-space-cache.o zlib.o lzo.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
-  reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o
+  reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
+  bwq.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/bwq.c b/fs/btrfs/bwq.c
new file mode 100644
index 000..feccf21
--- /dev/null
+++ b/fs/btrfs/bwq.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright (C) 2013 Fujitsu.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#include linux/slab.h
+#include linux/list.h
+#include linux/spinlock.h
+#include linux/freezer.h
+#include linux/workqueue.h
+#include linux/completion.h
+#include linux/spinlock.h
+#include bwq.h
+
+struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char *name,
+char *ordered_name,
+int max_active)
+{
+   int wq_flags = WQ_UNBOUND | WQ_MEM_RECLAIM;
+   struct btrfs_workqueue_struct *ret = kzalloc(sizeof(*ret), GFP_NOFS);
+   if (unlikely(!ret))
+   return NULL;
+   ret-normal_wq = alloc_workqueue(name, wq_flags, max_active);
+   if (unlikely(!ret-normal_wq)) {
+   kfree(ret);
+   return NULL;
+   }
+
+   ret-ordered_wq = alloc_ordered_workqueue(ordered_name,
+ WQ_MEM_RECLAIM);
+   if (unlikely(!ret-ordered_wq)) {
+   destroy_workqueue(ret-normal_wq);
+   kfree(ret);
+   return NULL;
+   }
+
+   spin_lock_init(ret-insert_lock);
+   return ret;
+}
+
+/*
+ * When in out-of-order mode(SSD), high concurrency is OK, so no need
+ * to do the completion things, just call the ordered_func after the
+ * normal work is done
+ */
+
+static void normal_work_helper(struct work_struct *arg)
+{
+   struct btrfs_work_struct *work;
+   work = container_of(arg, struct btrfs_work_struct, normal_work);
+   work-func(work);
+   complete(work-normal_completion);
+}
+
+static void ordered_work_helper(struct work_struct *arg)
+{
+   struct btrfs_work_struct *work;
+   work = container_of(arg, struct btrfs_work_struct, ordered_work);
+   wait_for_completion(work-normal_completion);
+   work-ordered_func(work);
+   if (work-ordered_free)
+   work-ordered_free(work);
+}
+
+void btrfs_init_work(struct btrfs_work_struct *work,
+   void (*func)(struct btrfs_work_struct *),
+   void (*ordered_func)(struct btrfs_work_struct *),
+   void (*ordered_free)(struct btrfs_work_struct *))
+{
+   work-func = func;
+   work-ordered_func = ordered_func;
+   work-ordered_free = ordered_free;
+   init_completion(work-normal_completion);
+}
+
+void btrfs_queue_work(struct btrfs_workqueue_struct *wq,
+ struct btrfs_work_struct *work)
+{
+   INIT_WORK(work-normal_work, normal_work_helper);
+   

[PATCH v2 2/9] btrfs: use kernel workqueue to replace the btrfs_workers functions

2013-09-12 Thread Qu Wenruo
Use the kernel workqueue to replace the btrfs_workers which are only
used as normal workqueue.

Other btrfs_workers will use some extra functions like requeue, high
priority and ordered work.
These btrfs_workers will not be touched in this patch.

The followings are the untouched btrfs_workers:

generic_worker: As the helper for other btrfs_workers
workers:Use the ordering and high priority features
delalloc_workers:   Use the ordering feature
submit_workers: Use requeue feature

All other workers can be replaced using the kernel workqueue directly.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/ctree.h |  39 +--
 fs/btrfs/delayed-inode.c |   9 ++-
 fs/btrfs/disk-io.c   | 164 ++-
 fs/btrfs/extent-tree.c   |   6 +-
 fs/btrfs/inode.c |  38 +--
 fs/btrfs/ordered-data.c  |  11 ++--
 fs/btrfs/ordered-data.h  |   4 +-
 fs/btrfs/qgroup.c|  16 ++---
 fs/btrfs/raid56.c|  37 +--
 fs/btrfs/reada.c |   8 +--
 fs/btrfs/scrub.c |  84 
 fs/btrfs/super.c |  23 ---
 12 files changed, 196 insertions(+), 243 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index e795bf1..0dd6ec9 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1202,7 +1202,7 @@ struct btrfs_caching_control {
struct list_head list;
struct mutex mutex;
wait_queue_head_t wait;
-   struct btrfs_work work;
+   struct work_struct work;
struct btrfs_block_group_cache *block_group;
u64 progress;
atomic_t count;
@@ -1479,25 +1479,26 @@ struct btrfs_fs_info {
struct btrfs_workers generic_worker;
struct btrfs_workers workers;
struct btrfs_workers delalloc_workers;
-   struct btrfs_workers flush_workers;
-   struct btrfs_workers endio_workers;
-   struct btrfs_workers endio_meta_workers;
-   struct btrfs_workers endio_raid56_workers;
-   struct btrfs_workers rmw_workers;
-   struct btrfs_workers endio_meta_write_workers;
-   struct btrfs_workers endio_write_workers;
-   struct btrfs_workers endio_freespace_worker;
struct btrfs_workers submit_workers;
-   struct btrfs_workers caching_workers;
-   struct btrfs_workers readahead_workers;
+
+   struct workqueue_struct *flush_workers;
+   struct workqueue_struct *endio_workers;
+   struct workqueue_struct *endio_meta_workers;
+   struct workqueue_struct *endio_raid56_workers;
+   struct workqueue_struct *rmw_workers;
+   struct workqueue_struct *endio_meta_write_workers;
+   struct workqueue_struct *endio_write_workers;
+   struct workqueue_struct *endio_freespace_worker;
+   struct workqueue_struct *caching_workers;
+   struct workqueue_struct *readahead_workers;
 
/*
 * fixup workers take dirty pages that didn't properly go through
 * the cow mechanism and make them safe to write.  It happens
 * for the sys_munmap function call path
 */
-   struct btrfs_workers fixup_workers;
-   struct btrfs_workers delayed_workers;
+   struct workqueue_struct *fixup_workers;
+   struct workqueue_struct *delayed_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
int thread_pool_size;
@@ -1576,9 +1577,9 @@ struct btrfs_fs_info {
wait_queue_head_t scrub_pause_wait;
struct rw_semaphore scrub_super_lock;
int scrub_workers_refcnt;
-   struct btrfs_workers scrub_workers;
-   struct btrfs_workers scrub_wr_completion_workers;
-   struct btrfs_workers scrub_nocow_workers;
+   struct workqueue_struct *scrub_workers;
+   struct workqueue_struct *scrub_wr_completion_workers;
+   struct workqueue_struct *scrub_nocow_workers;
 
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
u32 check_integrity_print_mask;
@@ -1619,9 +1620,9 @@ struct btrfs_fs_info {
/* qgroup rescan items */
struct mutex qgroup_rescan_lock; /* protects the progress item */
struct btrfs_key qgroup_rescan_progress;
-   struct btrfs_workers qgroup_rescan_workers;
+   struct workqueue_struct *qgroup_rescan_workers;
struct completion qgroup_rescan_completion;
-   struct btrfs_work qgroup_rescan_work;
+   struct work_struct qgroup_rescan_work;
 
/* filesystem state */
unsigned long fs_state;
@@ -3542,7 +3543,7 @@ struct btrfs_delalloc_work {
int delay_iput;
struct completion completion;
struct list_head list;
-   struct btrfs_work work;
+   struct work_struct work;
 };
 
 struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode,
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 5615eac..2b8da0a7 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1258,10 +1258,10 @@ void 

[PATCH v2 5/9] btrfs: Use btrfs_workqueue_struct to replace the fs_info-workers

2013-09-12 Thread Qu Wenruo
Use the newly created btrfs_workqueue_struct to replace the original
fs_info-workers

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 36 +++-
 fs/btrfs/super.c   |  3 ++-
 3 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 0dd6ec9..2662ef2 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1477,10 +1477,10 @@ struct btrfs_fs_info {
 * two
 */
struct btrfs_workers generic_worker;
-   struct btrfs_workers workers;
struct btrfs_workers delalloc_workers;
struct btrfs_workers submit_workers;
 
+   struct btrfs_workqueue_struct *workers;
struct workqueue_struct *flush_workers;
struct workqueue_struct *endio_workers;
struct workqueue_struct *endio_meta_workers;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index d02a552..364c409 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -48,6 +48,7 @@
 #include rcu-string.h
 #include dev-replace.h
 #include raid56.h
+#include bwq.h
 
 #ifdef CONFIG_X86
 #include asm/cpufeature.h
@@ -108,7 +109,7 @@ struct async_submit_bio {
 * can't tell us where in the file the bio should go
 */
u64 bio_offset;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
int error;
 };
 
@@ -751,12 +752,12 @@ int btrfs_bio_wq_end_io(struct btrfs_fs_info *info, 
struct bio *bio,
 unsigned long btrfs_async_submit_limit(struct btrfs_fs_info *info)
 {
unsigned long limit = min_t(unsigned long,
-   info-workers.max_workers,
+   info-thread_pool_size,
info-fs_devices-open_devices);
return 256 * limit;
 }
 
-static void run_one_async_start(struct btrfs_work *work)
+static void run_one_async_start(struct btrfs_work_struct *work)
 {
struct async_submit_bio *async;
int ret;
@@ -769,7 +770,7 @@ static void run_one_async_start(struct btrfs_work *work)
async-error = ret;
 }
 
-static void run_one_async_done(struct btrfs_work *work)
+static void run_one_async_done(struct btrfs_work_struct *work)
 {
struct btrfs_fs_info *fs_info;
struct async_submit_bio *async;
@@ -796,7 +797,7 @@ static void run_one_async_done(struct btrfs_work *work)
   async-bio_offset);
 }
 
-static void run_one_async_free(struct btrfs_work *work)
+static void run_one_async_free(struct btrfs_work_struct *work)
 {
struct async_submit_bio *async;
 
@@ -824,11 +825,9 @@ int btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, 
struct inode *inode,
async-submit_bio_start = submit_bio_start;
async-submit_bio_done = submit_bio_done;
 
-   async-work.func = run_one_async_start;
-   async-work.ordered_func = run_one_async_done;
-   async-work.ordered_free = run_one_async_free;
+   btrfs_init_work(async-work, run_one_async_start,
+   run_one_async_done, run_one_async_free);
 
-   async-work.flags = 0;
async-bio_flags = bio_flags;
async-bio_offset = bio_offset;
 
@@ -837,9 +836,9 @@ int btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, 
struct inode *inode,
atomic_inc(fs_info-nr_async_submits);
 
if (rw  REQ_SYNC)
-   btrfs_set_work_high_prio(async-work);
+   btrfs_set_work_high_priority(async-work);
 
-   btrfs_queue_worker(fs_info-workers, async-work);
+   btrfs_queue_work(fs_info-workers, async-work);
 
while (atomic_read(fs_info-async_submit_draining) 
  atomic_read(fs_info-nr_async_submits)) {
@@ -1987,7 +1986,7 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info 
*fs_info)
 {
btrfs_stop_workers(fs_info-generic_worker);
btrfs_stop_workers(fs_info-delalloc_workers);
-   btrfs_stop_workers(fs_info-workers);
+   btrfs_destroy_workqueue(fs_info-workers);
btrfs_stop_workers(fs_info-submit_workers);
destroy_workqueue(fs_info-fixup_workers);
destroy_workqueue(fs_info-endio_workers);
@@ -2462,9 +2461,8 @@ int open_ctree(struct super_block *sb,
btrfs_init_workers(fs_info-generic_worker,
   genwork, 1, NULL);
 
-   btrfs_init_workers(fs_info-workers, worker,
-  fs_info-thread_pool_size,
-  fs_info-generic_worker);
+   fs_info-workers = btrfs_alloc_workqueue(worker, worker-ordered,
+worker-high, max_active);
 
btrfs_init_workers(fs_info-delalloc_workers, delalloc,
   fs_info-thread_pool_size,
@@ -2478,9 +2476,6 @@ int open_ctree(struct super_block *sb,
   fs_info-generic_worker);
fs_info-caching_workers = alloc_workqueue(cache, flags, 2);
 
-   fs_info-workers.idle_thresh = 16;
-   

[PATCH v2 6/9] btrfs: Use btrfs_workqueue_struct to replace the fs_info-delalloc_workers

2013-09-12 Thread Qu Wenruo
Much like the fs_info-workers, replace the fs_info-delalloc_workers
use the same btrfs_workqueue.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 13 +
 fs/btrfs/inode.c   | 19 +--
 fs/btrfs/super.c   |  2 +-
 4 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 2662ef2..81aba0e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1477,10 +1477,10 @@ struct btrfs_fs_info {
 * two
 */
struct btrfs_workers generic_worker;
-   struct btrfs_workers delalloc_workers;
struct btrfs_workers submit_workers;
 
struct btrfs_workqueue_struct *workers;
+   struct btrfs_workqueue_struct *delalloc_workers;
struct workqueue_struct *flush_workers;
struct workqueue_struct *endio_workers;
struct workqueue_struct *endio_meta_workers;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 364c409..fd795b6 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1985,7 +1985,7 @@ static noinline int next_root_backup(struct btrfs_fs_info 
*info,
 static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info)
 {
btrfs_stop_workers(fs_info-generic_worker);
-   btrfs_stop_workers(fs_info-delalloc_workers);
+   btrfs_destroy_workqueue(fs_info-delalloc_workers);
btrfs_destroy_workqueue(fs_info-workers);
btrfs_stop_workers(fs_info-submit_workers);
destroy_workqueue(fs_info-fixup_workers);
@@ -2464,9 +2464,9 @@ int open_ctree(struct super_block *sb,
fs_info-workers = btrfs_alloc_workqueue(worker, worker-ordered,
 worker-high, max_active);
 
-   btrfs_init_workers(fs_info-delalloc_workers, delalloc,
-  fs_info-thread_pool_size,
-  fs_info-generic_worker);
+   fs_info-delalloc_workers = btrfs_alloc_workqueue(delalloc,
+ delalloc-ordered,
+ NULL, max_active);
 
fs_info-flush_workers = alloc_workqueue(flush_delalloc, flags,
 max_active);
@@ -2476,9 +2476,6 @@ int open_ctree(struct super_block *sb,
   fs_info-generic_worker);
fs_info-caching_workers = alloc_workqueue(cache, flags, 2);
 
-   fs_info-delalloc_workers.idle_thresh = 2;
-   fs_info-delalloc_workers.ordered = 1;
-
fs_info-fixup_workers = alloc_workqueue(fixup, flags, 1);
fs_info-endio_workers = alloc_workqueue(endio, flags, max_active);
fs_info-endio_meta_workers = alloc_workqueue(endio-meta, flags,
@@ -2503,11 +2500,11 @@ int open_ctree(struct super_block *sb,
 * return -ENOMEM if any of these fail.
 */
ret = btrfs_start_workers(fs_info-generic_worker);
-   ret |= btrfs_start_workers(fs_info-delalloc_workers);
ret |= btrfs_start_workers(fs_info-submit_workers);
 
if (ret || !(fs_info-flush_workers  fs_info-endio_workers 
 fs_info-endio_meta_workers  fs_info-workers 
+fs_info-delalloc_workers 
 fs_info-endio_raid56_workers 
 fs_info-rmw_workers  fs_info-qgroup_rescan_workers 
 fs_info-endio_meta_write_workers 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 53901a5..0ae21a6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -59,6 +59,7 @@
 #include inode-map.h
 #include backref.h
 #include hash.h
+#include bwq.h
 
 struct btrfs_iget_args {
u64 ino;
@@ -295,7 +296,7 @@ struct async_cow {
u64 start;
u64 end;
struct list_head extents;
-   struct btrfs_work work;
+   struct btrfs_work_struct work;
 };
 
 static noinline int add_async_extent(struct async_cow *cow,
@@ -1057,7 +1058,7 @@ static noinline int cow_file_range(struct inode *inode,
 /*
  * work queue call back to started compression on a file and pages
  */
-static noinline void async_cow_start(struct btrfs_work *work)
+static noinline void async_cow_start(struct btrfs_work_struct *work)
 {
struct async_cow *async_cow;
int num_added = 0;
@@ -1075,7 +1076,7 @@ static noinline void async_cow_start(struct btrfs_work 
*work)
 /*
  * work queue call back to submit previously compressed pages
  */
-static noinline void async_cow_submit(struct btrfs_work *work)
+static noinline void async_cow_submit(struct btrfs_work_struct *work)
 {
struct async_cow *async_cow;
struct btrfs_root *root;
@@ -1096,7 +1097,7 @@ static noinline void async_cow_submit(struct btrfs_work 
*work)
submit_compressed_extents(async_cow-inode, async_cow);
 }
 
-static noinline void async_cow_free(struct btrfs_work *work)
+static noinline void async_cow_free(struct btrfs_work_struct *work)
 {
struct async_cow 

[PATCH v2 4/9] btrfs: Add high priority workqueue support for btrfs_workqueue_struct

2013-09-12 Thread Qu Wenruo
Add high priority workqueue, which added a new workqueue to
btrfs_workqueue_struct.

Whether using the high priority workqueue must be decided at
initialization.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/bwq.c | 29 -
 fs/btrfs/bwq.h |  8 
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/bwq.c b/fs/btrfs/bwq.c
index feccf21..c2a089c 100644
--- a/fs/btrfs/bwq.c
+++ b/fs/btrfs/bwq.c
@@ -27,6 +27,7 @@
 
 struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char *name,
 char *ordered_name,
+char *high_name,
 int max_active)
 {
int wq_flags = WQ_UNBOUND | WQ_MEM_RECLAIM;
@@ -46,6 +47,17 @@ struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char 
*name,
kfree(ret);
return NULL;
}
+   if (high_name) {
+   ret-high_wq = alloc_workqueue(high_name, wq_flags | WQ_HIGHPRI,
+  max_active);
+   if (unlikely(!ret-high_wq)) {
+   destroy_workqueue(ret-normal_wq);
+   destroy_workqueue(ret-ordered_wq);
+   kfree(ret);
+   return NULL;
+   }
+   }
+
 
spin_lock_init(ret-insert_lock);
return ret;
@@ -89,10 +101,16 @@ void btrfs_init_work(struct btrfs_work_struct *work,
 void btrfs_queue_work(struct btrfs_workqueue_struct *wq,
  struct btrfs_work_struct *work)
 {
+   struct workqueue_struct *dest_wq;
+   if (work-high  wq-high_wq)
+   dest_wq = wq-high_wq;
+   else
+   dest_wq = wq-normal_wq;
+
INIT_WORK(work-normal_work, normal_work_helper);
INIT_WORK(work-ordered_work, ordered_work_helper);
spin_lock(wq-insert_lock);
-   queue_work(wq-normal_wq, work-normal_work);
+   queue_work(dest_wq, work-normal_work);
queue_work(wq-ordered_wq, work-ordered_work);
spin_unlock(wq-insert_lock);
 }
@@ -100,10 +118,19 @@ void btrfs_queue_work(struct btrfs_workqueue_struct *wq,
 void btrfs_destroy_workqueue(struct btrfs_workqueue_struct *wq)
 {
destroy_workqueue(wq-ordered_wq);
+   if (wq-high_wq)
+   destroy_workqueue(wq-high_wq);
destroy_workqueue(wq-normal_wq);
 }
 
 void btrfs_workqueue_set_max(struct btrfs_workqueue_struct *wq, int max)
 {
workqueue_set_max_active(wq-normal_wq, max);
+   if (wq-high_wq)
+   workqueue_set_max_active(wq-high_wq, max);
+}
+
+void btrfs_set_work_high_priority(struct btrfs_work_struct *work)
+{
+   work-high = 1;
 }
diff --git a/fs/btrfs/bwq.h b/fs/btrfs/bwq.h
index bf12c90..d9a7ded 100644
--- a/fs/btrfs/bwq.h
+++ b/fs/btrfs/bwq.h
@@ -22,6 +22,7 @@
 struct btrfs_workqueue_struct {
struct workqueue_struct *normal_wq;
struct workqueue_struct *ordered_wq;
+   struct workqueue_struct *high_wq;
 
/*
 * Spinlock to ensure that both ordered and normal work can
@@ -43,10 +44,16 @@ struct btrfs_work_struct {
struct work_struct normal_work;
struct work_struct ordered_work;
struct completion normal_completion;
+   int high;
 };
 
+/*
+ * name and ordered_name is mandamental, if high_name not given(NULL),
+ * high priority workqueue feature will not be available
+ */
 struct btrfs_workqueue_struct *btrfs_alloc_workqueue(char *name,
 char *ordered_name,
+char *high_name,
 int max_active);
 void btrfs_init_work(struct btrfs_work_struct *work,
 void (*func)(struct btrfs_work_struct *),
@@ -56,4 +63,5 @@ void btrfs_queue_work(struct btrfs_workqueue_struct *wq,
  struct btrfs_work_struct *work);
 void btrfs_destroy_workqueue(struct btrfs_workqueue_struct *wq);
 void btrfs_workqueue_set_max(struct btrfs_workqueue_struct *wq, int max);
+void btrfs_set_work_high_priority(struct btrfs_work_struct *work);
 #endif
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Btrfs: don't leak transaction in btrfs_sync_file()

2013-09-12 Thread Filipe David Borba Manana
In btrfs_sync_file(), if the call to btrfs_log_dentry_safe() returns
a negative error (for e.g. -ENOMEM via btrfs_log_inode()), we would
return without ending/freeing the transaction.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: If btrfs_log_dentry_safe() returns error, don't fall through because
that will override the final return value, and can make us return
success (0) instead of an error.

 fs/btrfs/file.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 5ba87b0..8c305f5 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1860,6 +1860,7 @@ int btrfs_sync_file(struct file *file, loff_t start, 
loff_t end, int datasync)
ret = btrfs_log_dentry_safe(trans, root, dentry);
if (ret  0) {
mutex_unlock(inode-i_mutex);
+   btrfs_end_transaction(trans, root);
goto out;
}
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


extent data disk byte 0

2013-09-12 Thread Anand Jain



 In the item 7 below, any idea why would the disk byte be 0 ?
 (its not an inline extent)

 --
item 6 key (257 EXTENT_DATA 0) itemoff 3531 itemsize 53
extent data disk byte 456130560 nr 4096
extent data offset 0 nr 4096 ram 4096
extent compression 0
item 7 key (257 EXTENT_DATA 4096) itemoff 3478 itemsize 53
extent data disk byte 0 nr 0
extent data offset 0 nr 258048 ram 258048
extent compression 0
item 8 key (257 EXTENT_DATA 262144) itemoff 3425 itemsize 53
extent data disk byte 456265728 nr 4096
extent data offset 0 nr 4096 ram 4096
extent compression 0
 ---

Thanks, Anand
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: btrfs.8.in: Add info about reverting back to root subvolume.

2013-09-12 Thread David Sterba
On Wed, Sep 11, 2013 at 02:34:05PM +0530, chandan wrote:
 --- a/man/btrfs.8.in
 +++ b/man/btrfs.8.in
 @@ -244,7 +244,8 @@ is similar to \fBsubvolume list\fR command.
  \fBsubvolume set-default\fR\fI id path\fR
  Set the subvolume of the filesystem \fIpath\fR which is mounted as
  \fIdefault\fR. The subvolume is identified by \fIid\fR, which
 -is returned by the \fBsubvolume list\fR command.
 +is returned by the \fBsubvolume list\fR command. The default subvolume
 +can be set to the root subvolume by passing an \fIid\fR value of 5.

The number 5 is an implementation detail, we should recommend to use 0.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] Btrfs

2013-09-12 Thread Chris Mason
Hi Linus,

Please pull my for-linus branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus

This is against 3.11-rc7, but was pulled and tested against your tree as
of yesterday.  We do have two small incrementals queued up, but I wanted
to get this bunch out the door before I hop on an airplane.

This is a fairly large batch of fixes, performance improvements, and
cleanups from the usual Btrfs suspects.

We've included Stefan Behren's work to index subvolume UUIDs, which is
targeted at speeding up send/receive with many subvolumes or snapshots
in place.  It closes a long standing performance issue that was built
in to the disk format.

Mark Fasheh's offline dedup work is also here.  In this case offline
means the FS is mounted and active, but the dedup work is not done
inline during file IO.   This is a building block where utilities  are
able to ask the FS to dedup a series of extents.  The kernel takes
care of verifying the data involved really is the same.  Today this
involves reading both extents, but we'll continue to evolve the patches.

Anand Jain (3):
  btrfs: fix get set label blocking against balance
  btrfs: use BTRFS_SUPER_INFO_SIZE macro at btrfs_read_dev_super()
  btrfs: return btrfs error code for dev excl ops err

Andy Shevchenko (1):
  btrfs: reuse kbasename helper

Carey Underwood (1):
  Btrfs: Release uuid_mutex for shrink during device delete

Dan Carpenter (1):
  btrfs/raid56: fix and cleanup some error paths

Dave Jones (1):
  Fix leak in __btrfs_map_block error path

David Sterba (2):
  btrfs: make errors in btrfs_num_copies less noisy
  btrfs: add mount option to set commit interval

Filipe David Borba Manana (18):
  Btrfs: optimize btrfs_lookup_extent_info()
  Btrfs: add missing error checks to add_data_references
  Btrfs: optimize function btrfs_read_chunk_tree
  Btrfs: add missing error check to find_parent_nodes
  Btrfs: add missing error handling to read_tree_block
  Btrfs: fix inode leak on kmalloc failure in tree-log.c
  Btrfs: don't ignore errors from btrfs_run_delayed_items
  Btrfs: return ENOSPC when target space is full
  Btrfs: add missing error code to BTRFS_IOC_INO_LOOKUP handler
  Btrfs: don't miss inode ref items in BTRFS_IOC_INO_LOOKUP
  Btrfs: reset force_compress on btrfs_file_defrag failure
  Btrfs: fix memory leak of orphan block rsv
  Btrfs: fix printing of non NULL terminated string
  Btrfs: fix race between removing a dev and writing sbs
  Btrfs: fix race conditions in BTRFS_IOC_FS_INFO ioctl
  Btrfs: fix memory leak of uuid_root in free_fs_info
  Btrfs: fix deadlock in uuid scan kthread
  Btrfs: optimize key searches in btrfs_search_slot

Geert Uytterhoeven (12):
  Btrfs: Remove superfluous casts from u64 to unsigned long long
  Btrfs: Make BTRFS_DEV_REPLACE_DEVID an unsigned long long constant
  Btrfs: Format PAGE_SIZE as unsigned long
  Btrfs: Format mirror_num as int
  Btrfs: Make btrfs_device_uuid() return unsigned long
  Btrfs: Make btrfs_device_fsid() return unsigned long
  Btrfs: Make btrfs_dev_extent_chunk_tree_uuid() return unsigned long
  Btrfs: Make btrfs_header_fsid() return unsigned long
  Btrfs: Make btrfs_header_chunk_tree_uuid() return unsigned long
  Btrfs: PAGE_CACHE_SIZE is already unsigned long
  Btrfs: Do not truncate sector_t on 32-bit with CONFIG_LBDAF=y
  Btrfs: Use %z to format size_t

Ilya Dryomov (5):
  Btrfs: find_next_devid: root - fs_info
  Btrfs: add btrfs_alloc_device and switch to it
  Btrfs: add alloc_fs_devices and switch to it
  Btrfs: rollback btrfs_device fields on umount
  Btrfs: stop refusing the relocation of chunk 0

Jeff Mahoney (1):
  btrfs: fall back to global reservation when removing subvolumes

Josef Bacik (30):
  Btrfs: stop using GFP_ATOMIC for the tree mod log allocations
  Btrfs: set lockdep class before locking new extent buffer
  Btrfs: reset ret in record_one_backref
  Btrfs: cleanup reloc roots properly on error
  Btrfs: don't bother autodefragging if our root is going away
  Btrfs: cleanup arguments to extent_clear_unlock_delalloc
  Btrfs: fix what bits we clear when erroring out from delalloc
  Btrfs: check to see if we have an inline item properly
  Btrfs: change how we queue blocks for backref checking
  Btrfs: don't bug_on when we fail when cleaning up transactions
  Btrfs: handle errors when doing slow caching
  Btrfs: check our parent dir when doing a compare send
  Btrfs: deal with enomem in the rewind path
  Btrfs: stop using GFP_ATOMIC when allocating rewind ebs
  Btrfs: skip subvol entries when checking if we've created a dir already
  Btrfs: don't allow a subvol to be deleted if it is the default subovl
  Btrfs: fix the error handling wrt orphan items
  Btrfs: fix heavy delalloc related deadlock
  Btrfs: 

Re: [PATCH v2 2/2] btrfs-progs: use kernel for mounted and lblkid to scan disks

2013-09-12 Thread David Sterba
On Fri, Sep 06, 2013 at 05:37:53PM +0800, Anand Jain wrote:
 Further, to scan for the disks this patch will use
 lblkid, so that we don't have to manually scan the
 /dev or /dev/mapper which means we don't need the
 all-devices options.

Thanks for implementing it! I found a few things to fix, comments below.

I wonder if we should keep --all-devices as a last resort fallback eg.
when blkid cache is not available.

 --- a/cmds-filesystem.c
 +++ b/cmds-filesystem.c
 -static int uuid_search(struct btrfs_fs_devices *fs_devices, char *search)
 -{
 - char uuidbuf[37];
 - struct list_head *cur;
 - struct btrfs_device *device;
 - int search_len = strlen(search);
 -
 - search_len = min(search_len, 37);
 - uuid_unparse(fs_devices-fsid, uuidbuf);
 - if (!strncmp(uuidbuf, search, search_len))
 - return 1;
 -
 - list_for_each(cur, fs_devices-devices) {
 - device = list_entry(cur, struct btrfs_device, dev_list);
 - if ((device-label  strcmp(device-label, search) == 0) ||
 - strcmp(device-name, search) == 0)
 - return 1;
 - }
 - return 0;
 -}

This is removing functionality and I don't understand why. It's used for

$ btrfs fi show 9f135b48-cc15-424b-8730-a6432c67dc34
[prints only the given filesystem]

and with your patch does not work as such. Can it be implemented on top
of the code you're adding in this patch? If yes, please make a separate
patch for that.

 @@ -232,8 +214,108 @@ static void print_one_uuid(struct btrfs_fs_devices 
 *fs_devices)
   printf(\n);
  }
  
 +/* adds up all the used spaces as reported by the space info ioctl
 + */
 +static u64 cal_used_bytes(struct btrfs_ioctl_space_args *si)

calc_used_bytes

 +static int btrfs_scan_kernel(void *input, int type)
 +{
 + int ret = 0, fd;
 + FILE *f;
 + struct mntent *mnt;
 + struct btrfs_ioctl_fs_info_args fi;
 + struct btrfs_ioctl_dev_info_args *di = NULL;
 + struct btrfs_ioctl_space_args *si;

the variable names are not very descriptive

 + char label[BTRFS_LABEL_SIZE];
 +
 + f = setmntent(/proc/mounts, r);

should be /proc/self/mounts

 + if (f == NULL)
 + return -errno;

man says that setmntent does not set errno 

 +
 + while ((mnt = getmntent(f)) != NULL) {
 + if (strcmp(mnt-mnt_type, btrfs))
 + continue;
 + ret = get_fs_info(mnt-mnt_dir, fi, di);
 + if (ret)
 + return ret;
 +
 + switch (type) {

Please add defines instead of the integers representing 'type'.

 + case 0:
 + break;
 + case 1:
 + if (uuid_compare(fi.fsid, (u8 *)input))
 + continue;
 + break;
 + case 2:
 + if (strcmp(input, mnt-mnt_dir))
 + continue;
 + break;

I haven't seen 1 and 2 used anywhere

 + default:
 + break;
 + }
 +
 + fd = open(mnt-mnt_dir, O_RDONLY);
 + if (fd  0  !get_df(fd, si)) {
 + get_label_mounted(mnt-mnt_dir, label);
 + print_one_fs(fi, di, si, label, mnt-mnt_dir);
 + free(si);
 + }
 + if (fd  0)
 + close(fd);
 + free(di);
 + }
 + return ret;
 +}
 @@ -244,36 +326,42 @@ static int cmd_show(int argc, char **argv)
 + /* show only mounted btrfs disks */
 + if (argc  1  !strcmp(argv[1], --mounted))
 + where = BTRFS_SCAN_MOUNTED;
  
 - all_uuids = btrfs_scanned_uuids();
 - list_for_each(cur_uuid, all_uuids) {
 - fs_devices = list_entry(cur_uuid, struct btrfs_fs_devices,
 + switch (where) {
 + case 0:
 + /* no option : show both mounted and unmounted
 +  */
 + /* mounted */
 + ret = btrfs_scan_kernel(NULL, 0);
 + if (ret)
 + fprintf(stderr, ERROR: scan kernel failed, %d\n,
 + ret);

I see this warning and there are no mounted filesystems listed in the
output:

$ ./btrfs fi show
ERROR: scan kernel failed, -1


 +
 + /* unmounted */
 + scan_for_btrfs_v2(!BTRFS_UPDATE_KERNEL);
 + all_uuids = btrfs_scanned_uuids();
 + list_for_each(cur_uuid, all_uuids) {
 + fs_devices = list_entry(cur_uuid,
 + struct btrfs_fs_devices,
   list);
 - if (search  uuid_search(fs_devices, search) == 0)
 - continue;
 - print_one_uuid(fs_devices);
 + print_one_uuid(fs_devices);
 + }
 + break;
 + case BTRFS_SCAN_MOUNTED:
 + ret = btrfs_scan_kernel(NULL, 0);
 + if (ret)
 + 

Re: [PATCH 1/2 resend] btrfs-progs: v4, move out print in cmd_df to another function

2013-09-12 Thread David Sterba
On Fri, Sep 06, 2013 at 05:37:52PM +0800, Anand Jain wrote:
 +static char *group_type_str(u64 flag)
  {
 - struct btrfs_ioctl_space_args *sargs, *sargs_orig;
 - u64 count = 0, i;
 - int ret;
 - int fd;
 - int e;
 - char *path;
 - DIR  *dirstream = NULL;
 -
 - if (check_argc_exact(argc, 2))
 - usage(cmd_df_usage);
 -
 - path = argv[1];
 + switch (flag  BTRFS_BLOCK_GROUP_TYPE_MASK) {
 + case BTRFS_BLOCK_GROUP_DATA:
 + return data;
 + case BTRFS_BLOCK_GROUP_SYSTEM:
 + return system;
 + case BTRFS_BLOCK_GROUP_METADATA:
 + return metadata;
 + case BTRFS_BLOCK_GROUP_DATA|BTRFS_BLOCK_GROUP_METADATA:
 + return mixed;

I think the profile names should stay unchanged, ie Data, System etc,
and Data+Metadata instead of mixed. We can change the output format
later, but for this preparatory patch I'd stick with what it is.

 + default:
 + return unknown;
 + }
 +}
  
 - fd = open_file_or_dir(path, dirstream);
 - if (fd  0) {
 - fprintf(stderr, ERROR: can't access to '%s'\n, path);
 - return 12;
 +static char *group_profile_str(u64 flag)
 +{
 + switch (flag  BTRFS_BLOCK_GROUP_PROFILE_MASK) {
 + case 0:
 + return single;

The 'single' profile was not explicitly mentioned before, I tend to
think that it's better to be consistent with the rest and add it as you
do in this patch.

Sample output:

$ ./btrfs fi df /mnt/enospc/mnt
data, single: total=5.92GiB, used=4.41GiB
system, DUP: total=8.00MiB, used=4.00KiB
system, single: total=4.00MiB, used=0.00
metadata, DUP: total=1.02GiB, used=828.10MiB
metadata, single: total=8.00MiB, used=0.00

looks imho ok.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/9] btrfs: Replace the btrfs_workers with kernel workqueue

2013-09-12 Thread David Sterba
On Thu, Sep 12, 2013 at 04:08:15PM +0800, Qu Wenruo wrote:
 Use kernel workqueue and kernel workqueue based new btrfs_workqueue_struct to 
 replace
 the old btrfs_workers.
 The main goal is to reduce the redundant codes(800 lines vs 200 lines) and
 try to get benefits from the latest workqueue changes.
 
 About the performance, the test suite I used is bonnie++,
 and there seems no significant regression.

You're replacing a core infrastructure building block, more testing is
absolutely required, but using the available infrastructure is a good
move.

I found a few things that do not replace the current implementation
one-to-one:

* the thread names lost the btrfs- prefix, this makes it hard to
  identify the processes and we want this, either debugging or
  performance monitoring

* od high priority tasks were handled in threads with unchanged priority
  and just prioritized within the queue
  newly addded WQ_HIGHPRI elevates the nice level of the thread, ie.
  it's not the same thing as before -- I need to look closer

* the idle_thresh attribute is not reflected in the new code, I don't
  know if the kernel workqueues have something equivalent


Other random comments:

* you can use the same files for the new helpers, instead of bwq.[ch]

* btrfs_workqueue_struct can drop the _struct suffix

* WQ_MEM_RECLAIM for the scrub thread does not seem right

* WQ_FREEZABLE should be probably set


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Btrfs

2013-09-12 Thread Josh Boyer
On Thu, Sep 12, 2013 at 11:36 AM, Chris Mason chris.ma...@fusionio.com wrote:
 Mark Fasheh (4):
   btrfs: offline dedupe

This commit adds calls to __put_user_unaligned, which causes build
failures on ARM if btrfs is configured:

+ make -s ARCH=arm V=1 -j4 modules
fs/btrfs/ioctl.c: In function 'btrfs_ioctl_file_extent_same':
fs/btrfs/ioctl.c:2802:3: error: implicit declaration of function
'__put_user_unaligned' [-Werror=implicit-function-declaration]
   if (__put_user_unaligned(info.status, args-info[i].status) ||
   ^
cc1: some warnings being treated as errors
make[2]: *** [fs/btrfs/ioctl.o] Error 1
make[1]: *** [fs/btrfs] Error 2
make[1]: *** Waiting for unfinished jobs
make: *** [fs] Error 2
make: *** Waiting for unfinished jobs

josh
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: improve replacing nocow extents

2013-09-12 Thread Josef Bacik
Various people have hit a deadlock when running btrfs/011.  This is because when
replacing nocow extents we will take the i_mutex to make sure nobody messes with
the file while we are replacing the extent.  The problem is we are already
holding a transaction open, which is a locking inversion, so instead we need to
save these inodes we find and then process them outside of the transaction.

Further we can't just lock the inode and assume we are good to go.  We need to
lock the extent range and then read back the extent cache for the inode to make
sure the extent really still points at the physical block we want.  If it
doesn't we don't have to copy it.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
 fs/btrfs/scrub.c |  112 +++---
 1 files changed, 98 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 0afcd45..c2463aa 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -158,12 +158,20 @@ struct scrub_fixup_nodatasum {
int mirror_num;
 };
 
+struct scrub_nocow_inode {
+   u64 inum;
+   u64 offset;
+   u64 root;
+   struct list_headlist;
+};
+
 struct scrub_copy_nocow_ctx {
struct scrub_ctx*sctx;
u64 logical;
u64 len;
int mirror_num;
u64 physical_for_dev_replace;
+   struct list_headinodes;
struct btrfs_work   work;
 };
 
@@ -245,7 +253,7 @@ static void scrub_wr_bio_end_io_worker(struct btrfs_work 
*work);
 static int write_page_nocow(struct scrub_ctx *sctx,
u64 physical_for_dev_replace, struct page *page);
 static int copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root,
- void *ctx);
+ struct scrub_copy_nocow_ctx *ctx);
 static int copy_nocow_pages(struct scrub_ctx *sctx, u64 logical, u64 len,
int mirror_num, u64 physical_for_dev_replace);
 static void copy_nocow_pages_worker(struct btrfs_work *work);
@@ -3126,12 +3134,30 @@ static int copy_nocow_pages(struct scrub_ctx *sctx, u64 
logical, u64 len,
nocow_ctx-mirror_num = mirror_num;
nocow_ctx-physical_for_dev_replace = physical_for_dev_replace;
nocow_ctx-work.func = copy_nocow_pages_worker;
+   INIT_LIST_HEAD(nocow_ctx-inodes);
btrfs_queue_worker(fs_info-scrub_nocow_workers,
   nocow_ctx-work);
 
return 0;
 }
 
+static int record_inode_for_nocow(u64 inum, u64 offset, u64 root, void *ctx)
+{
+   struct scrub_copy_nocow_ctx *nocow_ctx = ctx;
+   struct scrub_nocow_inode *nocow_inode;
+
+   nocow_inode = kzalloc(sizeof(*nocow_inode), GFP_NOFS);
+   if (!nocow_inode)
+   return -ENOMEM;
+   nocow_inode-inum = inum;
+   nocow_inode-offset = offset;
+   nocow_inode-root = root;
+   list_add_tail(nocow_inode-list, nocow_ctx-inodes);
+   return 0;
+}
+
+#define COPY_COMPLETE 1
+
 static void copy_nocow_pages_worker(struct btrfs_work *work)
 {
struct scrub_copy_nocow_ctx *nocow_ctx =
@@ -3167,8 +3193,7 @@ static void copy_nocow_pages_worker(struct btrfs_work 
*work)
}
 
ret = iterate_inodes_from_logical(logical, fs_info, path,
- copy_nocow_pages_for_inode,
- nocow_ctx);
+ record_inode_for_nocow, nocow_ctx);
if (ret != 0  ret != -ENOENT) {
pr_warn(iterate_inodes_from_logical() failed: log %llu, phys 
%llu, len %llu, mir %u, ret %d\n,
logical, physical_for_dev_replace, len, mirror_num,
@@ -3177,7 +3202,33 @@ static void copy_nocow_pages_worker(struct btrfs_work 
*work)
goto out;
}
 
+   btrfs_end_transaction(trans, root);
+   trans = NULL;
+   while (!list_empty(nocow_ctx-inodes)) {
+   struct scrub_nocow_inode *entry;
+   entry = list_first_entry(nocow_ctx-inodes,
+struct scrub_nocow_inode,
+list);
+   list_del_init(entry-list);
+   ret = copy_nocow_pages_for_inode(entry-inum, entry-offset,
+entry-root, nocow_ctx);
+   kfree(entry);
+   if (ret == COPY_COMPLETE) {
+   ret = 0;
+   break;
+   } else if (ret) {
+   break;
+   }
+   }
 out:
+   while (!list_empty(nocow_ctx-inodes)) {
+   struct scrub_nocow_inode *entry;
+   entry = list_first_entry(nocow_ctx-inodes,
+struct scrub_nocow_inode,

Re: [PATCH v2 2/9] btrfs: use kernel workqueue to replace the btrfs_workers functions

2013-09-12 Thread Liu Bo
On Thu, Sep 12, 2013 at 04:08:17PM +0800, Qu Wenruo wrote:
 Use the kernel workqueue to replace the btrfs_workers which are only
 used as normal workqueue.
 
 Other btrfs_workers will use some extra functions like requeue, high
 priority and ordered work.
 These btrfs_workers will not be touched in this patch.
 
 The followings are the untouched btrfs_workers:
 
 generic_worker:   As the helper for other btrfs_workers
 workers:  Use the ordering and high priority features
 delalloc_workers: Use the ordering feature
 submit_workers:   Use requeue feature
 
 All other workers can be replaced using the kernel workqueue directly.

Interesting, I've been doing the same work for a while, but I'm still
doing the tuning work on kerner wq + btrfs.

 
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
 ---
  fs/btrfs/ctree.h |  39 +--
  fs/btrfs/delayed-inode.c |   9 ++-
  fs/btrfs/disk-io.c   | 164 
 ++-
  fs/btrfs/extent-tree.c   |   6 +-
  fs/btrfs/inode.c |  38 +--
  fs/btrfs/ordered-data.c  |  11 ++--
  fs/btrfs/ordered-data.h  |   4 +-
  fs/btrfs/qgroup.c|  16 ++---
  fs/btrfs/raid56.c|  37 +--
  fs/btrfs/reada.c |   8 +--
  fs/btrfs/scrub.c |  84 
  fs/btrfs/super.c |  23 ---
  12 files changed, 196 insertions(+), 243 deletions(-)
 
 diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
 index e795bf1..0dd6ec9 100644
 --- a/fs/btrfs/ctree.h
 +++ b/fs/btrfs/ctree.h
 @@ -1202,7 +1202,7 @@ struct btrfs_caching_control {
   struct list_head list;
   struct mutex mutex;
   wait_queue_head_t wait;
 - struct btrfs_work work;
 + struct work_struct work;
   struct btrfs_block_group_cache *block_group;
   u64 progress;
   atomic_t count;
 @@ -1479,25 +1479,26 @@ struct btrfs_fs_info {
   struct btrfs_workers generic_worker;
   struct btrfs_workers workers;
   struct btrfs_workers delalloc_workers;
 - struct btrfs_workers flush_workers;
 - struct btrfs_workers endio_workers;
 - struct btrfs_workers endio_meta_workers;
 - struct btrfs_workers endio_raid56_workers;
 - struct btrfs_workers rmw_workers;
 - struct btrfs_workers endio_meta_write_workers;
 - struct btrfs_workers endio_write_workers;
 - struct btrfs_workers endio_freespace_worker;
   struct btrfs_workers submit_workers;
 - struct btrfs_workers caching_workers;
 - struct btrfs_workers readahead_workers;
 +
 + struct workqueue_struct *flush_workers;
 + struct workqueue_struct *endio_workers;
 + struct workqueue_struct *endio_meta_workers;
 + struct workqueue_struct *endio_raid56_workers;
 + struct workqueue_struct *rmw_workers;
 + struct workqueue_struct *endio_meta_write_workers;
 + struct workqueue_struct *endio_write_workers;
 + struct workqueue_struct *endio_freespace_worker;
 + struct workqueue_struct *caching_workers;
 + struct workqueue_struct *readahead_workers;
  
   /*
* fixup workers take dirty pages that didn't properly go through
* the cow mechanism and make them safe to write.  It happens
* for the sys_munmap function call path
*/
 - struct btrfs_workers fixup_workers;
 - struct btrfs_workers delayed_workers;
 + struct workqueue_struct *fixup_workers;
 + struct workqueue_struct *delayed_workers;
   struct task_struct *transaction_kthread;
   struct task_struct *cleaner_kthread;
   int thread_pool_size;
 @@ -1576,9 +1577,9 @@ struct btrfs_fs_info {
   wait_queue_head_t scrub_pause_wait;
   struct rw_semaphore scrub_super_lock;
   int scrub_workers_refcnt;
 - struct btrfs_workers scrub_workers;
 - struct btrfs_workers scrub_wr_completion_workers;
 - struct btrfs_workers scrub_nocow_workers;
 + struct workqueue_struct *scrub_workers;
 + struct workqueue_struct *scrub_wr_completion_workers;
 + struct workqueue_struct *scrub_nocow_workers;
  
  #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
   u32 check_integrity_print_mask;
 @@ -1619,9 +1620,9 @@ struct btrfs_fs_info {
   /* qgroup rescan items */
   struct mutex qgroup_rescan_lock; /* protects the progress item */
   struct btrfs_key qgroup_rescan_progress;
 - struct btrfs_workers qgroup_rescan_workers;
 + struct workqueue_struct *qgroup_rescan_workers;
   struct completion qgroup_rescan_completion;
 - struct btrfs_work qgroup_rescan_work;
 + struct work_struct qgroup_rescan_work;
  
   /* filesystem state */
   unsigned long fs_state;
 @@ -3542,7 +3543,7 @@ struct btrfs_delalloc_work {
   int delay_iput;
   struct completion completion;
   struct list_head list;
 - struct btrfs_work work;
 + struct work_struct work;
  };
  
  struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode,
 diff --git 

Re: [PATCH v2 2/9] btrfs: use kernel workqueue to replace the btrfs_workers functions

2013-09-12 Thread Qu Wenruo

于 2013年09月13日 09:29, Liu Bo 写道:

On Thu, Sep 12, 2013 at 04:08:17PM +0800, Qu Wenruo wrote:

Use the kernel workqueue to replace the btrfs_workers which are only
used as normal workqueue.

Other btrfs_workers will use some extra functions like requeue, high
priority and ordered work.
These btrfs_workers will not be touched in this patch.

The followings are the untouched btrfs_workers:

generic_worker: As the helper for other btrfs_workers
workers:Use the ordering and high priority features
delalloc_workers:   Use the ordering feature
submit_workers: Use requeue feature

All other workers can be replaced using the kernel workqueue directly.

Interesting, I've been doing the same work for a while, but I'm still
doing the tuning work on kerner wq + btrfs.


Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
  fs/btrfs/ctree.h |  39 +--
  fs/btrfs/delayed-inode.c |   9 ++-
  fs/btrfs/disk-io.c   | 164 ++-
  fs/btrfs/extent-tree.c   |   6 +-
  fs/btrfs/inode.c |  38 +--
  fs/btrfs/ordered-data.c  |  11 ++--
  fs/btrfs/ordered-data.h  |   4 +-
  fs/btrfs/qgroup.c|  16 ++---
  fs/btrfs/raid56.c|  37 +--
  fs/btrfs/reada.c |   8 +--
  fs/btrfs/scrub.c |  84 
  fs/btrfs/super.c |  23 ---
  12 files changed, 196 insertions(+), 243 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index e795bf1..0dd6ec9 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1202,7 +1202,7 @@ struct btrfs_caching_control {
struct list_head list;
struct mutex mutex;
wait_queue_head_t wait;
-   struct btrfs_work work;
+   struct work_struct work;
struct btrfs_block_group_cache *block_group;
u64 progress;
atomic_t count;
@@ -1479,25 +1479,26 @@ struct btrfs_fs_info {
struct btrfs_workers generic_worker;
struct btrfs_workers workers;
struct btrfs_workers delalloc_workers;
-   struct btrfs_workers flush_workers;
-   struct btrfs_workers endio_workers;
-   struct btrfs_workers endio_meta_workers;
-   struct btrfs_workers endio_raid56_workers;
-   struct btrfs_workers rmw_workers;
-   struct btrfs_workers endio_meta_write_workers;
-   struct btrfs_workers endio_write_workers;
-   struct btrfs_workers endio_freespace_worker;
struct btrfs_workers submit_workers;
-   struct btrfs_workers caching_workers;
-   struct btrfs_workers readahead_workers;
+
+   struct workqueue_struct *flush_workers;
+   struct workqueue_struct *endio_workers;
+   struct workqueue_struct *endio_meta_workers;
+   struct workqueue_struct *endio_raid56_workers;
+   struct workqueue_struct *rmw_workers;
+   struct workqueue_struct *endio_meta_write_workers;
+   struct workqueue_struct *endio_write_workers;
+   struct workqueue_struct *endio_freespace_worker;
+   struct workqueue_struct *caching_workers;
+   struct workqueue_struct *readahead_workers;
  
  	/*

 * fixup workers take dirty pages that didn't properly go through
 * the cow mechanism and make them safe to write.  It happens
 * for the sys_munmap function call path
 */
-   struct btrfs_workers fixup_workers;
-   struct btrfs_workers delayed_workers;
+   struct workqueue_struct *fixup_workers;
+   struct workqueue_struct *delayed_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
int thread_pool_size;
@@ -1576,9 +1577,9 @@ struct btrfs_fs_info {
wait_queue_head_t scrub_pause_wait;
struct rw_semaphore scrub_super_lock;
int scrub_workers_refcnt;
-   struct btrfs_workers scrub_workers;
-   struct btrfs_workers scrub_wr_completion_workers;
-   struct btrfs_workers scrub_nocow_workers;
+   struct workqueue_struct *scrub_workers;
+   struct workqueue_struct *scrub_wr_completion_workers;
+   struct workqueue_struct *scrub_nocow_workers;
  
  #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY

u32 check_integrity_print_mask;
@@ -1619,9 +1620,9 @@ struct btrfs_fs_info {
/* qgroup rescan items */
struct mutex qgroup_rescan_lock; /* protects the progress item */
struct btrfs_key qgroup_rescan_progress;
-   struct btrfs_workers qgroup_rescan_workers;
+   struct workqueue_struct *qgroup_rescan_workers;
struct completion qgroup_rescan_completion;
-   struct btrfs_work qgroup_rescan_work;
+   struct work_struct qgroup_rescan_work;
  
  	/* filesystem state */

unsigned long fs_state;
@@ -3542,7 +3543,7 @@ struct btrfs_delalloc_work {
int delay_iput;
struct completion completion;
struct list_head list;
-   struct btrfs_work work;
+   struct work_struct work;
  };
  
  struct btrfs_delalloc_work 

Re: [PATCH v2 9/9] btrfs: Replace thread_pool_size with workqueue default value

2013-09-12 Thread Liu Bo
On Thu, Sep 12, 2013 at 04:08:24PM +0800, Qu Wenruo wrote:
 The original btrfs_workers uses the fs_info-thread_pool_size as the
 max_active, and the previous patches followed this way.
 
 But the kernel workqueue has the default value(0) for workqueue,
 and workqueue itself has some threshold mechanism to prevent creating
 too many threads, so we should use the default value.
 
 Since the thread_pool_size algorithm is not used, related codes should
 also be changed.

Ohh, I should have seen this mail first before commenting 'max_active'.

I think that some tuning work should be done on this part, according to
my tests, setting max_active=0 will create ~258 worker helpers
(kworker/uX:X if you set WQ_UNBOUND), this may cause too many context
switches which will have an impact on performance in some cases.

-liubo

 
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
 ---
  fs/btrfs/disk-io.c | 12 +++-
  fs/btrfs/super.c   |  3 +--
  2 files changed, 8 insertions(+), 7 deletions(-)
 
 diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
 index a61e1fe..0446d27 100644
 --- a/fs/btrfs/disk-io.c
 +++ b/fs/btrfs/disk-io.c
 @@ -750,9 +750,11 @@ int btrfs_bio_wq_end_io(struct btrfs_fs_info *info, 
 struct bio *bio,
  
  unsigned long btrfs_async_submit_limit(struct btrfs_fs_info *info)
  {
 - unsigned long limit = min_t(unsigned long,
 - info-thread_pool_size,
 - info-fs_devices-open_devices);
 + unsigned long limit;
 + limit = info-thread_pool_size ?
 + min_t(unsigned long, info-thread_pool_size,
 +   info-fs_devices-open_devices) :
 + info-fs_devices-open_devices;
   return 256 * limit;
  }
  
 @@ -2191,8 +2193,8 @@ int open_ctree(struct super_block *sb,
   INIT_RADIX_TREE(fs_info-reada_tree, GFP_NOFS  ~__GFP_WAIT);
   spin_lock_init(fs_info-reada_lock);
  
 - fs_info-thread_pool_size = min_t(unsigned long,
 -   num_online_cpus() + 2, 8);
 + /* use the default value of kernel workqueue */
 + fs_info-thread_pool_size = 0;
  
   INIT_LIST_HEAD(fs_info-ordered_roots);
   spin_lock_init(fs_info-ordered_root_lock);
 diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
 index 63e653c..ccf412f 100644
 --- a/fs/btrfs/super.c
 +++ b/fs/btrfs/super.c
 @@ -898,8 +898,7 @@ static int btrfs_show_options(struct seq_file *seq, 
 struct dentry *dentry)
   if (info-alloc_start != 0)
   seq_printf(seq, ,alloc_start=%llu,
  (unsigned long long)info-alloc_start);
 - if (info-thread_pool_size !=  min_t(unsigned long,
 -  num_online_cpus() + 2, 8))
 + if (info-thread_pool_size)
   seq_printf(seq, ,thread_pool=%d, info-thread_pool_size);
   if (btrfs_test_opt(root, COMPRESS)) {
   if (info-compress_type == BTRFS_COMPRESS_ZLIB)
 -- 
 1.8.4
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/9] btrfs: Replace the btrfs_workers with kernel workqueue

2013-09-12 Thread Qu Wenruo

于 2013年09月13日 01:37, David Sterba 写道:

On Thu, Sep 12, 2013 at 04:08:15PM +0800, Qu Wenruo wrote:

Use kernel workqueue and kernel workqueue based new btrfs_workqueue_struct to 
replace
the old btrfs_workers.
The main goal is to reduce the redundant codes(800 lines vs 200 lines) and
try to get benefits from the latest workqueue changes.

About the performance, the test suite I used is bonnie++,
and there seems no significant regression.

You're replacing a core infrastructure building block, more testing is
absolutely required, but using the available infrastructure is a good
move.
Definitely needs more test since the I lack enough different disks to 
test with.


I found a few things that do not replace the current implementation
one-to-one:

* the thread names lost the btrfs- prefix, this makes it hard to
   identify the processes and we want this, either debugging or
   performance monitoring

Yes, that's right.
But the problem is, even I added btrfs- prefix to the wq,
the real work executor is kernel workers without any prefix.
Still hard to debugging due to the workqueue mechanism.


* od high priority tasks were handled in threads with unchanged priority
   and just prioritized within the queue
   newly addded WQ_HIGHPRI elevates the nice level of the thread, ie.
   it's not the same thing as before -- I need to look closer

Also true, since I didn't find a way to ensure the high priority work
to be executed before any normal priority work,
I choose this workaround.
(Seems the original btrfs_workers also have some mechanism to avoid
starving, so I think this way maybe OK)


* the idle_thresh attribute is not reflected in the new code, I don't
   know if the kernel workqueues have something equivalent

It seems that kernel will not create kthread without any control,
but still needs more investigation to make sure.



Other random comments:

* you can use the same files for the new helpers, instead of bwq.[ch]

The way I used is to avoid naming confliction and easy to clean.
If needed I'll also use the async-thread.[ch]


* btrfs_workqueue_struct can drop the _struct suffix

The naming rule is mostly copied from kernel wq and just add btrfs_ prefix
if no naming confliction.
Will modify if needed.

* WQ_MEM_RECLAIM for the scrub thread does not seem right

* WQ_FREEZABLE should be probably set

Will modify soon.




david



Thanks for the comment.

Qu

--
-
Qu Wenruo
Development Dept.I
Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
No. 6 Wenzhu Road, Nanjing, 210012, China
TEL: +86+25-86630566-8526
COINS: 7998-8526
FAX: +86+25-83317685
MAIL: quwen...@cn.fujitsu.com
-

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 9/9] btrfs: Replace thread_pool_size with workqueue default value

2013-09-12 Thread Qu Wenruo

于 2013年09月13日 09:47, Liu Bo 写道:

On Thu, Sep 12, 2013 at 04:08:24PM +0800, Qu Wenruo wrote:

The original btrfs_workers uses the fs_info-thread_pool_size as the
max_active, and the previous patches followed this way.

But the kernel workqueue has the default value(0) for workqueue,
and workqueue itself has some threshold mechanism to prevent creating
too many threads, so we should use the default value.

Since the thread_pool_size algorithm is not used, related codes should
also be changed.

Ohh, I should have seen this mail first before commenting 'max_active'.

I think that some tuning work should be done on this part, according to
my tests, setting max_active=0 will create ~258 worker helpers
(kworker/uX:X if you set WQ_UNBOUND), this may cause too many context
switches which will have an impact on performance in some cases.

Yes, but the default number when using max_active=0 should be 256
(half of the WQ_DFL_ACTIVE).

Also in my test(single thread), the performance and CPU usage does not 
change too much.

So it seems that in this situation, the kernel has some control on creating
kthreads.

Still further max_active tunning is still quiet good.

Qu


-liubo


Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
  fs/btrfs/disk-io.c | 12 +++-
  fs/btrfs/super.c   |  3 +--
  2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a61e1fe..0446d27 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -750,9 +750,11 @@ int btrfs_bio_wq_end_io(struct btrfs_fs_info *info, struct 
bio *bio,
  
  unsigned long btrfs_async_submit_limit(struct btrfs_fs_info *info)

  {
-   unsigned long limit = min_t(unsigned long,
-   info-thread_pool_size,
-   info-fs_devices-open_devices);
+   unsigned long limit;
+   limit = info-thread_pool_size ?
+   min_t(unsigned long, info-thread_pool_size,
+ info-fs_devices-open_devices) :
+   info-fs_devices-open_devices;
return 256 * limit;
  }
  
@@ -2191,8 +2193,8 @@ int open_ctree(struct super_block *sb,

INIT_RADIX_TREE(fs_info-reada_tree, GFP_NOFS  ~__GFP_WAIT);
spin_lock_init(fs_info-reada_lock);
  
-	fs_info-thread_pool_size = min_t(unsigned long,

- num_online_cpus() + 2, 8);
+   /* use the default value of kernel workqueue */
+   fs_info-thread_pool_size = 0;
  
  	INIT_LIST_HEAD(fs_info-ordered_roots);

spin_lock_init(fs_info-ordered_root_lock);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 63e653c..ccf412f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -898,8 +898,7 @@ static int btrfs_show_options(struct seq_file *seq, struct 
dentry *dentry)
if (info-alloc_start != 0)
seq_printf(seq, ,alloc_start=%llu,
   (unsigned long long)info-alloc_start);
-   if (info-thread_pool_size !=  min_t(unsigned long,
-num_online_cpus() + 2, 8))
+   if (info-thread_pool_size)
seq_printf(seq, ,thread_pool=%d, info-thread_pool_size);
if (btrfs_test_opt(root, COMPRESS)) {
if (info-compress_type == BTRFS_COMPRESS_ZLIB)
--
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
-
Qu Wenruo
Development Dept.I
Nanjing Fujitsu Nanda Software Tech. Co., Ltd.(FNST)
No. 6 Wenzhu Road, Nanjing, 210012, China
TEL: +86+25-86630566-8526
COINS: 7998-8526
FAX: +86+25-83317685
MAIL: quwen...@cn.fujitsu.com
-

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: btrfs.8.in: Add info about reverting back to root subvolume.

2013-09-12 Thread chandan
On Thursday, September 12, 2013 03:29:52 PM David Sterba wrote:
 
 The number 5 is an implementation detail, we should recommend to use 0.

In the current btrfs kernel code if 0 is passed as the subvolume id,
the btrfs_ioctl_default_subvol() function sets the subvolume id to the
objectid of the current subvolume. The patch provided at
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg17973.html
should fix the issue.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html