Re: [f2fs-dev] [PATCH v2] f2fs: fix max orphan inodes calculation

2015-03-08 Thread Changman Lee

-- 8 --

From ce2462523dd5940b59f770c09a50d4babff5fcdb Mon Sep 17 00:00:00 2001
From: Changman Lee cm224@samsung.com
Date: Mon, 9 Mar 2015 08:07:04 +0900
Subject: [PATCH] f2fs: cleanup statement about max orphan inodes calc

Through each macro, we can read the meaning easily.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/checkpoint.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 53bc328..384bfc4 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1104,13 +1104,6 @@ void init_ino_entry_info(struct f2fs_sb_info *sbi)
im-ino_num = 0;
}
 
-   /*
-* considering 512 blocks in a segment 8+cp_payload blocks are
-* needed for cp and log segment summaries. Remaining blocks are
-* used to keep orphan entries with the limitation one reserved
-* segment for cp pack we can have max 1020*(504-cp_payload)
-* orphan entries
-*/
sbi-max_orphans = (sbi-blocks_per_seg - F2FS_CP_PACKS -
NR_CURSEG_TYPE - __cp_payload(sbi)) *
F2FS_ORPHANS_PER_BLOCK;
-- 
1.9.1


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] fs/f2fs: add cond_resched() to sync_dirty_dir_inodes()

2015-03-02 Thread Changman Lee
On Fri, Feb 27, 2015 at 01:13:14PM +0100, Sebastian Andrzej Siewior wrote:
 In a preempt-off enviroment a alot of FS activity (write/delete) I run
 into a CPU stall:
 
 | NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u2:2:59]
 | Modules linked in:
 | CPU: 0 PID: 59 Comm: kworker/u2:2 Tainted: GW  
 3.19.0-00010-g10c11c51ffed #153
 | Workqueue: writeback bdi_writeback_workfn (flush-179:0)
 | task: df23 ti: df23e000 task.ti: df23e000
 | PC is at __submit_merged_bio+0x6c/0x110
 | LR is at f2fs_submit_merged_bio+0x74/0x80
 …
 | [c00085c4] (gic_handle_irq) from [c0012e84] (__irq_svc+0x44/0x5c)
 | Exception stack(0xdf23fb48 to 0xdf23fb90)
 | fb40:   deef3484 0001 0001 0027 deef3484 
 
 | fb60: deef3440  de426000 deef34ec deefc440 df23fbb4 df23fbb8 
 df23fb90
 | fb80: c02191f0 c0218fa0 6013 
 | [c0012e84] (__irq_svc) from [c0218fa0] (__submit_merged_bio+0x6c/0x110)
 | [c0218fa0] (__submit_merged_bio) from [c02191f0] 
 (f2fs_submit_merged_bio+0x74/0x80)
 | [c02191f0] (f2fs_submit_merged_bio) from [c021624c] 
 (sync_dirty_dir_inodes+0x70/0x78)
 | [c021624c] (sync_dirty_dir_inodes) from [c0216358] 
 (write_checkpoint+0x104/0xc10)
 | [c0216358] (write_checkpoint) from [c021231c] (f2fs_sync_fs+0x80/0xbc)
 | [c021231c] (f2fs_sync_fs) from [c0221eb8] (f2fs_balance_fs_bg+0x4c/0x68)
 | [c0221eb8] (f2fs_balance_fs_bg) from [c021e9b8] 
 (f2fs_write_node_pages+0x40/0x110)
 | [c021e9b8] (f2fs_write_node_pages) from [c00de620] 
 (do_writepages+0x34/0x48)
 | [c00de620] (do_writepages) from [c0145714] 
 (__writeback_single_inode+0x50/0x228)
 | [c0145714] (__writeback_single_inode) from [c0146184] 
 (writeback_sb_inodes+0x1a8/0x378)
 | [c0146184] (writeback_sb_inodes) from [c01463e4] 
 (__writeback_inodes_wb+0x90/0xc8)
 | [c01463e4] (__writeback_inodes_wb) from [c01465f8] 
 (wb_writeback+0x1dc/0x28c)
 | [c01465f8] (wb_writeback) from [c0146dd8] 
 (bdi_writeback_workfn+0x2ac/0x460)
 | [c0146dd8] (bdi_writeback_workfn) from [c003c3fc] 
 (process_one_work+0x11c/0x3a4)
 | [c003c3fc] (process_one_work) from [c003c844] 
 (worker_thread+0x17c/0x490)
 | [c003c844] (worker_thread) from [c0041398] (kthread+0xec/0x100)
 | [c0041398] (kthread) from [c000ed10] (ret_from_fork+0x14/0x24)
 
 As it turns out, the code loops in sync_dirty_dir_inodes() and waits for
 others to make progress but since it never leaves the CPU there is no
 progress made. At the time of this stall, there is also a rm process
 blocked:
 | rm  R running  0  1989   1774 0x
 | [c047c55c] (__schedule) from [c00486dc] (__cond_resched+0x30/0x4c)
 | [c00486dc] (__cond_resched) from [c047c8c8] (_cond_resched+0x4c/0x54)
 | [c047c8c8] (_cond_resched) from [c00e1aec] 
 (truncate_inode_pages_range+0x1f0/0x5e8)
 | [c00e1aec] (truncate_inode_pages_range) from [c00e1fd8] 
 (truncate_inode_pages+0x28/0x30)
 | [c00e1fd8] (truncate_inode_pages) from [c00e2148] 
 (truncate_inode_pages_final+0x60/0x64)
 | [c00e2148] (truncate_inode_pages_final) from [c020c92c] 
 (f2fs_evict_inode+0x4c/0x268)
 | [c020c92c] (f2fs_evict_inode) from [c0137214] (evict+0x94/0x140)
 | [c0137214] (evict) from [c01377e8] (iput+0xc8/0x134)
 | [c01377e8] (iput) from [c01333e4] (d_delete+0x154/0x180)
 | [c01333e4] (d_delete) from [c0129870] (vfs_rmdir+0x114/0x12c)
 | [c0129870] (vfs_rmdir) from [c012d644] (do_rmdir+0x158/0x168)
 | [c012d644] (do_rmdir) from [c012dd90] (SyS_unlinkat+0x30/0x3c)
 | [c012dd90] (SyS_unlinkat) from [c000ec40] (ret_fast_syscall+0x0/0x4c)
 
 As explained by Jaegeuk Kim:
 |This inode is the directory (c.f., do_rmdir) causing a infinite loop on
 |sync_dirty_dir_inodes.
 |The sync_dirty_dir_inodes tries to flush dirty dentry pages, but if the
 |inode is under eviction, it submits bios and do it again until eviction
 |is finished.
 
 This patch adds a cond_resched() (as suggested by Jaegeuk) after a BIO
 is submitted so other thread can make progress.
 
 Signed-off-by: Sebastian Andrzej Siewior bige...@linutronix.de
 ---
 Hi Jaegeuk,
 
 How about adding cond_resched() right after f2fs_submit_merged_bio in
 sync_dirty_dir_inodes?
 
 Could you test this?
 
 So I added it as you suggsted. I've seen that the two function looped
 for 5sec but the system did not freeze like before that patch. So it
 seems to work, thanks.

Hi Sebastian,

After this patch, your test is all done without any CPU stall, Right?
IMHO, context should be switched without cond_resched() after consumed
own time quota. So, it just reduces system latency due to yielding.
I thought another way to discard pages of inode to be evicted in merged bio
instead of submit. If so, evict() doesn't need to wait for writeback.
Just my curiousity out of this problem.

Thanks,

 
  fs/f2fs/checkpoint.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
 index 7f794b72b3b7..a2ad3df39f24 100644
 --- a/fs/f2fs/checkpoint.c
 +++ b/fs/f2fs/checkpoint.c
 @@ -796,6 +796,7 @@ void 

Re: [f2fs-dev] [PATCH v2] f2fs: fix max orphan inodes calculation

2015-03-02 Thread Changman Lee
On Fri, Feb 27, 2015 at 05:38:13PM +0800, Wanpeng Li wrote:
 cp_payload is introduced for sit bitmap to support large volume, and it is
 just after the block of f2fs_checkpoint + nat bitmap, so the first segment
 should include F2FS_CP_PACKS + NR_CURSEG_TYPE + cp_payload + orphan blocks.
 However, current max orphan inodes calculation don't consider cp_payload,
 this patch fix it by reducing the number of cp_payload from total blocks of
 the first segment when calculate max orphan inodes.
 
 Signed-off-by: Wanpeng Li wanpeng...@linux.intel.com
 ---
 v1 - v2:
  * adjust comments above the codes 
  * fix coding style issue
 
  fs/f2fs/checkpoint.c | 12 +++-
  1 file changed, 7 insertions(+), 5 deletions(-)
 
 diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
 index db82e09..a914e99 100644
 --- a/fs/f2fs/checkpoint.c
 +++ b/fs/f2fs/checkpoint.c
 @@ -1103,13 +1103,15 @@ void init_ino_entry_info(struct f2fs_sb_info *sbi)
   }
  
   /*
 -  * considering 512 blocks in a segment 8 blocks are needed for cp
 -  * and log segment summaries. Remaining blocks are used to keep
 -  * orphan entries with the limitation one reserved segment
 -  * for cp pack we can have max 1020*504 orphan entries
 +  * considering 512 blocks in a segment 8+cp_payload blocks are
 +  * needed for cp and log segment summaries. Remaining blocks are
 +  * used to keep orphan entries with the limitation one reserved
 +  * segment for cp pack we can have max 1020*(504-cp_payload)
 +  * orphan entries
*/

Hi all,

I think below code give us information enough so it doesn't need to
describe above comments. And someone could get confused by 1020 constants.
How do you think about removing comments.

Regards,
Changman

   sbi-max_orphans = (sbi-blocks_per_seg - F2FS_CP_PACKS -
 - NR_CURSEG_TYPE) * F2FS_ORPHANS_PER_BLOCK;
 + NR_CURSEG_TYPE - __cp_payload(sbi)) *
 + F2FS_ORPHANS_PER_BLOCK;
  }
  
  int __init create_checkpoint_caches(void)
 -- 
 1.9.1

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 5/5 v2] f2fs: introduce a batched trim

2015-02-02 Thread Changman Lee
Hi Jaegeuk,

IMHO, it looks better user could decide the size of trim considering latency of 
trim.
Otherwise, additional checkpoints user doesn't want will occur.

Regards,
Changman

On Mon, Feb 02, 2015 at 03:29:25PM -0800, Jaegeuk Kim wrote:
 Change long from v1:
  o add description
  o change the # of batched segments suggested by Chao
  o make consistent for # of batched segments
 
 This patch introduces a batched trimming feature, which submits split discard
 commands.
 
 This patch introduces a batched trimming feature, which submits split discard
 commands.
 
 This is to avoid long latency due to huge trim commands.
 If fstrim was triggered ranging from 0 to the end of device, we should lock
 all the checkpoint-related mutexes, resulting in very long latency.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/f2fs.h|  2 ++
  fs/f2fs/segment.c | 16 +++-
  2 files changed, 13 insertions(+), 5 deletions(-)
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 index 8231a59..ec5e66f 100644
 --- a/fs/f2fs/f2fs.h
 +++ b/fs/f2fs/f2fs.h
 @@ -105,6 +105,8 @@ enum {
   CP_DISCARD,
  };
  
 +#define BATCHED_TRIM_SEGMENTS(sbi)   (((sbi)-segs_per_sec)  5)
 +
  struct cp_control {
   int reason;
   __u64 trim_start;
 diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
 index 5ea57ec..b85bb97 100644
 --- a/fs/f2fs/segment.c
 +++ b/fs/f2fs/segment.c
 @@ -1066,14 +1066,20 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
 fstrim_range *range)
   end_segno = (end = MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
   GET_SEGNO(sbi, end);
   cpc.reason = CP_DISCARD;
 - cpc.trim_start = start_segno;
 - cpc.trim_end = end_segno;
   cpc.trim_minlen = range-minlen  sbi-log_blocksize;
  
   /* do checkpoint to issue discard commands safely */
 - mutex_lock(sbi-gc_mutex);
 - write_checkpoint(sbi, cpc);
 - mutex_unlock(sbi-gc_mutex);
 + for (; start_segno = end_segno;
 + start_segno += BATCHED_TRIM_SEGMENTS(sbi)) {
 + cpc.trim_start = start_segno;
 + cpc.trim_end = min_t(unsigned int,
 + start_segno + BATCHED_TRIM_SEGMENTS (sbi) - 1,
 + end_segno);
 +
 + mutex_lock(sbi-gc_mutex);
 + write_checkpoint(sbi, cpc);
 + mutex_unlock(sbi-gc_mutex);
 + }
  out:
   range-len = cpc.trimmed  sbi-log_blocksize;
   return 0;
 -- 
 2.1.1
 
 
 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [RFC PATCH 06/10] f2fs: add core functions for rb-tree extent cache

2015-01-20 Thread Changman Lee
Hi Chao,

Great works. :)

2015-01-12 16:14 GMT+09:00 Chao Yu chao2...@samsung.com:
 This patch adds core functions including slab cache init function and
 init/lookup/update/shrink/destroy function for rb-tree based extent cache.

 Thank Jaegeuk Kim and Changman Lee as they gave much suggestion about detail
 design and implementation of extent cache.

 Todo:
  * add a cached_ei into struct extent_tree for a quick recent cache.
  * register rb-based extent cache shrink with mm shrink interface.
  * disable dir inode's extent cache.

 Signed-off-by: Chao Yu chao2...@samsung.com
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 Signed-off-by: Changman Lee cm224@samsung.com
 ---
  fs/f2fs/data.c | 458 
 +
  fs/f2fs/node.c |   9 +-
  2 files changed, 466 insertions(+), 1 deletion(-)

 diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
 index 4f5b871e..bf8c5eb 100644
 --- a/fs/f2fs/data.c
 +++ b/fs/f2fs/data.c
 @@ -25,6 +25,9 @@
  #include trace.h
  #include trace/events/f2fs.h


~ snip ~

 +
 +static void f2fs_update_extent_tree(struct inode *inode, pgoff_t fofs,
 +   block_t blkaddr)
 +{
 +   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 +   nid_t ino = inode-i_ino;
 +   struct extent_tree *et;
 +   struct extent_node *en = NULL, *en1 = NULL, *en2 = NULL, *en3 = NULL;
 +   struct extent_node *den = NULL;
 +   struct extent_info *pei;
 +   struct extent_info ei;
 +   unsigned int endofs;
 +
 +   if (is_inode_flag_set(F2FS_I(inode), FI_NO_EXTENT))
 +   return;
 +
 +retry:
 +   down_write(sbi-extent_tree_lock);
 +   et = radix_tree_lookup(sbi-extent_tree_root, ino);
 +   if (!et) {

We've already made some useful functions.
How about using f2fs_kmem_cache_alloc and f2fs_radix_tree_insert ?

 +   et = kmem_cache_alloc(extent_tree_slab, GFP_ATOMIC);
 +   if (!et) {
 +   up_write(sbi-extent_tree_lock);
 +   goto retry;
 +   }
 +   if (radix_tree_insert(sbi-extent_tree_root, ino, et)) {
 +   up_write(sbi-extent_tree_lock);
 +   kmem_cache_free(extent_tree_slab, et);
 +   goto retry;
 +   }
 +   memset(et, 0, sizeof(struct extent_tree));
 +   et-ino = ino;
 +   et-root = RB_ROOT;
 +   rwlock_init(et-lock);
 +   atomic_set(et-refcount, 0);
 +   et-count = 0;
 +   sbi-total_ext_tree++;
 +   }
 +   atomic_inc(et-refcount);
 +   up_write(sbi-extent_tree_lock);
 +

~ snip ~

 +
 +   write_unlock(et-lock);
 +   atomic_dec(et-refcount);
 +}
 +
 +void f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink)
 +{
 +   struct extent_tree *treevec[EXT_TREE_VEC_SIZE];
 +   struct extent_node *en, *tmp;
 +   unsigned long ino = F2FS_ROOT_INO(sbi);
 +   struct radix_tree_iter iter;
 +   void **slot;
 +   unsigned int found;
 +   unsigned int node_cnt = 0, tree_cnt = 0;
 +
 +   if (available_free_memory(sbi, EXTENT_CACHE))
 +   return;
 +
 +   spin_lock(sbi-extent_lock);
 +   list_for_each_entry_safe(en, tmp, sbi-extent_list, list) {
 +   if (!nr_shrink--)
 +   break;
 +   list_del_init(en-list);
 +   }
 +   spin_unlock(sbi-extent_lock);
 +

IMHO, it's expensive to retrieve all extent_tree to free extent_node
that list_empty() is true.
Is there any idea to improve this?
For example, if each extent_node has its extent_root, it would be more
fast by not to retrieve all trees.
Of course, however, it uses more memory.

But, I think that your patchset might just as well be merged because
patches are well made and it's clearly separated with mount option. In
the next time, we could improve this.

Regards,
Changman

 +   down_read(sbi-extent_tree_lock);
 +   while ((found = radix_tree_gang_lookup(sbi-extent_tree_root,
 +   (void **)treevec, ino, EXT_TREE_VEC_SIZE))) {
 +   unsigned i;
 +
 +   ino = treevec[found - 1]-ino + 1;
 +   for (i = 0; i  found; i++) {
 +   struct extent_tree *et = treevec[i];
 +
 +   atomic_inc(et-refcount);
 +   write_lock(et-lock);
 +   node_cnt += __free_extent_tree(sbi, et, false);
 +   write_unlock(et-lock);
 +   atomic_dec(et-refcount);
 +   }
 +   }
 +   up_read(sbi-extent_tree_lock);
 +
 +   down_write(sbi-extent_tree_lock);
 +   radix_tree_for_each_slot(slot, sbi-extent_tree_root, iter,
 +   F2FS_ROOT_INO(sbi)) {
 +   struct extent_tree *et = (struct extent_tree *)*slot

Re: [f2fs-dev] [RFC PATCH] f2fs: add extent cache base on rb-tree

2015-01-07 Thread Changman Lee
Hi Chao,

On Sun, Jan 04, 2015 at 11:19:28AM +0800, Chao Yu wrote:
 Hi Changman,
 
 Sorry for replying late!
 
  -Original Message-
  From: Changman Lee [mailto:cm224@samsung.com]
  Sent: Tuesday, December 30, 2014 8:32 AM
  To: Jaegeuk Kim
  Cc: Chao Yu; linux-f2fs-devel@lists.sourceforge.net; 
  linux-ker...@vger.kernel.org
  Subject: Re: [RFC PATCH] f2fs: add extent cache base on rb-tree
  
  Hi all,
  
  On Mon, Dec 29, 2014 at 01:23:00PM -0800, Jaegeuk Kim wrote:
   Hi Chao,
  
   On Mon, Dec 29, 2014 at 03:19:18PM +0800, Chao Yu wrote:
  
   [snip]
  
   Nice draft. :)
  
   
Please see the draft below.
   
1) Extent management:
If we use global management that managing all extents which are from 
different
inodes in sbi, we will face with serious lock contention when we access 
these
extents belong to different inodes concurrently, the loss may 
outweights the
gain.
  
   Agreed.
  
So we choose a local management for extent which means all extents are
managed by inode itself to avoid above lock contention. Addtionlly, we 
manage
all extents globally by linking all inode into a global lru list for 
extent
cache shrinker.
Approach:
a) build extent tree/rwlock/lru list/extent count in each inode.
*extent tree: link all extent in rb-tree;
*rwlock: protect fields when accessing extent cache 
concurrently;
*lru list: sort all extents in accessing time order;
*extent count: record total count of extents in cache.
b) use lru shrink list in sbi to manage all inode which cached 
extents.
*inode will be added or repostioned in this global list 
whenever
extent is being access in this inode.
*use spinlock to protect this shrink list.
  
   1. How about adding a data structure with inode number instead of 
   referring
   inode pointer?
  
   2. How about managing extent entries globally and setting an upper bound 
   to
   the number of extent entries instead of limiting them per each inode?
   (The rb-tree will handle many extents per inode.)
  
   3. It needs to set a minimum length for the candidate of extent cache.
(e.g., 64)
  
  
  Agreed.
  
   So, for example,
   struct ino_entry_for_extents {
 inode number;
 rb_tree for extent_entry objects;
 rwlock;
   };
  
   struct extent_entry {
 blkaddr, len;
 list_head *;
   };
  
   Something like this.
  
   [A, B, C, ... are extent entry]
  
   The sbi has
   1. an extent_list: (LRU) A - B - C - D - E - F - G (MRU)
   2. radix_tree:  ino_entry_for_extents (#10) has D, B in rb-tree
 ` ino_entry_for_extents (#11) has A, C in rb-tree
 ` ino_entry_for_extents (#12) has Fin rb-tree
 ` ino_entry_for_extents (#13) has G, E in rb-tree
  
   In f2fs_update_extent_cache and __get_data_block for #10,
 ino_entry_for_extents (#10) was founded and updated D or B.
 Then, updated entries are moved to MRU.
  
   In f2fs_evict_inode for #11, A and C are moved to LRU.
   But, if this inode is unlinked, all the A, C, and ino_entry_for_extens 
   (#11)
   should be released.
  
   In f2fs_balance_fs_bg, some LRU extents are released according to the 
   amount
   of consumed memory. Then, it frees any ino_entry_for_extents having no 
   extent.
  
   IMO, we don't need to consider readahead for this, since get_data_block 
   will
   be called by VFS readahead.
  
   Furthermore, we need to think about whether LRU is really best or not.
   IMO, the extent cache aims to improve second access speed, rather than 
   initial
   cold misses. So, maybe MRU or another algorithms would be better.
  
  
  Right. It's very comflicated to judge which is better.
  In read or write path, extents could be made every time. At that time, we 
  should
  decide which extent evicts instead of new extents if we set upper bound.
  In update, one extent could be seperated into 3. It requires 3 insertion 
  and 1 deletion.
  So if update happends frequently, we could give up extent management for 
  some ranges.
  And we need to bring ideas from vm managemnt. For example,
  active/inactive list and second chance to promotion, or batch work for 
  insertion/deletion
  
  I thought suddenly 'Simple is best'.
  Let's think about better ideas together.
 
 Yeah, how about using an opposite way to the way of page cache manager?
 
 for example:
 node page A,B,C,D is in page cache;
 extent a,b,c,d is in extent cache;
 extent a is built from page A, ..., d is built from page D.
 page cache: LRU A - B - C - D MRU
 extent cache: LRU a - b - c - d MRU
 
 If we use
 1) the same way LRU, cache pair A-a, B-b, ... may be reclaimed in the same 
 time as OOM.
 2) the opposite way, maybe A,B in page cache and d,c in extent cache will be 
 reclaimed,
 but we still can hit whole cache

Re: [f2fs-dev] [RFC PATCH] f2fs: add extent cache base on rb-tree

2014-12-29 Thread Changman Lee
Hi all,

On Mon, Dec 29, 2014 at 01:23:00PM -0800, Jaegeuk Kim wrote:
 Hi Chao,
 
 On Mon, Dec 29, 2014 at 03:19:18PM +0800, Chao Yu wrote:
 
 [snip]
 
 Nice draft. :)
 
  
  Please see the draft below.
  
  1) Extent management:
  If we use global management that managing all extents which are from 
  different
  inodes in sbi, we will face with serious lock contention when we access 
  these
  extents belong to different inodes concurrently, the loss may outweights the
  gain.
 
 Agreed.
 
  So we choose a local management for extent which means all extents are
  managed by inode itself to avoid above lock contention. Addtionlly, we 
  manage
  all extents globally by linking all inode into a global lru list for extent
  cache shrinker.
  Approach:
  a) build extent tree/rwlock/lru list/extent count in each inode.
  *extent tree: link all extent in rb-tree;
  *rwlock: protect fields when accessing extent cache 
  concurrently;
  *lru list: sort all extents in accessing time order;
  *extent count: record total count of extents in cache.
  b) use lru shrink list in sbi to manage all inode which cached extents.
  *inode will be added or repostioned in this global list whenever
  extent is being access in this inode.
  *use spinlock to protect this shrink list.
 
 1. How about adding a data structure with inode number instead of referring
 inode pointer?
 
 2. How about managing extent entries globally and setting an upper bound to
 the number of extent entries instead of limiting them per each inode?
 (The rb-tree will handle many extents per inode.)
 
 3. It needs to set a minimum length for the candidate of extent cache.
  (e.g., 64)
 

Agreed.

 So, for example,
 struct ino_entry_for_extents {
   inode number;
   rb_tree for extent_entry objects;
   rwlock;
 };
 
 struct extent_entry {
   blkaddr, len;
   list_head *;
 };
 
 Something like this.
 
 [A, B, C, ... are extent entry]
 
 The sbi has
 1. an extent_list: (LRU) A - B - C - D - E - F - G (MRU)
 2. radix_tree:  ino_entry_for_extents (#10) has D, B in rb-tree
   ` ino_entry_for_extents (#11) has A, C in rb-tree
   ` ino_entry_for_extents (#12) has Fin rb-tree
   ` ino_entry_for_extents (#13) has G, E in rb-tree
 
 In f2fs_update_extent_cache and __get_data_block for #10,
   ino_entry_for_extents (#10) was founded and updated D or B.
   Then, updated entries are moved to MRU.
 
 In f2fs_evict_inode for #11, A and C are moved to LRU.
 But, if this inode is unlinked, all the A, C, and ino_entry_for_extens (#11)
 should be released.
 
 In f2fs_balance_fs_bg, some LRU extents are released according to the amount
 of consumed memory. Then, it frees any ino_entry_for_extents having no extent.
 
 IMO, we don't need to consider readahead for this, since get_data_block will
 be called by VFS readahead.
 
 Furthermore, we need to think about whether LRU is really best or not.
 IMO, the extent cache aims to improve second access speed, rather than initial
 cold misses. So, maybe MRU or another algorithms would be better.
 

Right. It's very comflicated to judge which is better.
In read or write path, extents could be made every time. At that time, we should
decide which extent evicts instead of new extents if we set upper bound.
In update, one extent could be seperated into 3. It requires 3 insertion and 1 
deletion.
So if update happends frequently, we could give up extent management for some 
ranges.
And we need to bring ideas from vm managemnt. For example,
active/inactive list and second chance to promotion, or batch work for 
insertion/deletion

I thought suddenly 'Simple is best'.
Let's think about better ideas together.

 Thanks,
 
  
  2) Limitation:
  In one inode, as we split or add extent in extent cache when read/write, 
  extent
  number will enlarge, so memory and CPU overhead will increase.
  In order to control the overhead of memory and CPU, we try to set a upper 
  bound
  number to limit total extent number in each inode, This number is global
  configuration which is visable to all inode. This number will be exported to
  sysfs for configuring according to requirement of user. By default, designed
  number is 8.
  

Chao,
It's better which # of extent are controlled globally rather than limit extents
per inode as Jaegeuk said to reduce extent management overhead.

  3) Shrinker:
  There are two shrink paths:
  a) one is triggered when extent count has exceed the upper bound of
  inode's extent cache. We will try to release extent(s) from head of
  inode's inner extent lru list until extent count is equal to upper 
  bound.
  This operation could be in f2fs_update_extent_cache().
  b) the other one is triggered when memory util exceed threshold, we try
  get inode from head of global lru list(s), and release extent(s) with
  fixed number (by default: 64 extents) 

Re: [f2fs-dev] [PATCH v2] f2fs: add block count by in-place-update in stat info

2014-12-23 Thread Changman Lee
Change from v1
 o use atomic_t inplace_count for more accurate suggested by Chao

-- 8 --

From 7a42b27c8df45494e806d625be03830bfa8c30ff Mon Sep 17 00:00:00 2001
From: Changman Lee cm224@samsung.com
Date: Wed, 24 Dec 2014 02:16:54 +0900
Subject: [PATCH] f2fs: add block count by in-place-update in stat info

This patch adds block count by in-place-update in stat.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/debug.c   | 4 
 fs/f2fs/f2fs.h| 5 -
 fs/f2fs/segment.c | 1 +
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index 91e8f69..2b64221 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -79,6 +79,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si-segment_count[i] = sbi-segment_count[i];
si-block_count[i] = sbi-block_count[i];
}
+
+   si-inplace_count = atomic_read(sbi-inplace_count);
 }
 
 /*
@@ -277,6 +279,7 @@ static int stat_show(struct seq_file *s, void *v)
for (j = 0; j  si-util_free; j++)
seq_putc(s, '-');
seq_puts(s, ]\n\n);
+   seq_printf(s, IPU: %u blocks\n, si-inplace_count);
seq_printf(s, SSR: %u blocks in %u segments\n,
   si-block_count[SSR], si-segment_count[SSR]);
seq_printf(s, LFS: %u blocks in %u segments\n,
@@ -331,6 +334,7 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
 
atomic_set(sbi-inline_inode, 0);
atomic_set(sbi-inline_dir, 0);
+   atomic_set(sbi-inplace_count, 0);
 
mutex_lock(f2fs_stat_mutex);
list_add_tail(si-stat_list, f2fs_stat_list);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ec58bb2..72d2aab 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -589,6 +589,7 @@ struct f2fs_sb_info {
struct f2fs_stat_info *stat_info;   /* FS status information */
unsigned int segment_count[2];  /* # of allocated segments */
unsigned int block_count[2];/* # of allocated blocks */
+   atomic_t inplace_count; /* # of inplace update */
int total_hit_ext, read_hit_ext;/* extent cache hit ratio */
atomic_t inline_inode;  /* # of inline_data inodes */
atomic_t inline_dir;/* # of inline_dentry inodes */
@@ -1514,6 +1515,7 @@ struct f2fs_stat_info {
 
unsigned int segment_count[2];
unsigned int block_count[2];
+   unsigned int inplace_count;
unsigned base_mem, cache_mem;
 };
 
@@ -1553,7 +1555,8 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct 
f2fs_sb_info *sbi)
((sbi)-segment_count[(curseg)-alloc_type]++)
 #define stat_inc_block_count(sbi, curseg)  \
((sbi)-block_count[(curseg)-alloc_type]++)
-
+#define stat_inc_inplace_blocks(sbi)   \
+   (atomic_inc((sbi)-inplace_count))
 #define stat_inc_seg_count(sbi, type)  \
do {\
struct f2fs_stat_info *si = F2FS_STAT(sbi); \
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 42607a6..fd9bc96 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1235,6 +1235,7 @@ void write_data_page(struct page *page, struct 
dnode_of_data *dn,
 void rewrite_data_page(struct page *page, block_t old_blkaddr,
struct f2fs_io_info *fio)
 {
+   stat_inc_inplace_blocks(F2FS_P_SB(page));
f2fs_submit_page_mbio(F2FS_P_SB(page), page, old_blkaddr, fio);
 }
 
-- 
1.9.1


--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [RFC PATCH] f2fs: add extent cache base on rb-tree

2014-12-23 Thread Changman Lee
On Mon, Dec 22, 2014 at 11:36:09PM -0800, Jaegeuk Kim wrote:
 Hi Chao,
 
 On Tue, Dec 23, 2014 at 11:01:39AM +0800, Chao Yu wrote:
  Hi Jaegeuk,
  
   -Original Message-
   From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
   Sent: Tuesday, December 23, 2014 7:16 AM
   To: Chao Yu
   Cc: 'Changman Lee'; linux-f2fs-devel@lists.sourceforge.net; 
   linux-ker...@vger.kernel.org
   Subject: Re: [RFC PATCH] f2fs: add extent cache base on rb-tree
   
   Hi Chao,
   
   On Mon, Dec 22, 2014 at 03:10:30PM +0800, Chao Yu wrote:
Hi Changman,
   
 -Original Message-
 From: Changman Lee [mailto:cm224@samsung.com]
 Sent: Monday, December 22, 2014 10:03 AM
 To: Chao Yu
 Cc: Jaegeuk Kim; linux-f2fs-devel@lists.sourceforge.net; 
 linux-ker...@vger.kernel.org
 Subject: Re: [RFC PATCH] f2fs: add extent cache base on rb-tree

 Hi Yu,

 Good approach.
   
Thank you. :)
   
 As you know, however, f2fs breaks extent itself due to COW.
   
Yes, and sometimes f2fs use IPU when override writing, in this 
condition,
by using this approach we can cache more contiguous mapping extent for 
better
performance.
   
   Hmm. When f2fs faces with this case, there is no chance to make an extent 
   itself
   at all.
  
  With new implementation of this patch f2fs will build extent cache when 
  readpage/readpages.
 
 I don't understand your points exactly. :(
 If there are no on-disk extents, it doesn't matter when the caches are built.
 Could you define what scenarios you're looking at?
 
  
   
   
 Unlike other filesystem like btrfs, minimum extent of f2fs could have 
 4KB granularity.
 So we would have lots of extents per inode and it could lead to 
 overhead
 to manage extents.
   
Agree, the more number of extents are growing in one inode, the more 
memory
pressure and longer latency operating in rb-tree we are facing.
IMO, to solve this problem, we'd better to add limitation or shrink 
ability into
extent cache:
1.limit extent number per inode with the value set from sysfs and 
discard extent
from inode's extent lru list if we touch the limitation; (e.g. in FAT, 
max number
of mapping extent per inode is fixed: 8)
2.add all extents of inodes into a global lru list, we will try to 
shrink this list
if we're facing memory pressure.
   
How do you think? or any better ideas are welcome. :)
   
   Historically, the reason that I added only one small extent cache is that 
   I
   wanted to avoid additional data structures having any overhead in 
   critical data
   write path.
  
  Thank you for telling me the history of original extent cache.
  
   Instead, I intended to use a well operating node page cache.
   
   We need to consider what would be the benefit when using extent cache 
   rather
   than existing node page cache.
  
  IMO, node page cache belongs to system level cache, filesystem sub system 
  can
  not control it completely, cached uptodate node page will be invalidated by
  using drop_caches from sysfs, or reclaimer of mm, result in more IO when we 
  need
  these node page next time.
 
 Yes, that's exactly what I wanted.
 
  New extent cache belongs to filesystem level cache, it is completely 
  controlled
  by filesystem itself. What we can profit is: on the one hand, it is used as
  first level cache above the node page cache, which can also increase the 
  cache
  hit ratio.
 
 I don't think so. The hit ratio depends on the cache policy. The node page
 cache is managed globally by kernel in LRU manner, so I think this can show
 affordable hit ratio.
 
  On the other hand, it is more instable and controllable than node page
  cache.
 
 It depends on how you can control the extent cache. But, I'm not sure that
 would be better than page cache managed by MM.
 
 So, my concerns are:
 
 1. Redundant memory overhead
  : The extent cache is likely on top of the node page cache, which will 
 consume 
  memory redundantly.
 
 2. CPU overhead
  : In every block address updates, it needs to traverse extent cache entries.
 
 3. Effectiveness
  : We have a node page cache that is managed by MM in LRU order. I think this
  provides good hit ratio, system-wide memory relciaming algorithms, and well-
  defined locking mechanism.
 
 4. Cache reclaiming policy
  a. global approach: it needs to consider lock contention, CPU overhead, and
  shrinker. I don't think it is better than page cache.
  b. local approach: there still exists cold misses at the initial read
 operations. After then, how does the extent cache increase
   hit ratio more than giving node page cache?
 
   For example, in the case of pretty normal scenario like
   open - read - close - open - read ..., we can't get
   benefits form locally-managed extent cache, while node
   page

Re: [f2fs-dev] [PATCH 1/2] f2fs: conduct f2fs_gc as explicit gc_type

2014-12-23 Thread Changman Lee
Hi,

On Tue, Dec 23, 2014 at 12:00:37AM -0800, Jaegeuk Kim wrote:
 Hi Changman,
 
 On Tue, Dec 23, 2014 at 08:37:38AM +0900, Changman Lee wrote:
  f2fs has 2 gc_type; foreground gc and background gc.
  In the case of foreground gc, f2fs will select victim as greedy.
  Otherwise, as cost-benefit. And also it runs as greedy in SSR mode.
  Until now, f2fs_gc conducted with BG_GC as default. So we couldn't
  expect how it runs; BG_GC or FG_GC and GREEDY or COST_BENEFIT.
 
 What does this mean?
 In f2fs_gc, the gc_type will be changed accoring to the number of free
 sections.

Right, but when I turn on trace I saw 3 cases.
1. BG_GC and COST_BENEFIT
2. BG_GC and GREEDY
3. FG_GC and GREEDY

I expected that case 1 is likely to operate only by gc_thread.
But it was not.

 
  Therefore sometimes it runs as BG_GC/COST_BENEFIT although gc_thread
  don't put f2fs_gc to work.
 
 You mean f2fs_balance_fs?
 In this case, again, the gc_type will be assigned FG_GC.
 
 Why do you want to set FG_GC/GREEDY for the SSR victims?

We should alloate a block as soon as possible.
In the case of FG_GC, it also uses invalid blocks dirtied by background gc.
In another case, if (BG_GC  test_bit(victim_secmap)), it will be
skipped.
I intended that SSR operates fastly like FG_GC.

Regards,
Changman

 
 Thanks,
 
  
  Signed-off-by: Changman Lee cm224@samsung.com
  ---
   fs/f2fs/f2fs.h| 2 +-
   fs/f2fs/gc.c  | 5 ++---
   fs/f2fs/segment.c | 6 +++---
   3 files changed, 6 insertions(+), 7 deletions(-)
  
  diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
  index ae6dfb6..c956535 100644
  --- a/fs/f2fs/f2fs.h
  +++ b/fs/f2fs/f2fs.h
  @@ -1476,7 +1476,7 @@ int f2fs_fiemap(struct inode *inode, struct 
  fiemap_extent_info *, u64, u64);
   int start_gc_thread(struct f2fs_sb_info *);
   void stop_gc_thread(struct f2fs_sb_info *);
   block_t start_bidx_of_node(unsigned int, struct f2fs_inode_info *);
  -int f2fs_gc(struct f2fs_sb_info *);
  +int f2fs_gc(struct f2fs_sb_info *, int);
   void build_gc_manager(struct f2fs_sb_info *);
   int __init create_gc_caches(void);
   void destroy_gc_caches(void);
  diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
  index eec0933..e1fa53a 100644
  --- a/fs/f2fs/gc.c
  +++ b/fs/f2fs/gc.c
  @@ -80,7 +80,7 @@ static int gc_thread_func(void *data)
  stat_inc_bggc_count(sbi);
   
  /* if return value is not zero, no victim was selected */
  -   if (f2fs_gc(sbi))
  +   if (f2fs_gc(sbi, BG_GC))
  wait_ms = gc_th-no_gc_sleep_time;
   
  /* balancing f2fs's metadata periodically */
  @@ -691,10 +691,9 @@ static void do_garbage_collect(struct f2fs_sb_info 
  *sbi, unsigned int segno,
  f2fs_put_page(sum_page, 1);
   }
   
  -int f2fs_gc(struct f2fs_sb_info *sbi)
  +int f2fs_gc(struct f2fs_sb_info *sbi, int gc_type)
   {
  unsigned int segno, i;
  -   int gc_type = BG_GC;
  int nfree = 0;
  int ret = -1;
  struct cp_control cpc;
  diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
  index fd9bc96..3b32404 100644
  --- a/fs/f2fs/segment.c
  +++ b/fs/f2fs/segment.c
  @@ -281,7 +281,7 @@ void f2fs_balance_fs(struct f2fs_sb_info *sbi)
   */
  if (has_not_enough_free_secs(sbi, 0)) {
  mutex_lock(sbi-gc_mutex);
  -   f2fs_gc(sbi);
  +   f2fs_gc(sbi, FG_GC);
  }
   }
   
  @@ -994,12 +994,12 @@ static int get_ssr_segment(struct f2fs_sb_info *sbi, 
  int type)
   
  if (IS_NODESEG(type) || !has_not_enough_free_secs(sbi, 0))
  return v_ops-get_victim(sbi,
  -   (curseg)-next_segno, BG_GC, type, SSR);
  +   (curseg)-next_segno, FG_GC, type, SSR);
   
  /* For data segments, let's do SSR more intensively */
  for (; type = CURSEG_HOT_DATA; type--)
  if (v_ops-get_victim(sbi, (curseg)-next_segno,
  -   BG_GC, type, SSR))
  +   FG_GC, type, SSR))
  return 1;
  return 0;
   }
  -- 
  1.9.1
  
  
  --
  Dive into the World of Parallel Programming! The Go Parallel Website,
  sponsored by Intel and developed in partnership with Slashdot Media, is your
  hub for all things parallel software development, from weekly thought
  leadership blogs to news, videos, case studies, tutorials and more. Take a
  look and join the conversation now. http://goparallel.sourceforge.net
  ___
  Linux-f2fs-devel mailing list
  Linux-f2fs-devel@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought

[f2fs-dev] [PATCH] f2fs: add block count by in-place-update in stat info

2014-12-22 Thread Changman Lee
This patch adds block count by in-place-update in stat.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/debug.c   | 3 +++
 fs/f2fs/f2fs.h| 5 -
 fs/f2fs/segment.c | 1 +
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index 91e8f69..46bef86 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -79,6 +79,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si-segment_count[i] = sbi-segment_count[i];
si-block_count[i] = sbi-block_count[i];
}
+
+   si-inplace_count = sbi-inplace_count;
 }
 
 /*
@@ -277,6 +279,7 @@ static int stat_show(struct seq_file *s, void *v)
for (j = 0; j  si-util_free; j++)
seq_putc(s, '-');
seq_puts(s, ]\n\n);
+   seq_printf(s, IPU: %u blocks\n, si-inplace_count);
seq_printf(s, SSR: %u blocks in %u segments\n,
   si-block_count[SSR], si-segment_count[SSR]);
seq_printf(s, LFS: %u blocks in %u segments\n,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ec58bb2..ae6dfb6 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -589,6 +589,7 @@ struct f2fs_sb_info {
struct f2fs_stat_info *stat_info;   /* FS status information */
unsigned int segment_count[2];  /* # of allocated segments */
unsigned int block_count[2];/* # of allocated blocks */
+   unsigned int inplace_count; /* # of inplace update */
int total_hit_ext, read_hit_ext;/* extent cache hit ratio */
atomic_t inline_inode;  /* # of inline_data inodes */
atomic_t inline_dir;/* # of inline_dentry inodes */
@@ -1514,6 +1515,7 @@ struct f2fs_stat_info {
 
unsigned int segment_count[2];
unsigned int block_count[2];
+   unsigned int inplace_count;
unsigned base_mem, cache_mem;
 };
 
@@ -1553,7 +1555,8 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct 
f2fs_sb_info *sbi)
((sbi)-segment_count[(curseg)-alloc_type]++)
 #define stat_inc_block_count(sbi, curseg)  \
((sbi)-block_count[(curseg)-alloc_type]++)
-
+#define stat_inc_inplace_blocks(sbi)   \
+   ((sbi)-inplace_count++)
 #define stat_inc_seg_count(sbi, type)  \
do {\
struct f2fs_stat_info *si = F2FS_STAT(sbi); \
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 42607a6..fd9bc96 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1235,6 +1235,7 @@ void write_data_page(struct page *page, struct 
dnode_of_data *dn,
 void rewrite_data_page(struct page *page, block_t old_blkaddr,
struct f2fs_io_info *fio)
 {
+   stat_inc_inplace_blocks(F2FS_P_SB(page));
f2fs_submit_page_mbio(F2FS_P_SB(page), page, old_blkaddr, fio);
 }
 
-- 
1.9.1


--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [RFC PATCH] f2fs: add extent cache base on rb-tree

2014-12-22 Thread Changman Lee
Hi,

On Mon, Dec 22, 2014 at 03:10:30PM +0800, Chao Yu wrote:
 Hi Changman,
 
  -Original Message-
  From: Changman Lee [mailto:cm224@samsung.com]
  Sent: Monday, December 22, 2014 10:03 AM
  To: Chao Yu
  Cc: Jaegeuk Kim; linux-f2fs-devel@lists.sourceforge.net; 
  linux-ker...@vger.kernel.org
  Subject: Re: [RFC PATCH] f2fs: add extent cache base on rb-tree
  
  Hi Yu,
  
  Good approach.
 
 Thank you. :)
 
  As you know, however, f2fs breaks extent itself due to COW.
 
 Yes, and sometimes f2fs use IPU when override writing, in this condition,
 by using this approach we can cache more contiguous mapping extent for better
 performance.
 
  Unlike other filesystem like btrfs, minimum extent of f2fs could have 4KB 
  granularity.
  So we would have lots of extents per inode and it could lead to overhead
  to manage extents.
 
 Agree, the more number of extents are growing in one inode, the more memory
 pressure and longer latency operating in rb-tree we are facing.
 IMO, to solve this problem, we'd better to add limitation or shrink ability 
 into
 extent cache:
 1.limit extent number per inode with the value set from sysfs and discard 
 extent
 from inode's extent lru list if we touch the limitation; (e.g. in FAT, max 
 number
 of mapping extent per inode is fixed: 8)
 2.add all extents of inodes into a global lru list, we will try to shrink 
 this list
 if we're facing memory pressure.
 
 How do you think? or any better ideas are welcome. :)
 

I think both of them are considerable options.
How about adding extent to inode selected by user using ioctl or xattr?
In the case of read most files having large size, user could get a benefit
surely although they are seperated some pieces.

Thanks,

  
  Anyway, mount option could be alternative for this patch.
 
 Yes, will do.
 
 Thanks,
 Yu
 
  
  On Fri, Dec 19, 2014 at 06:49:29PM +0800, Chao Yu wrote:
   Now f2fs have page-block mapping cache which can cache only one extent 
   mapping
   between contiguous logical address and physical address.
   Normally, this design will work well because f2fs will expand coverage 
   area of
   the mapping extent when we write forward sequentially. But when we write 
   data
   randomly in Out-Place-Update mode, the extent will be shorten and hardly 
   be
   expanded for most time as following reasons:
   1.The short part of extent will be discarded if we break contiguous 
   mapping in
   the middle of extent.
   2.The new mapping will be added into mapping cache only at head or tail 
   of the
   extent.
   3.We will drop the extent cache when the extent became very fragmented.
   4.We will not update the extent with mapping which we get from readpages 
   or
   readpage.
  
   To solve above problems, this patch adds extent cache base on rb-tree 
   like other
   filesystems (e.g.: ext4/btrfs) in f2fs. By this way, f2fs can support 
   another
   more effective cache between dnode page cache and disk. It will supply 
   high hit
   ratio in the cache with fewer memory when dnode page cache are reclaimed 
   in
   environment of low memory.
  
   Todo:
   *introduce mount option for extent cache.
   *add shrink ability for extent cache.
  
   Signed-off-by: Chao Yu chao2...@samsung.com
   ---
 

--
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: merge two uchar variable in struct node_info to reduce memory cost

2014-12-17 Thread Changman Lee
Hi Yu,

This patch is effective only in 32 bit machine. In case of 64 bit
machine, nat_entry will be aligned in 8 bytes due to pointer variable
(i.e. struct list_head). So it can't get any benefit to reduce memory
usage. In the case of node_info, however, it will be gain in terms of
memory usage.
Hence, I think it's not correct for commit log to describe this patch.

Thanks,

Reviewed-by: Changman Lee cm224@samsung.com

2014-12-15 18:33 GMT+09:00 Chao Yu chao2...@samsung.com:
 This patch moves one member of struct nat_entry: _flag_ to struct node_info,
 so _version_ in struct node_info and _flag_ with unsigned char type will merge
 to one 32-bit space in register/memory. Then the size of nat_entry will reduce
 its size from 28 bytes to 24 bytes and slab memory using by f2fs will be
 reduced.

 changes from v1:
  o introduce inline copy_node_info() to copy valid data from node info 
 suggested
by Jaegeuk Kim, it can avoid bug.

 Signed-off-by: Chao Yu chao2...@samsung.com
 ---
  fs/f2fs/node.c |  4 ++--
  fs/f2fs/node.h | 33 ++---
  2 files changed, 24 insertions(+), 13 deletions(-)

 diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
 index f83326c..5aa54a0 100644
 --- a/fs/f2fs/node.c
 +++ b/fs/f2fs/node.c
 @@ -268,7 +268,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, 
 struct node_info *ni,
 e = __lookup_nat_cache(nm_i, ni-nid);
 if (!e) {
 e = grab_nat_entry(nm_i, ni-nid);
 -   e-ni = *ni;
 +   copy_node_info(e-ni, ni);
 f2fs_bug_on(sbi, ni-blk_addr == NEW_ADDR);
 } else if (new_blkaddr == NEW_ADDR) {
 /*
 @@ -276,7 +276,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, 
 struct node_info *ni,
  * previous nat entry can be remained in nat cache.
  * So, reinitialize it with new information.
  */
 -   e-ni = *ni;
 +   copy_node_info(e-ni, ni);
 f2fs_bug_on(sbi, ni-blk_addr != NULL_ADDR);
 }

 diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
 index d10b644..eb59167 100644
 --- a/fs/f2fs/node.h
 +++ b/fs/f2fs/node.h
 @@ -29,6 +29,14 @@
  /* return value for read_node_page */
  #define LOCKED_PAGE1

 +/* For flag in struct node_info */
 +enum {
 +   IS_CHECKPOINTED,/* is it checkpointed before? */
 +   HAS_FSYNCED_INODE,  /* is the inode fsynced before? */
 +   HAS_LAST_FSYNC, /* has the latest node fsync mark? */
 +   IS_DIRTY,   /* this nat entry is dirty? */
 +};
 +
  /*
   * For node information
   */
 @@ -37,18 +45,11 @@ struct node_info {
 nid_t ino;  /* inode number of the node's owner */
 block_t blk_addr;   /* block address of the node */
 unsigned char version;  /* version of the node */
 -};
 -
 -enum {
 -   IS_CHECKPOINTED,/* is it checkpointed before? */
 -   HAS_FSYNCED_INODE,  /* is the inode fsynced before? */
 -   HAS_LAST_FSYNC, /* has the latest node fsync mark? */
 -   IS_DIRTY,   /* this nat entry is dirty? */
 +   unsigned char flag; /* for node information bits */
  };

  struct nat_entry {
 struct list_head list;  /* for clean or dirty nat list */
 -   unsigned char flag; /* for node information bits */
 struct node_info ni;/* in-memory node information */
  };

 @@ -63,20 +64,30 @@ struct nat_entry {

  #define inc_node_version(version)  (++version)

 +static inline void copy_node_info(struct node_info *dst,
 +   struct node_info *src)
 +{
 +   dst-nid = src-nid;
 +   dst-ino = src-ino;
 +   dst-blk_addr = src-blk_addr;
 +   dst-version = src-version;
 +   /* should not copy flag here */
 +}
 +
  static inline void set_nat_flag(struct nat_entry *ne,
 unsigned int type, bool set)
  {
 unsigned char mask = 0x01  type;
 if (set)
 -   ne-flag |= mask;
 +   ne-ni.flag |= mask;
 else
 -   ne-flag = ~mask;
 +   ne-ni.flag = ~mask;
  }

  static inline bool get_nat_flag(struct nat_entry *ne, unsigned int type)
  {
 unsigned char mask = 0x01  type;
 -   return ne-flag  mask;
 +   return ne-ni.flag  mask;
  }

  static inline void nat_reset_flag(struct nat_entry *ne)
 --
 2.1.2



 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk
 ___
 Linux-f2fs-devel mailing list
 Linux

[f2fs-dev] [PATCH 1/3] f2fs: check if inode state is dirty at fsync

2014-12-07 Thread Changman Lee
If inode state is dirty, go straight to write.

Suggested-by: Jaegeuk Kim jaeg...@kernel.org
Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/file.c | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index b6f3fbf..0b97002 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -138,6 +138,17 @@ static inline bool need_do_checkpoint(struct inode *inode)
return need_cp;
 }
 
+static bool need_inode_page_update(struct f2fs_sb_info *sbi, nid_t ino)
+{
+   struct page *i = find_get_page(NODE_MAPPING(sbi), ino);
+   bool ret = false;
+   /* But we need to avoid that there are some inode updates */
+   if ((i  PageDirty(i)) || need_inode_block_update(sbi, ino))
+   ret = true;
+   f2fs_put_page(i, 0);
+   return ret;
+}
+
 int f2fs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 {
struct inode *inode = file-f_mapping-host;
@@ -168,19 +179,21 @@ int f2fs_sync_file(struct file *file, loff_t start, 
loff_t end, int datasync)
return ret;
}
 
+   /* if the inode is dirty, let's recover all the time */
+   if (!datasync  is_inode_flag_set(fi, FI_DIRTY_INODE)) {
+   update_inode_page(inode);
+   goto go_write;
+   }
+
/*
 * if there is no written data, don't waste time to write recovery info.
 */
if (!is_inode_flag_set(fi, FI_APPEND_WRITE) 
!exist_written_data(sbi, ino, APPEND_INO)) {
-   struct page *i = find_get_page(NODE_MAPPING(sbi), ino);
 
-   /* But we need to avoid that there are some inode updates */
-   if ((i  PageDirty(i)) || need_inode_block_update(sbi, ino)) {
-   f2fs_put_page(i, 0);
+   /* it may call write_inode just prior to fsync */
+   if (need_inode_page_update(sbi, ino))
goto go_write;
-   }
-   f2fs_put_page(i, 0);
 
if (is_inode_flag_set(fi, FI_UPDATE_WRITE) ||
exist_written_data(sbi, ino, UPDATE_INO))
-- 
1.9.1


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: check if inode's state is dirty or not before skip fsync

2014-12-04 Thread Changman Lee
On Thu, Dec 04, 2014 at 04:58:29PM -0800, Jaegeuk Kim wrote:
 On Wed, Dec 03, 2014 at 10:46:38AM +0900, Changman Lee wrote:
  Hi Jaegeuk,
  
  Thanks for explanation.
  
  On Tue, Dec 02, 2014 at 11:42:19AM -0800, Jaegeuk Kim wrote:
   On Tue, Dec 02, 2014 at 01:21:31PM +0900, Changman Lee wrote:
Hi,

f2fs_dirty_inode just set fi-flag as FI_DIRTY_INODE not to
call update_inode_page. Instead, we do it when f2fs_write_indoe is 
called.
Do you have any reason to do like this?
   
   Actually, I'd like to use inode caches instead of dirty node pages as 
   much as
   possible to mitigate memory pressure as well as reduce node page writes.
   But, the reality is that f2fs triggers update_inode_page frequently, 
   since some
   inode information like i_blocks and i_links should be recovered 
   consistently
   from sudden power-cuts.
  
  I got it. No objection.
  
   
How about move update_inode_page from write_inode to dirty_inode?
And we can update inode page when mark_inode_dirty or
mark_inode_dirty_sync is called. Then, we control write I/O in
write_inode according to wbc-sync_mode.
   
   What do you mean controlling write I/O in write_inode?
   The write_inode does not trigger any I/Os.
   We're controlling node page writes by f2fs_write_node_pages.
  
  Sorry, it's not enough for my explanation.
  At __writeback_single_inode, it calls write_inode if inode is dirty.
  And at ext4_write_inode and btrfs_write_inode, they issue write
  according to wbc-sync_mode. However, current f2fs doesn't issue any
  write i/o. Could you review it?
 
 Hi,
 
 Well, I'm not quite sure that f2fs should do this.
 In terms of recovery, we don't need to do this.
 
  
   
   Anyway, if we call update_inode_page in mark_inode_dirty, f2fs would 
   suffer from
   a lot of dirty node pages.
  
  Got it. But I think we should write dirty node after
  update_inode_page in write_inode if wbc-sync_mode == WB_SYNC_ALL.
 
 Why do we have to do this?
 Again, there is no problem wrt recovery, but that causes unnecessary IOs.
 
  
  
  Finally, I have one more question.
  At f2fs_sync_file, in the case of need_cp is true and file_wrong_pino
  f2fs calls write_inode. But the inode isn't written back. Is it okay?
  Could you elaborate on it?
 
 No problem. That pino will be used only for fsynced inodes after checkpoint.

I got it. My concern was started from this. If there is no problem,
I think current f2fs_write_inode is also no problem.
Thanks Jaegeuk.

Then, let's merge your suggestion below.

Lastly, I have curiosity related to write node; APPEND or UPDATE.
Before fsync is called, isn't there any possiblity to be changed to APPEND from
UPDATE. If so, we might lost recovery info.
I think we'd better check if there is a situation.

Regards,
Changman

 
 Thanks,
 
  
  Thanks,
  
   
   Thanks,
   
Could you consider this once?

Thanks,

On Mon, Dec 01, 2014 at 02:52:57PM -0800, Jaegeuk Kim wrote:
 On Mon, Dec 01, 2014 at 04:05:20PM +0900, Changman Lee wrote:
  It makes sense to check inode's state than check if
  inode page is dirty or not.
 
 Nice catch.
 However, we should leave the original condition, since write_inode 
 can be called
 in prior to this fsync call.
 And, this is not a proper fix, since it still can skip to write its 
 inode page. 
 
 How about this one?
 
 diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
 index 146e58a..6690599 100644
 --- a/fs/f2fs/file.c
 +++ b/fs/f2fs/file.c
 @@ -168,6 +168,12 @@ int f2fs_sync_file(struct file *file, loff_t 
 start, loff_t end, int datasync)
   return ret;
   }
  
 + /* if the inode is dirty, let's recover all the time */
 + if (is_inode_flag_set(fi, FI_DIRTY_INODE)) {
 + update_inode_page(inode);
 + goto go_write;
 + }
 +
   /*
* if there is no written data, don't waste time to write 
 recovery info.
*/
 -- 
 2.1.1
 
  
  Signed-off-by: Changman Lee cm224@samsung.com
  ---
   fs/f2fs/file.c | 7 ++-
   1 file changed, 2 insertions(+), 5 deletions(-)
  
  diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
  index 7c2ec3e..0c5ae87 100644
  --- a/fs/f2fs/file.c
  +++ b/fs/f2fs/file.c
  @@ -173,14 +173,11 @@ int f2fs_sync_file(struct file *file, loff_t 
  start, loff_t end, int datasync)
   */
  if (!is_inode_flag_set(fi, FI_APPEND_WRITE) 
  !exist_written_data(sbi, ino, APPEND_INO)) {
  -   struct page *i = find_get_page(NODE_MAPPING(sbi), ino);
   
  /* But we need to avoid that there are some inode 
  updates */
  -   if ((i  PageDirty(i)) || need_inode_block_update(sbi, 
  ino)) {
  -   f2fs_put_page(i, 0);
  +   if (is_inode_flag_set(fi

Re: [f2fs-dev] [PATCH] f2fs: check if inode's state is dirty or not before skip fsync

2014-12-02 Thread Changman Lee
Hi Jaegeuk,

Thanks for explanation.

On Tue, Dec 02, 2014 at 11:42:19AM -0800, Jaegeuk Kim wrote:
 On Tue, Dec 02, 2014 at 01:21:31PM +0900, Changman Lee wrote:
  Hi,
  
  f2fs_dirty_inode just set fi-flag as FI_DIRTY_INODE not to
  call update_inode_page. Instead, we do it when f2fs_write_indoe is called.
  Do you have any reason to do like this?
 
 Actually, I'd like to use inode caches instead of dirty node pages as much as
 possible to mitigate memory pressure as well as reduce node page writes.
 But, the reality is that f2fs triggers update_inode_page frequently, since 
 some
 inode information like i_blocks and i_links should be recovered consistently
 from sudden power-cuts.

I got it. No objection.

 
  How about move update_inode_page from write_inode to dirty_inode?
  And we can update inode page when mark_inode_dirty or
  mark_inode_dirty_sync is called. Then, we control write I/O in
  write_inode according to wbc-sync_mode.
 
 What do you mean controlling write I/O in write_inode?
 The write_inode does not trigger any I/Os.
 We're controlling node page writes by f2fs_write_node_pages.

Sorry, it's not enough for my explanation.
At __writeback_single_inode, it calls write_inode if inode is dirty.
And at ext4_write_inode and btrfs_write_inode, they issue write
according to wbc-sync_mode. However, current f2fs doesn't issue any
write i/o. Could you review it?

 
 Anyway, if we call update_inode_page in mark_inode_dirty, f2fs would suffer 
 from
 a lot of dirty node pages.

Got it. But I think we should write dirty node after
update_inode_page in write_inode if wbc-sync_mode == WB_SYNC_ALL.


Finally, I have one more question.
At f2fs_sync_file, in the case of need_cp is true and file_wrong_pino
f2fs calls write_inode. But the inode isn't written back. Is it okay?
Could you elaborate on it?

Thanks,

 
 Thanks,
 
  Could you consider this once?
  
  Thanks,
  
  On Mon, Dec 01, 2014 at 02:52:57PM -0800, Jaegeuk Kim wrote:
   On Mon, Dec 01, 2014 at 04:05:20PM +0900, Changman Lee wrote:
It makes sense to check inode's state than check if
inode page is dirty or not.
   
   Nice catch.
   However, we should leave the original condition, since write_inode can be 
   called
   in prior to this fsync call.
   And, this is not a proper fix, since it still can skip to write its inode 
   page. 
   
   How about this one?
   
   diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
   index 146e58a..6690599 100644
   --- a/fs/f2fs/file.c
   +++ b/fs/f2fs/file.c
   @@ -168,6 +168,12 @@ int f2fs_sync_file(struct file *file, loff_t start, 
   loff_t end, int datasync)
 return ret;
 }

   + /* if the inode is dirty, let's recover all the time */
   + if (is_inode_flag_set(fi, FI_DIRTY_INODE)) {
   + update_inode_page(inode);
   + goto go_write;
   + }
   +
 /*
  * if there is no written data, don't waste time to write recovery info.
  */
   -- 
   2.1.1
   

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/file.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 7c2ec3e..0c5ae87 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -173,14 +173,11 @@ int f2fs_sync_file(struct file *file, loff_t 
start, loff_t end, int datasync)
 */
if (!is_inode_flag_set(fi, FI_APPEND_WRITE) 
!exist_written_data(sbi, ino, APPEND_INO)) {
-   struct page *i = find_get_page(NODE_MAPPING(sbi), ino);
 
/* But we need to avoid that there are some inode 
updates */
-   if ((i  PageDirty(i)) || need_inode_block_update(sbi, 
ino)) {
-   f2fs_put_page(i, 0);
+   if (is_inode_flag_set(fi, FI_DIRTY_INODE) ||
+   need_inode_block_update(sbi, 
ino))
goto go_write;
-   }
-   f2fs_put_page(i, 0);
 
if (is_inode_flag_set(fi, FI_UPDATE_WRITE) ||
exist_written_data(sbi, ino, 
UPDATE_INO))
-- 
1.9.1


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  
more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Download BIRT

[f2fs-dev] f2fs_write_inode

2014-12-01 Thread Changman Lee
Hi guys,

I was wondering why f2fs_write_inode doesn't submit any I/O according to
wbc-sync_mode.
If you have any idea, answer to my questions, please.

And at f2fs_sync_file,

if (need_cp) {

Q: We've already called sync_fs. Is there any scenario like below ?
I refered to 354a3399dc6f7e556d04e1c731cd50e08eeb44bd but I can't guess the 
situation.

if (file_wrong_pino(inode)  inode-i_nlink == 1 
get_parent_ino(inode, pino)) {
fi-i_pino = pino;
file_got_pino(inode);
up_write(fi-i_sem);
mark_inode_dirty_sync(inode);

Q: Update but no write I/O. How to recover after SPO ?

ret = f2fs_write_inode(inode, NULL);
if (ret)
goto out;
} else {
up_write(fi-i_sem);
}
} else {

~ snip ~

out:
return ret;


Regards,
Changman

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: check if inode's state is dirty or not before skip fsync

2014-12-01 Thread Changman Lee
Hi,

f2fs_dirty_inode just set fi-flag as FI_DIRTY_INODE not to
call update_inode_page. Instead, we do it when f2fs_write_indoe is called.
Do you have any reason to do like this?
How about move update_inode_page from write_inode to dirty_inode?
And we can update inode page when mark_inode_dirty or
mark_inode_dirty_sync is called. Then, we control write I/O in
write_inode according to wbc-sync_mode.
Could you consider this once?

Thanks,

On Mon, Dec 01, 2014 at 02:52:57PM -0800, Jaegeuk Kim wrote:
 On Mon, Dec 01, 2014 at 04:05:20PM +0900, Changman Lee wrote:
  It makes sense to check inode's state than check if
  inode page is dirty or not.
 
 Nice catch.
 However, we should leave the original condition, since write_inode can be 
 called
 in prior to this fsync call.
 And, this is not a proper fix, since it still can skip to write its inode 
 page. 
 
 How about this one?
 
 diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
 index 146e58a..6690599 100644
 --- a/fs/f2fs/file.c
 +++ b/fs/f2fs/file.c
 @@ -168,6 +168,12 @@ int f2fs_sync_file(struct file *file, loff_t start, 
 loff_t end, int datasync)
   return ret;
   }
  
 + /* if the inode is dirty, let's recover all the time */
 + if (is_inode_flag_set(fi, FI_DIRTY_INODE)) {
 + update_inode_page(inode);
 + goto go_write;
 + }
 +
   /*
* if there is no written data, don't waste time to write recovery info.
*/
 -- 
 2.1.1
 
  
  Signed-off-by: Changman Lee cm224@samsung.com
  ---
   fs/f2fs/file.c | 7 ++-
   1 file changed, 2 insertions(+), 5 deletions(-)
  
  diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
  index 7c2ec3e..0c5ae87 100644
  --- a/fs/f2fs/file.c
  +++ b/fs/f2fs/file.c
  @@ -173,14 +173,11 @@ int f2fs_sync_file(struct file *file, loff_t start, 
  loff_t end, int datasync)
   */
  if (!is_inode_flag_set(fi, FI_APPEND_WRITE) 
  !exist_written_data(sbi, ino, APPEND_INO)) {
  -   struct page *i = find_get_page(NODE_MAPPING(sbi), ino);
   
  /* But we need to avoid that there are some inode updates */
  -   if ((i  PageDirty(i)) || need_inode_block_update(sbi, ino)) {
  -   f2fs_put_page(i, 0);
  +   if (is_inode_flag_set(fi, FI_DIRTY_INODE) ||
  +   need_inode_block_update(sbi, ino))
  goto go_write;
  -   }
  -   f2fs_put_page(i, 0);
   
  if (is_inode_flag_set(fi, FI_UPDATE_WRITE) ||
  exist_written_data(sbi, ino, UPDATE_INO))
  -- 
  1.9.1
  
  
  --
  Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
  from Actuate! Instantly Supercharge Your Business Reports and Dashboards
  with Interactivity, Sharing, Native Excel Exports, App Integration  more
  Get technology previously reserved for billion-dollar corporations, FREE
  http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
  ___
  Linux-f2fs-devel mailing list
  Linux-f2fs-devel@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: more fast lookup for gc_inode list

2014-11-27 Thread Changman Lee
If there are many inodes that have data blocks in victim segment,
it takes long time to find a inode in gc_inode list.
Let's use radix_tree to reduce lookup time.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/gc.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 29fc7e5..fc765c1 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -24,6 +24,7 @@
 #include gc.h
 #include trace/events/f2fs.h
 
+RADIX_TREE(gc_inode_root, GFP_ATOMIC);
 static struct kmem_cache *winode_slab;
 
 static int gc_thread_func(void *data)
@@ -338,13 +339,13 @@ static const struct victim_selection default_v_ops = {
.get_victim = get_victim_by_default,
 };
 
-static struct inode *find_gc_inode(nid_t ino, struct list_head *ilist)
+static struct inode *find_gc_inode(nid_t ino)
 {
struct inode_entry *ie;
 
-   list_for_each_entry(ie, ilist, list)
-   if (ie-inode-i_ino == ino)
-   return ie-inode;
+   ie = radix_tree_lookup(gc_inode_root, ino);
+   if (ie)
+   return ie-inode;
return NULL;
 }
 
@@ -352,13 +353,19 @@ static void add_gc_inode(struct inode *inode, struct 
list_head *ilist)
 {
struct inode_entry *new_ie;
 
-   if (inode == find_gc_inode(inode-i_ino, ilist)) {
+   new_ie = radix_tree_lookup(gc_inode_root, inode-i_ino);
+   if (new_ie) {
iput(inode);
return;
}
 
new_ie = f2fs_kmem_cache_alloc(winode_slab, GFP_NOFS);
new_ie-inode = inode;
+
+   if (radix_tree_insert(gc_inode_root, inode-i_ino, new_ie)) {
+   kmem_cache_free(winode_slab, new_ie);
+   return;
+   }
list_add_tail(new_ie-list, ilist);
 }
 
@@ -367,7 +374,7 @@ static void put_gc_inode(struct list_head *ilist)
struct inode_entry *ie, *next_ie;
list_for_each_entry_safe(ie, next_ie, ilist, list) {
iput(ie-inode);
-   list_del(ie-list);
+   radix_tree_delete(gc_inode_root, ie-inode-i_ino);
kmem_cache_free(winode_slab, ie);
}
 }
@@ -614,7 +621,7 @@ next_step:
}
 
/* phase 3 */
-   inode = find_gc_inode(dni.ino, ilist);
+   inode = find_gc_inode(dni.ino);
if (inode) {
start_bidx = start_bidx_of_node(nofs,
F2FS_I(inode));
-- 
1.9.1


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: move put_gc_inode into gc_mutex

2014-11-27 Thread Changman Lee
There in no any lock to protect gc_inode list so let's move into
gc_mutex, otherwise it might be lost links of list.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/gc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 657683c9..99e1720 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -733,9 +733,9 @@ gc_more:
if (gc_type == FG_GC)
write_checkpoint(sbi, cpc);
 stop:
-   mutex_unlock(sbi-gc_mutex);
-
put_gc_inode(ilist);
+
+   mutex_unlock(sbi-gc_mutex);
return ret;
 }
 
-- 
1.9.1


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: move put_gc_inode into gc_mutex

2014-11-27 Thread Changman Lee
On Thu, Nov 27, 2014 at 07:55:14PM -0800, Jaegeuk Kim wrote:
 Hi Changman,
 
 On Thu, Nov 27, 2014 at 06:42:54PM +0900, Changman Lee wrote:
  There in no any lock to protect gc_inode list so let's move into
  gc_mutex, otherwise it might be lost links of list.
 
 Could you explain why the links can be lost?
 Cause the ilist is a local variable.

Hi Jaegeuk,

Oh, I missed ilist is a local variable.
Sorry, ignore this patch.

Thanks,

 
 IIRC, the reason why put_gc_inode is called outside of gc_mutex is to avoid
 deadlock between f2fs_evict_inode and gc operations.
 I'm not sure it still has a problem, but it is unclear that we have to move
 put_gc_inode inside gc_mutex.
 
 Are you facing with any bug on this?
 
 Thanks,
 
  
  Signed-off-by: Changman Lee cm224@samsung.com
  ---
   fs/f2fs/gc.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)
  
  diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
  index 657683c9..99e1720 100644
  --- a/fs/f2fs/gc.c
  +++ b/fs/f2fs/gc.c
  @@ -733,9 +733,9 @@ gc_more:
  if (gc_type == FG_GC)
  write_checkpoint(sbi, cpc);
   stop:
  -   mutex_unlock(sbi-gc_mutex);
  -
  put_gc_inode(ilist);
  +
  +   mutex_unlock(sbi-gc_mutex);
  return ret;
   }
   
  -- 
  1.9.1
  
  
  --
  Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
  from Actuate! Instantly Supercharge Your Business Reports and Dashboards
  with Interactivity, Sharing, Native Excel Exports, App Integration  more
  Get technology previously reserved for billion-dollar corporations, FREE
  http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
  ___
  Linux-f2fs-devel mailing list
  Linux-f2fs-devel@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: cleanup if-statement of phase in gc_data_segment

2014-11-26 Thread Changman Lee
Little cleanup to distinguish each phase easily

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/gc.c | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 81686b2..de00713 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -602,27 +602,28 @@ next_step:
 
data_page = find_data_page(inode,
start_bidx + ofs_in_node, false);
-   if (IS_ERR(data_page))
-   goto next_iput;
+   if (IS_ERR(data_page)) {
+   iput(inode);
+   continue;
+   }
 
f2fs_put_page(data_page, 0);
add_gc_inode(inode, ilist);
-   } else {
-   inode = find_gc_inode(dni.ino, ilist);
-   if (inode) {
-   start_bidx = start_bidx_of_node(nofs,
-   F2FS_I(inode));
-   data_page = get_lock_data_page(inode,
-   start_bidx + ofs_in_node);
-   if (IS_ERR(data_page))
-   continue;
-   move_data_page(inode, data_page, gc_type);
-   stat_inc_data_blk_count(sbi, 1);
-   }
+   continue;
+   }
+
+   /* phase 3 */
+   inode = find_gc_inode(dni.ino, ilist);
+   if (inode) {
+   start_bidx = start_bidx_of_node(nofs,
+   F2FS_I(inode));
+   data_page = get_lock_data_page(inode,
+   start_bidx + ofs_in_node);
+   if (IS_ERR(data_page))
+   continue;
+   move_data_page(inode, data_page, gc_type);
+   stat_inc_data_blk_count(sbi, 1);
}
-   continue;
-next_iput:
-   iput(inode);
}
 
if (++phase  4)
-- 
1.9.1


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/3] f2fs: call flush_dcache_page when the page was updated

2014-11-25 Thread Changman Lee
Hi Simon,

Thanks very much for your interest.
It becomes more clear due to your explanation.

Regards,
Changman

On Tue, Nov 25, 2014 at 08:05:23PM +0100, Simon Baatz wrote:
 Hi Changman,
 
 On Mon, Nov 24, 2014 at 11:46:46AM +0900, Changman Lee wrote:
  Hi Simon,
  Thanks for your explanation kindly.
  
  On Sun, Nov 23, 2014 at 11:08:54AM +0100, Simon Baatz wrote:
   Hi Changman, Jaegeuk,
   
   On Thu, Nov 20, 2014 at 05:47:29PM +0900, Changman Lee wrote:
On Wed, Nov 19, 2014 at 10:45:33PM -0800, Jaegeuk Kim wrote:
 On Thu, Nov 20, 2014 at 03:04:10PM +0900, Changman Lee wrote:
  Hi Jaegeuk,
  
  We should call flush_dcache_page before kunmap because the purpose 
  of the cache flush is to address aliasing problem related to 
  virtual address.
 
 Oh, I just followed zero_user_segments below.
 
 static inline void zero_user_segments(struct page *page,
   unsigned start1, unsigned end1,
   unsigned start2, unsigned end2)
 {
   void *kaddr = kmap_atomic(page);
 
   BUG_ON(end1  PAGE_SIZE || end2  PAGE_SIZE);
 
   if (end1  start1)
   memset(kaddr + start1, 0, end1 - start1);
 
   if (end2  start2)
   memset(kaddr + start2, 0, end2 - start2);
 
   kunmap_atomic(kaddr);
   flush_dcache_page(page);
 }
 
 Is this a wrong reference? Or, a bug?
 

Well.. Data in cache only have to be flushed until before other users 
read the data.
If so, it's not a bug.

   
   Yes, it is not a bug, since flush_dcache_page() needs to be able to
   deal with non-kmapped pages. However, this may create overhead in
   some situations.
   
  
  Previously, I was vague but I thought that it should be different
  according to vaddr exists or not. So I told jaegeuk that it should
  be better to change an order between flush_dache_page and kunmap.
  But actually, it doesn't matter the order between them except
  the situation you said.
  Could you explain the situation that makes overhead by flushing after 
  kummap.
  I can't imagine it by just seeing flush_dcache_page code.
  
 
 I was a not very precise here. Yes, flush_dcache_page() on ARM does
 the same in both situations since it has no idea whether it is called
 before or after kunmap.  However, flush_kernel_dcache_page() can
 assume that it is called before kunmap and thus, for example, does not
 need to pin a highmem page by kmap_high_get() (apart from not having
 to care about flushing user space mappings)
 
   According to documentation (see Documentation/cachetlb.txt), this is
   a use for flush_kernel_dcache_page(), since the page has been
   modified by the kernel only.  In contrast to flush_dcache_page(),
   this function must be called before kunmap().
   
   flush_kernel_dcache_page() does not need to flush the user space
   aliases.  Additionally, at least on ARM, it does not flush at all
   when called within kmap_atomic()/kunmap_atomic(), when
   kunmap_atomic() is going to flush the page anyway.  (I know that
   almost no one uses flush_kernel_dcache_page() (probably because
   almost no one knows when to use which of the two functions), but it
   may save a few cache flushes on architectures which are affected by
   aliasing)
   
   
 Anyway I modified as below.
 
 Thanks,
 
 From 7cb7b27c8cd2efc8a31d79239bef5b41c6e79216 Mon Sep 17 00:00:00 
 2001
 From: Jaegeuk Kim jaeg...@kernel.org
 Date: Tue, 18 Nov 2014 10:50:21 -0800
 Subject: [PATCH] f2fs: call flush_dcache_page when the page was 
 updated
 
 Whenever f2fs updates mapped pages, it needs to call 
 flush_dcache_page.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/dir.c| 7 ++-
  fs/f2fs/inline.c | 2 ++
  2 files changed, 8 insertions(+), 1 deletion(-)
 
 diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
 index 5a49995..fabf4ee 100644
 --- a/fs/f2fs/dir.c
 +++ b/fs/f2fs/dir.c
 @@ -287,8 +287,10 @@ void f2fs_set_link(struct inode *dir, struct 
 f2fs_dir_entry *de,
   f2fs_wait_on_page_writeback(page, type);
   de-ino = cpu_to_le32(inode-i_ino);
   set_de_type(de, inode);
 - if (!f2fs_has_inline_dentry(dir))
 + if (!f2fs_has_inline_dentry(dir)) {
 + flush_dcache_page(page);
   kunmap(page);
 + }
   
   Is this a page that may be mapped into user space? (I may be
   completely wrong here, since I have no idea how this code works.  But
   it looks like as if the answer is no ;-) ).
   
   It is not necessary to flush pages that cannot be seen by user space
   (see also the NOTE in the documentation of flush_dcache_page() in
   cachetlb.txt). Thus, if you know that a page will not be mapped into
   user space, please don't create the overhead of flushing it.
   
  
  In the case of dentry unlike inline data

Re: [f2fs-dev] [PATCH] f2fs: add cleancache support

2014-11-24 Thread Changman Lee
On Sun, Nov 23, 2014 at 11:18:00PM -0800, Jaegeuk Kim wrote:
 On Mon, Nov 24, 2014 at 03:19:43PM +0900, Changman Lee wrote:
  On Sun, Nov 23, 2014 at 09:42:12PM -0800, Jaegeuk Kim wrote:
   On Thu, Nov 20, 2014 at 01:38:51PM +0900, Changman Lee wrote:
On Fri, Nov 14, 2014 at 02:53:02PM +0900, Changman Lee wrote:
 On Thu, Nov 13, 2014 at 05:27:51PM -0800, Jaegeuk Kim wrote:
  Hi Changman,
  
  On Thu, Nov 13, 2014 at 02:34:50PM +0900, Changman Lee wrote:
   To use cleancache, fs must explicitly enable cleancache by calling
   cleancache_init_fs.
  
  Good catch!
  
  Prior to merge this patch, can you share any testing results or 
  performance
  numbers?
  
 Not yet, I'll try to get numbers.
 

Hi,

This is the result of kernel compile on xen-4.4 enabled tmem
: cleancache and frontswap.
I'm afraid that there is little difference by cleancache.
The cleancache shows a few cache hits but the effect through it doesn't
show. I don't know best benchmark to testify it yet.
Finally, I couldn't discover any bug during test.

[before patch]
1   2   3
Elapsed time25:00.6725:07.0925:00.38
Major fault 31100   31410   31333
Minor fault 276869398   276869318   276871144

[after patch]
1   2   3
Elapsed time25:12.3425:13.2925:11.99
Major fault 31559   32069   31801
Minor fault 276870283   276868046   276869251

[cleancache] - diff between start and end
1   2   3
failed_gets 1277980 1296355 1300368
invalidates 2588227 2651722 2655285
puts1289970 1323685 1320623
*succ_gets* 11  121299  114310
   
   Hi Changman,
   
   So, what is your suggestion?
   IMO, we first need to find a way exploiting cleancache over f2fs, so that
   we can introduce some guide for users.
   Until then, how about keeping this patch for a while?
   
  
  The performance of cleancache depends on workload but ext4 and btrfs
  support it already. So how about allowing to enable cleancache on f2fs?
  If backend of cleancache doesn't exists, there is no effect for f2fs.
  I think negative effectness of cleancache is little.
  Anyway, a final decision lies in your hand.
 
 I'm not sure, but it seems that nobody uses the cleancache.
 https://www.google.co.kr/trends/explore#q=cleancache
 
 And, as you've shown even worse performance under a simple workload, I don't
 understand why you want to add this.
 
 Let me know, if I'm missing any rationale.
 

Okay, let's keep it until before finding a way exploiting it well.

I thought to estimate firefox's startup time. To do it, however, I
needed to install ubuntu on f2fs. It takes long time to set up test
environment. So I gave up. :(
I have no rationale now.

  
  Thanks
  
   Thanks,
   

Thanks,
Changman

  What condition will be the best way to exploit f2fs and cleancache?
  
 Not clear.
 I think we can make a cleancache client for f2fs so that can 
 compenstate
 a penalty of node pages which are read mostly.
 
  Can we confirm that f2fs satisfies most of requirements described by
  cleancache.txt below?
 
 Good point.
 At a quick glance, F2FS seems to satisfy most of requirements.
 Through a experimental, I'll try to check side effect.
 
  
  Some points for a filesystem to consider:
  
  - The FS should be block-device-based (e.g. a ram-based FS such
as tmpfs should not enable cleancache)
  - To ensure coherency/correctness, the FS must ensure that all
file removal or truncation operations either go through VFS or
add hooks to do the equivalent cleancache invalidate operations
  - To ensure coherency/correctness, either inode numbers must
be unique across the lifetime of the on-disk file OR the
FS must provide an encode_fh function.
  - The FS must call the VFS superblock alloc and deactivate routines
or add hooks to do the equivalent cleancache calls done there.
  - To maximize performance, all pages fetched from the FS should
go through the do_mpag_readpage routine or the FS should add
hooks to do the equivalent (cf. btrfs)
  - Currently, the FS blocksize must be the same as PAGESIZE.  This
is not an architectural restriction, but no backends currently
support anything different.
  - A clustered FS should invoke the shared_init_fs cleancache
hook to get best performance for some backends.
  
  Thanks,
  
   
   Signed-off-by: Changman Lee cm224@samsung.com
   ---
fs/f2fs/super.c | 3

[f2fs-dev] [PATCH 2/2] f2fs: no more dirty_nat_entires when flushing

2014-11-24 Thread Changman Lee
After flushing dirty nat entries, it has to be no more dirty nat
entries.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/node.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index f6bd222..fc1077b 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1925,10 +1925,10 @@ static void __flush_nat_entry_set(struct f2fs_sb_info 
*sbi,
else
f2fs_put_page(page, 1);
 
-   if (!set-entry_cnt) {
-   radix_tree_delete(NM_I(sbi)-nat_set_root, set-set);
-   kmem_cache_free(nat_entry_set_slab, set);
-   }
+   f2fs_bug_on(sbi, set-entry_cnt);
+
+   radix_tree_delete(NM_I(sbi)-nat_set_root, set-set);
+   kmem_cache_free(nat_entry_set_slab, set);
 }
 
 /*
-- 
1.9.1


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/3] f2fs: call flush_dcache_page when the page was updated

2014-11-23 Thread Changman Lee
Hi Simon,
Thanks for your explanation kindly.

On Sun, Nov 23, 2014 at 11:08:54AM +0100, Simon Baatz wrote:
 Hi Changman, Jaegeuk,
 
 On Thu, Nov 20, 2014 at 05:47:29PM +0900, Changman Lee wrote:
  On Wed, Nov 19, 2014 at 10:45:33PM -0800, Jaegeuk Kim wrote:
   On Thu, Nov 20, 2014 at 03:04:10PM +0900, Changman Lee wrote:
Hi Jaegeuk,

We should call flush_dcache_page before kunmap because the purpose of 
the cache flush is to address aliasing problem related to virtual 
address.
   
   Oh, I just followed zero_user_segments below.
   
   static inline void zero_user_segments(struct page *page,
 unsigned start1, unsigned end1,
 unsigned start2, unsigned end2)
   {
 void *kaddr = kmap_atomic(page);
   
 BUG_ON(end1  PAGE_SIZE || end2  PAGE_SIZE);
   
 if (end1  start1)
 memset(kaddr + start1, 0, end1 - start1);
   
 if (end2  start2)
 memset(kaddr + start2, 0, end2 - start2);
   
 kunmap_atomic(kaddr);
 flush_dcache_page(page);
   }
   
   Is this a wrong reference? Or, a bug?
   
  
  Well.. Data in cache only have to be flushed until before other users read 
  the data.
  If so, it's not a bug.
  
 
 Yes, it is not a bug, since flush_dcache_page() needs to be able to
 deal with non-kmapped pages. However, this may create overhead in
 some situations.
 

Previously, I was vague but I thought that it should be different
according to vaddr exists or not. So I told jaegeuk that it should
be better to change an order between flush_dache_page and kunmap.
But actually, it doesn't matter the order between them except
the situation you said.
Could you explain the situation that makes overhead by flushing after kummap.
I can't imagine it by just seeing flush_dcache_page code.

 According to documentation (see Documentation/cachetlb.txt), this is
 a use for flush_kernel_dcache_page(), since the page has been
 modified by the kernel only.  In contrast to flush_dcache_page(),
 this function must be called before kunmap().
 
 flush_kernel_dcache_page() does not need to flush the user space
 aliases.  Additionally, at least on ARM, it does not flush at all
 when called within kmap_atomic()/kunmap_atomic(), when
 kunmap_atomic() is going to flush the page anyway.  (I know that
 almost no one uses flush_kernel_dcache_page() (probably because
 almost no one knows when to use which of the two functions), but it
 may save a few cache flushes on architectures which are affected by
 aliasing)
 
 
   Anyway I modified as below.
   
   Thanks,
   
   From 7cb7b27c8cd2efc8a31d79239bef5b41c6e79216 Mon Sep 17 00:00:00 2001
   From: Jaegeuk Kim jaeg...@kernel.org
   Date: Tue, 18 Nov 2014 10:50:21 -0800
   Subject: [PATCH] f2fs: call flush_dcache_page when the page was updated
   
   Whenever f2fs updates mapped pages, it needs to call flush_dcache_page.
   
   Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
   ---
fs/f2fs/dir.c| 7 ++-
fs/f2fs/inline.c | 2 ++
2 files changed, 8 insertions(+), 1 deletion(-)
   
   diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
   index 5a49995..fabf4ee 100644
   --- a/fs/f2fs/dir.c
   +++ b/fs/f2fs/dir.c
   @@ -287,8 +287,10 @@ void f2fs_set_link(struct inode *dir, struct 
   f2fs_dir_entry *de,
 f2fs_wait_on_page_writeback(page, type);
 de-ino = cpu_to_le32(inode-i_ino);
 set_de_type(de, inode);
   - if (!f2fs_has_inline_dentry(dir))
   + if (!f2fs_has_inline_dentry(dir)) {
   + flush_dcache_page(page);
 kunmap(page);
   + }
 
 Is this a page that may be mapped into user space? (I may be
 completely wrong here, since I have no idea how this code works.  But
 it looks like as if the answer is no ;-) ).
 
 It is not necessary to flush pages that cannot be seen by user space
 (see also the NOTE in the documentation of flush_dcache_page() in
 cachetlb.txt). Thus, if you know that a page will not be mapped into
 user space, please don't create the overhead of flushing it.
 

In the case of dentry unlike inline data, this is not mapped to user space, so 
dcache flush
makes overhead. Do you mean that?

Best regard,
Changman

 
 - Simon

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/3] f2fs: call flush_dcache_page when the page was updated

2014-11-20 Thread Changman Lee
On Wed, Nov 19, 2014 at 10:45:33PM -0800, Jaegeuk Kim wrote:
 On Thu, Nov 20, 2014 at 03:04:10PM +0900, Changman Lee wrote:
  Hi Jaegeuk,
  
  We should call flush_dcache_page before kunmap because the purpose of the 
  cache flush is to address aliasing problem related to virtual address.
 
 Oh, I just followed zero_user_segments below.
 
 static inline void zero_user_segments(struct page *page,
   unsigned start1, unsigned end1,
   unsigned start2, unsigned end2)
 {
   void *kaddr = kmap_atomic(page);
 
   BUG_ON(end1  PAGE_SIZE || end2  PAGE_SIZE);
 
   if (end1  start1)
   memset(kaddr + start1, 0, end1 - start1);
 
   if (end2  start2)
   memset(kaddr + start2, 0, end2 - start2);
 
   kunmap_atomic(kaddr);
   flush_dcache_page(page);
 }
 
 Is this a wrong reference? Or, a bug?
 

Well.. Data in cache only have to be flushed until before other users read the 
data.
If so, it's not a bug.

 Anyway I modified as below.
 
 Thanks,
 
 From 7cb7b27c8cd2efc8a31d79239bef5b41c6e79216 Mon Sep 17 00:00:00 2001
 From: Jaegeuk Kim jaeg...@kernel.org
 Date: Tue, 18 Nov 2014 10:50:21 -0800
 Subject: [PATCH] f2fs: call flush_dcache_page when the page was updated
 
 Whenever f2fs updates mapped pages, it needs to call flush_dcache_page.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/dir.c| 7 ++-
  fs/f2fs/inline.c | 2 ++
  2 files changed, 8 insertions(+), 1 deletion(-)
 
 diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
 index 5a49995..fabf4ee 100644
 --- a/fs/f2fs/dir.c
 +++ b/fs/f2fs/dir.c
 @@ -287,8 +287,10 @@ void f2fs_set_link(struct inode *dir, struct 
 f2fs_dir_entry *de,
   f2fs_wait_on_page_writeback(page, type);
   de-ino = cpu_to_le32(inode-i_ino);
   set_de_type(de, inode);
 - if (!f2fs_has_inline_dentry(dir))
 + if (!f2fs_has_inline_dentry(dir)) {
 + flush_dcache_page(page);
   kunmap(page);
 + }
   set_page_dirty(page);
   dir-i_mtime = dir-i_ctime = CURRENT_TIME;
   mark_inode_dirty(dir);
 @@ -365,6 +367,7 @@ static int make_empty_dir(struct inode *inode,
   make_dentry_ptr(d, (void *)dentry_blk, 1);
   do_make_empty_dir(inode, parent, d);
  
 + flush_dcache_page(dentry_page);
   kunmap_atomic(dentry_blk);
  
   set_page_dirty(dentry_page);
 @@ -578,6 +581,7 @@ fail:
   update_inode_page(dir);
   clear_inode_flag(F2FS_I(dir), FI_UPDATE_DIR);
   }
 + flush_dcache_page(dentry_page);
   kunmap(dentry_page);
   f2fs_put_page(dentry_page, 1);
   return err;
 @@ -660,6 +664,7 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
 struct page *page,
   bit_pos = find_next_bit_le(dentry_blk-dentry_bitmap,
   NR_DENTRY_IN_BLOCK,
   0);
 + flush_dcache_page(page);
   kunmap(page); /* kunmap - pair of f2fs_find_entry */
   set_page_dirty(page);
  
 diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
 index f26fb87..4291c1f 100644
 --- a/fs/f2fs/inline.c
 +++ b/fs/f2fs/inline.c
 @@ -106,6 +106,7 @@ int f2fs_convert_inline_page(struct dnode_of_data *dn, 
 struct page *page)
   src_addr = inline_data_addr(dn-inode_page);
   dst_addr = kmap_atomic(page);
   memcpy(dst_addr, src_addr, MAX_INLINE_DATA);
 + flush_dcache_page(page);
   kunmap_atomic(dst_addr);
   SetPageUptodate(page);
  no_update:
 @@ -357,6 +358,7 @@ static int f2fs_convert_inline_dir(struct inode *dir, 
 struct page *ipage,
   memcpy(dentry_blk-filename, inline_dentry-filename,
   NR_INLINE_DENTRY * F2FS_SLOT_LEN);
  
 + flush_dcache_page(page);
   kunmap_atomic(dentry_blk);
   SetPageUptodate(page);
   set_page_dirty(page);
 -- 
 2.1.1
 

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: fix wrong data structure when create slab

2014-11-20 Thread Changman Lee
It used nat_entry_set when create slab for sit_entry_set.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/segment.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e094675..9de857f 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2231,7 +2231,7 @@ int __init create_segment_manager_caches(void)
goto fail;
 
sit_entry_set_slab = f2fs_kmem_cache_create(sit_entry_set,
-   sizeof(struct nat_entry_set));
+   sizeof(struct sit_entry_set));
if (!sit_entry_set_slab)
goto destory_discard_entry;
 
-- 
1.9.1


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: add cleancache support

2014-11-19 Thread Changman Lee
On Fri, Nov 14, 2014 at 02:53:02PM +0900, Changman Lee wrote:
 On Thu, Nov 13, 2014 at 05:27:51PM -0800, Jaegeuk Kim wrote:
  Hi Changman,
  
  On Thu, Nov 13, 2014 at 02:34:50PM +0900, Changman Lee wrote:
   To use cleancache, fs must explicitly enable cleancache by calling
   cleancache_init_fs.
  
  Good catch!
  
  Prior to merge this patch, can you share any testing results or performance
  numbers?
  
 Not yet, I'll try to get numbers.
 

Hi,

This is the result of kernel compile on xen-4.4 enabled tmem
: cleancache and frontswap.
I'm afraid that there is little difference by cleancache.
The cleancache shows a few cache hits but the effect through it doesn't
show. I don't know best benchmark to testify it yet.
Finally, I couldn't discover any bug during test.

[before patch]
1   2   3
Elapsed time25:00.6725:07.0925:00.38
Major fault 31100   31410   31333
Minor fault 276869398   276869318   276871144

[after patch]
1   2   3
Elapsed time25:12.3425:13.2925:11.99
Major fault 31559   32069   31801
Minor fault 276870283   276868046   276869251

[cleancache] - diff between start and end
1   2   3
failed_gets 1277980 1296355 1300368
invalidates 2588227 2651722 2655285
puts1289970 1323685 1320623
*succ_gets* 11  121299  114310

Thanks,
Changman

  What condition will be the best way to exploit f2fs and cleancache?
  
 Not clear.
 I think we can make a cleancache client for f2fs so that can compenstate
 a penalty of node pages which are read mostly.
 
  Can we confirm that f2fs satisfies most of requirements described by
  cleancache.txt below?
 
 Good point.
 At a quick glance, F2FS seems to satisfy most of requirements.
 Through a experimental, I'll try to check side effect.
 
  
  Some points for a filesystem to consider:
  
  - The FS should be block-device-based (e.g. a ram-based FS such
as tmpfs should not enable cleancache)
  - To ensure coherency/correctness, the FS must ensure that all
file removal or truncation operations either go through VFS or
add hooks to do the equivalent cleancache invalidate operations
  - To ensure coherency/correctness, either inode numbers must
be unique across the lifetime of the on-disk file OR the
FS must provide an encode_fh function.
  - The FS must call the VFS superblock alloc and deactivate routines
or add hooks to do the equivalent cleancache calls done there.
  - To maximize performance, all pages fetched from the FS should
go through the do_mpag_readpage routine or the FS should add
hooks to do the equivalent (cf. btrfs)
  - Currently, the FS blocksize must be the same as PAGESIZE.  This
is not an architectural restriction, but no backends currently
support anything different.
  - A clustered FS should invoke the shared_init_fs cleancache
hook to get best performance for some backends.
  
  Thanks,
  
   
   Signed-off-by: Changman Lee cm224@samsung.com
   ---
fs/f2fs/super.c | 3 +++
1 file changed, 3 insertions(+)
   
   diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
   index 512ffd8..2ebb960 100644
   --- a/fs/f2fs/super.c
   +++ b/fs/f2fs/super.c
   @@ -24,6 +24,7 @@
#include linux/blkdev.h
#include linux/f2fs_fs.h
#include linux/sysfs.h
   +#include linux/cleancache.h

#include f2fs.h
#include node.h
   @@ -1144,6 +1145,8 @@ try_onemore:
 if (err)
 goto free_kobj;
 }
   +
   + cleancache_init_fs(sb);
 return 0;

free_kobj:
   -- 
   1.9.1
   
   
   --
   Comprehensive Server Monitoring with Site24x7.
   Monitor 10 servers for $9/Month.
   Get alerted through email, SMS, voice calls or mobile push notifications.
   Take corrective actions from your mobile device.
   http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
   ___
   Linux-f2fs-devel mailing list
   Linux-f2fs-devel@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
 
 --
 Comprehensive Server Monitoring with Site24x7.
 Monitor 10 servers for $9/Month.
 Get alerted through email, SMS, voice calls or mobile push notifications.
 Take corrective actions from your mobile device.
 http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 1/3] f2fs: call flush_dcache_page when the page was updated

2014-11-19 Thread Changman Lee
Hi Jaegeuk,

We should call flush_dcache_page before kunmap because the purpose of the cache 
flush is to address aliasing problem related to virtual address.

On Wed, Nov 19, 2014 at 02:35:08PM -0800, Jaegeuk Kim wrote:
 Whenever f2fs updates mapped pages, it needs to call flush_dcache_page.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/dir.c| 7 ++-
  fs/f2fs/inline.c | 4 +++-
  2 files changed, 9 insertions(+), 2 deletions(-)
 
 diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
 index 5a49995..312fbfc 100644
 --- a/fs/f2fs/dir.c
 +++ b/fs/f2fs/dir.c
 @@ -287,8 +287,10 @@ void f2fs_set_link(struct inode *dir, struct 
 f2fs_dir_entry *de,
   f2fs_wait_on_page_writeback(page, type);
   de-ino = cpu_to_le32(inode-i_ino);
   set_de_type(de, inode);
 - if (!f2fs_has_inline_dentry(dir))
 + if (!f2fs_has_inline_dentry(dir)) {
   kunmap(page);
 + flush_dcache_page(page);
 + }
   set_page_dirty(page);
   dir-i_mtime = dir-i_ctime = CURRENT_TIME;
   mark_inode_dirty(dir);
 @@ -366,6 +368,7 @@ static int make_empty_dir(struct inode *inode,
   do_make_empty_dir(inode, parent, d);
  
   kunmap_atomic(dentry_blk);
 + flush_dcache_page(dentry_page);
  
   set_page_dirty(dentry_page);
   f2fs_put_page(dentry_page, 1);
 @@ -579,6 +582,7 @@ fail:
   clear_inode_flag(F2FS_I(dir), FI_UPDATE_DIR);
   }
   kunmap(dentry_page);
 + flush_dcache_page(dentry_page);
   f2fs_put_page(dentry_page, 1);
   return err;
  }
 @@ -661,6 +665,7 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
 struct page *page,
   NR_DENTRY_IN_BLOCK,
   0);
   kunmap(page); /* kunmap - pair of f2fs_find_entry */
 + flush_dcache_page(page);
   set_page_dirty(page);
  
   dir-i_ctime = dir-i_mtime = CURRENT_TIME;
 diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
 index f26fb87..8b7cc51 100644
 --- a/fs/f2fs/inline.c
 +++ b/fs/f2fs/inline.c
 @@ -45,8 +45,8 @@ void read_inline_data(struct page *page, struct page *ipage)
   src_addr = inline_data_addr(ipage);
   dst_addr = kmap_atomic(page);
   memcpy(dst_addr, src_addr, MAX_INLINE_DATA);
 - flush_dcache_page(page);
   kunmap_atomic(dst_addr);
 + flush_dcache_page(page);
   SetPageUptodate(page);
  }
  
 @@ -107,6 +107,7 @@ int f2fs_convert_inline_page(struct dnode_of_data *dn, 
 struct page *page)
   dst_addr = kmap_atomic(page);
   memcpy(dst_addr, src_addr, MAX_INLINE_DATA);
   kunmap_atomic(dst_addr);
 + flush_dcache_page(page);
   SetPageUptodate(page);
  no_update:
   /* write data page to try to make data consistent */
 @@ -358,6 +359,7 @@ static int f2fs_convert_inline_dir(struct inode *dir, 
 struct page *ipage,
   NR_INLINE_DENTRY * F2FS_SLOT_LEN);
  
   kunmap_atomic(dentry_blk);
 + flush_dcache_page(page);
   SetPageUptodate(page);
   set_page_dirty(page);
  
 -- 
 2.1.1
 
 
 --
 Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
 from Actuate! Instantly Supercharge Your Business Reports and Dashboards
 with Interactivity, Sharing, Native Excel Exports, App Integration  more
 Get technology previously reserved for billion-dollar corporations, FREE
 http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] mkfs.f2fs: introduce some macros to simplify coding style

2014-11-16 Thread Changman Lee
This patch tries to simplify coding style for readability.
Rename shortly
 o rename super_block to sb

And, introduce some macros.
 o set/get_cp
 o set/get_sb
 o next/prev_zone, last_zone and last_section
 o ALIGN, SEG_ALIGN and ZONE_ALIGN

Signed-off-by: Changman Lee cm224@samsung.com
---
 include/f2fs_fs.h   |   6 +
 lib/libf2fs.c   |   1 +
 mkfs/f2fs_format.c  | 548 +++-
 mkfs/f2fs_format_main.c |   1 +
 4 files changed, 272 insertions(+), 284 deletions(-)

diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h
index efddfca..0c3ba04 100644
--- a/include/f2fs_fs.h
+++ b/include/f2fs_fs.h
@@ -230,6 +230,7 @@ struct f2fs_configuration {
u_int32_t cur_seg[6];
u_int32_t segs_per_sec;
u_int32_t secs_per_zone;
+   u_int32_t segs_per_zone;
u_int32_t start_sector;
u_int64_t total_sectors;
u_int32_t sectors_per_blk;
@@ -786,4 +787,9 @@ f2fs_hash_t f2fs_dentry_hash(const unsigned char *, int);
 
 extern struct f2fs_configuration config;
 
+#define ALIGN(val, size)   ((val) + (size) - 1) / (size)
+#define SEG_ALIGN(blks)ALIGN(blks, config.blks_per_seg)
+#define ZONE_ALIGN(blks)   ALIGN(blks, config.blks_per_seg * \
+   config.segs_per_zone)
+
 #endif /*__F2FS_FS_H */
diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index 14e4164..8123528 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -357,6 +357,7 @@ void f2fs_init_configuration(struct f2fs_configuration *c)
c-overprovision = 5;
c-segs_per_sec = 1;
c-secs_per_zone = 1;
+   c-segs_per_zone = 1;
c-heap = 1;
c-vol_label = ;
c-device_name = NULL;
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index c0028a3..a8d2db6 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -22,7 +22,71 @@
 #include f2fs_format_utils.h
 
 extern struct f2fs_configuration config;
-struct f2fs_super_block super_block;
+struct f2fs_super_block sb;
+struct f2fs_checkpoint *cp;
+
+/* Return first segment number of each area */
+#define prev_zone(cur) (config.cur_seg[cur] - config.segs_per_zone)
+#define next_zone(cur) (config.cur_seg[cur] + config.segs_per_zone)
+#define last_zone(cur) ((cur - 1) * config.segs_per_zone)
+#define last_section(cur)  (cur + (config.secs_per_zone - 1) * 
config.segs_per_sec)
+
+#define set_sb_le64(member, val)   (sb.member = cpu_to_le64(val))
+#define set_sb_le32(member, val)   (sb.member = cpu_to_le32(val))
+#define set_sb_le16(member, val)   (sb.member = cpu_to_le16(val))
+#define get_sb_le64(member)le64_to_cpu(sb.member)
+#define get_sb_le32(member)le32_to_cpu(sb.member)
+#define get_sb_le16(member)le16_to_cpu(sb.member)
+
+#define set_sb(member, val)\
+   do {\
+   typeof(sb.member) t;\
+   switch (sizeof(t)) {\
+   case 8: set_sb_le64(member, val); break; \
+   case 4: set_sb_le32(member, val); break; \
+   case 2: set_sb_le16(member, val); break; \
+   } \
+   } while(0)
+
+#define get_sb(member) \
+   ({  \
+   typeof(sb.member) t;\
+   switch (sizeof(t)) {\
+   case 8: t = get_sb_le64(member); break; \
+   case 4: t = get_sb_le32(member); break; \
+   case 2: t = get_sb_le16(member); break; \
+   }   \
+   t; \
+   })
+
+#define set_cp_le64(member, val)   (cp-member = cpu_to_le64(val))
+#define set_cp_le32(member, val)   (cp-member = cpu_to_le32(val))
+#define set_cp_le16(member, val)   (cp-member = cpu_to_le16(val))
+#define get_cp_le64(member)le64_to_cpu(cp-member)
+#define get_cp_le32(member)le32_to_cpu(cp-member)
+#define get_cp_le16(member)le16_to_cpu(cp-member)
+
+#define set_cp(member, val)\
+   do {\
+   typeof(cp-member) t;   \
+   switch (sizeof(t)) {\
+   case 8: set_cp_le64(member, val); break; \
+   case 4: set_cp_le32(member, val); break; \
+   case 2: set_cp_le16(member, val); break

[f2fs-dev] [PATCH 2/2] mkfs.f2fs: fix missing endian conversion

2014-11-13 Thread Changman Lee
This is for conversion from cpu to little endian and vice versa.

Signed-off-by: Changman Lee cm224@samsung.com
---
 mkfs/f2fs_format.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a9d728..c0028a3 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -71,7 +71,7 @@ static void configure_extension_list(void)
memcpy(super_block.extension_list[i++], *extlist, name_len);
extlist++;
}
-   super_block.extension_count = i;
+   super_block.extension_count = cpu_to_le32(i);
 
if (!ext_str)
return;
@@ -86,7 +86,7 @@ static void configure_extension_list(void)
break;
}
 
-   super_block.extension_count = i;
+   super_block.extension_count = cpu_to_le32(i);
 
free(config.extension_list);
 }
@@ -211,7 +211,7 @@ static int f2fs_prepare_super_block(void)
if (max_sit_bitmap_size 
(CHECKSUM_OFFSET - sizeof(struct f2fs_checkpoint) + 
65)) {
max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct 
f2fs_checkpoint) + 1;
-   super_block.cp_payload = F2FS_BLK_ALIGN(max_sit_bitmap_size);
+   super_block.cp_payload = 
cpu_to_le32(F2FS_BLK_ALIGN(max_sit_bitmap_size));
} else {
max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct 
f2fs_checkpoint) + 1
- max_sit_bitmap_size;
-- 
1.9.1


--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: add cleancache support

2014-11-13 Thread Changman Lee
On Fri, Nov 14, 2014 at 11:08:15AM +0800, Chao Yu wrote:
 Hi Changman,
 
  -Original Message-
  From: Changman Lee [mailto:cm224@samsung.com]
  Sent: Thursday, November 13, 2014 1:35 PM
  To: linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net
  Subject: [f2fs-dev] [PATCH] f2fs: add cleancache support
  
  To use cleancache, fs must explicitly enable cleancache by calling
  cleancache_init_fs.
 
 Good catch!
 
 AFAIK, cleancache will work only if we init its backend and register related 
 ops,
 but since we merged the commit 962564604873 staging: zcache: delete it, we
 have lost the zcache one. Is there other backends?
 
 Regards,
 Yu
 

Hi Yu,

AFAIK, Hypervisor like xen uses cleancache and frontswap positively.
And GCMA (Guaranteed CMA) is newly submitting, which has plan to be used
by cleancache.
I think it's not bad we prepare to accept them.

Thansk,

  
  Signed-off-by: Changman Lee cm224@samsung.com
  ---
   fs/f2fs/super.c | 3 +++
   1 file changed, 3 insertions(+)
  
  diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
  index 512ffd8..2ebb960 100644
  --- a/fs/f2fs/super.c
  +++ b/fs/f2fs/super.c
  @@ -24,6 +24,7 @@
   #include linux/blkdev.h
   #include linux/f2fs_fs.h
   #include linux/sysfs.h
  +#include linux/cleancache.h
  
   #include f2fs.h
   #include node.h
  @@ -1144,6 +1145,8 @@ try_onemore:
  if (err)
  goto free_kobj;
  }
  +
  +   cleancache_init_fs(sb);
  return 0;
  
   free_kobj:
  --
  1.9.1
  
  
  --
  Comprehensive Server Monitoring with Site24x7.
  Monitor 10 servers for $9/Month.
  Get alerted through email, SMS, voice calls or mobile push notifications.
  Take corrective actions from your mobile device.
  http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
  ___
  Linux-f2fs-devel mailing list
  Linux-f2fs-devel@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: add cleancache support

2014-11-13 Thread Changman Lee
On Thu, Nov 13, 2014 at 05:27:51PM -0800, Jaegeuk Kim wrote:
 Hi Changman,
 
 On Thu, Nov 13, 2014 at 02:34:50PM +0900, Changman Lee wrote:
  To use cleancache, fs must explicitly enable cleancache by calling
  cleancache_init_fs.
 
 Good catch!
 
 Prior to merge this patch, can you share any testing results or performance
 numbers?
 
Not yet, I'll try to get numbers.

 What condition will be the best way to exploit f2fs and cleancache?
 
Not clear.
I think we can make a cleancache client for f2fs so that can compenstate
a penalty of node pages which are read mostly.

 Can we confirm that f2fs satisfies most of requirements described by
 cleancache.txt below?

Good point.
At a quick glance, F2FS seems to satisfy most of requirements.
Through a experimental, I'll try to check side effect.

 
 Some points for a filesystem to consider:
 
 - The FS should be block-device-based (e.g. a ram-based FS such
   as tmpfs should not enable cleancache)
 - To ensure coherency/correctness, the FS must ensure that all
   file removal or truncation operations either go through VFS or
   add hooks to do the equivalent cleancache invalidate operations
 - To ensure coherency/correctness, either inode numbers must
   be unique across the lifetime of the on-disk file OR the
   FS must provide an encode_fh function.
 - The FS must call the VFS superblock alloc and deactivate routines
   or add hooks to do the equivalent cleancache calls done there.
 - To maximize performance, all pages fetched from the FS should
   go through the do_mpag_readpage routine or the FS should add
   hooks to do the equivalent (cf. btrfs)
 - Currently, the FS blocksize must be the same as PAGESIZE.  This
   is not an architectural restriction, but no backends currently
   support anything different.
 - A clustered FS should invoke the shared_init_fs cleancache
   hook to get best performance for some backends.
 
 Thanks,
 
  
  Signed-off-by: Changman Lee cm224@samsung.com
  ---
   fs/f2fs/super.c | 3 +++
   1 file changed, 3 insertions(+)
  
  diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
  index 512ffd8..2ebb960 100644
  --- a/fs/f2fs/super.c
  +++ b/fs/f2fs/super.c
  @@ -24,6 +24,7 @@
   #include linux/blkdev.h
   #include linux/f2fs_fs.h
   #include linux/sysfs.h
  +#include linux/cleancache.h
   
   #include f2fs.h
   #include node.h
  @@ -1144,6 +1145,8 @@ try_onemore:
  if (err)
  goto free_kobj;
  }
  +
  +   cleancache_init_fs(sb);
  return 0;
   
   free_kobj:
  -- 
  1.9.1
  
  
  --
  Comprehensive Server Monitoring with Site24x7.
  Monitor 10 servers for $9/Month.
  Get alerted through email, SMS, voice calls or mobile push notifications.
  Take corrective actions from your mobile device.
  http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
  ___
  Linux-f2fs-devel mailing list
  Linux-f2fs-devel@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/5] f2fs: disable roll-forward when active_logs = 2

2014-11-11 Thread Changman Lee
On Mon, Nov 10, 2014 at 07:07:59AM -0800, Jaegeuk Kim wrote:
 Hi Changman,
 
 On Mon, Nov 10, 2014 at 06:54:37PM +0900, Changman Lee wrote:
  On Sat, Nov 08, 2014 at 11:36:05PM -0800, Jaegeuk Kim wrote:
   The roll-forward mechanism should be activated when the number of active
   logs is not 2.
   
   Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
   ---
fs/f2fs/file.c| 2 ++
fs/f2fs/segment.c | 4 ++--
2 files changed, 4 insertions(+), 2 deletions(-)
   
   diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
   index 46311e7..54722a0 100644
   --- a/fs/f2fs/file.c
   +++ b/fs/f2fs/file.c
   @@ -132,6 +132,8 @@ static inline bool need_do_checkpoint(struct inode 
   *inode)
 need_cp = true;
 else if (test_opt(sbi, FASTBOOT))
 need_cp = true;
   + else if (sbi-active_logs == 2)
   + need_cp = true;

 return need_cp;
}
   diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
   index 2fb3d7f..16721b5d 100644
   --- a/fs/f2fs/segment.c
   +++ b/fs/f2fs/segment.c
   @@ -1090,8 +1090,8 @@ static int __get_segment_type_4(struct page *page, 
   enum page_type p_type)
 else
 return CURSEG_COLD_DATA;
 } else {
   - if (IS_DNODE(page)  !is_cold_node(page))
   - return CURSEG_HOT_NODE;
   + if (IS_DNODE(page)  is_cold_node(page))
   + return CURSEG_WARM_NODE;
  
  Hi Jaegeuk,
  
  We should take hot/cold seperation into account as well.
  In case of dir inode, it will be mixed with COLD_NODE.
  If it's trade-off, let's notice it kindly as comments.
 
 NAK.
 This patch tries to fix a bug, which is not a trade-off.
 We should write files' direct node blocks in CURSEG_WARM_NODE for recovery.
 
 Thanks,

Okay, a word of 'trade-off' is wrong. We must be able to do recovery.
However, we break a rule of hot/cold separation we want. So I thought we
should notice its negative effect.
Anyway, how about putting WARM and HOT together instead HOT and COLD?
We can distinguish enough if they are direct node and have fsync_mark at
recovery time although HOT/WARM are mixed.
Let me know if there is my misundertanding.

Thanks,

 
  
  Regards,
  Changman
  
 else
 return CURSEG_COLD_NODE;
 }
   -- 
   2.1.1
   
   
   --
   ___
   Linux-f2fs-devel mailing list
   Linux-f2fs-devel@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/5] f2fs: disable roll-forward when active_logs = 2

2014-11-10 Thread Changman Lee
On Sat, Nov 08, 2014 at 11:36:05PM -0800, Jaegeuk Kim wrote:
 The roll-forward mechanism should be activated when the number of active
 logs is not 2.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/file.c| 2 ++
  fs/f2fs/segment.c | 4 ++--
  2 files changed, 4 insertions(+), 2 deletions(-)
 
 diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
 index 46311e7..54722a0 100644
 --- a/fs/f2fs/file.c
 +++ b/fs/f2fs/file.c
 @@ -132,6 +132,8 @@ static inline bool need_do_checkpoint(struct inode *inode)
   need_cp = true;
   else if (test_opt(sbi, FASTBOOT))
   need_cp = true;
 + else if (sbi-active_logs == 2)
 + need_cp = true;
  
   return need_cp;
  }
 diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
 index 2fb3d7f..16721b5d 100644
 --- a/fs/f2fs/segment.c
 +++ b/fs/f2fs/segment.c
 @@ -1090,8 +1090,8 @@ static int __get_segment_type_4(struct page *page, enum 
 page_type p_type)
   else
   return CURSEG_COLD_DATA;
   } else {
 - if (IS_DNODE(page)  !is_cold_node(page))
 - return CURSEG_HOT_NODE;
 + if (IS_DNODE(page)  is_cold_node(page))
 + return CURSEG_WARM_NODE;

Hi Jaegeuk,

We should take hot/cold seperation into account as well.
In case of dir inode, it will be mixed with COLD_NODE.
If it's trade-off, let's notice it kindly as comments.

Regards,
Changman

   else
   return CURSEG_COLD_NODE;
   }
 -- 
 2.1.1
 
 
 --
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 3/5] f2fs: control the memory footprint used by ino entries

2014-11-09 Thread Changman Lee
On Sat, Nov 08, 2014 at 11:36:07PM -0800, Jaegeuk Kim wrote:
 This patch adds to control the memory footprint used by ino entries.
 This will conduct best effort, not strictly.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/node.c| 28 ++--
  fs/f2fs/node.h|  3 ++-
  fs/f2fs/segment.c |  3 ++-
  3 files changed, 26 insertions(+), 8 deletions(-)
 
 diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
 index 44b8afe..4ea2c47 100644
 --- a/fs/f2fs/node.c
 +++ b/fs/f2fs/node.c
 @@ -31,22 +31,38 @@ bool available_free_memory(struct f2fs_sb_info *sbi, int 
 type)
  {
   struct f2fs_nm_info *nm_i = NM_I(sbi);
   struct sysinfo val;
 + unsigned long avail_ram;
   unsigned long mem_size = 0;
   bool res = false;
  
   si_meminfo(val);
 - /* give 25%, 25%, 50% memory for each components respectively */
 +
 + /* only uses low memory */
 + avail_ram = val.totalram - val.totalhigh;
 +
 + /* give 25%, 25%, 50%, 50% memory for each components respectively */

Hi Jaegeuk,

The memory usage of nm_i should be 100% but it's 125%.
Mistake or intended?

   if (type == FREE_NIDS) {
 - mem_size = (nm_i-fcnt * sizeof(struct free_nid))  12;
 - res = mem_size  ((val.totalram * nm_i-ram_thresh / 100)  2);
 + mem_size = (nm_i-fcnt * sizeof(struct free_nid)) 
 + PAGE_CACHE_SHIFT;
 + res = mem_size  ((avail_ram * nm_i-ram_thresh / 100)  2);
   } else if (type == NAT_ENTRIES) {
 - mem_size = (nm_i-nat_cnt * sizeof(struct nat_entry))  12;
 - res = mem_size  ((val.totalram * nm_i-ram_thresh / 100)  2);
 + mem_size = (nm_i-nat_cnt * sizeof(struct nat_entry)) 
 + PAGE_CACHE_SHIFT;
 + res = mem_size  ((avail_ram * nm_i-ram_thresh / 100)  2);
   } else if (type == DIRTY_DENTS) {
   if (sbi-sb-s_bdi-dirty_exceeded)
   return false;
   mem_size = get_pages(sbi, F2FS_DIRTY_DENTS);
 - res = mem_size  ((val.totalram * nm_i-ram_thresh / 100)  1);
 + res = mem_size  ((avail_ram * nm_i-ram_thresh / 100)  1);
 + } else if (type == INO_ENTRIES) {
 + int i;
 +
 + if (sbi-sb-s_bdi-dirty_exceeded)
 + return false;
 + for (i = 0; i = UPDATE_INO; i++)
 + mem_size += (sbi-ino_num[i] * sizeof(struct ino_entry))
 +  PAGE_CACHE_SHIFT;
 + res = mem_size  ((avail_ram * nm_i-ram_thresh / 100)  1);
   }
   return res;
  }
 diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
 index acb71e5..d10b644 100644
 --- a/fs/f2fs/node.h
 +++ b/fs/f2fs/node.h
 @@ -106,7 +106,8 @@ static inline void raw_nat_from_node_info(struct 
 f2fs_nat_entry *raw_ne,
  enum mem_type {
   FREE_NIDS,  /* indicates the free nid list */
   NAT_ENTRIES,/* indicates the cached nat entry */
 - DIRTY_DENTS /* indicates dirty dentry pages */
 + DIRTY_DENTS,/* indicates dirty dentry pages */
 + INO_ENTRIES,/* indicates inode entries */
  };
  
  struct nat_entry_set {
 diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
 index 16721b5d..e094675 100644
 --- a/fs/f2fs/segment.c
 +++ b/fs/f2fs/segment.c
 @@ -276,7 +276,8 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
  {
   /* check the # of cached NAT entries and prefree segments */
   if (try_to_free_nats(sbi, NAT_ENTRY_PER_BLOCK) ||
 - excess_prefree_segs(sbi))
 + excess_prefree_segs(sbi) ||
 + available_free_memory(sbi, INO_ENTRIES))
   f2fs_sync_fs(sbi-sb, true);
  }
  
 -- 
 2.1.1
 
 
 --
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] Don't merge not sended patches

2014-11-09 Thread Changman Lee
Hi Jaegeuk,

I've found new 2 patches when I pull f2fs-tools.
The patches didn't show in mailing list.
I think although patches is very trivial, it should be reported through
our mailing list.

Thanks,
Changman


--
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 4/5] f2fs: write node pages if checkpoint is not doing

2014-11-09 Thread Changman Lee
On Sat, Nov 08, 2014 at 11:36:08PM -0800, Jaegeuk Kim wrote:
 It needs to write node pages if checkpoint is not doing in order to avoid
 memory pressure.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/node.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)
 
 diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
 index 4ea2c47..6f514fb 100644
 --- a/fs/f2fs/node.c
 +++ b/fs/f2fs/node.c
 @@ -1314,10 +1314,12 @@ static int f2fs_write_node_page(struct page *page,
   return 0;
   }
  
 - if (wbc-for_reclaim)
 - goto redirty_out;
 -
 - down_read(sbi-node_write);
 + if (wbc-for_reclaim) {
 + if (!down_read_trylock(sbi-node_write))
 + goto redirty_out;

Previously, we skipped write_page for reclaim path, but from now on, we
will write out node page to reclaim memory at any time except checkpoint.
We should remember it may occur to break merging bio.
Got it.

Reviewed-by: Changman Lee cm224@samsung.com

 + } else {
 + down_read(sbi-node_write);
 + }
   set_page_writeback(page);
   write_node_page(sbi, page, fio, nid, ni.blk_addr, new_addr);
   set_node_addr(sbi, ni, new_addr, is_fsync_dnode(page));
 -- 
 2.1.1
 
 
 --
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: implement -o dirsync

2014-11-09 Thread Changman Lee
On Sun, Nov 09, 2014 at 10:24:22PM -0800, Jaegeuk Kim wrote:
 If a mount option has dirsync, we should call checkpoint for all the directory
 operations.
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/namei.c | 24 
  1 file changed, 24 insertions(+)
 
 diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
 index 6312dd2..db3ee09 100644
 --- a/fs/f2fs/namei.c
 +++ b/fs/f2fs/namei.c
 @@ -138,6 +138,9 @@ static int f2fs_create(struct inode *dir, struct dentry 
 *dentry, umode_t mode,
   stat_inc_inline_inode(inode);
   d_instantiate(dentry, inode);
   unlock_new_inode(inode);
 +
 + if (IS_DIRSYNC(dir))
 + f2fs_sync_fs(sbi-sb, 1);
   return 0;
  out:
   handle_failed_inode(inode);
 @@ -164,6 +167,9 @@ static int f2fs_link(struct dentry *old_dentry, struct 
 inode *dir,
   f2fs_unlock_op(sbi);
  
   d_instantiate(dentry, inode);
 +
 + if (IS_DIRSYNC(dir))
 + f2fs_sync_fs(sbi-sb, 1);
   return 0;
  out:
   clear_inode_flag(F2FS_I(inode), FI_INC_LINK);
 @@ -233,6 +239,9 @@ static int f2fs_unlink(struct inode *dir, struct dentry 
 *dentry)
   f2fs_delete_entry(de, page, dir, inode);
   f2fs_unlock_op(sbi);
  
 + if (IS_DIRSYNC(dir))
 + f2fs_sync_fs(sbi-sb, 1);
 +
   /* In order to evict this inode, we set it dirty */
   mark_inode_dirty(inode);

Let's move it below mark_inode_dirty.
After sync, it's unnecessary inserting inode into dirty_list.


  fail:
 @@ -268,6 +277,9 @@ static int f2fs_symlink(struct inode *dir, struct dentry 
 *dentry,
  
   d_instantiate(dentry, inode);
   unlock_new_inode(inode);
 +
 + if (IS_DIRSYNC(dir))
 + f2fs_sync_fs(sbi-sb, 1);
   return err;
  out:
   handle_failed_inode(inode);
 @@ -304,6 +316,8 @@ static int f2fs_mkdir(struct inode *dir, struct dentry 
 *dentry, umode_t mode)
   d_instantiate(dentry, inode);
   unlock_new_inode(inode);
  
 + if (IS_DIRSYNC(dir))
 + f2fs_sync_fs(sbi-sb, 1);
   return 0;
  
  out_fail:
 @@ -346,8 +360,12 @@ static int f2fs_mknod(struct inode *dir, struct dentry 
 *dentry,
   f2fs_unlock_op(sbi);
  
   alloc_nid_done(sbi, inode-i_ino);
 +
   d_instantiate(dentry, inode);
   unlock_new_inode(inode);
 +
 + if (IS_DIRSYNC(dir))
 + f2fs_sync_fs(sbi-sb, 1);
   return 0;
  out:
   handle_failed_inode(inode);
 @@ -461,6 +479,9 @@ static int f2fs_rename(struct inode *old_dir, struct 
 dentry *old_dentry,
   }
  
   f2fs_unlock_op(sbi);
 +
 + if (IS_DIRSYNC(old_dir) || IS_DIRSYNC(new_dir))
 + f2fs_sync_fs(sbi-sb, 1);
   return 0;
  
  put_out_dir:
 @@ -600,6 +621,9 @@ static int f2fs_cross_rename(struct inode *old_dir, 
 struct dentry *old_dentry,
   update_inode_page(new_dir);
  
   f2fs_unlock_op(sbi);
 +
 + if (IS_DIRSYNC(old_dir) || IS_DIRSYNC(new_dir))
 + f2fs_sync_fs(sbi-sb, 1);
   return 0;
  out_undo:
   /* Still we may fail to recover name info of f2fs_inode here */
 -- 
 2.1.1
 
 
 --
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] mkfs.f2fs: reclaim free space in case of regular file

2014-11-04 Thread Changman Lee
If we use regular file instead block device, let's reclaim its free
space.

Signed-off-by: Changman Lee cm224@samsung.com
---
 configure.ac |  2 +-
 mkfs/f2fs_format_utils.c | 18 --
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index 0111e72..d66cb73 100644
--- a/configure.ac
+++ b/configure.ac
@@ -57,7 +57,7 @@ PKG_CHECK_MODULES([libuuid], [uuid])
 
 # Checks for header files.
 AC_CHECK_HEADERS([linux/fs.h fcntl.h mntent.h stdlib.h string.h \
-   sys/ioctl.h sys/mount.h unistd.h])
+   sys/ioctl.h sys/mount.h unistd.h linux/falloc.h])
 
 # Checks for typedefs, structures, and compiler characteristics.
 AC_C_INLINE
diff --git a/mkfs/f2fs_format_utils.c b/mkfs/f2fs_format_utils.c
index 9892a8f..88b9953 100644
--- a/mkfs/f2fs_format_utils.c
+++ b/mkfs/f2fs_format_utils.c
@@ -6,18 +6,26 @@
  *
  * Dual licensed under the GPL or LGPL version 2 licenses.
  */
+#define _LARGEFILE_SOURCE
 #define _LARGEFILE64_SOURCE
+#ifndef _GNU_SOURCE
+#define _GNU_SOURCE
+#endif
 
 #include stdio.h
 #include unistd.h
 #include sys/ioctl.h
 #include sys/stat.h
+#include fcntl.h
 
 #include f2fs_fs.h
 
 #ifdef HAVE_LINUX_FS_H
 #include linux/fs.h
 #endif
+#ifdef HAVE_LINUX_FALLOC_H
+#include linux/falloc.h
+#endif
 
 int f2fs_trim_device()
 {
@@ -37,9 +45,15 @@ int f2fs_trim_device()
 
 #if defined(WITH_BLKDISCARD)  defined(BLKDISCARD)
MSG(0, Info: Discarding device\n);
-   if (S_ISREG(stat_buf.st_mode))
+   if (S_ISREG(stat_buf.st_mode)) {
+#ifdef FALLOC_FL_PUNCH_HOLE
+   if (fallocate(config.fd, FALLOC_FL_PUNCH_HOLE | 
FALLOC_FL_KEEP_SIZE,
+   range[0], range[1])  0) {
+   MSG(0, Info: fallocate(PUNCH_HOLE|KEEP_SIZE) is 
failed\n);
+   }
+#endif
return 0;
-   else if (S_ISBLK(stat_buf.st_mode)) {
+   } else if (S_ISBLK(stat_buf.st_mode)) {
if (ioctl(config.fd, BLKDISCARD, range)  0) {
MSG(0, Info: This device doesn't support TRIM\n);
} else {
-- 
1.9.1


--
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 04/10] f2fs: give an option to enable in-place-updates during fsync to users

2014-09-14 Thread Changman Lee
Hi JK,

I think it' nicer if this can be used as 'OR' with other policy
together. If so, we can also cover the weakness in high utilization.

Regard,
Changman

On Sun, Sep 14, 2014 at 03:14:18PM -0700, Jaegeuk Kim wrote:
 If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file
 only starts to try in-place-updates.
 And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it
 keeps out-of-order manner. Otherwise, it triggers in-place-updates.
 
 This may be used by storage showing very high random write performance.
 
 For example, it can be used when,
 
 Seq. writes (Data) + wait + Seq. writes (Node)
 
 is pretty much slower than,
 
 Rand. writes (Data)
 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  Documentation/ABI/testing/sysfs-fs-f2fs |  7 +++
  Documentation/filesystems/f2fs.txt  |  9 -
  fs/f2fs/f2fs.h  |  1 +
  fs/f2fs/file.c  |  7 +++
  fs/f2fs/segment.c   |  3 ++-
  fs/f2fs/segment.h   | 14 ++
  fs/f2fs/super.c |  2 ++
  7 files changed, 33 insertions(+), 10 deletions(-)
 
 diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
 b/Documentation/ABI/testing/sysfs-fs-f2fs
 index 62dd725..6f9157f 100644
 --- a/Documentation/ABI/testing/sysfs-fs-f2fs
 +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
 @@ -44,6 +44,13 @@ Description:
Controls the FS utilization condition for the in-place-update
policies.
  
 +What:/sys/fs/f2fs/disk/min_fsync_blocks
 +Date:September 2014
 +Contact: Jaegeuk Kim jaeg...@kernel.org
 +Description:
 +  Controls the dirty page count condition for the in-place-update
 +  policies.
 +
  What:/sys/fs/f2fs/disk/max_small_discards
  Date:November 2013
  Contact: Jaegeuk Kim jaegeuk@samsung.com
 diff --git a/Documentation/filesystems/f2fs.txt 
 b/Documentation/filesystems/f2fs.txt
 index a2046a7..d010da8 100644
 --- a/Documentation/filesystems/f2fs.txt
 +++ b/Documentation/filesystems/f2fs.txt
 @@ -194,13 +194,20 @@ Files in /sys/fs/f2fs/devname
updates in f2fs. There are five policies:
 0: F2FS_IPU_FORCE, 1: F2FS_IPU_SSR,
 2: F2FS_IPU_UTIL,  3: F2FS_IPU_SSR_UTIL,
 -   4: F2FS_IPU_DISABLE.
 +   4: F2FS_IPU_FSYNC, 5: F2FS_IPU_DISABLE.
  
   min_ipu_util This parameter controls the threshold to 
 trigger
in-place-updates. The number indicates 
 percentage
of the filesystem utilization, and used by
F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.
  
 + min_fsync_blocks This parameter controls the threshold to 
 trigger
 +  in-place-updates when F2FS_IPU_FSYNC mode is 
 set.
 +   The number indicates the number of dirty pages
 +   when fsync needs to flush on its call path. If
 +   the number is less than this value, it triggers
 +   in-place-updates.
 +
   max_victim_search This parameter controls the number of trials to
 find a victim segment when conducting SSR and
 cleaning operations. The default value is 4096
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 index 2756c16..4f84d2a 100644
 --- a/fs/f2fs/f2fs.h
 +++ b/fs/f2fs/f2fs.h
 @@ -386,6 +386,7 @@ struct f2fs_sm_info {
  
   unsigned int ipu_policy;/* in-place-update policy */
   unsigned int min_ipu_util;  /* in-place-update threshold */
 + unsigned int min_fsync_blocks;  /* threshold for fsync */
  
   /* for flush command control */
   struct flush_cmd_control *cmd_control_info;
 diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
 index 77426c7..af06e22 100644
 --- a/fs/f2fs/file.c
 +++ b/fs/f2fs/file.c
 @@ -154,12 +154,11 @@ int f2fs_sync_file(struct file *file, loff_t start, 
 loff_t end, int datasync)
   trace_f2fs_sync_file_enter(inode);
  
   /* if fdatasync is triggered, let's do in-place-update */
 - if (datasync)
 + if (get_dirty_pages(inode) = SM_I(sbi)-min_fsync_blocks)
   set_inode_flag(fi, FI_NEED_IPU);
 -
   ret = filemap_write_and_wait_range(inode-i_mapping, start, end);
 - if (datasync)
 - clear_inode_flag(fi, FI_NEED_IPU);
 + clear_inode_flag(fi, FI_NEED_IPU);
 +
   if (ret) {
   trace_f2fs_sync_file_exit(inode, need_cp, datasync, ret);
   return ret;
 diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
 index e158d63..c6f627b 100644
 --- a/fs/f2fs/segment.c
 +++ b/fs/f2fs/segment.c
 @@ -1928,8 +1928,9 @@ int build_segment_manager(struct 

Re: [f2fs-dev] [PATCH] f2fs: reposition unlock_new_inode to prevent accessing invalid inode

2014-08-27 Thread Changman Lee
Hi Chao,

I agree it's correct unlock_new_inode should be located after make_bad_inode.

About this scenario,
I think we should check some condition if this could be occured;
A inode allocated newly could be victim by gc thread.
Then, f2fs_iget called by Thread A have to fail because we handled it as
bad_inode in Thread B. However, f2fs_iget could still get inode.
How about check it using is_bad_inode() in f2fs_iget.

Thanks,

On Tue, Aug 26, 2014 at 06:35:29PM +0800, Chao Yu wrote:
 As the race condition on the inode cache, following scenario can appear:
 [Thread a][Thread b]
   -f2fs_mkdir
 -f2fs_add_link
   -__f2fs_add_link
 -init_inode_metadata failed here
 -gc_thread_func
   -f2fs_gc
 -do_garbage_collect
   -gc_data_segment
 -f2fs_iget
   -iget_locked
 -wait_on_inode
 -unlock_new_inode
 -move_data_page
 -make_bad_inode
 -iput
 
 When we fail in create/symlink/mkdir/mknod/tmpfile, the new allocated inode
 should be set as bad to avoid being accessed by other thread. But in above
 scenario, it allows f2fs to access the invalid inode before this inode was set
 as bad.
 This patch fix the potential problem, and this issue was found by code review.
 
 Signed-off-by: Chao Yu chao2...@samsung.com
 ---
  fs/f2fs/namei.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)
 
 diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
 index 6b53ce9..845f1be 100644
 --- a/fs/f2fs/namei.c
 +++ b/fs/f2fs/namei.c
 @@ -134,8 +134,8 @@ static int f2fs_create(struct inode *dir, struct dentry 
 *dentry, umode_t mode,
   return 0;
  out:
   clear_nlink(inode);
 - unlock_new_inode(inode);
   make_bad_inode(inode);
 + unlock_new_inode(inode);
   iput(inode);
   alloc_nid_failed(sbi, ino);
   return err;
 @@ -267,8 +267,8 @@ static int f2fs_symlink(struct inode *dir, struct dentry 
 *dentry,
   return err;
  out:
   clear_nlink(inode);
 - unlock_new_inode(inode);
   make_bad_inode(inode);
 + unlock_new_inode(inode);
   iput(inode);
   alloc_nid_failed(sbi, inode-i_ino);
   return err;
 @@ -308,8 +308,8 @@ static int f2fs_mkdir(struct inode *dir, struct dentry 
 *dentry, umode_t mode)
  out_fail:
   clear_inode_flag(F2FS_I(inode), FI_INC_LINK);
   clear_nlink(inode);
 - unlock_new_inode(inode);
   make_bad_inode(inode);
 + unlock_new_inode(inode);
   iput(inode);
   alloc_nid_failed(sbi, inode-i_ino);
   return err;
 @@ -354,8 +354,8 @@ static int f2fs_mknod(struct inode *dir, struct dentry 
 *dentry,
   return 0;
  out:
   clear_nlink(inode);
 - unlock_new_inode(inode);
   make_bad_inode(inode);
 + unlock_new_inode(inode);
   iput(inode);
   alloc_nid_failed(sbi, inode-i_ino);
   return err;
 @@ -688,8 +688,8 @@ release_out:
  out:
   f2fs_unlock_op(sbi);
   clear_nlink(inode);
 - unlock_new_inode(inode);
   make_bad_inode(inode);
 + unlock_new_inode(inode);
   iput(inode);
   alloc_nid_failed(sbi, inode-i_ino);
   return err;
 -- 
 2.0.0.421.g786a89d
 
 
 
 --
 Slashdot TV.  
 Video for Nerds.  Stuff that matters.
 http://tv.slashdot.org/
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 07/11] f2fs: enable in-place-update for fdatasync

2014-07-29 Thread Changman Lee
On Tue, Jul 29, 2014 at 05:22:15AM -0700, Jaegeuk Kim wrote:
 Hi Changman,
 
 On Tue, Jul 29, 2014 at 09:41:11AM +0900, Changman Lee wrote:
  Hi Jaegeuk,
  
  On Fri, Jul 25, 2014 at 03:47:21PM -0700, Jaegeuk Kim wrote:
   This patch enforces in-place-updates only when fdatasync is requested.
   If we adopt this in-place-updates for the fdatasync, we can skip to write 
   the
   recovery information.
  
  But, as you know, random write occurs when changing into in-place-updates.
  It will degrade write performance. Is there any case in-place-updates is
  better, except recovery or high utilization?
 
 As I described, you can easily imagine, if users requested small amount of 
 data
 writes with fdatasync, we should do data writes + node writes.
 But, if we can do in-place-update, we don't need to write node blocks.
 Surely it triggers random writes, however, the amount of data is preety small
 and the device handles them very fast by its inside cache, so that it can
 enhance the performance.
 
 Thanks,

Partially agree. Sometimes, I see that SSR shows lower performance than
IPU. One of the reasons might be node writes.
Anyway, if so, we should know total dirty pages for fdatasync and it's very
tunable according to a random write performance of device.

Thanks,

 
  
  Thanks
  
   
   Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
   ---
fs/f2fs/f2fs.h| 1 +
fs/f2fs/file.c| 7 +++
fs/f2fs/segment.h | 4 
3 files changed, 12 insertions(+)
   
   diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
   index ab36025..8f8685e 100644
   --- a/fs/f2fs/f2fs.h
   +++ b/fs/f2fs/f2fs.h
   @@ -998,6 +998,7 @@ enum {
 FI_INLINE_DATA, /* used for inline data*/
 FI_APPEND_WRITE,/* inode has appended data */
 FI_UPDATE_WRITE,/* inode has in-place-update data */
   + FI_NEED_IPU,/* used fo ipu for fdatasync */
};

static inline void set_inode_flag(struct f2fs_inode_info *fi, int flag)
   diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
   index 121689a..e339856 100644
   --- a/fs/f2fs/file.c
   +++ b/fs/f2fs/file.c
   @@ -127,11 +127,18 @@ int f2fs_sync_file(struct file *file, loff_t start, 
   loff_t end, int datasync)
 return 0;

 trace_f2fs_sync_file_enter(inode);
   +
   + /* if fdatasync is triggered, let's do in-place-update */
   + if (datasync)
   + set_inode_flag(fi, FI_NEED_IPU);
   +
 ret = filemap_write_and_wait_range(inode-i_mapping, start, end);
 if (ret) {
 trace_f2fs_sync_file_exit(inode, need_cp, datasync, ret);
 return ret;
 }
   + if (datasync)
   + clear_inode_flag(fi, FI_NEED_IPU);

 /*
  * if there is no written data, don't waste time to write recovery info.
   diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
   index ee5c75e..55973f7 100644
   --- a/fs/f2fs/segment.h
   +++ b/fs/f2fs/segment.h
   @@ -486,6 +486,10 @@ static inline bool need_inplace_update(struct inode 
   *inode)
 if (S_ISDIR(inode-i_mode))
 return false;

   + /* this is only set during fdatasync */
   + if (is_inode_flag_set(F2FS_I(inode), FI_NEED_IPU))
   + return true;
   +
 switch (SM_I(sbi)-ipu_policy) {
 case F2FS_IPU_FORCE:
 return true;
   -- 
   1.8.5.2 (Apple Git-48)
   
   
   --
   Want fast and easy access to all the code in your enterprise? Index and
   search up to 200,000 lines of code with a free copy of Black Duck
   Code Sight - the same software that powers the world's largest code
   search on Ohloh, the Black Duck Open Hub! Try it now.
   http://p.sf.net/sfu/bds
   ___
   Linux-f2fs-devel mailing list
   Linux-f2fs-devel@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] Remove an unnecessary line in allocate_data_block.

2014-07-29 Thread Changman Lee
On Tue, Jul 29, 2014 at 06:24:48AM -0700, Jaegeuk Kim wrote:
 Hi Dongho,
 
 At first, please write a patch under the correct rule.
 (e.g., description)
 
 About this change, it's negative.
 When considering SSR, we need to take care of the following scenario.
 - old segno : X
 - new address : Z
 - old curseg : Y
 This means, a new block is supposed to be written to Z from X.
 And Z is newly allocated in the same path from Y.
 
 In that case, we should trigger locate_dirty_segment for Y, since
 it was a current_segment and can be dirty owing to SSR.
 But that was not included in the dirty list.
 
 Thanks,
 

We already choosed old curseg(Y) and then we allocate new address(Z) from old
curseg(Y). After that we call refresh_sit_entry(old address, new address).
In the funcation, we call locate_dirty_segment with old seg and old curseg.
So calling locate_dirty_segment after refresh_sit_entry again is redundant.

Thanks,

 On Mon, Jul 28, 2014 at 08:34:25AM +, Dongho Sim wrote:
  Hi, Chao.
  It's my mistake.
  
  Thanks :-)
  
  Signed-off-by: Dongho Sim dh@samsung.com
  ---
   fs/f2fs/segment.c | 3 ---
   1 file changed, 3 deletions(-)
  
  diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
  index 8a6e57d..7af4a8d 100644
  --- a/fs/f2fs/segment.c
  +++ b/fs/f2fs/segment.c
  @@ -973,14 +973,12 @@ void allocate_data_block(struct f2fs_sb_info *sbi, 
  struct page *page,
   {
  struct sit_info *sit_i = SIT_I(sbi);
  struct curseg_info *curseg;
  -   unsigned int old_cursegno;
   
  curseg = CURSEG_I(sbi, type);
   
  mutex_lock(curseg-curseg_mutex);
   
  *new_blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
  -   old_cursegno = curseg-segno;
   
  /*
   * __add_sum_entry should be resided under the curseg_mutex
  @@ -1001,7 +999,6 @@ void allocate_data_block(struct f2fs_sb_info *sbi, 
  struct page *page,
   * since SSR needs latest valid block information.
   */
  refresh_sit_entry(sbi, old_blkaddr, *new_blkaddr);
  -   locate_dirty_segment(sbi, old_cursegno);
   
  mutex_unlock(sit_i-sentry_lock);
   
  -- 
  1.9.1
  
  --- Original Message ---
  Sender : ?超chao2...@samsung.com 工程?/SRC-Nanjing-Mobile Solution Lab/삼성전자
  Date : 2014-07-28 16:21 (GMT+09:00)
  Title : Re: [f2fs-dev] [PATCH] Remove an unnecessary line in 
  allocate_data_block.
  
  Hi Dongho,
  
   - Original Message -
   
   From: Dongho Sim 
   Sent: Monday, July 28, 2014 1:51 PM
   To: Chao Yu 
   Cc: jaeg...@kernel.org, linux-f2fs-devel@lists.sourceforge.net
   Subject: Re: [f2fs-dev] [PATCH] Remove an unnecessary line in 
   allocate_data_block.
   
   Yes, there was another one.
   Thanks Chao, :-)
   
   Signed-off-by: Dongho Sim 
   ---
fs/f2fs/segment.c | 2 --
1 file changed, 2 deletions(-)
   
   diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
   index 8a6e57d..3ab7749 100644
   --- a/fs/f2fs/segment.c
   +++ b/fs/f2fs/segment.c
   @@ -980,7 +980,6 @@ void allocate_data_block(struct f2fs_sb_info *sbi, 
   struct page *page,
   mutex_lock(curseg-curseg_mutex);
   
   *new_blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
   -   old_cursegno = curseg-segno;
  
  The definition of old_cursegno also should be removed.
  
  Thanks,
  Yu
  
   
   /*
* __add_sum_entry should be resided under the curseg_mutex
   @@ -1001,7 +1000,6 @@ void allocate_data_block(struct f2fs_sb_info *sbi, 
   struct page *page,
* since SSR needs latest valid block information.
*/
   refresh_sit_entry(sbi, old_blkaddr, *new_blkaddr);
   -   locate_dirty_segment(sbi, old_cursegno);
   
   mutex_unlock(sit_i-sentry_lock);
   
   --
   1.9.1
   
   --- Original Message ---
   Sender : Chao Yu 
   Date : 2014-07-28 14:35 (GMT+09:00)
   Title : RE: [f2fs-dev] [PATCH] Remove an unnecessary line in 
   allocate_data_block.
   
   Hi Dongho,
   
-Original Message-
From: Dongho Sim [mailto:dh@samsung.com]
Sent: Monday, July 28, 2014 7:03 AM
To: jaeg...@kernel.org; linux-f2fs-devel@lists.sourceforge.net
Subject: [f2fs-dev] [PATCH] Remove an unnecessary line in 
allocate_data_block.
   
Hi. There was an unnecessary line in function, allocate_data_block.
It is already done in
   refresh_sit_entry(sbi, old_blkaddr, *new_blkaddr);
   
Thanks.
   
   Agreed,
   How about removing old_cursegno too as it's no longer used in 
   allocate_data_block?
   
   Thanks,
   Yu
   
   
Signed-off-by: Dongho Sim
---
 fs/f2fs/segment.c | 1 -
 1 file changed, 1 deletion(-)
   
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 8a6e57d..a3c7aae 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1001,7 +1001,6 @@ void allocate_data_block(struct f2fs_sb_info 
*sbi, struct page *page,
  * since SSR needs latest valid block information.
  */
  refresh_sit_entry(sbi, old_blkaddr, *new_blkaddr);
- locate_dirty_segment(sbi, 

Re: [f2fs-dev] [PATCH 07/11] f2fs: enable in-place-update for fdatasync

2014-07-28 Thread Changman Lee
Hi Jaegeuk,

On Fri, Jul 25, 2014 at 03:47:21PM -0700, Jaegeuk Kim wrote:
 This patch enforces in-place-updates only when fdatasync is requested.
 If we adopt this in-place-updates for the fdatasync, we can skip to write the
 recovery information.

But, as you know, random write occurs when changing into in-place-updates.
It will degrade write performance. Is there any case in-place-updates is
better, except recovery or high utilization?

Thanks

 
 Signed-off-by: Jaegeuk Kim jaeg...@kernel.org
 ---
  fs/f2fs/f2fs.h| 1 +
  fs/f2fs/file.c| 7 +++
  fs/f2fs/segment.h | 4 
  3 files changed, 12 insertions(+)
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 index ab36025..8f8685e 100644
 --- a/fs/f2fs/f2fs.h
 +++ b/fs/f2fs/f2fs.h
 @@ -998,6 +998,7 @@ enum {
   FI_INLINE_DATA, /* used for inline data*/
   FI_APPEND_WRITE,/* inode has appended data */
   FI_UPDATE_WRITE,/* inode has in-place-update data */
 + FI_NEED_IPU,/* used fo ipu for fdatasync */
  };
  
  static inline void set_inode_flag(struct f2fs_inode_info *fi, int flag)
 diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
 index 121689a..e339856 100644
 --- a/fs/f2fs/file.c
 +++ b/fs/f2fs/file.c
 @@ -127,11 +127,18 @@ int f2fs_sync_file(struct file *file, loff_t start, 
 loff_t end, int datasync)
   return 0;
  
   trace_f2fs_sync_file_enter(inode);
 +
 + /* if fdatasync is triggered, let's do in-place-update */
 + if (datasync)
 + set_inode_flag(fi, FI_NEED_IPU);
 +
   ret = filemap_write_and_wait_range(inode-i_mapping, start, end);
   if (ret) {
   trace_f2fs_sync_file_exit(inode, need_cp, datasync, ret);
   return ret;
   }
 + if (datasync)
 + clear_inode_flag(fi, FI_NEED_IPU);
  
   /*
* if there is no written data, don't waste time to write recovery info.
 diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
 index ee5c75e..55973f7 100644
 --- a/fs/f2fs/segment.h
 +++ b/fs/f2fs/segment.h
 @@ -486,6 +486,10 @@ static inline bool need_inplace_update(struct inode 
 *inode)
   if (S_ISDIR(inode-i_mode))
   return false;
  
 + /* this is only set during fdatasync */
 + if (is_inode_flag_set(F2FS_I(inode), FI_NEED_IPU))
 + return true;
 +
   switch (SM_I(sbi)-ipu_policy) {
   case F2FS_IPU_FORCE:
   return true;
 -- 
 1.8.5.2 (Apple Git-48)
 
 
 --
 Want fast and easy access to all the code in your enterprise? Index and
 search up to 200,000 lines of code with a free copy of Black Duck
 Code Sight - the same software that powers the world's largest code
 search on Ohloh, the Black Duck Open Hub! Try it now.
 http://p.sf.net/sfu/bds
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/2 V4] mkfs.f2fs: large volume support

2014-07-10 Thread Changman Lee
Hi, Jaegeuk

Long time ago, I sent 3 patches for large volume support for mkfs, fsck
and kernel. But you've missed one patch of mkfs. So I resend the patch
resovled conflict with current git tree.

Changes from V3
 o remove cp_payload in f2fs_super_block

Changes from V2
 o remove CP_LARGE_VOL_LFLAG instead, use cp_payload in superblock
  because disk size is determined at format

Changes from V1
 o fix orphan node blkaddr


Regards,
Changman Lee

-- 8 --

From b7d46c6aaf786d28f82c0fe5d116b561c03b4cb2 Mon Sep 17 00:00:00 2001
From: Changman Lee cm224@samsung.com
Date: Thu, 10 Jul 2014 15:26:04 +0900
Subject: [PATCH] mkfs.f2fs: large volume support

This patch supports large volume over about 3TB.

Signed-off-by: Changman Lee cm224@samsung.com
---
 include/f2fs_fs.h  |  8 ++
 mkfs/f2fs_format.c | 79 +++---
 2 files changed, 71 insertions(+), 16 deletions(-)

diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h
index 53b8cb9..80ce918 100644
--- a/include/f2fs_fs.h
+++ b/include/f2fs_fs.h
@@ -221,6 +221,7 @@ enum {
 #define F2FS_LOG_SECTORS_PER_BLOCK 3   /* 4KB: F2FS_BLKSIZE */
 #define F2FS_BLKSIZE   4096/* support only 4KB block */
 #define F2FS_MAX_EXTENSION 64  /* # of extension entries */
+#define F2FS_BLK_ALIGN(x)  (((x) + F2FS_BLKSIZE - 1) / F2FS_BLKSIZE)
 
 #define NULL_ADDR  0x0U
 #define NEW_ADDR   -1U
@@ -456,6 +457,13 @@ struct f2fs_nat_block {
 #define SIT_ENTRY_PER_BLOCK (PAGE_CACHE_SIZE / sizeof(struct f2fs_sit_entry))
 
 /*
+ * F2FS uses 4 bytes to represent block address. As a result, supported size of
+ * disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments.
+ */
+#define F2FS_MAX_SEGMENT   ((16 * 1024 * 1024) / 2)
+#define MAX_SIT_BITMAP_SIZE((F2FS_MAX_SEGMENT / SIT_ENTRY_PER_BLOCK) / 8)
+
+/*
  * Note that f2fs_sit_entry-vblocks has the following bit-field information.
  * [15:10] : allocation type such as CURSEG__TYPE
  * [9:0] : valid block count
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 1568545..a62a8fe 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -101,7 +101,8 @@ static int f2fs_prepare_super_block(void)
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t total_valid_blks_available;
u_int64_t zone_align_start_offset, diff, total_meta_segments;
-   u_int32_t sit_bitmap_size, max_nat_bitmap_size, max_nat_segments;
+   u_int32_t sit_bitmap_size, max_sit_bitmap_size;
+   u_int32_t max_nat_bitmap_size, max_nat_segments;
u_int32_t total_zones;
 
super_block.magic = cpu_to_le32(F2FS_SUPER_MAGIC);
@@ -197,8 +198,26 @@ static int f2fs_prepare_super_block(void)
 */
sit_bitmap_size = ((le32_to_cpu(super_block.segment_count_sit) / 2) 
log_blks_per_seg) / 8;
-   max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct f2fs_checkpoint) 
+ 1 -
-   sit_bitmap_size;
+
+   if (sit_bitmap_size  MAX_SIT_BITMAP_SIZE)
+   max_sit_bitmap_size = MAX_SIT_BITMAP_SIZE;
+   else
+   max_sit_bitmap_size = sit_bitmap_size;
+
+   /*
+* It should be reserved minimum 1 segment for nat.
+* When sit is too large, we should expand cp area. It requires more 
pages for cp.
+*/
+   if (max_sit_bitmap_size 
+   (CHECKSUM_OFFSET - sizeof(struct f2fs_checkpoint) + 
65)) {
+   max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct 
f2fs_checkpoint) + 1;
+   super_block.cp_payload = F2FS_BLK_ALIGN(max_sit_bitmap_size);
+   } else {
+   max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct 
f2fs_checkpoint) + 1
+   - max_sit_bitmap_size;
+   super_block.cp_payload = 0;
+   }
+
max_nat_segments = (max_nat_bitmap_size * 8)  log_blks_per_seg;
 
if (le32_to_cpu(super_block.segment_count_nat)  max_nat_segments)
@@ -414,6 +433,7 @@ static int f2fs_write_check_point_pack(void)
u_int64_t cp_seg_blk_offset = 0;
u_int32_t crc = 0;
int i;
+   char *cp_payload = NULL;
 
ckp = calloc(F2FS_BLKSIZE, 1);
if (ckp == NULL) {
@@ -427,6 +447,12 @@ static int f2fs_write_check_point_pack(void)
return -1;
}
 
+   cp_payload = calloc(F2FS_BLKSIZE, 1);
+   if (cp_payload == NULL) {
+   MSG(1, \tError: Calloc Failed for cp_payload!!!\n);
+   return -1;
+   }
+
/* 1. cp page 1 of checkpoint pack 1 */
ckp-checkpoint_ver = cpu_to_le64(1);
ckp-cur_node_segno[0] =
@@ -465,9 +491,10 @@ static int f2fs_write_check_point_pack(void)
((le32_to_cpu(ckp-free_segment_count) + 6 -
le32_to_cpu(ckp-overprov_segment_count)) *
 config.blks_per_seg));
-   ckp-cp_pack_total_block_count

Re: [f2fs-dev] [PATCH 3/4] f2fs: use find_next_bit_le rather than test_bit_le in, find_in_block

2014-07-06 Thread Changman Lee
Hello,

On Fri, Jul 04, 2014 at 11:25:35PM -0700, Jaegeuk Kim wrote:
 To Changman,
 
 Just for sure, can you reproduce this issue in the x86 machine with proper
 benchmarks? (i.e., test_bit_le vs. find_next_bit_le)

It shows quite a different result of bit_mod_test between server and desktop.

CPU i5 x86_64 Ubuntu Server - 3.16.0-rc3

[266627.204776] find_next_bit_letest_bit_le
[266627.205319] 18321774
[266627.206223] 12921746
[266627.207092] 12051746
[266627.207876]  9141746
[266627.208710] 10821746
[266627.209506]  9561746
[266627.210175]  5231746

[266627.211839] 39071746
[266627.212898] 18501746
[266627.214046] 21531746
[266627.215118] 18941746


CPU i7 x86_64 Mint Desktop - 3.13.0-24-generic

[432284.422356] find_next_bit_letest_bit_le
[432284.423470] 37713878
[432284.425400] 26713696
[432284.427221] 24923760
[432284.428908] 19713696
[432284.430640] 21913730
[432284.432323] 19863696
[432284.433741] 11233698

[432284.437269] 82993696
[432284.439487] 38423696
[432284.441850] 43343696
[432284.444080] 38853696

 
 To all,
 
 I cautiously suspect that the performances might be different when processing
 f2fs_find_entry, since L1/L2 cache misses due to the intermediate routines 
 like
 matching strings can make some effect on it.
 
 But, IMO, it is still worth to investigate this issue and contemplate how to
 detect all ones or not.
 
 Ah, one solution may be using 2 bytes from the reserved space, total 3, to
 indicate how many valid dentries are stored in the dentry block.
 
 Any ideas?

Agree. In the case of one bits is over than half, test_bit is better
than find_next_bit. So we can decide whether using test_bit or
find_next_bit depending on count of one bits.

When just comparing test_bit and find_next_bit, I think test_bit is more 
effective
in f2fs because let's think about f2fs's dentry management policy.
One dentry bucket is filled then next dentry bucket is filled from
lower to higher level. If empty slots of lower level exist, they are used first.
So, I guess that one bits are getting more than zero bits as time goes by.

Thanks,

 
 Thanks,
 
 On Fri, Jul 04, 2014 at 04:04:09PM +0800, Gu Zheng wrote:
  Hi Yu,
  Thanks.
  On 07/04/2014 02:21 PM, Chao Yu wrote:
  
   Hi Jaegeuk, Gu, Changman
   
   -Original Message-
   From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
   Sent: Friday, July 04, 2014 1:36 PM
   To: Gu Zheng
   Cc: f2fs; fsdevel; 이창만; 俞
   Subject: Re: [PATCH 3/4] f2fs: use find_next_bit_le rather than 
   test_bit_le in, find_in_block
  
   Well, how about testing with many ones in the bit streams?
   Thanks,
  
   On Thu, Jul 03, 2014 at 06:14:02PM +0800, Gu Zheng wrote:
   Hi Jaegeuk, Changman
  
   Just a simple test, not very sure it can address
   our qualm.
  
   Bitmap size:216(the same as f2fs dentry_bits).
   CPU: Intel i5 x86_64.
  
   Time counting based on tsc(the less the fast).
   [Index of 1]find_next_bit_letest_bit_le
   0   20  117
   1   20  114
   2   20  113
   3   20  139
   4   22  121
   5   22  118
   6   22  115
   8   22  112
   9   22  106
   10  22  105
   11  22  100
   16  22  98
   48  22  97
   80  27  95
   104 27  92
   136 32  95
   160 32  92
   184 32  90
   200 27  87
   208 35  84
  
   According to the result, find_next_bit_le is always
   better than test_bit_le, though there may be some
   noise, but I think the result is clear.
   Hope it can help us.:)
   ps.The sample is attached too.
  
   Thanks,
   Gu
   
   I hope this could provide some help for this patch.
   
   I modify Gu's code like this, and add few test case:
   
   static void test_bit_search_speed(void)
   {
 unsigned long flags;
 uint64_t tsc_s_b1, tsc_s_e1, tsc_s_b2, tsc_s_e2;
 int i, j, pos;
 const void *bit_addr;
   
 local_irq_save(flags);
 preempt_disable();
 
 printk(find_next_bit   test_bit_le\n);
   
 for (i = 0; i  24; i++) {
   
 

Re: [f2fs-dev] [PATCH 3/4] f2fs: use find_next_bit_le rather than test_bit_le in, find_in_block

2014-07-02 Thread Changman Lee
Hi, Gu

Unfortunately, find_next_bit isn't always better than test_bit.
Refer to commit 5d0c667121bfc8be76d1580f485bddbe73465d1a

I remember that
Perviously, Jaegeuk had changed find_next_bit to test_bit because
find_next_bit spent much cpu time in the case of there is lot of dentries like 
a postmark.
Sorry, I should have reported this quickly.

On Tue, Jun 24, 2014 at 06:20:41PM +0800, Gu Zheng wrote:
 Use find_next_bit_le rather than test_bit_le to improve search speed
 lightly.
 
 Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com
 ---
  fs/f2fs/dir.c |   43 +--
  1 files changed, 21 insertions(+), 22 deletions(-)
 
 diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
 index 3edd561..ba510fb 100644
 --- a/fs/f2fs/dir.c
 +++ b/fs/f2fs/dir.c
 @@ -93,42 +93,41 @@ static struct f2fs_dir_entry *find_in_block(struct page 
 *dentry_page,
   const char *name, size_t namelen, int *max_slots,
   f2fs_hash_t namehash, struct page **res_page)
  {
 - struct f2fs_dir_entry *de;
 - unsigned long bit_pos = 0;
 + unsigned long bit_pos = 0, bit_start = 0;
   struct f2fs_dentry_block *dentry_blk = kmap(dentry_page);
   const void *dentry_bits = dentry_blk-dentry_bitmap;
 - int max_len = 0;
  
 - while (bit_pos  NR_DENTRY_IN_BLOCK) {
 - if (!test_bit_le(bit_pos, dentry_bits)) {
 - if (bit_pos == 0)
 - max_len = 1;
 - else if (!test_bit_le(bit_pos - 1, dentry_bits))
 - max_len++;
 - bit_pos++;
 - continue;
 + while (bit_start  NR_DENTRY_IN_BLOCK) {
 + struct f2fs_dir_entry *de;
 + int max_len = 0;
 +
 + bit_pos = find_next_bit_le(dentry_bits,
 + NR_DENTRY_IN_BLOCK, bit_start);
 +
 + max_len = bit_pos - bit_start;
 + if (max_len  *max_slots) {
 + *max_slots = max_len;
 + max_len = 0;
   }
 +
 + if (bit_pos = NR_DENTRY_IN_BLOCK)
 + break;
 +
   de = dentry_blk-dentry[bit_pos];
   if (early_match_name(name, namelen, namehash, de)) {
   if (!memcmp(dentry_blk-filename[bit_pos],
   name, namelen)) {
   *res_page = dentry_page;
 - goto found;
 + return de;
   }
   }
 - if (max_len  *max_slots) {
 - *max_slots = max_len;
 - max_len = 0;
 - }
 - bit_pos += GET_DENTRY_SLOTS(le16_to_cpu(de-name_len));
 +
 + bit_start = bit_pos
 + + GET_DENTRY_SLOTS(le16_to_cpu(de-name_len));
   }
  
 - de = NULL;
   kunmap(dentry_page);
 -found:
 - if (max_len  *max_slots)
 - *max_slots = max_len;
 - return de;
 + return NULL;
  }
  
  static struct f2fs_dir_entry *find_in_level(struct inode *dir,
 -- 
 1.7.7

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: avoid overflow when large directory feathure is enabled

2014-05-27 Thread Changman Lee
Hi, Chao
Good catch. Please, modify Documentation/filesytems/f2fs.txt

On Tue, May 27, 2014 at 09:06:52AM +0800, Chao Yu wrote:
 When large directory feathure is enable, We have one case which could cause
 overflow in dir_buckets() as following:
 special case: level + dir_level = 32 and level  MAX_DIR_HASH_DEPTH / 2.
 
 Here we define MAX_DIR_BUCKETS to limit the return value when the condition
 could trigger potential overflow.
 
 Signed-off-by: Chao Yu chao2...@samsung.com
 ---
  fs/f2fs/dir.c   |4 ++--
  include/linux/f2fs_fs.h |3 +++
  2 files changed, 5 insertions(+), 2 deletions(-)
 
 diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
 index c3f1485..966acb0 100644
 --- a/fs/f2fs/dir.c
 +++ b/fs/f2fs/dir.c
 @@ -23,10 +23,10 @@ static unsigned long dir_blocks(struct inode *inode)
  
  static unsigned int dir_buckets(unsigned int level, int dir_level)
  {
 - if (level  MAX_DIR_HASH_DEPTH / 2)
 + if (level + dir_level  MAX_DIR_HASH_DEPTH / 2)
   return 1  (level + dir_level);
   else
 - return 1  ((MAX_DIR_HASH_DEPTH / 2 + dir_level) - 1);
 + return MAX_DIR_BUCKETS;
  }
  
  static unsigned int bucket_blocks(unsigned int level)
 diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
 index 8c03f71..ba6f312 100644
 --- a/include/linux/f2fs_fs.h
 +++ b/include/linux/f2fs_fs.h
 @@ -394,6 +394,9 @@ typedef __le32f2fs_hash_t;
  /* MAX level for dir lookup */
  #define MAX_DIR_HASH_DEPTH   63
  
 +/* MAX buckets in one level of dir */
 +#define MAX_DIR_BUCKETS  (1  ((MAX_DIR_HASH_DEPTH / 2) - 1))
 +
  #define SIZE_OF_DIR_ENTRY11  /* by byte */
  #define SIZE_OF_DENTRY_BITMAP((NR_DENTRY_IN_BLOCK + BITS_PER_BYTE - 
 1) / \
   BITS_PER_BYTE)
 -- 
 1.7.10.4
 
 
 
 --
 The best possible search technologies are now affordable for all companies.
 Download your FREE open source Enterprise Search Engine today!
 Our experts will assist you in its installation for $59/mo, no commitment.
 Test it for FREE on our Cloud platform anytime!
 http://pubads.g.doubleclick.net/gampad/clk?id=145328191iu=/4140/ostg.clktrk
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
The best possible search technologies are now affordable for all companies.
Download your FREE open source Enterprise Search Engine today!
Our experts will assist you in its installation for $59/mo, no commitment.
Test it for FREE on our Cloud platform anytime!
http://pubads.g.doubleclick.net/gampad/clk?id=145328191iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: avoid crash when trace f2fs_submit_page_mbio event in ra_sum_pages

2014-05-27 Thread Changman Lee
On Tue, May 27, 2014 at 02:32:57PM +0800, Chao Yu wrote:
 Hi changman,
 
  -Original Message-
  From: Changman Lee [mailto:cm224@samsung.com]
  Sent: Tuesday, May 27, 2014 9:25 AM
  To: Chao Yu
  Cc: Jaegeuk Kim; linux-fsde...@vger.kernel.org; 
  linux-ker...@vger.kernel.org;
  linux-f2fs-devel@lists.sourceforge.net
  Subject: Re: [f2fs-dev] [PATCH v2] f2fs: avoid crash when trace 
  f2fs_submit_page_mbio event
  in ra_sum_pages
  
  Hi, Chao
  
  Could you think about following once.
  move node_inode in front of build_segment_manager, then use node_inode
  instead of bd_inode.
 
 Jaegeuk and I discussed this solution previously in
 [PATCH 3/3 V3] f2fs: introduce f2fs_cache_node_page() to add page into 
 node_inode cache
 
 You can see it from this url:
 http://sourceforge.net/p/linux-f2fs/mailman/linux-f2fs-devel/?viewmonth=201312page=5
 
 And it seems not easy to change order of build_*_manager and make node_inode,
 because there are dependency between them.
 

Sorry to make a mess your patch thread.
I've understood it. In your patch, using NAT journal seems to be
possible. Anyway, thanks for your answer.

  
  On Tue, May 27, 2014 at 08:41:07AM +0800, Chao Yu wrote:
   Previously we allocate pages with no mapping in ra_sum_pages(), so we may
   encounter a crash in event trace of f2fs_submit_page_mbio where we access
   mapping data of the page.
  
   We'd better allocate pages in bd_inode mapping and invalidate these pages 
   after
   we restore data from pages. It could avoid crash in above scenario.
  
   Changes from V1
o remove redundant code in ra_sum_pages() suggested by Jaegeuk Kim.
  
   Call Trace:
[f1031630] ? ftrace_raw_event_f2fs_write_checkpoint+0x80/0x80 [f2fs]
[f10377bb] f2fs_submit_page_mbio+0x1cb/0x200 [f2fs]
[f103c5da] restore_node_summary+0x13a/0x280 [f2fs]
[f103e22d] build_curseg+0x2bd/0x620 [f2fs]
[f104043b] build_segment_manager+0x1cb/0x920 [f2fs]
[f1032c85] f2fs_fill_super+0x535/0x8e0 [f2fs]
[c115b66a] mount_bdev+0x16a/0x1a0
[f102f63f] f2fs_mount+0x1f/0x30 [f2fs]
[c115c096] mount_fs+0x36/0x170
[c1173635] vfs_kern_mount+0x55/0xe0
[c1175388] do_mount+0x1e8/0x900
[c1175d72] SyS_mount+0x82/0xc0
[c16059cc] sysenter_do_call+0x12/0x22
  
   Suggested-by: Jaegeuk Kim jaegeuk@samsung.com
   Signed-off-by: Chao Yu chao2...@samsung.com
   ---
fs/f2fs/node.c |   52 
   
1 file changed, 24 insertions(+), 28 deletions(-)
  
   diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
   index 3d60d3d..02a59e9 100644
   --- a/fs/f2fs/node.c
   +++ b/fs/f2fs/node.c
   @@ -1658,35 +1658,29 @@ int recover_inode_page(struct f2fs_sb_info *sbi, 
   struct page *page)
  
/*
 * ra_sum_pages() merge contiguous pages into one bio and submit.
   - * these pre-readed pages are linked in pages list.
   + * these pre-readed pages are alloced in bd_inode's mapping tree.
 */
   -static int ra_sum_pages(struct f2fs_sb_info *sbi, struct list_head 
   *pages,
   +static int ra_sum_pages(struct f2fs_sb_info *sbi, struct page **pages,
 int start, int nrpages)
{
   - struct page *page;
   - int page_idx = start;
   + struct inode *inode = sbi-sb-s_bdev-bd_inode;
   + struct address_space *mapping = inode-i_mapping;
   + int i, page_idx = start;
 struct f2fs_io_info fio = {
 .type = META,
 .rw = READ_SYNC | REQ_META | REQ_PRIO
 };
  
   - for (; page_idx  start + nrpages; page_idx++) {
   - /* alloc temporal page for read node summary info*/
   - page = alloc_page(GFP_F2FS_ZERO);
   - if (!page)
   + for (i = 0; page_idx  start + nrpages; page_idx++, i++) {
   + /* alloc page in bd_inode for reading node summary info */
   + pages[i] = grab_cache_page(mapping, page_idx);
   + if (!pages[i])
 break;
   -
   - lock_page(page);
   - page-index = page_idx;
   - list_add_tail(page-lru, pages);
   + f2fs_submit_page_mbio(sbi, pages[i], page_idx, fio);
 }
  
   - list_for_each_entry(page, pages, lru)
   - f2fs_submit_page_mbio(sbi, page, page-index, fio);
   -
 f2fs_submit_merged_bio(sbi, META, READ);
   -
   - return page_idx - start;
   + return i;
}
  
int restore_node_summary(struct f2fs_sb_info *sbi,
   @@ -1694,11 +1688,11 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
{
 struct f2fs_node *rn;
 struct f2fs_summary *sum_entry;
   - struct page *page, *tmp;
   + struct inode *inode = sbi-sb-s_bdev-bd_inode;
 block_t addr;
 int bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
   - int i, last_offset, nrpages, err = 0;
   - LIST_HEAD(page_list);
   + struct page *pages[bio_blocks];
   + int i, idx, last_offset, nrpages, err = 0;
  
 /* scan the node segment */
 last_offset = sbi-blocks_per_seg;
   @@ -1709,29 +1703,31 @@ int restore_node_summary(struct f2fs_sb_info *sbi

Re: [f2fs-dev] [PATCH] f2fs: avoid crash when trace f2fs_submit_page_mbio event in ra_sum_pages

2014-05-26 Thread Changman Lee
On Mon, May 26, 2014 at 02:26:24PM +0800, Chao Yu wrote:
 Hi changman,
 
  -Original Message-
  From: Changman Lee [mailto:cm224@samsung.com]
  Sent: Friday, May 23, 2014 1:14 PM
  To: Jaegeuk Kim
  Cc: Chao Yu; linux-fsde...@vger.kernel.org; linux-ker...@vger.kernel.org;
  linux-f2fs-devel@lists.sourceforge.net
  Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid crash when trace 
  f2fs_submit_page_mbio event in
  ra_sum_pages
  
  On Wed, May 21, 2014 at 12:36:46PM +0900, Jaegeuk Kim wrote:
   Hi Chao,
  
   2014-05-16 (금), 17:14 +0800, Chao Yu:
Previously we allocate pages with no mapping in ra_sum_pages(), so we 
may
encounter a crash in event trace of f2fs_submit_page_mbio where we 
access
mapping data of the page.
   
We'd better allocate pages in bd_inode mapping and invalidate these 
pages after
we restore data from pages. It could avoid crash in above scenario.
   
Call Trace:
 [f1031630] ? ftrace_raw_event_f2fs_write_checkpoint+0x80/0x80 [f2fs]
 [f10377bb] f2fs_submit_page_mbio+0x1cb/0x200 [f2fs]
 [f103c5da] restore_node_summary+0x13a/0x280 [f2fs]
 [f103e22d] build_curseg+0x2bd/0x620 [f2fs]
 [f104043b] build_segment_manager+0x1cb/0x920 [f2fs]
 [f1032c85] f2fs_fill_super+0x535/0x8e0 [f2fs]
 [c115b66a] mount_bdev+0x16a/0x1a0
 [f102f63f] f2fs_mount+0x1f/0x30 [f2fs]
 [c115c096] mount_fs+0x36/0x170
 [c1173635] vfs_kern_mount+0x55/0xe0
 [c1175388] do_mount+0x1e8/0x900
 [c1175d72] SyS_mount+0x82/0xc0
 [c16059cc] sysenter_do_call+0x12/0x22
   
Signed-off-by: Chao Yu chao2...@samsung.com
---
 fs/f2fs/node.c |   49 -
 1 file changed, 28 insertions(+), 21 deletions(-)
   
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 3d60d3d..b5cd814 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1658,13 +1658,16 @@ int recover_inode_page(struct f2fs_sb_info 
*sbi, struct page *page)
   
 /*
  * ra_sum_pages() merge contiguous pages into one bio and submit.
- * these pre-readed pages are linked in pages list.
+ * these pre-readed pages are alloced in bd_inode's mapping tree.
  */
-static int ra_sum_pages(struct f2fs_sb_info *sbi, struct list_head 
*pages,
+static int ra_sum_pages(struct f2fs_sb_info *sbi, struct page **pages,
int start, int nrpages)
 {
struct page *page;
+   struct inode *inode = sbi-sb-s_bdev-bd_inode;
  
  How about use sbi-meta_inode instead of bd_inode, then we can do
  caching summary pages for further i/o.
 
 In my understanding, In ra_sum_pages() we readahead node pages in NODE 
 segment,
 then we could padding current summary caching with nid of node page's footer.
 So we should not cache this readaheaded pages in meta_inode's mapping.
 Do I miss something?
 
 Regards
 

Sorry, you're right. Forget about caching. I've confused ra_sum_pages with 
summary segments.

  
+   struct address_space *mapping = inode-i_mapping;
int page_idx = start;
+   int alloced, readed;
struct f2fs_io_info fio = {
.type = META,
.rw = READ_SYNC | REQ_META | REQ_PRIO
@@ -1672,21 +1675,23 @@ static int ra_sum_pages(struct f2fs_sb_info 
*sbi, struct list_head
  *pages,
   
for (; page_idx  start + nrpages; page_idx++) {
/* alloc temporal page for read node summary info*/
-   page = alloc_page(GFP_F2FS_ZERO);
+   page = grab_cache_page(mapping, page_idx);
if (!page)
break;
-
-   lock_page(page);
-   page-index = page_idx;
-   list_add_tail(page-lru, pages);
+   page_cache_release(page);
  
   IMO, we don't need to do like this.
   Instead,
 for() {
 page = grab_cache_page();
 if (!page)
 break;
 page[page_idx] = page;
 f2fs_submit_page_mbio(sbi, page, fio);
 }
 f2fs_submit_merged_bio(sbi, META, READ);
 return page_idx - start;
  
   Afterwards, in restore_node_summry(),
 lock_page() will wait the end_io for read.
 ...
 f2fs_put_page(pages[index], 1);
  
   Thanks,
  
}
   
-   list_for_each_entry(page, pages, lru)
-   f2fs_submit_page_mbio(sbi, page, page-index, fio);
+   alloced = page_idx - start;
+   readed = find_get_pages_contig(mapping, start, alloced, pages);
+   BUG_ON(alloced != readed);
+
+   for (page_idx = 0; page_idx  readed; page_idx++)
+   f2fs_submit_page_mbio(sbi, pages[page_idx],
+   pages[page_idx]-index, fio);
   
f2fs_submit_merged_bio(sbi, META, READ);
   
-   return page_idx - start;
+   return readed

[f2fs-dev] [PATCH 1/2 V3] mkfs.f2fs: large volume support

2014-05-26 Thread Changman Lee

Changes from V2
 o remove CP_LARGE_VOL_LFLAG instead, use cp_payload in superblock
 because disk size is determined at format

Changes from V1
 o fix orphan node blkaddr

-- 8 --

From 7e5e66699bb383e4fa7ce970e1cc8e10eb0a5c6f Mon Sep 17 00:00:00 2001
From: root root@f2fs-00.(none)
Date: Mon, 12 May 2014 22:01:38 +0900
Subject: [PATCH 1/2] mkfs.f2fs: large volume support

This patch supports large volume over about 3TB.

Signed-off-by: Changman Lee cm224@samsung.com
---
 include/f2fs_fs.h  |9 +++
 mkfs/f2fs_format.c |   68 +++-
 2 files changed, 66 insertions(+), 11 deletions(-)

diff --git a/include/f2fs_fs.h b/include/f2fs_fs.h
index 94d8dc3..3003f7f 100644
--- a/include/f2fs_fs.h
+++ b/include/f2fs_fs.h
@@ -223,6 +223,7 @@ enum {
 #define F2FS_LOG_SECTORS_PER_BLOCK 3   /* 4KB: F2FS_BLKSIZE */
 #define F2FS_BLKSIZE   4096/* support only 4KB block */
 #define F2FS_MAX_EXTENSION 64  /* # of extension entries */
+#define F2FS_BLK_ALIGN(x)  (((x) + F2FS_BLKSIZE - 1) / F2FS_BLKSIZE)
 
 #define NULL_ADDR  0x0U
 #define NEW_ADDR   -1U
@@ -279,6 +280,7 @@ struct f2fs_super_block {
__le16 volume_name[512];/* volume name */
__le32 extension_count; /* # of extensions below */
__u8 extension_list[F2FS_MAX_EXTENSION][8]; /* extension array */
+   __le32 cp_payload;
 } __attribute__((packed));
 
 /*
@@ -457,6 +459,13 @@ struct f2fs_nat_block {
 #define SIT_ENTRY_PER_BLOCK (PAGE_CACHE_SIZE / sizeof(struct f2fs_sit_entry))
 
 /*
+ * F2FS uses 4 bytes to represent block address. As a result, supported size of
+ * disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments.
+ */
+#define F2FS_MAX_SEGMENT   ((16 * 1024 * 1024) / 2)
+#define MAX_SIT_BITMAP_SIZE((F2FS_MAX_SEGMENT / SIT_ENTRY_PER_BLOCK) / 8)
+
+/*
  * Note that f2fs_sit_entry-vblocks has the following bit-field information.
  * [15:10] : allocation type such as CURSEG__TYPE
  * [9:0] : valid block count
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index cdbf74a..58550a2 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -102,7 +102,8 @@ static int f2fs_prepare_super_block(void)
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t total_valid_blks_available;
u_int64_t zone_align_start_offset, diff, total_meta_segments;
-   u_int32_t sit_bitmap_size, max_nat_bitmap_size, max_nat_segments;
+   u_int32_t sit_bitmap_size, max_sit_bitmap_size;
+   u_int32_t max_nat_bitmap_size, max_nat_segments;
u_int32_t total_zones;
 
super_block.magic = cpu_to_le32(F2FS_SUPER_MAGIC);
@@ -217,8 +218,25 @@ static int f2fs_prepare_super_block(void)
 */
sit_bitmap_size = ((le32_to_cpu(super_block.segment_count_sit) / 2) 
log_blks_per_seg) / 8;
-   max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct f2fs_checkpoint) 
+ 1 -
-   sit_bitmap_size;
+
+   if (sit_bitmap_size  MAX_SIT_BITMAP_SIZE)
+   max_sit_bitmap_size = MAX_SIT_BITMAP_SIZE;
+   else
+   max_sit_bitmap_size = sit_bitmap_size;
+
+   /*
+* It should be reserved minimum 1 segment for nat.
+* When sit is too large, we should expand cp area. It requires more 
pages for cp.
+*/
+   if (max_sit_bitmap_size 
+   (CHECKSUM_OFFSET - sizeof(struct f2fs_checkpoint) + 
65)) {
+   max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct 
f2fs_checkpoint) + 1;
+   super_block.cp_payload = F2FS_BLK_ALIGN(max_sit_bitmap_size);
+   } else {
+   max_nat_bitmap_size = CHECKSUM_OFFSET - sizeof(struct 
f2fs_checkpoint) + 1 - max_sit_bitmap_size;
+   super_block.cp_payload = 0;
+   }
+
max_nat_segments = (max_nat_bitmap_size * 8)  log_blks_per_seg;
 
if (le32_to_cpu(super_block.segment_count_nat)  max_nat_segments)
@@ -434,6 +452,7 @@ static int f2fs_write_check_point_pack(void)
u_int64_t cp_seg_blk_offset = 0;
u_int32_t crc = 0;
int i;
+   char *cp_payload = NULL;
 
ckp = calloc(F2FS_BLKSIZE, 1);
if (ckp == NULL) {
@@ -447,6 +466,12 @@ static int f2fs_write_check_point_pack(void)
return -1;
}
 
+   cp_payload = calloc(F2FS_BLKSIZE, 1);
+   if (cp_payload == NULL) {
+   MSG(1, \tError: Calloc Failed for cp_payload!!!\n);
+   return -1;
+   }
+
/* 1. cp page 1 of checkpoint pack 1 */
ckp-checkpoint_ver = cpu_to_le64(1);
ckp-cur_node_segno[0] =
@@ -485,9 +510,11 @@ static int f2fs_write_check_point_pack(void)
((le32_to_cpu(ckp-free_segment_count) + 6 -
le32_to_cpu(ckp-overprov_segment_count)) *
 config.blks_per_seg));
-   ckp-cp_pack_total_block_count

[f2fs-dev] [PATCH 2/2 V3] fsck.f2fs: large volume support

2014-05-26 Thread Changman Lee
Changes from V2
 o remove CP_LARGE_VOL_FLAG instead, use cp_payload in superblock
  because disk size is determined at format

Changes from V1
 o fix orphan node blkaddr

-- 8 --

From 405367374f868a8cf29bef62c06bf53271b58f52 Mon Sep 17 00:00:00 2001
From: Changman Lee cm224@samsung.com
Date: Mon, 12 May 2014 22:03:46 +0900
Subject: [PATCH 2/2] fsck.f2fs: large volume support

This patch support large volume over about 3TB.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fsck/f2fs.h   |   14 +++---
 fsck/fsck.c   |7 +--
 fsck/mount.c  |   22 --
 lib/libf2fs.c |4 ++--
 4 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/fsck/f2fs.h b/fsck/f2fs.h
index e1740fe..427a733 100644
--- a/fsck/f2fs.h
+++ b/fsck/f2fs.h
@@ -203,9 +203,17 @@ static inline unsigned long __bitmap_size(struct 
f2fs_sb_info *sbi, int flag)
 static inline void *__bitmap_ptr(struct f2fs_sb_info *sbi, int flag)
 {
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
-   int offset = (flag == NAT_BITMAP) ?
-   le32_to_cpu(ckpt-sit_ver_bitmap_bytesize) : 0;
-   return ckpt-sit_nat_version_bitmap + offset;
+   int offset;
+   if (le32_to_cpu(F2FS_RAW_SUPER(sbi)-cp_payload)  0) {
+   if (flag == NAT_BITMAP)
+   return ckpt-sit_nat_version_bitmap;
+   else
+   return ((char *)ckpt + F2FS_BLKSIZE);
+   } else {
+   offset = (flag == NAT_BITMAP) ?
+   le32_to_cpu(ckpt-sit_ver_bitmap_bytesize) : 0;
+   return ckpt-sit_nat_version_bitmap + offset;
+   }
 }
 
 static inline bool is_set_ckpt_flags(struct f2fs_checkpoint *cp, unsigned int 
f)
diff --git a/fsck/fsck.c b/fsck/fsck.c
index 20582c9..a1d5dd0 100644
--- a/fsck/fsck.c
+++ b/fsck/fsck.c
@@ -653,11 +653,14 @@ int fsck_chk_orphan_node(struct f2fs_sb_info *sbi)
 
block_t start_blk, orphan_blkaddr, i, j;
struct f2fs_orphan_block *orphan_blk;
+   struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
 
-   if (!is_set_ckpt_flags(F2FS_CKPT(sbi), CP_ORPHAN_PRESENT_FLAG))
+   if (!is_set_ckpt_flags(ckpt, CP_ORPHAN_PRESENT_FLAG))
return 0;
 
-   start_blk = __start_cp_addr(sbi) + 1;
+   start_blk = __start_cp_addr(sbi) + 1 +
+   le32_to_cpu(F2FS_RAW_SUPER(sbi)-cp_payload);
+
orphan_blkaddr = __start_sum_addr(sbi) - 1;
 
orphan_blk = calloc(BLOCK_SZ, 1);
diff --git a/fsck/mount.c b/fsck/mount.c
index e2f3ace..24ef3bf 100644
--- a/fsck/mount.c
+++ b/fsck/mount.c
@@ -129,6 +129,7 @@ void print_raw_sb_info(struct f2fs_sb_info *sbi)
DISP_u32(sb, root_ino);
DISP_u32(sb, node_ino);
DISP_u32(sb, meta_ino);
+   DISP_u32(sb, cp_payload);
printf(\n);
 }
 
@@ -285,6 +286,7 @@ void *validate_checkpoint(struct f2fs_sb_info *sbi, block_t 
cp_addr, unsigned lo
/* Read the 2nd cp block in this CP pack */
cp_page_2 = malloc(PAGE_SIZE);
cp_addr += le32_to_cpu(cp_block-cp_pack_total_block_count) - 1;
+
if (dev_read_block(cp_page_2, cp_addr)  0)
goto invalid_cp2;
 
@@ -295,7 +297,7 @@ void *validate_checkpoint(struct f2fs_sb_info *sbi, block_t 
cp_addr, unsigned lo
 
crc = *(unsigned int *)((unsigned char *)cp_block + crc_offset);
if (f2fs_crc_valid(crc, cp_block, crc_offset))
-   goto invalid_cp1;
+   goto invalid_cp2;
 
cur_version = le64_to_cpu(cp_block-checkpoint_ver);
 
@@ -319,8 +321,9 @@ int get_valid_checkpoint(struct f2fs_sb_info *sbi)
unsigned long blk_size = sbi-blocksize;
unsigned long long cp1_version = 0, cp2_version = 0;
unsigned long long cp_start_blk_no;
+   unsigned int cp_blks = 1 + le32_to_cpu(F2FS_RAW_SUPER(sbi)-cp_payload);
 
-   sbi-ckpt = malloc(blk_size);
+   sbi-ckpt = malloc(cp_blks * blk_size);
if (!sbi-ckpt)
return -ENOMEM;
/*
@@ -351,6 +354,20 @@ int get_valid_checkpoint(struct f2fs_sb_info *sbi)
 
memcpy(sbi-ckpt, cur_page, blk_size);
 
+   if (cp_blks  1) {
+   int i;
+   unsigned long long cp_blk_no;
+
+   cp_blk_no = le32_to_cpu(raw_sb-cp_blkaddr);
+   if (cur_page == cp2)
+   cp_blk_no += 1  
le32_to_cpu(raw_sb-log_blocks_per_seg);
+   /* copy sit bitmap */
+   for (i = 1; i  cp_blks; i++) {
+   unsigned char *ckpt = (unsigned char *)sbi-ckpt;
+   dev_read_block(cur_page, cp_blk_no + i);
+   memcpy(ckpt + i * blk_size, cur_page, blk_size);
+   }
+   }
free(cp1);
free(cp2);
return 0;
@@ -697,6 +714,7 @@ void check_block_count(struct f2fs_sb_info *sbi,
int valid_blocks = 0;
int i;
 
+
/* check segment usage */
ASSERT(GET_SIT_VBLOCKS(raw_sit) = sbi-blocks_per_seg);
 
diff --git a/lib

Re: [f2fs-dev] [PATCH v2] f2fs: avoid crash when trace f2fs_submit_page_mbio event in ra_sum_pages

2014-05-26 Thread Changman Lee
Hi, Chao

Could you think about following once.
move node_inode in front of build_segment_manager, then use node_inode
instead of bd_inode.

On Tue, May 27, 2014 at 08:41:07AM +0800, Chao Yu wrote:
 Previously we allocate pages with no mapping in ra_sum_pages(), so we may
 encounter a crash in event trace of f2fs_submit_page_mbio where we access
 mapping data of the page.
 
 We'd better allocate pages in bd_inode mapping and invalidate these pages 
 after
 we restore data from pages. It could avoid crash in above scenario.
 
 Changes from V1
  o remove redundant code in ra_sum_pages() suggested by Jaegeuk Kim.
 
 Call Trace:
  [f1031630] ? ftrace_raw_event_f2fs_write_checkpoint+0x80/0x80 [f2fs]
  [f10377bb] f2fs_submit_page_mbio+0x1cb/0x200 [f2fs]
  [f103c5da] restore_node_summary+0x13a/0x280 [f2fs]
  [f103e22d] build_curseg+0x2bd/0x620 [f2fs]
  [f104043b] build_segment_manager+0x1cb/0x920 [f2fs]
  [f1032c85] f2fs_fill_super+0x535/0x8e0 [f2fs]
  [c115b66a] mount_bdev+0x16a/0x1a0
  [f102f63f] f2fs_mount+0x1f/0x30 [f2fs]
  [c115c096] mount_fs+0x36/0x170
  [c1173635] vfs_kern_mount+0x55/0xe0
  [c1175388] do_mount+0x1e8/0x900
  [c1175d72] SyS_mount+0x82/0xc0
  [c16059cc] sysenter_do_call+0x12/0x22
 
 Suggested-by: Jaegeuk Kim jaegeuk@samsung.com
 Signed-off-by: Chao Yu chao2...@samsung.com
 ---
  fs/f2fs/node.c |   52 
  1 file changed, 24 insertions(+), 28 deletions(-)
 
 diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
 index 3d60d3d..02a59e9 100644
 --- a/fs/f2fs/node.c
 +++ b/fs/f2fs/node.c
 @@ -1658,35 +1658,29 @@ int recover_inode_page(struct f2fs_sb_info *sbi, 
 struct page *page)
  
  /*
   * ra_sum_pages() merge contiguous pages into one bio and submit.
 - * these pre-readed pages are linked in pages list.
 + * these pre-readed pages are alloced in bd_inode's mapping tree.
   */
 -static int ra_sum_pages(struct f2fs_sb_info *sbi, struct list_head *pages,
 +static int ra_sum_pages(struct f2fs_sb_info *sbi, struct page **pages,
   int start, int nrpages)
  {
 - struct page *page;
 - int page_idx = start;
 + struct inode *inode = sbi-sb-s_bdev-bd_inode;
 + struct address_space *mapping = inode-i_mapping;
 + int i, page_idx = start;
   struct f2fs_io_info fio = {
   .type = META,
   .rw = READ_SYNC | REQ_META | REQ_PRIO
   };
  
 - for (; page_idx  start + nrpages; page_idx++) {
 - /* alloc temporal page for read node summary info*/
 - page = alloc_page(GFP_F2FS_ZERO);
 - if (!page)
 + for (i = 0; page_idx  start + nrpages; page_idx++, i++) {
 + /* alloc page in bd_inode for reading node summary info */
 + pages[i] = grab_cache_page(mapping, page_idx);
 + if (!pages[i])
   break;
 -
 - lock_page(page);
 - page-index = page_idx;
 - list_add_tail(page-lru, pages);
 + f2fs_submit_page_mbio(sbi, pages[i], page_idx, fio);
   }
  
 - list_for_each_entry(page, pages, lru)
 - f2fs_submit_page_mbio(sbi, page, page-index, fio);
 -
   f2fs_submit_merged_bio(sbi, META, READ);
 -
 - return page_idx - start;
 + return i;
  }
  
  int restore_node_summary(struct f2fs_sb_info *sbi,
 @@ -1694,11 +1688,11 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
  {
   struct f2fs_node *rn;
   struct f2fs_summary *sum_entry;
 - struct page *page, *tmp;
 + struct inode *inode = sbi-sb-s_bdev-bd_inode;
   block_t addr;
   int bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
 - int i, last_offset, nrpages, err = 0;
 - LIST_HEAD(page_list);
 + struct page *pages[bio_blocks];
 + int i, idx, last_offset, nrpages, err = 0;
  
   /* scan the node segment */
   last_offset = sbi-blocks_per_seg;
 @@ -1709,29 +1703,31 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
   nrpages = min(last_offset - i, bio_blocks);
  
   /* read ahead node pages */
 - nrpages = ra_sum_pages(sbi, page_list, addr, nrpages);
 + nrpages = ra_sum_pages(sbi, pages, addr, nrpages);
   if (!nrpages)
   return -ENOMEM;
  
 - list_for_each_entry_safe(page, tmp, page_list, lru) {
 + for (idx = 0; idx  nrpages; idx++) {
   if (err)
   goto skip;
  
 - lock_page(page);
 - if (unlikely(!PageUptodate(page))) {
 + lock_page(pages[idx]);
 + if (unlikely(!PageUptodate(pages[idx]))) {
   err = -EIO;
   } else {
 - rn = F2FS_NODE(page);
 + rn = F2FS_NODE(pages[idx]);
   sum_entry-nid = rn-footer.nid;
   sum_entry-version = 0;

[f2fs-dev] [PATCH V3] f2fs: large volume support

2014-05-22 Thread Changman Lee

Changes from V2
 o fix conversion like le32_to_cpu
 o use is_set_ckpt_flags instead of bit operation
 o check return value after memory allocation

Changes from V1
 o fix orphan node blkaddr for large volume

Jaegeuk,
What is your opinion about reallocation of sbi-ckpt ? If you have any
idea, let me know.
Thanks.

-- 8 --

From 5a821fcec79fb9570a26104238b3c2391f6160ae Mon Sep 17 00:00:00 2001
From: Changman Lee cm224@samsung.com
Date: Mon, 12 May 2014 12:27:43 +0900
Subject: [PATCH] f2fs: large volume support

f2fs's cp has one page which consists of struct f2fs_checkpoint and
version bitmap of sit and nat. To support lots of segments, we need more
blocks for sit bitmap. So let's arrange sit bitmap as following:
+-++
| f2fs_checkpoint | sit bitmap |
| + nat bitmap||
+-++
0 4kN blocks

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/checkpoint.c|   59
+++
 fs/f2fs/f2fs.h  |   13 +--
 include/linux/f2fs_fs.h |2 ++
 3 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index fe968c7..cf2d1a7 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -366,12 +366,18 @@ static void recover_orphan_inode(struct
f2fs_sb_info *sbi, nid_t ino)
 void recover_orphan_inodes(struct f2fs_sb_info *sbi)
 {
block_t start_blk, orphan_blkaddr, i, j;
+   struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
 
if (!is_set_ckpt_flags(F2FS_CKPT(sbi), CP_ORPHAN_PRESENT_FLAG))
return;
 
sbi-por_doing = true;
-   start_blk = __start_cp_addr(sbi) + 1;
+
+   if (is_set_ckpt_flags(ckpt, CP_LARGE_VOL_FLAG))
+   start_blk = __start_cp_addr(sbi) + F2FS_BLK_ALIGN(
+   le32_to_cpu(ckpt-sit_ver_bitmap_bytesize));
+   else
+   start_blk = __start_cp_addr(sbi) + 1;
orphan_blkaddr = __start_sum_addr(sbi) - 1;
 
ra_meta_pages(sbi, start_blk, orphan_blkaddr, META_CP);
@@ -544,6 +550,35 @@ int get_valid_checkpoint(struct f2fs_sb_info *sbi)
cp_block = (struct f2fs_checkpoint *)page_address(cur_page);
memcpy(sbi-ckpt, cp_block, blk_size);
 
+   if (is_set_ckpt_flags(sbi-ckpt, CP_LARGE_VOL_FLAG)) {
+   int i, cp_blks;
+   block_t cp_blk_no;
+
+   cp_blk_no = le32_to_cpu(fsb-cp_blkaddr);
+   if (cur_page == cp2)
+   cp_blk_no += 1  le32_to_cpu(fsb-log_blocks_per_seg);
+
+   cp_blks = 1 + F2FS_BLK_ALIGN(
+   le32_to_cpu(cp_block-sit_ver_bitmap_bytesize));
+
+   kfree(sbi-ckpt);
+   sbi-ckpt = kzalloc(cp_blks * blk_size, GFP_KERNEL);
+   if (!sbi-ckpt)
+   return -ENOMEM;
+
+   memcpy(sbi-ckpt, cp_block, blk_size);
+
+   for (i = 1; i  cp_blks; i++) {
+   void *sit_bitmap_ptr;
+   unsigned char *ckpt = (unsigned char *)sbi-ckpt;
+
+   cur_page = get_meta_page(sbi, cp_blk_no + i);
+   sit_bitmap_ptr = page_address(cur_page);
+   memcpy(ckpt + i * blk_size, sit_bitmap_ptr, blk_size);
+   f2fs_put_page(cur_page, 1);
+   }
+   }
+
f2fs_put_page(cp1, 1);
f2fs_put_page(cp2, 1);
return 0;
@@ -736,6 +771,7 @@ static void do_checkpoint(struct f2fs_sb_info *sbi,
bool is_umount)
__u32 crc32 = 0;
void *kaddr;
int i;
+   int sit_bitmap_blks = 0;
 
/*
 * This avoids to conduct wrong roll-forward operations and uses
@@ -786,16 +822,22 @@ static void do_checkpoint(struct f2fs_sb_info
*sbi, bool is_umount)
 
orphan_blocks = (sbi-n_orphans + F2FS_ORPHANS_PER_BLOCK - 1)
/ F2FS_ORPHANS_PER_BLOCK;
-   ckpt-cp_pack_start_sum = cpu_to_le32(1 + orphan_blocks);
+   if (is_set_ckpt_flags(ckpt, CP_LARGE_VOL_FLAG))
+   sit_bitmap_blks = F2FS_BLK_ALIGN(
+   le32_to_cpu(ckpt-sit_ver_bitmap_bytesize));
+   ckpt-cp_pack_start_sum = cpu_to_le32(1 + sit_bitmap_blks +
+   orphan_blocks);
 
if (is_umount) {
set_ckpt_flags(ckpt, CP_UMOUNT_FLAG);
ckpt-cp_pack_total_block_count = cpu_to_le32(2 +
-   data_sum_blocks + orphan_blocks + NR_CURSEG_NODE_TYPE);
+   sit_bitmap_blks + data_sum_blocks +
+   orphan_blocks + NR_CURSEG_NODE_TYPE);
} else {
clear_ckpt_flags(ckpt, CP_UMOUNT_FLAG);
ckpt-cp_pack_total_block_count = cpu_to_le32(2 +
-   data_sum_blocks + orphan_blocks);
+   sit_bitmap_blks + data_sum_blocks

Re: [f2fs-dev] [PATCH] f2fs: avoid crash when trace f2fs_submit_page_mbio event in ra_sum_pages

2014-05-22 Thread Changman Lee
On Wed, May 21, 2014 at 12:36:46PM +0900, Jaegeuk Kim wrote:
 Hi Chao,
 
 2014-05-16 (금), 17:14 +0800, Chao Yu:
  Previously we allocate pages with no mapping in ra_sum_pages(), so we may
  encounter a crash in event trace of f2fs_submit_page_mbio where we access
  mapping data of the page.
  
  We'd better allocate pages in bd_inode mapping and invalidate these pages 
  after
  we restore data from pages. It could avoid crash in above scenario.
  
  Call Trace:
   [f1031630] ? ftrace_raw_event_f2fs_write_checkpoint+0x80/0x80 [f2fs]
   [f10377bb] f2fs_submit_page_mbio+0x1cb/0x200 [f2fs]
   [f103c5da] restore_node_summary+0x13a/0x280 [f2fs]
   [f103e22d] build_curseg+0x2bd/0x620 [f2fs]
   [f104043b] build_segment_manager+0x1cb/0x920 [f2fs]
   [f1032c85] f2fs_fill_super+0x535/0x8e0 [f2fs]
   [c115b66a] mount_bdev+0x16a/0x1a0
   [f102f63f] f2fs_mount+0x1f/0x30 [f2fs]
   [c115c096] mount_fs+0x36/0x170
   [c1173635] vfs_kern_mount+0x55/0xe0
   [c1175388] do_mount+0x1e8/0x900
   [c1175d72] SyS_mount+0x82/0xc0
   [c16059cc] sysenter_do_call+0x12/0x22
  
  Signed-off-by: Chao Yu chao2...@samsung.com
  ---
   fs/f2fs/node.c |   49 -
   1 file changed, 28 insertions(+), 21 deletions(-)
  
  diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
  index 3d60d3d..b5cd814 100644
  --- a/fs/f2fs/node.c
  +++ b/fs/f2fs/node.c
  @@ -1658,13 +1658,16 @@ int recover_inode_page(struct f2fs_sb_info *sbi, 
  struct page *page)
   
   /*
* ra_sum_pages() merge contiguous pages into one bio and submit.
  - * these pre-readed pages are linked in pages list.
  + * these pre-readed pages are alloced in bd_inode's mapping tree.
*/
  -static int ra_sum_pages(struct f2fs_sb_info *sbi, struct list_head *pages,
  +static int ra_sum_pages(struct f2fs_sb_info *sbi, struct page **pages,
  int start, int nrpages)
   {
  struct page *page;
  +   struct inode *inode = sbi-sb-s_bdev-bd_inode;

How about use sbi-meta_inode instead of bd_inode, then we can do
caching summary pages for further i/o.

  +   struct address_space *mapping = inode-i_mapping;
  int page_idx = start;
  +   int alloced, readed;
  struct f2fs_io_info fio = {
  .type = META,
  .rw = READ_SYNC | REQ_META | REQ_PRIO
  @@ -1672,21 +1675,23 @@ static int ra_sum_pages(struct f2fs_sb_info *sbi, 
  struct list_head *pages,
   
  for (; page_idx  start + nrpages; page_idx++) {
  /* alloc temporal page for read node summary info*/
  -   page = alloc_page(GFP_F2FS_ZERO);
  +   page = grab_cache_page(mapping, page_idx);
  if (!page)
  break;
  -
  -   lock_page(page);
  -   page-index = page_idx;
  -   list_add_tail(page-lru, pages);
  +   page_cache_release(page);
 
 IMO, we don't need to do like this.
 Instead,
   for() {
   page = grab_cache_page();
   if (!page)
   break;
   page[page_idx] = page;
   f2fs_submit_page_mbio(sbi, page, fio);
   }
   f2fs_submit_merged_bio(sbi, META, READ);
   return page_idx - start;
 
 Afterwards, in restore_node_summry(),
   lock_page() will wait the end_io for read.
   ...
   f2fs_put_page(pages[index], 1);
 
 Thanks,
 
  }
   
  -   list_for_each_entry(page, pages, lru)
  -   f2fs_submit_page_mbio(sbi, page, page-index, fio);
  +   alloced = page_idx - start;
  +   readed = find_get_pages_contig(mapping, start, alloced, pages);
  +   BUG_ON(alloced != readed);
  +
  +   for (page_idx = 0; page_idx  readed; page_idx++)
  +   f2fs_submit_page_mbio(sbi, pages[page_idx],
  +   pages[page_idx]-index, fio);
   
  f2fs_submit_merged_bio(sbi, META, READ);
   
  -   return page_idx - start;
  +   return readed;
   }
   
   int restore_node_summary(struct f2fs_sb_info *sbi,
  @@ -1694,11 +1699,11 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
   {
  struct f2fs_node *rn;
  struct f2fs_summary *sum_entry;
  -   struct page *page, *tmp;
  +   struct inode *inode = sbi-sb-s_bdev-bd_inode;
  block_t addr;
  int bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
  -   int i, last_offset, nrpages, err = 0;
  -   LIST_HEAD(page_list);
  +   struct page *pages[bio_blocks];
  +   int i, index, last_offset, nrpages, err = 0;
   
  /* scan the node segment */
  last_offset = sbi-blocks_per_seg;
  @@ -1709,29 +1714,31 @@ int restore_node_summary(struct f2fs_sb_info *sbi,
  nrpages = min(last_offset - i, bio_blocks);
   
  /* read ahead node pages */
  -   nrpages = ra_sum_pages(sbi, page_list, addr, nrpages);
  +   nrpages = ra_sum_pages(sbi, pages, addr, nrpages);
  if (!nrpages)
  return -ENOMEM;
   
  -   list_for_each_entry_safe(page, tmp, page_list, lru) {
  +   for (index = 0; index  nrpages; 

Re: [f2fs-dev] [PATCH] f2fs: large volume support

2014-05-21 Thread Changman Lee
On 수, 2014-05-21 at 13:33 +0900, Jaegeuk Kim wrote:
 Hi Changman,
 
 2014-05-12 (월), 15:59 +0900, Changman Lee:
  f2fs's cp has one page which consists of struct f2fs_checkpoint and
  version bitmap of sit and nat. To support lots of segments, we need more
  blocks for sit bitmap. So let's arrange sit bitmap as following:
  +-++
  | f2fs_checkpoint | sit bitmap |
  | + nat bitmap||
  +-++
  0 4kN blocks
  
  Signed-off-by: Changman Lee cm224@samsung.com
  ---
   fs/f2fs/checkpoint.c|   47 
  ---
   fs/f2fs/f2fs.h  |   13 +++--
   include/linux/f2fs_fs.h |2 ++
   3 files changed, 57 insertions(+), 5 deletions(-)
  
  diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
  index fe968c7..f418243 100644
  --- a/fs/f2fs/checkpoint.c
  +++ b/fs/f2fs/checkpoint.c
  @@ -544,6 +544,32 @@ int get_valid_checkpoint(struct f2fs_sb_info *sbi)
  cp_block = (struct f2fs_checkpoint *)page_address(cur_page);
  memcpy(sbi-ckpt, cp_block, blk_size);
   
  +   if (is_set_ckpt_flags(sbi-ckpt, CP_LARGE_VOL_FLAG)) {
  +   int i, cp_blks;
  +   block_t cp_blk_no;
  +
  +   cp_blk_no = le32_to_cpu(fsb-cp_blkaddr);
  +   if (cur_page == cp2)
  +   cp_blk_no += 1  le32_to_cpu(fsb-log_blocks_per_seg);
  +
  +   cp_blks = 1 + F2FS_BLK_ALIGN(cp_block-sit_ver_bitmap_bytesize);
 
 Should covert le32_to_cpu(cp_block-sit_ver_bitmap_bytesize).
 
Got it.
  +
  +   kfree(sbi-ckpt);
  +   sbi-ckpt = kzalloc(cp_blks * blk_size, GFP_KERNEL);
 
 Why does it have to reallocate this and not to handle -ENOMEM correctly?

I think it's more simple than using another variable to point
sit_ver_bitmap and it doesn't require alloc and free for the variable.

 
  +
  +   memcpy(sbi-ckpt, cp_block, blk_size);
  +
  +   for (i = 1; i  cp_blks; i++) {
  +   void *sit_bitmap_ptr;
  +   unsigned char *ckpt = (unsigned char *)sbi-ckpt;
  +
  +   cur_page = get_meta_page(sbi, cp_blk_no + i);
  +   sit_bitmap_ptr = page_address(cur_page);
  +   memcpy(ckpt + i * blk_size, sit_bitmap_ptr, blk_size);
  +   f2fs_put_page(cur_page, 1);
  +   }
  +   }
  +
  f2fs_put_page(cp1, 1);
  f2fs_put_page(cp2, 1);
  return 0;
  @@ -736,6 +762,7 @@ static void do_checkpoint(struct f2fs_sb_info *sbi, 
  bool is_umount)
  __u32 crc32 = 0;
  void *kaddr;
  int i;
  +   int sit_bitmap_blks = 0;
   
  /*
   * This avoids to conduct wrong roll-forward operations and uses
  @@ -786,16 +813,21 @@ static void do_checkpoint(struct f2fs_sb_info *sbi, 
  bool is_umount)
   
  orphan_blocks = (sbi-n_orphans + F2FS_ORPHANS_PER_BLOCK - 1)
  / F2FS_ORPHANS_PER_BLOCK;
  -   ckpt-cp_pack_start_sum = cpu_to_le32(1 + orphan_blocks);
  +   if (is_set_ckpt_flags(ckpt, CP_LARGE_VOL_FLAG))
  +   sit_bitmap_blks = F2FS_BLK_ALIGN(ckpt-sit_ver_bitmap_bytesize);
  +   ckpt-cp_pack_start_sum = cpu_to_le32(1 + sit_bitmap_blks +
  +   orphan_blocks);
   
  if (is_umount) {
  set_ckpt_flags(ckpt, CP_UMOUNT_FLAG);
  ckpt-cp_pack_total_block_count = cpu_to_le32(2 +
  -   data_sum_blocks + orphan_blocks + NR_CURSEG_NODE_TYPE);
  +   sit_bitmap_blks + data_sum_blocks +
  +   orphan_blocks + NR_CURSEG_NODE_TYPE);
  } else {
  clear_ckpt_flags(ckpt, CP_UMOUNT_FLAG);
  ckpt-cp_pack_total_block_count = cpu_to_le32(2 +
  -   data_sum_blocks + orphan_blocks);
  +   sit_bitmap_blks + data_sum_blocks +
  +   orphan_blocks);
  }
   
  if (sbi-n_orphans)
  @@ -821,6 +853,15 @@ static void do_checkpoint(struct f2fs_sb_info *sbi, 
  bool is_umount)
  set_page_dirty(cp_page);
  f2fs_put_page(cp_page, 1);
   
  +   for (i = 1; i  1 + sit_bitmap_blks; i++) {
  +   cp_page = grab_meta_page(sbi, start_blk++);
  +   kaddr = page_address(cp_page);
  +   memcpy(kaddr, (char *)ckpt + i * F2FS_BLKSIZE,
  +   (1  sbi-log_blocksize));
  +   set_page_dirty(cp_page);
  +   f2fs_put_page(cp_page, 1);
  +   }
  +
  if (sbi-n_orphans) {
  write_orphan_inodes(sbi, start_blk);
  start_blk += orphan_blocks;
  diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
  index 676a2c6..9e147ae 100644
  --- a/fs/f2fs/f2fs.h
  +++ b/fs/f2fs/f2fs.h
  @@ -764,9 +764,18 @@ static inline unsigned long __bitmap_size(struct 
  f2fs_sb_info *sbi, int flag)
   static inline void *__bitmap_ptr(struct f2fs_sb_info *sbi, int flag)
   {
  struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
  -   int offset = (flag == NAT_BITMAP

[f2fs-dev] [PATCH] f2fs: large volume support

2014-05-18 Thread Changman Lee
f2fs's cp has one page which consists of struct f2fs_checkpoint and
version bitmap of sit and nat. To support lots of segments, we need more
blocks for sit bitmap. So let's arrange sit bitmap as following:
+-++
| f2fs_checkpoint | sit bitmap |
| + nat bitmap||
+-++
0 4kN blocks

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/checkpoint.c|   55 +++
 fs/f2fs/f2fs.h  |   13 +--
 include/linux/f2fs_fs.h |2 ++
 3 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index fe968c7..05e18f8 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -366,12 +366,18 @@ static void recover_orphan_inode(struct f2fs_sb_info 
*sbi, nid_t ino)
 void recover_orphan_inodes(struct f2fs_sb_info *sbi)
 {
block_t start_blk, orphan_blkaddr, i, j;
+   struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
 
if (!is_set_ckpt_flags(F2FS_CKPT(sbi), CP_ORPHAN_PRESENT_FLAG))
return;
 
sbi-por_doing = true;
-   start_blk = __start_cp_addr(sbi) + 1;
+
+   if (is_set_ckpt_flags(ckpt, CP_LARGE_VOL_FLAG))
+   start_blk = __start_cp_addr(sbi) +
+   F2FS_BLK_ALIGN(ckpt-sit_ver_bitmap_bytesize);
+   else
+   start_blk = __start_cp_addr(sbi) + 1;
orphan_blkaddr = __start_sum_addr(sbi) - 1;
 
ra_meta_pages(sbi, start_blk, orphan_blkaddr, META_CP);
@@ -544,6 +550,32 @@ int get_valid_checkpoint(struct f2fs_sb_info *sbi)
cp_block = (struct f2fs_checkpoint *)page_address(cur_page);
memcpy(sbi-ckpt, cp_block, blk_size);
 
+   if (is_set_ckpt_flags(sbi-ckpt, CP_LARGE_VOL_FLAG)) {
+   int i, cp_blks;
+   block_t cp_blk_no;
+
+   cp_blk_no = le32_to_cpu(fsb-cp_blkaddr);
+   if (cur_page == cp2)
+   cp_blk_no += 1  le32_to_cpu(fsb-log_blocks_per_seg);
+
+   cp_blks = 1 + F2FS_BLK_ALIGN(cp_block-sit_ver_bitmap_bytesize);
+
+   kfree(sbi-ckpt);
+   sbi-ckpt = kzalloc(cp_blks * blk_size, GFP_KERNEL);
+
+   memcpy(sbi-ckpt, cp_block, blk_size);
+
+   for (i = 1; i  cp_blks; i++) {
+   void *sit_bitmap_ptr;
+   unsigned char *ckpt = (unsigned char *)sbi-ckpt;
+
+   cur_page = get_meta_page(sbi, cp_blk_no + i);
+   sit_bitmap_ptr = page_address(cur_page);
+   memcpy(ckpt + i * blk_size, sit_bitmap_ptr, blk_size);
+   f2fs_put_page(cur_page, 1);
+   }
+   }
+
f2fs_put_page(cp1, 1);
f2fs_put_page(cp2, 1);
return 0;
@@ -736,6 +768,7 @@ static void do_checkpoint(struct f2fs_sb_info *sbi, bool 
is_umount)
__u32 crc32 = 0;
void *kaddr;
int i;
+   int sit_bitmap_blks = 0;
 
/*
 * This avoids to conduct wrong roll-forward operations and uses
@@ -786,16 +819,21 @@ static void do_checkpoint(struct f2fs_sb_info *sbi, bool 
is_umount)
 
orphan_blocks = (sbi-n_orphans + F2FS_ORPHANS_PER_BLOCK - 1)
/ F2FS_ORPHANS_PER_BLOCK;
-   ckpt-cp_pack_start_sum = cpu_to_le32(1 + orphan_blocks);
+   if (is_set_ckpt_flags(ckpt, CP_LARGE_VOL_FLAG))
+   sit_bitmap_blks = F2FS_BLK_ALIGN(ckpt-sit_ver_bitmap_bytesize);
+   ckpt-cp_pack_start_sum = cpu_to_le32(1 + sit_bitmap_blks +
+   orphan_blocks);
 
if (is_umount) {
set_ckpt_flags(ckpt, CP_UMOUNT_FLAG);
ckpt-cp_pack_total_block_count = cpu_to_le32(2 +
-   data_sum_blocks + orphan_blocks + NR_CURSEG_NODE_TYPE);
+   sit_bitmap_blks + data_sum_blocks +
+   orphan_blocks + NR_CURSEG_NODE_TYPE);
} else {
clear_ckpt_flags(ckpt, CP_UMOUNT_FLAG);
ckpt-cp_pack_total_block_count = cpu_to_le32(2 +
-   data_sum_blocks + orphan_blocks);
+   sit_bitmap_blks + data_sum_blocks +
+   orphan_blocks);
}
 
if (sbi-n_orphans)
@@ -821,6 +859,15 @@ static void do_checkpoint(struct f2fs_sb_info *sbi, bool 
is_umount)
set_page_dirty(cp_page);
f2fs_put_page(cp_page, 1);
 
+   for (i = 1; i  1 + sit_bitmap_blks; i++) {
+   cp_page = grab_meta_page(sbi, start_blk++);
+   kaddr = page_address(cp_page);
+   memcpy(kaddr, (char *)ckpt + i * F2FS_BLKSIZE,
+   (1  sbi-log_blocksize));
+   set_page_dirty(cp_page);
+   f2fs_put_page(cp_page, 1);
+   }
+
if (sbi-n_orphans) {
write_orphan_inodes(sbi, start_blk

[f2fs-dev] [PATCH 2/2] fsck.f2fs: large volume support

2014-05-12 Thread Changman Lee
In the case of volume size is over 2.x TB, checkpoint pack is also
expanded over 4KB. It consists of f2fs_checkpoint and nat bitmap in a
blocks, and n blocks of sit bitmap.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fsck/f2fs.h   |   14 +++---
 fsck/mount.c  |   30 +-
 lib/libf2fs.c |4 ++--
 3 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/fsck/f2fs.h b/fsck/f2fs.h
index e1740fe..439ab8c 100644
--- a/fsck/f2fs.h
+++ b/fsck/f2fs.h
@@ -203,9 +203,17 @@ static inline unsigned long __bitmap_size(struct 
f2fs_sb_info *sbi, int flag)
 static inline void *__bitmap_ptr(struct f2fs_sb_info *sbi, int flag)
 {
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
-   int offset = (flag == NAT_BITMAP) ?
-   le32_to_cpu(ckpt-sit_ver_bitmap_bytesize) : 0;
-   return ckpt-sit_nat_version_bitmap + offset;
+   int offset;
+   if (ckpt-ckpt_flags  CP_LARGE_VOL_FLAG) {
+   if (flag == NAT_BITMAP)
+   return ckpt-sit_nat_version_bitmap;
+   else
+   return ((char *)ckpt + F2FS_BLKSIZE);
+   } else {
+   offset = (flag == NAT_BITMAP) ?
+   le32_to_cpu(ckpt-sit_ver_bitmap_bytesize) : 0;
+   return ckpt-sit_nat_version_bitmap + offset;
+   }
 }
 
 static inline bool is_set_ckpt_flags(struct f2fs_checkpoint *cp, unsigned int 
f)
diff --git a/fsck/mount.c b/fsck/mount.c
index e2f3ace..a12a6cf 100644
--- a/fsck/mount.c
+++ b/fsck/mount.c
@@ -265,6 +265,7 @@ void *validate_checkpoint(struct f2fs_sb_info *sbi, block_t 
cp_addr, unsigned lo
unsigned long long cur_version = 0, pre_version = 0;
unsigned int crc = 0;
size_t crc_offset;
+   unsigned int sit_bitmap_blks = 0;
 
/* Read the 1st cp block in this CP pack */
cp_page_1 = malloc(PAGE_SIZE);
@@ -284,7 +285,10 @@ void *validate_checkpoint(struct f2fs_sb_info *sbi, 
block_t cp_addr, unsigned lo
 
/* Read the 2nd cp block in this CP pack */
cp_page_2 = malloc(PAGE_SIZE);
+   if (cp_block-ckpt_flags  CP_LARGE_VOL_FLAG)
+   sit_bitmap_blks = 
F2FS_BLK_ALIGN(cp_block-sit_ver_bitmap_bytesize);
cp_addr += le32_to_cpu(cp_block-cp_pack_total_block_count) - 1;
+
if (dev_read_block(cp_page_2, cp_addr)  0)
goto invalid_cp2;
 
@@ -295,7 +299,7 @@ void *validate_checkpoint(struct f2fs_sb_info *sbi, block_t 
cp_addr, unsigned lo
 
crc = *(unsigned int *)((unsigned char *)cp_block + crc_offset);
if (f2fs_crc_valid(crc, cp_block, crc_offset))
-   goto invalid_cp1;
+   goto invalid_cp2;
 
cur_version = le64_to_cpu(cp_block-checkpoint_ver);
 
@@ -351,6 +355,29 @@ int get_valid_checkpoint(struct f2fs_sb_info *sbi)
 
memcpy(sbi-ckpt, cur_page, blk_size);
 
+   if (sbi-ckpt-ckpt_flags  CP_LARGE_VOL_FLAG) {
+   int i, cp_blks;
+   unsigned long long cp_blk_no;
+
+   free(sbi-ckpt);
+
+   cp_blk_no = le32_to_cpu(raw_sb-cp_blkaddr);
+   if (cur_page == cp2)
+   cp_blk_no += 1  
le32_to_cpu(raw_sb-log_blocks_per_seg);
+
+   cp_blks = 1 + 
F2FS_BLK_ALIGN(sbi-ckpt-sit_ver_bitmap_bytesize);
+
+   /* allocate cp size */
+   sbi-ckpt = malloc(cp_blks * blk_size);
+   /* copy first cp data including nat bitmap */
+   memcpy(sbi-ckpt, cur_page, blk_size);
+   /* copy sit bitmap */
+   for (i = 1; i  cp_blks; i++) {
+   unsigned char *ckpt = (unsigned char *)sbi-ckpt;
+   dev_read_block(cur_page, cp_blk_no + i);
+   memcpy(ckpt + i * blk_size, cur_page, blk_size);
+   }
+   }
free(cp1);
free(cp2);
return 0;
@@ -697,6 +724,7 @@ void check_block_count(struct f2fs_sb_info *sbi,
int valid_blocks = 0;
int i;
 
+
/* check segment usage */
ASSERT(GET_SIT_VBLOCKS(raw_sit) = sbi-blocks_per_seg);
 
diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index fb3f8c1..1a16dd2 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -342,8 +342,8 @@ int f2fs_crc_valid(u_int32_t blk_crc, void *buf, int len)
cal_crc = f2fs_cal_crc32(F2FS_SUPER_MAGIC, buf, len);
 
if (cal_crc != blk_crc) {
-   DBG(0,CRC validation failed: cal_crc = %u \
-   blk_crc = %u buff_size = 0x%x,
+   DBG(0,CRC validation failed: cal_crc = %u, 
+   blk_crc = %u buff_size = 0x%x\n,
cal_crc, blk_crc, len);
return -1;
}
-- 
1.7.9.5


--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best

[f2fs-dev] [PATCH 2/2] f2fs: call set_dirty_dir_page if inode is directory.

2014-03-11 Thread Changman Lee
It's more legible and efficient to call set_dirty_dir_page only if
inode-i_mode is directory before calling it.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/checkpoint.c |3 ---
 fs/f2fs/data.c   |3 ++-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 1e03ca5..cc61962 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -578,9 +578,6 @@ void set_dirty_dir_page(struct inode *inode, struct page 
*page)
struct f2fs_sb_info *sbi = F2FS_SB(inode-i_sb);
struct dir_inode_entry *new;
 
-   if (!S_ISDIR(inode-i_mode))
-   return;
-
new = f2fs_kmem_cache_alloc(inode_entry_slab, GFP_NOFS);
new-inode = inode;
INIT_LIST_HEAD(new-list);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index acd0159..ecfa674 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1043,7 +1043,8 @@ static int f2fs_set_data_page_dirty(struct page *page)
 
if (!PageDirty(page)) {
__set_page_dirty_nobuffers(page);
-   set_dirty_dir_page(inode, page);
+   if (S_ISDIR(inode-i_mode))
+   set_dirty_dir_page(inode, page);
return 1;
}
return 0;
-- 
1.7.10.4


--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fstat: add memory information used by f2fs

2014-02-05 Thread Changman Lee
This patch adds memory information used by f2fs.

Signed-off-by: Changman Lee cm224@samsung.com
---
 tools/f2fstat.c |   69 +++
 1 file changed, 44 insertions(+), 25 deletions(-)

diff --git a/tools/f2fstat.c b/tools/f2fstat.c
index b4f22ae..8ece660 100644
--- a/tools/f2fstat.c
+++ b/tools/f2fstat.c
@@ -16,6 +16,9 @@
  */
 #define F2FS_STATUS/sys/kernel/debug/f2fs/status
 
+#define KEY_NODE   0x0001
+#define KEY_META   0x0010
+
 unsigned long util;
 unsigned long used_node_blks;
 unsigned long used_data_blks;
@@ -33,9 +36,9 @@ unsigned long gc_node_blks;
 
 //unsigned long extent_hit_ratio;
 
-unsigned long dirty_node;
+unsigned long dirty_node, node_kb;
 unsigned long dirty_dents;
-unsigned long dirty_meta;
+unsigned long dirty_meta, meta_kb;
 unsigned long nat_caches;
 unsigned long dirty_sit;
 
@@ -43,7 +46,7 @@ unsigned long free_nids;
 
 unsigned long ssr_blks;
 unsigned long lfs_blks;
-
+unsigned long memory_kb;
 
 struct options {
int delay;
@@ -54,6 +57,7 @@ struct options {
 struct mm_table {
const char *name;
unsigned long *val;
+   int flag;
 };
 
 static int compare_mm_table(const void *a, const void *b)
@@ -84,21 +88,22 @@ void f2fstat(struct options *opt)
int found_cnt = 0;
 
static struct mm_table f2fstat_table[] = {
-   {   - Data,   used_data_blks },
-   {   - Dirty,  dirty_segs },
-   {   - Free,   free_segs },
-   {   - NATs,   nat_caches },
-   {   - Node,   used_node_blks },
-   {   - Prefree,prefree_segs },
-   {   - SITs,   dirty_sit },
-   {   - Valid,  valid_segs },
-   {   - dents,  dirty_dents },
-   {   - meta,   dirty_meta },
-   {   - nodes,  dirty_node },
-   { GC calls,   gc },
-   { LFS,lfs_blks },
-   { SSR,ssr_blks },
-   { Utilization,util },
+   {   - Data,   used_data_blks,0 },
+   {   - Dirty,  dirty_segs,0 },
+   {   - Free,   free_segs, 0 },
+   {   - NATs,   nat_caches,0 },
+   {   - Node,   used_node_blks,0 },
+   {   - Prefree,prefree_segs,  0 },
+   {   - SITs,   dirty_sit, 0 },
+   {   - Valid,  valid_segs,0 },
+   {   - dents,  dirty_dents,   0 },
+   {   - meta,   dirty_meta,KEY_META },
+   {   - nodes,  dirty_node,KEY_NODE },
+   { GC calls,   gc,0 },
+   { LFS,lfs_blks,  0 },
+   { Memory, memory_kb, 0 },
+   { SSR,ssr_blks,  0 },
+   { Utilization,util,  0 },
};
 
f2fstat_table_cnt = sizeof(f2fstat_table)/sizeof(struct mm_table);
@@ -147,6 +152,20 @@ void f2fstat(struct options *opt)
goto nextline;
 
*(found-val) = strtoul(head, tail, 10);
+   if (found-flag) {
+   int npages;
+   tail = strstr(head, in);
+   head = tail + 2;
+   npages = strtoul(head, tail, 10);
+   switch (found-flag  (KEY_NODE | KEY_META)) {
+   case KEY_NODE:
+   node_kb = npages * 4;
+   break;
+   case KEY_META:
+   meta_kb = npages * 4;
+   break;
+   }
+   }
if (++found_cnt == f2fstat_table_cnt)
break;
 nextline:
@@ -193,13 +212,13 @@ void parse_option(int argc, char *argv[], struct options 
*opt)
 
 void print_head(void)
 {
-   printf(---utilization--- ---main area ---balancing 
async-- -gc- ---alloc---\n);
-   printf(util  node   data   free  valid  dirty prefree node  dent meta 
sit   gcssrlfs\n);
+   fprintf(stderr, ---utilization--- ---main area 
---balancing async-- -gc- ---alloc--- -memory-\n);
+   fprintf(stderr, util  node   data   free  valid  dirty prefree node  
dent meta sit   gcssrlfs  total  node  meta\n);
 }
 
 int main(int argc, char *argv[])
 {
-   char format[] = %3ld %6ld %6ld %6ld %6ld %6ld %6ld %5ld %5ld %3ld %3ld 
%5ld %6ld %6ld\n;
+   char format[] = %3ld %6ld %6ld %6ld %6ld %6ld %6ld %5ld %5ld %3ld %3ld 
%5ld %6ld %6ld %6ld %6ld %6ld\n;
int

[f2fs-dev] [PATCH 2/2] fibmap.f2fs: add bdev information

2014-01-15 Thread Changman Lee
This patch shows devname and start_lba based on zero sector.
fibmap reports related lba, sometimes we want to know absolute lba of
file to compare with blktrace.

Signed-off-by: Changman Lee cm224@samsung.com
---
 tools/fibmap.c |   44 +++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/tools/fibmap.c b/tools/fibmap.c
index 9eb6b90..ed0a08e 100644
--- a/tools/fibmap.c
+++ b/tools/fibmap.c
@@ -7,6 +7,8 @@
 #include errno.h
 #include sys/ioctl.h
 #include sys/stat.h
+#include libgen.h
+#include linux/hdreg.h
 #include linux/types.h
 #include linux/fs.h
 
@@ -41,6 +43,42 @@ void print_stat(struct stat64 *st)
printf(\n\n);
 }
 
+void stat_bdev(struct stat64 *st, unsigned int *start_lba)
+{
+   struct stat bdev_stat;
+   struct hd_geometry geom;
+   char devname[32] = { 0, };
+   char linkname[32] = { 0, };
+   int fd;
+
+   sprintf(devname, /dev/block/%d:%d, major(st-st_dev), 
minor(st-st_dev));
+
+   fd = open(devname, O_RDONLY);
+   if (fd  0)
+   return;
+
+   if (fstat(fd, bdev_stat)  0)
+   goto out;
+
+   if (S_ISBLK(bdev_stat.st_mode)) {
+   if (ioctl(fd, HDIO_GETGEO, geom)  0)
+   *start_lba = 0;
+   else
+   *start_lba = geom.start;
+   }
+
+   if (readlink(devname, linkname, sizeof(linkname))  0)
+   goto out;
+
+   printf(bdev info---\n);
+   printf(devname = %s\n, basename(linkname));
+   printf(start_lba = %u\n, *start_lba);
+
+out:
+   close(fd);
+
+}
+
 int main(int argc, char *argv[])
 {
int fd;
@@ -50,6 +88,7 @@ int main(int argc, char *argv[])
int total_blks;
unsigned int i;
struct file_ext ext;
+   __u32 start_lba;
__u32 blknum;
 
if (argc != 2) {
@@ -73,9 +112,12 @@ int main(int argc, char *argv[])
goto out;
}
 
+   stat_bdev(st, start_lba);
+
total_blks = (st.st_size + st.st_blksize - 1) / st.st_blksize;
 
-   printf(\n%s :\n, filename);
+   printf(\nfile info---\n);
+   printf(%s :\n, filename);
print_stat(st);
printf(file_pos   start_blk end_blkblks\n);
 
-- 
1.7.9.5


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs-tools: add f2fstat to print f2fs's status in sec

2014-01-10 Thread Changman Lee
This tool prints /sys/kernel/debug/f2fs/status in sec so that we
can monitor variation of f2fs status.

Signed-off-by: Changman Lee cm224@samsung.com
---
 Makefile.am   |2 +-
 configure.ac  |1 +
 tools/Makefile.am |7 ++
 tools/f2fstat.c   |  216 +
 4 files changed, 225 insertions(+), 1 deletion(-)
 create mode 100644 tools/Makefile.am
 create mode 100644 tools/f2fstat.c

diff --git a/Makefile.am b/Makefile.am
index ca376b4..d2921d6 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -2,4 +2,4 @@
 
 ACLOCAL_AMFLAGS = -I m4
 
-SUBDIRS = man lib mkfs fsck
+SUBDIRS = man lib mkfs fsck tools
diff --git a/configure.ac b/configure.ac
index c5ca858..c2dafb0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -83,6 +83,7 @@ AC_CONFIG_FILES([
lib/Makefile
mkfs/Makefile
fsck/Makefile
+   tools/Makefile
 ])
 
 AC_OUTPUT
diff --git a/tools/Makefile.am b/tools/Makefile.am
new file mode 100644
index 000..8442387
--- /dev/null
+++ b/tools/Makefile.am
@@ -0,0 +1,7 @@
+## Makefile.am
+
+AM_CPPFLAGS = ${libuuid_CFLAGS} -I$(top_srcdir)/include
+AM_CFLAGS = -Wall
+sbin_PROGRAMS = f2fstat
+f2fstat_SOURCES = f2fstat.c
+f2fstat_LDADD = ${libuuid_LIBS} $(top_builddir)/lib/libf2fs.la
diff --git a/tools/f2fstat.c b/tools/f2fstat.c
new file mode 100644
index 000..75027a8
--- /dev/null
+++ b/tools/f2fstat.c
@@ -0,0 +1,216 @@
+#include stdio.h
+#include unistd.h
+#include stdlib.h
+#include string.h
+#include fcntl.h
+
+#ifdef DEBUG
+#define dbg(fmt, args...)  printf(fmt, __VA_ARGS__);
+#else
+#define dbg(fmt, args...)
+#endif
+
+/*
+ * f2fs status
+ */
+#define F2FS_STATUS/sys/kernel/debug/f2fs/status
+
+unsigned long util;
+unsigned long used_node_blks;
+unsigned long used_data_blks;
+//unsigned long inline_inode;
+
+unsigned long free_segs;
+unsigned long valid_segs;
+unsigned long dirty_segs;
+unsigned long prefree_segs;
+
+unsigned long gc;
+unsigned long bg_gc;
+unsigned long gc_data_blks;
+unsigned long gc_node_blks;
+
+//unsigned long extent_hit_ratio;
+
+unsigned long dirty_node;
+unsigned long dirty_dents;
+unsigned long dirty_meta;
+unsigned long nat_caches;
+unsigned long dirty_sit;
+
+unsigned long free_nids;
+
+unsigned long ssr_blks;
+unsigned long lfs_blks;
+
+
+struct options {
+   int delay;
+   int interval;
+};
+
+struct mm_table {
+   const char *name;
+   unsigned long *val;
+};
+
+static int compare_mm_table(const void *a, const void *b)
+{
+   dbg([COMPARE] %s, %s\n, ((struct mm_table *)a)-name, ((struct 
mm_table *)b)-name);
+   return strcmp(((struct mm_table *)a)-name, ((struct mm_table 
*)b)-name);
+}
+
+static inline void remove_newline(char **head)
+{
+again:
+   if (**head == '\n') {
+   *head = *head + 1;
+   goto again;
+   }
+}
+
+void f2fstat(void)
+{
+   int fd;
+   int ret;
+   char keyname[32];
+   char buf[4096];
+   struct mm_table key = { keyname, NULL };
+   struct mm_table *found;
+   int f2fstat_table_cnt;
+   char *head, *tail;
+
+   static struct mm_table f2fstat_table[] = {
+   {   - Data,   used_data_blks },
+   {   - Dirty,  dirty_segs },
+   {   - Free,   free_segs },
+   {   - NATs,   nat_caches },
+   {   - Node,   used_node_blks },
+   {   - Prefree,prefree_segs },
+   {   - SITs,   dirty_sit },
+   {   - Valid,  valid_segs },
+   {   - dents,  dirty_dents },
+   {   - meta,   dirty_meta },
+   {   - nodes,  dirty_node },
+   { GC calls,   gc },
+   { LFS,lfs_blks },
+   { SSR,ssr_blks },
+   { Utilization,util },
+   };
+
+   f2fstat_table_cnt = sizeof(f2fstat_table)/sizeof(struct mm_table);
+
+   fd = open(F2FS_STATUS, O_RDONLY);
+   if (fd  0) {
+   perror(open  F2FS_STATUS);
+   exit(EXIT_FAILURE);
+   }
+
+   ret = read(fd, buf, 4096);
+   if (ret  0) {
+   perror(read  F2FS_STATUS);
+   exit(EXIT_FAILURE);
+   }
+   buf[ret] = '\0';
+
+   head = buf;
+   for (;;) {
+   remove_newline(head);
+   tail = strchr(head, ':');
+   if (!tail)
+   break;
+   *tail = '\0';
+   if (strlen(head) = sizeof(keyname)) {
+   dbg([OVER] %s\n, head);
+   *tail = ':';
+   tail = strchr(head, '\n');
+   head = tail + 1;
+   continue;
+   }
+
+   strcpy(keyname, head);
+
+   found = bsearch(key, f2fstat_table, f2fstat_table_cnt, 
sizeof(struct mm_table), compare_mm_table

[f2fs-dev] [PATCH] f2fs: unify rw and sync parameter into rw

2013-12-09 Thread Changman Lee
When we submit io, we can know whether the io is read or write and sync
mode or not. So we can remove redundant sync parameter.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/checkpoint.c |   12 ++--
 fs/f2fs/data.c   |   19 ---
 fs/f2fs/f2fs.h   |2 +-
 fs/f2fs/gc.c |2 +-
 fs/f2fs/node.c   |   15 ---
 fs/f2fs/segment.c|   16 
 6 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 38f4a224..76b557c 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -158,8 +158,8 @@ long sync_meta_pages(struct f2fs_sb_info *sbi, enum 
page_type type,
}
 
if (nwritten)
-   f2fs_submit_merged_bio(sbi, type, nr_to_write == LONG_MAX,
-   WRITE);
+   f2fs_submit_merged_bio(sbi, type,
+   (nr_to_write == LONG_MAX) ? WRITE_SYNC : WRITE);
 
return nwritten;
 }
@@ -592,7 +592,7 @@ retry:
 * We should submit bio, since it exists several
 * wribacking dentry pages in the freeing inode.
 */
-   f2fs_submit_merged_bio(sbi, DATA, true, WRITE);
+   f2fs_submit_merged_bio(sbi, DATA, WRITE_SYNC);
}
goto retry;
 }
@@ -798,9 +798,9 @@ void write_checkpoint(struct f2fs_sb_info *sbi, bool 
is_umount)
 
trace_f2fs_write_checkpoint(sbi-sb, is_umount, finish block_ops);
 
-   f2fs_submit_merged_bio(sbi, DATA, true, WRITE);
-   f2fs_submit_merged_bio(sbi, NODE, true, WRITE);
-   f2fs_submit_merged_bio(sbi, META, true, WRITE);
+   f2fs_submit_merged_bio(sbi, DATA, WRITE_SYNC);
+   f2fs_submit_merged_bio(sbi, NODE, WRITE_SYNC);
+   f2fs_submit_merged_bio(sbi, META, WRITE_SYNC);
 
/*
 * update checkpoint pack index
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 4e2fc09..470db6a 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -95,7 +95,7 @@ static void f2fs_write_end_io(struct bio *bio, int err)
 
 static void __submit_merged_bio(struct f2fs_sb_info *sbi,
struct f2fs_bio_info *io,
-   enum page_type type, bool sync, int rw)
+   enum page_type type, int rw)
 {
enum page_type btype = PAGE_TYPE_OF_BIO(type);
 
@@ -106,16 +106,12 @@ static void __submit_merged_bio(struct f2fs_sb_info *sbi,
rw |= REQ_META;
 
if (is_read_io(rw)) {
-   if (sync)
-   rw |= READ_SYNC;
submit_bio(rw, io-bio);
trace_f2fs_submit_read_bio(sbi-sb, rw, type, io-bio);
io-bio = NULL;
return;
}
 
-   if (sync)
-   rw |= WRITE_SYNC;
if (type = META_FLUSH)
rw |= WRITE_FLUSH_FUA;
 
@@ -136,7 +132,7 @@ static void __submit_merged_bio(struct f2fs_sb_info *sbi,
 }
 
 void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi,
-   enum page_type type, bool sync, int rw)
+   enum page_type type, int rw)
 {
enum page_type btype = PAGE_TYPE_OF_BIO(type);
struct f2fs_bio_info *io;
@@ -144,7 +140,7 @@ void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi,
io = is_read_io(rw) ? sbi-read_io : sbi-write_io[btype];
 
mutex_lock(io-io_mutex);
-   __submit_merged_bio(sbi, io, type, sync, rw);
+   __submit_merged_bio(sbi, io, type, rw);
mutex_unlock(io-io_mutex);
 }
 
@@ -195,7 +191,7 @@ void f2fs_submit_page_mbio(struct f2fs_sb_info *sbi, struct 
page *page,
inc_page_count(sbi, F2FS_WRITEBACK);
 
if (io-bio  io-last_block_in_bio != blk_addr - 1)
-   __submit_merged_bio(sbi, io, type, true, rw);
+   __submit_merged_bio(sbi, io, type, rw);
 alloc_new:
if (io-bio == NULL) {
bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi));
@@ -212,7 +208,7 @@ alloc_new:
 
if (bio_add_page(io-bio, page, PAGE_CACHE_SIZE, 0) 
PAGE_CACHE_SIZE) {
-   __submit_merged_bio(sbi, io, type, true, rw);
+   __submit_merged_bio(sbi, io, type, rw);
goto alloc_new;
}
 
@@ -733,7 +729,7 @@ write:
goto redirty_out;
 
if (wbc-for_reclaim)
-   f2fs_submit_merged_bio(sbi, DATA, true, WRITE);
+   f2fs_submit_merged_bio(sbi, DATA, WRITE_SYNC);
 
clear_cold_data(page);
 out:
@@ -785,7 +781,8 @@ static int f2fs_write_data_pages(struct address_space 
*mapping,
ret = write_cache_pages(mapping, wbc, __f2fs_writepage, mapping);
if (locked)
mutex_unlock(sbi-writepages);
-   f2fs_submit_merged_bio(sbi, DATA, wbc-sync_mode == WB_SYNC_ALL, WRITE);
+   f2fs_submit_merged_bio(sbi

Re: [f2fs-dev] [PATCH] f2fs: introduce f2fs_find_next(_zero)_bit

2013-11-14 Thread Changman Lee
I agree. Your suggestion is more good.
Thanks for your review.

On 2013년 11월 15일 13:31, Jaegeuk Kim wrote:

 Hi,

 IMO, it would be better give names like __find_rev_next(_zero)_bit.
 If there is no objection, I'll modify and apply them by myself.
 Thanks, :)

 2013-11-15 (금), 10:42 +0900, Changman Lee:
 When f2fs_set_bit is used, in a byte MSB and LSB is reversed,
 in that case we can use f2fs_find_next_bit or f2fs_find_next_zero_bit.

 Signed-off-by: Changman Lee cm224@samsung.com
 ---
   fs/f2fs/segment.c |  143 
 +
   1 file changed, 143 insertions(+)

 diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
 index fa284d3..b2de887 100644
 --- a/fs/f2fs/segment.c
 +++ b/fs/f2fs/segment.c
 @@ -21,6 +21,149 @@
   #include trace/events/f2fs.h
   
   /*
 + * f2fs_ffs is copied from include/asm-generic/bitops/__ffs.h because
 + * MSB and LSB is reversed in a byte by f2fs_set_bit.
 + */
 +static inline unsigned long f2fs_ffs(unsigned long word)
 +{
 +int num = 0;
 +
 +#if BITS_PER_LONG == 64
 +if ((word  0x) == 0) {
 +num += 32;
 +word = 32;
 +}
 +#endif
 +if ((word  0x) == 0) {
 +num += 16;
 +word = 16;
 +}
 +if ((word  0xff) == 0) {
 +num += 8;
 +word = 8;
 +}
 +if ((word  0xf0) == 0)
 +num += 4;
 +else
 +word = 4;
 +if ((word  0xc) == 0)
 +num += 2;
 +else
 +word = 2;
 +if ((word  0x2) == 0)
 +num += 1;
 +return num;
 +}
 +
 +#define f2fs_ffz(x) f2fs_ffs(~(x))
 +
 +/*
 + * f2fs_find_next(_zero)_bit is copied from lib/find_next_bit.c becasue
 + * f2fs_set_bit makes MSB and LSB reversed in a byte.
 + * Example:
 + * LSB -- MSB
 + *   f2fs_set_bit(0, bitmap) =  0001
 + *   f2fs_set_bit(7, bitmap) = 1000 
 + */
 +static unsigned long f2fs_find_next_bit(const unsigned long *addr,
 +unsigned long size, unsigned long offset)
 +{
 +const unsigned long *p = addr + BIT_WORD(offset);
 +unsigned long result = offset  ~(BITS_PER_LONG - 1);
 +unsigned long tmp;
 +unsigned long mask, submask;
 +unsigned long quot, rest;
 +
 +if (offset = size)
 +return size;
 +size -= result;
 +offset %= BITS_PER_LONG;
 +if (!offset)
 +goto aligned;
 +tmp = *(p++);
 +quot = (offset  3)  3;
 +rest = offset  0x7;
 +mask = ~0UL  quot;
 +submask = (unsigned char)(0xff  rest)  rest;
 +submask = quot;
 +mask = submask;
 +tmp = mask;
 +if (size  BITS_PER_LONG)
 +goto found_first;
 +if (tmp)
 +goto found_middle;
 +size -= BITS_PER_LONG;
 +result += BITS_PER_LONG;
 +aligned:
 +while (size  ~(BITS_PER_LONG-1)) {
 +tmp = *(p++);
 +if (tmp)
 +goto found_middle;
 +result += BITS_PER_LONG;
 +size -= BITS_PER_LONG;
 +}
 +if (!size)
 +return result;
 +tmp = *p;
 +
 +found_first:
 +tmp = (~0UL  (BITS_PER_LONG - size));
 +if (tmp == 0UL) /* Are any bits set? */
 +return result + size;   /* Nope. */
 +found_middle:
 +return result + f2fs_ffs(tmp);
 +}
 +
 +static unsigned long f2fs_find_next_zero_bit(const unsigned long *addr,
 +unsigned long size, unsigned long offset)
 +{
 +const unsigned long *p = addr + BIT_WORD(offset);
 +unsigned long result = offset  ~(BITS_PER_LONG - 1);
 +unsigned long tmp;
 +unsigned long mask, submask;
 +unsigned long quot, rest;
 +
 +if (offset = size)
 +return size;
 +size -= result;
 +offset %= BITS_PER_LONG;
 +if (!offset)
 +goto aligned;
 +tmp = *(p++);
 +quot = (offset  3)  3;
 +rest = offset  0x7;
 +mask = ~(~0UL  quot);
 +submask = (unsigned char)~((unsigned char)(0xff  rest)  rest);
 +submask = quot;
 +mask += submask;
 +tmp |= mask;
 +if (size  BITS_PER_LONG)
 +goto found_first;
 +if (~tmp)
 +goto found_middle;
 +size -= BITS_PER_LONG;
 +result += BITS_PER_LONG;
 +aligned:
 +while (size  ~(BITS_PER_LONG - 1)) {
 +tmp = *(p++);
 +if (~tmp)
 +goto found_middle;
 +result += BITS_PER_LONG;
 +size -= BITS_PER_LONG;
 +}
 +if (!size)
 +return result;
 +tmp = *p;
 +
 +found_first:
 +tmp |= ~0UL  size;
 +if (tmp == ~0UL)/* Are any bits zero? */
 +return result + size;   /* Nope. */
 +found_middle:
 +return result + f2fs_ffz(tmp);
 +}
 +
 +/*
* This function balances dirty node and dentry pages.
* In addition, it controls garbage collection.
*/


--
DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
OAuth

[f2fs-dev] [PATCH] f2fs: issue more large discard command

2013-11-07 Thread Changman Lee
When f2fs issues discard command, if segment is contiguous,
let's issue more large segment to gather adjacent segments.

** blktrace **
179,10 585942.619023770   971  C   D 131072 + 2097152 [0]
179,1033665   108.840475468   971  C   D 2228224 + 2494464 [0]
179,1033671   109.131616427   971  C   D 14909440 + 344064 [0]
179,1033677   109.137100677   971  C   D 15261696 + 4096 [0]

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/segment.c |   40 ++--
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index b7186a3..09f1375 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -141,8 +141,12 @@ void clear_prefree_segments(struct f2fs_sb_info *sbi)
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
unsigned int segno = -1;
unsigned int total_segs = TOTAL_SEGS(sbi);
+   bool init = true;
+   int count = 0;
+   int start_segno, prev_segno;
 
mutex_lock(dirty_i-seglist_lock);
+
while (1) {
segno = find_next_bit(dirty_i-dirty_segmap[PRE], total_segs,
segno + 1);
@@ -152,15 +156,39 @@ void clear_prefree_segments(struct f2fs_sb_info *sbi)
if (test_and_clear_bit(segno, dirty_i-dirty_segmap[PRE]))
dirty_i-nr_dirty[PRE]--;
 
-   /* Let's use trim */
-   if (test_opt(sbi, DISCARD))
-   blkdev_issue_discard(sbi-sb-s_bdev,
-   START_BLOCK(sbi, segno) 
+   if (init) {
+   init = false;
+   start_segno = segno;
+   prev_segno = segno;
+   count = 1;
+   continue;
+   }
+
+   if (segno == prev_segno + 1) {
+   count++;
+   prev_segno = segno;
+   } else {
+   if (test_opt(sbi, DISCARD))
+   blkdev_issue_discard(sbi-sb-s_bdev,
+   START_BLOCK(sbi, start_segno) 
sbi-log_sectors_per_block,
-   1  (sbi-log_sectors_per_block +
-   sbi-log_blocks_per_seg),
+   (1  (sbi-log_sectors_per_block +
+   sbi-log_blocks_per_seg)) * count,
GFP_NOFS, 0);
+   start_segno = segno;
+   prev_segno = segno;
+   count = 1;
+   }
}
+
+   if (count  test_opt(sbi, DISCARD))
+   blkdev_issue_discard(sbi-sb-s_bdev,
+   START_BLOCK(sbi, start_segno) 
+   sbi-log_sectors_per_block,
+   (1  (sbi-log_sectors_per_block +
+  sbi-log_blocks_per_seg)) * count,
+   GFP_NOFS, 0);
+
mutex_unlock(dirty_i-seglist_lock);
 }
 
-- 
1.7.9.5


--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH v2] f2fs: cleanup waiting routine for writeback pages in cp

2013-11-06 Thread Changman Lee
use genernal method supported by kernel

 o changes from v1
   If any waiter exists at end io, wake up it.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fs/f2fs/checkpoint.c |   25 -
 fs/f2fs/f2fs.h   |2 +-
 fs/f2fs/segment.c|5 +++--
 fs/f2fs/super.c  |1 +
 4 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index d430157..5716e5e 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -634,6 +634,21 @@ static void unblock_operations(struct f2fs_sb_info *sbi)
f2fs_unlock_all(sbi);
 }
 
+static void wait_on_all_pages_writeback(struct f2fs_sb_info *sbi)
+{
+   DEFINE_WAIT(wait);
+
+   for (;;) {
+   prepare_to_wait(sbi-cp_wait, wait, TASK_UNINTERRUPTIBLE);
+
+   if (!get_pages(sbi, F2FS_WRITEBACK))
+   break;
+
+   io_schedule();
+   }
+   finish_wait(sbi-cp_wait, wait);
+}
+
 static void do_checkpoint(struct f2fs_sb_info *sbi, bool is_umount)
 {
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
@@ -743,15 +758,7 @@ static void do_checkpoint(struct f2fs_sb_info *sbi, bool 
is_umount)
f2fs_put_page(cp_page, 1);
 
/* wait for previous submitted node/meta pages writeback */
-   sbi-cp_task = current;
-   while (get_pages(sbi, F2FS_WRITEBACK)) {
-   set_current_state(TASK_UNINTERRUPTIBLE);
-   if (!get_pages(sbi, F2FS_WRITEBACK))
-   break;
-   io_schedule();
-   }
-   __set_current_state(TASK_RUNNING);
-   sbi-cp_task = NULL;
+   wait_on_all_pages_writeback(sbi);
 
filemap_fdatawait_range(sbi-node_inode-i_mapping, 0, LONG_MAX);
filemap_fdatawait_range(sbi-meta_inode-i_mapping, 0, LONG_MAX);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 625eb4b..89dc750 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -372,7 +372,7 @@ struct f2fs_sb_info {
struct mutex writepages;/* mutex for writepages() */
bool por_doing; /* recovery is doing or not */
bool on_build_free_nids;/* build_free_nids is doing */
-   struct task_struct *cp_task;/* checkpoint task */
+   wait_queue_head_t cp_wait;
 
/* for orphan inode management */
struct list_head orphan_inode_list; /* orphan inode list */
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 3d4d5fc..74e81cb 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -592,8 +592,9 @@ static void f2fs_end_io_write(struct bio *bio, int err)
if (p-is_sync)
complete(p-wait);
 
-   if (!get_pages(p-sbi, F2FS_WRITEBACK)  p-sbi-cp_task)
-   wake_up_process(p-sbi-cp_task);
+   if (!get_pages(p-sbi, F2FS_WRITEBACK) 
+   !list_empty(p-sbi-cp_wait.task_list))
+   wake_up(p-sbi-cp_wait);
 
kfree(p);
bio_put(bio);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index e42351c..00e79df 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -876,6 +876,7 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
spin_lock_init(sbi-stat_lock);
init_rwsem(sbi-bio_sem);
init_rwsem(sbi-cp_rwsem);
+   init_waitqueue_head(sbi-cp_wait);
init_sb_info(sbi);
 
/* get an inode for meta space */
-- 
1.7.10.4


--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH V2 RESEND] f2fs: check all ones or zeros bitmap with bitops for better mount performance

2013-10-29 Thread Changman Lee
Review attached patch, please.

-Original Message-
From: Chao Yu [mailto:chao2...@samsung.com] 
Sent: Tuesday, October 29, 2013 3:51 PM
To: jaegeuk@samsung.com
Cc: linux-fsde...@vger.kernel.org; linux-ker...@vger.kernel.org;
linux-f2fs-devel@lists.sourceforge.net
Subject: [f2fs-dev] [PATCH V2 RESEND] f2fs: check all ones or zeros bitmap
with bitops for better mount performance

Previously, check_block_count check valid_map with bit data type in common
scenario that sit has all ones or zeros bitmap, it makes low mount
performance.
So let's check the special bitmap with integer data type instead of the bit
one.

v1--v2:
use find_next_{zero_}bit_le for better performance and readable as
Jaegeuk suggested.
use neat logogram in comment as Gu Zheng suggested.
search continuous ones or zeros for better performance when checking
mixed bitmap.

Suggested-by: Jaegeuk Kim jaegeuk@samsung.com
Signed-off-by: Shu Tan shu@samsung.com
Signed-off-by: Chao Yu chao2...@samsung.com
---
 fs/f2fs/segment.h |   19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index abe7094..a7abfa8
100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -550,8 +550,9 @@ static inline void check_block_count(struct f2fs_sb_info
*sbi,  {
struct f2fs_sm_info *sm_info = SM_I(sbi);
unsigned int end_segno = sm_info-segment_count - 1;
+   bool is_valid  = test_bit_le(0, raw_sit-valid_map) ? true : false;
int valid_blocks = 0;
-   int i;
+   int cur_pos = 0, next_pos;
 
/* check segment usage */
BUG_ON(GET_SIT_VBLOCKS(raw_sit)  sbi-blocks_per_seg); @@ -560,9
+561,19 @@ static inline void check_block_count(struct f2fs_sb_info *sbi,
BUG_ON(segno  end_segno);
 
/* check bitmap with valid block count */
-   for (i = 0; i  sbi-blocks_per_seg; i++)
-   if (f2fs_test_bit(i, raw_sit-valid_map))
-   valid_blocks++;
+   do {
+   if (is_valid) {
+   next_pos =
find_next_zero_bit_le(raw_sit-valid_map,
+   sbi-blocks_per_seg,
+   cur_pos);
+   valid_blocks += next_pos - cur_pos;
+   } else
+   next_pos = find_next_bit_le(raw_sit-valid_map,
+   sbi-blocks_per_seg,
+   cur_pos);
+   cur_pos = next_pos;
+   is_valid = !is_valid;
+   } while (cur_pos  sbi-blocks_per_seg);
BUG_ON(GET_SIT_VBLOCKS(raw_sit) != valid_blocks);  }
 
--
1.7.9.5



--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


0001-f2fs-use-pre-calculated-value-to-get-sum-of-valid-bl.patch
Description: Binary data
--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH V2 RESEND] f2fs: check all ones or zeros bitmap with bitops for better mount performance

2013-10-29 Thread Changman Lee

As you know, if any data or function are used once, we can use some keywords
like __initdata for data and __init for function.


-Original Message-
From: Chao Yu [mailto:chao2...@samsung.com] 
Sent: Tuesday, October 29, 2013 7:52 PM
To: 'Changman Lee'; jaegeuk@samsung.com
Cc: linux-fsde...@vger.kernel.org; linux-ker...@vger.kernel.org;
linux-f2fs-devel@lists.sourceforge.net
Subject: RE: [f2fs-dev] [PATCH V2 RESEND] f2fs: check all ones or zeros
bitmap with bitops for better mount performance

Hi Lee,

 -Original Message-
 From: Changman Lee [mailto:cm224@samsung.com]
 Sent: Tuesday, October 29, 2013 3:36 PM
 To: 'Chao Yu'; jaegeuk@samsung.com
 Cc: linux-fsde...@vger.kernel.org; linux-ker...@vger.kernel.org; 
 linux-f2fs-devel@lists.sourceforge.net
 Subject: RE: [f2fs-dev] [PATCH V2 RESEND] f2fs: check all ones or 
 zeros
bitmap
 with bitops for better mount performance
 
 Review attached patch, please.

Could we hide the pre calculated value by generating it in allocated memory
by func, because the value will be no use after build_sit_entries();

Regards
Yu

 
 -Original Message-
 From: Chao Yu [mailto:chao2...@samsung.com]
 Sent: Tuesday, October 29, 2013 3:51 PM
 To: jaegeuk@samsung.com
 Cc: linux-fsde...@vger.kernel.org; linux-ker...@vger.kernel.org; 
 linux-f2fs-devel@lists.sourceforge.net
 Subject: [f2fs-dev] [PATCH V2 RESEND] f2fs: check all ones or zeros 
 bitmap
with
 bitops for better mount performance
 
 Previously, check_block_count check valid_map with bit data type in 
 common scenario that sit has all ones or zeros bitmap, it makes low 
 mount performance.
 So let's check the special bitmap with integer data type instead of 
 the
bit one.
 
 v1--v2:
 use find_next_{zero_}bit_le for better performance and readable as 
 Jaegeuk suggested.
   use neat logogram in comment as Gu Zheng suggested.
   search continuous ones or zeros for better performance when checking

 mixed bitmap.
 
 Suggested-by: Jaegeuk Kim jaegeuk@samsung.com
 Signed-off-by: Shu Tan shu@samsung.com
 Signed-off-by: Chao Yu chao2...@samsung.com
 ---
  fs/f2fs/segment.h |   19 +++
  1 file changed, 15 insertions(+), 4 deletions(-)
 
 diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index 
 abe7094..a7abfa8
 100644
 --- a/fs/f2fs/segment.h
 +++ b/fs/f2fs/segment.h
 @@ -550,8 +550,9 @@ static inline void check_block_count(struct 
 f2fs_sb_info *sbi,  {
   struct f2fs_sm_info *sm_info = SM_I(sbi);
   unsigned int end_segno = sm_info-segment_count - 1;
 + bool is_valid  = test_bit_le(0, raw_sit-valid_map) ? true : false;
   int valid_blocks = 0;
 - int i;
 + int cur_pos = 0, next_pos;
 
   /* check segment usage */
   BUG_ON(GET_SIT_VBLOCKS(raw_sit)  sbi-blocks_per_seg); @@ -560,9
 +561,19 @@ static inline void check_block_count(struct f2fs_sb_info 
 +*sbi,
   BUG_ON(segno  end_segno);
 
   /* check bitmap with valid block count */
 - for (i = 0; i  sbi-blocks_per_seg; i++)
 - if (f2fs_test_bit(i, raw_sit-valid_map))
 - valid_blocks++;
 + do {
 + if (is_valid) {
 + next_pos =
 find_next_zero_bit_le(raw_sit-valid_map,
 + sbi-blocks_per_seg,
 + cur_pos);
 + valid_blocks += next_pos - cur_pos;
 + } else
 + next_pos = find_next_bit_le(raw_sit-valid_map,
 + sbi-blocks_per_seg,
 + cur_pos);
 + cur_pos = next_pos;
 + is_valid = !is_valid;
 + } while (cur_pos  sbi-blocks_per_seg);
   BUG_ON(GET_SIT_VBLOCKS(raw_sit) != valid_blocks);  }
 
 --
 1.7.9.5
 
 


 --
 Android is increasing in popularity, but the open development platform
that
 developers love is also attractive to malware creators. Download this
white
 paper to learn more about secure code signing practices that can help 
 keep Android apps secure.
 http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.c
 lktr
 k
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


--
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs-tools: discard is default but not set in config

2013-08-30 Thread Changman Lee
flash devices support discard therefore discard is default but not set
in config

Signed-off-by: Changman Lee cm224@samsung.com
---
 lib/libf2fs.c  |1 +
 mkfs/f2fs_format.c |1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index 6947425..9046986 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -364,6 +364,7 @@ void f2fs_init_configuration(struct f2fs_configuration *c)
c-heap = 1;
c-vol_label = ;
c-device_name = NULL;
+   c-trim = 1;
 }
 
 static int is_mounted(const char *mpt, const char *device)
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 5b017c7..364bb46 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -917,6 +917,7 @@ int f2fs_trim_device()
return -1;
}
 
+   MSG(0, Info: Discarding device\n);
if (S_ISREG(stat_buf.st_mode))
return 0;
else if (S_ISBLK(stat_buf.st_mode)) {
-- 
1.7.9.5


--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs-tools: add stat information into fibmap

2013-08-07 Thread Changman Lee
This patch shows stat information about a file with fragmented state.

Signed-off-by: Changman Lee cm224@samsung.com
---
 fsck/fibmap.c |   16 
 1 file changed, 16 insertions(+)

diff --git a/fsck/fibmap.c b/fsck/fibmap.c
index 8726d3d..0ced7ca 100644
--- a/fsck/fibmap.c
+++ b/fsck/fibmap.c
@@ -26,6 +26,21 @@ void print_ext(struct file_ext *ext)
ext-end_blk, ext-blk_count);
 }
 
+void print_stat(struct stat64 *st)
+{
+   printf(\n);
+   printf(dev   [%d:%d]\n, major(st-st_dev), minor(st-st_dev));
+   printf(ino   [0x%8lx : %ld]\n, st-st_ino, st-st_ino);
+   printf(mode  [0x%8x : %d]\n, st-st_mode, st-st_mode);
+   printf(nlink [0x%8lx : %ld]\n, st-st_nlink, st-st_nlink);
+   printf(uid   [0x%8x : %d]\n, st-st_uid, st-st_uid);
+   printf(gid   [0x%8x : %d]\n, st-st_gid, st-st_gid);
+   printf(size  [0x%8lx : %ld]\n, st-st_size, st-st_size);
+   printf(blksize   [0x%8lx : %ld]\n, st-st_blksize, st-st_blksize);
+   printf(blocks[0x%8lx : %ld]\n, st-st_blocks, st-st_blocks);
+   printf(\n\n);
+}
+
 int main(int argc, char *argv[])
 {
int fd;
@@ -61,6 +76,7 @@ int main(int argc, char *argv[])
total_blks = (st.st_size + st.st_blksize - 1) / st.st_blksize;
 
printf(\n%s :\n, filename);
+   print_stat(st);
printf(file_pos   start_blk end_blkblks\n);
 
blknum = 0;
-- 
1.7.10.4


--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with 2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/4] f2fs: reorganize the f2fs_setattr() function.

2013-06-23 Thread Changman Lee
2013. 6. 21. 오후 4:30에 Namjae Jeon linkinj...@gmail.com님이 작성:

 
  Sorry for late. I was very busy.
 
  Could you tell me if it happens difference between xattr and i_mode,
  what will you do?
 First of all, I want to know which case make mismatching permission
 between xattr and i_mode.
 And when we call chmod, inode is locked in sys_chmod. If so,
 inode-i_mode can be changed by any updated inode during chmod
 although inode is locked ?


update_inode updates raw inode on disk from inode-i_mode.
As you know, dirtied inode page will written back to disk at unexpected
time according to dirty ratio or expired time. If you instantly modify
inode-i_mode, inode could be earlier written back than xattr. So I think
it is possible that inode-i_mode and xattr might be different when SPO is
occured and so on.

  The purpose of i_acl_mode is used to update i_mode and xattr together in
  same lock region.
 Could you please tell me what is same lock region ? (inode-i_mutex or
 mutex_lock_op(sbi))

 Thanks.

I meant later.

 
 
  
  

   Subject: [PATCH v2] f2fs: reorganize the f2fs_setattr(),
f2fs_set_acl,
   f2fs_setxattr()
   From: Namjae Jeon namjae.j...@samsung.com
  


--
 This SF.net email is sponsored by Windows:

 Build for Windows Store.

 http://p.sf.net/sfu/windows-dev2dev
 ___
 Linux-f2fs-devel mailing list
 Linux-f2fs-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel