Re: [Ocfs2-devel] [PATCH] make ocfs2 do not log ENOENT error

2014-01-24 Thread Andrew Morton
On Fri, 24 Jan 2014 14:11:09 +0800 xiaowei...@oracle.com wrote:

 From: Xiaowei.Hu xiaowei...@oracle.com
 
 suppress log message like this:
 (open_delete,8328,0):ocfs2_unlink:951 ERROR: status = -2
 
 Orabug:17445485
 ---
  fs/ocfs2/namei.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
 index 0ba1cf0..ca6c5ba 100644
 --- a/fs/ocfs2/namei.c
 +++ b/fs/ocfs2/namei.c
 @@ -947,7 +947,7 @@ leave:
   ocfs2_free_dir_lookup_result(orphan_insert);
   ocfs2_free_dir_lookup_result(lookup);
  
 - if (status  (status != -ENOTEMPTY))
 + if (status  (status != -ENOTEMPTY)  (status != -ENOENT))
   mlog_errno(status);
  
   return status;

Please send a signed-off-by: for this patch.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] stalled ocfs2 patches

2014-01-24 Thread Andrew Morton

Guys, I'm developing an increasingly large backlog of ocfs2 patches
which I am unprepared to send upstream.  Some I feel need review which
is beyond my understanding and some were reviewed, with an unclear
outcome.

Can we please take another look at all of these and try to get this all
cleared away?

I'll send all the patches now.  My notes are as follows:

ocfs2-fix-ocfs2_sync_file-if-filesystem-is-readonly.patch

  Needs acks

ocfs2-free-allocated-clusters-if-error-occurs-after-ocfs2_claim_clusters.patch

  Needs acks

ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one.patch

  Needs acks

ocfs2-fix-a-tiny-race-when-running-dirop_fileop_racer.patch

  Needs acks

ocfs2-code-cleanup-remove-unused-functions.patch

  Might have been nacked?

ocfs2-o2net-o2net_listen_data_ready-should-do-nothing-if-socket-state-is-not-tcp_listen.patch

  Needs acks

ocfs2-check-existence-of-old-dentry-in-ocfs2_link.patch

  Needs acks

ocfs2-should-call-ocfs2_journal_access_di-before-ocfs2_delete_entry-in-ocfs2_orphan_del.patch

  Needs more review.  Sunil might have nacked this, but it was foggy.

ocfs2-llseek-requires-ocfs2-inode-lock-for-the-file-in-seek_end.patch

  Sunil was worried about performance, Joel had a question, Mark
  wanted an update, Joel acked it.

ocfs2-fix-issue-that-ocfs2_setattr-does-not-deal-with-new_i_size==i_size.patch

  To be updated in response to comment from Jeff Liu

ocfs2-update-inode-size-after-zeronig-the-hole.patch

  Needs acks


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [patch 01/11] ocfs2: fix ocfs2_sync_file() if filesystem is readonly

2014-01-24 Thread akpm
From: Younger Liu younger.li...@gmail.com
Subject: ocfs2: fix ocfs2_sync_file() if filesystem is readonly

If filesystem is readonly, there is no need to flush drive's caches or
force any uncommitted transactions.

Signed-off-by: Younger Liu younger.li...@gmail.com
Cc: Joel Becker jl...@evilplan.org
Cc: Mark Fasheh mfas...@suse.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/file.c |3 +++
 1 file changed, 3 insertions(+)

diff -puN fs/ocfs2/file.c~ocfs2-fix-ocfs2_sync_file-if-filesystem-is-readonly 
fs/ocfs2/file.c
--- a/fs/ocfs2/file.c~ocfs2-fix-ocfs2_sync_file-if-filesystem-is-readonly
+++ a/fs/ocfs2/file.c
@@ -185,6 +185,9 @@ static int ocfs2_sync_file(struct file *
  file-f_path.dentry-d_name.name,
  (unsigned long long)datasync);
 
+   if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
+   return 0;
+
err = filemap_write_and_wait_range(inode-i_mapping, start, end);
if (err)
return err;
_

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [patch 03/11] ocfs2/o2net: incorrect to terminate accepting connections loop upon rejecting an invalid one

2014-01-24 Thread akpm
From: Tariq Saeed tariq.x.sa...@oracle.com
Subject: ocfs2/o2net: incorrect to terminate accepting connections loop upon 
rejecting an invalid one

When o2net-accept-one() rejects an illegal connection, it terminates the
loop picking up the remaining queued connections.  This fix will continue
accepting connections till the queue is emtpy.

Addresses Orabug 17489469.

Signed-off-by: Tariq Saseed tariq.x.sa...@oracle.com
Cc: Mark Fasheh mfas...@suse.com
Cc: Joel Becker jl...@evilplan.org
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/cluster/tcp.c |   13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff -puN 
fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
 fs/ocfs2/cluster/tcp.c
--- 
a/fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
+++ a/fs/ocfs2/cluster/tcp.c
@@ -1826,7 +1826,7 @@ int o2net_register_hb_callbacks(void)
 
 /*  */
 
-static int o2net_accept_one(struct socket *sock)
+static int o2net_accept_one(struct socket *sock, int *more)
 {
int ret, slen;
struct sockaddr_in sin;
@@ -1837,6 +1837,7 @@ static int o2net_accept_one(struct socke
struct o2net_node *nn;
 
BUG_ON(sock == NULL);
+   *more = 0;
ret = sock_create_lite(sock-sk-sk_family, sock-sk-sk_type,
   sock-sk-sk_protocol, new_sock);
if (ret)
@@ -1848,6 +1849,7 @@ static int o2net_accept_one(struct socke
if (ret  0)
goto out;
 
+   *more = 1;
new_sock-sk-sk_allocation = GFP_ATOMIC;
 
ret = o2net_set_nodelay(new_sock);
@@ -1949,8 +1951,15 @@ out:
 static void o2net_accept_many(struct work_struct *work)
 {
struct socket *sock = o2net_listen_sock;
-   while (o2net_accept_one(sock) == 0)
+   int more;
+   int err;
+
+   for (;;) {
+   err = o2net_accept_one(sock, more);
+   if (!more)
+   break;
cond_resched();
+   }
 }
 
 static void o2net_listen_data_ready(struct sock *sk, int bytes)
_

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [patch 08/11] ocfs2: should call ocfs2_journal_access_di() before ocfs2_delete_entry() in ocfs2_orphan_del()

2014-01-24 Thread akpm
From: Younger Liu younger@huawei.com
Subject: ocfs2: should call ocfs2_journal_access_di() before 
ocfs2_delete_entry() in ocfs2_orphan_del()

While deleting a file into orphan dir in ocfs2_orphan_del(), it calls
ocfs2_delete_entry() before ocfs2_journal_access_di().  If
ocfs2_delete_entry() succeeded and ocfs2_journal_access_di() failed, there
would be a inconsistency: the file is deleted from orphan dir, but orphan
dir dinode is not updated.

So we need to call ocfs2_journal_access_di() before ocfs2_orphan_del().

Signed-off-by: Younger Liu younger@huawei.com
Reviewed-by: Jensen shencanq...@huawei.com
Cc: Jie Liu jeff@oracle.com
Cc: Joel Becker jl...@evilplan.org
Cc: Mark Fasheh mfas...@suse.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/namei.c |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff -puN 
fs/ocfs2/namei.c~ocfs2-should-call-ocfs2_journal_access_di-before-ocfs2_delete_entry-in-ocfs2_orphan_del
 fs/ocfs2/namei.c
--- 
a/fs/ocfs2/namei.c~ocfs2-should-call-ocfs2_journal_access_di-before-ocfs2_delete_entry-in-ocfs2_orphan_del
+++ a/fs/ocfs2/namei.c
@@ -2207,17 +2207,17 @@ int ocfs2_orphan_del(struct ocfs2_super
goto leave;
}
 
-   /* remove it from the orphan directory */
-   status = ocfs2_delete_entry(handle, orphan_dir_inode, lookup);
+   status = ocfs2_journal_access_di(handle,
+INODE_CACHE(orphan_dir_inode),
+orphan_dir_bh,
+OCFS2_JOURNAL_ACCESS_WRITE);
if (status  0) {
mlog_errno(status);
goto leave;
}
 
-   status = ocfs2_journal_access_di(handle,
-INODE_CACHE(orphan_dir_inode),
-orphan_dir_bh,
-OCFS2_JOURNAL_ACCESS_WRITE);
+   /* remove it from the orphan directory */
+   status = ocfs2_delete_entry(handle, orphan_dir_inode, lookup);
if (status  0) {
mlog_errno(status);
goto leave;
_

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [patch 04/11] ocfs2: fix a tiny race when running dirop_fileop_racer

2014-01-24 Thread akpm
From: Yiwen Jiang jiangyi...@huawei.com
Subject: ocfs2: fix a tiny race when running dirop_fileop_racer

When running dirop_fileop_racer we found a dead lock case.

2 nodes, say Node A and Node B, mount the same ocfs2 volume.  Create
/race/16/1 in the filesystem, and let the inode number of dir 16 is less
than the inode number of dir race.

Node ANode B
mv /race/16/1 /race/
  right after Node A has got the
  EX mode of /race/16/, and tries to
  get EX mode of /race
  ls /race/16/

In this case, Node A has got the EX mode of /race/16/, and wants to get EX
mode of /race/.  Node B has got the PR mode of /race/, and wants to get
the PR mode of /race/16/.  Since EX and PR are mutually exclusive, dead
lock happens.

This patch fixes this case by locking in ancestor order before trying
inode number order.

Signed-off-by: Yiwen Jiang jiangyi...@huawei.com
Signed-off-by: Joseph Qi joseph...@huawei.com
Cc: Joel Becker jl...@evilplan.org
Cc: Mark Fasheh mfas...@suse.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/namei.c |   97 -
 1 file changed, 95 insertions(+), 2 deletions(-)

diff -puN 
fs/ocfs2/namei.c~ocfs2-fix-a-tiny-race-when-running-dirop_fileop_racer 
fs/ocfs2/namei.c
--- a/fs/ocfs2/namei.c~ocfs2-fix-a-tiny-race-when-running-dirop_fileop_racer
+++ a/fs/ocfs2/namei.c
@@ -954,6 +954,65 @@ leave:
return status;
 }
 
+static int ocfs2_check_if_ancestor(struct ocfs2_super *osb,
+   u64 src_inode_no, u64 dest_inode_no)
+{
+   int ret = 0, i = 0;
+   u64 parent_inode_no = 0;
+   u64 child_inode_no = src_inode_no;
+   struct inode *child_inode;
+
+#define MAX_LOOKUP_TIMES 32
+   while (1) {
+   child_inode = ocfs2_iget(osb, child_inode_no, 0, 0);
+   if (IS_ERR(child_inode)) {
+   ret = PTR_ERR(child_inode);
+   break;
+   }
+
+   ret = ocfs2_inode_lock(child_inode, NULL, 0);
+   if (ret  0) {
+   iput(child_inode);
+   if (ret != -ENOENT)
+   mlog_errno(ret);
+   break;
+   }
+
+   ret = ocfs2_lookup_ino_from_name(child_inode, .., 2,
+   parent_inode_no);
+   ocfs2_inode_unlock(child_inode, 0);
+   iput(child_inode);
+   if (ret  0) {
+   ret = -ENOENT;
+   break;
+   }
+
+   if (parent_inode_no == dest_inode_no) {
+   ret = 1;
+   break;
+   }
+
+   if (parent_inode_no == osb-root_inode-i_ino) {
+   ret = 0;
+   break;
+   }
+
+   child_inode_no = parent_inode_no;
+
+   if (++i = MAX_LOOKUP_TIMES) {
+   mlog(ML_NOTICE, max lookup times reached, filesystem 
+   may have nested directories, 
+   src inode: %llu, dest inode: %llu.\n,
+   (unsigned long long)src_inode_no,
+   (unsigned long long)dest_inode_no);
+   ret = 0;
+   break;
+   }
+   }
+
+   return ret;
+}
+
 /*
  * The only place this should be used is rename!
  * if they have the same id, then the 1st one is the only one locked.
@@ -965,6 +1024,7 @@ static int ocfs2_double_lock(struct ocfs
 struct inode *inode2)
 {
int status;
+   int inode1_is_ancestor, inode2_is_ancestor;
struct ocfs2_inode_info *oi1 = OCFS2_I(inode1);
struct ocfs2_inode_info *oi2 = OCFS2_I(inode2);
struct buffer_head **tmpbh;
@@ -978,9 +1038,26 @@ static int ocfs2_double_lock(struct ocfs
if (*bh2)
*bh2 = NULL;
 
-   /* we always want to lock the one with the lower lockid first. */
+   /* we always want to lock the one with the lower lockid first.
+* and if they are nested, we lock ancestor first */
if (oi1-ip_blkno != oi2-ip_blkno) {
-   if (oi1-ip_blkno  oi2-ip_blkno) {
+   inode1_is_ancestor = ocfs2_check_if_ancestor(osb, oi2-ip_blkno,
+   oi1-ip_blkno);
+   if (inode1_is_ancestor  0) {
+   status = inode1_is_ancestor;
+   goto bail;
+   }
+
+   inode2_is_ancestor = ocfs2_check_if_ancestor(osb, oi1-ip_blkno,
+   oi2-ip_blkno);
+   if (inode2_is_ancestor  0) {
+   status = inode2_is_ancestor;
+   goto bail;
+  

[Ocfs2-devel] [patch 10/11] ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size

2014-01-24 Thread akpm
From: Younger Liu younger@huawei.com
Subject: ocfs2: fix issue that ocfs2_setattr() does not deal with 
new_i_size==i_size

The issue scenario is as following:

- Create a small file and fallocate a large disk space for a file with
  FALLOC_FL_KEEP_SIZE option.

- ftruncate the file back to the original size again.  but the disk free
  space is not changed back.  This is a real bug that be fixed in this
  patch.

In order to solve the issue above, we modified ocfs2_setattr(), if
attr-ia_size != i_size_read(inode), It calls ocfs2_truncate_file(), and
truncate disk space to attr-ia_size.

Signed-off-by: Younger Liu younger@huawei.com
Reviewed-by: Jie Liu jeff@oracle.com
Tested-by: Jie Liu jeff@oracle.com
Cc: Joel Becker jl...@evilplan.org
Cc: Mark Fasheh mfas...@suse.com
Cc: Sunil Mushran sunil.mush...@gmail.com
Reviewed-by: Jensen shencanq...@huawei.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/alloc.c |2 +-
 fs/ocfs2/file.c  |9 ++---
 2 files changed, 3 insertions(+), 8 deletions(-)

diff -puN 
fs/ocfs2/alloc.c~ocfs2-fix-issue-that-ocfs2_setattr-does-not-deal-with-new_i_size==i_size
 fs/ocfs2/alloc.c
--- 
a/fs/ocfs2/alloc.c~ocfs2-fix-issue-that-ocfs2_setattr-does-not-deal-with-new_i_size==i_size
+++ a/fs/ocfs2/alloc.c
@@ -7158,7 +7158,7 @@ int ocfs2_truncate_inline(struct inode *
if (end  i_size_read(inode))
end = i_size_read(inode);
 
-   BUG_ON(start = end);
+   BUG_ON(start  end);
 
if (!(OCFS2_I(inode)-ip_dyn_features  OCFS2_INLINE_DATA_FL) ||
!(le16_to_cpu(di-i_dyn_features)  OCFS2_INLINE_DATA_FL) ||
diff -puN 
fs/ocfs2/file.c~ocfs2-fix-issue-that-ocfs2_setattr-does-not-deal-with-new_i_size==i_size
 fs/ocfs2/file.c
--- 
a/fs/ocfs2/file.c~ocfs2-fix-issue-that-ocfs2_setattr-does-not-deal-with-new_i_size==i_size
+++ a/fs/ocfs2/file.c
@@ -477,11 +477,6 @@ static int ocfs2_truncate_file(struct in
goto bail;
}
 
-   /* lets handle the simple truncate cases before doing any more
-* cluster locking. */
-   if (new_i_size == le64_to_cpu(fe-i_size))
-   goto bail;
-
down_write(OCFS2_I(inode)-ip_alloc_sem);
 
ocfs2_resv_discard(osb-osb_la_resmap,
@@ -1148,14 +1143,14 @@ int ocfs2_setattr(struct dentry *dentry,
goto bail_unlock_rw;
}
 
-   if (size_change  attr-ia_size != i_size_read(inode)) {
+   if (size_change) {
status = inode_newsize_ok(inode, attr-ia_size);
if (status)
goto bail_unlock;
 
inode_dio_wait(inode);
 
-   if (i_size_read(inode)  attr-ia_size) {
+   if (i_size_read(inode) = attr-ia_size) {
if (ocfs2_should_order_data(inode)) {
status = ocfs2_begin_ordered_truncate(inode,
  
attr-ia_size);
_

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [patch 07/11] ocfs2: check existence of old dentry in ocfs2_link()

2014-01-24 Thread akpm
From: Xue jiufei xuejiu...@huawei.com
Subject: ocfs2: check existence of old dentry in ocfs2_link()

System call linkat first calls user_path_at(), check the existence of old
dentry, and then calls vfs_link()-ocfs2_link() to do the actual work. 
There may exist a race when Node A create a hard link for file while node
B rm it.

 Node A  Node B
user_path_at()
  -ocfs2_lookup(),
find old dentry exist
rm file, add inode say inodeA
to orphan_dir

call ocfs2_link(),create a
hard link for inodeA.

rm the link, add inodeA to orphan_dir
again

When orphan_scan work start, it calls ocfs2_queue_orphans() to do the main
work.  It first tranverses entrys in orphan_dir, linking all inodes in
this orphan_dir to a list look like this:

inodeA-inodeB-...-inodeA

When tranvering this list, it will fall into loop, calling iput() again
and again.  And finally trigger BUG_ON(inode-i_state  I_CLEAR).

Signed-off-by: joyce xuejiu...@huawei.com
Cc: Joel Becker jl...@evilplan.org
Cc: Mark Fasheh mfas...@suse.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/namei.c |   13 +
 1 file changed, 13 insertions(+)

diff -puN fs/ocfs2/namei.c~ocfs2-check-existence-of-old-dentry-in-ocfs2_link 
fs/ocfs2/namei.c
--- a/fs/ocfs2/namei.c~ocfs2-check-existence-of-old-dentry-in-ocfs2_link
+++ a/fs/ocfs2/namei.c
@@ -644,6 +644,7 @@ static int ocfs2_link(struct dentry *old
struct ocfs2_super *osb = OCFS2_SB(dir-i_sb);
struct ocfs2_dir_lookup_result lookup = { NULL, };
sigset_t oldset;
+   u64 old_de_ino;
 
trace_ocfs2_link((unsigned long long)OCFS2_I(inode)-ip_blkno,
 old_dentry-d_name.len, old_dentry-d_name.name,
@@ -665,6 +666,18 @@ static int ocfs2_link(struct dentry *old
err = -ENOENT;
goto out;
}
+
+   err = ocfs2_lookup_ino_from_name(dir, old_dentry-d_name.name,
+   old_dentry-d_name.len, old_de_ino);
+   if (err) {
+   err = -ENOENT;
+   goto out;
+   }
+
+   if (old_de_ino != OCFS2_I(inode)-ip_blkno) {
+   err = -ENOENT;
+   goto out;
+   }
 
err = ocfs2_check_dir_for_entry(dir, dentry-d_name.name,
dentry-d_name.len);
_

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [patch 11/11] ocfs2: update inode size after zeroing the hole

2014-01-24 Thread akpm
From: Junxiao Bi junxiao...@oracle.com
Subject: ocfs2: update inode size after zeroing the hole

fs-writeback will release the dirty pages without page lock whose offset
are over inode size, the release happens at block_write_full_page_endio().
 If not update, dirty pages in file holes may be released before flushed
to the disk, then file holes will contain some non-zero data, this will
cause sparse file md5sum error.

To reproduce the bug, find a big sparse file with many holes, like vm
image file, its actual size should be bigger than available mem size to
make writeback work more frequently, tar it with -S option, then keep
untar it and check its md5sum again and again until you get a wrong
md5sum.

Signed-off-by: Junxiao Bi junxiao...@oracle.com
Cc: Younger Liu younger@huawei.com
Cc: Mark Fasheh mfas...@suse.com
Cc: Joel Becker jl...@evilplan.org
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/file.c |   40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff -puN fs/ocfs2/file.c~ocfs2-update-inode-size-after-zeronig-the-hole 
fs/ocfs2/file.c
--- a/fs/ocfs2/file.c~ocfs2-update-inode-size-after-zeronig-the-hole
+++ a/fs/ocfs2/file.c
@@ -716,7 +716,8 @@ leave:
  * While a write will already be ordering the data, a truncate will not.
  * Thus, we need to explicitly order the zeroed pages.
  */
-static handle_t *ocfs2_zero_start_ordered_transaction(struct inode *inode)
+static handle_t *ocfs2_zero_start_ordered_transaction(struct inode *inode,
+   struct buffer_head *di_bh)
 {
struct ocfs2_super *osb = OCFS2_SB(inode-i_sb);
handle_t *handle = NULL;
@@ -733,7 +734,14 @@ static handle_t *ocfs2_zero_start_ordere
}
 
ret = ocfs2_jbd2_file_inode(handle, inode);
-   if (ret  0)
+   if (ret  0) {
+   mlog_errno(ret);
+   goto out;
+   }
+
+   ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode), di_bh,
+ OCFS2_JOURNAL_ACCESS_WRITE);
+   if (ret)
mlog_errno(ret);
 
 out:
@@ -749,7 +757,7 @@ out:
  * to be too fragile to do exactly what we need without us having to
  * worry about recursive locking in -write_begin() and -write_end(). */
 static int ocfs2_write_zero_page(struct inode *inode, u64 abs_from,
-u64 abs_to)
+u64 abs_to, struct buffer_head *di_bh)
 {
struct address_space *mapping = inode-i_mapping;
struct page *page;
@@ -757,6 +765,7 @@ static int ocfs2_write_zero_page(struct
handle_t *handle = NULL;
int ret = 0;
unsigned zero_from, zero_to, block_start, block_end;
+   struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh-b_data;
 
BUG_ON(abs_from = abs_to);
BUG_ON(abs_to  (((u64)index + 1)  PAGE_CACHE_SHIFT));
@@ -799,7 +808,8 @@ static int ocfs2_write_zero_page(struct
}
 
if (!handle) {
-   handle = ocfs2_zero_start_ordered_transaction(inode);
+   handle = ocfs2_zero_start_ordered_transaction(inode,
+ di_bh);
if (IS_ERR(handle)) {
ret = PTR_ERR(handle);
handle = NULL;
@@ -816,8 +826,22 @@ static int ocfs2_write_zero_page(struct
ret = 0;
}
 
-   if (handle)
+   if (handle) {
+   /*
+* fs-writeback will release the dirty pages without page lock
+* whose offset are over inode size, the release happens at
+* block_write_full_page_endio().
+*/
+   i_size_write(inode, abs_to);
+   inode-i_blocks = ocfs2_inode_sector_count(inode);
+   di-i_size = cpu_to_le64((u64)i_size_read(inode));
+   inode-i_mtime = inode-i_ctime = CURRENT_TIME;
+   di-i_mtime = di-i_ctime = cpu_to_le64(inode-i_mtime.tv_sec);
+   di-i_ctime_nsec = cpu_to_le32(inode-i_mtime.tv_nsec);
+   di-i_mtime_nsec = di-i_ctime_nsec;
+   ocfs2_journal_dirty(handle, di_bh);
ocfs2_commit_trans(OCFS2_SB(inode-i_sb), handle);
+   }
 
 out_unlock:
unlock_page(page);
@@ -913,7 +937,7 @@ out:
  * has made sure that the entire range needs zeroing.
  */
 static int ocfs2_zero_extend_range(struct inode *inode, u64 range_start,
-  u64 range_end)
+  u64 range_end, struct buffer_head *di_bh)
 {
int rc = 0;
u64 next_pos;
@@ -929,7 +953,7 @@ static int ocfs2_zero_extend_range(struc
next_pos = (zero_pos  PAGE_CACHE_MASK) + PAGE_CACHE_SIZE;
if (next_pos  range_end)
next_pos = range_end;
-   rc = 

[Ocfs2-devel] [patch 06/11] ocfs2/o2net: o2net_listen_data_ready should do nothing if socket state is not TCP_LISTEN

2014-01-24 Thread akpm
From: Tariq Saeed tariq.x.sa...@oracle.com
Subject: ocfs2/o2net: o2net_listen_data_ready should do nothing if socket state 
is not TCP_LISTEN

Orabug: 17330860

When accepting an incomming connection o2net_accept_one clones a child
data socket from the parent listening socket.  It then proceeds to setup
the child with callback o2net_data_ready() and sk_user_data to NULL.  If
data arrives in this window, o2net_listen_data_ready will be called with
some non-deterministic value in sk_user_data (not inherited).  We panic
when we page fault on sk_user_data -- in parent it is sock_def_readable().
 The fix is to recognize that this is a data socket being set up by
looking at the socket state and do nothing.

Signed-off-by: Tariq Saseed tariq.x.sa...@oracle.com
Signed-off-by: Srinivas Eeda srinivas.e...@oracle.com
Cc: Joel Becker jl...@evilplan.org
Cc: Mark Fasheh mfas...@suse.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/cluster/tcp.c |   22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff -puN 
fs/ocfs2/cluster/tcp.c~ocfs2-o2net-o2net_listen_data_ready-should-do-nothing-if-socket-state-is-not-tcp_listen
 fs/ocfs2/cluster/tcp.c
--- 
a/fs/ocfs2/cluster/tcp.c~ocfs2-o2net-o2net_listen_data_ready-should-do-nothing-if-socket-state-is-not-tcp_listen
+++ a/fs/ocfs2/cluster/tcp.c
@@ -1973,18 +1973,30 @@ static void o2net_listen_data_ready(stru
goto out;
}
 
-   /* -sk_data_ready is also called for a newly established child socket
-* before it has been accepted and the acceptor has set up their
-* data_ready.. we only want to queue listen work for our listening
-* socket */
+   /* This callback may called twice when a new connection
+* is  being established as a child socket inherits everything
+* from a parent LISTEN socket, including the data_ready cb of
+* the parent. This leads to a hazard. In o2net_accept_one()
+* we are still initializing the child socket but have not
+* changed the inherited data_ready callback yet when
+* data starts arriving.
+* We avoid this hazard by checking the state.
+* For the listening socket,  the state will be TCP_LISTEN; for the new
+* socket, will be  TCP_ESTABLISHED. Also, in this case,
+* sk-sk_user_data is not a valid function pointer.
+*/
+
if (sk-sk_state == TCP_LISTEN) {
mlog(ML_TCP, bytes: %d\n, bytes);
queue_work(o2net_wq, o2net_listen_work);
+   } else {
+   ready = NULL;
}
 
 out:
read_unlock(sk-sk_callback_lock);
-   ready(sk, bytes);
+   if (ready != NULL)
+   ready(sk, bytes);
 }
 
 static int o2net_open_listening_sock(__be32 addr, __be16 port)
_

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [patch 09/11] ocfs2: llseek requires ocfs2 inode lock for the file in SEEK_END

2014-01-24 Thread akpm
From: Jensen shencanq...@huawei.com
Subject: ocfs2: llseek requires ocfs2 inode lock for the file in SEEK_END

llseek requires ocfs2 inode lock for updating the file size in SEEK_END. 
because the file size maybe update on another node.

This bug can be reproduce the following scenario: at first, we dd a test
fileA, the file size is 10k.

on NodeA:
-
1) open the test fileA, lseek the end of file. and print the position.
2) close the test fileA

on NodeB:
1) open the test fileA, append the 5k data to test FileA.
2) lseek the end of file. and print the position.
3) close file.

At first we run the test program1 on NodeA , the result is 10k.  And then
run the test program2 on NodeB, the result is 15k.  At last, we run the
test program1 on NodeA again, the result is 10k.

After applying this patch the three step result is 15k.

Signed-off-by: Jensen shencanq...@huawei.com
Cc: Jie Liu jeff@oracle.com
Acked-by: Joel Becker jl...@evilplan.org
Cc: Mark Fasheh mfas...@suse.com
Cc: Sunil Mushran sunil.mush...@gmail.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/file.c |   11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff -puN 
fs/ocfs2/file.c~ocfs2-llseek-requires-ocfs2-inode-lock-for-the-file-in-seek_end 
fs/ocfs2/file.c
--- 
a/fs/ocfs2/file.c~ocfs2-llseek-requires-ocfs2-inode-lock-for-the-file-in-seek_end
+++ a/fs/ocfs2/file.c
@@ -2626,7 +2626,16 @@ static loff_t ocfs2_file_llseek(struct f
case SEEK_SET:
break;
case SEEK_END:
-   offset += inode-i_size;
+   /* SEEK_END requires the OCFS2 inode lock for the file
+* because it references the file's size.
+*/
+   ret = ocfs2_inode_lock(inode, NULL, 0);
+   if (ret  0) {
+   mlog_errno(ret);
+   goto out;
+   }
+   offset += i_size_read(inode);
+   ocfs2_inode_unlock(inode, 0);
break;
case SEEK_CUR:
if (offset == 0) {
_

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [patch 03/11] ocfs2/o2net: incorrect to terminate accepting connections loop upon rejecting an invalid one

2014-01-24 Thread Mark Fasheh
On Fri, Jan 24, 2014 at 12:47:02PM -0800, a...@linux-foundation.org wrote:
 From: Tariq Saeed tariq.x.sa...@oracle.com
 Subject: ocfs2/o2net: incorrect to terminate accepting connections loop upon 
 rejecting an invalid one
 
 When o2net-accept-one() rejects an illegal connection, it terminates the
 loop picking up the remaining queued connections.  This fix will continue
 accepting connections till the queue is emtpy.
 
 Addresses Orabug 17489469.

Thanks for sending this, review comments below.


 diff -puN 
 fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
  fs/ocfs2/cluster/tcp.c
 --- 
 a/fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
 +++ a/fs/ocfs2/cluster/tcp.c
 @@ -1826,7 +1826,7 @@ int o2net_register_hb_callbacks(void)
  
  /*  */
  
 -static int o2net_accept_one(struct socket *sock)
 +static int o2net_accept_one(struct socket *sock, int *more)
  {
   int ret, slen;
   struct sockaddr_in sin;
 @@ -1837,6 +1837,7 @@ static int o2net_accept_one(struct socke
   struct o2net_node *nn;
  
   BUG_ON(sock == NULL);
 + *more = 0;
   ret = sock_create_lite(sock-sk-sk_family, sock-sk-sk_type,
  sock-sk-sk_protocol, new_sock);
   if (ret)
 @@ -1848,6 +1849,7 @@ static int o2net_accept_one(struct socke
   if (ret  0)
   goto out;
  
 + *more = 1;
   new_sock-sk-sk_allocation = GFP_ATOMIC;
  
   ret = o2net_set_nodelay(new_sock);
 @@ -1949,8 +1951,15 @@ out:
  static void o2net_accept_many(struct work_struct *work)
  {
   struct socket *sock = o2net_listen_sock;
 - while (o2net_accept_one(sock) == 0)
 + int more;
 + int err;
 +
 + for (;;) {
 + err = o2net_accept_one(sock, more);
 + if (!more)
 + break;

We're throwing out 'err' here and trusting the variable 'more'. However, err
could be set and more would be 0 regardless of whether there actually are
more connections to be had. This makes more sense given when 'more' is set:

if (err)
break;
/* only trust the value of 'more' when err == 0 */
if (more)
break;

Thanks,
--Mark

--
Mark Fasheh

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [patch 05/11] ocfs2: code cleanup: remove unused functions

2014-01-24 Thread Mark Fasheh
On Fri, Jan 24, 2014 at 12:47:04PM -0800, a...@linux-foundation.org wrote:
 From: Goldwyn Rodrigues rgold...@suse.de
 Subject: ocfs2: code cleanup: remove unused functions
 
 These functions are either coded in individual files as static or not used
 at all.  Remove them.

NAK - some of this may be used by userspace (ocfs2_fs.h is shared with
ocfs2-tools).

The other reason for my NAK is that some of these either complete a set of
functions or provide information on how to handle structures:
ocfs2_gd_is_discontig or ocfs2_system_inode_is_global are good examples and
it's all in ocfs2_fs.h which is where I'd personally go look to see how the
fs structures are stored.
--Mark

--
Mark Fasheh

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [patch 01/11] ocfs2: fix ocfs2_sync_file() if filesystem is readonly

2014-01-24 Thread Goldwyn Rodrigues


On 01/24/2014 02:46 PM, a...@linux-foundation.org wrote:
 From: Younger Liu younger.li...@gmail.com
 Subject: ocfs2: fix ocfs2_sync_file() if filesystem is readonly

 If filesystem is readonly, there is no need to flush drive's caches or
 force any uncommitted transactions.

An ocfs2 filesystem can be set to read-only because of an error, in 
which case, you should return -EROFS.

Nak.


-- 
Goldwyn

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [patch 01/11] ocfs2: fix ocfs2_sync_file() if filesystem is readonly

2014-01-24 Thread Mark Fasheh
On Fri, Jan 24, 2014 at 12:46:59PM -0800, a...@linux-foundation.org wrote:
 From: Younger Liu younger.li...@gmail.com
 Subject: ocfs2: fix ocfs2_sync_file() if filesystem is readonly
 
 If filesystem is readonly, there is no need to flush drive's caches or
 force any uncommitted transactions.
 
 Signed-off-by: Younger Liu younger.li...@gmail.com
 Cc: Joel Becker jl...@evilplan.org
 Cc: Mark Fasheh mfas...@suse.com
 Signed-off-by: Andrew Morton a...@linux-foundation.org

Looks good, thanks for this.

Signed-off-by: Mark Fasheh mfas...@suse.de
--Mark

--
Mark Fasheh

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [patch 03/11] ocfs2/o2net: incorrect to terminate accepting connections loop upon rejecting an invalid one

2014-01-24 Thread Srinivas Eeda
On 01/24/2014 01:55 PM, Mark Fasheh wrote:
 On Fri, Jan 24, 2014 at 12:47:02PM -0800, a...@linux-foundation.org wrote:
 From: Tariq Saeed tariq.x.sa...@oracle.com
 Subject: ocfs2/o2net: incorrect to terminate accepting connections loop upon 
 rejecting an invalid one

 When o2net-accept-one() rejects an illegal connection, it terminates the
 loop picking up the remaining queued connections.  This fix will continue
 accepting connections till the queue is emtpy.

 Addresses Orabug 17489469.
 Thanks for sending this, review comments below.


 diff -puN 
 fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
  fs/ocfs2/cluster/tcp.c
 --- 
 a/fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
 +++ a/fs/ocfs2/cluster/tcp.c
 @@ -1826,7 +1826,7 @@ int o2net_register_hb_callbacks(void)
   
   /*  */
   
 -static int o2net_accept_one(struct socket *sock)
 +static int o2net_accept_one(struct socket *sock, int *more)
   {
  int ret, slen;
  struct sockaddr_in sin;
 @@ -1837,6 +1837,7 @@ static int o2net_accept_one(struct socke
  struct o2net_node *nn;
   
  BUG_ON(sock == NULL);
 +*more = 0;
  ret = sock_create_lite(sock-sk-sk_family, sock-sk-sk_type,
 sock-sk-sk_protocol, new_sock);
  if (ret)
 @@ -1848,6 +1849,7 @@ static int o2net_accept_one(struct socke
  if (ret  0)
  goto out;
   
 +*more = 1;
  new_sock-sk-sk_allocation = GFP_ATOMIC;
   
  ret = o2net_set_nodelay(new_sock);
 @@ -1949,8 +1951,15 @@ out:
   static void o2net_accept_many(struct work_struct *work)
   {
  struct socket *sock = o2net_listen_sock;
 -while (o2net_accept_one(sock) == 0)
 +int more;
 +int err;
 +
 +for (;;) {
 +err = o2net_accept_one(sock, more);
 +if (!more)
 +break;
 We're throwing out 'err' here and trusting the variable 'more'. However, err
 could be set and more would be 0 regardless of whether there actually are
 more connections to be had. This makes more sense given when 'more' is set:
Hi Mark,

thanks for reviewing this :). Please correct me if I am wrong.

The idea was if accept itself fails exit the loop, otherwise continue 
scanning all queued connections. For eg: if a connection request from 
unknown node followed by known node happens at the same time, we can 
ignore the ret value of the first connection request and still continue 
accepting the next connection.

ret value appears to be of not much importance other than just for 
logging or debugging purpose.



   if (err)
   break;
   /* only trust the value of 'more' when err == 0 */
   if (more)
   break;

 Thanks,
   --Mark

 --
 Mark Fasheh

 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 https://oss.oracle.com/mailman/listinfo/ocfs2-devel


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [patch 01/11] ocfs2: fix ocfs2_sync_file() if filesystem is readonly

2014-01-24 Thread Mark Fasheh
On Fri, Jan 24, 2014 at 04:02:09PM -0600, Goldwyn Rodrigues wrote:
 
 
 On 01/24/2014 02:46 PM, a...@linux-foundation.org wrote:
  From: Younger Liu younger.li...@gmail.com
  Subject: ocfs2: fix ocfs2_sync_file() if filesystem is readonly
 
  If filesystem is readonly, there is no need to flush drive's caches or
  force any uncommitted transactions.
 
 An ocfs2 filesystem can be set to read-only because of an error, in 
 which case, you should return -EROFS.
 
 Nak.

Goldwyn's right actually - disregard my sign off for the last one.

Basically the patch does this:

if  (we're in some readonly state)
return 0;

What we want, at the top of ocfs2_sync_file() is a return of -EROFS. This
will satisfy Goldwyn's requirement that we bubble -EROFS up the stack but at
the same time avoiding the extra work of trying to sync on a RO fs.

So the new version of the patch would be:

if  (we're in some readonly state)
return -EROFS;

Thanks,
--Mark

--
Mark Fasheh

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [patch 03/11] ocfs2/o2net: incorrect to terminate accepting connections loop upon rejecting an invalid one

2014-01-24 Thread tariq saeed

Thanks for the comments.
To understand the consequences of ignoring the err, we need to look at 
what is going on.
We get a softIRQ when a connection packet (tcp SYN). It is critical to 
note that we may not

get a softIRQ_for every connection s_ince connection packets can arrive
back-to-back (as happened in this bug). So, one softIRQ could be 
delivered for  1 pending accept.

_This is the KEY point. _

If we terminate the loop calling o2net_accept_one() upon seeing an 
error, what happens
to the rest of the connections in the queue. If no new connection 
arrives for hours, no new softIRQ

will be delivered, and the connections will just sit in the queue.

Thanks,
-Tariq


On 1/24/2014 1:55 PM, Mark Fasheh wrote:

On Fri, Jan 24, 2014 at 12:47:02PM -0800, a...@linux-foundation.org wrote:

From: Tariq Saeed tariq.x.sa...@oracle.com
Subject: ocfs2/o2net: incorrect to terminate accepting connections loop upon 
rejecting an invalid one

When o2net-accept-one() rejects an illegal connection, it terminates the
loop picking up the remaining queued connections.  This fix will continue
accepting connections till the queue is emtpy.

Addresses Orabug 17489469.

Thanks for sending this, review comments below.



diff -puN 
fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
 fs/ocfs2/cluster/tcp.c
--- 
a/fs/ocfs2/cluster/tcp.c~ocfs2-o2net-incorrect-to-terminate-accepting-connections-loop-upon-rejecting-an-invalid-one
+++ a/fs/ocfs2/cluster/tcp.c
@@ -1826,7 +1826,7 @@ int o2net_register_hb_callbacks(void)
  
  /*  */
  
-static int o2net_accept_one(struct socket *sock)

+static int o2net_accept_one(struct socket *sock, int *more)
  {
int ret, slen;
struct sockaddr_in sin;
@@ -1837,6 +1837,7 @@ static int o2net_accept_one(struct socke
struct o2net_node *nn;
  
  	BUG_ON(sock == NULL);

+   *more = 0;
ret = sock_create_lite(sock-sk-sk_family, sock-sk-sk_type,
   sock-sk-sk_protocol, new_sock);
if (ret)
@@ -1848,6 +1849,7 @@ static int o2net_accept_one(struct socke
if (ret  0)
goto out;
  
+	*more = 1;

new_sock-sk-sk_allocation = GFP_ATOMIC;
  
  	ret = o2net_set_nodelay(new_sock);

@@ -1949,8 +1951,15 @@ out:
  static void o2net_accept_many(struct work_struct *work)
  {
struct socket *sock = o2net_listen_sock;
-   while (o2net_accept_one(sock) == 0)
+   int more;
+   int err;
+
+   for (;;) {
+   err = o2net_accept_one(sock, more);
+   if (!more)
+   break;

We're throwing out 'err' here and trusting the variable 'more'. However, err
could be set and more would be 0 regardless of whether there actually are
more connections to be had. This makes more sense given when 'more' is set:

if (err)
break;
/* only trust the value of 'more' when err == 0 */
if (more)
break;

Thanks,
--Mark

--
Mark Fasheh


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [patch 01/11] ocfs2: fix ocfs2_sync_file() if filesystem is readonly

2014-01-24 Thread Andrew Morton
On Fri, 24 Jan 2014 14:21:19 -0800 Mark Fasheh mfas...@suse.de wrote:

 On Fri, Jan 24, 2014 at 04:02:09PM -0600, Goldwyn Rodrigues wrote:
  
  
  On 01/24/2014 02:46 PM, a...@linux-foundation.org wrote:
   From: Younger Liu younger.li...@gmail.com
   Subject: ocfs2: fix ocfs2_sync_file() if filesystem is readonly
  
   If filesystem is readonly, there is no need to flush drive's caches or
   force any uncommitted transactions.
  
  An ocfs2 filesystem can be set to read-only because of an error, in 
  which case, you should return -EROFS.
  
  Nak.
 
 Goldwyn's right actually - disregard my sign off for the last one.
 
 Basically the patch does this:
 
 if  (we're in some readonly state)
   return 0;
 
 What we want, at the top of ocfs2_sync_file() is a return of -EROFS. This
 will satisfy Goldwyn's requirement that we bubble -EROFS up the stack but at
 the same time avoiding the extra work of trying to sync on a RO fs.
 
 So the new version of the patch would be:
 
 if  (we're in some readonly state)
   return -EROFS;
 

So it's this?

--- a/fs/ocfs2/file.c~ocfs2-fix-ocfs2_sync_file-if-filesystem-is-readonly
+++ a/fs/ocfs2/file.c
@@ -185,6 +185,9 @@ static int ocfs2_sync_file(struct file *
  file-f_path.dentry-d_name.name,
  (unsigned long long)datasync);
 
+   if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
+   return -EROFS;
+
err = filemap_write_and_wait_range(inode-i_mapping, start, end);
if (err)
return err;
_


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel