On Tue, Jun 06, 2017 at 03:04:08PM -0400, David Miller wrote:
> From: David Miller
> Date: Fri, 02 Jun 2017 11:28:54 -0400 (EDT)
>
> >
> > On sparc, if we have an alloca() like situation, as is the case with
> > SHASH_DESC_ON_STACK(), we can end up referencing deallocated
On Wed, Jun 07, 2017 at 11:08:01AM +0800, Eryu Guan wrote:
> On Tue, Jun 06, 2017 at 05:03:05PM -0700, Omar Sandoval wrote:
> > On Sat, Jun 03, 2017 at 12:37:00AM -0700, Christoph Hellwig wrote:
> > > This looks like a btrfs-specific test, and not like a generic one
> > > to me.
> >
> > Nothing
Most of patches regroup if else logic
in attemp to avoid useless checks
Last patch convert if else to switch case,
because in that place code work with enum
and usage of switch case can make more obvious
to compiler how to optimize that code
This is if else vs case in GCC C++,
but i think gcc do
In worst case code do 6 comparison,
just add some new checks to switch check branch faster
now in worst case code do 4 comparison
Signed-off-by: Timofey Titovets
---
fs/btrfs/inode.c | 21 -
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git
In worst case code do 8 comparison,
just add some new checks to switch check branch faster
now in worst case code do 5 comparison
Signed-off-by: Timofey Titovets
---
fs/btrfs/backref.c | 28
1 file changed, 16 insertions(+), 12 deletions(-)
In worst case code do 4 comparison,
just add some new checks to switch check branch faster
now in worst case code do 3 comparison
Signed-off-by: Timofey Titovets
---
fs/btrfs/file.c | 17 +
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git
By comparison logic if ret => 0 -> ret=0;
So lets cleanup compare logic
Signed-off-by: Timofey Titovets
---
fs/btrfs/backref.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 897d664a9..f53045891
In worst case code do 4 comparison,
just add some new checks to switch check branch faster
now in worst case code do 3 comparison
Signed-off-by: Timofey Titovets
---
fs/btrfs/ctree.c | 20 +++-
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git
If arg to "switch case" is determined and it's a consecutive numbers
(This is enum btrfs_wq_endio_type)
Compiler can create jump table to optimize logic
Signed-off-by: Timofey Titovets
---
fs/btrfs/disk-io.c | 17 +
1 file changed, 13 insertions(+), 4
On Tue, Jun 06, 2017 at 05:03:05PM -0700, Omar Sandoval wrote:
> On Sat, Jun 03, 2017 at 12:37:00AM -0700, Christoph Hellwig wrote:
> > This looks like a btrfs-specific test, and not like a generic one
> > to me.
>
> Nothing about the workload itself is btrfs-specific, we just have the
> extra
'replay_one_buffer' first reads buffers and dispatches items accroding
item type.
In this patch, 'add_inode_ref' handles inode_ref and inode_extref.
Then 'add_inode_ref' calls 'ref_get_fields' and 'extref_get_fields' to read
ref/extref name for the first time.
So Checking name_len before read in
Since 'iterate_dir_item' checks name_len in its way,
so use 'btrfs_is_name_len_valid' not 'verify_dir_item' to make more strict
name_len check.
Signed-off-by: Su Yue
---
fs/btrfs/send.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/fs/btrfs/send.c
When reading out name from inode_ref, dir_item, it's possible that
corrupted name_len leads to read beyond boundary.
Since there are already patches for btrfs-progs, this patchset is
for btrfs.
Introduce 'btrfs_is_name_len_valid' to make check name_len with
item boundary.
If read name from
On 06/03/17 00:58, David Sterba wrote:
Christoph pointed out that bio allocations backed by a bioset will never
fail.
David,
Looks like this feature comes when __GFP_DIRECT_RECLAIM is
set and we aren't, such as [1]. Any idea why? Looks like I am
missing something ?
[1]
-
static
On Tue, Jun 06, 2017 at 06:21:17PM +0800, Anand Jain wrote:
> On 06/03/17 00:58, David Sterba wrote:
> > Christoph pointed out that bio allocations backed by a bioset will never
> > fail.
>
> Looks like this feature comes when __GFP_DIRECT_RECLAIM is
> set and we aren't, such as [1]. Any
This series adds nonblocking feature to asynchronous I/O writes.
io_submit() can be delayed because of a number of reason:
- Block allocation for files
- Data writebacks for direct I/O
- Sleeping because of waiting to acquire i_rwsem
- Congested block device
The goal of the patch series is to
From: Goldwyn Rodrigues
IOCB_NOWAIT translates to IOMAP_NOWAIT for iomaps.
This is used by XFS in the XFS patch.
Reviewed-by: Christoph Hellwig
Reviewed-by: Jan Kara
Signed-off-by: Goldwyn Rodrigues
---
fs/iomap.c|
From: Goldwyn Rodrigues
Reviewed-by: Christoph Hellwig
Reviewed-by: Jan Kara
Signed-off-by: Goldwyn Rodrigues
---
fs/read_write.c| 12 +++-
include/linux/fs.h | 14 ++
2 files changed, 17 insertions(+),
From: Goldwyn Rodrigues
Return EAGAIN if any of the following checks fail
+ i_rwsem is not lockable
+ NODATACOW or PREALLOC is not set
+ Cannot nocow at the desired location
+ Writing beyond end of file which is not allocated
Acked-by: David Sterba
From: Goldwyn Rodrigues
Find out if the write will trigger a wait due to writeback. If yes,
return -EAGAIN.
Return -EINVAL for buffered AIO: there are multiple causes of
delay such as page locks, dirty throttling logic, page loading
from disk etc. which cannot be taken care
From: Goldwyn Rodrigues
RWF_NOWAIT informs kernel to bail out if an AIO request will block
for reasons such as file allocations, or a writeback triggered,
or would block while allocating requests while performing
direct I/O.
RWF_NOWAIT is translated to IOCB_NOWAIT for
From: Goldwyn Rodrigues
aio_rw_flags is introduced in struct iocb (using aio_reserved1) which will
carry the RWF_* flags. We cannot use aio_flags because they are not
checked for validity which may break existing applications.
Note, the only place RWF_HIPRI comes in effect is
From: Goldwyn Rodrigues
If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable
immediately.
IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin
if it needs allocation either due to file extension, writing to a hole,
or COW or waiting for other DIOs to finish.
From: Goldwyn Rodrigues
Return EAGAIN if any of the following checks fail for direct I/O:
+ i_rwsem is lockable
+ Writing beyond end of file (will trigger allocation)
+ Blocks are not allocated at the write location
Signed-off-by: Goldwyn Rodrigues
From: Goldwyn Rodrigues
A new bio operation flag REQ_NOWAIT is introduced to identify bio's
orignating from iocb with IOCB_NOWAIT. This flag indicates
to return immediately if a request cannot be made instead
of retrying.
Stacked devices such as md (the ones with
From: Goldwyn Rodrigues
filemap_range_has_page() return true if the file's mapping has
a page within the range mentioned. This function will be used
to check if a write() call will cause a writeback of previous
writes.
Reviewed-by: Christoph Hellwig
Reviewed-by:
On Mon, Jun 05, 2017 at 09:56:14PM +0300, Timofey Titovets wrote:
> 2017-06-05 19:10 GMT+03:00 David Sterba :
> > On Tue, May 30, 2017 at 02:18:05AM +0300, Timofey Titovets wrote:
> >> Btrfs already skip store of data where compression didn't
> >> free at least one byte. Let's
On Thu, Apr 13, 2017 at 06:11:48PM -0700, Liu Bo wrote:
> Currently dio read also goes to verify checksum if -EIO has been returned,
> although it usually fails on checksum, it's not necessary at all, we could
> directly check if there is another copy to read.
>
> And with this, the behavior of
While talking to another btrfs user on IRC today, it became clear that a
major point of confusion in the btrfs send manual is that it's not
telling the user soon enough that send/receive solely operates on
subvolume snapshots instead of the original (read/write) subvolumes.
So, change the first
From: Omar Sandoval
The total_bytes_pinned counter is completely broken when accounting
delayed refs:
- If two drops for the same extent are merged, we will decrement
total_bytes_pinned twice but only increment it once.
- If an add is merged into a drop or vice versa, we will
From: Omar Sandoval
There are a few places where we pass in a negative num_bytes, so make it
signed for clarity. Also move it up in the file since later patches will
need it there.
Signed-off-by: Omar Sandoval
---
fs/btrfs/extent-tree.c | 41
From: Omar Sandoval
The extents marked in pin_down_extent() will be unpinned later in
unpin_extent_range(), which decrements total_bytes_pinned.
pin_down_extent() must increment the counter to avoid underflowing it.
Also adjust btrfs_free_tree_block() to avoid accounting for the
From: Omar Sandoval
Currently, we only increment total_bytes_pinned in
btrfs_free_tree_block() when dropping the last reference on the block.
However, when the delayed ref is run later, we will decrement
total_bytes_pinned regardless of whether it was the last reference or
not.
From: Omar Sandoval
Signed-off-by: Omar Sandoval
---
fs/btrfs/extent-tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 7c01b4e9e3b6..6032e9a635f2 100644
--- a/fs/btrfs/extent-tree.c
From: Omar Sandoval
This series fixes several problems with the total_bytes_pinned counter.
Patches 1 and 2 are cleanups. Patches 3 and 4 are straightforward fixes.
Patch 5 is prep for patch 6. Patch 6 is the most complicated fix.
Patches 5 and 6 are ugly, I'd love any
On Mon, Jun 05, 2017 at 09:29:47PM -0700, Liu Bo wrote:
> On Fri, Jun 02, 2017 at 11:14:13AM -0700, Omar Sandoval wrote:
> > On Fri, May 19, 2017 at 11:39:15AM -0600, Liu Bo wrote:
> > > We commit transaction in order to reclaim space from pinned bytes because
> > > it could process delayed refs,
From: Omar Sandoval
We need this to decide when to account pinned bytes.
Signed-off-by: Omar Sandoval
---
fs/btrfs/delayed-ref.c | 29
fs/btrfs/delayed-ref.h | 6 --
fs/btrfs/extent-tree.c | 51
From: Omar Sandoval
Catch any future/remaining leaks or underflows of total_bytes_pinned.
Signed-off-by: Omar Sandoval
---
fs/btrfs/extent-tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index
On Sat, Jun 03, 2017 at 12:37:00AM -0700, Christoph Hellwig wrote:
> This looks like a btrfs-specific test, and not like a generic one
> to me.
Nothing about the workload itself is btrfs-specific, we just have the
extra check at the end. But I don't really care, I can make it a btrfs
test unless
As code already know that (node->ref_mod > 0)
else if (node->ref_mod <= 0) - useless
So just leave 'else'
Signed-off-by: Timofey Titovets
---
fs/btrfs/backref.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
On Tue, Jun 06, 2017 at 05:57:00PM +0800, Su Yue wrote:
> + if (item_end != read_end &&
> + item_end - read_end < size) {
This fits on line, no need to split. Fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to
Btrfs already skip store of data where compression didn't
free at least one byte. Let's make logic better and make check
that compression free at least one sector size
because in another case it useless to store this data compressed
Signed-off-by: Timofey Titovets
Cc: David
Origin 'verify_dir_item' verifies name_len of dir_item with fixed values
but no item boundary.
If corrupted name_len was not bigger than the fixed value, for example 255,
the function will think the dir_item is fine. And then reading beyond
boundary will cause crash.
Example:
1. Corrupt
Call 'Verify dir_item' before 'memcmp_extent_buffer' reading name from
dir_item.
Signed-off-by: Su Yue
---
fs/btrfs/props.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
index d6cb155ef7a1..4b23ae5d0e5c 100644
---
Introduce function btrfs_is_name_len_valid.
The function compares arg @name_len with item boundary then returns value
represents name_len is valid or not.
Signed-off-by: Su Yue
---
fs/btrfs/ctree.h| 2 ++
fs/btrfs/dir-item.c | 73
'btrfs_del_root_ref' does search_slot and reads name from root_ref.
Call 'btrfs_is_name_len_valid' before memcmp.
Signed-off-by: Su Yue
---
fs/btrfs/root-tree.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
index
'replay_xattr_deletes' calls 'btrfs_search_slot' to get buffer and
reads name.
Call 'verify_dir_item' to check name_len in 'replay_xattr_deletes'
in avoid of read out of boundary.
Signed-off-by: Su Yue
---
fs/btrfs/tree-log.c | 7 +++
1 file changed, 7
In 'btrfs_log_inode', 'btrfs_search_forward' gets the buffer and then
'btrfs_check_ref_name_override' will read name from inode_ref/inode_extref
for the first time.
Call 'btrfs_is_name_len_valid' before reading name.
Signed-off-by: Su Yue
---
fs/btrfs/tree-log.c | 6
In 'btrfs_get_name', it does 'btrfs_search_slot' and reads name from
inode_ref/root_ref.
Call btrfs_is_name_len_valid in btrfs_get_name.
Signed-off-by: Su Yue
---
fs/btrfs/export.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/fs/btrfs/export.c
On Tue, Jun 06, 2017 at 06:21:17PM +0800, Anand Jain wrote:
>
>
> On 06/03/17 00:58, David Sterba wrote:
>> Christoph pointed out that bio allocations backed by a bioset will never
>> fail.
>
> David,
>
> Looks like this feature comes when __GFP_DIRECT_RECLAIM is
> set and we aren't, such as
On Tue, Jun 06, 2017 at 05:57:05PM +0800, Su Yue wrote:
> Since 'iterate_dir_item' checks name_len in its way,
> so use 'btrfs_is_name_len_valid' not 'verify_dir_item' to make more strict
> name_len check.
>
> Signed-off-by: Su Yue
> ---
> fs/btrfs/send.c | 6 ++
>
On Tue, Jun 06, 2017 at 05:56:59PM +0800, Su Yue wrote:
> When reading out name from inode_ref, dir_item, it's possible that
> corrupted name_len leads to read beyond boundary.
> Since there are already patches for btrfs-progs, this patchset is
> for btrfs.
>
> Introduce 'btrfs_is_name_len_valid'
On Wed, May 17, 2017 at 03:42:00PM -0600, Liu Bo wrote:
> With raid1 profile, dio read isn't tolerating IO errors if read length is
> less than the stripe length (64K).
>
> Our bio didn't get split in btrfs_submit_direct_hook() if (dip->flags &
> BTRFS_DIO_ORIG_BIO_SUBMITTED) is true and that
From: David Miller
Date: Fri, 02 Jun 2017 11:28:54 -0400 (EDT)
>
> On sparc, if we have an alloca() like situation, as is the case with
> SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack
> memory. The result can be that the value is clobbered if a trap
>
With switching to use btrfs_bio_clone_partial() to split bio in
directIO path, read endio is also adapted to that by recording a
iterator in btrfs_bio, however, it breaks those bios which are less
than stripe length thus no need to be split and results in NULL
pointer dereference.
This fixes the
55 matches
Mail list logo