date:20170608

btrfs-convert: whats minimum free space requirement?

2017-06-08 Thread Lakshmipathi.G

Hi.
Just wanted to check whether do we have any numbers
on whats the minimum free space requirement on 
source file system for btrfs-convert to work?

ex: Like 5% of ext3/4 free space is needed for
btrfs-convert to succeed? 

Cheers.
Lakshmipathi.G


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: status of swapfiles on Btrfs

2017-06-08 Thread Omar Sandoval

On Thu, Jun 08, 2017 at 03:35:10PM -0600, Chris Murphy wrote:
> What's the status?

I rebased the patches on v4.9 back in November and ran into a circular
locking issue between mmap_sem and i_rwsem. I never figured out how to
resolve that. Christoph was the last one that I talked to about this,
maybe he has some ideas. Latest code rebased on v4.12-rc4 is here:
https://github.com/osandov/linux/tree/btrfs-swap.

[  991.245632] ==
[  991.246493] WARNING: possible circular locking dependency detected
[  991.246590] 4.12.0-rc4-6-g0e2e3e2ba974 #3 Not tainted
[  991.246590] --
[  991.246590] swapme/626 is trying to acquire lock:
[  991.246590]  (>s_type->i_mutex_key#16){++}, at: [] 
nfs_start_io_direct+0x1e/0x70 [nfs]
[  991.246590]
[  991.246590] but task is already holding lock:
[  991.246590]  (>mmap_sem){++}, at: [] 
__do_page_fault+0x17a/0x550
[  991.246590]
[  991.246590] which lock already depends on the new lock.
[  991.246590]
[  991.246590]
[  991.246590] the existing dependency chain (in reverse order) is:
[  991.246590]
[  991.246590] -> #1 (>mmap_sem){++}:
[  991.246590]lock_acquire+0xa5/0x250
[  991.246590]__might_fault+0x68/0x90
[  991.246590]copy_page_to_iter+0xc4/0x310
[  991.246590]generic_file_read_iter+0x325/0x7d0
[  991.246590]nfs_file_read+0x7c/0xa0 [nfs]
[  991.246590]__vfs_read+0xe1/0x130
[  991.246590]vfs_read+0xa8/0x150
[  991.246590]SyS_read+0x58/0xd0
[  991.246590]entry_SYSCALL_64_fastpath+0x1f/0xbe
[  991.246590]
[  991.246590] -> #0 (>s_type->i_mutex_key#16){++}:
[  991.246590]__lock_acquire+0x15e1/0x1940
[  991.246590]lock_acquire+0xa5/0x250
[  991.246590]down_read+0x3e/0x70
[  991.246590]nfs_start_io_direct+0x1e/0x70 [nfs]
[  991.246590]nfs_file_direct_write+0x1b6/0x290 [nfs]
[  991.246590]nfs_file_write+0x169/0x1f0 [nfs]
[  991.246590]__swap_writepage+0x121/0x2f0
[  991.246590]swap_writepage+0x34/0x90
[  991.246590]pageout.isra.18+0xf8/0x3c0
[  991.246590]shrink_page_list+0x779/0xac0
[  991.246590]shrink_inactive_list+0x200/0x580
[  991.246590]shrink_node_memcg+0x367/0x750
[  991.246590]shrink_node+0xf7/0x2f0
[  991.246590]do_try_to_free_pages+0xd7/0x350
[  991.246590]try_to_free_mem_cgroup_pages+0x111/0x390
[  991.246590]try_charge+0x14b/0xa20
[  991.246590]mem_cgroup_try_charge+0x87/0x480
[  991.246590]__handle_mm_fault+0xbc0/0x11b0
[  991.246590]handle_mm_fault+0x174/0x340
[  991.246590]__do_page_fault+0x290/0x550
[  991.246590]trace_do_page_fault+0x9a/0x260
[  991.246590]do_async_page_fault+0x4f/0x70
[  991.246590]async_page_fault+0x28/0x30
[  991.246590]
[  991.246590] other info that might help us debug this:
[  991.246590]
[  991.246590]  Possible unsafe locking scenario:
[  991.246590]
[  991.246590]CPU0CPU1
[  991.246590]
[  991.246590]   lock(>mmap_sem);
[  991.246590]lock(>s_type->i_mutex_key#16);
[  991.246590]lock(>mmap_sem);
[  991.246590]   lock(>s_type->i_mutex_key#16);
[  991.246590]
[  991.246590]  *** DEADLOCK ***
[  991.246590]
[  991.246590] 1 lock held by swapme/626:
[  991.246590]  #0:  (>mmap_sem){++}, at: [] 
__do_page_fault+0x17a/0x550
[  991.246590]
[  991.246590] stack backtrace:
[  991.246590] CPU: 2 PID: 626 Comm: swapme Not tainted 
4.12.0-rc4-6-g0e2e3e2ba974 #3
[  991.246590] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-20170228_101828-anatol 04/01/2014
[  991.246590] Call Trace:
[  991.246590]  dump_stack+0x8e/0xcd
[  991.246590]  print_circular_bug+0x1f8/0x2e0
[  991.246590]  __lock_acquire+0x15e1/0x1940
[  991.246590]  lock_acquire+0xa5/0x250
[  991.246590]  ? lock_acquire+0xa5/0x250
[  991.246590]  ? nfs_start_io_direct+0x1e/0x70 [nfs]
[  991.246590]  down_read+0x3e/0x70
[  991.246590]  ? nfs_start_io_direct+0x1e/0x70 [nfs]
[  991.246590]  nfs_start_io_direct+0x1e/0x70 [nfs]
[  991.246590]  nfs_file_direct_write+0x1b6/0x290 [nfs]
[  991.246590]  nfs_file_write+0x169/0x1f0 [nfs]
[  991.246590]  ? SyS_madvise+0x870/0x870
[  991.246590]  __swap_writepage+0x121/0x2f0
[  991.246590]  swap_writepage+0x34/0x90
[  991.246590]  pageout.isra.18+0xf8/0x3c0
[  991.246590]  shrink_page_list+0x779/0xac0
[  991.246590]  shrink_inactive_list+0x200/0x580
[  991.246590]  ? mark_lock+0x5d0/0x670
[  991.246590]  shrink_node_memcg+0x367/0x750
[  991.246590]  ? mem_cgroup_iter+0x1c1/0x760
[  991.246590]  shrink_node+0xf7/0x2f0
[  991.246590]  ? shrink_node+0xf7/0x2f0
[  991.246590]  do_try_to_free_pages+0xd7/0x350
[  991.246590]  try_to_free_mem_cgroup_pages+0x111/0x390
[  991.246590]  try_charge+0x14b/0xa20
[  991.246590]  ?

status of swapfiles on Btrfs

2017-06-08 Thread Chris Murphy

What's the status?

This message gave me an idea:
http://www.spinics.net/lists/linux-btrfs/msg40323.html

What about an xattr on both subvolume and the swapfile that would
inhibit btrfs user space tools from either snapshotting the subvolume
or the swapfile?

There could be a feature in btrfs user space tool to "create a
swapfile" which would do the full sequence:
1. create a subvolume at the top level
2. set a "do not snapshot" xattr on subvolume
3. fallocate a swapfile, presumably contiguous since swap can't use a
file with holes or gaps
4. set a "do not snapshot/reflink" xattr on the swapfile

Does this solve any usability concerns?



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: About free space fragmentation, metadata write amplification and (no)ssd

2017-06-08 Thread Hans van Kranenburg

On 06/08/2017 08:47 PM, Roman Mamedov wrote:
> On Thu, 8 Jun 2017 19:57:10 +0200
> Hans van Kranenburg  wrote:
> 
>> There is an improvement with subvolume delete + nossd that is visible
>> between 4.7 and 4.9.
> 
> I don't remember if I asked before, but did you test on 4.4?

No, I jumped from 3.16 lts (debian) to 4.7.8 to 4.9.25 now. I haven't
been building my own (yet), it's all debian kernels.

The biggest improvement I needed was the free space tree (>=4.5),
because with 3.16 transaction commit disk write IO was going through the
roof, blocking the fs for too long every few seconds. 4.7.8 was about
the first kernel that I tested which I couldn't too easily get to
explode and corrupt file systems. The 3.16 lts was (is) a really stable
kernel for btrfs.

> The two latest
> longterm series are 4.9 and 4.4. 4.7 should be abandoned and forgotten by now
> really, certainly not used daily in production,

I know, I know. They're already gone now. :)

> it's not even listed on
> kernel.org anymore. Also it's possible the 4.7 branch that you test did not
> receive all the bugfix backports from mainline like the longterm series do.

Well, I wouldn't say "all" the bugfixes, looking at the history of
fs/btrfs in current 4.9. It's more like.. sporadically, someone might
take time to also think about the longterm kernel. ;-)

>> I have no idea what change between 4.7 and 4.9 is responsible for this, but
>> it's good.  
> 
> FWIW, this appears to be the big Btrfs change between 4.7 and 4.9 (in 4.8):
> 
> Btrfs: introduce ticketed enospc infrastructure
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=957780eb2788d8c218d539e19a85653f51a96dc1

Since that part of the problem is gone now, I don't think it makes sense
any more to spend time to find where it improved...

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: About free space fragmentation, metadata write amplification and (no)ssd

2017-06-08 Thread Roman Mamedov

On Thu, 8 Jun 2017 19:57:10 +0200
Hans van Kranenburg  wrote:

> There is an improvement with subvolume delete + nossd that is visible
> between 4.7 and 4.9.

I don't remember if I asked before, but did you test on 4.4? The two latest
longterm series are 4.9 and 4.4. 4.7 should be abandoned and forgotten by now
really, certainly not used daily in production, it's not even listed on
kernel.org anymore. Also it's possible the 4.7 branch that you test did not
receive all the bugfix backports from mainline like the longterm series do.

> I have no idea what change between 4.7 and 4.9 is responsible for this, but
> it's good.  

FWIW, this appears to be the big Btrfs change between 4.7 and 4.9 (in 4.8):

Btrfs: introduce ticketed enospc infrastructure
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=957780eb2788d8c218d539e19a85653f51a96dc1

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Lock between userspace and btrfs-cleaner on extent_buffer

2017-06-08 Thread Sargun Dhillon

I have a deadlock caught in the wild between two processes --
btrfs-cleaner, and userspace process (Docker). Here, you can see both
of the backtraces. btrfs-cleaner is trying to get a lock on
9859d360caf0, which is owned by Docker's pid. Docker on the other
hand is trying to get a lock on 9859dc0f0578, which is owned by
btrfs-cleaner's Pid.

This is on vanilla 4.11.3 without much workload. The background
workload was basically starting and stopping Docker with a medium
sized image like ubuntu:latest with sleep 5. So, snapshot creation,
destruction. And there's some stuff that's logging to btrfs.

crash> bt -FF
PID: 3423   TASK: 985ec7a16580  CPU: 2   COMMAND: "btrfs-cleaner"
 #0 [afca9d9078e8] __schedule at bb235729
afca9d9078f0:  [985eccb2e580:task_struct]
afca9d907900: [985ec7a16580:task_struct] 985ed949b280
afca9d907910: afca9d907978 __schedule+953
afca9d907920: btree_get_extent 9de968f0
afca9d907930: 985ed949b280 afca9d907958
afca9d907940: 0004 00a90842012fd9df
afca9d907950: [985ec7a16580:task_struct]
[9859d360cb50:btrfs_extent_buffer]
afca9d907960: [9859d360cb58:btrfs_extent_buffer]
[985ec7a16580:task_struct]
afca9d907970: [985ec7a16580:task_struct] afca9d907990
afca9d907980: schedule+54
 #1 [afca9d907980] schedule at bb235c96
afca9d907988: [9859d360caf0:btrfs_extent_buffer] afca9d9079f8
afca9d907998: btrfs_tree_read_lock+204
 #2 [afca9d907998] btrfs_tree_read_lock at c03e112c [btrfs]
afca9d9079a0: 985e [985ec7a16580:task_struct]
afca9d9079b0: autoremove_wake_function
[9859d360cb60:btrfs_extent_buffer]
afca9d9079c0: [9859d360cb60:btrfs_extent_buffer] 00a90842012fd9df
afca9d9079d0: [985a6ca3c370:Acpi-State]
[9859d360caf0:btrfs_extent_buffer]
afca9d9079e0: afca9d907ac0 [985e751bc000:kmalloc-8192]
afca9d9079f0: [985e751bc000:kmalloc-8192] afca9d907a48
afca9d907a00: __add_missing_keys+190
 #3 [afca9d907a00] __add_missing_keys at c040abae [btrfs]
afca9d907a08:  afca9d907a28
afca9d907a18: free_extent_buffer+75 00a90842012fd9df
afca9d907a28: afca9d907ab0 afca9d907be8
afca9d907a38:  [985e78dae540:btrfs_path]
afca9d907a48: afca9d907b28 find_parent_nodes+889
 #4 [afca9d907a50] find_parent_nodes at c040c4d9 [btrfs]
afca9d907a58: [985e751bc000:kmalloc-8192]
[9859d613cf40:kmalloc-32]
afca9d907a68: [9859d613c220:kmalloc-32] 
afca9d907a78: 030dc000 
afca9d907a88: [985e78dae540:btrfs_path] 
afca9d907a98: 000178dae540 
afca9d907aa8: 0002 afca9d907ab0
afca9d907ab8: afca9d907ab0 [985a6ca3c370:Acpi-State]
afca9d907ac8: [985a6ca3ce10:Acpi-State] c000985e751bc000
afca9d907ad8: 01a9030d 
afca9d907ae8: a9030dc0 0001
afca9d907af8: 00a90842012fd9df [9859d613c220:kmalloc-32]
afca9d907b08: afca9d907be8 030dc000
afca9d907b18: [985e751bc000:kmalloc-8192] 
afca9d907b28: afca9d907b98 __btrfs_find_all_roots+169
 #5 [afca9d907b30] __btrfs_find_all_roots at c040cb09 [btrfs]
afca9d907b38:  
afca9d907b48:  
afca9d907b58:  [9859d5e63c10:kmalloc-64]
afca9d907b68: 00a90842012fd9df [985e751bc788:kmalloc-8192]
afca9d907b78: [9859d5e63140:kmalloc-64]
[985e9dfa8ee8:btrfs_transaction]
afca9d907b88: [985e9dfa8d80:btrfs_transaction] 0321
afca9d907b98: afca9d907bd0 btrfs_find_all_roots+85
 #6 [afca9d907ba0] btrfs_find_all_roots at c040cbf5 [btrfs]
afca9d907ba8: afca9d907be8 
afca9d907bb8: 42b93000 [985e751bc000:kmalloc-8192]
afca9d907bc8: [985e751bc000:kmalloc-8192] afca9d907c18
afca9d907bd8: btrfs_qgroup_trace_extent+302
 #7 [afca9d907bd8] btrfs_qgroup_trace_extent at c04115ee [btrfs]
afca9d907be0: 1000 [9859d613cf40:kmalloc-32]
afca9d907bf0: 00a90842012fd9df [9859dc0f0578:btrfs_extent_buffer]
afca9d907c00: 0ce5 2fa4
afca9d907c10: [985e751bc000:kmalloc-8192] afca9d907c80
afca9d907c20: btrfs_qgroup_trace_leaf_items+279
 #8 [afca9d907c20] btrfs_qgroup_trace_leaf_items at c0411747 [btrfs]
afca9d907c28: 42b93000 [985eb63d9a40:btrfs_trans_handle]
afca9d907c38: 72ffc03848e8

Re: About free space fragmentation, metadata write amplification and (no)ssd

2017-06-08 Thread Hans van Kranenburg

Ehrm,

On 05/28/2017 02:59 AM, Hans van Kranenburg wrote:
> A small update...
> 
> Original (long) message:
> https://www.spinics.net/lists/linux-btrfs/msg64446.html
> 
> On 04/08/2017 10:19 PM, Hans van Kranenburg wrote:
>> [...]
>>
>> == But! The Meta Mummy returns! ==
>>
>> After changing to nossd, another thing happened. The expiry process,
>> which normally takes about 1.5 hour to remove ~2500 subvolumes (keeping
>> it queued up to a 100 orphans all the time), suddenly took the entire
>> rest of the day, not being done before the nightly backups had to start
>> again at 10PM...
>>
>> And the only thing it seemed to do is writing, writing, writing 100MB/s
>> all day long.
> 
> This behaviour was observed with a 4.7.5 linux kernel.
> 
> When running 4.9.25 now with -o nossd, this weird behaviour is gone. I
> have no idea what change between 4.7 and 4.9 is responsible for this,
> but it's good.

Ok, that hooray was a bit too early...

There is an improvement with subvolume delete + nossd that is visible
between 4.7 and 4.9.

This example that I saved shows what happened when doing remount,nossd
on 4.7.8:

https://syrinx.knorrie.org/~knorrie/btrfs/keep/2017-06-08-xvdb-nossd-sub-del.png

That example filesystem has about 1.5TiB of small files (subversion
repositories) on it, and every 15 minutes, using send/receive (helped by
btrbk) incremental changes are being sent to another location, and
snapshots older than a day are removed.

When switching to nossd, the snapshot removals (also every 15 mins)
suddenly showed quite a lot more disk writes happening (metadata).

With 4.9.25, that effect on this one and smaller filesystems is gone.
The graphs look the same when switching to nossd.

But still, on the large filesystem (>30TiB), removing
subvolumes/snapshots takes like >10x the time (and metadata write IO)
with nossd than with ssd.

An example:

https://syrinx.knorrie.org/~knorrie/btrfs/keep/2017-06-08-big-expire-ssd-nossd.png

With -o nossd, I was able to remove 900 subvolumes (varying fs tree
sizes) in about 17 hours, doing sustained 100MB/s writes to disk.

When switching to -o ssd, I was able to remove 4300 of them within 4
hours, with way less disk write activity.

So, I'm still suspecting it's simply the SZ_64K vs SZ_2M difference for
metadata *empty_cluster that is making this huge difference, and that
the absurd metadata overhead is generated because of the fact that the
extent tree is tracked inside the extent tree itself.

To gather proof of this, and to research the effect of different
settings, applying different patches (like playing with the
empty_cluster values, the shift to left page patch, bulk csum etc,) I
need to be able to measure some things first.

So, my current idea is to put per tree (all fs trees combined under 5)
cow counters in, exposed via sysfs, so that I can create munin cow rate
graphs per filesystem. Currently, I put the python-to-C btrfs-progs
bindings project aside again, and am teaching myself enough to get this
done first. :) Free time is a bit limited nowadays, but progress is steady.

To be continued...

>> == So, what do we want? ssd? nossd? ==
>>
>> Well, both don't do it for me. I want my expensive NetApp disk space to
>> be filled up, without requiring me to clean up after it all the time
>> using painful balance actions and I want to quickly get rid of old
>> snapshots.
>>
>> So currently, there's two mount -o remount statements before and after
>> doing the expiries...
> 
> With 4.9+ now, it stays on nossd for sure, everywhere. :)

Nope, the daily remounts are back again, well only on the biggest
filesystems. :@

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dedicated error codes for the block layer V3

2017-06-08 Thread Mike Snitzer

On Thu, Jun 08 2017 at 11:42am -0400,
Jens Axboe  wrote:

> On 06/03/2017 01:37 AM, Christoph Hellwig wrote:
> > This series introduces a new blk_status_t error code type for the block
> > layer so that we can have tigher control and explicit semantics for
> > block layer errors.
> > 
> > All but the last three patches are cleanups that lead to the new type.
> > 
> > The series it mostly limited to the block layer and drivers, and touching
> > file systems a little bit.  The only major exception is btrfs, which
> > does funny things with bios and thus sees a larger amount of propagation
> > of the new blk_status_t.
> > 
> > A git tree is also available at:
> > 
> > git://git.infradead.org/users/hch/block.git block-errors
> > 
> > gitweb:
> > 
> > 
> > http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-errors
> > 
> > Note the the two biggest patches didn't make it to linux-block and
> > linux-btrfs last time.  If you didn't get them they are available in
> > the git tree above.  Unfortunately there is no easy way to split them
> > up.
> 
> Mike, can you take a look at the dm bits in this series? I'd like to get
> this queued up, but I'd also greatly prefer if the dm patches had sign
> off from your end.

Will do.  I'll have a look by the end of the week.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dedicated error codes for the block layer V3

2017-06-08 Thread Jens Axboe

On 06/03/2017 01:37 AM, Christoph Hellwig wrote:
> This series introduces a new blk_status_t error code type for the block
> layer so that we can have tigher control and explicit semantics for
> block layer errors.
> 
> All but the last three patches are cleanups that lead to the new type.
> 
> The series it mostly limited to the block layer and drivers, and touching
> file systems a little bit.  The only major exception is btrfs, which
> does funny things with bios and thus sees a larger amount of propagation
> of the new blk_status_t.
> 
> A git tree is also available at:
> 
> git://git.infradead.org/users/hch/block.git block-errors
> 
> gitweb:
> 
> 
> http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-errors
> 
> Note the the two biggest patches didn't make it to linux-block and
> linux-btrfs last time.  If you didn't get them they are available in
> the git tree above.  Unfortunately there is no easy way to split them
> up.

Mike, can you take a look at the dm bits in this series? I'd like to get
this queued up, but I'd also greatly prefer if the dm patches had sign
off from your end.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Investment Interest & Offer

2017-06-08 Thread Seydou Thieba

We are a Fund management company located in the United Kingdom, we specialize 
in searching for potential investments opportunities for our high net-worth 
clients globally. Should this be of interest to you, please do not hesitate to 
email me for  further information.

Thanks
Seydou Thieba
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/10 v11] No wait AIO

2017-06-08 Thread Christoph Hellwig

As already indicated this whole series looks fine to me.

Al: are you going to pick this up?  Or Andrew?

On Tue, Jun 06, 2017 at 06:19:29AM -0500, Goldwyn Rodrigues wrote:
> This series adds nonblocking feature to asynchronous I/O writes.
> io_submit() can be delayed because of a number of reason:
>  - Block allocation for files
>  - Data writebacks for direct I/O
>  - Sleeping because of waiting to acquire i_rwsem
>  - Congested block device
> 
> The goal of the patch series is to return -EAGAIN/-EWOULDBLOCK if
> any of these conditions are met. This way userspace can push most
> of the write()s to the kernel to the best of its ability to complete
> and if it returns -EAGAIN, can defer it to another thread.
> 
> In order to enable this, IOCB_RW_FLAG_NOWAIT is introduced in
> uapi/linux/aio_abi.h. If set for aio_rw_flags, it translates to
> IOCB_NOWAIT for struct iocb, REQ_NOWAIT for bio.bi_opf and IOMAP_NOWAIT for
> iomap. aio_rw_flags is a new flag replacing aio_reserved1. We could
> not use aio_flags because it is not currently checked for invalidity
> in the kernel.
> 
> This feature is provided for direct I/O of asynchronous I/O only. I have
> tested it against xfs, ext4, and btrfs while I intend to add more filesystems.
> The nowait feature is for request based devices. In the future, I intend to
> add support to stacked devices such as md.
> 
> Applications will have to check supportability
> by sending a async direct write and any other error besides -EAGAIN
> would mean it is not supported.
> 
> First two patches are prep patches into nowait I/O.
> 
> Changes since v1:
>  + changed name from _NONBLOCKING to *_NOWAIT
>  + filemap_range_has_page call moved to closer to (just before) calling 
> filemap_write_and_wait_range().
>  + BIO_NOWAIT limited to get_request()
>  + XFS fixes 
>   - included reflink 
>   - use of xfs_ilock_nowait() instead of a XFS_IOLOCK_NONBLOCKING flag
>   - Translate the flag through IOMAP_NOWAIT (iomap) to check for
> block allocation for the file.
>  + ext4 coding style
> 
> Changes since v2:
>  + Using aio_reserved1 as aio_rw_flags instead of aio_flags
>  + blk-mq support
>  + xfs uptodate with kernel and reflink changes
> 
>  Changes since v3:
>   + Added FS_NOWAIT, which is set if the filesystem supports NOWAIT feature.
>   + Checks in generic_make_request() to make sure BIO_NOWAIT comes in
> for async direct writes only.
>   + Added QUEUE_FLAG_NOWAIT, which is set if the device supports BIO_NOWAIT.
> This is added (rather not set) to block devices such as dm/md currently.
> 
>  Changes since v4:
>   + Ported AIO code to use RWF_* flags. Check for RWF_* flags in
> generic_file_write_iter().
>   + Changed IOCB_RW_FLAGS_NOWAIT to RWF_NOWAIT.
> 
>  Changes since v5:
>   + BIO_NOWAIT to REQ_NOWAIT
>   + Common helper for RWF flags.
> 
>  Changes since v6:
>   + REQ_NOWAIT will be ignored for request based devices since they
> cannot block. So, removed QUEUE_FLAG_NOWAIT since it is not
> required in the current implementation. It will be resurrected
> when we program for stacked devices.
>   + changed kiocb_rw_flags() to kiocb_set_rw_flags() in order to accomodate
> for errors. Moved checks in the function.
> 
>  Changes since v7:
>   + split patches into prep so the main patches are smaller and easier
> to understand
>   + All patches are reviewed or acked!
>  
>  Changes since v8:
>  + Err out AIO reads with -EINVAL flagged as RWF_NOWAIT
> 
>  Changes since v9:
>  + Retract - Err out AIO reads with -EINVAL flagged as RWF_NOWAIT
>  + XFS returns EAGAIN if extent list is not in memory
>  + Man page updates to io_submit with iocb description and nowait features.
> 
>  Changes since v10:
>  + Corrected comment and subject in "return on congested block device"
> 
> -- 
> Goldwyn
> 
> 
---end quoted text---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

btrfs-convert: whats minimum free space requirement?

Re: status of swapfiles on Btrfs

status of swapfiles on Btrfs

Re: About free space fragmentation, metadata write amplification and (no)ssd

Re: About free space fragmentation, metadata write amplification and (no)ssd

Lock between userspace and btrfs-cleaner on extent_buffer

Re: About free space fragmentation, metadata write amplification and (no)ssd

Re: dedicated error codes for the block layer V3

Re: dedicated error codes for the block layer V3

Investment Interest & Offer

Re: [PATCH 0/10 v11] No wait AIO

11 matches

Site Navigation

Mail list logo

Footer information