Hello Jan,
Am Donnerstag, 9. Februar 2017, 13:44:23 BRST schrieb Jan Kara:
> People, please have a look at patches. The are mostly simple however the
> interactions are rather complex so I may have missed something. Also I'm
> happy for any additional testing these patches can get - I've stressed
On Thu, Jan 26, 2017 at 02:57:49PM +0300, Kirill A. Shutemov wrote:
> Later we can add logic to accumulate information from shadow entires to
> return to caller (average eviction time?).
I would say minimum rather than average. That will become the refault
time of the entire page, so minimum
On Thu, Jan 26, 2017 at 02:57:46PM +0300, Kirill A. Shutemov wrote:
> Let's add FileHugePages and FilePmdMapped fields into meminfo and smaps.
> It indicates how many times we allocate and map file THP.
>
> Signed-off-by: Kirill A. Shutemov
Reviewed-by: Matthew
On Thu, Jan 26, 2017 at 02:57:47PM +0300, Kirill A. Shutemov wrote:
> @@ -2146,6 +2146,23 @@ int split_huge_page_to_list(struct page *page, struct
> list_head *list)
> goto out;
> }
>
> + /* Try to free buffers before attempt split */
> +
On Wed, Feb 08, 2017 at 11:34:07AM -0500, Mike Snitzer wrote:
> On Tue, Feb 07 2017 at 11:58pm -0500,
> Kent Overstreet wrote:
>
> > On Tue, Feb 07, 2017 at 09:39:11PM +0100, Pavel Machek wrote:
> > > On Mon 2017-02-06 17:49:06, Kent Overstreet wrote:
> > > > On Mon,
Hi all,
During our test of the CFQ group schedule, we found a performance related
problem.
Rate-capped fio jobs in a CFQ group will degrade the performance of fio jobs in
another CFQ group. Both of the CFQ groups have the same blkio.weight.
We launch two fios in difference terminals. The
On Thu, Feb 09, 2017 at 01:18:49PM +0900, Damien Le Moal wrote:
> +
> +/*
> + * Target BIO completion.
> + */
> +static inline void dmz_bio_end(struct bio *bio, int err)
> +{
> + struct dm_zone_bioctx *bioctx =
> + dm_per_bio_data(bio, sizeof(struct dm_zone_bioctx));
> +
> + if
On Thu, Jan 26, 2017 at 02:57:50PM +0300, Kirill A. Shutemov wrote:
> +++ b/mm/filemap.c
> @@ -1886,6 +1886,7 @@ static ssize_t do_generic_file_read(struct file *filp,
> loff_t *ppos,
> if (unlikely(page == NULL))
> goto no_cached_page;
>
On Thu, Jan 26, 2017 at 02:57:52PM +0300, Kirill A. Shutemov wrote:
> @@ -405,9 +405,14 @@ static int __filemap_fdatawait_range(struct
> address_space *mapping,
> if (page->index > end)
> continue;
>
> + page =
On Feb 9, 2017, at 4:34 PM, Matthew Wilcox wrote:
>
> On Thu, Jan 26, 2017 at 02:57:53PM +0300, Kirill A. Shutemov wrote:
>> Most page cache allocation happens via readahead (sync or async), so if
>> we want to have significant number of huge pages in page cache we need
>>
Mike,
On 2/10/17 00:27, Mike Snitzer wrote:
> Looks like this work has come a long way. While I am _still_ not
> hearing from any customers (or partners) that SMR is a priority I do
> acknowledge that that obviously isn't the case for WDC and others.
>
> So, I'll sign up for reviewing this DM
On Thu, Jan 26, 2017 at 02:57:51PM +0300, Kirill A. Shutemov wrote:
> Write path allocate pages using pagecache_get_page(). We should be able
> to allocate huge pages there, if it's allowed. As usually, fallback to
> small pages, if failed.
>
> Signed-off-by: Kirill A. Shutemov
Hannes,
On 2/10/17 01:07, Hannes Reinecke wrote:
>> +/*
>> + * CRC32
>> + */
>> +static u32 dmz_sb_crc32(u32 crc, const void *buf, size_t length)
>> +{
>> +unsigned char *p = (unsigned char *)buf;
>> +int i;
>> +
>> +#define CRCPOLY_LE 0xedb88320
>> +
>> +while (length--) {
>> +
On Thu, Jan 26, 2017 at 02:57:53PM +0300, Kirill A. Shutemov wrote:
> Most page cache allocation happens via readahead (sync or async), so if
> we want to have significant number of huge pages in page cache we need
> to find a ways to allocate them from readahead.
>
> Unfortunately, huge pages
Christoph,
On 2/9/17 18:36, Christoph Hellwig wrote:
> On Thu, Feb 09, 2017 at 01:18:49PM +0900, Damien Le Moal wrote:
>> +
>> +/*
>> + * Target BIO completion.
>> + */
>> +static inline void dmz_bio_end(struct bio *bio, int err)
>> +{
>> +struct dm_zone_bioctx *bioctx =
>> +
On Thu, Feb 09 2017, Jan Kara wrote:
> Commit 6cd18e711dd8 "block: destroy bdi before blockdev is
> unregistered." moved bdi unregistration (at that time through
> bdi_destroy()) from blk_release_queue() to blk_cleanup_queue() because
> it needs to happen before blk_unregister_region() call in
On Thu, Feb 09 2017, Jan Kara wrote:
> Currently switching of inode between different writeback structures is
> asynchronous and not guaranteed to succeed. Add a variant of switching
> that is synchronous and reliable so that it can reliably move inode to
> the default writeback structure
On Wed, Feb 8, 2017 at 4:08 PM, James Bottomley
wrote:
> On Mon, 2017-02-06 at 21:42 -0800, Dan Williams wrote:
[..]
>> ...but it reproduces on current mainline with the same config. I
>> haven't spotted what makes scsi_debug behave like this.
>
> Looking at
On 02/09/2017 06:20 PM, Scott Bauer wrote:
> When CONFIG_KASAN is enabled, compilation fails:
>
> block/sed-opal.c: In function 'sed_ioctl':
> block/sed-opal.c:2447:1: error: the frame size of 2256 bytes is larger than
> 2048 bytes [-Werror=frame-larger-than=]
>
> Moved all the ioctl structures
Make the function available for outside use and fortify it against NULL
kobject.
Signed-off-by: Jan Kara
---
include/linux/kobject.h | 2 ++
lib/kobject.c | 5 -
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/kobject.h
When device open races with device shutdown, we can get the following
oops in scsi_disk_get():
[11863.044351] general protection fault: [#1] SMP
[11863.045561] Modules linked in: scsi_debug xfs libcrc32c netconsole btrfs
raid6_pq zlib_deflate lzo_compress xor [last unloaded: loop]
Move it up in fs/fs-writeback.c so that we don't have to use forward
declarations. No code change.
Signed-off-by: Jan Kara
---
fs/fs-writeback.c | 30 +++---
1 file changed, 15 insertions(+), 15 deletions(-)
diff --git a/fs/fs-writeback.c
When block device is closed, we call inode_detach_wb() in __blkdev_put()
which sets inode->i_wb to NULL. That is contrary to expectations that
inode->i_wb stays valid once set during the whole inode's lifetime and
leads to oops in wb_get() in locked_inode_to_wb_and_lock_list() because
Currently switching of inode between different writeback structures is
asynchronous and not guaranteed to succeed. Add a variant of switching
that is synchronous and reliable so that it can reliably move inode to
the default writeback structure (bdi->wb) when writeback on bdi is going
to be
"previous" is a better name for the variable storing the previous
asynchronous request, better than the opaque name "data" atleast.
We see that we assign the return status to the returned variable
on all code paths, so we might as well just do that immediately
after calling mmc_finalize_areq().
As we want to complete requests autonomously from feeding the
host with new requests, we create a worker thread to deal with
this specifically in response to the callback from a host driver.
This patch just adds the worker, later patches will make use of
it.
Signed-off-by: Linus Walleij
The waitqueue in the host context is there to signal back from
mmc_request_done() through mmc_wait_data_done() that the hardware
is done with a command, and when the wait is over, the core
will typically submit the next asynchronous request that is pending
just waiting for the hardware to be
Remove all the pipeline flush: i.e. repeatedly sending NULL
down to the core layer to flush out asynchronous requests,
and also sending NULL after "special" commands to achieve the
same flush.
Instead: let the "special" commands wait for any ongoing
asynchronous transfers using the completion,
The host context member "is_new_req" is only assigned values,
never checked. Delete it.
Signed-off-by: Linus Walleij
---
drivers/mmc/core/core.c | 1 -
drivers/mmc/core/queue.c | 5 -
include/linux/mmc/host.h | 2 --
3 files changed, 8 deletions(-)
diff --git
The "is_done_rcv" in the context info for the host is no longer
needed: it is clear from context (ha!) that as long as we are
waiting for the asynchronous request to come to completion,
we are not done receiving data, and when the finalization work
has run and completed the completion, we are
The last member of the context info: is_waiting_last_req is
just assigned values, never checked. Delete that and the whole
context info as a result.
Signed-off-by: Linus Walleij
---
drivers/mmc/core/bus.c | 1 -
drivers/mmc/core/core.c | 13 -
The if() statment checking if there is no current or previous
request is now just looking ahead at something that will be
concluded a few lines below. Simplify the logic by moving the
assignment of .asleep.
Signed-off-by: Linus Walleij
---
drivers/mmc/core/queue.c | 9
On Thu 09-02-17 13:44:26, Jan Kara wrote:
> When a device gets removed, block device inode unhashed so that it is not
> used anymore (bdget() will not find it anymore). Later when a new device
> gets created with the same device number, we create new block device
> inode. However there may be file
On Wed, Feb 08 2017 at 11:18pm -0500,
Damien Le Moal wrote:
> The dm-zoned device mapper target provides transparent write access
> to zoned block devices (ZBC and ZAC compliant block devices).
> dm-zoned hides to the device user (a file system or an application
> doing
On Thu, Feb 09, 2017 at 05:50:31AM -0800, Christoph Hellwig wrote:
> On Sun, Feb 05, 2017 at 06:15:17PM +0100, Christoph Hellwig wrote:
> > Hi Michael, hi Jason,
> >
> > This patches applies a few cleanups to the virtio PCI interrupt handling
> > code, and then converts the virtio PCI code to use
On Thu 09-02-17 16:36:13, Boaz Harrosh wrote:
> On 02/02/2017 07:34 PM, Jan Kara wrote:
> > So far we just relied on block device to hold a bdi reference for us
> > while the filesystem is mounted. While that works perfectly fine, it is
> > a bit awkward that we have a pointer to a refcounted
This moves the asynchronous post-processing of a request over
to the finalization function.
The patch has a slight semantic change:
Both places will be in the code path for if (host->areq) and
in the same sequence, but before this patch, the next request
was started before performing
The following is the latest attempt at a rewriting the MMC/SD
stack to cope with multiqueueing.
If you just want to grab a branch and test the patches with
your hardware, I put a git branch with this series here:
HACK ALERT: DO NOT MERGE THIS! IT IS A FYI PATCH FOR DISCUSSION
ONLY.
This is a totally new implementation of how to do multiqueue
in the MMC/SD stack. It is based on top of my refactorings in the
series which ends with this patch, and now makes proper use of
.init_request() and .exit_request()
Instead of passing two pointers around and messing and reassigning
to the left and right, issue mmc_queue_req and dereference
the queue from the request where needed. The struct mmc_queue_req
is the thing that has a lifecycle after all: this is what we are
keepin in out queue. Augment all users to
This makes a crucial change to the issueing mechanism for the
MMC requests:
Before commit "mmc: core: move the asynchronous post-processing"
some parallelism on the read/write requests was achieved by
speculatively postprocessing a request and re-preprocess and
re-issue the request if something
The per-hardware-transaction struct mmc_queue_req is assigned
from a pool of 2 requests using a current/previous scheme and
then swapped around.
This is confusing, especially if we need more than two to make
our work efficient and parallel.
Rewrite the mechanism to have a pool of struct
Instead of doing retries at the same time as trying to submit new
requests, do the retries when the request is reported as completed
by the driver, in the finalization worker.
This is achieved by letting the core worker call back into the block
layer using mmc_blk_rw_done(), that will read the
> What immediately jumps out at you is that linear read/writes
> perform just as nicely or actually better with MQ than with the
> old block layer.
>
> What is amazing is that just a little randomness, such as the
> find . > /dev/null immediately seems to visibly regress with MQ.
> My best guess
On Thu 09-02-17 12:52:47, Thiago Jung Bauermann wrote:
> Hello Jan,
>
> Am Donnerstag, 9. Februar 2017, 13:44:23 BRST schrieb Jan Kara:
> > People, please have a look at patches. The are mostly simple however the
> > interactions are rather complex so I may have missed something. Also I'm
> >
On 02/09/2017 05:18 AM, Damien Le Moal wrote:
> The dm-zoned device mapper target provides transparent write access
> to zoned block devices (ZBC and ZAC compliant block devices).
> dm-zoned hides to the device user (a file system or an application
> doing raw block device accesses) any constraint
Move bdev_unhash_inode() after invalidate_partition() as
invalidate_partition() looks up bdev and will unnecessarily recreate it
if bdev_unhash_inode() destroyed it. Also use part_devt() when calling
bdev_unhash_inode() instead of manually creating the device number.
Signed-off-by: Jan Kara
__inode_wait_for_writeback() waits for I_SYNC on inode to get cleared.
There's nothing specific regarting I_SYNC for that function. Generalize
it so that we can use it also for I_WB_SWITCH bit. Also the function
uses __wait_on_bit() unnecessarily. Switch it to wait_on_bit() to remove
some code.
On Sun, Feb 05, 2017 at 06:15:17PM +0100, Christoph Hellwig wrote:
> Hi Michael, hi Jason,
>
> This patches applies a few cleanups to the virtio PCI interrupt handling
> code, and then converts the virtio PCI code to use the automatic MSI-X
> vectors spreading, as well as using the information in
On Wed 08-02-17 12:24:00, Richard Weinberger wrote:
> Am 02.02.2017 um 18:34 schrieb Jan Kara:
> > Allocate struct backing_dev_info separately instead of embedding it
> > inside the superblock. This unifies handling of bdi among users.
> >
> > CC: Richard Weinberger
> > CC: Artem
> > @@ -1249,6 +1254,50 @@ mount_fs(struct file_system_type *type, int flags,
> > const char *name, void *data)
> > }
> >
> > /*
> > + * Setup private BDI for given superblock. I gets automatically cleaned up
>
> (typo) s/I/It/
>
> Looks fine otherwise.
Thanks, fixed.
On 02/02/2017 07:34 PM, Jan Kara wrote:
> So far we just relied on block device to hold a bdi reference for us
> while the filesystem is mounted. While that works perfectly fine, it is
> a bit awkward that we have a pointer to a refcounted structure in the
> superblock without proper reference. So
On Wed, Feb 8, 2017 at 11:12 PM, Scott Bauer wrote:
> On Wed, Feb 08, 2017 at 02:58:28PM -0700, Scott Bauer wrote:
>> Thank you for the report. We want to keep the function calls agnostic to
>> userland.
>> In the future we will have in-kernel callers and I don't want to
On 02/02/2017 07:34 PM, Jan Kara wrote:
> Allocate struct backing_dev_info separately instead of embedding it
> inside the superblock. This unifies handling of bdi among users.
>
> CC: Boaz Harrosh
> CC: Benny Halevy
> CC: osd-...@open-osd.org
From: Arnd Bergmann
> Sent: 08 February 2017 21:15
>
> When CONFIG_KASAN is in use, the sed_ioctl function uses unusually large
> stack,
> as each possible ioctl argument gets its own stack area plus redzone:
Why not do a single copy_from_user() at the top of sed_ioctl() based on
the _IOC_DIR()
Am 09.02.2017 um 13:17 schrieb Jan Kara:
>> So ->capabilities is now zero by default since you use __GFP_ZERO in
>> bdi_alloc().
>> At least for UBIFS I'll add a comment on this, otherwise it is not so
>> clear that UBIFS wants a BDI with no capabilities and how it achieves that.
>
> OK, I've
the IOW for the IOC_OPAL_ACTIVATE_LSP took the wrong strcure which
would give us the wrong size when using _IOC_SIZE, switch it to the
right structure.
Fixes: 058f8a2 ("Include: Uapi: Add user ABI for Sed/Opal")
Signed-off-by: Scott Bauer
---
It may be too late to change anyhting in the uapi header. When we
switched over to using IOC_SIZE I found a bug where I had switched
up a structure in one of the series from v4 to v5 but never changed
the structure in the IOW. The structure that was in there was to small
so when we kzalloc on it
When CONFIG_KASAN is enabled, compilation fails:
block/sed-opal.c: In function 'sed_ioctl':
block/sed-opal.c:2447:1: error: the frame size of 2256 bytes is larger than
2048 bytes [-Werror=frame-larger-than=]
Moved all the ioctl structures off the stack and dynamically activate
using _IOC_SIZE()
mmc_wait_for_data_req_done() is called in exactly one place,
and having it spread out is making things hard to oversee.
Factor this function into mmc_finalize_areq().
Signed-off-by: Linus Walleij
---
drivers/mmc/core/core.c | 86
On Wed, Feb 08, 2017 at 07:57:27PM -0800, Matthew Wilcox wrote:
> On Thu, Jan 26, 2017 at 02:57:43PM +0300, Kirill A. Shutemov wrote:
> > +++ b/include/linux/pagemap.h
> > @@ -332,6 +332,15 @@ static inline struct page
> > *grab_cache_page_nowait(struct address_space *mapping,
> >
We have this construction:
if (a && b && !c)
finalize;
else
block;
finalize;
Which is equivalent by boolean logic to:
if (!a || !b || c)
block;
finalize;
Which is simpler code.
Signed-off-by: Linus Walleij
---
drivers/mmc/core/core.c | 27
From: Scott Bauer
> Sent: 09 February 2017 17:20
> It may be too late to change anyhting in the uapi header. When we
> switched over to using IOC_SIZE I found a bug where I had switched
> up a structure in one of the series from v4 to v5 but never changed
> the structure in the IOW. The structure
On Thu, Feb 09, 2017 at 05:43:20PM +, David Laight wrote:
> From: Scott Bauer
> > Sent: 09 February 2017 17:20
> > It may be too late to change anyhting in the uapi header. When we
> > switched over to using IOC_SIZE I found a bug where I had switched
> > up a structure in one of the series
64 matches
Mail list logo