On Fri, May 18, 2018 at 03:03:49PM +0200, Roman Pen wrote:
> Function is going to be used in transport over RDMA module
> in subsequent patches.
>
> Signed-off-by: Roman Pen
> Cc: Tejun Heo
> Cc: linux-ker...@vger.kernel.org
Acked-by: Tejun Heo
Please feel free to apply
On Sun, May 27, 2018 at 11:21:25AM +0900, Tetsuo Handa wrote:
> From 8a8222698163d1fe180258566e9a3ff43f54fcd9 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa
> Date: Sun, 27 May 2018 11:08:20 +0900
> Subject: [PATCH] bdi: Fix another oops in wb_workfn()
>
> syzbot is still hitting NULL pointer dere
On Sun, May 27, 2018 at 01:43:45PM +0900, Tetsuo Handa wrote:
> Tejun Heo wrote:
> > On Sun, May 27, 2018 at 11:21:25AM +0900, Tetsuo Handa wrote:
> > > syzbot is still hitting NULL pointer dereference at wb_workfn() [1].
> > > This might be because we overlooked that d
Hey, Linus.
On Mon, Jun 04, 2018 at 09:24:28AM -0700, Linus Torvalds wrote:
> Tejun, the code in question is mddev_put() in drivers/md/md.c, and it
> basically does
>
> INIT_WORK(&mddev->del_work, mddev_delayed_delete);
> queue_work(md_misc_wq, &mddev->del_work);
>
Hello,
On Mon, Jun 11, 2018 at 11:12:48AM +0200, Jan Kara wrote:
> However this is wrong and so is the patch. The problem is in
> cgwb_bdi_unregister() which does cgwb_kill() and thus drops bdi's
> reference to wb structures before going through the list of wbs again and
> calling wb_shutdown() on
way we also no longer need synchronization using WB_shutting_down as the
> mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
> CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
> bdi_unregister().
>
> Reported-by: syzbot
> Signed-off-by: Jan Kara
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello,
On Thu, Aug 02, 2018 at 11:29:39AM -0700, Bart Van Assche wrote:
> Introduce a function that allows to read the value of a per-cpu counter.
> This function will be used in the next patch to check whether a per-cpu
> counter in atomic mode has the value one.
I'm not a big fan of exposing th
Hello, Bart.
On Thu, Aug 02, 2018 at 01:04:43PM -0700, Bart Van Assche wrote:
> As you probably know one of the long term goals for the block layer
> is to switch to blk-mq and to drop the legacy block layer. Hence
> this patch series that adds support for runtime power management to
> blk-mq beca
Hello,
On Tue, Aug 07, 2018 at 03:51:28PM -0700, Bart Van Assche wrote:
> Introduce a function that allows to determine whether a per-cpu refcount
> is in use. This function will be used in a later patch to determine
> whether or not any block layer requests are being executed.
I thought about it
Hello,
On Thu, Aug 09, 2018 at 07:53:36AM -0700, Bart Van Assche wrote:
> +/**
> + * blkg_lookup - look up blkg for the specified request queue
> + * @q: request_queue of interest
> + *
> + * Lookup blkg for @q at the root level. See also blkg_lookup().
> + */
> +static inline struct blkcg_gq *blk
rcu grace period.
Can you please note that in the function comment? Provided that,
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello, Bart.
On Mon, Sep 24, 2018 at 01:43:35PM -0700, Bart Van Assche wrote:
> Thanks for the review. But could you please clarify what "after ->percpu_ref
> is called" refers to?
Oops, sorry, I meant the confirm_kill callback of
percpu_ref_kill_and_confirm().
Thanks.
--
tejun
Hello, Konstantin.
On Wed, Feb 27, 2019 at 11:05:44AM +0300, Konstantin Khlebnikov wrote:
> Unlike to memory cgroup blkio throttler does not support value suffixes.
>
> It silently ignores everything after last digit. For example this command
> will set rate limit 1 byte per second rather than 1
Hello,
On Tue, Apr 11, 2017 at 03:43:01PM +0200, Paolo Valente wrote:
> From: Arianna Avanzini
>
> Add complete support for full hierarchical scheduling, with a cgroups
> interface. Full hierarchical scheduling is implemented through the
> 'entity' abstraction: both bfq_queues, i.e., the interna
Hello, Paolo.
On Wed, Apr 12, 2017 at 07:22:03AM +0200, Paolo Valente wrote:
> could you elaborate a bit more on this? I mean, cgroups support has
> been in BFQ (and CFQ) for almost ten years, perfectly working as far
> as I know. Of course it is perfectly working in terms of I/O and not
> of CP
Hello,
On Mon, May 15, 2017 at 09:49:13PM +0200, Paolo Valente wrote:
> So, unless you tell me that there are other races I haven't seen, or,
> even worse, that I'm just talking nonsense, I have thought of a simple
> solution to address this issue without resorting to the request_queue
> lock: fur
On Wed, May 17, 2017 at 01:07:23PM -0700, Shaohua Li wrote:
> Hi,
>
> The patchset fix a few issues for io.low limit. The title of the patches
> explain the issues pretty well.
Looks good to me.
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello, Paolo.
On Fri, May 19, 2017 at 10:39:08AM +0200, Paolo Valente wrote:
> Operations on blkg objects in blk-cgroup are protected with the
> request_queue lock, which is no more the lock that protects
> I/O-scheduler operations in blk-mq. The latter are now protected with
> finer-grained per-s
Hello, Paolo.
On Sat, May 20, 2017 at 09:27:33AM +0200, Paolo Valente wrote:
> Consider a process or a group that is moved from a given source group
> to a different group, or simply removed from a group (although I
> didn't yet succeed in just removing a process from a group :) ). The
> pointer
Hello, Paolo.
On Wed, May 24, 2017 at 12:53:26PM +0100, Paolo Valente wrote:
> Exact, but even after all blkgs, as well as the cfq_group and pd, are
> gone, the children cfq_queues of the gone cfq_group continue to point
> to unexisting objects, until new cfq_set_requests are executed for
> those
__blkg_release_rcu+0x59/0x170
> rcu_process_callbacks+0x260/0x4e0
> __do_softirq+0x116/0x250
> smpboot_thread_fn+0x123/0x1e0
> kthread+0x109/0x140
> ret_from_fork+0x31/0x40
>
> Fixes: commit e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of
> struct
Hello,
On Wed, May 24, 2017 at 05:43:18PM +0100, Paolo Valente wrote:
> > so none of the above objects can be destroyed before the request is
> > done.
>
> ... the issue seems just to move to a more subtle position: cfq is ok,
> because it protects itself with rq lock, but blk-mq schedulers don't
Hello, Paolo.
On Thu, May 25, 2017 at 09:10:59AM +0200, Paolo Valente wrote:
> Ok. So, just to better understand: as of now, i.e., before you make
> the changes you are proposing, the address returned by blkg_lookup can
> be used safely only if one both invokes blkg_lookup and gets a
> reference,
On Sun, Jun 04, 2017 at 02:42:19PM +0200, Christoph Hellwig wrote:
> Hi all,
>
> this series adds support for using our new generic TCG OPAL code with
> SATA disks, and as side effect for SCSI disks (although so far it doesn't
> seem like none of those actually exist).
Applied 1-5 to libata/for-4
Hello,
On Fri, Jun 02, 2017 at 02:53:55PM -0700, Shaohua Li wrote:
> @@ -630,7 +633,11 @@ static struct kernfs_node *__kernfs_new_node(struct
> kernfs_root *root,
> if (!kn)
> goto err_out1;
>
> - ret = ida_simple_get(&root->ino_ida, 1, 0, GFP_KERNEL);
> + idr_preloa
Hello,
On Fri, Jun 02, 2017 at 02:53:56PM -0700, Shaohua Li wrote:
> --- a/fs/kernfs/dir.c
> +++ b/fs/kernfs/dir.c
> @@ -643,6 +643,7 @@ static struct kernfs_node *__kernfs_new_node(struct
> kernfs_root *root,
> kn->ino = ret;
> kn->generation = atomic_inc_return(&root->next_generatio
Hello,
On Fri, Jun 02, 2017 at 02:53:57PM -0700, Shaohua Li wrote:
> @@ -570,7 +570,8 @@ static int kernfs_dop_revalidate(struct dentry *dentry,
> unsigned int flags)
> goto out_bad;
>
> /* The kernfs node has been moved? */
> - if (dentry->d_parent->d_fsdata != kn->pare
Ooh, one more thing.
On Mon, Jun 12, 2017 at 02:20:28PM -0400, Tejun Heo wrote:
> > +struct kernfs_node *kernfs_get_node_by_ino(struct kernfs_root *root,
> > + unsigned int ino)
Can we name this kernfs_find_and_get_by_ino() for consistency?
On Fri, Jun 02, 2017 at 02:53:59PM -0700, Shaohua Li wrote:
> From: Shaohua Li
>
> Add an API to export cgroup fhandle info. We don't export a full 'struct
> file_handle', there are unrequired info. Sepcifically, cgroup is always
> a directory, so we don't need a 'FILEID_INO32_GEN_PARENT' type fh
On Fri, Jun 02, 2017 at 02:54:01PM -0700, Shaohua Li wrote:
> @@ -691,6 +691,8 @@ static inline bool blkcg_bio_issue_check(struct
> request_queue *q,
> rcu_read_lock();
> blkcg = bio_blkcg(bio);
>
> + bio_associate_blkcg(bio, &blkcg->css);
> +
Let's please note that this only es
really is a problem, we should
> find better data structure.
>
> Signed-off-by: Shaohua Li
Acked-by: Tejun Heo
Thanks.
--
tejun
number.
>
> Signed-off-by: Shaohua Li
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello,
On Thu, Jun 15, 2017 at 11:17:11AM -0700, Shaohua Li wrote:
> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> index 33f711f..7a4f327 100644
> --- a/fs/kernfs/dir.c
> +++ b/fs/kernfs/dir.c
> @@ -508,6 +508,10 @@ void kernfs_put(struct kernfs_node *kn)
> struct kernfs_node *parent;
>
Look at the kernfs code
> closely, there is no point to set dentry->d_fsdata. inode->i_private
> already points to kernfs_node, and we can get inode from a dentry. So
> this patch just delete the d_fsdata usage.
>
> Signed-off-by: Shaohua Li
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello,
On Thu, Jun 15, 2017 at 11:17:13AM -0700, Shaohua Li wrote:
> +/* represent a kernfs node */
> +struct kernfs_node_id {
> + u32 ino;
> + u32 generation;
> +} __attribute__((packed));
Can we just make it a u64? kernfs cares about the details
Hello,
On Thu, Jun 15, 2017 at 11:17:14AM -0700, Shaohua Li wrote:
> -static int kernfs_fill_super(struct super_block *sb, unsigned long magic)
> +static int kernfs_fill_super(struct super_block *sb, unsigned long magic,
> + bool enable_expop)
Hmm... can't we make this a
e cgroup info into bio at
> blkcg_bio_issue_check. This also makes blktrace outputs correct cgroup
> info.
>
> Signed-off-by: Shaohua Li
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello,
On Thu, Jun 15, 2017 at 11:17:19AM -0700, Shaohua Li wrote:
> From: Shaohua Li
>
> By default we output cgroup id in blktrace. This adds an option to
> display cgroup path. Since get cgroup path is a relativly heavy
> operation, we don't enable it by default.
>
> with the option enabled,
On Mon, Jun 26, 2017 at 12:43:27PM -0400, Martin K. Petersen wrote:
>
> Christoph,
>
> > ping?
>
> Looks good to me. I'll queue it up for 4.13 as soon as Linus has pulled
> in the ata bits.
I can route it through libata tree w/ your ack if that's more convenient.
Thanks.
--
tejun
To make the API lock free, kernfs node is freed in RCU
> context. And we depend on kernfs_node count/ino number to filter out
> stale kernfs nodes.
>
> Signed-off-by: Shaohua Li
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello,
On Wed, Jun 28, 2017 at 10:54:28AM -0600, Jens Axboe wrote:
> >> Series looks fine to me. I don't know how you want to split or funnel it,
> >> since it touches multiple different parts. Would it make sense to split
> >> this
> >> series into two - one for the kernfs changes, and then a su
ed a 'FILEID_INO32_GEN_PARENT' type fhandle,
> we only need export the inode number and generation number just like
> what generic_fh_to_dentry does. And we can avoid the overhead of getting
> an inode too, since kernfs_node_id (ino and generation) has all the info
> requi
On Mon, Jun 19, 2017 at 02:26:46PM +0200, Christoph Hellwig wrote:
> Just wire up the generic TCG OPAL infrastructure to the SCSI disk driver
> and the Security In/Out commands.
>
> Note that I don't know of any actual SCSI disks that do support TCG OPAL,
> but this is required to support ATA disk
Hello,
On Wed, Jun 28, 2017 at 02:57:38PM -0600, Jens Axboe wrote:
> Personally I don't care that much, but the risk of conflicts is much
> higher on the block side, than on the kernfs side. So might be the
> path of less resistance to pull it through the block tree. And I'd be
> happy to do that,
Hello, Paolo.
On Mon, Nov 19, 2018 at 11:34:14AM +0100, Paolo Valente wrote:
> - if all entities produce the same output, the this common output is
> shown only once;
> - if the outputs differ, then every per-entity output is shown,
> followed by the name of the entity that produced that output.
are associating bsaed on the current css. If there is already a blkg
> associated, the css will be reused and association will be redone as the
> request_queue may have changed.
>
> Signed-off-by: Dennis Zhou
Acked-by: Tejun Heo
A minor nit below.
> +/**
> + * bio_associate_b
On Mon, Nov 26, 2018 at 04:19:38PM -0500, Dennis Zhou wrote:
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 62715a5a4f32..8bc9d9b29fd3 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -486,6 +486,12 @@ extern unsigned int bvec_nr_vecs(unsigned short idx);
> ext
by: Dennis Zhou
Acked-by: Tejun Heo
Thanks.
--
tejun
Hello, Paolo.
On Fri, Nov 30, 2018 at 07:23:24PM +0100, Paolo Valente wrote:
> > Then we understood that exactly the same happens with throttling, in
> > case the latter is activated on different devices w.r.t. bfq.
> >
> > In addition, the same may happen, in the near future, with the
> > bandwi
Hello, Paolo.
On Tue, Dec 18, 2018 at 08:48:10AM +0100, Paolo Valente wrote:
> If Tejun cannot see any solution to his concern, then can we just
> switch to this extension, considering that
> - for non-shared names the interface is *identical* to the current
> one;
> - by using this new interfac
Hello, Paolo.
On Sun, Dec 23, 2018 at 12:00:14PM +0100, Paolo Valente wrote:
> 4.21 is coming ... and the legacy proportional share interface will
> be gone with cfq. This will break legacy code using the
> proportional-share interface on top of bfq. This code may just fail
> when trying to use
Hello, Paolo.
On Sun, Dec 30, 2018 at 11:25:25AM +0100, Paolo Valente wrote:
> What's the benefit of throwing away months of work, on which we agreed
> before starting it, and that solves a problem already acknowledged by
> interested parties?
Showing multiple conflicting numbers definitely isn't
oc() // fails
> - .seq_stop()
>- class_dev_iter_exit(seqf->private) // boom! old pointer
>
> As the comment in disk_seqf_stop() says, stop is called even if start
> failed, so we need to reinitialise the private pointer to NULL when seq
> iteration stops.
>
> An
Hello, Tom.
On Sun, Aug 07, 2016 at 10:10:17PM +0800, Tom Yan wrote:
> So the (not so) recent bump of BLK_DEF_MAX_SECTORS from 1024 to 2560
> (commit d2be537c3ba3) seemed to have caused trouble to some of the ATA
> devices, which were then worked around with ATA_HORKAGE_MAX_SEC_1024.
>
> However,
Hello,
On Mon, Aug 08, 2016 at 01:39:08PM +0200, Roman Pen wrote:
> Long time ago there was a similar fix proposed by Akinobu Mita[1],
> but it seems that time everyone decided to fix this subtle race in
> percpu-refcount and Tejun Heo[2] did an attempt (as I can see that
> patc
Hello,
On Tue, Aug 09, 2016 at 10:45:47PM +0800, tom.t...@gmail.com wrote:
> From: Tom Yan
>
> Currently block layer limit max_hw_sectors is set to
> ATA_MAX_SECTORS_LBA48 (65535), for devices with LBA48 support.
>
> However, block layer limit max_sectors (which is the effective
> one; also adj
On Wed, Aug 10, 2016 at 10:42:09AM +0200, Roman Penyaev wrote:
> Hi,
>
> On Wed, Aug 10, 2016 at 5:55 AM, Tejun Heo wrote:
> > Hello,
> >
> > On Mon, Aug 08, 2016 at 01:39:08PM +0200, Roman Pen wrote:
> >> Long time ago there was a similar fix proposed by Aki
Hello, Tom.
On Wed, Aug 10, 2016 at 04:32:39PM +0800, Tom Yan wrote:
> I have to admit that libata may not be the right place to deal with my
> concern over the current BLK_DEF_MAX_SECTORS, which seems non-sensical
> to me. In the original commit message:
>
> (d2be537c3ba3, "block: bump BLK_DEF_M
Hello, Tom.
On Wed, Aug 10, 2016 at 06:04:10PM +0800, Tom Yan wrote:
> On 10 August 2016 at 11:26, Tejun Heo wrote:
> > Hmmm.. why not? The hardware limit is 64k and the driver is using a
>
> Is that referring to the maximum number of entries allowed in the
> PRDT, Physical
Hello, Dan.
On Fri, Aug 12, 2016 at 02:29:30PM -0700, Dan Williams wrote:
> Before spending effort trying to flush the destruction of old bdi
> instances before new ones are registered, is it rather time to
> complete the conversion of sd to only use dynamically allocated devt?
I think that proba
Hello,
On Sat, Aug 27, 2016 at 11:30:22AM +0200, Vegard Nossum wrote:
> > Don't know what's the right fix, but I posted a slightly different one
> > for the same crash some months ago:
> > https://patchwork.kernel.org/patch/8556941/
> >
>
> Ah, I'm sorry, I didn't see that.
>
> Your patch is 10
Hello,
On Mon, Aug 29, 2016 at 10:49:57PM +0200, Vegard Nossum wrote:
> That didn't work at all. I guess bd_acquire() would just do a bdgrab()
> and not touch ->bd_holders, whereas blkdev_get() would increment
Yeah, bdev has two different refs - one for bdev struct itself and the
other for the ac
Hello,
On Mon, Aug 29, 2016 at 11:33:41PM +0200, Vegard Nossum wrote:
> On 08/29/2016 09:55 PM, Tejun Heo wrote:
> > I think the right thing to do there is doing blkdev_get() /
> > blkdev_put() around func() invocation in iterate_bdevs() rather than
> > holding bd_mutex acro
On Fri, Sep 23, 2016 at 09:07:56AM -0700, Bart Van Assche wrote:
> Avoid that sparse complains about blkg_hint manipulations.
>
> Fixes: a637120e4902 ("blkcg: use radix tree to index blkgs from blkcg")
> Signed-off-by: Bart Van Assche
> Cc: Tejun Heo
Acked-by: Tejun
Hello,
On Tue, Sep 27, 2016 at 11:14:54AM -0700, Adam Manzanares wrote:
> Patch adds association between iocontext and a request.
>
> Signed-off-by: Adam Manzanares
Can you please describe how this may impact existing usages?
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "
Hello,
On Tue, Sep 27, 2016 at 11:14:55AM -0700, Adam Manzanares wrote:
> +/**
> + * ata_ncq_prio_enabled - Test whether NCQ prio is enabled
> + * @dev: ATA device to test for
> + *
> + * LOCKING:
> + * spin_lock_irqsave(host lock)
> + *
> + * RETURNS:
> + * 1 if NCQ prio is enabled fo
k-cgroup.c:1378: blkcg_policy_register() error: double unlock
> 'mutex:&blkcg_pol_mutex'
>
> Signed-off-by: Bart Van Assche
> Cc: Tejun Heo
> Cc:
> ---
> block/blk-cgroup.c | 6 --
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --gi
k-cgroup.c:1378: blkcg_policy_register() error: double unlock
> 'mutex:&blkcg_pol_mutex'
>
> Fixes: 06b285bd1125 ("blkcg: fix blkcg_policy_data allocation bug")
> Signed-off-by: Bart Van Assche
> Cc: Tejun Heo
> Cc:
Applied to cgroup/for-4.9. We're right
Hello, Adam.
On Fri, Sep 30, 2016 at 09:02:17AM -0700, Adam Manzanares wrote:
> I'll start with the changes I made and work my way through a grep of
>
> ioprio. Please add or correct any of the assumptions I have made.
>
Well, it looks like you're the one who's most f
Hello, Vivek.
On Tue, Oct 04, 2016 at 09:28:05AM -0400, Vivek Goyal wrote:
> On Mon, Oct 03, 2016 at 02:20:19PM -0700, Shaohua Li wrote:
> > Hi,
> >
> > The background is we don't have an ioscheduler for blk-mq yet, so we can't
> > prioritize processes/cgroups.
>
> So this is an interim solution
Hello,
On Tue, Oct 04, 2016 at 06:22:28PM +0200, Paolo Valente wrote:
> Could you please elaborate more on this point? BFQ uses sectors
> served to measure service, and, on the all the fast devices on which
> we have tested it, it accurately distributes
> bandwidth as desired, redistributes exces
Hello, Vivek.
On Tue, Oct 04, 2016 at 02:12:45PM -0400, Vivek Goyal wrote:
> Agreed that we don't have a good basic unit to measure IO cost. I was
> thinking of measuring cost in terms of sectors as that's simple and
> gets more accurate on faster devices with almost no seek penalty. And
If this
Hello, Paolo.
On Tue, Oct 04, 2016 at 07:43:48PM +0200, Paolo Valente wrote:
> > I don't think IO bandwidth does not matter. The problem is bandwidth can't
> > measure IO cost. For example, you can't say 8k IO costs 2x IO resource than
> > 4k
> > IO.
>
> For what goal do you need to be able to s
Hello, Paolo.
On Tue, Oct 04, 2016 at 09:02:47PM +0200, Paolo Valente wrote:
> That's exactly what BFQ has succeeded in doing in all the tests
> devised so far. Can you give me a concrete example for which I can
> try with BFQ and with any other mechanism you deem better. If
> you are right, num
Hello, Paolo.
On Tue, Oct 04, 2016 at 09:29:48PM +0200, Paolo Valente wrote:
> > Hmm... I think we already discussed this but here's a really simple
> > case. There are three unknown workloads A, B and C and we want to
> > give A certain best-effort guarantees (let's say around 80% of the
> > und
Hello, Adam.
On Tue, Oct 04, 2016 at 08:49:18AM -0700, Adam Manzanares wrote:
> > I wonder whether the right thing to do is adding bio->bi_ioprio which
> > is initialized on bio submission and carried through req->ioprio.
>
> I looked around and thought about this and I'm not sure if this will he
Hello, Paolo.
On Wed, Oct 05, 2016 at 02:37:00PM +0200, Paolo Valente wrote:
> In this respect, for your generic, unpredictable scenario to make
> sense, there must exist at least one real system that meets the
> requirements of such a scenario. Or, if such a real system does not
> yet exist, it
Hello,
On Thu, Oct 13, 2016 at 04:00:30PM -0700, Adam Manzanares wrote:
> This patch checks to see if an ATA device supports NCQ command priorities.
> If so and the user has specified an iocontext that indicates
> IO_PRIO_CLASS_RT then we build a tf with a high priority command.
>
> This is done
Hello, Adam.
Sorry about late reply. Was on vacation.
On Thu, Oct 13, 2016 at 04:00:31PM -0700, Adam Manzanares wrote:
> Add a sysfs entry to turn on priority information being passed
> to a ATA device. By default this feature is turned off.
>
> This patch depends on ata: Enabling ATA Command P
Hello, Kyle.
On Sat, Oct 08, 2016 at 06:15:14PM -0700, Kyle Sanderson wrote:
> How is this even a discussion when hard numbers, and trying any
> reproduction case easily reproduce the issues that CFQ causes. Reading
> this thread, and many others only grows not only my disappointment,
> but whenev
Hello, Paolo.
On Fri, Oct 14, 2016 at 07:13:41PM +0200, Paolo Valente wrote:
> That said, your 'thus' seems a little too strong: "bfq does not yet
> handle fast SSDs, thus we need something else". What about the
> millions of devices (and people) still within 10-20 K IOPS, and
> experiencing awfu
On Mon, Oct 17, 2016 at 11:27:30AM -0700, Adam Manzanares wrote:
> Add a sysfs entry to turn on priority information being passed
> to a ATA device. By default this feature is turned off.
>
> This patch depends on ata: Enabling ATA Command Priorities
>
> Signed-off-by: Adam Manzanares
> ---
> d
Hello,
On Mon, Oct 17, 2016 at 11:27:28AM -0700, Adam Manzanares wrote:
> Patch adds an association between iocontext ioprio and the ioprio of a
> request. This is done to enable request based drivers the ability to
> act on priority information stored in the request. An example being
> ATA device
s done to improve the tail latency of commands that are high
priority by passing priority to the device.
tj: Removed trivial ata_ncq_prio_enabled() and open-coded the test.
Signed-off-by: Adam Manzanares
Signed-off-by: Tejun Heo
---
drivers/ata/libata-core.
On Mon, Oct 17, 2016 at 11:27:27AM -0700, Adam Manzanares wrote:
> This patch builds ATA commands with high priority if the iocontext of a
> process
> is set to real time. The goal of the patch is to improve tail latencies of
> workloads that use higher queue depths. This requires setting the ioc
This patch depends on ata: Enabling ATA Command Priorities
tj: Renamed ncq_prio_on to ncq_prio_enable and removed trivial
ata_ncq_prio_on() and open-coded the test.
Signed-off-by: Adam Manzanares
Signed-off-by: Tejun Heo
---
drivers/ata/libahci.c | 1 +
drivers/ata/libata-core.c | 3 ++-
fied
through the @gfp parameter. This currently doesn't cause any actual
issues because all current callers specify GFP_KERNEL. Fix it.
Signed-off-by: Tejun Heo
Reported-by: Dan Carpenter
Fixes: e4a9bde9589f ("blkcg: replace blkcg_policy->cpd_size with
->cpd_alloc/free_fn() method
Hello, Shaohua.
Sorry about the delay.
On Mon, Nov 14, 2016 at 02:22:09PM -0800, Shaohua Li wrote:
> @@ -1376,11 +1414,37 @@ static ssize_t tg_set_max(struct kernfs_open_file *of,
> goto out_finish;
> }
>
> - tg->bps[READ][LIMIT_MAX] = v[0];
> - tg->bps[WRITE
On Mon, Nov 14, 2016 at 02:22:10PM -0800, Shaohua Li wrote:
> each queue will have a state machine. Initially queue is in LIMIT_HIGH
> state, which means all cgroups will be throttled according to their high
> limit. After all cgroups with high limit cross the limit, the queue state
> gets upgraded
Hello,
On Mon, Nov 14, 2016 at 02:22:12PM -0800, Shaohua Li wrote:
> +static unsigned long tg_last_high_overflow_time(struct throtl_grp *tg)
> +{
> + struct throtl_service_queue *parent_sq;
> + struct throtl_grp *parent = tg;
> + unsigned long ret = __tg_last_high_overflow_time(tg);
>
Hello,
On Mon, Nov 14, 2016 at 02:22:14PM -0800, Shaohua Li wrote:
> throtl_slice is important for blk-throttling. A lot of stuffes depend on
> it, for example, throughput measurement. It has 100ms default value,
> which is not appropriate for all disks. For example, for SSD we might
> use a small
Hello,
On Tue, Nov 22, 2016 at 04:21:21PM -0500, Tejun Heo wrote:
> 1. A cgroup and its high and max limits don't have much to do with
>other cgroups and their limits. I don't get how the choice between
>high and max limits can be a td-wide state.
Ah, okay, this
Hello,
On Tue, Nov 22, 2016 at 03:08:36PM -0800, Shaohua Li wrote:
> Yep, the limit could be high or max. It's doable moving the restrictions on
> input, but will increase trouble using the limits. If high is bigger than max,
> can I set high to max? if not, I'd prefer to keep the restrictions.
Y
Hello,
On Tue, Nov 22, 2016 at 03:18:24PM -0800, Shaohua Li wrote:
> Hmm, it's not a real 'time slice'. The name is a bit confusion. Maybe rename
> it
> to 'throtl_interval' or 'throtl_sampling_time'? not sure. bandwidth and iops
> are always in terms of a time interval we measure them. We can't
Hello,
On Mon, Nov 14, 2016 at 02:22:16PM -0800, Shaohua Li wrote:
> cg1/cg2 bps: 10/80 -> 15/105 -> 20/100 -> 25/95 -> 30/90 -> 35/85 -> 40/80
> -> 45/75 -> 10/80
I wonder whether it'd make sense to make the clamping down gradual too
(way faster than the ramping up but still gradual). The contr
On Mon, Nov 14, 2016 at 02:22:18PM -0800, Shaohua Li wrote:
> Add interface to configure the threshold
>
> Signed-off-by: Shaohua Li
> ---
> block/blk-sysfs.c| 7 +++
> block/blk-throttle.c | 25 +
> block/blk.h | 4
> 3 files changed, 36 insertion
Hello, Shaohua.
On Mon, Nov 14, 2016 at 02:22:17PM -0800, Shaohua Li wrote:
> Unfortunately it's very hard to determine if a cgroup is real idle. This
> patch uses the 'think time check' idea from CFQ for the purpose. Please
> note, the idea doesn't work for all workloads. For example, a workload
Hello, Shaohua.
On Wed, Nov 23, 2016 at 05:06:30PM -0800, Shaohua Li wrote:
> > Shouldn't this be a per-cgroup setting along with latency target?
> > These two are the parameters which define how the cgroup should be
> > treated time-wise.
>
> It should be easy to make it per-cgroup. Just not sur
Hello, Shaohua.
On Wed, Nov 23, 2016 at 05:15:18PM -0800, Shaohua Li wrote:
> > Hmm... I'm not sure thinktime is the best measure here. Think time is
> > used by cfq mainly to tell the likely future behavior of a workload so
> > that cfq can take speculative actions on the prediction. However,
>
Hello, Shaohua.
On Mon, Nov 28, 2016 at 03:10:18PM -0800, Shaohua Li wrote:
> > But we can increase sharing by upping the target latency. That should
> > be the main knob - if low, the user wants stricter service guarantee
> > at the cost of lower overall utilization; if high, the workload can
>
1 - 100 of 464 matches
Mail list logo