Re: [PATCH 2/7] block: have ->poll_fn() return number of entries polled

2018-11-27 Thread Jens Axboe
On 11/27/18 3:10 AM, Sagi Grimberg wrote:
>> No, it'll be exactly the same, and believe me, I've done plenty of
>> performance testing to ensure that it works well.  In fact, with the
>> changes queued up and this on top, polling is both faster and more
>> efficient than it ever has been before, for both the classic
>> preadv2/pwritev2 and the async polled IO.
> 
> I don't argue that this has not been tested, I was referring to the
> following use-case: app queues say 16 I/Os with io_submit and then
> issues preadv2 for an "important" 512B sector.
> 
> With this proposed change, is poll_fn more likely to return with the
> first completion it sees rather than continue poll until it sees that
> specific I/O it polled for?

Any IO completion will make it return, that was the case before and that
is still the case. If it isn't your specific IO, then you get to go
around the loop again.

>> I think that would be useless. For SYNC type polling with one thread, it
>> doesn't matter if we're looking for a particular IO, or just any IO.
>> Once your into two threads, you're already wasting huge amounts of CPU,
>> just to get to QD=2. The poll loop handles finding each others IOs just
>> fine, so there's no worry on that side.
>>
>> Polling was originally designed for the strictly SYNC interfaces, and
>> since we had the cookie anyway, it was tempting to look for a specific
>> IO. I think that was a mistake, as it assumed that we'd never do async
>> polling.
> 
> Not arguing. Just now that we already have the selective interface in
> place I'm asking is it OK to change that (IFF the above question is
> indeed real) esoteric as this use-case may be...

It won't cause any real change for the sync use case. There's no
difference between multiple sync users of polling and a mixed async/sync
polled use case in that regard.

-- 
Jens Axboe



Re: [PATCH 2/7] block: have ->poll_fn() return number of entries polled

2018-11-27 Thread Sagi Grimberg




We currently only really support sync poll, ie poll with 1
IO in flight. This prepares us for supporting async poll.


Hey Jens,

So are we sure that this is fine to simply replace the
poll functionality? you say that that we support poll
with only 1 I/O inflight but is it entirely true?
semantically speaking?


It is, unless you have multiple threads all doing polling.  Which is
pretty inefficient, as I'm sure you know.


I am, wasn't referring to multiple threads though..



The interface is selective polling, which means that
the user can have more than a single I/O inflight but wants
to poll for a specific one that it really cares about.

Would this introduce a regression for users that rely
on preadv2 to basically "complete *this* IO as fast as possible"?


No, it'll be exactly the same, and believe me, I've done plenty of
performance testing to ensure that it works well.  In fact, with the
changes queued up and this on top, polling is both faster and more
efficient than it ever has been before, for both the classic
preadv2/pwritev2 and the async polled IO.


I don't argue that this has not been tested, I was referring to the
following use-case: app queues say 16 I/Os with io_submit and then
issues preadv2 for an "important" 512B sector.

With this proposed change, is poll_fn more likely to return with the
first completion it sees rather than continue poll until it sees that
specific I/O it polled for?


I think that would be useless. For SYNC type polling with one thread, it
doesn't matter if we're looking for a particular IO, or just any IO.
Once your into two threads, you're already wasting huge amounts of CPU,
just to get to QD=2. The poll loop handles finding each others IOs just
fine, so there's no worry on that side.

Polling was originally designed for the strictly SYNC interfaces, and
since we had the cookie anyway, it was tempting to look for a specific
IO. I think that was a mistake, as it assumed that we'd never do async
polling.


Not arguing. Just now that we already have the selective interface in
place I'm asking is it OK to change that (IFF the above question is
indeed real) esoteric as this use-case may be...


Re: [PATCH 2/7] block: have ->poll_fn() return number of entries polled

2018-11-25 Thread Jens Axboe
On 11/25/18 3:52 AM, Sagi Grimberg wrote:
>> We currently only really support sync poll, ie poll with 1
>> IO in flight. This prepares us for supporting async poll.
> 
> Hey Jens,
> 
> So are we sure that this is fine to simply replace the
> poll functionality? you say that that we support poll
> with only 1 I/O inflight but is it entirely true?
> semantically speaking?

It is, unless you have multiple threads all doing polling.  Which is
pretty inefficient, as I'm sure you know.

> The interface is selective polling, which means that
> the user can have more than a single I/O inflight but wants
> to poll for a specific one that it really cares about.
> 
> Would this introduce a regression for users that rely
> on preadv2 to basically "complete *this* IO as fast as possible"?

No, it'll be exactly the same, and believe me, I've done plenty of
performance testing to ensure that it works well.  In fact, with the
changes queued up and this on top, polling is both faster and more
efficient than it ever has been before, for both the classic
preadv2/pwritev2 and the async polled IO.

> Note that I like the proposed direction, I'm merely questioning
> if it is OK to simply change how selective polling worked until
> today instead of introducing a separate blk_poll_all(q) (but with
> a better name) which would be wired up to aio polling and friends.

I think that would be useless. For SYNC type polling with one thread, it
doesn't matter if we're looking for a particular IO, or just any IO.
Once your into two threads, you're already wasting huge amounts of CPU,
just to get to QD=2. The poll loop handles finding each others IOs just
fine, so there's no worry on that side.

Polling was originally designed for the strictly SYNC interfaces, and
since we had the cookie anyway, it was tempting to look for a specific
IO. I think that was a mistake, as it assumed that we'd never do async
polling.

-- 
Jens Axboe



Re: [PATCH 2/7] block: have ->poll_fn() return number of entries polled

2018-11-25 Thread Sagi Grimberg

We currently only really support sync poll, ie poll with 1
IO in flight. This prepares us for supporting async poll.


Hey Jens,

So are we sure that this is fine to simply replace the
poll functionality? you say that that we support poll
with only 1 I/O inflight but is it entirely true?
semantically speaking?

The interface is selective polling, which means that
the user can have more than a single I/O inflight but wants
to poll for a specific one that it really cares about.

Would this introduce a regression for users that rely
on preadv2 to basically "complete *this* IO as fast as possible"?

Note that I like the proposed direction, I'm merely questioning
if it is OK to simply change how selective polling worked until
today instead of introducing a separate blk_poll_all(q) (but with
a better name) which would be wired up to aio polling and friends.


Re: [PATCH 2/7] block: have ->poll_fn() return number of entries polled

2018-11-18 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 


[PATCH 2/7] block: have ->poll_fn() return number of entries polled

2018-11-17 Thread Jens Axboe
We currently only really support sync poll, ie poll with 1
IO in flight. This prepares us for supporting async poll.

Note that the returned value isn't necessarily 100% accurate.
If poll races with IRQ completion, we assume that the fact
that the task is now runnable means we found at least one
entry. In reality it could be more than 1, or not even 1.
This is fine, the caller will just need to take this into
account.

Signed-off-by: Jens Axboe 
---
 block/blk-mq.c| 18 +-
 drivers/nvme/host/multipath.c |  4 ++--
 include/linux/blkdev.h|  2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7fc4abb4cc36..52b1c97cd7c6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -38,7 +38,7 @@
 #include "blk-mq-sched.h"
 #include "blk-rq-qos.h"
 
-static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
+static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
 static void blk_mq_poll_stats_start(struct request_queue *q);
 static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
 
@@ -3305,7 +3305,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue 
*q,
return true;
 }
 
-static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
+static int __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
 {
struct request_queue *q = hctx->queue;
long state;
@@ -3318,7 +3318,7 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, 
struct request *rq)
 * straight to the busy poll loop.
 */
if (blk_mq_poll_hybrid_sleep(q, hctx, rq))
-   return true;
+   return 1;
 
hctx->poll_considered++;
 
@@ -3332,30 +3332,30 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, 
struct request *rq)
if (ret > 0) {
hctx->poll_success++;
__set_current_state(TASK_RUNNING);
-   return true;
+   return ret;
}
 
if (signal_pending_state(state, current))
__set_current_state(TASK_RUNNING);
 
if (current->state == TASK_RUNNING)
-   return true;
+   return 1;
if (ret < 0)
break;
cpu_relax();
}
 
__set_current_state(TASK_RUNNING);
-   return false;
+   return 0;
 }
 
-static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
+static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 {
struct blk_mq_hw_ctx *hctx;
struct request *rq;
 
if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
-   return false;
+   return 0;
 
hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)];
if (!blk_qc_t_is_internal(cookie))
@@ -3369,7 +3369,7 @@ static bool blk_mq_poll(struct request_queue *q, blk_qc_t 
cookie)
 * so we should be safe with just the NULL check.
 */
if (!rq)
-   return false;
+   return 0;
}
 
return __blk_mq_poll(hctx, rq);
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index b82b0d3ca39a..65539c8df11d 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -220,11 +220,11 @@ static blk_qc_t nvme_ns_head_make_request(struct 
request_queue *q,
return ret;
 }
 
-static bool nvme_ns_head_poll(struct request_queue *q, blk_qc_t qc)
+static int nvme_ns_head_poll(struct request_queue *q, blk_qc_t qc)
 {
struct nvme_ns_head *head = q->queuedata;
struct nvme_ns *ns;
-   bool found = false;
+   int found = 0;
int srcu_idx;
 
srcu_idx = srcu_read_lock(&head->srcu);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 1ad6eafc43f2..e97c0a3b2262 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -283,7 +283,7 @@ static inline unsigned short req_get_ioprio(struct request 
*req)
 struct blk_queue_ctx;
 
 typedef blk_qc_t (make_request_fn) (struct request_queue *q, struct bio *bio);
-typedef bool (poll_q_fn) (struct request_queue *q, blk_qc_t);
+typedef int (poll_q_fn) (struct request_queue *q, blk_qc_t);
 
 struct bio_vec;
 typedef int (dma_drain_needed_fn)(struct request *);
-- 
2.17.1