Re: [Cluster-devel] [PATCH V10 00/19] block: support multi-page bvec

2018-11-16 Thread Ming Lei
On Fri, Nov 16, 2018 at 03:03:14PM +0100, Christoph Hellwig wrote:
> It seems like bi_phys_segments is still around of this series.
> Shouldn't it be superflous now?

Even though multi-page bvec is supported, the segment number doesn't
equal to the actual bvec count yet, for example, one bvec may be too
bigger to be held in one single segment.


Thanks,
Ming



[Cluster-devel] [GFS2 PATCH] gfs2: Remove vestigial bd_ops

2018-11-16 Thread Bob Peterson
Hi,

Field bd_ops was set but never used, so I removed it.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/incore.h | 1 -
 fs/gfs2/log.c| 1 -
 fs/gfs2/trans.c  | 1 -
 3 files changed, 3 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 888b62cfd6d1..663759abe60d 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -165,7 +165,6 @@ struct gfs2_bufdata {
u64 bd_blkno;
 
struct list_head bd_list;
-   const struct gfs2_log_operations *bd_ops;
 
struct gfs2_trans *bd_tr;
struct list_head bd_ail_st_list;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 4dcd2b48189e..5bfaf381921a 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -605,7 +605,6 @@ void gfs2_add_revoke(struct gfs2_sbd *sdp, struct 
gfs2_bufdata *bd)
bd->bd_blkno = bh->b_blocknr;
gfs2_remove_from_ail(bd); /* drops ref on bh */
bd->bd_bh = NULL;
-   bd->bd_ops = _revoke_lops;
sdp->sd_log_num_revoke++;
atomic_inc(>gl_revokes);
set_bit(GLF_LFLUSH, >gl_flags);
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 423bc2d03dd8..8a40bce33a69 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -132,7 +132,6 @@ static struct gfs2_bufdata *gfs2_alloc_bufdata(struct 
gfs2_glock *gl,
bd = kmem_cache_zalloc(gfs2_bufdata_cachep, GFP_NOFS | __GFP_NOFAIL);
bd->bd_bh = bh;
bd->bd_gl = gl;
-   bd->bd_ops = lops;
INIT_LIST_HEAD(>bd_list);
bh->b_private = bd;
return bd;



Re: [Cluster-devel] [GIT PULL] gfs2: 4.20 fixes

2018-11-16 Thread Linus Torvalds
On Thu, Nov 15, 2018 at 2:44 PM Andreas Gruenbacher  wrote:
>
> Ok, I've changed the merge commit as follows now:
>
> Merge tag 'v4.20-rc1'
>
> Pull in the gfs2 fixes that went into v4.19-rc8:
>
>   gfs2: Fix iomap buffered write support for journaled files
>   gfs2: Fix iomap buffered write support for journaled files (2)
>
> Without these two commits, the following fix would cause conflicts.
>
> So merging v4.19-rc8 would have been sufficient. v4.20-rc1 is what I
> ended up testing, though.
>
> Are you okay with that now?

If the only reason for this was the trivial one-liner conflict, the
real fix is to just not do the merge at all, and not worry about
conflicts.

I handle simple conflicts like this in my sleep. It's actually much
better and clearer if the development trees just do their own
development, and not worry about "Oh, Linus will get a small conflict
due to other changes he has pulled".

So what I did was to actually try to generate the tree I think it
*should* have been, and just merge that with the simple conflict
resolution.

You might want to double-check it - not only because I rewrote your
pull to not have the merge at all, and in the process modified things,
but just to see what I think the right approach would have been.

Now, there *are* reasons to do back-merges, but no, "Oh, there's a
trivial conflict" is not one of them.

  Linus



Re: [Cluster-devel] [PATCH V10 18/19] block: kill QUEUE_FLAG_NO_SG_MERGE

2018-11-16 Thread Omar Sandoval
On Fri, Nov 16, 2018 at 02:59:22PM +0100, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 06:18:11PM -0800, Omar Sandoval wrote:
> > This commit message wasn't very clear. Is it the case that
> > QUEUE_FLAG_NO_SG_MERGE is no longer set by any drivers?
> 
> I think he wants to say that not doing S/G merging is rather pointless
> with the current setup of the I/O path, as it isn't going to save
> you a significant amount of cycles.

Okay, that makes sense. Ming, you can add

Reviewed-by: Omar Sandoval 



Re: [Cluster-devel] [PATCH V10 03/19] block: use bio_for_each_bvec() to compute multi-page bvec count

2018-11-16 Thread Omar Sandoval
On Fri, Nov 16, 2018 at 10:19:56AM +0100, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 02:18:47PM -0800, Omar Sandoval wrote:
> > My only reason to prefer unsigned int is consistency. unsigned int is
> > much more common in the kernel:
> > 
> > $ ag --cc -s 'unsigned\s+int' | wc -l
> > 129632
> > $ ag --cc -s 'unsigned\s+(?!char|short|int|long)' | wc -l
> > 22435
> > 
> > checkpatch also warns on plain unsigned.
> 
> Talk about chicken and egg.  unsigned is perfectly valid C, and being
> shorter often helps being more readable.  checkpath is as so often
> wrongly opinionated..

Fair enough. Since enough people don't mind bare unsigned in the block
code, I retract my comment :)



Re: [Cluster-devel] [PATCH V10 18/19] block: kill QUEUE_FLAG_NO_SG_MERGE

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 06:18:11PM -0800, Omar Sandoval wrote:
> This commit message wasn't very clear. Is it the case that
> QUEUE_FLAG_NO_SG_MERGE is no longer set by any drivers?

I think he wants to say that not doing S/G merging is rather pointless
with the current setup of the I/O path, as it isn't going to save
you a significant amount of cycles.



Re: [Cluster-devel] [PATCH V10 17/19] block: don't use bio->bi_vcnt to figure out segment number

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 04:53:04PM +0800, Ming Lei wrote:
> It is wrong to use bio->bi_vcnt to figure out how many segments
> there are in the bio even though CLONED flag isn't set on this bio,
> because this bio may be splitted or advanced.
> 
> So always use bio_segments() in blk_recount_segments(), and it shouldn't
> cause any performance loss now because the physical segment number is figured
> out in blk_queue_split() and BIO_SEG_VALID is set meantime since
> bdced438acd83ad83a6c ("block: setup bi_phys_segments after splitting").

Looks good, but shouldn't this go to the beginning of the series?

Reviewed-by: Christoph Hellwig 



Re: [Cluster-devel] [PATCH V10 18/19] block: kill QUEUE_FLAG_NO_SG_MERGE

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 04:53:05PM +0800, Ming Lei wrote:
> Since bdced438acd83ad83a6c ("block: setup bi_phys_segments after splitting"),
> physical segment number is mainly figured out in blk_queue_split() for
> fast path, and the flag of BIO_SEG_VALID is set there too.
> 
> Now only blk_recount_segments() and blk_recalc_rq_segments() use this
> flag.
> 
> Basically blk_recount_segments() is bypassed in fast path given BIO_SEG_VALID
> is set in blk_queue_split().
> 
> For another user of blk_recalc_rq_segments():
> 
> - run in partial completion branch of blk_update_request, which is an unusual 
> case
> 
> - run in blk_cloned_rq_check_limits(), still not a big problem if the flag is 
> killed
> since dm-rq is the only user.
> 
> Multi-page bvec is enabled now, QUEUE_FLAG_NO_SG_MERGE doesn't make sense any 
> more.
> 
> Cc: Dave Chinner 
> Cc: Kent Overstreet 
> Cc: Mike Snitzer 
> Cc: dm-de...@redhat.com
> Cc: Alexander Viro 
> Cc: linux-fsde...@vger.kernel.org
> Cc: Shaohua Li 
> Cc: linux-r...@vger.kernel.org
> Cc: linux-er...@lists.ozlabs.org
> Cc: David Sterba 
> Cc: linux-bt...@vger.kernel.org
> Cc: Darrick J. Wong 
> Cc: linux-...@vger.kernel.org
> Cc: Gao Xiang 
> Cc: Christoph Hellwig 
> Cc: Theodore Ts'o 
> Cc: linux-e...@vger.kernel.org
> Cc: Coly Li 
> Cc: linux-bca...@vger.kernel.org
> Cc: Boaz Harrosh 
> Cc: Bob Peterson 
> Cc: cluster-devel@redhat.com
> Signed-off-by: Ming Lei 
> ---
>  block/blk-merge.c  | 31 ++-
>  block/blk-mq-debugfs.c |  1 -
>  block/blk-mq.c |  3 ---
>  drivers/md/dm-table.c  | 13 -
>  include/linux/blkdev.h |  1 -
>  5 files changed, 6 insertions(+), 43 deletions(-)
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 153a659fde74..06be298be332 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -351,8 +351,7 @@ void blk_queue_split(struct request_queue *q, struct bio 
> **bio)
>  EXPORT_SYMBOL(blk_queue_split);
>  
>  static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
> -  struct bio *bio,
> -  bool no_sg_merge)
> +  struct bio *bio)
>  {
>   struct bio_vec bv, bvprv = { NULL };
>   int cluster, prev = 0;
> @@ -379,13 +378,6 @@ static unsigned int __blk_recalc_rq_segments(struct 
> request_queue *q,
>   nr_phys_segs = 0;
>   for_each_bio(bio) {
>   bio_for_each_bvec(bv, bio, iter) {
> - /*
> -  * If SG merging is disabled, each bio vector is
> -  * a segment
> -  */
> - if (no_sg_merge)
> - goto new_segment;
> -
>   if (prev && cluster) {
>   if (seg_size + bv.bv_len
>   > queue_max_segment_size(q))
> @@ -420,27 +412,16 @@ static unsigned int __blk_recalc_rq_segments(struct 
> request_queue *q,
>  
>  void blk_recalc_rq_segments(struct request *rq)
>  {
> - bool no_sg_merge = !!test_bit(QUEUE_FLAG_NO_SG_MERGE,
> - >q->queue_flags);
> -
> - rq->nr_phys_segments = __blk_recalc_rq_segments(rq->q, rq->bio,
> - no_sg_merge);
> + rq->nr_phys_segments = __blk_recalc_rq_segments(rq->q, rq->bio);

Can we rename __blk_recalc_rq_segments to blk_recalc_rq_segments
can kill the old blk_recalc_rq_segments now?

Otherwise looks fine:

Reviewed-by: Christoph Hellwig 



Re: [Cluster-devel] [PATCH V10 00/19] block: support multi-page bvec

2018-11-16 Thread Christoph Hellwig
It seems like bi_phys_segments is still around of this series.
Shouldn't it be superflous now?



Re: [Cluster-devel] [PATCH V10 19/19] block: kill BLK_MQ_F_SG_MERGE

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 04:53:06PM +0800, Ming Lei wrote:
> QUEUE_FLAG_NO_SG_MERGE has been killed, so kill BLK_MQ_F_SG_MERGE too.

Looks fine,

Reviewed-by: Christoph Hellwig 



Re: [Cluster-devel] [PATCH V10 07/19] btrfs: use bvec_last_segment to get bio's last page

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 04:52:54PM +0800, Ming Lei wrote:
> index 2955a4ea2fa8..161e14b8b180 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -400,8 +400,11 @@ blk_status_t btrfs_submit_compressed_write(struct inode 
> *inode, u64 start,
>  static u64 bio_end_offset(struct bio *bio)
>  {
>   struct bio_vec *last = bio_last_bvec_all(bio);
> + struct bio_vec bv;
>  
> - return page_offset(last->bv_page) + last->bv_len + last->bv_offset;
> + bvec_last_segment(last, );
> +
> + return page_offset(bv.bv_page) + bv.bv_len + bv.bv_offset;

I don't think we need this.  If last is a multi-page bvec bv_offset
will already contain the correct offset from the first page.



Re: [Cluster-devel] [PATCH V10 01/19] block: introduce multi-page page bvec helpers

2018-11-16 Thread Christoph Hellwig
> -#define bvec_iter_page(bvec, iter)   \
> +#define mp_bvec_iter_page(bvec, iter)\
>   (__bvec_iter_bvec((bvec), (iter))->bv_page)
>  
> -#define bvec_iter_len(bvec, iter)\
> +#define mp_bvec_iter_len(bvec, iter) \

I'd much prefer if we would stick to the segment naming that
we also use in the higher level helper.

So segment_iter_page, segment_iter_len, etc.

> + * This helpers are for building sp bvec in flight.

Please spell out single page, sp is not easy understandable.



Re: [Cluster-devel] [PATCH V10 13/19] iomap & xfs: only account for new added page

2018-11-16 Thread Christoph Hellwig
I'd much rather have __bio_try_merge_page only do merges in
the same page, and have a new __bio_try_merge_segment that does
multi-page merges.  This will keep the accounting a lot simpler.



Re: [Cluster-devel] [PATCH V10 11/19] bcache: avoid to use bio_for_each_segment_all() in bch_bio_alloc_pages()

2018-11-16 Thread Christoph Hellwig
> - bio_for_each_segment_all(bv, bio, i) {
> + for (i = 0, bv = bio->bi_io_vec; i < bio->bi_vcnt; bv++) {

This really needs a comment.  Otherwise it looks fine to me.



Re: [Cluster-devel] [PATCH V10 09/19] block: introduce bio_bvecs()

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 04:52:56PM +0800, Ming Lei wrote:
> There are still cases in which we need to use bio_bvecs() for get the
> number of multi-page segment, so introduce it.

The only user in your final tree seems to be the loop driver, and
even that one only uses the helper for read/write bios.

I think something like this would be much simpler in the end:

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index d509902a8046..712511815ac6 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -514,16 +514,18 @@ static int lo_rw_aio(struct loop_device *lo, struct 
loop_cmd *cmd,
struct request *rq = blk_mq_rq_from_pdu(cmd);
struct bio *bio = rq->bio;
struct file *file = lo->lo_backing_file;
+   struct bvec_iter bvec_iter;
+   struct bio_vec tmp;
unsigned int offset;
int nr_bvec = 0;
int ret;
 
+   __rq_for_each_bio(bio, rq)
+   bio_for_each_bvec(tmp, bio, bvec_iter)
+   nr_bvec++;
+
if (rq->bio != rq->biotail) {
-   struct bvec_iter iter;
-   struct bio_vec tmp;
 
-   __rq_for_each_bio(bio, rq)
-   nr_bvec += bio_bvecs(bio);
bvec = kmalloc_array(nr_bvec, sizeof(struct bio_vec),
 GFP_NOIO);
if (!bvec)
@@ -537,7 +539,7 @@ static int lo_rw_aio(struct loop_device *lo, struct 
loop_cmd *cmd,
 * API will take care of all details for us.
 */
__rq_for_each_bio(bio, rq)
-   bio_for_each_bvec(tmp, bio, iter) {
+   bio_for_each_bvec(tmp, bio, bvec_iter) {
*bvec = tmp;
bvec++;
}
@@ -551,7 +553,6 @@ static int lo_rw_aio(struct loop_device *lo, struct 
loop_cmd *cmd,
 */
offset = bio->bi_iter.bi_bvec_done;
bvec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter);
-   nr_bvec = bio_bvecs(bio);
}
atomic_set(>ref, 2);
 
diff --git a/include/linux/bio.h b/include/linux/bio.h
index dcad0b69f57a..379440d1ced0 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -200,30 +200,6 @@ static inline unsigned bio_segments(struct bio *bio)
}
 }
 
-static inline unsigned bio_bvecs(struct bio *bio)
-{
-   unsigned bvecs = 0;
-   struct bio_vec bv;
-   struct bvec_iter iter;
-
-   /*
-* We special case discard/write same/write zeroes, because they
-* interpret bi_size differently:
-*/
-   switch (bio_op(bio)) {
-   case REQ_OP_DISCARD:
-   case REQ_OP_SECURE_ERASE:
-   case REQ_OP_WRITE_ZEROES:
-   return 0;
-   case REQ_OP_WRITE_SAME:
-   return 1;
-   default:
-   bio_for_each_bvec(bv, bio, iter)
-   bvecs++;
-   return bvecs;
-   }
-}
-
 /*
  * get a reference to a bio, so it won't disappear. the intended use is
  * something like:



Re: [Cluster-devel] [PATCH V10 15/19] block: always define BIO_MAX_PAGES as 256

2018-11-16 Thread Christoph Hellwig
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 5040e9a2eb09..277921ad42e7 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -34,15 +34,7 @@
>  #define BIO_BUG_ON
>  #endif
>  
> -#ifdef CONFIG_THP_SWAP
> -#if HPAGE_PMD_NR > 256
> -#define BIO_MAX_PAGESHPAGE_PMD_NR
> -#else
>  #define BIO_MAX_PAGES256
> -#endif
> -#else
> -#define BIO_MAX_PAGES256
> -#endif

Looks good.  This mess should have never gone in.

Reviewed-by: Christoph Hellwig 



Re: [Cluster-devel] [PATCH V10 14/19] block: enable multipage bvecs

2018-11-16 Thread Christoph Hellwig
> -
> - if (page == bv->bv_page && off == bv->bv_offset + bv->bv_len) {
> - bv->bv_len += len;
> - bio->bi_iter.bi_size += len;
> - return true;
> - }
> + struct request_queue *q = NULL;
> +
> + if (page == bv->bv_page && off == (bv->bv_offset + bv->bv_len)
> + && (off + len) <= PAGE_SIZE)

How could the page struct be the same, but the range beyond PAGE_SIZE
(at least with the existing callers)?

Also no need for the inner btraces, and the && always goes on the
first line.

> + if (bio->bi_disk)
> + q = bio->bi_disk->queue;
> +
> + /* disable multi-page bvec too if cluster isn't enabled */
> + if (!q || !blk_queue_cluster(q) ||
> + ((page_to_phys(bv->bv_page) + bv->bv_offset + bv->bv_len) !=
> +  (page_to_phys(page) + off)))
> + return false;
> + merge:
> + bv->bv_len += len;
> + bio->bi_iter.bi_size += len;
> + return true;

Ok, this is scary, as it will give differen results depending on when
bi_disk is assigned.  But then again we shouldn't really do the cluster
check here, but rather when splitting the bio for the actual low-level
driver.

(and eventually we should kill this clustering setting off in favor
of our normal segment limits).



Re: [Cluster-devel] [PATCH V10 05/19] block: introduce bvec_last_segment()

2018-11-16 Thread Christoph Hellwig
> + unsigned last_page = total / PAGE_SIZE;
> +
> + if (last_page * PAGE_SIZE == total)

Not sure it matters much with modern compilers, but generally
we shit by PAGE_SHIFT instead of multiplying with or dividing
by PAGE_SIZE.



Re: [Cluster-devel] [PATCH V10 08/19] btrfs: move bio_pages_all() to btrfs

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 04:52:55PM +0800, Ming Lei wrote:
> BTRFS is the only user of this helper, so move this helper into
> BTRFS, and implement it via bio_for_each_segment_all(), since
> bio->bi_vcnt may not equal to number of pages after multipage bvec
> is enabled.

btrfs only uses the value to check if it is larger than 1.  No amount
of multipage bio merging should ever make bi_vcnt go from 0 to 1 or
vice versa.



Re: [Cluster-devel] [PATCH V10 06/19] fs/buffer.c: use bvec iterator to truncate the bio

2018-11-16 Thread Christoph Hellwig
Looks fine,

Reviewed-by: Christoph Hellwig 



Re: [Cluster-devel] [PATCH V10 02/19] block: introduce bio_for_each_bvec()

2018-11-16 Thread Christoph Hellwig
> +static inline void __bio_advance_iter(struct bio *bio, struct bvec_iter 
> *iter,
> +   unsigned bytes, bool mp)

I think these magic 'bool np' arguments and wrappers over wrapper
don't help anyone to actually understand the code.  I'd vote for
removing as many wrappers as we really don't need, and passing the
actual segment limit instead of the magic bool flag.  Something like
this untested patch:

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 277921ad42e7..dcad0b69f57a 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -138,30 +138,21 @@ static inline bool bio_full(struct bio *bio)
bvec_for_each_segment(bvl, &((bio)->bi_io_vec[iter_all.idx]), 
i, iter_all)
 
 static inline void __bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
- unsigned bytes, bool mp)
+ unsigned bytes, unsigned max_segment)
 {
iter->bi_sector += bytes >> 9;
 
if (bio_no_advance_iter(bio))
iter->bi_size -= bytes;
else
-   if (!mp)
-   bvec_iter_advance(bio->bi_io_vec, iter, bytes);
-   else
-   mp_bvec_iter_advance(bio->bi_io_vec, iter, bytes);
+   __bvec_iter_advance(bio->bi_io_vec, iter, bytes, max_segment);
/* TODO: It is reasonable to complete bio with error here. */
 }
 
 static inline void bio_advance_iter(struct bio *bio, struct bvec_iter *iter,
unsigned bytes)
 {
-   __bio_advance_iter(bio, iter, bytes, false);
-}
-
-static inline void bio_advance_mp_iter(struct bio *bio, struct bvec_iter *iter,
-  unsigned bytes)
-{
-   __bio_advance_iter(bio, iter, bytes, true);
+   __bio_advance_iter(bio, iter, bytes, PAGE_SIZE);
 }
 
 #define __bio_for_each_segment(bvl, bio, iter, start)  \
@@ -177,7 +168,7 @@ static inline void bio_advance_mp_iter(struct bio *bio, 
struct bvec_iter *iter,
for (iter = (start);\
 (iter).bi_size &&  \
((bvl = bio_iter_mp_iovec((bio), (iter))), 1);  \
-bio_advance_mp_iter((bio), &(iter), (bvl).bv_len))
+__bio_advance_iter((bio), &(iter), (bvl).bv_len, 0))
 
 /* returns one real segment(multipage bvec) each time */
 #define bio_for_each_bvec(bvl, bio, iter)  \
diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 02f26d2b59ad..5e2ed46c1c88 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -138,8 +138,7 @@ struct bvec_iter_all {
 })
 
 static inline bool __bvec_iter_advance(const struct bio_vec *bv,
-  struct bvec_iter *iter,
-  unsigned bytes, bool mp)
+   struct bvec_iter *iter, unsigned bytes, unsigned max_segment)
 {
if (WARN_ONCE(bytes > iter->bi_size,
 "Attempted to advance past end of bvec iter\n")) {
@@ -148,18 +147,18 @@ static inline bool __bvec_iter_advance(const struct 
bio_vec *bv,
}
 
while (bytes) {
-   unsigned len;
+   unsigned segment_len = mp_bvec_iter_len(bv, *iter);
 
-   if (mp)
-   len = mp_bvec_iter_len(bv, *iter);
-   else
-   len = bvec_iter_len(bv, *iter);
+   if (max_segment) {
+   max_segment -= bvec_iter_offset(bv, *iter);
+   segment_len = min(segment_len, max_segment);
+   }
 
-   len = min(bytes, len);
+   segment_len = min(bytes, segment_len);
 
-   bytes -= len;
-   iter->bi_size -= len;
-   iter->bi_bvec_done += len;
+   bytes -= segment_len;
+   iter->bi_size -= segment_len;
+   iter->bi_bvec_done += segment_len;
 
if (iter->bi_bvec_done == __bvec_iter_bvec(bv, *iter)->bv_len) {
iter->bi_bvec_done = 0;
@@ -197,14 +196,7 @@ static inline bool bvec_iter_advance(const struct bio_vec 
*bv,
 struct bvec_iter *iter,
 unsigned bytes)
 {
-   return __bvec_iter_advance(bv, iter, bytes, false);
-}
-
-static inline bool mp_bvec_iter_advance(const struct bio_vec *bv,
-   struct bvec_iter *iter,
-   unsigned bytes)
-{
-   return __bvec_iter_advance(bv, iter, bytes, true);
+   return __bvec_iter_advance(bv, iter, bytes, PAGE_SIZE);
 }
 
 #define for_each_bvec(bvl, bio_vec, iter, start)   \



Re: [Cluster-devel] [PATCH V10 04/19] block: use bio_for_each_bvec() to map sg

2018-11-16 Thread Christoph Hellwig
> + if (!*sg)
> + return sglist;
> + else {

No need for an else after an early return.



Re: [Cluster-devel] [PATCH V10 03/19] block: use bio_for_each_bvec() to compute multi-page bvec count

2018-11-16 Thread Mike Snitzer
On Thu, Nov 15 2018 at  3:20pm -0500,
Omar Sandoval  wrote:

> On Thu, Nov 15, 2018 at 04:52:50PM +0800, Ming Lei wrote:
> > First it is more efficient to use bio_for_each_bvec() in both
> > blk_bio_segment_split() and __blk_recalc_rq_segments() to compute how
> > many multi-page bvecs there are in the bio.
> > 
> > Secondly once bio_for_each_bvec() is used, the bvec may need to be
> > splitted because its length can be very longer than max segment size,
> > so we have to split the big bvec into several segments.
> > 
> > Thirdly when splitting multi-page bvec into segments, the max segment
> > limit may be reached, so the bio split need to be considered under
> > this situation too.
> > 
> > Cc: Dave Chinner 
> > Cc: Kent Overstreet 
> > Cc: Mike Snitzer 
> > Cc: dm-de...@redhat.com
> > Cc: Alexander Viro 
> > Cc: linux-fsde...@vger.kernel.org
> > Cc: Shaohua Li 
> > Cc: linux-r...@vger.kernel.org
> > Cc: linux-er...@lists.ozlabs.org
> > Cc: David Sterba 
> > Cc: linux-bt...@vger.kernel.org
> > Cc: Darrick J. Wong 
> > Cc: linux-...@vger.kernel.org
> > Cc: Gao Xiang 
> > Cc: Christoph Hellwig 
> > Cc: Theodore Ts'o 
> > Cc: linux-e...@vger.kernel.org
> > Cc: Coly Li 
> > Cc: linux-bca...@vger.kernel.org
> > Cc: Boaz Harrosh 
> > Cc: Bob Peterson 
> > Cc: cluster-devel@redhat.com
> > Signed-off-by: Ming Lei 
> > ---
> >  block/blk-merge.c | 90 
> > ++-
> >  1 file changed, 76 insertions(+), 14 deletions(-)
> > 
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index 91b2af332a84..6f7deb94a23f 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -160,6 +160,62 @@ static inline unsigned get_max_io_size(struct 
> > request_queue *q,
> > return sectors;
> >  }
> >  
> > +/*
> > + * Split the bvec @bv into segments, and update all kinds of
> > + * variables.
> > + */
> > +static bool bvec_split_segs(struct request_queue *q, struct bio_vec *bv,
> > +   unsigned *nsegs, unsigned *last_seg_size,
> > +   unsigned *front_seg_size, unsigned *sectors)
> > +{
> > +   bool need_split = false;
> > +   unsigned len = bv->bv_len;
> > +   unsigned total_len = 0;
> > +   unsigned new_nsegs = 0, seg_size = 0;
> 
> "unsigned int" here and everywhere else.

Curious why?  I've wondered what govens use of "unsigned" vs "unsigned
int" recently and haven't found _the_ reason to pick one over the other.

Thanks,
Mike



Re: [Cluster-devel] [PATCH V10 03/19] block: use bio_for_each_bvec() to compute multi-page bvec count

2018-11-16 Thread Gao Xiang


On 2018/11/16 17:19, Christoph Hellwig wrote:
> On Thu, Nov 15, 2018 at 02:18:47PM -0800, Omar Sandoval wrote:
>> My only reason to prefer unsigned int is consistency. unsigned int is
>> much more common in the kernel:
>>
>> $ ag --cc -s 'unsigned\s+int' | wc -l
>> 129632
>> $ ag --cc -s 'unsigned\s+(?!char|short|int|long)' | wc -l
>> 22435
>>
>> checkpatch also warns on plain unsigned.
> 
> Talk about chicken and egg.  unsigned is perfectly valid C, and being
> shorter often helps being more readable.  checkpath is as so often
> wrongly opinionated..
> 

sigh...I personally tend to use "unsigned" instead of "unsigned int" as well,
but checkpatch.pl also suggests erofs to use "unsigned int" :-(

Thanks,
Gao Xiang



Re: [Cluster-devel] [PATCH 0/3] dlm: fix various incorrect behaviors

2018-11-16 Thread Kees Cook
On Fri, Nov 9, 2018 at 9:15 AM, Tycho Andersen  wrote:
> On Wed, Nov 07, 2018 at 04:20:42PM -0600, David Teigland wrote:
>> On Fri, Nov 02, 2018 at 02:18:19PM -0600, Tycho Andersen wrote:
>> > Hi,
>> >
>> > here's a series to fix some bugs I noticed in the DLM. The third patch
>> > in the series and maybe the first should probably go to stable, assuming
>> > everyone agrees they're indeed bugs.
>> >
>> > Thanks,
>> >
>> > Tycho
>> >
>> > Tycho Andersen (3):
>> >   dlm: fix invalid free
>> >   dlm: don't allow zero length names
>> >   dlm: don't leak kernel pointer to userspace
>> >
>> >  fs/dlm/lockspace.c | 2 +-
>> >  fs/dlm/member.c| 5 +++--
>> >  fs/dlm/user.c  | 2 +-
>> >  3 files changed, 5 insertions(+), 4 deletions(-)
>>
>> I've pushed these to linux-dlm next.
>
> Great, thanks! Should we send 1 and 3 to stable?

Yes please! :)

-- 
Kees Cook



Re: [Cluster-devel] [PATCH V10 03/19] block: use bio_for_each_bvec() to compute multi-page bvec count

2018-11-16 Thread Christoph Hellwig
On Thu, Nov 15, 2018 at 02:18:47PM -0800, Omar Sandoval wrote:
> My only reason to prefer unsigned int is consistency. unsigned int is
> much more common in the kernel:
> 
> $ ag --cc -s 'unsigned\s+int' | wc -l
> 129632
> $ ag --cc -s 'unsigned\s+(?!char|short|int|long)' | wc -l
> 22435
> 
> checkpatch also warns on plain unsigned.

Talk about chicken and egg.  unsigned is perfectly valid C, and being
shorter often helps being more readable.  checkpath is as so often
wrongly opinionated..