On 2018/5/14 15:05, Changwei Ge wrote: > Hi Jun, > > IMO, o2hb_issue_node_write() just needs _one_ vec so there is no problem > in it. > > Calculating bio number might be get related code more complicated which > I think is not necessary. > > It doesn't solve any existed problem or improve performance or clean > code. :(
OK, it sounds reasonable, thanks for your correction. Thanks, Jun > > Thanks, > Changwei > > On 2018/5/14 14:26, piaojun wrote: >> Hi Changwei, >> >> I got your point, we should let the caller retry if bio is not enough, >> right? But some caller like o2hb_issue_node_write() won't retry once error >> happens, though the bio will always be enough. I think if we could >> calculate the number of bio we need before calling bio_add_page()? >> >> Thanks >> Jun >> >> On 2018/5/14 11:21, Changwei Ge wrote: >>> Hi Jun, >>> >>> Right now, I am afraid that the easiest and fasted way to fix this issue >>> is to revert your patch. >>> >>> From comments before function bio_add_page(), we can see that it only >>> fails if either ::bi_vcnt == ::bi_max_vecs or it's a cloned bio. >>> >>> >>> So we can judge if bio is full from its return value is zero or not. >>> >>> >>> Thanks, >>> >>> Changwei >>> >>> >>> On 2018/5/10 9:13, Changwei Ge wrote: >>>> >>>> On 2018/5/10 8:24, piaojun wrote: >>>>> On 2018/5/9 20:01, Changwei Ge wrote: >>>>>> Hi Jun, >>>>>> >>>>>> >>>>>> On 2018/5/9 18:08, piaojun wrote: >>>>>>> Hi Changwei, >>>>>>> >>>>>>> On 2018/4/13 13:51, Changwei Ge wrote: >>>>>>>> If cluster scale exceeds 16 nodes, bio will be full and >>>>>>>> bio_add_page() >>>>>>>> returns 0 when adding pages to bio. Returning -EIO to >>>>>>>> o2hb_read_slots() >>>>>>>> from o2hb_setup_one_bio() will lead to losing chance to allocate more >>>>>>>> bios to present all heartbeat region. >>>>>>>> >>>>>>>> So o2hb_read_slots() fails. >>>>>>>> >>>>>>>> In my test, making fs fails in starting o2cb service. >>>>>>>> >>>>>>>> Attach error log: >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 0, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 1, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 2, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 3, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 4, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 5, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 6, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 7, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 8, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 9, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 10, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 11, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 12, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 13, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 14, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 15, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 16, vec_len = >>>>>>>> 4096, vec_start = 0 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:471 ERROR: Adding page[16] >>>>>>>> to bio failed, page ffffea0002d7ed40, len 0, vec_len 4096, >>>>>>>> vec_start 0, bi_sector 8192 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_read_slots:500 ERROR: status = -5 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_populate_slot_data:1911 ERROR: status = -5 >>>>>>>> (mkfs.ocfs2,27479,2):o2hb_region_dev_write:2012 ERROR: status = -5 >>>>>>>> >>>>>>>> Fixes: ba16ddfbeb9d ("ocfs2/o2hb: check len for bio_add_page() to >>>>>>>> avoid getting incorrect bio" >>>>>>>> >>>>>>>> Signed-off-by: Changwei Ge <ge.chang...@h3c.com> >>>>>>>> --- >>>>>>>> fs/ocfs2/cluster/heartbeat.c | 8 ++++++-- >>>>>>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>>>>>> >>>>>>>> diff --git a/fs/ocfs2/cluster/heartbeat.c >>>>>>>> b/fs/ocfs2/cluster/heartbeat.c >>>>>>>> index 91a8889abf9b..2809e29d612d 100644 >>>>>>>> --- a/fs/ocfs2/cluster/heartbeat.c >>>>>>>> +++ b/fs/ocfs2/cluster/heartbeat.c >>>>>>>> @@ -540,11 +540,12 @@ static struct bio *o2hb_setup_one_bio(struct >>>>>>>> o2hb_region *reg, >>>>>>>> struct bio *bio; >>>>>>>> struct page *page; >>>>>>>> +#define O2HB_BIO_VECS 16 >>>>>>>> /* Testing has shown this allocation to take long enough under >>>>>>>> * GFP_KERNEL that the local node can get fenced. It would be >>>>>>>> * nicest if we could pre-allocate these bios and avoid this >>>>>>>> * all together. */ >>>>>>>> - bio = bio_alloc(GFP_ATOMIC, 16); >>>>>>>> + bio = bio_alloc(GFP_ATOMIC, O2HB_BIO_VECS); >>>>>>>> if (!bio) { >>>>>>>> mlog(ML_ERROR, "Could not alloc slots BIO!\n"); >>>>>>>> bio = ERR_PTR(-ENOMEM); >>>>>>>> @@ -570,7 +571,10 @@ static struct bio *o2hb_setup_one_bio(struct >>>>>>>> o2hb_region *reg, >>>>>>>> current_page, vec_len, vec_start); >>>>>>> Should we check the validity of 'current_page' before >>>>>>> bio_add_page()? And >>>>>>> that will prevent error happen. Others looks OK. >>>>>> If I understand correctly, you mean we should check current page is >>>>>> NULL or not? >>>>>> If so I think there is no need since o2hb should guarantee that it has >>>>>> already reserved enough pages for disk heartbeat read/write behalf. >>>>> I mean we could check if 'current_page' equals O2HB_BIO_VECS before >>>>> bio_add_page() to avoid NULL pointer referrence. >>>> Yes, that might work. >>>> I find another problem within this patch. >>>> I will post v2 patch later to fix them all with consideration about >>>> your suggestion. >>>> >>>> Thanks, >>>> Changwei >>>> >>>>> thanks, >>>>> Jun >>>>> >>>>>> Thanks, >>>>>> Changwei >>>>>>> thanks, >>>>>>> Jun >>>>>>>> len = bio_add_page(bio, page, vec_len, vec_start); >>>>>>>> - if (len != vec_len) { >>>>>>>> + if (len == 0 && current_page == O2HB_BIO_VECS) { >>>>>>>> + /* bio is full now. */ >>>>>>>> + goto bail; >>>>>>>> + } else if (len != vec_len) { >>>>>>>> mlog(ML_ERROR, "Adding page[%d] to bio failed, " >>>>>>>> "page %p, len %d, vec_len %u, vec_start %u, " >>>>>>>> "bi_sector %llu\n", current_page, page, len, >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Ocfs2-devel mailing list >>>>>>> Ocfs2-devel@oss.oracle.com >>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel > _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel