Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13975 )

Change subject: Consider the available space when selecting data dirs for 
blocks.
......................................................................


Patch Set 9:

> Patch Set 8:
>
> > (9 comments)
>  > 
>  > I'll look into the test failure some more. If I check out this
>  > patch, I see it failing a small percentage (2-6%) of the time.
>
> About the failure in DiskErrorITest.TestFailDuringScanWorkload. IIUC, it only 
> inject the disk failure in data_dir[1], and after the reflush check change, 
> it may be remove from the candidate dirs. So it may not trigger the failure 
> and so the test will failure.

I added some logging and think I understand what's going on. When we first 
create the tablets, we always refresh the space, and this necessarily isn't 
atomic between the data dirs, so we can sometimes end up with the data 
directories registering that they have different amounts of available space, 
even though they share the same disk.

When this happens, because there are only three directories, this 
implementation of PO2C might end up completely ignoring the data dir with the 
least amount of space in it.

So I see two paths forward for this. Either:
1) update the implementation of PO2C to sometimes select the data dir with the 
least space. For example, select two random indices (may be the same) and 
compare the available space (compared to what we have now, which always 
compares two different data directories). OR...
2) update disk_failure-itest to inject failures into two data directories 
instead of one. With the current PO2C implementation, it's a safe bet that 
killing two data dirs will touch blocks.


--
To view, visit http://gerrit.cloudera.org:8080/13975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I194c4965ee64aed728e3b84e684c04d445cbe529
Gerrit-Change-Number: 13975
Gerrit-PatchSet: 9
Gerrit-Owner: ZhangYao <triplesheep0...@gmail.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <405403...@qq.com>
Gerrit-Reviewer: ZhangYao <triplesheep0...@gmail.com>
Gerrit-Comment-Date: Fri, 09 Aug 2019 20:58:12 +0000
Gerrit-HasComments: No

Reply via email to