[
https://issues.apache.org/jira/browse/KUDU-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adar Dembo updated KUDU-2665:
-----------------------------
Attachment: block_manager-stress-test.txt.gz
Priority: Blocker (was: Major)
Issue Type: Bug (was: New Feature)
I think I've identified the root cause.
The new runtime container deletion code works as follows:
# When a deletion transaction goes out of scope, we check all containers that
participated. If any are both full and have no live blocks _right now_, we
declare them to be dead, mark them as such, and remove their refs from global
log block manager state.
# The dead containers continue to live in memory because they have other
referrents. These referrents are ongoing {{WritableBlock}} instances (shouldn't
be any because the container is dead) and opened {{ReadableBlock}} instances
(may exist).
# When the last referrent is closed, the container's destructor runs. Because
the container was marked as dead, its on disk files are now removed.
To make all this work, it is assumed that when a container is both full and has
no live blocks anymore, it is going to remain in that state in perpetuity.
That's logically true: a full container with no live blocks isn't going to be
used for any new blocks. However, due to the nature of {{WritableBlock}}
finalization/closing, it's possible for a container with outstanding
{{WritableBlock}} instances to briefly appear as dead. That's because:
# The container's next block offset (responsible for determining fullness) is
incrmented when the {{WritableBlock}} is finalized, but
# The container's live block count is incremented when the {{WritableBlock}} is
_closed_.
Thus, if the "last" block in a container is deleted after a {{WritableBlock}}
was finalized but before it has been closed, the container will be erroneously
marked as dead. What's the effect? When the container's last referrent
disappears (i.e. last outstanding {{ReadableBlock}} is closed), it will be
deleted from disk _despite having live blocks in it_. Because
block_manager-stress-test restarts from time to time, the block manager thus
loses blocks that the test still expects to find.
I'm attaching the test's output with a lot more instrumentation showing the bug.
We absolutely need to fix this before releasing 1.9, or to at least disable the
runtime container deletion code.
> BlockManagerStressTest.StressTest is extremely flaky
> ----------------------------------------------------
>
> Key: KUDU-2665
> URL: https://issues.apache.org/jira/browse/KUDU-2665
> Project: Kudu
> Issue Type: Bug
> Components: fs
> Affects Versions: 1.9.0
> Reporter: Mike Percy
> Assignee: HeLifu
> Priority: Blocker
> Fix For: 1.9.0
>
> Attachments: block_manager-stress-test.txt.gz
>
>
> After some recent block manager changes the Block Manager Stress Test is
> about 50% flaky on certain precommit builds. The failure looks like this:
> {code:java}
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/fs/block_manager-stress-test.cc:518:
> Failure
> Failed
> Bad status: Not found:
> /data/somelongdirectorytoavoidrpathissues/src/kudutest/block_manager-stress-test.0.BlockManagerStressTest_1.StressTest.1547778831841692-23619/data/e8ab31ef3e2143a5bc6d7a2b40e7805b.data:
> No such file or directory (error 2)
> /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/fs/block_manager-stress-test.cc:549:
> Failure
> Expected: this->InjectNonFatalInconsistencies() doesn't generate new fatal
> failures in the current thread.
> Actual: it does.
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)