[
https://issues.apache.org/jira/browse/KUDU-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15472759#comment-15472759
]
Adar Dembo commented on KUDU-1597:
----------------------------------
bq. Does it seem reasonable to just add a call to tablet_peer_->CheckRunning()
in the SysCatalogTable::Visit*() functions?
I don't think that's sufficient. The window of sadness begins in
ServerBase::Init() and ends just after CatalogManager::InitSysCatalogAsync().
During that time, there's not even a guarantee that
catalog_manager().sys_catalog() exists, so you won't necessarily be able to get
at the underlying tablet peer.
But, you could condition on CatalogManager::IsInitialized() (or CheckOnline(),
if you want a Status). I think that should be safe.
> Master crashes if web UI is visited during startup
> --------------------------------------------------
>
> Key: KUDU-1597
> URL: https://issues.apache.org/jira/browse/KUDU-1597
> Project: Kudu
> Issue Type: Bug
> Components: master
> Affects Versions: 0.10.0
> Reporter: Adar Dembo
> Priority: Critical
>
> Accessing certain paths in the master web UI during startup can cause the
> master to crash. For example, /dump-entities while the master tablet is still
> bootstrapping will lead to a CHECK crash (SysCatalogTable has an
> uninitialized schema at this point). There are no doubt others too.
> We haven't seen this in testing because our tests generally don't access the
> web UI, or they don't try to bootstrap a large master tablet. We've seen at
> least one user hit this, though: for some reason the master tablet was
> enormous (i.e. it took at least a minute to bootstrap) and the Cloudera
> Manager agent was periodically accessing /dump-entities.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)