[ 
https://issues.apache.org/jira/browse/KUDU-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15472759#comment-15472759
 ] 

Adar Dembo commented on KUDU-1597:
----------------------------------

bq. Does it seem reasonable to just add a call to tablet_peer_->CheckRunning() 
in the SysCatalogTable::Visit*() functions?

I don't think that's sufficient. The window of sadness begins in 
ServerBase::Init() and ends just after CatalogManager::InitSysCatalogAsync(). 
During that time, there's not even a guarantee that 
catalog_manager().sys_catalog() exists, so you won't necessarily be able to get 
at the underlying tablet peer.

But, you could condition on CatalogManager::IsInitialized() (or CheckOnline(), 
if you want a Status). I think that should be safe.



> Master crashes if web UI is visited during startup
> --------------------------------------------------
>
>                 Key: KUDU-1597
>                 URL: https://issues.apache.org/jira/browse/KUDU-1597
>             Project: Kudu
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.10.0
>            Reporter: Adar Dembo
>            Priority: Critical
>
> Accessing certain paths in the master web UI during startup can cause the 
> master to crash. For example, /dump-entities while the master tablet is still 
> bootstrapping will lead to a CHECK crash (SysCatalogTable has an 
> uninitialized schema at this point). There are no doubt others too.
> We haven't seen this in testing because our tests generally don't access the 
> web UI, or they don't try to bootstrap a large master tablet. We've seen at 
> least one user hit this, though: for some reason the master tablet was 
> enormous (i.e. it took at least a minute to bootstrap) and the Cloudera 
> Manager agent was periodically accessing /dump-entities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to