Any opinions? Thanks! Best, Hao
On Thu, Dec 17, 2015 at 11:54 PM, Hao Hao <[email protected]> wrote: > Hi all,Now for large metastores, hdfs path sync can take up to 10m to > start up. We need to improve the current load time for starting Hive > Metastore, which documented in the Jira > <https://issues.apache.org/jira/browse/SENTRY-990>. Propose solutions > here: > > Solution 1: During initialization, we can chunk all updates to small > pieces and do not block the start up by waiting for sending the updates. > The plugin can send the updates to sentry service based on the delta after > HMS start. > > > Problems: > > - How to decide when to chunk? We can have configurable timer or paths > update number limits to decide the chunk of updates. > - How to track the delta and the order of the requests? Make use of > the current update sequence number mechanism. > - > > How to work with HA? (Need some inputs here) > > > - > > How do the customer work with the new design? (Especially during > startup) > > > - Client side connections need to be thread safe. > > > Solution 2: Have lazy updating mechanism: update the path based on the > namenode request. Do not prefer this approach, since it can impact the > performance on HDFS plugin. > Any opinions about the proposal? Thanks a lot! > > Best, > Hao >
