Any opinions? Thanks!

Best,
Hao

On Thu, Dec 17, 2015 at 11:54 PM, Hao Hao <[email protected]> wrote:

> Hi all,Now for large metastores, hdfs path sync can take up to 10m to
> start up. We need to improve the current load time for starting Hive
> Metastore, which documented in the Jira
> <https://issues.apache.org/jira/browse/SENTRY-990>. Propose solutions
> here:
>
> Solution 1: During initialization, we can chunk all updates to small
> pieces and do not block the start up by waiting for sending the updates.
> The plugin can send the updates to sentry service based on the delta after
> HMS start.
>
>
> Problems:
>
>    - How to decide when to chunk? We can have configurable timer or paths
>    update number limits to decide the chunk of updates.
>    - How to track the delta and the order of the requests? Make use of
>    the current update sequence number mechanism.
>    -
>
>    How to work with HA? (Need some inputs here)
>
>
>    -
>
>    How do the customer work with the new design? (Especially during
>    startup)
>
>
>    - Client side connections need to be thread safe.
>
>
> Solution 2: Have lazy updating mechanism: update the path based on the
> namenode request. Do not prefer this approach, since it can impact the
> performance on HDFS plugin.
> Any opinions about the proposal? Thanks a lot!
>
> Best,
> Hao
>

Reply via email to