I think its important to keep supporting "hive --service metastore" for
backwards compatibility. I also think it would be great to have some
separation in the lib/bin if possible so that it is easier for someone who
downloads Hive but just wants to deploy a standalone-metastore to identify
which jars are needed to run metastore.

On Thu, Jan 25, 2018 at 8:13 AM, Alexander Kolbasov <ak...@cloudera.com>
wrote:

> Alan,
>
> While continuing shipping HMS with Hive makes sense (at least for a while),
> what do you think about somehow separating lib/bin directories created in
> the distro so Hive and metastore have a separate set of bin/lib dirs?
>
> - Alex
>
> On Wed, Jan 24, 2018 at 12:16 PM, Alan Gates <alanfga...@gmail.com> wrote:
>
> > In HIVE-17983 I have been working on packaing and start/stop scripts for
> > the standalone metastore.  One question this brings up is how Hive will
> be
> > released now, with or without the metastore.  I can see two options:
> >
> > 1) We continue to ship the metastore with Hive.  Not only does this mean
> > the metastore code is in the Hive source code release and the metastore
> > jars are in the Hive binary distribution, but scripts like metastore.sh
> are
> > still included in Hive's bin directory, so that Hive admins can still do
> > 'hive --service metastore' to start the metastore.  I see the following
> > advantages of this:
> > a) it is completely backwards compatible;
> > b) it is what users would expect (I have installed many databases and
> never
> > been asked to first install a separate package for its data catalog or
> any
> > other essential piece);
> > c) this will still be the metastore's most frequent use case for at least
> > the near future.
> >
> > The disadvantage is it is error prone when Hive is set up to connect to a
> > separate metastore.  An operator could easily start the metastore in the
> > Hive package, not realizing Hive is configured to connect to a different
> > one.
> >
> > 2) We remove the metastore from the packaging completely like we do
> Hadoop
> > and require the user to install it separately.  The advantages and
> > disadvantages of this exactly mirror those of option 1.
> >
> > Based on both the 80/20 rule (most metastore users will still be single
> > system Hive users) and the law of least astonishment (people expect a
> > database to have a data catalog) I vote for option 1.
> >
> > Anyone strongly feel we should do 2 instead?
> >
> > Any other options I haven't considered?
> >
> > Alan.
> >
>

Reply via email to