In Hive 2 update and delete take what are called semi-shared locks (meaning
they allow shared locks through, while not allowing other semi-shared
locks), and insert and select take shared locks.  So you can insert or
select while deleting, you just can't have two simultaneous deletes in the
same partition.

The reason insert can take a shared lock is because Hive does not enforce
uniqueness constraints, so there's no concept of overwriting an existing
row.  Multiple inserts can also proceed simultaneously.

This changes in Hive 3, where update and delete also take shared locks and
a first committer wins strategy is employed instead.

Alan.

On Mon, Sep 9, 2019 at 8:29 AM David Morin <morin.david....@gmail.com>
wrote:

> Hello,
>
> I use in production HDP 2.6.5 with Hive 2.1.0
> We use transactional tables and we try to ingest data in a streaming way
> (despite the fact we still use Hive 2)
> I've read some docs but I would like some clarifications concerning the
> use of Locks with transactional tables.
> Do we have to use locks during insert or delete ?
> Let's consider this pipeline:
> 1. create a transaction for delete and related lock (shared)
> 2. create delta directory + file with this new transaction (original
> transaction != current with original is the transaction used for the last
> insert)
> 3 Same steps 1 and 2 for Insert (except original transaction = current)
> 4. commit transactions
>
> Can we use Shared lock here ? Thus select queries can still be used
>
> Thanks
> David
>
>
>
>
>
>
>

Reply via email to