Thanks Alan,

When you say "you just can't have two simultaneous deletes in the same
partition", simultaneous means for the same transaction ?
If a create 2 "transactions" for 2 deletes on the same table/partition it
works. Am I right ?


Le lun. 9 sept. 2019 à 19:04, Alan Gates <alanfga...@gmail.com> a écrit :

> In Hive 2 update and delete take what are called semi-shared locks
> (meaning they allow shared locks through, while not allowing other
> semi-shared locks), and insert and select take shared locks.  So you can
> insert or select while deleting, you just can't have two simultaneous
> deletes in the same partition.
>
> The reason insert can take a shared lock is because Hive does not enforce
> uniqueness constraints, so there's no concept of overwriting an existing
> row.  Multiple inserts can also proceed simultaneously.
>
> This changes in Hive 3, where update and delete also take shared locks and
> a first committer wins strategy is employed instead.
>
> Alan.
>
> On Mon, Sep 9, 2019 at 8:29 AM David Morin <morin.david....@gmail.com>
> wrote:
>
>> Hello,
>>
>> I use in production HDP 2.6.5 with Hive 2.1.0
>> We use transactional tables and we try to ingest data in a streaming way
>> (despite the fact we still use Hive 2)
>> I've read some docs but I would like some clarifications concerning the
>> use of Locks with transactional tables.
>> Do we have to use locks during insert or delete ?
>> Let's consider this pipeline:
>> 1. create a transaction for delete and related lock (shared)
>> 2. create delta directory + file with this new transaction (original
>> transaction != current with original is the transaction used for the last
>> insert)
>> 3 Same steps 1 and 2 for Insert (except original transaction = current)
>> 4. commit transactions
>>
>> Can we use Shared lock here ? Thus select queries can still be used
>>
>> Thanks
>> David
>>
>>
>>
>>
>>
>>
>>

Reply via email to