Thanks Alan, When you say "you just can't have two simultaneous deletes in the same partition", simultaneous means for the same transaction ? If a create 2 "transactions" for 2 deletes on the same table/partition it works. Am I right ?
Le lun. 9 sept. 2019 à 19:04, Alan Gates <alanfga...@gmail.com> a écrit : > In Hive 2 update and delete take what are called semi-shared locks > (meaning they allow shared locks through, while not allowing other > semi-shared locks), and insert and select take shared locks. So you can > insert or select while deleting, you just can't have two simultaneous > deletes in the same partition. > > The reason insert can take a shared lock is because Hive does not enforce > uniqueness constraints, so there's no concept of overwriting an existing > row. Multiple inserts can also proceed simultaneously. > > This changes in Hive 3, where update and delete also take shared locks and > a first committer wins strategy is employed instead. > > Alan. > > On Mon, Sep 9, 2019 at 8:29 AM David Morin <morin.david....@gmail.com> > wrote: > >> Hello, >> >> I use in production HDP 2.6.5 with Hive 2.1.0 >> We use transactional tables and we try to ingest data in a streaming way >> (despite the fact we still use Hive 2) >> I've read some docs but I would like some clarifications concerning the >> use of Locks with transactional tables. >> Do we have to use locks during insert or delete ? >> Let's consider this pipeline: >> 1. create a transaction for delete and related lock (shared) >> 2. create delta directory + file with this new transaction (original >> transaction != current with original is the transaction used for the last >> insert) >> 3 Same steps 1 and 2 for Insert (except original transaction = current) >> 4. commit transactions >> >> Can we use Shared lock here ? Thus select queries can still be used >> >> Thanks >> David >> >> >> >> >> >> >>