That's the old (Hive 2) version of ACID. In the newer version (Hive 3) there's no update, just insert and delete (update is insert + delete). If you're working against Hive 2 what you have is what you want. If you're working against Hive 3 you'll need the newer stuff.
Alan. On Tue, Mar 12, 2019 at 12:24 PM David Morin <morin.david....@gmail.com> wrote: > Thanks Alan. > Yes, the problem is fact was that this streaming API does not handle > update and delete. > I've used native Orc files and the next step I've planned to do is the use > of ACID support as described here: https://orc.apache.org/docs/acid.html > The INSERT/UPDATE/DELETE seems to be implemented: > OPERATIONSERIALIZATION > INSERT 0 > UPDATE 1 > DELETE 2 > Do you think this approach is suitable ? > > > > Le mar. 12 mars 2019 à 19:30, Alan Gates <alanfga...@gmail.com> a écrit : > >> Have you looked at Hive's streaming ingest? >> https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest >> It is designed for this case, though it only handles insert (not update), >> so if you need updates you'd have to do the merge as you are currently >> doing. >> >> Alan. >> >> On Mon, Mar 11, 2019 at 2:09 PM David Morin <morin.david....@gmail.com> >> wrote: >> >>> Hello, >>> >>> I've just implemented a pipeline based on Apache Flink to synchronize data >>> between MySQL and Hive (transactional + bucketized) onto HDP cluster. Flink >>> jobs run on Yarn. >>> I've used Orc files but without ACID properties. >>> Then, we've created external tables on these hdfs directories that contain >>> these delta Orc files. >>> Then, MERGE INTO queries are executed periodically to merge data into the >>> Hive target table. >>> It works pretty well but we want to avoid the use of these Merge queries. >>> How can I update Orc files directly from my Flink job ? >>> >>> Thanks, >>> David >>> >>>