[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498877#comment-14498877
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-----------------------------------------

Sorry, yeah, this is a big patch. :)

It's really a cumulative patch of a bunch of work, but a lot of that was 
overwriting itself so much that splitting them out into a bunch of patches 
would have been difficult. Forking hive to do dev of this on a separate branch 
and merging in one go might have been easier.

I'd created https://issues.apache.org/jira/browse/HIVE-10264 as a doc jira, and 
I've attached a presentation-like document there outlining various points of 
why we're doing a bunch of what we're doing, but that still needs some 
wiki-fication that I am working on. I've also attached the replay-protocol 
document on that jira after updating it slightly with your question on DROP 
TABLE here.

I'll reply to code-level comments on review board, and reply to your 
higher-level comments here.

DROP TABLE : This is not quite a DROP TABLE IF EXISTS, it's a DROP TABLE IF 
OLDER THAN(x). There are a couple of cases this can happen in:

a) To make it more resilient in cases of parallelization of events (in the 
cases of a worker that times out and does not respond back, for eg., but might 
still be running, albeit slowly in the background), one of the goals of all 
Commands generated by Replication is that they should be idempotent, and 
reprocessing of events older than the state of an object should not cause any 
error. So, if one drone that's processing events (41,42,43) might perform 41 
and then not respond back for a significant amount of time, causing Falcon to 
queue another HiveDR job that starts performing (41,42,43), and 43 might return 
successfully before the other job performs 42, and then failing. So, one of the 
early design goals was that all commands should be resilient to repeats. This 
is a way of achieving that goal.

b) In the case of a 
CREATE1->DROP1->CREATE2->REPL(CREATE1)->REPL(DROP1)->REPL(CREATE2), since the 
REPL(CREATE1) occurs after CREATE2, it picks up a newer state of the table, and 
the destination is at a newer state than the table which was dropped. Thus, by 
making the DROP ignore the destination table if it's already newer than the 
event that spawned the DROP, we can optimize away a bit of re-importing that 
REPL(CREATE2) would have needed to do. In the future, we'll add in 
event-nullification, and can do it at a higher level if we batch events, but 
this helps out even when processing at an individual level.

c) In addition to a DROP-IF-OLDER, it also acts like a recursive 
DROP-TABLE-IF-OLDER for cases where it doesn't result in the dropping of the 
table, it will still result in dropping older partitions in a newer table. For 
eg., if a T(state=50) has partitions P1(state=45) and P2(state=53), then 
DROP_TABLE_IF_OLDER_THAN(47) will drop P1 but not P2. This is because a 
Drop-table event does not result in a series of DropPtn events that are 
associated with the appropriate table. So, given that our replication works on 
an per-object basis, if DropTable should not drop the destination table because 
the destination table is newer than the origin table at the time of the drop, 
it might still contain older partitions which should be nuked. (This mode is 
tested in one of the tests in TestCommands in HIVE-10227 if you want to have a 
look at an example of what's expected)

--

Regarding the kewword addition, thanks for the feedback, it was not my intent 
to make them "reserved keywords". I talked to [~pxiong] and [~ashutoshc] about 
it, and the latter is the way that makes sense. As long as I add them to the 
nonReserved entry in IdentifiersParser.g, it should be good. So, I'll add that 
in and have another update here.


> Changes to Hive Export/Import/DropTable/DropPartition to support replication 
> semantics
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-10228
>                 URL: https://issues.apache.org/jira/browse/HIVE-10228
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Import/Export
>    Affects Versions: 1.2.0
>            Reporter: Sushanth Sowmyan
>            Assignee: Sushanth Sowmyan
>         Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch
>
>
> We need to update a couple of hive commands to support replication semantics. 
> To wit, we need the following:
> EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
> Export will now support an extra optional clause to tell it that this export 
> is being prepared for the purpose of replication. There is also an additional 
> optional clause here, that allows for the export to be a metadata-only 
> export, to handle cases of capturing the diff for alter statements, for 
> example.
> Also, if done for replication, the non-presence of a table, or a table being 
> a view/offline table/non-native table is not considered an error, and 
> instead, will result in a successful no-op.
> IMPORT ... (as normal) – but handles new semantics 
> No syntax changes for import, but import will have to change to be able to 
> handle all the permutations of export dumps possible. Also, import will have 
> to ensure that it should update the object only if the update being imported 
> is not older than the state of the object. Also, import currently does not 
> work with dbname.tablename kind of specification, this should be fixed to 
> work.
> DROP TABLE ... FOR REPLICATION('eventid')
> Drop Table now has an additional clause, to specify that this drop table is 
> being done for replication purposes, and that the dop should not actually 
> drop the table if the table is newer than that event id specified.
> ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
> Similarly, Drop Partition also has an equivalent change to Drop Table.
> =
> In addition, we introduce a new property "repl.last.id", which when tagged on 
> to table properties or partition properties on a replication-destination, 
> holds the effective "state identifier" of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to