[
https://issues.apache.org/jira/browse/HIVE-14841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15668011#comment-15668011
]
Sergey Shelukhin commented on HIVE-14841:
-----------------------------------------
Is it possible to do work in the branch? This causes immense conflicts with
hive-14535 branch, and I see tons of comments that purport with FIXMEs and
stuff to move code around and refactor this and that.
I think this should be done on the branch and merged once when ready, so that
conflicts with parallel changes to the code affected by the moves are minimized.
> Replication - Phase 2
> ---------------------
>
> Key: HIVE-14841
> URL: https://issues.apache.org/jira/browse/HIVE-14841
> Project: Hive
> Issue Type: New Feature
> Components: repl
> Affects Versions: 2.1.0
> Reporter: Sushanth Sowmyan
> Assignee: Sushanth Sowmyan
>
> Per email sent out to the dev list, the current implementation of replication
> in hive has certain drawbacks, for instance :
> * Replication follows a rubberbanding pattern, wherein different tables/ptns
> can be in a different/mixed state on the destination, so that unless all
> events are caught up on, we do not have an equivalent warehouse. Thus, this
> only satisfies DR cases, not load balancing usecases, and the secondary
> warehouse is really only seen as a backup, rather than as a live warehouse
> that trails the primary.
> * The base implementation is a naive implementation, and has several
> performance problems, including a large amount of duplication of data for
> subsequent events, as mentioned in HIVE-13348, having to copy out entire
> partitions/tables when just a delta of files might be sufficient/etc. Also,
> using EXPORT/IMPORT allows us a simple implementation, but at the cost of
> tons of temporary space, much of which is not actually applied at the
> destination.
> Thus, to track this, we now create a new branch (repl2) and a uber-jira(this
> one) to track experimental development towards improvement of this situation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)