[ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296728&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296728
 ]

ASF GitHub Bot logged work on HIVE-22068:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Aug/19 04:54
            Start Date: 17/Aug/19 04:54
    Worklog Time Spent: 10m 
      Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314934579
 
 

 ##########
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##########
 @@ -522,6 +525,25 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
       // bootstrap of tables if exist.
       if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
         DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+      } else if (work.dbNameToLoadIn != null) {
+        // Nothing to be done for repl load now. Add a task to update the 
last.repl.id of the
+        // target database to the event id of the last event considered by the 
dump. Next
+        // incremental cycle if starts from this id, the events considered for 
this dump, won't
+        // be considered again. If we are replicating to multiple databases at 
a time, it's not
+        // possible to know which all databases we are replicating into and 
hence we can not
+        // update repl id in all those databases.
+        String lastEventid = builder.eventTo().toString();
 
 Review comment:
   OK
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 296728)
    Time Spent: 1.5h  (was: 1h 20m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-22068
>                 URL: https://issues.apache.org/jira/browse/HIVE-22068
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ashutosh Bapat
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch, HIVE-22068.05.patch
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to